A Brief Insight on DATA

Published in

The Startup

6 min readFeb 23, 2020

Technology is not just a part of our life, it is making our life much better. Data is almost available everywhere and we create an enormous amount of data in our day-to-day life. Its all about the 1’s and 0’s which we create and it is almost present everywhere.

For example: When we use an AI Toothbrush, we create an enormous amount of data. This toothbrush carry a enormous amount of data and can study our brushing patterns and time we brush.

To be in short,this AI toothbrush know better about your tooth than you know about it.It suggest you that which tooth requires a extra care. Thus,technology creates an enormous opportunities to create a sustainable and healthy society.

“Never before in history has innovation offered promise of so much to so many in so short a time”.(Bill Gates)

There is a Buzzword about the Fourth Industrial Revolution.It may be either from building a home in a Mars using 3D Printer or a Doctor doing a surgery to a patient from his home by using Mixed Reality,Robotic arm and 5G Technology. Whatever it may be,data plays a crucial role in it.In the age of 5G Technology,there will be a lot of things which will be connected to Internet which we call it as Internet of Things. The IOT creates lots of data when compared to data which we create in our day-to-day life.

The Data we create can be of any format i.e., Photos,text,video,speech.etc. Thus,this large amount of data which we create is called BIG DATA.

The 5 V’s of BIG DATA

Big data is a combination of structured, semi-structured and unstructured data that can be mined for information and used in predictive modeling and other advanced analytical applications.

These data are collected, stored,processed and then predictive models are built to get some insights about the data.

“It’s the knowledge derived from information that gives you a competitive edge”. (Bill gates)

DATA SCIENCE

Data science is the art of collecting data,storing it,processing,describing and building predictive models to get some insights on the whole population based on some sample of the population. Data science is combination of Domain knowledge,mathematics and programming skills.Mining large amounts of structured and unstructured data to identify patterns can help an organization to increase its efficiencies, recognize new market opportunities and increase the organization’s competitive advantage.

THE DATA SCIENTIST APPROACH TO A REAL WORLD PROBLEM

The Data Scientist after collecting data, do some experiments with it which is commonly know as Exploratory Data Analysis to get some insights and patterns hidden in the data by plotting it and visualizing it.

“You need to understand things in order to invent beyond them.”(Bill gates)

By doing EDA the data scientist can understand the data much better and can determine which algorithm to be used in order to solve the problem.

THE RELATIONSHIP BETWEEN AI vs ML vs DL vs DS

ARTIFICIAL INTELLIGENCE

Artificial intelligence (AI) is an area of computer science that emphasizes the creation of intelligent machines that work and react like humans.The activities that work and mimic like humans are Problem solving,Knowledge Representation,Decision Making,Perception,Communication and Actuation.

TYPES OF AI

ARTIFICIAL NARROW INTELLIGENCE(ANI) -> (MACHINE LEARNING)

Specializes in one area and solves only one problem.

ARTIFICIAL GENERAL INTELLIGENCE(AGI) ->(MACHINE INTELLIGENCE)

Refers to a computer that is as smart as a human across the board.

ARTIFICIAL SUPER INTELLIGENCE(ASI) ->(MACHINE CONSCIOUSNESS)

An intellect that is much smarter than the best human brains in practically every field.

MACHINE LEARNING

Machine learning is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as “training data”, in order to make predictions or decisions without being explicitly programmed to perform the task

TYPES OF MACHINE LEARNING

SUPERVISED LEARNING

Supervised learning is when the model is getting trained on a labelled data-set. Labelled data-set is one which have both input and output parameters. In this type of learning both training and validation data-sets are labelled.

UNSUPERVISED LEARNING

Unsupervised learning is the training of machine using information that is neither classified nor labeled and allowing the algorithm to act on that information without guidance. Here the task of machine is to group unsorted information according to similarities, patterns and differences without any prior training of data.

REINFORCEMENT LEARNING

Reinforcement learning is an area of machine learning concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. Due to its generality, the field is studied in many other disciplines, such as game theory, control theory, operations research, information theory, simulation-based optimization, multi-agent systems, swarm intelligence, statistics and genetic algorithms.

DEEP LEARNING

Deep learning is a set of algorithms in machine learning that attempt to model high-level abstractions in data by using architectures composed of multiple non-linear transformations.

DIFFERENCE BETWEEN ML AND DL

Feature Extraction: It is a process of dimensional reduction by which an initial set of raw data is reduced to more manageable groups for processing. A characteristic of these large data sets is a large number of variables that require a lot of computing resources to process.

CONVOLUTION NEURAL NETWORKS

In deep learning, a convolutional neural network (CNN, or ConvNet) is a class of deep neural networks, most commonly applied to analyzing visual imagery. They are also known as shift invariant or space invariant artificial neural networks (SIANN), based on their shared-weights architecture and translation invariance characteristics. They have applications in image and video recognition, recommender systems, image classification, medical image analysis, natural language processing, and financial time series.

RECURRENT NEURAL NETWORK

A Recurrent Neural Network (RNN) is a class of artificial neural networks where connections between nodes form a directed graph along a temporal sequence. This allows it to exhibit temporal dynamic behavior. Derived from feedforward neural networks, RNNs can use their internal state (memory) to process variable length sequences of inputs. This makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition.

CONCLUSION

Homo-sapiens, a just one kind of species who explored the world,discovered & innovated new things.But,its the same species who are destroying the world.Compared to other business problem, now the world has more complex problem such as Climate change,Plastic pollution,Space debris,etc and we rely on this technology to save the world.Technology is like a knife and its all about how we use it.It is also better to remember what a great Physicist once said.

“AI is likely to be either the best or worst thing to happen to humanity”. (Stephen Hawking)

By using the available data we can’t stop any natural disaster but we can at least be able to predict the occurrence of some natural disaster and can save millions of life.