DEEP LEARNING

Gayathri siva
10 min readMay 6, 2021

--

Deep Learning is a collection of statistical machine learning techniques used to learn feature hierarchies based on the concept of artificial neural networks.”

Artificial Intelligence

Artificial intelligence is a technology that enables a machine to simulate human behavior.

Subsets of Artificial Intelligence

Machine learning

Machine learning is a type of Artificial Intelligence(AI) that provides computers with the ability to learn without being explicitly programmed.

Know more about Machine Learning

Types of Machine Learning:

  1. Supervised Learning
  2. Unsupervised Learning
  3. Reinforcement Learning

Limitations of Machine Learning

  1. Are not useful while working with high dimensional data, that is where we have a large number of inputs and outputs.
  2. Cannot solve crucial AI problems like NLP, Image Recognition, etc.
  3. One of the big challenges with traditional Machine Learning models is a process called feature extraction.
  4. For complex problems such as object recognition or handwriting recognition, this is a huge challenge.

Deep Learning to the Rescue:

— Deep Learning Models are capable to focus on the right features by themselves, requiring little guidance from the programmer.

— These Models also partially solve the dimensional problem.

Why do we need Deep Learning?

DEEP LEARNING

Deep learning is a form of machine learning that uses a model of computing that’s very much inspired by the structure of the brain.

Artificial Neural Network:

A neural network is a system of hardware and/or software patterned after the operation of neurons in the human brain. Neural networks also called Artificial Neural Network is a way of achieving deep learning.

Biological Neuron vs. Artificial Neuron

Biological neural networks allow computers to learn and interact as humans do. It has interconnected neurons with dendrites that receive inputs and based on those inputs, it produces an electric signal i.e. output through the axon.

An Artificial neural network is usually described as having different layers. The first layer is the input layer, it picks up the input signals and passes them to the next layer. The next layer does all kinds of calculations and feature extractions — it’s called the hidden layer. Often, there will be more than one hidden layer. And finally, there’s an output layer, which delivers the final result.

Why we need Artificial Neurons?

We need a system to separate the two species.

With the help of an Artificial Neuron, we can separate the two species of flowers

A type of single artificial neuron called a perceptron.

How Does A Neural Network Work?

To understand neural networks, we need to break them down and understand the most basic unit of a Neural Network, i.e. a Perceptron.

  • Single layer Perceptron
  • Multilayer Perceptron

What is Perceptron(Single-Layer Perceptron)

A Perceptron is a basic part of a single-layer neural network that is used to classify linear data.

Perceptron Learning Rule:

STEP 1: Inputs are multiplied with the weights and a summation is performed, plus a
bias is added

STEP 2: The weighted sum of inputs is passed to an activation function to determine if a neuron will fire or not.

STEP 3: Perceptron receives input signals and if the sum of the input signals exceeds a certain threshold value, it either outputs a signal or does not return an output.

STEP 4: The error in the output is backpropagated and weights are updated to minimize the error.

Components of the basic Artificial Neuron:

  1. Inputs: Inputs are the set of values for which we need to predict the output value. They can be viewed as features or attributes in a dataset.
  2. Weights: weights are the real values that are associated with each feature which tells the importance of that feature in predicting the final value. (we will know more about it in this article)
  3. Bias: Bias is used for shifting the activation function towards left or right, it can be referred to as a y-intercept in the line equation. (we will know more about this in this article)
  4. Summation Function: The work of the summation function is to bind the weights and inputs together and find their sum.
  5. Activation Function: An activation function takes the “weighted sum of input and the bias ((w*x) + b)” as the input to the function and decides whether it should be fired or not. It is used to introduce non-linearity in the model.

Types of Activation Function:

Sigmoid Function

Used for models where we have to predict the probability as an output. It exists between 0 and 1

Threshold Function

It is a threshold-based activation function. If the X value is greater than a certain value, the function is activated and fired else not.

ReLU (Rectified Linear Unit) Function

It is the most widely used activation function and gives an output of X if X is positive, 0 otherwise

Hyperbolic Tangent Function

This function is similar to the sigmoid function and is bound to range(-1, 1)

Limitations of Single-Layer Perceptron:

There are two major problems:

  • Single-Layer Perceptrons cannot classify non-linearly separable data points.
  • Complex problems, that involve a lot of parameters cannot be solved by Single-Layer Perceptrons.

Multi Layer Perceptron:

A Multi Layer Perceptron (MLP) contains one or more hidden layers (apart from one input and one output layer). While a single layer perceptron can only learn linear functions, a multi layer perceptron can also learn non — linear functions. And is thus considered as a deep neural network.

A Deep neural network consists of the following layers:

  1. The Input Layer
  2. The Hidden Layer
  3. The Output Layer

A simple neural network includes an input layer, an output (or target) layer, and, in between, a hidden layer. The layers are connected via nodes, and these connections form a “network” — the neural network — of interconnected nodes.

Froward Propagation and Backpropagation?

Let see forward and backpropagation with handwritten alphabet image classification example.

This simple neural network must be trained to recognize handwritten alphabets ‘a’, ‘b’, ‘c’.

The human brain can easily recognize the alphabet but what if a computer had to recognize them. That’s where deep learning comes here.

Here’s a neural network trained to identify a handwritten alphabet each alphabet is present as an image of 28*28 pixels.

Now that amounts to a total of 784 pixels neurons the core entity of the neural network is where the information processing takes place each of the 784 pixels is fed to a neuron in the first layer of our neural network.

This forms the input layer. On the other end, we have the output layer with each neuron representing the alphabet with the hidden layer existing between them.

The information is trans from one layer to another over connecting channels each of these has a random value attached to it and hence is called a weighted channel.

The initial prediction is made using the weighted channel.

All neurons have a unique number associated with it called bias.

This bias is added to the weighted sum of inputs reaching the neuron which is then applied to a function known as activation function.

The result of the activation function determines if the neuron gets activated. Every activated neuron passes on information to the following layers this continues up till the second last layer.

The one neuron activated in the output layer corresponds to the input alphabet, the weights and bias are continuously adjusted to produce a well-trained network.

Our network predicts the input to be ‘b’ with a probability of 0.5

The predicted probabilities are compared against the actual probabilities and the error is calculated.

The magnitude indicates the amount of change while the sign indicates an increase or decrease in the weights.

The information is transmitted back through the networks(backpropagation)

Weights throughout the network are adjusted in order to reduce the loss in prediction

The network starts training itself by choosing a random value for weights.

In 1st iteration, we get

After 2nd and 3rd iteration,

Let’s focus on finding the minimum loss for our variable ‘a’.

Gradient Descent:

Optimization Algorithm or the graphical method of finding the minimum of a function is called gradient descent.

  • A negative slope indicates a decrease in weight.
  • A zero slope indicates the appropriate weight.

Our aim is to reach a point where the slope is zero.

We now plot a graph for weight versus loss.

Let’s assume the below to be our graph for the loss of prediction with variable an as compared to the weights contributing to it from the second last layer.

Random points chosen on the graph are now backpropagated through the network in order to adjust the weights.

The network is run once again with the new weights. The process is repeated multiple times till it provides accurate predictions.

The weights are further adjusted to identify ‘b’ and ‘c’ too

Types of neural networks

There are different kinds of deep neural networks — and each has advantages and disadvantages, depending upon the use. Examples include:

  • Convolutional neural networks (CNNs) contain five types of layers: input, convolution, pooling, fully connected, and output. Each layer has a specific purpose, like summarizing, connecting, or activating. Convolutional neural networks have popularized image classification and object detection. However, CNNs have also been applied to other areas, such as natural language processing and forecasting.
  • Recurrent neural networks (RNNs) use sequential information such as time-stamped data from a sensor device or a spoken sentence, composed of a sequence of terms. Unlike traditional neural networks, all inputs to a recurrent neural network are not independent of each other, and the output for each element depends on the computations of its preceding elements. RNNs are used in fore­casting and time series applications, sentiment analysis, and other text applications.
  • Feedforward neural networks, in which each perceptron in one layer is connected to every perceptron from the next layer. Information is fed forward from one layer to the next in the forward direction only. There are no feedback loops.
  • Autoencoder neural networks are used to create abstractions called encoders, created from a given set of inputs. Although similar to more traditional neural networks, autoencoders seek to model the inputs themselves, and therefore the method is considered unsupervised. The premise of autoencoders is to desensitize the irrelevant and sensitize the relevant. As layers are added, further abstractions are formulated at higher layers (layers closest to the point at which a decoder layer is introduced). These abstractions can then be used by linear or nonlinear classifiers.

Future of Artificial Neural Network

  • Neural Networks will be used in the field of medicine, agriculture, etc.
  • Neural Network tools will be embedded in every design surface.
  • Neural networks will be a lot faster in the future.
  • More personalized choices for users and customers all over the world.
  • Hyper Intelligent virtual assistants will make life easier.
  • New forms of algorithm or learning methods would be discovered.

Popular Deep Learning Frameworks

Applications of Deep Learning:

Playing Games: Deep Learning allows us to build machines that can play games.
Composing Music: Deep Neural Nets can be used to produce music by making computers learn the patterns in a composition.
Autonomous Driving Cats: Distinguishes different types of objects, people, road signs, and drives without human intervention
Building Robots: Deep Learning is used to train robots to perform human tasks.
Medical Diagnosis: Deep Neural Nets are used to identify suspicious lesions and nodules in lung cancer patients

--

--

No responses yet