Neural Networks: An Introduction

Submitted by Souvick Moulick (Department :BCA,Batch : 2017-2020)

University Roll No :15201217017

We want to explore machine learning on a deeper level by discussing neural networks. We will do that by explaining how you can use TensorFlow to recognize handwriting. But to do that we first must understand what are neural networks. Having a solid grasp on deep learning techniques feels like acquiring a super power these days. From classifying images and translating languages to building a self-driving car, all these tasks are being driven by computers rather than manual human effort. Deep learning has penetrated into multiple and diverse industries, and it continues to break new ground on an almost weekly basis.

What Exactly Are Neural Networks?

To begin our discussion of how to use TensorFlow to work with neural networks, we first need to discuss what neural networks are.

Think of the linear regression problem we have look at several times here before. We have the concept of a loss function. A neural network hones in on the correct answer to a problem by minimizing the loss function .

neural networks are a programming approach that is inspired by the neurons in the human brain and that enables computers to learn from observational data, be it imagesaudiotextlabelsstrings or numbers. They try to model some unknown function  that maps this data to numbers or classes by recognizing patterns in the data.

How Neural Networks Work

Generally speaking, neural network models consist of thousands of neurons (nodes) that are densely connected. In most neural network models, neurons are organized into layers. This includes an input layer, which includes neurons for all of the provided predictor variables, hidden layers, and an output layer. The hidden layers of a neural network effectively transform the inputs into something that the output layer can interpret. The output layer returns either a category label (classification) or an estimated value (regression).

At each neuron, all incoming values are added together and then processed with an activation function (e.g., sigmoid function), which will determine whether or not the neuron is “activated”. Often, a bias will also be included in this calculation, prior to the activation function. The bias is similar to an intercept term in a regression model.

How a Neural Network is Trained

Multilayer Perceptron Neural Networks are typically trained with a method known as Backpropagation, which involves adjusting the weights of the neurons in the neural network by calculating the gradient of the cost (loss) function.

To start training a neural network, all of the initial weights and thresholds are randomly generated. The training data is then fed through the input layer and passes through the model until it arrives at the output layer. At the output layer, a cost function is calculated to estimate how the model performed in estimating the known target variable. The output of the cost function is minimized when the network confidently estimates the correct value and increases with misclassifications. The goal of the training algorithm is to minimize the value of the cost function.

Weights and thresholds in the neural network are then adjusted to minimize the cost function (this is the calculus part) until the model converges on a local minimum. The process is repeated and weights and thresholds continue to be adjusted based on the training data and a cost function until all data with the same labels result in similar values.

Try Your Own Neural Network

Neural networks are data-driven algorithms, so the first step is to investigate your data thoroughly. Various statistical and visualization techniques can be used to see patterns and variations in the data. Once you have a better understanding of your data, decide on your network. The best bet is to start from networks that have been trained and validated by established researchers, or at least take inspiration from the various “building units” in them. A great place to start is the Wolfram Neural Net Repository, where you can play with various network surgery functions. Once you have created the architecture, start experimenting with various parameters, initializations and losses. It is absolutely okay to overfit at this stage! Finally, you can use regularization techniques in the original model or the ones discussed to generalize your model