This is part of my answer to interview question 9 which is to explain your favorite machine learning algorithm in five minutes.
Neural Networks Made Simple
Neural networks are designed to replicate the way human brains learn. They consist of layers of nodes that are interconnected. There is first an input layer, followed by any number of hidden layers, and finally an output layer. The input layer takes in the values of the features in the training set, the output layer produces the final predicted output.
A neural network will compute a single output for each node based on a weighted combination of all of the inputs to that node. The output is produced by taking the result of the node’s activation function and thresholding it against some value/ threshold function. The inputs are values from the features of the data or from previous layers, and the output is a single value that is passed to the next layer or is the final output if in the output layer.
Neural networks learn by continually updating the weights to minimize error. The concept of backpropagation is that the errors on your output “flow back” from the output layer to update the weights throughout the network. If the output of the network matches the label on the data then the weights are not updated. Two different methods of updating the weights are the perceptron rule and the delta rule (gradient descent).
Different activation functions can be used at each layer. Common activation functions are:
- Rectified Linear Unit (ReLU) – thresholded at 0
- Perceptron – Discrete -1 or 1 value, will find anything that is linearly separable
- Sigmoid – Gradual continuous from 0 to 1, is differentiable so can use gradient descent
- Hidden layers can invent new features and therefore create a better representation of the problem
- Good at handling large data sets
- Hard to interpret output
- More complex/ bigger network, more likely to overfit
- Keras – Python Deep Learning Library
- SK Learn – Neural Networks
- Wikipedia – Artificial Neural Network