Artificial Neural Networks¶
A neural network refers to a type of hypothesis class containing multiple, parameterized differentiable functions (layers) composed together in a manner to map the input to the output
It is made of layers of neurons, connected in a way that the input of one layer of neuron is the output of the previous layer of neurons (after activation)
They are loosely based on how our human brain works: Biological structure -> Biological function
You can think of a neural network as combining multiple non-linear decision surfaces into a single decision surface.
where \(\phi\) is a non-linear function
Neural networks can be thought of βlearningβ (and hence optimizing loss by tweaking)
- features (instead of manual feature specification)
- parameters
Universal Function Approximation¶
A 2 layer ANN is capable of approximate any function over a finite subset of the input space
Catch: The size of NN should be equal to number of datapoints
Over-exaggerated property; same property is shared by Nearest Neighbors and splines, but no one cares
Hyperparameters¶
- Batch size
- Input size
- Output size
- No of hidden layers
- No of neurons in hidden layers
- Regularization
- Loss function
- Weight initialization technique
- Optimization
- Algorithm
- Learning rate
- No of epochs
Artificial Neuron¶
Most basic unit of an artificial neural network
Tasks¶
- Receive input from other neurons and combine them together
- Perform some kind of transformation to give an output. This transformation is usually a mathematical combination of inputs and application of an activation function.
Visual representation¶
MP Neuron¶
McCulloch Pitts Neuron
Highly simplified compulational model of neuron
\(g\) aggregates inputs and the function \(f\) and gives \(y \in \{ 0, 1 \}\)
- \(\sum x_i\) is the summation of boolean inputs
- \(\theta\) is threshold for the neuron
β Limitation¶
MP neuron can be used to represent linearly-separable functions
Perceptron¶
MP neuron with a mechanism to learn numerical weights for inputs
β Input is no longer limited to boolean values
- \(w_i\) is weights for the inputs
Key Terms for Logic¶
- Pre-Activation (Aggregation)
- Activation (Decision)
Perceptron Learning Algorithm¶
Perceptron vs Sigmoidal Neuron¶
Perceptron | Sigmoid/Logistic | |
---|---|---|
Type of line | Step Graph | Gradual Curve |
Smooth Curve? | β | β |
Continuous Curve? | β | β |
Differentiable Curve? | β | β |