16 Feb 2023
Used to get the output of the node. It is also known as the Transfer Function.
Why do we use them?
- Determine the output of a neural network.
- Maps the resulting values in between 0 to 1, -1 to 1 etc.
Linear/Identity Activation Function
Sigmoid (Logistic Activation Function)
- Range: 0 to 1
- Used for models that try to predict probability
Softmax
A more generalized logistic activation function, used for multiclass classification.
Tanh
- Like sigmoid but ranges from -1 to 1.
- Advantage over sigmoid: negative inputs will be mapped strongly negative and 0 will be mapped to 0.
- Mainly used in classification between 2 classes.
ReLU
- Very popular right now (used for deep learning).
- Range: 0 to infinity
- Disadvantages: all the negative values become zero, which decreases the ability of model to fit or train from data.
Leaky ReLU
- Attempts to solve the “dying” ReLU problem.
- Usually uses the f(x) = 0.01x function for the left side of the graph.
- Range: -infinity to infinity