16 Feb 2023

Activation Functions in Machine Learning

Used to get the output of the node. It is also known as the Transfer Function.

Why do we use them?

Determine the output of a neural network.
Maps the resulting values in between 0 to 1, -1 to 1 etc.

Linear/Identity Activation Function

Output not confined behind any range.
f(x) = x

Sigmoid (Logistic Activation Function)

Range: 0 to 1
Used for models that try to predict probability

Softmax

A more generalized logistic activation function, used for multiclass classification.

Tanh

Like sigmoid but ranges from -1 to 1.
Advantage over sigmoid: negative inputs will be mapped strongly negative and 0 will be mapped to 0.
Mainly used in classification between 2 classes.

ReLU

Very popular right now (used for deep learning).
Range: 0 to infinity
Disadvantages: all the negative values become zero, which decreases the ability of model to fit or train from data.

Leaky ReLU

Attempts to solve the “dying” ReLU problem.
Usually uses the f(x) = 0.01x function for the left side of the graph.
Range: -infinity to infinity