16 Feb 2023

Activation Functions in Machine Learning

Used to get the output of the node. It is also known as the Transfer Function.

Why do we use them?

  • Determine the output of a neural network.
  • Maps the resulting values in between 0 to 1, -1 to 1 etc.

Linear/Identity Activation Function

  • Output not confined behind any range.

  • f(x) = x

Sigmoid (Logistic Activation Function)

  • Range: 0 to 1
  • Used for models that try to predict probability

Softmax

A more generalized logistic activation function, used for multiclass classification.

Tanh

  • Like sigmoid but ranges from -1 to 1.
  • Advantage over sigmoid: negative inputs will be mapped strongly negative and 0 will be mapped to 0.
  • Mainly used in classification between 2 classes.

ReLU

  • Very popular right now (used for deep learning).
  • Range: 0 to infinity
  • Disadvantages: all the negative values become zero, which decreases the ability of model to fit or train from data.

Leaky ReLU

  • Attempts to solve the “dying” ReLU problem.
  • Usually uses the f(x) = 0.01x function for the left side of the graph.
  • Range: -infinity to infinity