Activation Function
- Determines a neuron’s output value and whether the neuron is considered “activated”.
- Enables neural networks to learn nonlinear relationships and make predictions on unseen data.
- Common choices (sigmoid, tanh, ReLU, softmax) have different output ranges and suit different tasks.
Definition
Section titled “Definition”Activation functions are functions used in neural networks that determine the output of a node or neuron and whether that neuron should be activated, based on the input received.
Explanation
Section titled “Explanation”Activation functions are essential components of neural networks because they decide a neuron’s output for a given input and thereby allow the network to learn from data and make predictions on unseen inputs. Different activation functions have distinct characteristics (such as output range and computational behavior) that make them more or less suitable for particular tasks.
Common activation functions described in the source:
- Sigmoid: produces outputs between 0 and 1.
- Tanh (hyperbolic tangent): produces outputs between -1 and 1.
- ReLU (Rectified Linear Unit): outputs the maximum of 0 and the input.
- Softmax: produces a probability distribution over classes.
Examples
Section titled “Examples”Sigmoid
Section titled “Sigmoid”- Usage: binary classification tasks.
- Outputs: between 0 and 1.
- Numerical examples: input -10 → output 0; input 10 → output 1.
- Formula:
- Usage: classification and regression tasks.
- Outputs: between -1 and 1.
- Numerical examples: input -10 → output -1; input 10 → output 1.
- Formula:
- Usage: regression and classification tasks.
- Behavior: outputs the maximum of 0 and the input.
- Numerical examples: input -10 → output 0; input 10 → output 10.
- Formula:
Softmax
Section titled “Softmax”- Usage: classification tasks.
- Behavior: outputs the probability of each class in a classification task.
- Numerical example: if there are three classes, softmax outputs the probabilities of each class.
- Formula:
Use cases
Section titled “Use cases”- Sigmoid: binary classification.
- Tanh: classification and regression.
- ReLU: regression and classification.
- Softmax: multi-class classification (outputs class probabilities).
Notes or pitfalls
Section titled “Notes or pitfalls”- Sigmoid: useful for binary classification but not suitable for tasks with multiple classes.
- Tanh: can suffer from vanishing gradients, which can hinder training.
- ReLU: computationally efficient but can suffer from the dying ReLU problem (outputs become always 0).
- Softmax: suitable for classification tasks but not for regression tasks.
Related terms
Section titled “Related terms”- Sigmoid
- Tanh
- ReLU (Rectified Linear Unit)
- Softmax
- Vanishing gradients
- Dying ReLU