Adaptive ELU#

Module for the Adaptive ELU activation function.

class AdaptiveELU(alpha=None, beta=None, gamma=None, fixed=None)[source]#

Bases: BaseAdaptiveFunction

Adaptive, trainable variant of the ELU activation.

This module extends the standard ELU by introducing learnable scaling and shifting parameters applied to both the input and the output.

Given the function \(\text{ELU}:\mathbb{R}^n\rightarrow\mathbb{R}^n\), the corresponding adaptive activation \(\text{ELU}_{\text{adaptive}}:\mathbb{R}^n\rightarrow\mathbb{R}^n\) is defined as:

\[\text{ELU}_{\text{adaptive}}({x}) = \alpha\,\text{ELU}(\beta{x}+\gamma),\]

where \(\alpha\), \(\beta\), and \(\gamma\) are trainable parameters controlling output scaling, input scaling, and input shifting, respectively.

The ELU function is defined elementwise as:

\[\begin{split}\text{ELU}(x) = \begin{cases} x, & \text{ if }x > 0\\ \exp(x) - 1, & \text{ if }x \leq 0 \end{cases}\end{split}\]

See also

Original reference: Godfrey, L. B., Gashler, M. S. (2015). A continuum among logarithmic, linear, and exponential functions, and its potential to improve generalization in neural networks. 7th international joint conference on knowledge discovery, knowledge engineering and knowledge management (IC3K), Vol. 1. DOI: arXiv preprint arXiv:1602.01321..

Original reference: Jagtap, A. D., Karniadakis, G. E. (2020). Adaptive activation functions accelerate convergence in deep and physics-informed neural networks. Journal of Computational Physics, 404. DOI: JCP 10.1016.

Initialization of the AdaptiveELU class.

Parameters:
  • alpha (int | float) – The output scaling parameter of the adaptive function. If None, it is initialized to 1. Default is None.

  • beta (int | float) – The input scaling parameter of the adaptive function. If None, it is initialized to 1. Default is None.

  • gamma (int | float) – The input shifting parameter of the adaptive function. If None, it is initialized to 0. Default is None.

  • fixed (str | list[str]) – The names of parameters to keep fixed during training. These parameters will not be optimized and will have requires_grad=False. Available options are "alpha", "beta", and "gamma". If None, all parameters are trainable. Default is None.

Raises:
  • ValueError – If alpha, when provided, is not a number.

  • ValueError – If beta, when provided, is not a number.

  • ValueError – If gamma, when provided, is not a number.

  • ValueError – If fixed, when provided, is neither a string nor a list of strings.

  • ValueError – If fixed contains invalid parameter names.