DeepEnsembleSupervisedSolver#

class DeepEnsembleSupervisedSolver(problem, models, loss=None, optimizers=None, schedulers=None, weighting=None, use_lt=False, ensemble_dim=0)[source]#

Bases: SupervisedSolverInterface, DeepEnsembleSolverInterface

Deep Ensemble Supervised Solver class. This class implements a Deep Ensemble Supervised Solver using user specified model``s to solve a specific ``problem.

An ensemble model is constructed by combining multiple models that solve the same type of problem. Mathematically, this creates an implicit distribution \(p(\mathbf{u} \mid \mathbf{s})\) over the possible outputs \(\mathbf{u}\), given the original input \(\mathbf{s}\). The models \(\mathcal{M}_{i\in (1,\dots,r)}\) in the ensemble work collaboratively to capture different aspects of the data or task, with each model contributing a distinct prediction \(\mathbf{y}_{i}=\mathcal{M}_i(\mathbf{u} \mid \mathbf{s})\). By aggregating these predictions, the ensemble model can achieve greater robustness and accuracy compared to individual models, leveraging the diversity of the models to reduce overfitting and improve generalization. Furthemore, statistical metrics can be computed, e.g. the ensemble mean and variance:

\[\mathbf{\mu} = \frac{1}{N}\sum_{i=1}^r \mathbf{y}_{i}\]
\[\mathbf{\sigma^2} = \frac{1}{N}\sum_{i=1}^r (\mathbf{y}_{i} - \mathbf{\mu})^2\]

During training the supervised loss is minimized by each ensemble model:

\[\mathcal{L}_{\rm{problem}} = \frac{1}{N}\sum_{i=1}^N \mathcal{L}(\mathbf{u}_i - \mathcal{M}_{j}(\mathbf{s}_i)), \quad j \in (1,\dots,N_{ensemble})\]

where \(\mathcal{L}\) is a specific loss function, typically the MSE:

\[\mathcal{L}(v) = \| v \|^2_2.\]

In this context, \(\mathbf{u}_i\) and \(\mathbf{s}_i\) indicates the will to approximate multiple (discretised) functions given multiple (discretised) input functions.

See also

Original reference: Lakshminarayanan, B., Pritzel, A., & Blundell, C. (2017). Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems, 30. DOI: arXiv:1612.01474.

Initialization of the DeepEnsembleSupervisedSolver class.

Parameters:
  • problem (AbstractProblem) – The problem to be solved.

  • models (torch.nn.Module) – The neural network models to be used.

  • loss (torch.nn.Module) – The loss function to be minimized. If None, the torch.nn.MSELoss loss is used. Default is None.

  • optimizer (Optimizer) – The optimizer to be used. If None, the torch.optim.Adam optimizer is used. Default is None.

  • scheduler (Scheduler) – Learning rate scheduler. If None, the torch.optim.lr_scheduler.ConstantLR scheduler is used. Default is None.

  • weighting (WeightingInterface) – The weighting schema to be used. If None, no weighting schema is used. Default is None.

  • use_lt (bool) – If True, the solver uses LabelTensors as input. Default is True.

  • ensemble_dim (int) – The dimension along which the ensemble outputs are stacked. Default is 0.

loss_data(input, target)[source]#

Compute the data loss for the EnsembleSupervisedSolver by evaluating the loss between the network’s output and the true solution for each model. This method should not be overridden, if not intentionally.

Parameters:
Returns:

The supervised loss, averaged over the number of observations.

Return type:

torch.Tensor