Utilities¶
Utility functions module.
Uses Gaussian process regression to build the response surface as a side effect. |
|
Predict method of cross-validation. |
|
Run the k-fold cross validation procedure. |
|
Score function of cross-validation. |
|
Return corresponding points shifted and scaled to [-1, 1]^n_params. |
|
Return corresponding points shifted and scaled to [self.lb, self.ub]. |
|
Objective function to be optimized during the tuning process of the method |
|
Inizialize uniform weights for simple Monte Carlo method or linear regression in local linear gradients. |
|
Solves an equality constrained linear program with variable bounds. |
|
Estimate a collection of gradients from input/output pairs. |
|
Evaluates the relative root mean square error. |
|
Sort eigenpairs. |
-
class
CrossValidation
(inputs, outputs, gradients, subspace, folds=5, **kwargs)[source] Bases:
object
Class to perform k-fold cross validation when tuning hyperparameters for the design of a response surface with ActiveSubspaces or KernelActiveSubspaces. Used in particular during the tuning of the parameters of the spectral distribution of the feature map, inside the object function average_rrmse. default score is the relative root mean square error (rrmse).
- Parameters
inputs (numpy.ndarray) – n_samples-by-input_dim input matrix.
outputs (numpy.ndarray) – n_sample-by-output_dim output matrix.
gradients (numpy.ndarray) – n_samples-by-output_dim-by-input_dim gradients matrix.
subspace (Subspaces) – ActiveSubspace or KernelActiveSubspace object, from which evaluate the response surface. The dimension of the response surface is specified in subspace.dim attribute.
folds (int) – number of folds of the cross-validation procedure.
kwargs (dict) – additional paramters organized in a dictionary to pass to subspace.fit method. For example ‘weights’ or ‘metric’.
- Variables
gp (sklearn.gaussian_process.GaussianProcessRegressor) – Gaussian process of the response surface built with scikit-learn.
-
fit
(inputs, gradients, outputs)[source] Uses Gaussian process regression to build the response surface as a side effect. The dimension of the response surface is specified in the attribute self.ss.dim.
- Parameters
inputs (numpy.ndarray) – n_samples-by-input_dim input matrix.
outputs (numpy.ndarray) – n_sample-by-output_dim output matrix.
gradients (numpy.ndarray) – n_samples-by-output_dim-by-input_dim gradients matrix.
-
predict
(inputs)[source] Predict method of cross-validation.
- Parameters
inputs (numpy.ndarray) – n_samples-by-input_dim input matrix.
- Returns
n_samples-by-dim prediction of the surrogate response surface model at the inputs. The value dim corresponds to self.ss.dim.
- Return type
-
run
()[source] Run the k-fold cross validation procedure. In each fold a fit and an evaluation of the score are compute.
- Returns
mean and standard deviation of the scores.
- Return type
list of two numpy.ndarray.
-
scorer
(inputs, outputs)[source] Score function of cross-validation.
- Parameters
inputs (numpy.ndarray) – n_samples-by-input_dim input matrix.
outputs (numpy.ndarray) – n_sample-by-output_dim output matrix.
- Returns
relative root mean square error between inputs and outputs.
- Return type
np.float64
-
class
Normalizer
(lb, ub)[source] Bases:
object
A class for normalizing and unnormalizing bounded inputs.
- Parameters
lb (numpy.ndarray) – array n_params-by-1 that contains lower bounds on the simulation inputs.
ub (numpy.ndarray) – array n_params-by-1 that contains upper bounds on the simulation inputs.
-
fit_transform
(inputs)[source] Return corresponding points shifted and scaled to [-1, 1]^n_params.
- Parameters
inputs (numpy.ndarray) – contains all input points to normalize. The shape is n_samples-by-n_params. The components of each row of inputs should be between self.lb and self.ub.
- Returns
the normalized inputs. The components of each row should be between -1 and 1.
- Return type
-
inverse_transform
(inputs)[source] Return corresponding points shifted and scaled to [self.lb, self.ub].
- Parameters
inputs (numpy.ndarray) – contains all input points to unnormalize. The shape is n_samples-by-n_params. The components of each row of inputs should be between -1 and 1.
- Returns
the unnormalized inputs. The components of each row should be between self.lb and self.ub.
- Return type
-
average_rrmse
(hyperparams, best, csv, verbose=False, resample=5)[source]¶ Objective function to be optimized during the tuning process of the method
tune_pr_matrix()
. The optimal hyperparameters of the spectral distribution are searched for in a domain logarithmically scaled in base 10. For each call ofaverage_rrmse()
by the optimizer, the same hyperparameter is tested in two nested procedures: in the external procedure the projection matrix is resampled a number of times specified by the resample parameter; in the internal procedure the relative root mean squared error (rrmse()
) is evaluated as the k-fold mean of a k-fold cross-validation procedure. The score of a single fold of this cross-validation procedure is the rrmse on the validation set of the predictions of the response surface built with a Subspace object on the training set.- Parameters
hyperparameters (list) – logarithm of the parameter of the spectral distribution passed to average_rrmse by the optimizer.
csv ('CrossValidation') – CrossValidation object which contains the same Subspace object and the inputs, outputs, gradients datasets. The
best (list) – list that records the best score and the best projection matrix. The initial values are 0.8 and a n_features-by-input_dim numpy.ndarray of zeros.
resample (int) – number of times the projection matrix is resampled from the same spectral distribution with the same hyperparameter.
verbose (bool) – True to print the score for each resample.
- Returns
minumum of the scores evaluated for the same hyperparameter and a specified number of resamples of the projection matrix.
- Return type
numpy.float64
-
initialize_weights
(matrix)[source]¶ Inizialize uniform weights for simple Monte Carlo method or linear regression in local linear gradients.
- Parameters
matrix (numpy.ndarray) – matrix which shape[0] value contains the dimension of the weights to be computed.
- Returns
weights
- Return type
-
linear_program_ineq
(c, A, b)[source]¶ Solves an equality constrained linear program with variable bounds. This method returns the minimizer of the following linear program.
minimize c^T x subject to A x >= b
- Parameters
c (numpy.ndarray) – coefficients vector of the linear objective function to be minimized.
A (numpy.ndarray) – 2-D array which, when matrix-multiplied by x, gives the values of the lower-bound inequality constraints at x.
b (numpy.ndarray) – 1-D array of values representing the lower-bound of each inequality constraint (row) in A.
- Returns
the independent variable vector which minimizes the linear programming problem.
- Return type
- Raises
RuntimeError
-
local_linear_gradients
(inputs, outputs, weights=None, n_neighbors=None)[source]¶ Estimate a collection of gradients from input/output pairs.
Given a set of input/output pairs, choose subsets of neighboring points and build a local linear model for each subset. The gradients of these local linear models comprise estimates of sampled gradients.
- Parameters
inputs (numpy.ndarray) – M-by-m matrix that contains the m-dimensional inputs
outputs (numpy.ndarray) – M-by-1 matrix that contains scalar outputs
weights (numpy.ndarray) – M-by-1 matrix that contains the weights for each observation (default None)
n_neighbors (int) – how many nearest neighbors to use when constructing the local linear model. the default value is floor(1.7*m)
- Returns
M-by-m matrix that contains estimated partial derivatives approximated by the local linear models; the corresponding new inputs
- Return type
- Raises
ValueError, TypeError
-
rrmse
(predictions, targets)[source]¶ Evaluates the relative root mean square error. It can be vectorized for multidimensional predictions and targets.
- Parameters
predictions (numpy.ndarray) – predictions input.
targets (numpy.ndarray) – targets input.
- Returns
relative root mean squared error
- Return type
np.float64
-
sort_eigpairs
(evals, evects)[source]¶ Sort eigenpairs.
- Parameters
evals (numpy.ndarray) – eigenvalues.
evects (numpy.ndarray) – eigenvectors.
- Returns
vector of sorted eigenvalues; orthogonal matrix of corresponding eigenvectors.
- Return type
Note
Eigenvectors are unique up to a sign. We make the choice to normalize the eigenvectors so that the first component of each eigenvector is positive. This normalization is very helpful for the bootstrapping.