Utilities¶
Utility functions module.
Uses Gaussian process regression to build the response surface as a side effect.  | 
|
Predict method of cross-validation.  | 
|
Run the k-fold cross validation procedure.  | 
|
Score function of cross-validation.  | 
|
Return corresponding points shifted and scaled to [-1, 1]^n_params.  | 
|
Return corresponding points shifted and scaled to [self.lb, self.ub].  | 
|
Objective function to be optimized during the tuning process of the method   | 
|
Inizialize uniform weights for simple Monte Carlo method or linear regression in local linear gradients.  | 
|
Solves an equality constrained linear program with variable bounds.  | 
|
Estimate a collection of gradients from input/output pairs.  | 
|
Evaluates the relative root mean square error.  | 
|
Sort eigenpairs.  | 
- 
class 
CrossValidation(inputs, outputs, gradients, subspace, folds=5, **kwargs)[source] Bases:
objectClass to perform k-fold cross validation when tuning hyperparameters for the design of a response surface with ActiveSubspaces or KernelActiveSubspaces. Used in particular during the tuning of the parameters of the spectral distribution of the feature map, inside the object function average_rrmse. default score is the relative root mean square error (rrmse).
- Parameters
 inputs (numpy.ndarray) – n_samples-by-input_dim input matrix.
outputs (numpy.ndarray) – n_sample-by-output_dim output matrix.
gradients (numpy.ndarray) – n_samples-by-output_dim-by-input_dim gradients matrix.
subspace (Subspaces) – ActiveSubspace or KernelActiveSubspace object, from which evaluate the response surface. The dimension of the response surface is specified in subspace.dim attribute.
folds (int) – number of folds of the cross-validation procedure.
kwargs (dict) – additional paramters organized in a dictionary to pass to subspace.fit method. For example ‘weights’ or ‘metric’.
- Variables
 gp (sklearn.gaussian_process.GaussianProcessRegressor) – Gaussian process of the response surface built with scikit-learn.
- 
fit(inputs, gradients, outputs)[source] Uses Gaussian process regression to build the response surface as a side effect. The dimension of the response surface is specified in the attribute self.ss.dim.
- Parameters
 inputs (numpy.ndarray) – n_samples-by-input_dim input matrix.
outputs (numpy.ndarray) – n_sample-by-output_dim output matrix.
gradients (numpy.ndarray) – n_samples-by-output_dim-by-input_dim gradients matrix.
- 
predict(inputs)[source] Predict method of cross-validation.
- Parameters
 inputs (numpy.ndarray) – n_samples-by-input_dim input matrix.
- Returns
 n_samples-by-dim prediction of the surrogate response surface model at the inputs. The value dim corresponds to self.ss.dim.
- Return type
 
- 
run()[source] Run the k-fold cross validation procedure. In each fold a fit and an evaluation of the score are compute.
- Returns
 mean and standard deviation of the scores.
- Return type
 list of two numpy.ndarray.
- 
scorer(inputs, outputs)[source] Score function of cross-validation.
- Parameters
 inputs (numpy.ndarray) – n_samples-by-input_dim input matrix.
outputs (numpy.ndarray) – n_sample-by-output_dim output matrix.
- Returns
 relative root mean square error between inputs and outputs.
- Return type
 np.float64
- 
class 
Normalizer(lb, ub)[source] Bases:
objectA class for normalizing and unnormalizing bounded inputs.
- Parameters
 lb (numpy.ndarray) – array n_params-by-1 that contains lower bounds on the simulation inputs.
ub (numpy.ndarray) – array n_params-by-1 that contains upper bounds on the simulation inputs.
- 
fit_transform(inputs)[source] Return corresponding points shifted and scaled to [-1, 1]^n_params.
- Parameters
 inputs (numpy.ndarray) – contains all input points to normalize. The shape is n_samples-by-n_params. The components of each row of inputs should be between self.lb and self.ub.
- Returns
 the normalized inputs. The components of each row should be between -1 and 1.
- Return type
 
- 
inverse_transform(inputs)[source] Return corresponding points shifted and scaled to [self.lb, self.ub].
- Parameters
 inputs (numpy.ndarray) – contains all input points to unnormalize. The shape is n_samples-by-n_params. The components of each row of inputs should be between -1 and 1.
- Returns
 the unnormalized inputs. The components of each row should be between self.lb and self.ub.
- Return type
 
- 
average_rrmse(hyperparams, best, csv, verbose=False, resample=5)[source]¶ Objective function to be optimized during the tuning process of the method
tune_pr_matrix(). The optimal hyperparameters of the spectral distribution are searched for in a domain logarithmically scaled in base 10. For each call ofaverage_rrmse()by the optimizer, the same hyperparameter is tested in two nested procedures: in the external procedure the projection matrix is resampled a number of times specified by the resample parameter; in the internal procedure the relative root mean squared error (rrmse()) is evaluated as the k-fold mean of a k-fold cross-validation procedure. The score of a single fold of this cross-validation procedure is the rrmse on the validation set of the predictions of the response surface built with a Subspace object on the training set.- Parameters
 hyperparameters (list) – logarithm of the parameter of the spectral distribution passed to average_rrmse by the optimizer.
csv ('CrossValidation') – CrossValidation object which contains the same Subspace object and the inputs, outputs, gradients datasets. The
best (list) – list that records the best score and the best projection matrix. The initial values are 0.8 and a n_features-by-input_dim numpy.ndarray of zeros.
resample (int) – number of times the projection matrix is resampled from the same spectral distribution with the same hyperparameter.
verbose (bool) – True to print the score for each resample.
- Returns
 minumum of the scores evaluated for the same hyperparameter and a specified number of resamples of the projection matrix.
- Return type
 numpy.float64
- 
initialize_weights(matrix)[source]¶ Inizialize uniform weights for simple Monte Carlo method or linear regression in local linear gradients.
- Parameters
 matrix (numpy.ndarray) – matrix which shape[0] value contains the dimension of the weights to be computed.
- Returns
 weights
- Return type
 
- 
linear_program_ineq(c, A, b)[source]¶ Solves an equality constrained linear program with variable bounds. This method returns the minimizer of the following linear program.
minimize c^T x subject to A x >= b
- Parameters
 c (numpy.ndarray) – coefficients vector of the linear objective function to be minimized.
A (numpy.ndarray) – 2-D array which, when matrix-multiplied by x, gives the values of the lower-bound inequality constraints at x.
b (numpy.ndarray) – 1-D array of values representing the lower-bound of each inequality constraint (row) in A.
- Returns
 the independent variable vector which minimizes the linear programming problem.
- Return type
 - Raises
 RuntimeError
- 
local_linear_gradients(inputs, outputs, weights=None, n_neighbors=None)[source]¶ Estimate a collection of gradients from input/output pairs.
Given a set of input/output pairs, choose subsets of neighboring points and build a local linear model for each subset. The gradients of these local linear models comprise estimates of sampled gradients.
- Parameters
 inputs (numpy.ndarray) – M-by-m matrix that contains the m-dimensional inputs
outputs (numpy.ndarray) – M-by-1 matrix that contains scalar outputs
weights (numpy.ndarray) – M-by-1 matrix that contains the weights for each observation (default None)
n_neighbors (int) – how many nearest neighbors to use when constructing the local linear model. the default value is floor(1.7*m)
- Returns
 M-by-m matrix that contains estimated partial derivatives approximated by the local linear models; the corresponding new inputs
- Return type
 - Raises
 ValueError, TypeError
- 
rrmse(predictions, targets)[source]¶ Evaluates the relative root mean square error. It can be vectorized for multidimensional predictions and targets.
- Parameters
 predictions (numpy.ndarray) – predictions input.
targets (numpy.ndarray) – targets input.
- Returns
 relative root mean squared error
- Return type
 np.float64
- 
sort_eigpairs(evals, evects)[source]¶ Sort eigenpairs.
- Parameters
 evals (numpy.ndarray) – eigenvalues.
evects (numpy.ndarray) – eigenvectors.
- Returns
 vector of sorted eigenvalues; orthogonal matrix of corresponding eigenvectors.
- Return type
 
Note
Eigenvectors are unique up to a sign. We make the choice to normalize the eigenvectors so that the first component of each eigenvector is positive. This normalization is very helpful for the bootstrapping.