Internal API Reference#

Here is a documentation of methods internal to the package, which are subject to considerable change between releases. No promises of backwards compatibility are made with these methods.

The package consists of a single general class for estimators, which is modelled after sk-learn’s Estimator class.

Classes#

CoreEstimator#

class scikit_stan.modelcore.CoreEstimator[source]#

Abstract class for all estimator-type models in this package.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:

deep (bool, default True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params (dict) – Parameter names mapped to their values.

set_params(**params)[source]#

Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object. :param **params: Estimator parameters. :type **params: dict

Returns:

self (estimator instance) – Estimator instance.

Validation Methods#

Validating Input Data#

scikit_stan.utils.validation.check_array(X, ensure_2d=True, allow_nd=False, allow_sparse=False, dtype=<class 'numpy.float64'>)[source]#

Input validation on an array, list, sparse matrix or similar. By default, the input is checked to be a non-empty 2D array containing only finite values.

Parameters:
  • X (array-like) – Array-like, list, sparse matrix, or similar of data to be checked.

  • ensure_2d (bool, optional) – Whether to ensure that the array is 2D

  • allow_nd (bool, optional) – Whether to allow the array to be an n-dimensional matrix where n > 2

  • dtype (type, optional) – Dtype of the data; regressions only supported on float64 or int64 arrays

Returns:

NDArray[Union[np.float64, np.int64]] – Verified set of data that can be used for regression.

Raises:
  • ValueError – Sparse, complex, or otherwise invalid data type passed for X.

  • ValueError – Invalid number of dimensions in data passed for X, or otherwise data that cannot be recast to satisfy dimension requirements

scikit_stan.utils.validation.check_is_fitted(estimator, attributes=None, *, msg=None, all_or_any=<built-in function all>)[source]#

Perform is_fitted validation for estimator. Checks if the estimator is fitted by verifying the presence of fitted attributes (ending with a trailing underscore) and otherwise raises a NotFittedError with the given message. If an estimator does not set any attributes with a trailing underscore, it can define a __sklearn_is_fitted__ method returning a boolean to specify if the estimator is fitted or not.

Parameters:
  • estimator (estimator instance) – estimator instance for which the check is performed.

  • attributes (str, list or tuple of str, default None) – Attribute name(s) given as string or a list/tuple of strings Eg.: ["coef_", "estimator_", ...], "coef_" If None, estimator is considered fitted if there exist an attribute that ends with a underscore and does not start with double underscore.

  • msg (str, default None) – The default error message is, “This %(name)s instance is not fitted yet. Call ‘fit’ with appropriate arguments before using this estimator.” For custom messages if “%(name)s” is present in the message string, it is substituted for the estimator name. Eg. : “Estimator, %(name)s, must be fitted before sparsifying”.

  • all_or_any (callable, {all, any}, default all) – Specify whether all or any of the given attributes must exist.

Returns:

None

Raises:

NotFittedError – If the attributes are not found.

scikit_stan.utils.validation.check_consistent_length(*arrays)[source]#

Check that all arrays have consistent first dimensions. Checks whether all objects in arrays have the same shape or length. :param *arrays: Objects that will be checked for consistent length. :type *arrays: list or tuple of input objects.

scikit_stan.utils.validation._num_samples(x)[source]#

Return number of samples in array-like x.

Validating Priors#

scikit_stan.utils.validation.validate_prior(prior_spec, coeff_type)[source]#

Perform validation on given prior dictionary for prior on either slope or intercept. This is only called when there is a prior to check.

Parameters:
  • prior_spec (Dict[str, Any]) – Proposed prior dictionary, can be either for slope or intercept.

  • coeff_type (str) – Specify whether the prior is for slope or intercept - should only be ‘slope’ or ‘intercept’.

Returns:

Dict[str, Any] – Validated dictionary of parameters for the given prior.

Raises:
  • ValueError – Validating a non-(slope or intercept) prior type.

  • ValueError – Prior distribution is not specified.

  • ValueError – Not all parameters for prior set-up are specified.

  • ValueError – Prior sigma is negative.

scikit_stan.utils.validation.validate_aux_prior(aux_prior_spec)[source]#

Validates passed configuration for prior on auxiliary parameters. This does not perform parameter autoscaling.

This validation method returns the following fields in the dictionary:

  • “prior_aux_dist”: distribution of the prior on auxiliary parameters from this mapping:
    PRIORS_AUX_MAP = {

    “exponential”: 0, # exponential distribution, requires only beta parameter “chi2”: 1, # chi-squared distribution, requires only nu parameter “gamma”: 2, # gamma distribution, requires only alpha and beta parameters “inv_gamma”: 3, # inverse gamma distribution, requires only alpha and beta parameters }

  • “num_prior_aux_params”: number of parameters in the prior on auxiliary parameters,

    determined by the number of parameters in the distribution.

  • “prior_aux_params”: list of parameters in the prior on auxiliary parameters;

    this must be a list even if the distribution only has one parameter

Parameters:

aux_prior_spec (Dict[str, Any]) –

Dictionary containing configuration for prior on auxiliary parameters. Currently supported priors are: “exponential” and “chi2”, which are both parameterized by a single scalar.

Priors here with more parameters are a future feature. For single-parameter priors, this field is a dictionary with the following keys

  • ”prior_aux_dist”: distribution of the prior on this parameter

  • ”prior_aux_param”: parameter of the prior on this parameter

For example, to specify a chi2 prior with nu=2.5, pass:

{"prior_aux_dist": "chi2", "prior_aux_param": 2.5}

Returns:

Dict[str, Any] – Dictionary containing validated configuration for prior on auxiliary parameters.

Raises:
  • ValueError – Prior’s distribution is not specified.

  • ValueError – Unsupported prior distribution for auxiliary parameter.

  • ValueError – Prior distribution parameters are not specified.