Boosting Machines

Boosting Machine Classifier

class snapml.BoostingMachineClassifier(n_jobs=1, num_round=100, max_depth=None, min_max_depth=1, max_max_depth=5, early_stopping_rounds=10, random_state=0, base_score=None, learning_rate=0.1, verbose=False, compress_trees=False, class_weight=None, use_histograms=True, hist_nbins=256, use_gpu=False, gpu_ids=[0], colsample_bytree=1.0, subsample=1.0, lambda_l2=0.0, tree_select_probability=1.0, regularizer=1.0, fit_intercept=False, gamma=1.0, n_components=10)

Boosting machine for binary and multi-class classification tasks.

A heterogeneous boosting machine that mixes binary decision trees (of stochastic max_depth) with linear models with random fourier features (approximation of kernel ridge regression).

Parameters

num_roundint, default=100: Number of boosting iterations.
learning_ratefloat, default=0.1: Learning rate / shrinkage factor.
random_stateint, default=0: Random seed.
colsample_bytreefloat, default=1.0: Fraction of feature columns used at each boosting iteration.
subsamplefloat, default=1.0: Fraction of training examples used at each boosting iteration.
verbosebool, default=False: Print off information during training.
lambda_l2float, default=0.0: L2-reguralization penalty used during tree-building.
early_stopping_roundsint, default=10: When a validation set is provided, training will stop if the validation loss does not decrease after a fixed number of rounds.
compress_treesbool, default=False: Compress trees after training for fast inference.
base_scorefloat, default=None: Base score to initialize boosting algorithm. If None then the algorithm will initialize the base score to be the average target (regression) or the logit of the probability of the positive class (binary classification) or zero (multiclass classification).
class_weight{‘balanced’, None}, default=None: If set to ‘balanced’ samples weights will be applied to account for class imbalance, otherwise no sample weights will be used.
max_depthint, default=None: If set, will set min_max_depth = max_depth = max_max_depth
min_max_depthint, default=1: Minimum max_depth of trees in the ensemble.
max_max_depthint, default=5: Maximum max_depth of trees in the ensemble.
n_jobsint, default=1: Number of threads to use during training.
use_histogramsbool, default=True: Use histograms to accelerate tree-building.
hist_nbinsint, default=256: Number of histogram bins.
use_gpubool, default=False: Use GPU for tree-building.
gpu_idsarray-like of int, default: [0]: Device IDs of the GPUs which will be used when GPU acceleration is enabled.
tree_select_probabilityfloat, default=1.0: Probability of selecting a tree (rather than a kernel ridge regressor) at each boosting iteration.
regularizerfloat, default=1.0: L2-regularization penality for the kernel ridge regressor.
fit_interceptbool, default=False: Include intercept term in the kernel ridge regressor.
gammafloat, default=1.0: Guassian kernel parameter.
n_componentsint, default=10: Number of components in the random projection.

Attributes

feature_importances_array-like, shape=(n_features,): Feature importances computed across trees.

apply(X)

Map batch of examples to leaf indices and labels.

Parameters

Xdense matrix (ndarray): Batch of examples.

Returns

indicesarray-like, shape = (n_samples, num_round) or (n_samples, num_round, num_classes): The leaf indices. Output is 2-dim for binary classification. Output is 3-dim for multiclass classification.
labelsarray-like, shape = (n_samples, num_round) or (n_samples, num_round, num_classes): The leaf labels. Output is 2-dim for binary classification. Output is 3-dim for multiclass classification.

fit(X, y, sample_weight=None, X_val=None, y_val=None, sample_weight_val=None, aggregate_importances=True)

Fit the model according to the given train data.

Parameters

Xdense matrix (ndarray): Train dataset
yarray-like, shape = (n_samples,): The target vector corresponding to X.
sample_weightarray-like, shape = (n_samples,): Training sample weights
X_valdense matrix (ndarray): Validation dataset
y_valarray-like, shape = (n_samples,): The target vector corresponding to X_val.
sample_weight_valarray-like, shape = (n_samples,): Validation sample weights
aggregate_importancesbool, default=True: Aggregate feature importances over boosting rounds

Returns

selfobject

get_params(deep=True)

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict: Parameter names mapped to their values.

predict(X, n_jobs=None)

Predict class labels

Parameters

Xdense matrix (ndarray): Dataset used for predicting class estimates.
n_jobsint: Number of threads to use for prediction.

Returns

pred: array-like, shape = (n_samples,): Returns the predicted class labels

predict_proba(X, n_jobs=None)

Predict class label probabilities

Parameters

Xdense matrix (ndarray): Dataset used for predicting class estimates.
n_jobsint: Number of threads to use for prediction.

Returns

proba: array-like, shape = (n_samples, 2): Returns the predicted class probabilities

score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters

Xarray-like of shape (n_samples, n_features): Test samples.
yarray-like of shape (n_samples,) or (n_samples, n_outputs): True labels for X.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns

scorefloat: Mean accuracy of self.predict(X) wrt. y.

set_params(**params)

Set the parameters of this model.

Valid parameter keys can be listed with get_params().

Returns

self

Boosting Machine Regressor

class snapml.BoostingMachineRegressor(n_jobs=1, num_round=100, objective='mse', max_depth=None, min_max_depth=1, max_max_depth=5, early_stopping_rounds=10, random_state=0, base_score=None, learning_rate=0.1, verbose=False, compress_trees=False, use_histograms=True, hist_nbins=256, use_gpu=False, gpu_id=0, colsample_bytree=1.0, subsample=1.0, lambda_l2=0.0, tree_select_probability=1.0, regularizer=1.0, fit_intercept=False, gamma=1.0, n_components=10)

Boosting machine for regression tasks.

A heterogeneous boosting machine that mixes binary decision trees (of stochastic max_depth) with linear models with random fourier features (approximation of kernel ridge regression).

Parameters

num_roundint, default=100: Number of boosting iterations.
objective{‘mse’, ‘cross_entropy’}, default=’mse’: Training objective.
learning_ratefloat, default=0.1: Learning rate / shrinkage factor.
random_stateint, default=0: Random seed.
colsample_bytreefloat, default=1.0: Fraction of feature columns used at each boosting iteration.
subsamplefloat, default=1.0: Fraction of training examples used at each boosting iteration.
verbosebool, default=False: Print off information during training.
lambda_l2float, default=0.0: L2-reguralization penalty used during tree-building.
early_stopping_roundsint, default=10: When a validation set is provided, training will stop if the validation loss does not decrease after a fixed number of rounds.
compress_treesbool, default=False: Compress trees after training for fast inference.
base_scorefloat, default=None: Base score to initialize boosting algorithm. If None then the algorithm will initialize the base score to be the average target (regression) or the logit of the probability of the positive class (binary classification).
max_depthint, default=None: If set, will set min_max_depth = max_depth = max_max_depth
min_max_depthint, default=1: Minimum max_depth of trees in the ensemble.
max_max_depthint, default=5: Maximum max_depth of trees in the ensemble.
n_jobsint, default=1: Number of threads to use during training.
use_histogramsbool, default=True: Use histograms to accelerate tree-building.
hist_nbinsint, default=256: Number of histogram bins.
use_gpubool, default=False: Use GPU for tree-building.
gpu_idint, default=0: Device ID for GPU to use during training.
tree_select_probabilityfloat, default=1.0: Probability of selecting a tree (rather than a kernel ridge regressor) at each boosting iteration.
regularizerfloat, default=1.0: L2-regularization penality for the kernel ridge regressor.
fit_interceptbool, default=False: Include intercept term in the kernel ridge regressor.
gammafloat, default=1.0: Guassian kernel parameter.
n_componentsint, default=10: Number of components in the random projection.

Attributes

feature_importances_array-like, shape=(n_features,): Feature importances computed across trees.

apply(X)

Map batch of examples to leaf indices and labels.

Parameters

Xdense matrix (ndarray): Batch of examples.

Returns

indicesarray-like, shape = (n_samples, num_round): The leaf indices.
labelsarray-like, shape = (n_samples, num_round): The leaf labels.

fit(X, y, sample_weight=None, X_val=None, y_val=None, sample_weight_val=None, aggregate_importances=True)

Fit the model according to the given train data.

Parameters

Xdense matrix (ndarray): Train dataset
yarray-like, shape = (n_samples,): The target vector corresponding to X
sample_weightarray-like, shape = (n_samples,): Training sample weights
X_valdense matrix (ndarray): Validation dataset
y_valarray-like, shape = (n_samples,): The target vector corresponding to X_val.
sample_weight_valarray-like, shape = (n_samples,): Validation sample weights
aggregate_importancesbool, default=True: Aggregate feature importances over boosting rounds

Returns

selfobject

get_params(deep=True)

Get parameters for this estimator.

Parameters

deepbool, default=True: If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

paramsdict: Parameter names mapped to their values.

predict(X, n_jobs=None)

Predict estimates

Parameters

Xdense matrix (ndarray): Dataset used for prediction
n_jobsint: Number of threads to use for prediction.

Returns

pred: array-like, shape = (n_samples,): Returns the predictions

score(X, y, sample_weight=None)

Return the coefficient of determination of the prediction.

The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares ((y_true - y_pred)** 2).sum() and \(v\) is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.

Parameters

Xarray-like of shape (n_samples, n_features): Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.
yarray-like of shape (n_samples,) or (n_samples, n_outputs): True values for X.
sample_weightarray-like of shape (n_samples,), default=None: Sample weights.

Returns

scorefloat: \(R^2\) of self.predict(X) wrt. y.

Notes

The \(R^2\) score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of r2_score(). This influences the score method of all the multioutput regressors (except for MultiOutputRegressor).

set_params(**params)

Set the parameters of this model.

Valid parameter keys can be listed with get_params().

Returns

self