dl85.supervised.classifiers.DL85Classifier

class dl85.supervised.classifiers.DL85Classifier(max_depth=1, min_sup=1, error_function=None, fast_error_function=None, max_error=0, stop_after_better=False, time_limit=0, verbose=False, desc=False, asc=False, repeat_sort=False, quiet=True, print_output=False)[source]

An optimal binary decision tree classifier.

Parameters
max_depthint, default=1

Maximum depth of the tree to be found

min_supint, default=1

Minimum number of examples per leaf

error_functionfunction, default=None

User-specific error function based on transactions

fast_error_functionfunction, default=None

User-specific error function based on supports per class

max_errorint, default=0

Maximum allowed error. Default value stands for no bound. If no tree can be found that is strictly better, the model remains empty.

stop_after_betterbool, default=False

A parameter used to indicate if the search will stop after finding a tree better than max_error

time_limitint, default=0

Allocated time in second(s) for the search. Default value stands for no limit. The best tree found within the time limit is stored, if this tree is better than max_error.

verbosebool, default=False

A parameter used to switch on/off the print of what happens during the search

descbool, default=False

A parameter used to indicate if the sorting of the items is done in descending order of information gain

ascbool, default=False

A parameter used to indicate if the sorting of the items is done in ascending order of information gain

repeat_sortbool, default=False

A parameter used to indicate whether the sorting of items is done at each level of the lattice or only before the search

print_outputbool, default=False

A parameter used to indicate if the search output will be printed or not

Attributes
tree_str

Outputted tree in serialized form; remains empty as long as no model is learned.

size_int

The size of the outputted tree

depth_int

Depth of the found tree

error_float

Error of the found tree

accuracy_float

Accuracy of the found tree on training set

lattice_size_int

The number of nodes explored before found the optimal tree

runtime_float

Time of the optimal decision tree search

timeout_bool

Whether the search reached timeout or not

classes_ndarray, shape (n_classes,)

The classes seen at fit().

fit(X, y=None, sample_weight=None)[source]

Implements the standard fitting function for a DL8.5 classifier.

Parameters
Xarray-like, shape (n_samples, n_features)

The training input samples.

yarray-like, shape (n_samples,)

The target values. An array of int.

Returns
selfobject

Returns self.

get_params(deep=True)

Get parameters for this estimator.

Parameters
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns
paramsdict

Parameter names mapped to their values.

predict(X)

Implements the standard predict function for a DL8.5 classifier.

Parameters
Xarray-like, shape (n_samples, n_features)

The input samples.

Returns
yndarray, shape (n_samples,)

The label for each sample is the label of the closest sample seen during fit.

predict_proba(X)

Implements the standard predict function for a DL8.5 classifier.

Parameters
Xarray-like, shape (n_samples, n_features)

The input samples.

Returns
yndarray, shape (n_samples,)

The label for each sample is the label of the closest sample seen during fit.

score(X, y, sample_weight=None)

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Parameters
Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

Returns
scorefloat

Mean accuracy of self.predict(X) w.r.t. y.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters
**paramsdict

Estimator parameters.

Returns
selfestimator instance

Estimator instance.