dl85.unsupervised.clustering.DL85Cluster¶
- class dl85.unsupervised.clustering.DL85Cluster(max_depth=1, min_sup=1, error_function=None, max_error=0, stop_after_better=False, time_limit=0, verbose=False, desc=False, asc=False, repeat_sort=False, leaf_value_function=None, print_output=False)[source]¶
An optimal binary decision tree classifier.
- Parameters
- max_depthint, default=1
Maximum depth of the tree to be found
- min_supint, default=1
Minimum number of examples per leaf
- max_errorint, default=0
Maximum allowed error. Default value stands for no bound. If no tree can be found that is strictly better, the model remains empty.
- stop_after_betterbool, default=False
A parameter used to indicate if the search will stop after finding a tree better than max_error
- time_limitint, default=0
Allocated time in second(s) for the search. Default value stands for no limit. The best tree found within the time limit is stored, if this tree is better than max_error.
- verbosebool, default=False
A parameter used to switch on/off the print of what happens during the search
- descbool, default=False
A parameter used to indicate if the sorting of the items is done in descending order of information gain
- ascbool, default=False
A parameter used to indicate if the sorting of the items is done in ascending order of information gain
- repeat_sortbool, default=False
A parameter used to indicate whether the sorting of items is done at each level of the lattice or only before the search
- print_outputbool, default=False
A parameter used to indicate if the search output will be printed or not
- Attributes
- tree_str
Outputted tree in serialized form; remains empty as long as no model is learned.
- size_int
The size of the outputted tree
- depth_int
Depth of the found tree
- error_float
Error of the found tree
- accuracy_float
Accuracy of the found tree on training set
- lattice_size_int
The number of nodes explored before found the optimal tree
- runtime_float
Time of the optimal decision tree search
- timeout_bool
Whether the search reached timeout or not
- classes_ndarray, shape (n_classes,)
The classes seen at
fit()
.
- fit(X, X_error=None)[source]¶
Implements the standard fitting function for a DL8.5 classifier.
- Parameters
- Xarray-like, shape (n_samples, n_features)
The training input samples. If X_error is provided, it represents explanation input
- X_errorarray-like, shape (n_samples, n_features_1)
The training input used to calculate error. If it is not provided X is used to calculate error
- Returns
- selfobject
Returns self.
- fit_predict(X, y=None)¶
Perform clustering on X and returns cluster labels.
- Parameters
- Xarray-like of shape (n_samples, n_features)
Input data.
- yIgnored
Not used, present for API consistency by convention.
- Returns
- labelsndarray of shape (n_samples,), dtype=np.int64
Cluster labels.
- get_params(deep=True)¶
Get parameters for this estimator.
- Parameters
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
- paramsdict
Parameter names mapped to their values.
- predict(X)[source]¶
Implements the standard predict function for a DL8.5 classifier.
- Parameters
- Xarray-like, shape (n_samples, n_features)
The input samples.
- Returns
- yndarray, shape (n_samples,)
The label for each sample is the label of the closest sample seen during fit.
- predict_proba(X)¶
Implements the standard predict function for a DL8.5 classifier.
- Parameters
- Xarray-like, shape (n_samples, n_features)
The input samples.
- Returns
- yndarray, shape (n_samples,)
The label for each sample is the label of the closest sample seen during fit.
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters
- **paramsdict
Estimator parameters.
- Returns
- selfestimator instance
Estimator instance.