EuclideanClustering¶
- class EuclideanClustering[source]¶
Bases:
etna.clustering.hierarchical.base.HierarchicalClustering
Hierarchical clustering with euclidean distance.
Examples
>>> from etna.clustering import EuclideanClustering >>> from etna.datasets import TSDataset >>> from etna.datasets import generate_ar_df >>> ts = generate_ar_df(periods = 40, start_time = "2000-01-01", n_segments = 10) >>> ts = TSDataset(TSDataset.to_dataset(ts), freq="D") >>> model = EuclideanClustering() >>> model.build_distance_matrix(ts) >>> model.build_clustering_algo(n_clusters=3, linkage="average") >>> segment2cluster = model.fit_predict() >>> segment2cluster {'segment_0': 2, 'segment_1': 1, 'segment_2': 0, 'segment_3': 1, 'segment_4': 1, 'segment_5': 0, 'segment_6': 0, 'segment_7': 0, 'segment_8': 2, 'segment_9': 2}
Create instance of EuclideanClustering.
- Inherited-members
Methods
build_clustering_algo
([n_clusters, linkage])Build clustering algo (see
sklearn.cluster.AgglomerativeClustering
) with given params.Build distance matrix with euclidean distance.
Fit clustering algorithm and predict clusters according to distance matrix build.
get_centroids
(**averaging_kwargs)Get centroids of clusters.
set_params
(**params)Return new object instance with modified parameters.
to_dict
()Collect all information about etna object in dict.
- build_clustering_algo(n_clusters: int = 30, linkage: Union[str, etna.clustering.hierarchical.base.ClusteringLinkageMode] = ClusteringLinkageMode.average, **clustering_algo_params)¶
Build clustering algo (see
sklearn.cluster.AgglomerativeClustering
) with given params.- Parameters
n_clusters (int) – number of clusters to build
linkage (Union[str, etna.clustering.hierarchical.base.ClusteringLinkageMode]) – rule for distance computation for new clusters, allowed “ward”, “single”, “average”, “maximum”, “complete”
Notes
Note that it will reset previous results of clustering in case of reinit algo.
- build_distance_matrix(ts: TSDataset)[source]¶
Build distance matrix with euclidean distance.
- Parameters
ts (TSDataset) – TSDataset with series to build distance matrix
- fit_predict() Dict[str, int] ¶
Fit clustering algorithm and predict clusters according to distance matrix build.
- Returns
dict in format {segment: cluster}
- Return type
Dict[str, int]
- get_centroids(**averaging_kwargs) pandas.core.frame.DataFrame ¶
Get centroids of clusters.
- Returns
dataframe with centroids
- Return type
pd.DataFrame
- set_params(**params: dict) etna.core.mixins.TMixin ¶
Return new object instance with modified parameters.
Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a
model
in aPipeline
.Nested parameters are expected to be in a
<component_1>.<...>.<parameter>
form, where components are separated by a dot.- Parameters
**params – Estimator parameters
self (etna.core.mixins.TMixin) –
params (dict) –
- Returns
New instance with changed parameters
- Return type
etna.core.mixins.TMixin
Examples
>>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = model=NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )
- to_dict()¶
Collect all information about etna object in dict.