scalers¶
Classes
|
Scale each feature by its maximum absolute value. |
|
Transform features by scaling each feature to a given range. |
|
Scale features using statistics that are robust to outliers. |
|
Standardize features by removing the mean and scaling to unit variance. |
- class MaxAbsScalerTransform(in_column: Optional[Union[str, List[str]]] = None, inplace: bool = True, out_column: Optional[str] = None, mode: Union[etna.transforms.math.sklearn.TransformMode, str] = 'per-segment')[source]¶
Scale each feature by its maximum absolute value.
Uses
sklearn.preprocessing.MaxAbsScaler
inside.Warning
This transform can suffer from look-ahead bias. For transforming data at some timestamp it uses information from the whole train part.
Init MinMaxScalerPreprocess.
- Parameters
in_column (Optional[Union[str, List[str]]]) – columns to be scaled, if None - all columns will be scaled.
inplace (bool) – features are changed by scaled.
out_column (Optional[str]) – base for the names of generated columns, uses
self.__repr__()
if not given.mode (Union[etna.transforms.math.sklearn.TransformMode, str]) –
“macro” or “per-segment”, way to transform features over segments.
If “macro”, transforms features globally, gluing the corresponding ones for all segments.
If “per-segment”, transforms features for each segment separately.
- Raises
ValueError: – if incorrect mode given
- fit(ts: etna.datasets.tsdataset.TSDataset) etna.transforms.math.sklearn.SklearnTransform ¶
Fit the transform.
- Parameters
- Return type
- fit_transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Fit and transform TSDataset.
May be reimplemented. But it is not recommended.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – TSDataset to transform.
- Returns
Transformed TSDataset.
- Return type
- get_regressors_info() List[str] ¶
Return the list with regressors created by the transform.
- Return type
List[str]
- inverse_transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Inverse transform TSDataset.
Apply the _inverse_transform method.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – TSDataset to be inverse transformed.
- Returns
TSDataset after applying inverse transformation.
- Return type
- classmethod load(path: pathlib.Path) typing_extensions.Self ¶
Load an object.
Warning
This method uses
dill
module which is not secure. It is possible to construct malicious data which will execute arbitrary code during loading. Never load data that could have come from an untrusted source, or that could have been tampered with.- Parameters
path (pathlib.Path) – Path to load object from.
- Returns
Loaded object.
- Return type
typing_extensions.Self
- params_to_tune() Dict[str, etna.distributions.distributions.BaseDistribution] ¶
Get default grid for tuning hyperparameters.
This grid tunes
mode
parameter. Other parameters are expected to be set by the user.- Returns
Grid to tune.
- Return type
Dict[str, etna.distributions.distributions.BaseDistribution]
- save(path: pathlib.Path)¶
Save the object.
- Parameters
path (pathlib.Path) – Path to save object to.
- set_params(**params: dict) etna.core.mixins.TMixin ¶
Return new object instance with modified parameters.
Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a
model
in aPipeline
.Nested parameters are expected to be in a
<component_1>.<...>.<parameter>
form, where components are separated by a dot.- Parameters
**params – Estimator parameters
self (etna.core.mixins.TMixin) –
params (dict) –
- Returns
New instance with changed parameters
- Return type
etna.core.mixins.TMixin
Examples
>>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = model=NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )
- to_dict()¶
Collect all information about etna object in dict.
- transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Transform TSDataset inplace.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – Dataset to transform.
- Returns
Transformed TSDataset.
- Return type
- class MinMaxScalerTransform(in_column: Optional[Union[str, List[str]]] = None, inplace: bool = True, out_column: Optional[str] = None, feature_range: Tuple[float, float] = (0, 1), clip: bool = True, mode: Union[etna.transforms.math.sklearn.TransformMode, str] = 'per-segment')[source]¶
Transform features by scaling each feature to a given range.
Uses
sklearn.preprocessing.MinMaxScaler
inside.Warning
This transform can suffer from look-ahead bias. For transforming data at some timestamp it uses information from the whole train part.
Init MinMaxScalerPreprocess.
- Parameters
in_column (Optional[Union[str, List[str]]]) – columns to be scaled, if None - all columns will be scaled.
inplace (bool) – features are changed by scaled.
out_column (Optional[str]) – base for the names of generated columns, uses
self.__repr__()
if not given.feature_range (Tuple[float, float]) – desired range of transformed data.
clip (bool) – set to True to clip transformed values of held-out data to provided feature range.
mode (Union[etna.transforms.math.sklearn.TransformMode, str]) –
“macro” or “per-segment”, way to transform features over segments.
If “macro”, transforms features globally, gluing the corresponding ones for all segments.
If “per-segment”, transforms features for each segment separately.
- Raises
ValueError: – if incorrect mode given
- fit(ts: etna.datasets.tsdataset.TSDataset) etna.transforms.math.sklearn.SklearnTransform ¶
Fit the transform.
- Parameters
- Return type
- fit_transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Fit and transform TSDataset.
May be reimplemented. But it is not recommended.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – TSDataset to transform.
- Returns
Transformed TSDataset.
- Return type
- get_regressors_info() List[str] ¶
Return the list with regressors created by the transform.
- Return type
List[str]
- inverse_transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Inverse transform TSDataset.
Apply the _inverse_transform method.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – TSDataset to be inverse transformed.
- Returns
TSDataset after applying inverse transformation.
- Return type
- classmethod load(path: pathlib.Path) typing_extensions.Self ¶
Load an object.
Warning
This method uses
dill
module which is not secure. It is possible to construct malicious data which will execute arbitrary code during loading. Never load data that could have come from an untrusted source, or that could have been tampered with.- Parameters
path (pathlib.Path) – Path to load object from.
- Returns
Loaded object.
- Return type
typing_extensions.Self
- params_to_tune() Dict[str, etna.distributions.distributions.BaseDistribution] [source]¶
Get default grid for tuning hyperparameters.
This grid tunes parameters:
mode
,clip
. Other parameters are expected to be set by the user.- Returns
Grid to tune.
- Return type
Dict[str, etna.distributions.distributions.BaseDistribution]
- save(path: pathlib.Path)¶
Save the object.
- Parameters
path (pathlib.Path) – Path to save object to.
- set_params(**params: dict) etna.core.mixins.TMixin ¶
Return new object instance with modified parameters.
Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a
model
in aPipeline
.Nested parameters are expected to be in a
<component_1>.<...>.<parameter>
form, where components are separated by a dot.- Parameters
**params – Estimator parameters
self (etna.core.mixins.TMixin) –
params (dict) –
- Returns
New instance with changed parameters
- Return type
etna.core.mixins.TMixin
Examples
>>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = model=NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )
- to_dict()¶
Collect all information about etna object in dict.
- transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Transform TSDataset inplace.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – Dataset to transform.
- Returns
Transformed TSDataset.
- Return type
- class RobustScalerTransform(in_column: Optional[Union[str, List[str]]] = None, inplace: bool = True, out_column: Optional[str] = None, with_centering: bool = True, with_scaling: bool = True, quantile_range: Tuple[float, float] = (25, 75), unit_variance: bool = False, mode: Union[etna.transforms.math.sklearn.TransformMode, str] = 'per-segment')[source]¶
Scale features using statistics that are robust to outliers.
Uses
sklearn.preprocessing.RobustScaler
inside.Warning
This transform can suffer from look-ahead bias. For transforming data at some timestamp it uses information from the whole train part.
Init RobustScalerPreprocess.
- Parameters
in_column (Optional[Union[str, List[str]]]) – columns to be scaled, if None - all columns will be scaled.
inplace (bool) – features are changed by scaled.
out_column (Optional[str]) – base for the names of generated columns, uses
self.__repr__()
if not given.with_centering (bool) – if True, center the data before scaling.
with_scaling (bool) – if True, scale the data to interquartile range.
quantile_range (Tuple[float, float]) – quantile range.
unit_variance (bool) –
If True, scale data so that normally distributed features have a variance of 1.
In general, if the difference between the x-values of q_max and q_min for a standard normal distribution is greater than 1, the dataset will be scaled down. If less than 1, the dataset will be scaled up.
mode (Union[etna.transforms.math.sklearn.TransformMode, str]) –
“macro” or “per-segment”, way to transform features over segments.
If “macro”, transforms features globally, gluing the corresponding ones for all segments.
If “per-segment”, transforms features for each segment separately.
- Raises
ValueError: – if incorrect mode given
- fit(ts: etna.datasets.tsdataset.TSDataset) etna.transforms.math.sklearn.SklearnTransform ¶
Fit the transform.
- Parameters
- Return type
- fit_transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Fit and transform TSDataset.
May be reimplemented. But it is not recommended.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – TSDataset to transform.
- Returns
Transformed TSDataset.
- Return type
- get_regressors_info() List[str] ¶
Return the list with regressors created by the transform.
- Return type
List[str]
- inverse_transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Inverse transform TSDataset.
Apply the _inverse_transform method.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – TSDataset to be inverse transformed.
- Returns
TSDataset after applying inverse transformation.
- Return type
- classmethod load(path: pathlib.Path) typing_extensions.Self ¶
Load an object.
Warning
This method uses
dill
module which is not secure. It is possible to construct malicious data which will execute arbitrary code during loading. Never load data that could have come from an untrusted source, or that could have been tampered with.- Parameters
path (pathlib.Path) – Path to load object from.
- Returns
Loaded object.
- Return type
typing_extensions.Self
- params_to_tune() Dict[str, etna.distributions.distributions.BaseDistribution] [source]¶
Get default grid for tuning hyperparameters.
This grid tunes parameters:
mode
,with_centering
,with_scaling
,unit_variance
. Other parameters are expected to be set by the user.- Returns
Grid to tune.
- Return type
Dict[str, etna.distributions.distributions.BaseDistribution]
- save(path: pathlib.Path)¶
Save the object.
- Parameters
path (pathlib.Path) – Path to save object to.
- set_params(**params: dict) etna.core.mixins.TMixin ¶
Return new object instance with modified parameters.
Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a
model
in aPipeline
.Nested parameters are expected to be in a
<component_1>.<...>.<parameter>
form, where components are separated by a dot.- Parameters
**params – Estimator parameters
self (etna.core.mixins.TMixin) –
params (dict) –
- Returns
New instance with changed parameters
- Return type
etna.core.mixins.TMixin
Examples
>>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = model=NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )
- to_dict()¶
Collect all information about etna object in dict.
- transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Transform TSDataset inplace.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – Dataset to transform.
- Returns
Transformed TSDataset.
- Return type
- class StandardScalerTransform(in_column: Optional[Union[str, List[str]]] = None, inplace: bool = True, out_column: Optional[str] = None, with_mean: bool = True, with_std: bool = True, mode: Union[etna.transforms.math.sklearn.TransformMode, str] = 'per-segment')[source]¶
Standardize features by removing the mean and scaling to unit variance.
Uses
sklearn.preprocessing.StandardScaler
inside.Warning
This transform can suffer from look-ahead bias. For transforming data at some timestamp it uses information from the whole train part.
Init StandardScalerPreprocess.
- Parameters
in_column (Optional[Union[str, List[str]]]) – columns to be scaled, if None - all columns will be scaled.
inplace (bool) – features are changed by scaled.
out_column (Optional[str]) – base for the names of generated columns, uses
self.__repr__()
if not given.with_mean (bool) – if True, center the data before scaling.
with_std (bool) – if True, scale the data to unit standard deviation.
mode (Union[etna.transforms.math.sklearn.TransformMode, str]) –
“macro” or “per-segment”, way to transform features over segments.
If “macro”, transforms features globally, gluing the corresponding ones for all segments.
If “per-segment”, transforms features for each segment separately.
- Raises
ValueError: – if incorrect mode given
- fit(ts: etna.datasets.tsdataset.TSDataset) etna.transforms.math.sklearn.SklearnTransform ¶
Fit the transform.
- Parameters
- Return type
- fit_transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Fit and transform TSDataset.
May be reimplemented. But it is not recommended.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – TSDataset to transform.
- Returns
Transformed TSDataset.
- Return type
- get_regressors_info() List[str] ¶
Return the list with regressors created by the transform.
- Return type
List[str]
- inverse_transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Inverse transform TSDataset.
Apply the _inverse_transform method.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – TSDataset to be inverse transformed.
- Returns
TSDataset after applying inverse transformation.
- Return type
- classmethod load(path: pathlib.Path) typing_extensions.Self ¶
Load an object.
Warning
This method uses
dill
module which is not secure. It is possible to construct malicious data which will execute arbitrary code during loading. Never load data that could have come from an untrusted source, or that could have been tampered with.- Parameters
path (pathlib.Path) – Path to load object from.
- Returns
Loaded object.
- Return type
typing_extensions.Self
- params_to_tune() Dict[str, etna.distributions.distributions.BaseDistribution] [source]¶
Get default grid for tuning hyperparameters.
This grid tunes parameters:
mode
,with_mean
,with_std
. Other parameters are expected to be set by the user.- Returns
Grid to tune.
- Return type
Dict[str, etna.distributions.distributions.BaseDistribution]
- save(path: pathlib.Path)¶
Save the object.
- Parameters
path (pathlib.Path) – Path to save object to.
- set_params(**params: dict) etna.core.mixins.TMixin ¶
Return new object instance with modified parameters.
Method also allows to change parameters of nested objects within the current object. For example, it is possible to change parameters of a
model
in aPipeline
.Nested parameters are expected to be in a
<component_1>.<...>.<parameter>
form, where components are separated by a dot.- Parameters
**params – Estimator parameters
self (etna.core.mixins.TMixin) –
params (dict) –
- Returns
New instance with changed parameters
- Return type
etna.core.mixins.TMixin
Examples
>>> from etna.pipeline import Pipeline >>> from etna.models import NaiveModel >>> from etna.transforms import AddConstTransform >>> model = model=NaiveModel(lag=1) >>> transforms = [AddConstTransform(in_column="target", value=1)] >>> pipeline = Pipeline(model, transforms=transforms, horizon=3) >>> pipeline.set_params(**{"model.lag": 3, "transforms.0.value": 2}) Pipeline(model = NaiveModel(lag = 3, ), transforms = [AddConstTransform(in_column = 'target', value = 2, inplace = True, out_column = None, )], horizon = 3, )
- to_dict()¶
Collect all information about etna object in dict.
- transform(ts: etna.datasets.tsdataset.TSDataset) etna.datasets.tsdataset.TSDataset ¶
Transform TSDataset inplace.
- Parameters
ts (etna.datasets.tsdataset.TSDataset) – Dataset to transform.
- Returns
Transformed TSDataset.
- Return type