plots¶

Functions

`_cross_correlation`(a, b[, maxlags, normed])	Calculate cross correlation between arrays.
`acf_plot`(ts[, n_segments, lags, partial, ...])	Autocorrelation and partial autocorrelation plot for multiple timeseries.
`cross_corr_plot`(ts[, n_segments, maxlags, ...])	Cross-correlation plot between multiple timeseries.
`distribution_plot`(ts[, n_segments, ...])	Distribution of z-values grouped by segments and time frequency.
`plot_clusters`(ts, segment2cluster[, ...])	Plot clusters [with centroids].
`plot_correlation_matrix`(ts[, columns, ...])	Plot pairwise correlation heatmap for selected segments.
`plot_holidays`(ts, holidays[, segments, ...])	Plot holidays for segments.
`plot_imputation`(ts, imputer[, segments, ...])	Plot the result of imputation by a given imputer.
`plot_periodogram`(ts, period[, ...])	Plot the periodogram using `scipy.signal.periodogram()`.

acf_plot(ts: TSDataset, n_segments: int = 10, lags: int = 21, partial: bool = False, columns_num: int = 2, segments: Optional[List[str]] = None, figsize: Tuple[int, int] = (10, 5))[source]¶

Autocorrelation and partial autocorrelation plot for multiple timeseries.

Notes

Definition of autocorrelation.

Definition of partial autocorrelation.

If partial=False function works with NaNs at any place of the time-series.
if partial=True function works only with NaNs at the edges of the time-series and fails if there are NaNs inside it.

Parameters

ts (TSDataset) – TSDataset with timeseries data
n_segments (int) – number of random segments to plot
lags (int) – number of timeseries shifts for cross-correlation
partial (bool) – plot autocorrelation or partial autocorrelation
columns_num (int) – number of columns in subplots
segments (Optional[List[str]]) – segments to plot
figsize (Tuple[int, int]) – size of the figure per subplot with one segment in inches

Raises

ValueError: – If partial=True and there is a NaN in the middle of the time series

cross_corr_plot(ts: TSDataset, n_segments: int = 10, maxlags: int = 21, segments: Optional[List[str]] = None, columns_num: int = 2, figsize: Tuple[int, int] = (10, 5))[source]¶

Cross-correlation plot between multiple timeseries.

Parameters

ts (TSDataset) – TSDataset with timeseries data
n_segments (int) – number of random segments to plot, ignored if parameter segments is set
maxlags (int) – number of timeseries shifts for cross-correlation, should be >=1 and <= len(timeseries)
segments (Optional[List[str]]) – segments to plot
columns_num (int) – number of columns in subplots
figsize (Tuple[int, int]) – size of the figure per subplot with one segment in inches

Raises

ValueError: – parameter maxlags doesn’t satisfy constraints

distribution_plot(ts: TSDataset, n_segments: int = 10, segments: Optional[List[str]] = None, shift: int = 30, window: int = 30, freq: str = '1M', n_rows: int = 10, figsize: Tuple[int, int] = (10, 5))[source]¶

Distribution of z-values grouped by segments and time frequency.

Mean is calculated by the windows:

\[mean_{i} = \sum_{j=i-\text{shift}}^{i-\text{shift}+\text{window}} \frac{x_{j}}{\text{window}}\]

The same is applied to standard deviation.

Parameters

ts (TSDataset) – dataset with timeseries data
n_segments (int) – number of random segments to plot
segments (Optional[List[str]]) – segments to plot
shift (int) – number of timeseries shifts for statistics calc
window (int) – number of points for statistics calc
freq (str) – group for z-values
n_rows (int) – maximum number of rows to plot
figsize (Tuple[int, int]) – size of the figure per subplot with one segment in inches

plot_clusters(ts: TSDataset, segment2cluster: Dict[str, int], centroids_df: Optional[pandas.core.frame.DataFrame] = None, columns_num: int = 2, figsize: Tuple[int, int] = (10, 5))[source]¶

Plot clusters [with centroids].

Parameters

ts (TSDataset) – TSDataset with timeseries
segment2cluster (Dict[str, int]) – mapping from segment to cluster in format {segment: cluster}
centroids_df (Optional[pandas.core.frame.DataFrame]) – dataframe with centroids
columns_num (int) – number of columns in subplots
figsize (Tuple[int, int]) – size of the figure per subplot with one segment in inches

plot_correlation_matrix(ts: TSDataset, columns: Optional[List[str]] = None, segments: Optional[List[str]] = None, method: str = 'pearson', mode: str = 'macro', columns_num: int = 2, figsize: Tuple[int, int] = (10, 10), **heatmap_kwargs)[source]¶

Plot pairwise correlation heatmap for selected segments.

Parameters

ts (TSDataset) – TSDataset with timeseries data
columns (Optional[List[str]]) – Columns to use, if None use all columns
segments (Optional[List[str]]) – Segments to use
method (str) –
Method of correlation:
- pearson: standard correlation coefficient
- kendall: Kendall Tau correlation coefficient
- spearman: Spearman rank correlation
mode ('macro' or 'per-segment') – Aggregation mode
columns_num (int) – Number of subplots columns
figsize (Tuple[int, int]) – size of the figure in inches

plot_holidays(ts: TSDataset, holidays: Union[str, pandas.core.frame.DataFrame], segments: Optional[List[str]] = None, columns_num: int = 2, figsize: Tuple[int, int] = (10, 5), start: Optional[str] = None, end: Optional[str] = None, as_is: bool = False)[source]¶

Plot holidays for segments.

Sequence of timestamps with one holiday is drawn as a colored region. Individual holiday is drawn like a colored point.

It is not possible to distinguish points plotted at one timestamp, but this case is considered rare. This the problem isn’t relevant for region drawing because they are partially transparent.

Parameters

ts (TSDataset) – TSDataset with timeseries data
holidays (Union[str, pandas.core.frame.DataFrame]) –
there are several options:
- if str, then this is code of the country in holidays library;
- if DataFrame, then dataframe is expected to be in prophet`s holiday format;
segments (Optional[List[str]]) – segments to use
columns_num (int) – number of columns in subplots
figsize (Tuple[int, int]) – size of the figure per subplot with one segment in inches
as_is (bool) –
- Use this option if DataFrame is represented as a dataframe with a timestamp index and holiday names columns.
  
  In a holiday column values 0 represent absence of holiday in that timestamp, 1 represent the presence.
start (Optional[str]) – start timestamp for plot
end (Optional[str]) – end timestamp for plot

Raises

ValueError: –

Holiday nor pd.DataFrame or String. * Holiday is an empty pd.DataFrame. * as_is=True while holiday is String. * If upper_window is negative. * If lower_window is positive.

plot_imputation(ts: TSDataset, imputer: TimeSeriesImputerTransform, segments: Optional[List[str]] = None, columns_num: int = 2, figsize: Tuple[int, int] = (10, 5), start: Optional[str] = None, end: Optional[str] = None)[source]¶

Plot the result of imputation by a given imputer.

Parameters

ts (TSDataset) – TSDataset with timeseries data
imputer (TimeSeriesImputerTransform) – transform to make imputation of NaNs
segments (Optional[List[str]]) – segments to use
columns_num (int) – number of columns in subplots
figsize (Tuple[int, int]) – size of the figure per subplot with one segment in inches
start (Optional[str]) – start timestamp for plot
end (Optional[str]) – end timestamp for plot

plot_periodogram(ts: TSDataset, period: float, amplitude_aggregation_mode: Union[str, Literal['per-segment']] = AggregationMode.mean, periodogram_params: Optional[Dict[str, Any]] = None, segments: Optional[List[str]] = None, xticks: Optional[List[Any]] = None, columns_num: int = 2, figsize: Tuple[int, int] = (10, 5))[source]¶

Plot the periodogram using scipy.signal.periodogram().

It is useful to determine the optimal order parameter for FourierTransform.

Parameters

ts (TSDataset) – TSDataset with timeseries data
period (float) – the period of the seasonality to capture in frequency units of time series, it should be >= 2; it is translated to the fs parameter of scipy.signal.periodogram()
amplitude_aggregation_mode (Union[str, Literal['per-segment']]) – aggregation strategy for obtained per segment periodograms; all the strategies can be examined at AggregationMode
periodogram_params (Optional[Dict[str, Any]]) – additional keyword arguments for periodogram, scipy.signal.periodogram() is used
segments (Optional[List[str]]) – segments to use
xticks (Optional[List[Any]]) – list of tick locations of the x-axis, useful to highlight specific reference periodicities
columns_num (int) – if amplitude_aggregation_mode="per-segment" number of columns in subplots, otherwise the value is ignored
figsize (Tuple[int, int]) – size of the figure per subplot with one segment in inches

Raises

ValueError: – if period < 2
ValueError: – if periodogram can’t be calculated on segment because of the NaNs inside it

Notes

In non per-segment mode all segments are cut to be the same length, the last values are taken.