hist_outliers¶
Functions
|
Count sse_one_bin[i][k] using binary search. |
|
Compute F. |
|
Get point outliers in time series using histogram model. |
|
Compute outliers indices according to hist rule. |
|
Count the approximation error by 1 bin from left to right elements. |
|
Count an approximation error of a series with [1, bins_number] bins. |
- adjust_estimation(i: int, k: int, sse: numpy.ndarray, sse_one_bin: numpy.ndarray) float [source]¶
Count sse_one_bin[i][k] using binary search.
- Parameters
i (int) – left border of series
k (int) – number of bins
sse (numpy.ndarray) – array of approximation errors
sse_one_bin (numpy.ndarray) – array of approximation errors with one bin
- Returns
result – calculated sse_one_bin[i][k]
- Return type
float
- compute_f(series: numpy.ndarray, k: int, p: numpy.ndarray, pp: numpy.ndarray) Tuple[numpy.ndarray, list] [source]¶
Compute F. F[a][b][k] - minimum approximation error on series[a:b+1] with k outliers.
- Parameters
series (numpy.ndarray) – array to count F
k (int) – number of outliers
p (numpy.ndarray) – array of sums of elements,
p[i]
- sum from 0th to i elementspp (numpy.ndarray) – array of sums of squares of elements,
pp[i]
- sum of squares from 0th to i elements
- Returns
result – array F, outliers_indices
- Return type
np.ndarray
- get_anomalies_hist(ts: TSDataset, in_column: str = 'target', bins_number: int = 10) Dict[str, List[pandas._libs.tslibs.timestamps.Timestamp]] [source]¶
Get point outliers in time series using histogram model.
Outliers are all points that, when removed, result in a histogram with a lower approximation error, even with the number of bins less than the number of outliers.
- Parameters
ts (TSDataset) – TSDataset with timeseries data
in_column (str) – name of the column in which the anomaly is searching
bins_number (int) – number of bins
- Returns
dict of outliers in format {segment: [outliers_timestamps]}
- Return type
Dict[str, List[pandas._libs.tslibs.timestamps.Timestamp]]
- hist(series: numpy.ndarray, bins_number: int) numpy.ndarray [source]¶
Compute outliers indices according to hist rule.
- Parameters
series (numpy.ndarray) – array to count F
bins_number (int) – number of bins
- Returns
indices – outliers indices
- Return type
np.ndarray
- optimal_sse(left: int, right: int, p: numpy.ndarray, pp: numpy.ndarray) float [source]¶
Count the approximation error by 1 bin from left to right elements.
- Parameters
left (int) – left border
right (int) – right border
p (numpy.ndarray) – array of sums of elements,
p[i]
- sum from first to i elementspp (numpy.ndarray) – array of sums of squares of elements,
pp[i]
- sum of squares from first to i elements
- Returns
result – approximation error
- Return type
float
- v_optimal_hist(series: numpy.ndarray, bins_number: int, p: numpy.ndarray, pp: numpy.ndarray) numpy.ndarray [source]¶
Count an approximation error of a series with [1, bins_number] bins.
- Parameters
series (numpy.ndarray) – array to count an approximation error with bins_number bins
bins_number (int) – number of bins
p (numpy.ndarray) – array of sums of elements, p[i] - sum from 0th to i elements
pp (numpy.ndarray) – array of sums of squares of elements, p[i] - sum of squares from 0th to i elements
- Returns
error – approximation error of a series with [1, bins_number] bins
- Return type
np.ndarray