density_outliers¶
Functions
Calculate distance for |
|
|
Compute outliers according to density rule. |
Get indices of outliers for one series. |
- absolute_difference_distance(x: float, y: float) float [source]¶
Calculate distance for
get_anomalies_density()
function by taking absolute value of difference.- Parameters
x (float) – first value
y (float) – second value
- Returns
result – absolute difference between values
- Return type
float
- get_anomalies_density(ts: TSDataset, in_column: str = 'target', window_size: int = 15, distance_coef: float = 3, n_neighbors: int = 3, distance_func: typing.Callable[[float, float], float] = <function absolute_difference_distance>) Dict[str, List[pandas._libs.tslibs.timestamps.Timestamp]] [source]¶
Compute outliers according to density rule.
For each element in the series build all the windows of size
window_size
containing this point. If any of the windows contains at leastn_neighbors
that are closer thandistance_coef * std(series)
to target point according todistance_func
target point is not an outlier.- Parameters
ts (TSDataset) – TSDataset with timeseries data
in_column (str) – name of the column in which the anomaly is searching
window_size (int) – size of windows to build
distance_coef (float) – factor for standard deviation that forms distance threshold to determine points are close to each other
n_neighbors (int) – min number of close neighbors of point not to be outlier
distance_func (Callable[[float, float], float]) – distance function
- Returns
dict of outliers in format {segment: [outliers_timestamps]}
- Return type
Dict[str, List[pandas._libs.tslibs.timestamps.Timestamp]]
Notes
It is a variation of distance-based (index) outlier detection method adopted for timeseries.