datasets_generation¶
Functions
|
Create DataFrame with AR process data. |
|
Create DataFrame with const data. |
|
Create DataFrame from patterns. |
|
Create DataFrame with hierarchical structure and AR process data. |
|
Create DataFrame with periodic data. |
- generate_ar_df(periods: int, start_time: str, ar_coef: Optional[list] = None, sigma: float = 1, n_segments: int = 1, freq: str = '1D', random_seed: int = 1) pandas.core.frame.DataFrame [source]¶
Create DataFrame with AR process data.
- Parameters
periods (int) – number of timestamps
start_time (str) – start timestamp
ar_coef (Optional[list]) – AR coefficients
sigma (float) – scale of AR noise
n_segments (int) – number of segments
freq (str) – pandas frequency string for
pandas.date_range()
that is used to generate timestamprandom_seed (int) – random seed
- Return type
pandas.core.frame.DataFrame
- generate_const_df(periods: int, start_time: str, scale: float, n_segments: int = 1, freq: str = '1D', add_noise: bool = False, sigma: float = 1, random_seed: int = 1) pandas.core.frame.DataFrame [source]¶
Create DataFrame with const data.
- Parameters
periods (int) – number of timestamps
start_time (str) – start timestamp
scale (float) – const value to fill
period – data frequency – x[i+period] = x[i]
n_segments (int) – number of segments
freq (str) – pandas frequency string for
pandas.date_range()
that is used to generate timestampadd_noise (bool) – if True we add noise to final samples
sigma (float) – scale of added noise
random_seed (int) – random seed
- Return type
pandas.core.frame.DataFrame
- generate_from_patterns_df(periods: int, start_time: str, patterns: List[List[float]], freq: str = '1D', add_noise=False, sigma: float = 1, random_seed: int = 1) pandas.core.frame.DataFrame [source]¶
Create DataFrame from patterns.
- Parameters
periods (int) – number of timestamps
start_time (str) – start timestamp
patterns (List[List[float]]) – list of lists with patterns to be repeated
freq (str) – pandas frequency string for
pandas.date_range()
that is used to generate timestampadd_noise – if True we add noise to final samples
sigma (float) – scale of added noise
random_seed (int) – random seed
- Return type
pandas.core.frame.DataFrame
- generate_hierarchical_df(periods: int, n_segments: List[int], freq: str = 'D', start_time: str = '2000-01-01', ar_coef: Optional[list] = None, sigma: float = 1, random_seed: int = 1) pandas.core.frame.DataFrame [source]¶
Create DataFrame with hierarchical structure and AR process data.
- The hierarchical structure is generated as follows:
Number of levels in the structure is the same as length of
n_segments
parameterEach level contains the number of segments set in
n_segments
Connections from parent to child level are generated randomly.
- Parameters
periods (int) – number of timestamps
n_segments (List[int]) – number of segments on each level.
freq (str) – pandas frequency string for
pandas.date_range()
that is used to generate timestampstart_time (str) – start timestamp
ar_coef (Optional[list]) – AR coefficients
sigma (float) – scale of AR noise
random_seed (int) – random seed
- Returns
DataFrame at the bottom level of the hierarchy
- Raises
ValueError: –
n_segments
is emptyValueError: –
n_segments
contains not positive integersValueError: –
n_segments
represents not non-decreasing sequence
- Return type
pandas.core.frame.DataFrame
- generate_periodic_df(periods: int, start_time: str, scale: float = 10, period: int = 1, n_segments: int = 1, freq: str = '1D', add_noise: bool = False, sigma: float = 1, random_seed: int = 1) pandas.core.frame.DataFrame [source]¶
Create DataFrame with periodic data.
- Parameters
periods (int) – number of timestamps
start_time (str) – start timestamp
scale (float) – we sample data from Uniform[0, scale)
period (int) – data frequency – x[i+period] = x[i]
n_segments (int) – number of segments
freq (str) – pandas frequency string for
pandas.date_range()
that is used to generate timestampadd_noise (bool) – if True we add noise to final samples
sigma (float) – scale of added noise
random_seed (int) – random seed
- Return type
pandas.core.frame.DataFrame