akkudoktoreos.core.dataabc.DataImportMixin
- class akkudoktoreos.core.dataabc.DataImportMixin
Bases:
objectMixin class for import of generic data.
This class is designed to handle generic data provided in the form of a key-value dictionary. - Keys: Represent identifiers from the record keys of a specific data. - Values: Are lists of data values starting at a specified start_datetime, where
each value corresponds to a subsequent time interval (e.g., hourly).
Two special keys are handled. start_datetime may be used to defined the starting datetime of the values. ìnterval may be used to define the fixed time interval between two values.
On import self.update_value(datetime, key, value) is called which has to be provided. Also self.ems_start_datetime may be necessary as a default in case `start_datetime`is not given.
- __init__()
Methods
__init__()import_datetimes(start_datetime, value_count)Generates a list of tuples containing timestamps and their corresponding value indices.
import_from_dataframe(df[, key_prefix, ...])Updates generic data by importing it from a pandas DataFrame.
import_from_dict(import_data[, key_prefix, ...])Updates generic data by importing it from a dictionary.
import_from_file(import_file_path[, ...])Updates generic data by importing it from a file.
import_from_json(json_str[, key_prefix, ...])Updates generic data by importing it from a JSON string.
- import_datetimes(start_datetime: DateTime, value_count: int, interval: Duration | None = None) List[Tuple[DateTime, int]]
Generates a list of tuples containing timestamps and their corresponding value indices.
The function accounts for daylight saving time (DST) transitions: - During a spring forward transition (e.g., DST begins), skipped hours are omitted. - During a fall back transition (e.g., DST ends), repeated hours are included, but they share the same value index.
- Parameters:
start_datetime (DateTime) – Start datetime of values
value_count (int) – The number of timestamps to generate.
interval (duration, optional) – The fixed time interval. Defaults to 1 hour.
- Returns:
A list of tuples, where each tuple contains: - A DateTime object representing an hourly step from start_datetime. - An integer value index corresponding to the logical hour.
- Return type:
List[Tuple[DateTime, int]]
- Behavior:
Skips invalid timestamps during DST spring forward transitions.
Includes both instances of repeated timestamps during DST fall back transitions.
Ensures the list contains exactly value_count entries.
Example
>>> start_datetime = pendulum.datetime(2024, 11, 3, 0, 0, tz="America/New_York") >>> import_datetimes(start_datetime, 5) [(DateTime(2024, 11, 3, 0, 0, tzinfo=Timezone('America/New_York')), 0), (DateTime(2024, 11, 3, 1, 0, tzinfo=Timezone('America/New_York')), 1), (DateTime(2024, 11, 3, 1, 0, tzinfo=Timezone('America/New_York')), 1), # Repeated hour (DateTime(2024, 11, 3, 2, 0, tzinfo=Timezone('America/New_York')), 2), (DateTime(2024, 11, 3, 3, 0, tzinfo=Timezone('America/New_York')), 3)]
- import_from_dict(import_data: dict, key_prefix: str = '', start_datetime: DateTime | None = None, interval: Duration | None = None) None
Updates generic data by importing it from a dictionary.
This method reads generic data from a dictionary, matches keys based on the record keys and the provided key_prefix, and updates the data values sequentially. All value lists must have the same length.
- Parameters:
import_data (dict) – Dictionary containing the generic data with optional ‘start_datetime’ and ‘interval’ keys.
key_prefix (str, optional) – A prefix to filter relevant keys from the generic data. Only keys starting with this prefix will be considered. Defaults to an empty string.
start_datetime (DateTime, optional) – Start datetime of values if not in dict.
interval (Duration, optional) – The fixed time interval if not in dict.
- Raises:
ValueError – If value lists have different lengths or if datetime conversion fails.
- import_from_dataframe(df: DataFrame, key_prefix: str = '', start_datetime: DateTime | None = None, interval: Duration | None = None) None
Updates generic data by importing it from a pandas DataFrame.
This method reads generic data from a DataFrame, matches columns based on the record keys and the provided key_prefix, and updates the data values using the DataFrame’s index as timestamps.
- Parameters:
df (pd.DataFrame) – DataFrame containing the generic data with datetime index or sequential values.
key_prefix (str, optional) – A prefix to filter relevant columns from the DataFrame. Only columns starting with this prefix will be considered. Defaults to an empty string.
start_datetime (DateTime, optional) – Start datetime if DataFrame doesn’t have datetime index.
interval (Duration, optional) – The fixed time interval if DataFrame doesn’t have datetime index.
- Raises:
ValueError – If DataFrame structure is invalid or datetime conversion fails.
- import_from_json(json_str: str, key_prefix: str = '', start_datetime: DateTime | None = None, interval: Duration | None = None) None
Updates generic data by importing it from a JSON string.
This method reads generic data from a JSON string, matches keys based on the record keys and the provided key_prefix, and updates the data values sequentially, starting from the start_datetime.
If start_datetime and or interval is given in the JSON dict it will be used. Otherwise the given parameters are used. If None is given start_datetime defaults to ‘self.ems_start_datetime’ and interval defaults to 1 hour.
- Parameters:
json_str (str) – The JSON string containing the generic data.
key_prefix (str, optional) – A prefix to filter relevant keys from the generic data. Only keys starting with this prefix will be considered. Defaults to an empty string.
start_datetime (DateTime, optional) – Start datetime of values.
interval (duration, optional) – The fixed time interval. Defaults to 1 hour.
- Raises:
JSONDecodeError – If the file content is not valid JSON.
Example
Given a JSON string with the following content: ```json {
“start_datetime”: “2024-11-10 00:00:00” “interval”: “30 minutes” “loadforecast_power_w”: [20.5, 21.0, 22.1], “other_xyz: [10.5, 11.0, 12.1],
}
and key_prefix = “load”, only the “loadforecast_power_w” key will be processed even though both keys are in the record.
- import_from_file(import_file_path: Path, key_prefix: str = '', start_datetime: DateTime | None = None, interval: Duration | None = None) None
Updates generic data by importing it from a file.
This method reads generic data from a JSON file, matches keys based on the record keys and the provided key_prefix, and updates the data values sequentially, starting from the start_datetime. Each data value is associated with an hourly interval.
If start_datetime and or interval is given in the JSON dict it will be used. Otherwise the given parameters are used. If None is given start_datetime defaults to ‘self.ems_start_datetime’ and interval defaults to 1 hour.
- Parameters:
import_file_path (Path) – The path to the JSON file containing the generic data.
key_prefix (str, optional) – A prefix to filter relevant keys from the generic data. Only keys starting with this prefix will be considered. Defaults to an empty string.
start_datetime (DateTime, optional) – Start datetime of values.
interval (duration, optional) – The fixed time interval. Defaults to 1 hour.
- Raises:
FileNotFoundError – If the specified file does not exist.
JSONDecodeError – If the file content is not valid JSON.
Example
Given a JSON file with the following content: ```json {
“loadforecast_power_w”: [20.5, 21.0, 22.1], “other_xyz: [10.5, 11.0, 12.1],
}
and key_prefix = “load”, only the “loadforecast_power_w” key will be processed even though both keys are in the record.