shrecc.database

Attributes

UNUSED_SOURCE

Functions

apply_cutoff(df_filt, cutoff, include_cutoff)

Apply a cutoff value to filter out smaller values in the dataframe and optionally include a "rest" category.

apply_mapping(Z_cons_to_multiply, el_map_all_norm)

Apply the technology mapping to the consumption data.

create_activity_dict(dataframe_filt, known_inputs, ...)

Creates a dictionary of activities for the BW database based on the filtered dataframe and known inputs.

create_database(dataframe_filt, project_name, db_name, ...)

Creates an "ecoinvent-like" BW database based on a previously filtered dataframe.

filt_cutoff(countries[, times, general_range, ...])

Filters data based on selected countries and times (either one-off, a range, or periodical range).

filter_by_countries(dataframe, countries)

Filter the dataframe by selected countries.

filter_by_range(dataframe, general_range, ...)

Filter the dataframe by a general time range and optionally by a refined time range.

filter_by_times(dataframe, times)

Filter the dataframe by specific times.

get_network_activities(eidb_name)

load_mapping_data(mapping_location)

Load the mapping data from an Excel file.

load_time_series_data(path_to_data, year)

Load the time series data from a pickle file and format it as a DataFrame.

map_known_inputs(eidb_name, dataframe_filt)

Maps known inputs from the ecoinvent database to the filtered dataframe.

prepare_consumption_data(Z_cons)

Prepare the consumption data by removing trade data and adjusting indices.

setup_database(project_name, db_name)

Sets up the BW2 database for the given project.

tech_mapping(year, path_to_data[, path_to_mapping])

Main function to map the technologies and scale them to 1 kWh.

Module Contents

shrecc.database.apply_cutoff(df_filt, cutoff, include_cutoff)[source]

Apply a cutoff value to filter out smaller values in the dataframe and optionally include a “rest” category.

Parameters:
  • df_filt (pd.DataFrame) – The filtered dataframe.

  • cutoff (float) – The cutoff value for technology values.

  • include_cutoff (bool) – If True, sums values below cutoff and includes them as a new technology “The rest”.

Returns:

A dataframe with values below the cutoff set to zero, optionally including a “rest” category.

Return type:

pd.DataFrame

shrecc.database.apply_mapping(Z_cons_to_multiply, el_map_all_norm)[source]

Apply the technology mapping to the consumption data.

Parameters:
  • Z_cons_to_multiply (pd.DataFrame) – The consumption data to be mapped.

  • el_map_all_norm (pd.DataFrame) – The normalized mapping data.

Returns:

The resulting DataFrame after applying the technology mapping.

Return type:

pd.DataFrame

shrecc.database.create_activity_dict(dataframe_filt, known_inputs, known_inputs_network, db_name)[source]

Creates a dictionary of activities for the BW database based on the filtered dataframe and known inputs.

Parameters:
  • dataframe_filt (pd.DataFrame) – The filtered dataframe containing technology data.

  • known_inputs (dict) – A dictionary mapping known inputs to ecoinvent database entries.

  • known_inputs_network (dict) – A dictionary mapping known network inputs to ecoinvent database entries.

  • db_name (str) – The name of the BW database.

Returns:

A dictionary containing activities to be written to the BW2 database.

Return type:

dict

shrecc.database.create_database(dataframe_filt, project_name, db_name, eidb_name, network='True')[source]

Creates an “ecoinvent-like” BW database based on a previously filtered dataframe.

Parameters:
  • dataframe_filt (pd.DataFrame) – Scaled and filtered dataframe.

  • project_name (str) – BW project name to which the database will be saved.

  • db_name (str) – Name of the BW database to be created.

  • eidb_name (str) – Name of the ecoinvent database. Must be the same as in the BW project.

  • network (bool) – If True, network activities will be considered.

Returns:

None

shrecc.database.filt_cutoff(countries, times=[], general_range=0, refined_range=0, freq=0, cutoff=0.001, include_cutoff=True, path_to_data=None)[source]

Filters data based on selected countries and times (either one-off, a range, or periodical range).

Parameters:
  • year (int) – Selected year of the downloaded data.

  • countries (list of str) – Countries selected by the user for their database. E.g. countries=[‘FR’, ‘DE’].

  • times (list of str) – Selecting one specific time, e.g. times = [‘2023-06-16 8:00:00’, ‘2023-06-16 22:00:00’]. Can be applied alone.

  • general_range (list of str) – Selecting a general range, e.g. for the month of June general_range = [‘2023-06-01 01:00:00’, ‘2023-06-30 23:00:00’]. Can be applied alone.

  • refined_range (list of int) – Refining range of general range, e.g. mornings of June (previously selected in general_range): refined_range = [8, 9, 10, 11]. Can only be applied with general_range.

  • freq (str) – Days to be included, e.g. freq=’D’ selects calendar days, see https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases.

  • cutoff (float) – Cutoff value for technology values.

  • include_cutoff (bool) – If True, cutoff is applied and summed at the end to create a new technology “The rest”. If False, cutoff is applied but new technology not created.

  • path_to_data (str or str or Path) – location of the data. If none, the data is taken from within the package.

Returns:

The filtered dataframe.

Return type:

pd.DataFrame

shrecc.database.filter_by_countries(dataframe, countries)[source]

Filter the dataframe by selected countries.

Parameters:
  • dataframe (pd.DataFrame) – The original dataframe containing data for multiple countries.

  • countries (list of str) – A list of country codes to filter by.

Returns:

A dataframe filtered by the specified countries.

Return type:

pd.DataFrame

shrecc.database.filter_by_range(dataframe, general_range, refined_range, freq)[source]

Filter the dataframe by a general time range and optionally by a refined time range.

Parameters:
  • dataframe (pd.DataFrame) – The original dataframe containing data.

  • general_range (list of str) – The start and end of the general range to filter by (e.g., [‘2023-06-01’, ‘2023-06-30’]).

  • refined_range (list of int) – A list specifying the refined range of hours to filter within the general range.

  • freq (str) – The frequency for generating timestamps (e.g., ‘D’ for daily).

Returns:

A dataframe filtered by the specified time range and refined range.

Return type:

pd.DataFrame

shrecc.database.filter_by_times(dataframe, times)[source]

Filter the dataframe by specific times.

Parameters:
  • dataframe (pd.DataFrame) – The original dataframe containing data for multiple times.

  • times (list of str) – A list of specific times to filter by.

Returns:

A dataframe filtered by the specified times.

Return type:

pd.DataFrame

shrecc.database.get_network_activities(eidb_name)[source]
shrecc.database.load_mapping_data(mapping_location)[source]

Load the mapping data from an Excel file. mapping_collection can be either a string pointing to a full file, or a directory. If it is a directory, it will assume that the file name is el_map_all_norm.csv

Parameters:

mapping_location (str or Path) – a full filename as string or path to the scaled technology mapping.

Returns:

A DataFrame containing the technology mapping data from the Excel file.

Return type:

pd.DataFrame

shrecc.database.load_time_series_data(path_to_data, year)[source]

Load the time series data from a pickle file and format it as a DataFrame.

Parameters:
  • path_to_data (str or Path) – The path to the directory containing the time series data.

  • year (int) – The year corresponding to the time series data.

Returns:

A DataFrame containing the time series data, with levels reordered and sorted.

Return type:

pd.DataFrame

shrecc.database.map_known_inputs(eidb_name, dataframe_filt)[source]

Maps known inputs from the ecoinvent database to the filtered dataframe.

Parameters:
  • eidb_name (str) – The name of the ecoinvent database in the BW project.

  • dataframe_filt (pd.DataFrame) – The filtered dataframe containing technology data.

Returns:

A dictionary mapping known inputs to their corresponding entries in the ecoinvent database.

Return type:

dict

shrecc.database.prepare_consumption_data(Z_cons)[source]

Prepare the consumption data by removing trade data and adjusting indices.

Parameters:

Z_cons (pd.DataFrame) – The original consumption data DataFrame.

Returns:

The prepared consumption data, with the trade data removed and indices swapped.

Return type:

pd.DataFrame

shrecc.database.setup_database(project_name, db_name)[source]

Sets up the BW2 database for the given project.

Parameters:
  • project_name (str) – The name of the BW project.

  • db_name (str) – The name of the BW database to set up.

Returns:

The newly registered BW2 database.

Return type:

bd.Database

shrecc.database.tech_mapping(year, path_to_data, path_to_mapping=None)[source]

Main function to map the technologies and scale them to 1 kWh.

Parameters:
  • year (int) – The year corresponding to the data.

  • path_to_data (str or Path) – Root directory of the data.

  • path_to_mapping (str or Path) – File with the mapping of the scaled technology mappings. If None, it will use the mapping from the package.

Returns:

A DataFrame with the scaled technology mappings.

Return type:

pd.DataFrame

shrecc.database.UNUSED_SOURCE = 'Import balance (physical)'[source]