shrecc.database

Attributes

UNUSED_SOURCE

Functions

`apply_cutoff`(df_filt, cutoff, include_cutoff)	Apply a cutoff value to filter out smaller values in the dataframe and optionally include a "rest" category.
`apply_mapping`(Z_cons_to_multiply, el_map_all_norm)	Apply the technology mapping to the consumption data.
`create_activity_dict`(dataframe_filt, known_inputs, ...)	Creates a dictionary of activities for the BW database based on the filtered dataframe and known inputs.
`create_database`(dataframe_filt, project_name, db_name, ...)	Creates an "ecoinvent-like" BW database based on a previously filtered dataframe.
`filt_cutoff`(countries[, times, general_range, ...])	Filters data based on selected countries and times (either one-off, a range, or periodical range).
`filter_by_countries`(dataframe, countries)	Filter the dataframe by selected countries.
`filter_by_range`(dataframe, general_range, ...)	Filter the dataframe by a general time range and optionally by a refined time range.
`filter_by_times`(dataframe, times)	Filter the dataframe by specific times.
`get_network_activities`(eidb_name)
`load_mapping_data`(mapping_location)	Load the mapping data from an Excel file.
`load_time_series_data`(path_to_data, year)	Load the time series data from a pickle file and format it as a DataFrame.
`map_known_inputs`(eidb_name, dataframe_filt)	Maps known inputs from the ecoinvent database to the filtered dataframe.
`prepare_consumption_data`(Z_cons)	Prepare the consumption data by removing trade data and adjusting indices.
`setup_database`(project_name, db_name)	Sets up the BW2 database for the given project.
`tech_mapping`(year, path_to_data[, path_to_mapping])	Main function to map the technologies and scale them to 1 kWh.

Module Contents

shrecc.database.apply_cutoff(df_filt, cutoff, include_cutoff)[source]

Apply a cutoff value to filter out smaller values in the dataframe and optionally include a “rest” category.

Parameters:

df_filt (pd.DataFrame) – The filtered dataframe.
cutoff (float) – The cutoff value for technology values.
include_cutoff (bool) – If True, sums values below cutoff and includes them as a new technology “The rest”.

Returns:

A dataframe with values below the cutoff set to zero, optionally including a “rest” category.

Return type:

pd.DataFrame

shrecc.database.apply_mapping(Z_cons_to_multiply, el_map_all_norm)[source]

Apply the technology mapping to the consumption data.

Parameters:

Z_cons_to_multiply (pd.DataFrame) – The consumption data to be mapped.
el_map_all_norm (pd.DataFrame) – The normalized mapping data.

Returns:

The resulting DataFrame after applying the technology mapping.

Return type:

pd.DataFrame

shrecc.database.create_activity_dict(dataframe_filt, known_inputs, known_inputs_network, db_name)[source]

Creates a dictionary of activities for the BW database based on the filtered dataframe and known inputs.

Parameters:

dataframe_filt (pd.DataFrame) – The filtered dataframe containing technology data.
known_inputs (dict) – A dictionary mapping known inputs to ecoinvent database entries.
known_inputs_network (dict) – A dictionary mapping known network inputs to ecoinvent database entries.
db_name (str) – The name of the BW database.

Returns:

A dictionary containing activities to be written to the BW2 database.

Return type:

dict

shrecc.database.create_database(dataframe_filt, project_name, db_name, eidb_name, network='True')[source]

Creates an “ecoinvent-like” BW database based on a previously filtered dataframe.

Parameters:

dataframe_filt (pd.DataFrame) – Scaled and filtered dataframe.
project_name (str) – BW project name to which the database will be saved.
db_name (str) – Name of the BW database to be created.
eidb_name (str) – Name of the ecoinvent database. Must be the same as in the BW project.
network (bool) – If True, network activities will be considered.

Returns:

None

shrecc.database.filt_cutoff(countries, times=[], general_range=0, refined_range=0, freq=0, cutoff=0.001, include_cutoff=True, path_to_data=None)[source]

Filters data based on selected countries and times (either one-off, a range, or periodical range).

Parameters:

year (int) – Selected year of the downloaded data.
countries (list of str) – Countries selected by the user for their database. E.g. countries=[‘FR’, ‘DE’].
times (list of str) – Selecting one specific time, e.g. times = [‘2023-06-16 8:00:00’, ‘2023-06-16 22:00:00’]. Can be applied alone.
general_range (list of str) – Selecting a general range, e.g. for the month of June general_range = [‘2023-06-01 01:00:00’, ‘2023-06-30 23:00:00’]. Can be applied alone.
refined_range (list of int) – Refining range of general range, e.g. mornings of June (previously selected in general_range): refined_range = [8, 9, 10, 11]. Can only be applied with general_range.
freq (str) – Days to be included, e.g. freq=’D’ selects calendar days, see https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#offset-aliases.
cutoff (float) – Cutoff value for technology values.
include_cutoff (bool) – If True, cutoff is applied and summed at the end to create a new technology “The rest”. If False, cutoff is applied but new technology not created.
path_to_data (str or str or Path) – location of the data. If none, the data is taken from within the package.

Returns:

The filtered dataframe.

Return type:

pd.DataFrame

shrecc.database.filter_by_countries(dataframe, countries)[source]

Filter the dataframe by selected countries.

Parameters:

dataframe (pd.DataFrame) – The original dataframe containing data for multiple countries.
countries (list of str) – A list of country codes to filter by.

Returns:

A dataframe filtered by the specified countries.

Return type:

pd.DataFrame

shrecc.database.filter_by_range(dataframe, general_range, refined_range, freq)[source]

Filter the dataframe by a general time range and optionally by a refined time range.

Parameters:

dataframe (pd.DataFrame) – The original dataframe containing data.
general_range (list of str) – The start and end of the general range to filter by (e.g., [‘2023-06-01’, ‘2023-06-30’]).
refined_range (list of int) – A list specifying the refined range of hours to filter within the general range.
freq (str) – The frequency for generating timestamps (e.g., ‘D’ for daily).

Returns:

A dataframe filtered by the specified time range and refined range.

Return type:

pd.DataFrame

shrecc.database.filter_by_times(dataframe, times)[source]

Filter the dataframe by specific times.

Parameters:

dataframe (pd.DataFrame) – The original dataframe containing data for multiple times.
times (list of str) – A list of specific times to filter by.

Returns:

A dataframe filtered by the specified times.

Return type:

pd.DataFrame

shrecc.database.get_network_activities(eidb_name)[source]

shrecc.database.load_mapping_data(mapping_location)[source]

Load the mapping data from an Excel file. mapping_collection can be either a string pointing to a full file, or a directory. If it is a directory, it will assume that the file name is el_map_all_norm.csv

Parameters:: mapping_location (str or Path) – a full filename as string or path to the scaled technology mapping.
Returns:: A DataFrame containing the technology mapping data from the Excel file.
Return type:: pd.DataFrame

shrecc.database.load_time_series_data(path_to_data, year)[source]

Load the time series data from a pickle file and format it as a DataFrame.

Parameters:

path_to_data (str or Path) – The path to the directory containing the time series data.
year (int) – The year corresponding to the time series data.

Returns:

A DataFrame containing the time series data, with levels reordered and sorted.

Return type:

pd.DataFrame

shrecc.database.map_known_inputs(eidb_name, dataframe_filt)[source]

Maps known inputs from the ecoinvent database to the filtered dataframe.

Parameters:

eidb_name (str) – The name of the ecoinvent database in the BW project.
dataframe_filt (pd.DataFrame) – The filtered dataframe containing technology data.

Returns:

A dictionary mapping known inputs to their corresponding entries in the ecoinvent database.

Return type:

dict

shrecc.database.prepare_consumption_data(Z_cons)[source]

Prepare the consumption data by removing trade data and adjusting indices.

Parameters:: Z_cons (pd.DataFrame) – The original consumption data DataFrame.
Returns:: The prepared consumption data, with the trade data removed and indices swapped.
Return type:: pd.DataFrame

shrecc.database.setup_database(project_name, db_name)[source]

Sets up the BW2 database for the given project.

Parameters:

project_name (str) – The name of the BW project.
db_name (str) – The name of the BW database to set up.

Returns:

The newly registered BW2 database.

Return type:

bd.Database

shrecc.database.tech_mapping(year, path_to_data, path_to_mapping=None)[source]

Main function to map the technologies and scale them to 1 kWh.

Parameters:

year (int) – The year corresponding to the data.
path_to_data (str or Path) – Root directory of the data.
path_to_mapping (str or Path) – File with the mapping of the scaled technology mappings. If None, it will use the mapping from the package.

Returns:

A DataFrame with the scaled technology mappings.

Return type:

pd.DataFrame

shrecc.database.UNUSED_SOURCE = 'Import balance (physical)'[source]