Modeling pipeline for the TRA420 course project: starting from regional energy demand and elasticity assumptions, deriving energy mixes, estimating emissions, linking those emissions to global temperature responses, and evaluating local/global impact metrics such as the Social Cost of Carbon (SCC).
src/calc_emissions/— converts electricity demand and mix into emission deltas.climate_module/— FaIR wrappers and scenario tools.air_pollution/— maps non-CO₂ deltas to concentration-driven health impacts.economic_module/— SCC utilities (damages, discounting, reporting).local_climate_impacts/— converts global responses into country-level temperature and precipitation trajectories.
scripts/— CLI helpers such asrun_fair_scenarios.pyfor quick experiments.data/— input datasets (raw and processed). Includescalc_emissions/(country configs and emission factors), air-pollution statistics, GDP/Population tables, and local climate-impact scaling factors. The canonical Excel workbooks for electricity mixes and technology intensities (Electricity_OECD.xlsx,Emission_factors_all.xlsx) now live underdata/calc_emissions/.results/— generated outputs. Emissions live underresults/<run>/emissions/<mix>/<Country>/(per-country) plusresults/<run>/emissions/All_countries/<mix>/. All downstream modules reuse the same<run>prefix so each experiment (set viarun.output_subdir, or overridden byresults.run_directory) keeps its own climate, air-pollution, economic, and summary folders.tests/— pytest suite covering emissions, climate, economic, and local climate-impact modules.config.yaml— project-level configuration (scenario metadata, default parameters).environment.yaml— preferred Python environment specification for reproducibility.pyproject.toml— project metadata plus Ruff lint/format configuration..gitignore— excludes generated results and other artifacts.README.md— usage guidance and development conventions.docs/— module documentation and CLI guides (economic_module.md,climate_module.md,local_climate_impacts.md,air_pollution.md,results_summary.md,scripts.md).
-
Create the environment
conda env create -f environment.yaml # or using mamba or pip conda activate tra420-modeling -
Install the project in editable mode (once package metadata is finalized):
pip install -e .
Run the calculator for one country (writes per-country deltas under results/<run>/emissions/<mix>/<Country>/). Country names correspond to config_<name>.yaml files in the directory configured by calc_emissions.countries.directory (default data/calc_emissions/configs/):
python scripts/run_calc_emissions.py --country AlbaniaValid names match config filenames (underscores instead of spaces), e.g. Serbia, Bosnia-Herzegovina, North_Macedonia, Kosovo, Montenegro. Scenario names follow the pattern <mix_case>__<demand_case> where the demand case is one of base_demand, scen1_lower, scen1_mean, or scen1_upper. Downstream modules derive their worklists from these names—keep the structure when creating new cases.
Need to call the per-country writer programmatically? Import it from calc_emissions.writers:
from calc_emissions.writers import write_per_country_results
write_per_country_results(per_country_map, Path("results/my_run/emissions"))per_country_map must follow {Country: {scenario_name: EmissionScenarioResult}} which is the structure returned by run_calc_emissions.py.
Compute deltas for all countries and write the sum by mix to results/<run>/emissions/All_countries/<mix>/co2.csv (with per-demand columns):
python scripts/run_calc_emissions_all.pyOptions:
-
Restrict to specific countries:
python scripts/run_calc_emissions_all.py --countries Albania Serbia
-
Choose a different output directory or mirror results elsewhere:
python scripts/run_calc_emissions_all.py --output results/emissions/All_countries_custom --results-output results/emissions/All_countries_custom_copy
Outputs mirror the per-country structure (co2.csv, nox.csv, so2.csv, pm25.csv, and gwp100.csv when available) with absolute_*/delta_* columns for each demand case. The list of country configs, scenario filter, and the default aggregate output directories are configurable via the calc_emissions.countries block in config.yaml.
To execute emissions, climate, local climate impacts, air-pollution and SCC modules in a single command (using the defaults from config.yaml):
python scripts/run_full_pipeline.pyTypical workflow (driven by config.yaml):
- Emissions –
python scripts/run_calc_emissions.py --country <name>(per-country deltas) orpython scripts/run_calc_emissions_all.py(aggregated deltas); run before downstream modules soresults/<run>/emissions/<mix>/<Country>/andresults/<run>/emissions/All_countries/<mix>/hold current data. - Air-pollution impacts –
python scripts/run_air_pollution.pycombines non-CO₂ deltas with concentration stats to estimate concentration changes, mortality percentage changes, absolute deaths, and monetary benefits (per pollutant and aggregated). - Global climate –
python scripts/run_fair_scenarios.pywritesresults/<run>/climate/*.csv. Each CSV includes aclimate_scenariocolumn, and the run also produces background baseline CSVs (background_climate_full.csv,background_climate_horizon.csv) for plotting/reference. - Local climate impacts (optional) –
python scripts/run_local_climate_impacts.pyconsumes the global climate CSVs plus the scaling factors table and produces per-country temperature and precipitation deltas underresults/<run>/climate_scaled/<ISO3>/<scenario>.csv(with anAVERAGE/folder containing equal-weight means). Files are truncated to the modeling horizon (e.g., 2025–2100). - Economics (pulse SCC only) –
python scripts/run_scc.pyinfers the SSP family from the climate CSVs, builds socioeconomics (SSP tables or DICE mode), and evaluates the SCC via the FaIR pulse workflow once per climate scenario/discount method. Outputs includepulse_scc_timeseries_<method>_<ssp>.csvplus per-mix damages (results/<run>/economic/<mix>/damages_<method>_<scenario>.csv). - Summary –
PYTHONPATH=src python scripts/generate_summary.pycompiles emissions, climate, SCC, damages, and air-pollution metrics intoresults/<run>/summary/summary.csvand generates mix-specific plots with lower/mean/upper envelopes. A sharedplots/scc_timeseries.pngshows SCC trajectories by SSP.
The summary CSV stores one row per (energy_mix, demand_case, climate_scenario) combination with column groups in the following order:
- Scenario descriptors –
energy_mix,climate_scenario,demand_case. - Headline emissions –
delta_co2_Mt_all_countries_<year>plus per-country pollutant deltas. - Other pollutants – aggregated
delta_<pollutant>_<unit>_all_countries_<year>columns. - Temperature – global
delta_T_C_<year>and pattern-scaleddelta_T_<ISO3>_<year>. - SCC –
SCC_<method>_<year>_PPP_USD_2025_discounted_to_year_per_tco2orscc_average_<method>. - Damages –
damages_PPP2020_usd_baseyear_<base_year>_<method>_<year>plusyear_periodsums when configured. - Air pollution –
air_pollution_mortality_difference_all_countries_<year>,air_pollution_mortality_percent_change_all_countries_<year>(percentage points), monetary benefits, concentration deltas averaged and per country (µg/m³), along with optional period sums (air_pollution_mortality_difference_sum_all_countries_<start>_to_<end>). - Socioeconomics – GDP/population snapshots backing the SCC calculations.
Plots live in results/<run>/summary/plots/. Mix-specific folders contain emissions, mortality, concentration, and damage timeseries with shaded envelopes for scen1_lower/scen1_upper. The main folder contains background climate graphics, socioeconomics, and an SCC timeseries plot showing every SSP over the full horizon.
- Set
run.output_subdirinconfig.yaml(or pass--run-subdir <name>toscripts/run_full_pipeline.py) to keep outputs underresults/<name>/…. results.run_directorytakes precedence overrun.output_subdirand is useful as an explicit override (for example when post-processing a prior run). The summary generator also supportsPYTHONPATH=src python scripts/generate_summary.py --run-directory <name>.- For batch experiments set
run.mode: scenariosand pointrun.scenario_fileto a YAML mapping scenario names to overrides. Each entry can tweak any part of the base config; the pipeline deep-merges the overrides, forcesrun.modeback tonormal, assigns a per-scenario subdirectory (results/<suite>/<scenario>/…), and runs the full workflow. After the suite finishes an aggregate CSV/JSON plus a copy of the scenario YAML are written toresults/<suite>/for provenance.
- The new
socioeconomicsblock inconfig.yaml(default modedice) replaces the old SSP lookup with a DICE-style projection. Population follows logistic growth using scenario-specific asymptotes, total factor productivity declines gradually, and capital evolves via DICE’s savings/depreciation rules. These trajectories feed directly intorun_scc.py. - The IIASA GDP/Population tables bundled in
data/GDP_and_Population_data/IIASA/end at 2100. To analyze horizons beyond that, switchsocioeconomics.modetodice, which synthesizes GDP/POP series for the entire damage window. - Set
socioeconomics.mode: ssp(default) when you prefer empirical pathways: the SCC runner will pull GDP and population from the IIASA Scenario Explorer extracts shipped underdata/GDP_and_Population_data/IIASA/. - The climate module now supports calibrated FaIR runs via
climate_module.fair.calibration. Provide the local calibration dataset once (this repo shipsdata/FaIR_calibration_data/v1.5.0) and the runner will:- pick the requested posterior ensemble member,
- replace FaIR’s carbon-cycle pools, CH₄ lifetime terms, and F₂× forcing with the calibrated values,
- replay the CMIP7 historical emissions plus solar/volcanic forcings from 1750 before transitioning to each SSP scenario.
No
poochdownloads are required—the files are read directly from the specifiedbase_path.
Install development dependencies and run the pytest suite from the project root:
pip install -e '.[dev]'
python -m pytestIndividual modules can be exercised via python -m pytest tests/test_calc_emissions.py (or similar) when iterating quickly. Continuous integration expects the full suite to pass before changes are submitted.
Individual modules can be exercised via python -m pytest tests/test_calc_emissions.py (or similar) when iterating quickly. Continuous integration expects the full suite to pass before changes are submitted.
- Module boundaries: keep each package focused (e.g.,
energyshould not depend on UI code). - Configuration over constants: read scenario parameters from
config.yamlor files inconfig/. - Typing & docs: use type hints and concise docstrings to clarify model interfaces.
- Testing: add unit tests alongside new functionality (mirror the structure under
tests/). - Data provenance: document dataset sources and preprocessing steps in
data/README.md(create the file when data arrives). - Version control: exclude large datasets or exploratory notebooks unless essential; prefer lightweight CSV/JSON inputs in Git.
- Create a feature branch (e.g.,
feature/energy-mix). - Implement changes with tests, documentation updates, and Ruff-clean code.
- Run
pre-commit run -a(or at leastruff check/ruff format) before committing. - Open a pull request summarizing scientific assumptions and validation steps.
- Put reusable library code under
src/(e.g.src/climate_module/). Modules here should expose functions/classes without side effects so they can be imported from notebooks, other scripts, or tests. - Place runnable entry points or one-off helpers under
scripts/. These are thin wrappers that import fromsrc/, read configuration (likeconfig.yaml), and orchestrate the workflow. - Use
results/emissions/for intermediate inputs (each mix folder containsco2/so2/nox/pm25.csvwith per-demand columns). This folder is ignored by Git so you can generate or edit CSVs without polluting commits.
-
Ruff is configured in
pyproject.tomlto handle both linting and code formatting. -
Install the Git hooks once per clone to enable automatic fixes:
pre-commit install
-
Run Ruff manually when needed:
ruff check . --fix ruff format .
-
For more comprehensive checks including unsafe fixes:
ruff check . --fix --unsafe-fixes -
pre-commit run -awill apply the same checks to the whole repository (useful before opening a PR).
All runtime settings live in config.yaml.
-
time_horizon- Shared
{start, end, step}used across calc-emissions, climate, and downstream modules. The climate runner upgrades this window to annual resolution automatically.
- Shared
-
calc_emissions- Defines electricity demand/mix scenarios and converts them into Mt/year deltas for CO₂, SOₓ, NOₓ, PM₂.₅ (and optional GWP100).
- Key subsections:
emission_factors_file: CSV with atechnologycolumn and pollutant intensities (*_kg_per_kwh,*_mt_per_twh, etc.); values are harmonised to Mt/TWh.demand_scenarios/mix_scenarios: named templates used bybaselineand entries inscenarios.baseline: reference demand + mix used to compute differences.scenarios: list of electricity cases. Each entry can reference a named scenario or supply*_custommappings.countries: metadata pointing to per-country configs, aggregate output folders, optional notes file, and the shared scenario filter (names must exist in every country file).- Outputs one folder per scenario in the configured directory (default
results/emissions/<mix>/<Country>/). Files includeco2.csv,sox.csv,nox.csv,pm25.csv,gwp100.csv(when available); the climate module consumesco2.csvwhile the others support air-pollution analysis.
-
climate_module- Consumes emission-difference files and runs FaIR temperature responses.
- Key options:
output_directory: where summary CSVs are written (results/climateby default).sample_years_option:default(5-year to 2050, then 10-year) orfull(every year 2025–2100).parameters: global FaIR settings (e.g.deep_ocean_efficacy,forcing_4co2,equilibrium_climate_sensitivity). Start/end years inherit fromtime_horizon, always run at 1-year steps.climate_scenarios: SSP pathways to run (userun: allor list of IDs) with per-pathway tweaks.emission_scenarios: which emission scenario folders inresults/emissions/All_countries/to process (allor list of mix names). Onlyco2.csvfeeds FaIR; other pollutant files are optional analytics inputs.- When
economic_module.damage_duration_yearsexceeds the emission horizon, FaIR extends its run tostart + duration - 1and holds the terminal emission delta constant.
-
local_climate_impacts- Consumes global climate CSVs and applies country-specific scaling coefficients to produce temperature and precipitation responses.
- Key options:
output_directory: destination for per-country scaled results.scaling_factors_file: path to the scaling table (e.g.,data/pattern_scaling/cmip6_pattern_scaling_by_country_mean.csv).scaling_weighting: selects whichpatterns.*column to use (e.g.,area,gdp.2000,pop.2100).countries: ISO3 codes to generate outputs for.
- Matches climate scenarios using the first four characters of each
climate_module.climate_scenarios.definitions[*].idor theclimate_scenariocolumn injected into climate CSVs.
-
air_pollution- Translates emission changes for PM₂.₅ and NOₓ into mortality percentage differences by scaling baseline concentrations with emission ratios.
- Key options:
output_directory: where health-impact CSVs are written (results/air_pollutionby default).concentration_measure: preferred statistic (median,mean, etc.); the module falls back throughconcentration_fallback_orderif the field is missing in the data.country_weights: weighting used when averaging country-level responses (equalor a mapping{Country: weight}, normalised automatically; can be overridden per pollutant).pollutants: per-pollutant overrides (stats file,relative_riskorbeta, reference concentration delta). Usebaseline_deathsto convert percentage changes into annual death deltas (per_yearortotalplusyears/span); a module-levelbaseline_deathsentry applies to the combined total, optionally weighted byweights.scenarios:allor a list of emission scenario names to evaluate.
- Outputs include one
*_health_impact.csvper pollutant plus optional*_mortality_summary.csv(if baseline deaths are configured) and a combinedtotal_mortality_summary.csvaggregating all pollutants.
-
resultsrun_directory: optional subfolder inserted underresults/(for examplerun_A). When set, each module automatically writes toresults/<run_directory>/<module>/…so you can compare runs without overwriting prior outputs.summary: configuration for the cross-module report (seedocs/results_summary.mdfor details on available fields).
-
economic_module- Computes SCC by combining temperature, emission, and GDP series.
- Configure discounting under
economic_module.methodsand provide GDP/emission inputs. damage_functionnow supports optional threshold amplification, smooth saturation, and catastrophic add-ons in addition to the DICE quadratic terms (delta1,delta2). Tune behaviour via keys such asuse_threshold,threshold_temperature,use_saturation,max_fraction,use_catastrophic, and related parameters (seeconfig.yaml).- Per-year SCC is computed with definition-faithful FaIR pulses (one FaIR evaluation per emission year) so the reported SCC(τ) is exact for the chosen discounting method; see
docs/economic_module.mdfor details. - Temperature CSVs export a
climate_scenariocolumn; the SCC runner reads it to select the matching SSP GDP/population series fromgdp_population_directory(defaults to the IIASA Scenario Explorer extracts indata/GDP_and_Population_data/IIASA/{GDP,Population}.csv). At load time the GDP series is scaled by the BEA GDPDEF conversion factor (1.05) to express values in PPP USD-2025. Providegdp_seriesonly when overriding the IIASA tables. damage_duration_yearsextends the SCC damage window beyond the shared time horizon (starting at the global start year); datasets must supply values through the requested end year, and the climate module reuses the last available emission delta during the tail.data_sources.emission_rootanddata_sources.temperature_rootshould target the intermediateresults/products (results/emissions/All_countries/<mix>/co2.csvandresults/climate/<scenario>_<climate>.csv). Scenario names now use the format<mix>__<demand>(e.g.base_mix__scen1_upper). Climate pathways default to the SSPs enabled underclimate_module.climate_scenarios.run; specifyeconomic_module.data_sources.climate_scenariosonly when you need to override that list.- When
aggregationis set toaverage, provideaggregation_horizon(start,end) to bound the averaging window. The CLI enforces this so you always know which portion of the timeline feeds the aggregate SCC.
-
resultssummarycollects cross-module indicators (SCC, damages, temperature and emission deltas, mortality impacts) for configured years and writessummary.csvplus optional comparison bar charts tooutput_directory. Seedocs/results_summary.md.- Toggle
include_plotsto disable chart generation (useful on headless systems) or changeplot_formatfor publication-ready graphics.- When
economic_module.damage_duration_yearsexceeds the emission horizon, FaIR extends its run tostart + duration - 1and holds the terminal emission delta constant.
- When
-
local_climate_impacts- Consumes global climate CSVs and applies country-specific scaling coefficients to produce temperature and precipitation responses.
- Key options:
output_directory: destination for per-country scaled results.scaling_factors_file: path to the scaling table (e.g.,data/pattern_scaling/cmip6_pattern_scaling_by_country_mean.csv).scaling_weighting: selects whichpatterns.*column to use (e.g.,area,gdp.2000,pop.2100).countries: ISO3 codes to generate outputs for.
- Matches climate scenarios using the first four characters of each
climate_module.climate_scenarios.definitions[*].idor theclimate_scenariocolumn injected into climate CSVs.
-
air_pollution- Translates emission changes for PM₂.₅ and NOₓ into mortality percentage differences by scaling baseline concentrations with emission ratios.
- Key options:
output_directory: where health-impact CSVs are written (results/air_pollutionby default).concentration_measure: preferred statistic (median,mean, etc.); the module falls back throughconcentration_fallback_orderif the field is missing in the data.country_weights: weighting used when averaging country-level responses (equalor a mapping{Country: weight}, normalised automatically; can be overridden per pollutant).pollutants: per-pollutant overrides (stats file,relative_riskorbeta, reference concentration delta). Usebaseline_deathsto convert percentage changes into annual death deltas (per_yearortotalplusyears/span); a module-levelbaseline_deathsentry applies to the combined total, optionally weighted byweights.scenarios:allor a list of emission scenario names to evaluate.
- Outputs include one
*_health_impact.csvper pollutant plus optional*_mortality_summary.csv(if baseline deaths are configured) and a combinedtotal_mortality_summary.csvaggregating all pollutants.
-
resultsrun_directory: optional subfolder inserted underresults/(for examplerun_A). When set, each module automatically writes toresults/<run_directory>/<module>/…so you can compare runs without overwriting prior outputs.summary: configuration for the cross-module report (seedocs/results_summary.mdfor details on available fields).
-
economic_module- Computes SCC by combining temperature, emission, and GDP series.
- Configure discounting under
economic_module.methodsand provide GDP/emission inputs. damage_functionnow supports optional threshold amplification, smooth saturation, and catastrophic add-ons in addition to the DICE quadratic terms (delta1,delta2). Tune behaviour via keys such asuse_threshold,threshold_temperature,use_saturation,max_fraction,use_catastrophic, and related parameters (seeconfig.yaml).- Per-year SCC is computed with definition-faithful FaIR pulses (one FaIR evaluation per emission year) so the reported SCC(τ) is exact for the chosen discounting method; see
docs/economic_module.mdfor details. - Temperature CSVs export a
climate_scenariocolumn; the SCC runner reads it to select the matching SSP GDP/population series fromgdp_population_directory(defaults to the IIASA Scenario Explorer extracts indata/GDP_and_Population_data/IIASA/{GDP,Population}.csv). At load time the GDP series is scaled by the BEA GDPDEF conversion factor (1.05) to express values in PPP USD-2025. Providegdp_seriesonly when overriding the IIASA tables. damage_duration_yearsextends the SCC damage window beyond the shared time horizon (starting at the global start year); datasets must supply values through the requested end year, and the climate module reuses the last available emission delta during the tail.data_sources.emission_rootanddata_sources.temperature_rootshould target the intermediateresults/products (results/emissions/All_countries/<mix>/co2.csvandresults/climate/<scenario>_<climate>.csv). Scenario names follow<mix>__<demand>. Climate pathways default to the SSPs enabled underclimate_module.climate_scenarios.run; specifyeconomic_module.data_sources.climate_scenariosonly when overriding that list.- When
aggregationis set toaverage, provideaggregation_horizon(start,end) to bound the averaging window. The CLI enforces this so you always know which portion of the timeline feeds the aggregate SCC.
-
resultssummarycollects cross-module indicators (SCC, damages, temperature and emission deltas, mortality impacts) for configured years and writessummary.csvplus optional comparison bar charts tooutput_directory. Seedocs/results_summary.md.- Toggle
include_plotsto disable chart generation (useful on headless systems) or changeplot_formatfor publication-ready graphics.