The following is a replication notebook of the paper by Bennedsen, Hillebrand, and Koopman (2024). The replication is done in Julia using Quarto as part of the paper “On measurements errors in CO\(_2\) airborne fraction estimates” by J. Eduardo Vera-Valdés and Charisios Grivas. This notebook contains no new results.
Notebook setup
This notebook is written in Julia and uses the following packages:
DataFrames for data manipulation
XLSX for reading data from an Excel file
Plots
LinearAlgebra
Statsistics
HypothesisTests
All packages are available in the Julia registry and can be installed using the Julia package manager with the following command:
In the following, we load a project environment that contains the necessary packages. This step is not required if the packages are already installed in the current environment.
Airborne fraction
The airborne fraction is the fraction of CO\(_2\) emissions that remain in the atmosphere. It is a key parameter in the carbon cycle and is used to estimate the impact of human activities on the climate system. The airborne fraction is defined as the ratio of the increase in atmospheric CO\(_2\) concentration to the total CO\(_2\) emissions. It is usually expressed as a percentage.
Data
We load the data from an Excel file and plot the CO\(_2\) emissions and the atmospheric CO\(_2\) concentration over time.
The data is neatly collected in an Excel file in the author’s GitHub repository at the following link.
To ease things up, we have downloaded the data directly from the repository and saved it in the file AF_data.xlsx in the local folder.
We can read the data using the XLSX.jl package, convert it to a data frame using DataFrames.jl. We then recover the year, emissions, and coverage variables. Note that emissions are defined as the sum of fossil fuels (FF) and land-use and land-coverage changes (LULCC).
Precompiling packages...
41401.1 ms ✓ DataFrames
1 dependency successfully precompiled in 41 seconds. 35 already precompiled.
Note that atmospheric concentration growth, denoted G from hereinafter, is transformed into a vector of Float64 at definition to avoid type issues. Emissions, the sum of fossil fuels (fossilfuels) and land-use and land-coverage changes (lulcc), are denoted by E.
Once loaded, we can plot the data using the Plots.jl package.
Precompiling packages...
529.6 ms ✓ Statistics → SparseArraysExt
3090.9 ms ✓ FixedPointNumbers
3115.7 ms ✓ StatsBase
2028.6 ms ✓ ColorTypes
418.5 ms ✓ ColorTypes → StyledStringsExt
3066.3 ms ✓ ColorVectorSpace
5786.3 ms ✓ Colors
5703.4 ms ✓ ColorSchemes
13835.3 ms ✓ PlotUtils
4906.1 ms ✓ PlotThemes
6877.5 ms ✓ RecipesPipeline
61538.5 ms ✓ Plots
12 dependencies successfully precompiled in 99 seconds. 165 already precompiled.
Precompiling packages...
2297.2 ms ✓ QuartoNotebookWorkerPlotsExt (serial)
1 dependency successfully precompiled in 2 seconds
Classic estimation
Commonly, the airborne fraction is estimated using the following formula:
\[\frac{G_t}{E_t} = \alpha + \epsilon_t\]
where \(G_t\) is the atmospheric CO\(_2\) concentration at time \(t\), \(E_t\) is the total CO\(_2\) emissions at time \(t\), and \(\epsilon_t\) is the error term that captures the natural variability in the carbon cycle. The parameter \(\alpha\) is the airborne fraction.
In practice, we estimate \(\alpha\) by taking the mean of the ratio of the coverage to the emissions.
usingStatisticsα₁ =mean(G ./ E)
0.43856861803874964
This value is the estimated airborne fraction, which is the consensus value in the literature.
We plot the yearly airborne fraction and the estimated mean airborne fraction.
A new approach
Recently, Bennedsen, Hillebrand, and Koopman (2024) has suggested a new approach to estimate the airborne fraction. They propose to use the following formula:
\[G_t = \alpha E_t + \epsilon_t,\]
and estimate \(\alpha\), the airborne fraction, using ordinary least squares (OLS). They argue that this approach provides better statistical properties. Among them, the OLS estimator is super-consistent, meaning that it converges to the true value at a faster rate than the standard OLS estimator. They also show that the estimator has lower variance and is asymptotically normal.
The new approach relies on estimating the cointegration relationship between the emissions and the coverage using OLS. As for all cointegration analyses, as a first step, we need to check if the variables are integrated of the same order. We can do this by testing for the presence of a unit root in the series.
Unit root test
We use the Augmented Dickey-Fuller (ADF) (Dickey and Fuller 1979) test to test for the presence of a unit root in the series. The null hypothesis is that the series has a unit root, while the alternative hypothesis is that the series is stationary.
In Julia, we can use the ADFTest function from the HypothesisTests.jl package to perform the test.
As a demonstration, we test the emissions series for the presence of a unit root in a model with a trend and two lags.
usingHypothesisTestsτᵉₜ =ADFTest(E, :trend, 2)
Precompiling packages...
627.2 ms ✓ FillArrays → FillArraysStatisticsExt
954.1 ms ✓ PDMats → StatsBaseExt
5497.7 ms ✓ Distributions
1958.8 ms ✓ HypothesisTests
4 dependencies successfully precompiled in 8 seconds. 52 already precompiled.
Precompiling packages...
541.9 ms ✓ ColorVectorSpace → SpecialFunctionsExt
1 dependency successfully precompiled in 1 seconds. 20 already precompiled.
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.262114
Test summary:
outcome with 95% confidence: fail to reject h_0
p-value: 0.2907
Details:
sample size in regression: 61
number of lags: 2
ADF statistic: -2.57687
Critical values at 1%, 5%, and 10%: adjoint([-4.10768, -3.48147, -3.16849])
In this case, the null hypothesis of a unit root in the emissions series is not rejected.
We can perform the same test for the coverage series while considering different combinations of models and lags.
# Dataframe to store the resultsresultsdf =DataFrame("Variable"=>String[], "Model"=>String[], "L = 0"=>Float64[], "L = 1"=>Float64[], "L = 2"=>Float64[], "L = 3"=>Float64[], "L = 4"=>Float64[], "L = 5"=>Float64[])for variable in [:E, :G]for model in [:none, :constant, :trend] fila =zeros(6)for lags in0:5 τ =ADFTest(eval(variable), model, lags) fila[lags+1] =pvalue(τ)endpush!(resultsdf, [titlecase(string(variable)), titlecase(string(model)), fila...])endendresultsdf
6×8 DataFrame
Row
Variable
Model
L = 0
L = 1
L = 2
L = 3
L = 4
L = 5
String
String
Float64
Float64
Float64
Float64
Float64
Float64
1
E
None
1.0
1.0
0.999998
0.999812
0.997157
0.99959
2
E
Constant
0.953474
0.939141
0.91181
0.862177
0.828175
0.896349
3
E
Trend
0.0834996
0.202047
0.290654
0.313837
0.213148
0.474568
4
G
None
0.235183
0.520472
0.695203
0.803914
0.84451
0.85085
5
G
Constant
0.00180684
0.0718307
0.299885
0.444168
0.465884
0.386677
6
G
Trend
2.4945e-9
7.17214e-6
0.0011416
0.0129135
0.016052
0.00266726
Based on these results, we cannot reject the null hypothesis of a unit root in the emissions series for all models and lags. For the coverage series, the tests reject except for the model with a trend. This suggests that both series seem stationary.
Cointegration test
We can test for cointegration between the emissions and the coverage using the Engle and Granger (1987) test. The null hypothesis is that there is no cointegration relationship between the series, while the alternative hypothesis is that there is a cointegration relationship.
To test for cointegration, we first estimate the OLS regression of the coverage on the emissions. We then test the residuals for a unit root using the ADF test. Note that the residuals should be stationary if there is a cointegration relationship between the series.
α₂ = (E'E) \ (E'G)
0.4477918844144535
The estimated airborne fraction is slightly larger than the classical estimate.
To test if there is a cointegration relationship, and thus that we have a valid estimate, we must recover the residuals and test them for a unit root.
res₂ = G - α₂ * Eτᵣ =ADFTest(res₂, :none, 0)
Augmented Dickey-Fuller unit root test
--------------------------------------
Population details:
parameter of interest: coefficient on lagged non-differenced variable
value under h_0: 0
point estimate: -0.963507
Test summary:
outcome with 95% confidence: reject h_0
p-value: <1e-11
Details:
sample size in regression: 63
number of lags: 0
ADF statistic: -7.58332
Critical values at 1%, 5%, and 10%: adjoint([-2.60156, -1.9459, -1.61324])
The test statistic has to be compared against the critical values generated by MacKinnon (2010). We can reject the null hypothesis of a unit root in the residuals, which suggests that there is a cointegration relationship between the emissions and the coverage and the estimate is valid.
Standard errors
We compute the standard errors of the estimates of the airborne fraction using the formula:
where \(\hat{\sigma}^2_\epsilon = \sum\hat{\epsilon}^2/N\) is the estimate of the variance of the error term and \(\hat{\epsilon}_t\) are the residuals from the regression.
We consider adding additional covariates to the model. In particular, we consider adding \(ENSO\) (El Niño Southern Oscillation) and \(VAI\) (volcanic activity index) as covariates. These variables are known to affect the carbon cycle and can potentially influence the airborne fraction.
where \(ENSO_t\) and \(VAI_t\) are the El Niño Southern Oscillation and volcanic activity index at time \(t\), respectively.
Note that the authors first detrend the ENSO series before estimating the model. We can do this by regressing the series on a time trend and taking the residuals.
Given the variability of the LULCC measurements at the beginning of the series, we consider a recent subsample of the data. We consider the data from 1992 and estimate the airborne fraction using the new approach.
We can also test the new approach on other datasets. We can use the same methodology to estimate the airborne fraction. The preferred data for the LULCC emissions are from the Global Carbon Project (Friedlingstein et al. 2023). However, we can also use data from Houghton and Castanho (2022) and Marle et al. (2022).
We can also estimate the airborne fraction using Deming regression (Deming 1943). Deming regression is a method for estimating the parameters of a linear regression model when both the dependent and independent variables are subject to measurement error. The Deming regression assumes that the measurement errors in the dependent and independent variables are normally distributed with known variances.
and \(\delta = \frac{Variance(\omega_{G,t})}{Variance(\eta_{E,t})}\) is the ratio of the variance of the measurement error in the coverage to the variance of the measurement error in emissions.
Several values for \(\delta\) are tried, given that the true value is unknown.
Other drawbacks of the Deming regression is that there is no closed form expression to obtain the standard errors. It also cannot easily handle additional regressors like in the preferred specification. We solve these issues in the paper: On measurements errors in CO\(_2\) airborne fraction estimates by J. Eduardo Vera-Valdés and Charisios Grivas.
Note that Bennedsen, Hillebrand, and Koopman (2024) considered heteroscedasticity and autocorrelation robust standard errors. Nonetheless, their selected bandwidth is quite small, so that they are almost identical to the OLS standard errors. We report here the latter for simplicity.
References
Bennedsen, Mikkel, Eric Hillebrand, and Siem Jan Koopman. 2024. “A Regression-Based Approach to the CO2 Airborne Fraction.”Nature Communications 15 (1): 8507. https://doi.org/10.1038/s41467-024-52728-1.
Deming, William Edwards. 1943. Statistical Adjustment of Data. wiley.
Dickey, David A, and Wayne A Fuller. 1979. “Distribution of the Estimators for Autoregressive Time Series with a Unit Root.”Journal of the American Statistical Association 74 (366a): 427–31.
Engle, Robert F., and C. W. J. Granger. 1987. “Co-Integration and ErrorCorrection: Representation, Estimation, and Testing.”Econometrica 55 (2): 251–76. https://www.jstor.org/stable/1913236.
Friedlingstein, Pierre, Michael O’sullivan, Matthew W Jones, Robbie M Andrew, Dorothee CE Bakker, Judith Hauck, Peter Landschützer, et al. 2023. “Global Carbon Budget 2023.”Earth System Science Data 15 (12): 5301–69.
Houghton, Richard A, and Andrea Castanho. 2022. “Annual Emissions of Carbon from Land Use, Land-Use Change, and Forestry 1850–2020.”Earth System Science Data Discussions 2022: 1–36.
MacKinnon, James G. 2010. “Critical Values for Cointegration Tests.” Queen’s Economics Department Working Paper.
Marle, Margreet JE van, Dave van Wees, Richard A Houghton, Robert D Field, Jan Verbesselt, and Guido R van der Werf. 2022. “RETRACTED ARTICLE: New Land-Use-Change Emissions Indicate a Declining CO2 Airborne Fraction.”Nature 603 (7901): 450–54.