Back to Blog

Do High Emitting Buildings Report Emissions Less Often?

Analyzing Patterns in Greenhouse Gas Emissions Reporting

Published

Analysis by Colton Lapp
Assistance and review by Viktor Köves

Many buildings in the Chicago building benchmarking datadon't report their emissions. Is there a pattern to which buildings fail to report? Our team has noticed that some high emissions buildings stop reporting, while more efficient buildings tend to keep reporting year after year. In this blog, we investigate if there is a broader pattern at play - are buildings emission levels linked to a building's reporting compliance?

Analyzing the Emissions Data

To understand if there was a link between reporting compliance and emissions, we analyzed Chicago's building benchmarking data. As a first step, we wanted to investigate the overall landscape of emissions reporting. We started by computing some general statistics about emissions reporting.

Historical Patterns of Non-Reporting

The graph below depicts the count of buildings that did and did not report emissions data each year.

Several things can be seen from the time series analysis above.

  • The set of buildings subject to reporting requirements grew from a few hundred to a few thousand buildings during the ramp up period of the program from 2014 to 2016
  • There was a sharp drop in emissions reporting during the COVID-19 pandemic. Buildings report emissions for the previous year in the spring, so emissions for 2019 were not reported in the spring of 2020.
  • Outside of the disruption from COVID-19, between a third to a fourth of buildings in the benchmarking database fail to report every year (starting in 2018)

Understanding Emissions Intensities

Our goal was to understand if 'bad' levels of emissions are related to buildings failing to report in subsequent years. To answer this, we need to choose a metric to measure what constitutes 'bad' levels of emissions. Our preferred metric is Green House Gas Intensity (GHG Intensity), the amount of greenhouse gas released per square foot . Buildings with GHG Intensity levels that are abnormally high or trending higher may strategically choose not to report data the next year.

What are normal emissions intensities?

As can be seen above, most buildings have GHG intensities between 0 and 20, but some outliers have GHG intensities well above 200.

Analysis: Do levels/trends in emissions predict reporting compliance?

To understand whether or not emissions patterns were related to reporting compliance, we looked at two different emissions characteristics:

  1. The GHG Intensity level a year prior
  2. The change in GHG intensity levels between last year and two years ago

For both of these values, we were curious to know:

  1. Does last years GHG intensity level predict this year's reporting compliance?
  2. Does the change in GHG intensity in the last two years predict this year's reporting compliance?

We computed the mean and median of the level and trend of GHG intensity for both groups of buildings: Reporting and Non-Reporting. In the graphs below, we show these statistics only for the most recent year of data. When we expand the analysis for all years of data, the findings do not meaningfully change. As can be seen, there are no real group differences in emission characteristics between the two groups:

Note: Graphs above show only most recent year of data. Buildings that didn't report data in both years are not represented in these scatter plots because we don't know what their emission levels were in the previous year. To see additional exploration of the data and graphs looking at different observation sets, refer to the linked Jupyter notebook .

Results: No meaningful difference between groups

As can be seen from both of the graphs above, it seems that there is no real pattern between emission intensities or emission trends when it comes to compliance. Instead, it seems that whether or not buildings report their emissions data seems unrelated to a building's emissions profile in the previous year.

Investigating Further

Although the graphs above suggest that the average values between compliant and non-compliant buildings are similar, several factors could be driving this finding. These considerations are grouped into three key areas:

1. Impact of COVID-19 on Reporting Rates

Most of the buildings that did not report data coincided with the COVID-19 drop in reporting rates.

2. Influence of Outliers

While most buildings have low emissions intensities and minimal changes, several outliers are present. Outliers in GHG intensity values distort mean values for groups (although median values should not be as affected).

3. Omitted Predictive Characteristics

Emissions intensity and trends are not the only factors that might predict compliance with reporting laws. The building type, for example, likely matters a great deal when it comes to GHG intensity (i.e. data centers, gyms and aquariums may have higher energy needs and longer operating hours). Some of these characteristics might also correlate with reporting compliance in a way that could be obscuring underlying trends.

Additional Analysis

To address questions #1 and #2 (COVID-19 and/or outliers) driving our results, we ran further analysis where we excluded both of these sets of data points and recalculated our mean and median values across groups. Ultimately, this simple descriptive analysis still found no meaningful difference between groups. This made it seem like these characteristics were not driving our results.

If COVID-19 and outliers don't seem to be affecting our results, what about omitted predictive characteristics? Is it possible that there are strong connections between certain building characteristics (age of building, type of building, etc) and reporting rates? One building characteristic, for example, that likely affects reporting and emissions is the building type (office, school, etc). A quick visualization of reporting rates by building type in fact does show that there are some patterns across groups:

Variance in Reporting Across Building Type

Note: This graph only shows building categories with 100 or more observations. There are over 50 different building types, but most have under a dozen buildings in the data. To see this graph with all the different building types, please view the full analysis in the linked Jupyter notebook .

In the graph above, it can be seen that two of the most common building categories, "K-12 Schools" and "Multifamily Housing" have very different reporting rates (27% and 17%, respectfully). Is it possible that these trends, as well as their intersection with other patterns (like the drop off in COVID-19 reporting), could be obscuring a relationship between emission levels and reporting compliance?

Regression Analysis

With these considerations in mind, we wanted to test the possibility that external factors might be hiding an underlying connection between emissions and reporting. To test this, we decided to run a linear regression analysis. Linear regression is a statistical technique that attempts to understand what the isolated effect of a variable is on a given outcome, while controlling for external factors. In our example, we are trying to rule out the possibility that some building characteristics (i.e. square footage, age of building, building type) and/or time trends could be obscuring a hidden pattern in the data of emissions being linked to reporting.

Specifically, we fit a linear probability model where our outcome of interest is equal to 1 if a building failed to report in a given year. The analysis includes buildings that reported complete emissions data in the past two years, regardless of whether they reported in the current year. Thus, buildings that consistently fail to report are excluded, as we can't determine how their emissions affect reporting compliance if we don't have data on their emissions. Many buildings that consistently report appear multiple times in the dataset for different time periods.

In the regression model, we controlled for building characteristics that we observe as well as the time period the data was collected in (to account for trends like COVID-19). The estimates of that model are reported below:

Regression Results: Linear Probability Model

VariableCoefficientP-Value Confidence Interval (0.025, 0.975)
Constant0.2722 0.082 [-0.035, 0.579]
GHG Intensity (Last Year)0.0004 0.064 [-0.000, 0.001]
Change in GHG Intensity (Last Two Years)-0.0000 0.947 [-0.000, 0.000]
Gross Floor Area (Millions)-0.0135 0.085 [-0.029, 0.002]
Year Building Was Built-0.0001 0.103 [-0.000, 0.000]
Building Type Dummy Variables: Included
Year Dummy Variables: Included

Dependent Variable: "Didn't Report"
"Building Type" Categorical Dummy Variables: Included
"Year" Dummy Variables: Included
Number of Observations: 10588
Number of Observations Dropped in Regression Due to Missing Independent Variables: 14303 (57.46%)
Number of Observations Where Outcome was "Didn't Report": 12.22%
R-Squared: 0.209
Adjusted R-Squared: 0.204
Covariance Type: nonrobust

As can be seen from the regression analysis above, it doesn't seem like there is any meaningful relationship between the emission levels or trends of a building and whether or not it reports data. The coefficient estimates for both of our emission related variables (GHG Intensity Last Year and Change in GHG Intensity) are essentially zero. In fact, most of our variables don't seem to have a very strong connection to reporting rates. The only variables that seem to explain some of the reporting patterns are those related to time (i.e. the COVID-19 data disruption) and those related to building type (a couple building categories have higher/lower reporting rates than average).

Conclusion: No pattern found

After looking at summary statistics and running regression analysis, it seems like there is no meaningful relationship between emission patterns of buildings and their reporting compliance. This contradicts our initial hypothesis that some buildings might withhold data to conceal poor emissions performance.

This raises a key question: why do buildings choose to report or not? At Electrify Chicago, we’ve noticed that few politicians and policymakers are even aware of the existence of the Chicago Building Benchmarking Data. So, maybe it's not surprising we don't detect a pattern between emissions levels and reporting compliance. After all, why would buildings try to hide their emission levels if no one is paying attention?

Electrify Chicago aims to change this reality by making the building benchmarking data more accessible. Greater accessibility means greater scrutiny—if more people are aware of the data, buildings hopefully will not only be more likely to report but also more motivated to transition to renewable energy and cut emissions.

Explore our Analysis

Interested in diving deeper into our analysis? You can view the code and interactive data exploration in the linked Jupyter Notebook.

Questions?

Contact the lead developer on this site, Viktor Köves, by emailing contact@viktorkoves.com