How Will the States and the Private Sector Respond to the Federal Environmental Regulatory Retreat?
Today, the New York Times reports that the Trump Administration will eliminate the Environmental Protection Agency’s internal Research Group. As the Federal Government invests fewer resources in monitoring pollution and conducting research using such Big Data, will pollution levels rise in the United States? Will the population’s exposure to pollution increase? An alternative hypothesis is that as the Federal Government retreats that those states that prioritize environmental protection will increase their investments to monitor pollution and to inform the local populace.
In this Substack, I present an economic analysis of the consequences of the Federal Government’s decentralization of environmental protection. I argue that in this Big Data age that Data collection plays a central role in helping private and public decision makers to keep up to date on emerging challenges we face.
I wrote this piece by first recording my thoughts on Zoom. I then fed my transcript to Grok 4 who rewrote my piece into a logical outline. I then edited the Grok 4 output.
Introduction
The federal government's disinvestment in data collection and in-house analysis, exemplified by the phasing out of the EPA's research group under the Trump administration, signals a shift toward decentralizing these responsibilities to states. Historically, data collection was centralized because it was costly to collect. The Federal Government was able to achieve scale economies and it collected data and then openly shared these data at libraries and then posted to Agency web pages. This created a level playing field, allowing citizens, businesses, and researchers to download and analyze data freely.
However, technological democratization—cheaper satellites, sensors, and personal devices—has enabled broader collection, prompting debates on optimal government spending. Decentralization promises efficiency but risks fragmentation, as states may adopt varying monitoring plans, leading to incomparable data collection.
The Benefits of Federal Centralization in Data Collection and Analysis
Centralized federal data collection solves coordination issues by standardizing units, measurements, and monitoring choices, ensuring clarity on where and when data is gathered. This fosters comparability across jurisdictions. Without it, researchers risk comparing "apples and oranges"—for instance, if Colorado, California, and Oklahoma use different air pollution plans, naive analyses could yield statistical errors, assuming representative samples where none exist.
As we learned in 2020 during the COVID-19 crisis, society gains from having access to real time national data with geographic accuracy on infection rates over time. During COVID, debates over test validity and fears of statistical type I/II errors delayed responses; imperfect but quick data could have allowed for firms and households and governments to engage in more cost effective virus adaptation behavioral change. Such continuous, comparable data would have also facilitated our real time understanding of what progress we were making in “bending the curve”.
At Federal agencies, In-house researchers guide collection, clean data, flag errors, and adapt surveys amid economic changes under budget constraints. Their statistical training uncovers hidden facts through counterfactual thinking. President Trump's EPA cuts, reducing staffing by nearly 23% and eliminating research offices, erodes the Federal capacity for data collection and hypothesis testing.
Will the States and the Private Sector Offset the Federal Retreat?
Economists often discuss the “crowding out” hypothesis. Consider the example of the U.S government introducing Social Security in 1935 and sending $ to senior citizens. The crowding out hypothesis posits that if the Federal Government saves more for people then young people will save less of their own earnings because they anticipate that the Federal Government is already saving for them. This logic implies that a booster of Social Security over-estimates how much this program contributes to the retirement consumption of older people because these same people saved less when they were young because of their expectation that they would receive social security payments when they are old.
In a similar sense, as the Federal Government retreats from certain regulatory functions, will the States and the Private sector step up their efforts? So, this is the “crowding out” hypothesis in reverse!
How Will Different States Respond to the Rise of Decentralization?
Going forward, States will be “free to choose” how they regulate air and water within their borders and how they monitor pollution and what they do with the data they collect. Will cross-state differences emerge? Richer, more educated states like California or Massachusetts invest in robust environmental protection, yielding superior data, while poorer ones like Mississippi or Alabama may choose to engage in less regulation and Data collection and data analysis.
Environmentalists will grow concerned that a consequence of this decentralization will be the rise of "domestic pollution havens," attracting dirty activities like toxic waste siting to lightly regulated areas. If there are locations within the United States where it is common knowledge that pollution is not measured, will this act to attract polluting activity to operate “in the shadows”?
Environmentalists will also point out that many environmental issues cross state boundaries. Consider the management of the Colorado River or the direction that smoke from a coal fired power plant in Ohio takes. In cases where cross-boundary spillovers occur, the regulators in the state where the pollution flows to will seek to regulate the upstream polluters but they will often be located in another state. A more recent example of this is when Wildfires burn in the Pacific West and in Canada and the smoke travels downwind to cities such as Chicago.
An optimist would counter that environmental liability law can be used to address these cases. In the case of downwind PM2.5 caused by wildfires, the victims are spread out and face transaction costs to work together to launch a class action lawsuit. The free rider problem predicts that they would have trouble coordinating.
In cases where the victims from pollution exposure are members of the same local community and they can identify the polluter, then the enforcement of the law would provide the polluter with an incentive to reduce its pollution. Here, the clear existence of property rights and the ability to use the legal system provides a strategy for deterring pollution activity even if the Federal regulators retreat.
A pessimist would push back here and argue that latency effects (how long does it take for pollution exposure to cause health damage) and the complexity of how one’s health depends on environmental factors versus personal choices over nutrition and smoking clouds the ability of the courts to hold firms accountable for their pollution.
Big Data Statistical Challenges Created By Regulatory Decentralization
Decentralization heightens risks of selection bias and non-random sampling. Economists worry that varying state criteria—e.g., Republican areas monitoring less in polluted spots versus progressive ones monitoring every two blocks—would yield skewed samples such that an “apples and oranges” issue would arise in studying pollution trends across different parts of the United States at a point in time and to study the same area over time. For example, suppose that California’s local EPA prioritized environmental justice and placed monitors in poor neighborhoods while a rural state places just a few monitors in rural places. The statistical facts learned from the data generated from these different monitors would not yield valid cross area comparisons.
Researchers using the Big Data generated from these non-random samples would have trouble testing hypotheses. This would slow down our scientific understanding of reality. Here an economist asks; “Who gains from slowing down the scientific method? How much do the losers lose ?” An optimist would counter that if these data are truly valuable that the private sector will eventually step in and collect these data and design cheaper ways to collect these data and the public can use its own funds to access these. This is an example of how society must choose between what are privately provided services and what are publicly provided services.
More Prospective Analysis of Data Collection Decentralization
In recent years, has the U.S Federal Government collected too much information about individuals and firms? Do such entities have the right to privacy?
A property rights issue arises. Who owns your data? What data must you share in the commons with statistical analysts? Do you have the right to opt in to decide what data related to you becomes part of the public domain?
I can imagine a compact between progressive Blue States where they would agree on data protocols for collecting standardized data on a variety of outcomes for people and firms and local environmental performance. This approach would offer a spatially segmented overview about quality of life such that researchers would know much more about quality of life dynamics in Blue States than in Red States.
A Counter Argument here that would be made by Governors in Red States is that they would now have the freedom to prioritize what issues they focus on. If a governor deeply cares about disaster resilience and K-12 educational quality, then more state resources might be devoted to collecting detailed data on these issues.
State governments do not have a monopoly on collecting data. In recent years, there has been an explosion in academic and private sector data collection. Many firms have their own Internet platforms and can use their platforms to collect data and to run their own A/B tests. Of course, these firms are focused on their own profits.
Firms that credibly demonstrate that they have collected data for a representative sample of Americans or have sampled pollution from a representative sample of locations could sell data licenses to use these data. This approach will raise the upfront costs of doing research but will provide an incentive for the private sector to engage in data collection, data creation and data innovation.
Conclusion
President Trump's Administration is launching a large scale experiment in governance. At any point in time, what data must be collected to have a real time understanding of our economy and emerging quality of life challenges. Who has an edge in collecting these data? How should these suppliers be compensated? What are the gains and costs of decentralizing data collection and analysis from the Federal government to the state governments and the Private Sector?
I have written this Substack on July 19th 2025. If I write a sequel to this Substack on July 19th 2050, will I point to this day as a turning point in environmental governance? Could the environmental performance of the United States actually improve due to this decentralization effort?