Capturing the Digital Footprint of a Day in One's Life
A Mildly Interesting Story About my 2017 Rejected NSF Proposal
Back in the late 2010s, the U.S National Science Foundation announced a creative call for proposals;
A DIRECT QUOTE
“The NSF 2026 Idea Machine encouraged individuals from all walks of life, age 14 or older, to submit pressing “grand challenges” requiring fundamental research in science, engineering, or STEM education in order to inform NSF’s long-term planning. Approximately 800 entries were received from nearly every state in the U.S. and from established researchers, undergraduate and graduate students, teachers on behalf of their classes, and high school and middle school students. The submitted entries went through five selection stages, including a public comment phase. A blue-ribbon panel of 12 eminent, broad thinkers recommended seven ideas for the final prizes that were found to be exciting, ambitious, creative, and highly interdisciplinary.”
My proposal to this call was rejected. In this Substack, I want to sketch out my core idea and see if this interests you. Let me say upfront that I am aware that it is a bit creepy and that the Hawthorne Effect lurks.
Back in 2017, I proposed that a randomized payment device should be used to recruit a representative sample of American Adults to opt-in to allow all of their Internet interactions be recorded and merged together to the same person and tracked for a few months at the individual level.
So, Consider Matthew Kahn. The following information would be collected;
The date, the hour, Matthew’s geographic location (based on his cell phone). His Google searches, social media postings on X, Facebook, etc. His Amazon searches and purchases. (In 2026, now that we have AI —- his Grok and Chat GPT interactions), his text messages (both the contents and his network of connections).
While social scientists have time diaries, we do not have a comprehensive —- high frequency data on how the same person interacts with the digital world over the course of days that stretch into months.
I may have also proposed that the person be fit with a Fitbit to measure sleep and physical activity.
As shocks play out, COVID 2020, the LA Fires of 2025, economic recessions, natural Disasters, this sample of people could be studied to see how they adapt to the challenges they face. On polluted or hot days, what steps do people take to have a normal day?
How proactive are people in being aware of emerging risks and taking steps to protect themselves and their families? Who reveals themselves to be sluggish in the midst of an emerging challenge?
I thought that this “natural experiment” style research would inform public health and social workers in cities.
I thought that this new database would inform sociology research and economic research on social interactions.
By observing the same person’s time allocation and Internet interactions for 100 days or more, what could economists learn about the consistency of people’s choices and what new tests of behavioral economics would be available? Now, as I think about this in April 2026; the rise of AI and the information it can vacuum up about a person is similar to what I was proposing back in 2017.
Of course, I understand that people have the right to privacy. Up front, I stated that this would be an opt in design and one could use the randomized $ payoff to participate to encourage more people who are privacy inclined to opt in. Self selection bias would need to be addressed in making inferences based on the sample of people who agreed to participate. There could be a Hawthorne effect as some people might choose not to go to dirty websites while they knew they were being watched.
In April 2026, people generate so much daily data that could be captured for research. Private companies such as Google have these data but we have no mechanism to merge data at the person level across the various platforms to create a 360 degree view of a Day in Matthew Kahn’s life. What do we not know about the modern economy and modern quality of life because we don’t collect these data?
A final example. In recent years, more people with blood sugar control issues are wearing continuous blood sugar monitoring devices. This could be another input source in an Internet dataset.
In a Nordic nation such as Sweden , would there be greater trust among the public to opt in to participate? If the State and Firms could see the complete “picture” of our days, would this help to improve policies and/or the set of products offered? What are the social costs of having fragmented data about the populace?
Would Hayek say that the price system solves this information aggregation issue so the Kahn NSF proposal should definitely be rejected?
Or would improved estimates of conditional probabilities help our economy to be more efficient and innovative?


What stayed with me in this piece is that it is not really asking for more data. It is asking what becomes visible when a life is no longer broken into institutional fragments.
That is what makes the proposal so interesting, and also so unsettling. A time diary tells one story. A search history tells another. A phone location trail tells another. But once those streams are fused, the question is no longer only what people do. It becomes what kind of intelligibility a human life acquires when behavior, attention, movement, consumption, and response to stress are gathered into one continuous record. That is a very different kind of knowledge, closer to a portrait than to a dataset.
What gives the piece its real force is that it does not hide the moral discomfort. It seems to recognize that the promise of deeper social understanding is inseparable from a deeper risk: the more fully a life can be rendered legible, the easier it becomes to confuse visibility with understanding, and description with permission. In that sense, the deepest question may not be whether this would teach us more about modern life. It almost certainly would. It is whether we yet know how to live ethically with that kind of knowledge once it exists.
One concern is what if the firms use the collected data for price discrimination and profit maximization instead of for the benefit of the public?