Scientists Work to Preserve Data from Trumpists
January 20, 2017
In another time, people like Donald Trump’s supporters burned the library at Alexandria.
At 11:59 am eastern, the official White House website had a lengthy information page about the threat of climate change and the steps the federal government had taken to fight it. At noon, at the instant Donald Trump took office, the page was gone, as well as any mention of climate change or global warming.
It’s customary for www.whitehouse.gov to flip over to the new administration exactly at noon, but the only mention of climate on President Trump’s new website is under his “America First Energy Plan” page, in which he vows to destroy President Obama’s Climate Action Plan, which is a government-wide plan to reduce carbon emissions and address climate change.
“President Trump is committed to eliminating harmful and unnecessary policies such as the Climate Action Plan and the Waters of the U.S. rule,” the site says. A search of the website found no mention of “global warming,” and the only mentions of “climate change” were archived pages that, after clicking on the links, led to scrubbed pages.
AT 10 AM the Saturday before inauguration day, on the sixth floor of the Van Pelt Library at the University of Pennsylvania, roughly 60 hackers, scientists, archivists, and librarians were hunched over laptops, drawing flow charts on whiteboards, and shouting opinions on computer scripts across the room. They had hundreds of government web pages and data sets to get through before the end of the day—all strategically chosen from the pages of the Environmental Protection Agency and the National Oceanic and Atmospheric Administration—any of which, they felt, might be deleted, altered, or removed from the public domain by the incoming Trump administration.
Their undertaking, at the time, was purely speculative, based on travails of Canadian government scientists under the Stephen Harper administration, which muzzled them from speaking about climate change. Researchers watched as Harper officials threw thousands of books of aquatic data into dumpsters as federal environmental research libraries closed.
But three days later, speculation became reality as news broke that the incoming Trump administration’s EPA transition team does indeed intend to remove some climate data from the agency’s website. That will include references to President Barack Obama’s June 2013 Climate Action Plan and the strategies for 2014 and 2015 to cut methane, according to an unnamed source who spoke with Inside EPA. “It’s entirely unsurprising,” said Bethany Wiggin, director of the environmental humanities program at Penn and one of the organizers of the data-rescuing event.
Back at the library, dozens of cups coffee sat precariously close to electronics, and coders were passing around 32-gigabyte zip drives from the university bookshop like precious artifacts.
The group was split in two. One half was setting web crawlers upon NOAA web pages that could be easily copied and sent to the Internet Archive. The other was working their way through the harder-to-crack data sets—the ones that fuel pages like the EPA’s incredibly detailed interactive map of greenhouse gas emissions, zoomable down to each high-emitting factory and power plant. “In that case, you have to find a back door,” said Michelle Murphy, a technoscience scholar at the University of Toronto.
Murphy had traveled to Philly from Toronto, where another data-rescuing hackathon had taken place a month prior. Murphy brought with her a list of all the data sets that were too tough for the Toronto volunteers to crack before their event ended. “Part of the work is finding where the data set is downloadable—and then sometimes that data set is hooked up to many other data sets,” she said, making a tree-like motion with her hands.
But data, no matter how expertly it is harvested, isn’t useful divorced from its meaning. “It no longer has the beautiful context of being a website, it’s just a data set,” Allen says.
That’s where the librarians came in. In order to be used by future researchers—or possibly used to repopulate the data libraries of a future, more science-friendly administration—the data would have to be untainted by suspicions of meddling. So the data must be meticulously kept under a “secure chain of provenance.” In one corner of the room, volunteers were busy matching data to descriptors like which agency the data came from, when it was retrieved, and who was handling it. Later, they hope, scientists can properly input a finer explanation of what the data actually describes.