Big Questions: Digital Preservation of Big Data in Government

Emily Larson

Big Data is becoming a key part of transactions and decision-making processes, and archivists are increasingly called to intervene in its management. This article examines the digital preservation needs of government Big Data from the perspective of archival theory. While Big Data presents unique challenges, particularly in the areas of record capture, access, and privacy, it is nonetheless becoming a key component of modern government recordkeeping. Managing both the technical and ethical aspects of Big Data is essential, with each requiring specific consideration. Taking a systems-level view of Big Data by attempting to capture instances of bounded variability may be one path forward, and technical tools and systems can successfully manage such large volumes of information. However, ultimately, as with all digital preservation initiatives, proper documentation is key. Creating appropriate metadata to capture the identity, technical characteristics, and management actions for Big Data must include the multiprovenancial origins of such data sets. More broadly, Big Data reminds archivists of their larger responsibilities. Recognizing the power dynamics in Big Data requires an interrogation and documentation of the data themselves, as well as of the ways in which governments and corporations use them. Digital preservation must balance technical knowledge with critical perspectives to truly capture the context of Big Data and the records it produces.

The American Archivist
Vol. 83, No. 1
Digital preservation, Big Data, Government archives, Privacy, Ethics, Digital archives, Data repositories
