To address the challenges of preserving born-digital collections across the Smithsonian, the Archives kicked off a survey of born-digital collections material at the Archives, the National Anthropological Archives, the Human Studies Film Archive, the National Air and Space Museum Archives, the Archives Center at the National Museum of American History, the Archives of American Art, and the National Museum of African American History and Culture to inventory and assess the condition of their born digital holdings.
The survey’s goals were to uncover hidden holdings, establish physical and intellectual control of born digital material, and to perform a baseline preservation assessment to strengthen collections care. Use of a shared methodology and metrics was chosen in order for the survey to serve as a foundation for future joint preservation initiatives and stewardship planning.
The survey commenced in 2012 with building an inventory of removable digital storage media accessioned at the different archives, and categorizing them by media type, age, and visible physical condition. Simultaneously the participating archives completed questionnaires that evaluated their perceived ability to manage these types of collections. Using the data from the first-phase inventory, a second grant was received in 2014 to complete the survey, perform risk analysis at the individual file level, and provide essential interventions to stabilize these fragile materials. The survey was fully completed in April 2015, and the resulting qualitative and quantitative insights are being incorporated into the collections stewardship planning at the participating archives and museums.
Hidden Collections and Intellectual Control
During the course of the survey, 6,613 new pieces of digital storage media containing 651,629 files were identified in more than 470 accessions across the participating units. These numbers were combined with SI Archives’ previously inventoried born digital material to arrive at a total in excess of 1.5 million files inventoried, assessed and placed under improved archival control.
As the digital content was identified and classified, each digital object was assessed for risk and stabilized. They were scanned for viruses, fixity values were established, backups were made into secure storage environments, and metadata was generated. In this way, a baseline of bit-level preservation for the newly described holdings was established across the participating archives. Risk level was categorized according to format and age as described in Figure 1. 81 percent of the content was assessed at a medium or low risk. 14 percent, or approximately 91,000 files, was found to be at severe risk, being either over 10 years old, or in an inaccessible format. Taken as a whole, risk was distributed as 14 percent Severe, 5 percent High, 43 percent Medium, and 38 percent Low:
- Severe (1) indicated files older than 10 years and whose format the participating archive was unable to access.
- High (2) indicated files younger than 10 years and whose format the participating archive was unable to access.
- Medium (3) indicated files older than ten years yet were in formats that the participating archive was able to access.
- Low (4) indicated files younger than ten years in formats that the participating archive was able to access.
In 2014, five universities co-published the results of a project entitled The Digital POWRR (Preserving Digital Objects With Restricted Resources) Project. It included particularly helpful ways to visualize the important elements of digital preservation. Placed in the context of the POWRR framework, the progress shown in the participating archives’ born digital collections stewardship between 2012 and 2015 is striking. Prior to the survey in 2012, the state of born digital holdings preservation and curation was significantly lacking as demonstrated by the absence of color in the chart below.
At the time of the survey conclusion in 2015, improvements had been accomplished in almost every category of the POWRR framework. As indicated by the increase in shading of the color blocks in the figure below, major gains were achieved across the participating archives in Ingest, Processing and Storage.
We are excited at the enduring effect this survey has had on the born digital holdings within Smithsonian collections and their stakeholders, as well as the stewardship community and the born digital advocacy it empowers.
- Disk Diving: A Born Digital Collections Survey at the Smithsonian, The Bigger Picture, Smithsonian Institution Archives
- The End of the Beginning: A Born Digital Survey at the Smithsonian Institution, The Bigger Picture, Smithsonian Institution Archives
- From Theory to Action: Good Enough Digital Preservation for Under-Resourced Cultural Heritage Institutions, Northern Illinois University