To date, the Archives holds more than 21 TB of digital records in a wide variety of types and formats which reflect the complex nature of the Smithsonian and and activity of its staff. Its earliest accession of digital content occurred in 1994, and its oldest digital content dates back to the early 1980s. Today, well over half of the Archives annual accessions contain digital material. The Archives employs a lifecycle management approach to establish physical and intellectual control as soon as it acquires custody of digital collections, using a multi-pronged preservation strategy that establishes a foundation of bit-level preservation. Format migration or emulation strategies are also employed when appropriate as the Archives carries out its commitment to provide durable access to its digital holdings.
In 2001, the Technical Services group was charged with the care and preservation of the digital holdings in its collections. By that point, it was well-accepted within the archival profession that electronic records could not be faithfully preserved apart from a digital approach. Initial efforts focused on documenting the presence and scope of digital storage media found in new accessions and two areas of research: website preservation and email preservation. Both areas were recognized as leading challenges in the world of electronic records and digital preservation.
In 2003, the Archives’ Electronic Records Program was formed and quickly set to work compiling detailed documentation and condition assessments of existing digital holdings. A preservation protocol was established and methodology integrated into the Archives acquisition process to ensure that preservation treatments began at the earliest point possible. An exhaustive inventory of all digital holdings was made the following year.
In 2005, website capture and preservation transitioned into normal operations. The Program resumed its research into the preservation of email, joining with the Rockefeller Archive Center on the Collaborative Electronic Records Program (2005-2008). Working with depositors from both organizations and bodies of email ranging from a few hundred to accounts with more than 80,000 messages, the project team developed guidance for email depositors and their archival organizations both large and small, preservation methodology, and software appropriate for dealing with whole email accounts and single messages. The CERP Preservation Parser software and documentation was released as open source in 2008.
Other challenges have been tackled and solutions defined since then. However, the nature of computing technology ensures that research will always be part of digital curation best practice. Learn more at Challenges and Solutions and Project Highlights.
Program staff often join with their peers in professional initiatives to advance digital preservation as participants or advisors in standards development, research projects and in hosting specialized internships.
- The Federal Agencies Digitization Guidelines Initiative, The Library of Congress
- The Federal Web Archiving Working Group, The Signal Blog, The Library of Congress
- The international Task Force on Technical Approaches to Email Archives, Mellon Foundation