Digital Curation


The use and usefulness of computers has exploded since the 1960s, shifting from highly specialized tools to something that is ubiquitous whether in the workplace, in the field, or at home.

The Archives digital curation team provides the necessary care for this growing segment of the Smithsonian’s historical and scientific records. Across the cultural heritage sector, digital curation encompasses the selection, acquisition, preservation, maintenance, and delivery of digital data. At the Archives, we manage our born-digital and digitized holdings in the context of a lifecycle management approach.

The Archives Electronic Records Program was implemented in 2003 to care for the born digital holdings spread across the collections. Annually more than half of our new accessions contain born digital material; images, audio, video, text, databases and datasets, CAD (computer aided design) files, websites, email, social media and custom-built software.

Digitization of our analog holdings began around 2008, and that function quickly became incorporated into our preservation and reference services. Since that time more than 1.5 million digital surrogates of textual, photographic, and video materials have been created and are being made available to researchers and scholars.

Digital Preservation

Often confused with the broader scope of digital curation, digital preservation is focused on the task of ensuring that digital collections are accessible to the public in the future. For analog collections that are digitized, the high-quality digital surrogates produced require the same attention as born-digital collections in order to avoid subjecting the analog materials to repeated digitization. Several professional standards and best practices guide this work.

Preservation of Analog Collections – Digitization

The oldest Archives collections date back to the 1820s. Some of our most valued, permanent collections are also the most heavily used by scholars and researchers. Careful handling and storage conditions are essential to the longevity of those materials. Paper and photographic material degrades over time, and frequent handling speeds up this process. Furthermore, as physical holdings can only be accessed by one person at a time in the Archives reading room, digitization makes these collections available to people around the world.

Therefore, to extend the life of these physical holdings, the Archives takes a digital preservation approach when digitizing its collections. Using high-resolution imaging and digital capture of audio and video at or above best practice specifications stored in long-lived preservation file formats, we seek to avoid the negative impact of repetitive digitization that can occur when the chief purpose of digitization is access-only. The preservation master surrogates can then be used to generate access derivatives as needed for researchers and scholars without affecting the original object. Learn more in Digitizing Collections.

Preservation of Born-Digital Collections

In the case of the Archives’ “younger” digital collections, dependencies on aging hardware and software, and degradation of storage media make this content more fragile than paper or magnetic media. The Archives established its Electronic Records Program to devote the attention born-digital collections require as soon as possible, and to continue that level of stewardship throughout the life of those holdings. The Digital Curation team establishes physical and intellectual control at the point of collection accession, and performs preemptive preservation migrations as baseline risk assessments indicate, following a functional OAIS Reference Model workflow. Not all file formats the Archives acquires have an appropriate preservation file format to which they can be migrated. When this is the case, an emulation-based preservation plan is considered. Learn more about the work and history of the Electronic Records Program.

Digital Asset Management

Whether preserving born digital collections or digital surrogates of analog collections, managing those digital files and their associated metadata is an essential component of their stewardship at the Archives. Selection of the preservation file formats in both cases uses the same criteria: standards-based, capacity to accommodate all the significant properties of the object, and expected longevity. Once preserved, these assets are stored in a digital repository where they are managed efficiently and effectively as part of the Archives overall digital curation efforts.

Lifecycle Management and Electronic Records

Lifecycle management is taking care of a product or item from its creation throughout its use, which can be for a set amount of time or indefinite. The creator needs to determine what software to use, how to name it, what content it should contain, how to store it, and how to use it.

Lifecycle management also includes caring for the file it is no longer active for the creator or custodian. The record can still be useful in that it can provide historical information even if the record creator no longer needs it on a regular basis. It also is possible a file that is no longer useful and can be disposed of depending on records disposition schedules or other guidance.

The Archives’ role with lifecycle management typically starts when the record has become inactive and appraised for accessioning. The Archives acquires the file(s) as part of an official archival collection.

This process with electronic records includes the following best practices for long-term preservation:

  • Virus scan
  • File transfer (ingest)
  • Integrity check (Did the files transfer over completely with no problems? Hashes match?)
  • File backup
  • File format identification
  • Creation and review of metadata
  • Preservation (bit-level or migration when needed/possible)
  • Secure storage
  • Retrievability and accessibility
  • Ongoing maintenance of hardware, software, and operating systems.

Open Archival Information Systems Reference Model

Another way of viewing this process is with the model from the Open Archival Information Systems – Reference Model (OAIS). The model, developed by a cross-section of data and information professionals, serves as a framework for an archival system that provides preservation and access functionality during the long-term. It can apply to digital and analog collections and does not specify particular hardware or software to use. Many cultural heritage organizations follow this framework with digital preservation, as does the Archives.

The OAIS Reference Model – Functional Diagram

Under the functional model from OAIS, the record creator or producer has created a Submission Information Package (SIP) for a repository for deposit or transfer. This package should include the files, regardless of format, associated metadata or documentation, and a transfer manifest that contains the file names, dates, and file sizes. In some cases, only the files are provided by the record holder/creator.

The Archives accepts the SIP for ingest or transfer. Additional information is added to an accession note by Archives staff indicating the processing that has been done: virus scan, integrity check, file format identification, preservation work, etc. The SIP is now an Archival Information Package (AIP), and is securely managed and stored by the Archives. AIP information, including file location and preservation actions, is tracked in a specialized database, and cross-referenced in the Archives’ collections management system.

When a member of the public or staff wants to use a file or files from a collection, a Dissemination Information Package (DIP) is created for access from the AIP that is in the secure digital repository, leaving the original files intact. The DIP can be used and reused over time. A DIP does not need to include all files from a collection. For example, a researcher might be interested in three images from a collection of 100 born-digital files. This select group of images would be copied from the AIP and shared (or disseminated) as the DIP with the researcher.

The Archives is grateful for the opportunities it has had to contribute to the advancement of digital curation among museums, libraries and archives. Learn more about our work and projects.

Related Resources