The Bigger Picture: Visual Archives and the Smithsonian
Digital Video Preservation: Further Challenges for Preserving Digital Video and Beyond
As one can expect, the complexity of digital video provides a few more factors to track and assess when compared with analog moving image counterparts in the archive. While innovative preservation strategies and technical standards developed for digital imaging and sound certainly apply to digital video, standards for preserving and maintaining digital video are just now emerging, which makes my internship at the Smithsonian Institution Archives this summer an especially exciting opportunity to assess what’s been accomplished so far, and to address the many issues that affect the preservation of digital video in the archival community.
As noted in my previous blog post, the Archives has accessioned a variety of born-digital materials resulting in some 10,000 unique video files that possess a wide variety of technical attributes including an array of container formats and codec types. A migration preservation strategy will identify video files at risk for obsolescence and convert them to a more sustainable format, many of which are tracked by the Library of Congress’ website “Sustainability of Digital Formats Planning for Library of Congress Collections.”
For archives, a sustainable video file format must either be lossless (which allows the exact original data to be reconstructed from the compressed data) or uncompressed; and due to the costs associated with storing large, uncompressed files, lossless compression is gaining more acceptance in cultural heritage archives. A recent post from the Library of Congress’ digital preservation blog details efforts to establish the Motion JPEG2000 video codec and MXF wrapper as the preservation target format for digital video. However, outside the motion picture industry, the Motion JPEG2000 codec has seen little adoption and has led to some apprehension in supporting the format for preservation purposes. But once the format becomes more prevalent in free, open-source conversion tools and media players, it will become a viable preservation target for cultural heritage archives. In the meantime, the Archives will continue to explore uncompressed preservation formats, as well as lossless codecs for its born-digital video accessions.
In addition to a lack of standards or guidelines in the wider community, further digital video preservation challenges persist, especially in regard to video files that are part of larger multimedia objects, like CD-ROMs, DVDs and websites. Often times, video files associated with these objects possess interactive attributes that provide navigational functionality to other content or features within the video file itself. Simply converting these types of video files to a preservation format like Motion JPEG2000 will not preserve these interactive features or their relationship to other files within the multimedia object. So, a different digital preservation workflow that captures these features via a conversion tool or more likely, metadata, will have to be developed.
This SWF file from the Smithsonian's Hirshhorn Museum and Sculpture Garden website from 2004 (Smithsonian Institution Archives, Accession 04-095) compiles multiple SWF flash files into one object, and uses navigation buttons to access the content. Preserving each SWF file separately does not preserve the object as a whole and presents considerable preservation challenges. Both the Internet Archives' Wayback Machine preservation of this video and the SWF file continue to flash loading. See the video above for a sample of the interactivity. Courtesy of the Smithsonian Institution Archives.
Progress is occurring in web, video game, and art preservation realms where metadata schemes for capturing interactive information provides the means for emulating or recreating the object should it be rendered unplayable in its original format. The Variable Media Network has produced “The Variable Media Questionnaire,” which allows curators and archivists alike to assess a multimedia object like a DVD, CD-ROM, or avant-garde performance art piece for the relevant information needed to preserve the object via storage, migration, emulation or reinterpretation strategies. A similar project, PANIC, produced a metadata scheme that facilitates the collection of important structural and technical information inherent to multimedia objects that can be used in future emulation efforts.
The above video is a QuickTime Virtual Reality (QTVR) file from the Smithsonian Photographic Services website in the late 1990s from Smithsonian Institution Archives Accession 09-257. Using the mouse cursor on the video, it provides a panoramic view of the crowd attending President Bill Clinton's inaugural address at the US Capitol Building in 1999. The interactive panoramic view taken by Dane A. Penland is only accessible in QuickTime and RealPlayer video applications. Preserving this interactivity will not be an easy task. Courtesy of the Smithsonian Institution Archives.
For the many digital video files on CD-ROMs and DVDs currently accessioned at the Archives, preserving this content will require the adoption of a metadata scheme and the use of other innovative strategies. Currently, the Archives is looking at screen capture software as a potential means of recording how a CD-ROM functions and links to other material before software and hardware obsolescence renders the content unplayable. A similar method can be employed for capturing the navigational structure of DVD menus and Flash-based websites. However, generating screen captures and metadata that collects this information is dependent on a working multimedia object. Once the object cannot play (like some of the already obsolete CD-ROMs) obtaining this information may require acquisition of obsolete hardware and software like specialized CD disc drives and old operating system environments.