The Bigger Picture: Visual Archives and the Smithsonian
Posts tagged with: Archive
How we share information and spread knowledge has changed drastically from when the Smithsonian Institution unveiled its first homepage in May 1995. The official debut of "America's Treasure House for Learning" took place in House Speaker Newt Gingrich's office with Smithsonian Secretary I. Michael Heyman on Capitol Hill. The site linked to video, images, pages, maps and audio clips from across the Institution. Other Smithsonian homepages went online as well.
Secretary Heyman reported in his annual statement for 1995 that as of September 30, the site had more than 8.5 million visits. To put this in perspective, Smithsonian websites combined had more than 99 million visits in fiscal year 2014.
Prior to this the National Museum of Natural History was using Gopher technology in 1993 on the Internet. The Smithsonian Astrophysical Observatory also launched its Telescope Data Center website in 1993, which was one of the first 250 websites on the Internet and is still active today.
While the Archives has been preserving Smithsonian websites since the late 1990s, we do not have the electronic files preserved from this first Smithsonian homepage. Multiple attempts to retrieve files off a data tape have been unsuccessful. We do have the press kit, a printout of the top part of the site, and other related files. We continue to hope someone out there might have another copy of the digital files from 1995.
The earliest captures of the homepage at the Wayback Machine from the Internet Archive only go back to 1997 and are missing some items.
Anyone who has done complex searches on the web knows they can be challenging, especially with digital information that is nearly 20 years old. Using a standard search engine does not always deliver the desired results.
This is where Memento comes in. With funding from the Library and Congress and the Andrew W. Mellon Foundation, it was developed by Los Alamos National Laboratory and Old Dominion University. Dubbed as "Time Travel for the Web," the Chrome extension works by supplying the URI (uniform resource identifier of a web resource) and selecting a date in the past that it may have been on the web. I entered www.si.edu in my browser and selected "get neared save date" of January 1, 1996, (the earliest available with the plugin) and found a web capture from the Portuguese Web Archive. This display of the homepage from October 13, 1996, has more details than what was found previously, as these results did not display from regular queries to search engines. I also recently found that Indiana University has a capture as well.
Memento uses a protocol to search archived websites from the Internet Archive, Archive-It (where you can find archived Smithsonian websites), the UK Web Archive, the Icelandic Archive, and other sites. It also works with Wikipedia, and other tools are being developed. Obviously, it only works if the website was captured in the past and available on a server.
It is rewarding to see a few more pieces come together from the early days of the web and the Smithsonian's role in it.
- Home Page of the Smithsonian's First Website, Historic Images of the Smithsonian, Smithsonian Institution Archives
- Web Archiving Update, October 2014, The Bigger Picture blog, Smithsonian Institution Archives
- Smithsonian Institution websites, Archive-It
As the saying goes, "Time flies when you're having a good time," and indeed it seems like yesterday that we met the new Smithsonian Secretary G. Wayne Clough at the Staff Picnic in July of 2008. In some ways, he appeared to fit the mold of the typical Smithsonian Secretary: a very tall man with a Ph.D. But in other ways, he broke the mold – a Southerner? – an engineer? How would someone like that lead the quirky Smithsonian? Hints of what the future had in store for us could be seen that first day, as he walked around the music and research tents, engaging staff in discussions about what they did at the Institution. His positive energy and smile were infectious, and I remember thinking, well, maybe he can liven this place up again . . .
Dr. Clough turned out to be a quick study as he surveyed the Institution and the people who make it tick – our strengths and weaknesses – and he formulated a plan that moved forward simultaneously on several tracks. The first task was daunting – to turn around a negative attitude that had crept across the Smithsonian in the last decade. He visited units, demonstrated to staff that he genuinely valued the work being done here, and publicly rewarded those with creativity and dedication. He dug up fossils, learned to work a snow blower, snorkeled in the Caribbean and hiked around the South Pole. He got to know the Smithsonian in-depth. And he challenged all of the staff to think positively about the future of the Smithsonian, rather than dwell upon the past.
For someone who was a graduate student at Berkeley in the 1970s when the computer revolution was taking off and Silicon Valley was coming into existence, the Smithsonian seemed behind the curve in the information technology Dr. Clough was immersed in for his engineering work. So, his second front was to encourage the expansion of the digitization of our collections, use of digital communications to reach new audiences, and support projects that used information technology in new and creative ways for Smithsonian web 2.0. Before long, Dr. Clough had the staff digitizing everything in 3D – even mini-Wayne himself! Today we reach millions of people across the globe and thousands of online volunteers have become part of the Smithsonian family.
So what does an engineer do at the Smithsonian? We quickly took solace in his expertise in earthquake engineering when a quake hit the mid-Atlantic region in 2011, damaging Smithsonian buildings. A lot of environmental research is conducted across the Smithsonian, but putting that research into practice in our own facilities had not been a priority. Dr. Clough challenged the facilities staff and they substantially reduced the amount of fossil fuels used and increased the amount of renewable fuel sources.
The Smithsonian is a large and complex organization – so Dr. Clough looked for ways to increase interactions across diverse units. He brought together a group who distilled Smithsonian interests into the four "Grand Challenges" of Unlocking the Mysteries of the Universe, Understanding and Sustaining a Biodiverse Planet, Valuing World Cultures, and Understanding the American Experience. And then he actually found funding for collaborative grants! "The Anthropocene" challenged astrophysicists, anthropologists, art historians, cultural historians, botanists, and paleontologists to actually work together on a coherent project. Could they do it? Yes, they did and got us all thinking in new ways, at the same time we got to know coworkers whose work was very different than our own.
So it is now time to bid farewell to Dr. Clough, as he returns to his beloved Georgia. I'll be busy for the next couple of years ensuring all his positive accomplishments are properly documented in our historical record. We'll miss the smile, enthusiasm, and challenges to reach higher every day, but we can build on his legacy to create a truly 21st century Smithsonian!
The Smithsonian Channel produces award-winning television programming that engages viewers much in the same way as the Smithsonian's museums and galleries do throughout the United States with their visitors. Just as the Smithsonian is working to digitize its collections for greater access and preservation, the Smithsonian Channel and the Smithsonian Institution Archives are also undertaking various efforts to ensure the digital preservation of these television programs.
The reformatting workflow for this project has been dynamic, and it should be. During earlier accessions of Smithsonian Channel programming, the progams were transferred on DVDs, numbering in the hundreds. In order to preserve the files digitally and prepare them for ingest into the Smithsonian's Digital Asset Management System (DAMS), the DVDs undertook a lengthy workflow process to ensure the highest level of playability and playback quality.
As part of the project's workflow, and best practices at the Archives, each disc is individually scanned using virus detection software. While this process is lengthy, it is critical to ensuring the security of the Archives' IT infrastructure. The next step in the workflow is to individually create .ISO images of each disc, which retains each program's DVD menu functionality. After creation of the .ISOs, the individual .vobs are extracted and converted to a single .vob using a command prompt script. This single .vob is then converted to an .mpeg, also using command prompt, to ensure the greatest playability across multiple software programs. This process is individually repeated for every DVD within the collection and can take months to complete.
After creation of the mpegs, the associated metadata must be created for each individual file in preparation for ingest into the DAMs. The metadata is applied to each file using Adobe Bridge; however, the metadata cannot be embedded into the actual video files, thus creating a sidecar .xmp file is necessary to hold the associated video file's metadata. Once this process is complete, the .ISO, .mpg, and .xmp files are entered simultaneously into the DAMs to ensure to proper parent (.iso)/child (.mpg and .xmp) relationships are maintained.
Throughout the entire workflow, upon initial receipt, after each conversion, and after upload to the DAMs, each file has been viewed for quality assurance, furthering adding time to an already lengthy workflow. In total, processing the collection of 136 DVDs within the accession took roughly 300 hours to complete.
In an effort to simplify the workflow, archivists from the Smithsonian Channel and the Archives met to develop a plan to achieve maximum efficiency with the preservation of Smithsonian Channel's programming. During the meeting, it was decided to test a pilot program wherein the Smithsonian Channel would send a number of .mov files through a secure server to the Archives to develop a new workflow based solely on the digital transfer of the Smithsonian Channel's programs. While not eliminating the original DVD transfer yet, this process significantly decreased the workflow and time involved in the entire preservation process.
With the transfer of the .mov files, the conversion process was removed entirely from the workflow. Further, the metadata can be directly embedded into the file header of the .mov files, eliminating the need to create a separate file for the metadata. For DAMs ingest, only the .mov file is needed, as opposed to the .ISO, .mpg, and the .xmp file. In essence, what used to take nearly 300 hours to complete could essentially be completed in as little as a day for a collection of programming.
By making the process of preserving the Smithsonian Channel programs simpler and easier, programs can be preserved more quickly and with less files to work with and a more straightforward workflow there is less likelihood for errors to be made. The collaborative effort between the Smithsonian Channel and the Archives is a prime example of two institutions working together in the effort of digital preservation.
- What are You Watching?, The Bigger Picture blog, Smithsonian Institution Archives
- And Action: The Ins and Outs of DVD Video Preservation, The Bigger Picture blog, Smithsonian Institution Archives
- Digital Video Preservation: Further Challenges for Preserving Digital Video and Beyond, The Bigger Picture blog, Smithsonian Institution Archives
- Smithsonian Channel records at the Smithsonian Institution Archives
- Awesome news - NARA has a new, updated National Archives Catalog, to help make it easier for people to search and find records in their collections. [via NARAtions blog, NARA]
- The Digital Einstein Papers launched last week, making available the collected papers of Albert Einstein, including a letter he wrote to Marie Curie supporting her and giving counsel on how to deal with her critics. [via Open Culture]
- The last of the Hidden Collections awards were given out by the Council on Library and Information Resources. The awards were created in 2008 and is supported by ongoing funding from The Andrew W. Mellon Foundation. The program has awarded 129 grants totaling about $27.3 million and has allowed repositoties to process and make available collections that were previously hidden. [via InfoDocket]
- The report is out - The FADGI (Federal Agencies Digitization Guidelines Initiative Audio-Visual Working Group) report on "Creating and Archiving Born Digital Video" was released this week, and the Archives was one of the contributors. [via The Signal: Digital Preservation, LOC]
- Science lesson for the week - Five major advances in scientific knowledge that have occurred since the National Museum of Natural History opened in 1910. [via Unearthed blog, NMNH]
- Available now - Two new online exhibitions from the Biodiversity Heritage Library - Early Women in Science and Latino Natural History. [via Field Book Project blog, NMNH and SIA]
- Watch as paleontologists discover Sue, who at 42 feet long and weighing nearly 4,000 pounds, is the largest, most complete Tyrannosaurus rex skeleton ever found. [via Underwire, Wired]