The Smithsonian: Using and Archiving Facebook

Did you know that the Smithsonian Institution has about eighty Facebook accounts spread across its museums, research centers, and offices? These Facebook pages document everything from listing upcoming speakers and films, to sharing photos of exhibitions and expeditions, to posting job announcements. Facebook is one of many tools the Smithsonian Institution uses to reach its multiple audiences around the globe. As of May 23, 2011, the main Smithsonian Institution Facebook site had more than 95,000 “Likes” from Facebook users. Since it’s the Smithsonian Institution Archives’ responsibility to retain the Institution’s history, it’s important for us to capture a representative sampling of Facebook pages to document how the Smithsonian used new technology in early 21st century, as well as to preserve content not available elsewhere. While traditional websites might only be updated once a day, weekly, or even less, Facebook pages give Smithsonian museums and offices the opportunity to distribute information more quickly and also allows the public a chance to interact by posting to Facebook Walls. In this sense, Facebook is more personal than a website and delivers more than the 140 characters of a Tweet.

Owney the Railway Mail Dog's Facebook Info Page on March 2, 2011.

Here’s a sampling of what Smithsonian Facebook Pages had to offer, for example, in March:

  • Smithsonian Folkways Recordings offered an audio clip of “Chopin: Etude in C minor, Op. 25, No. 12” by Robert Pritchard.
  • Owney the Railway Mail Dog from the National Postal Museum posted a link about the use of pneumatic tubes being used for mail delivery in the 19th century.
  • The Smithsonian Marine Ecosystems Exhibit shared coral photos from a Belize research trip.
  • Smithsonian Theaters posted showing times for the film Arabia 3D.
  • The Hirshhorn Museum and Sculpture Garden posted an image of Dana Hoey's Waimea from its collections.

At that same time, a group of five students from the University of Michigan was spending their Alternative Spring Break at the Archives, participating in various projects including digitizing field books and transferring digital audio files. One student spent the week saving a multitude of recent Smithsonian Facebook pages and their additional components such as Info, Photo, and Notes tabs.

The Smithsonian American Art Museum and Renwick Gallery's Facebook Wall.

While we have been using a web crawler known as Heritrix to archive the Smithsonian’s traditional websites, we decided we would archive Facebook differently since there are some complications with web crawlers and social media sites. So, we created PDF/A captures of the Facebook pages. PDF/A is a subset of PDF for long-term digital archiving. This international standard requires 100 percent self-containment, meaning fonts are embedded and no audio or video are allowed. The PDF/A file cannot rely on outside content such as hyperlinks. The files are to reproduce the same every time regardless of system. Overall, the archiving process went smoothly but was time consuming. Each Facebook page was opened in a web browser and “Print to PDF” with PDF/A settings was used. When some of the Notes tabs crashed, we experimented with other methods to try to create the PDF files. These captures immediately became records of moments in time. Some pages only have information from 2011. Others include content that goes back farther in history. The plan is to archive these pages on an annual basis, so that in the future viewing these pages can give a researcher a greater appreciation of the scope of what the Smithsonian is and does. Update: If you are interested in “archiving” your own personal account, Facebook offers a feature that allows you to download your information. Go to Account, Account Settings, and Download Your Information. This will copy all your photos, videos, Wall posts, messages, etc. into a compressed file that only you can download. To find out more information on PDF captures of web pages, do a search on screenshots of web pdf.

Produced by the Smithsonian Institution Archives. For copyright questions, please see the Terms of Use.