To Preserve or Not to Preserve: Social Media

The Smithsonian Institution maintains over five hundred social media social networking, and other

The Smithsonian Institution currently has over five hundred social media, social networking, and other "web 2.0" accounts (many of them are listed on’s "Connect" page). These accounts include approximately 143 Facebook accounts, one hundred Twitter accounts, seventy-four blogs, sixty-six Flickr accounts, and sixty-one YouTube accounts. These accounts are used for public outreach and to bring attention to the Smithsonian’s objects, exhibitions, research, programs, projects, events, activities, staff, and educational resources . Each of these accounts focuses on a different audience and specializes in a unique topic.

These vehicles for engaging audiences may be new, but the Smithsonian has always performed this sort of outreach using a variety of means including articles in scholarly journals, Smithsonian-published magazines, newsletters, press releases, teacher packets, email lists, and websites. Since all of these have been considered historically valuable materials, it is only logical that the Smithsonian’s social media accounts should also be preserved.

Preservation of any type of digital record is more complicated than paper preservation and requires more resources over time. Keeping this in mind, we closely look at the records to determine if we need to preserve them in their entirety. Our goal is to preserve enough data to satisfy the needs of future researchers while minimizing the amount of duplicate, extraneous, and less historically valuable data. We attempt to find this "happy medium" as part of our appraisal process—the process by which we determine what will become part of the Archives’ collections.

When we appraise social media accounts, we look at each account individually because they are all used differently. Some accounts contain mostly original content or other information that is not quickly and easily available elsewhere. Other accounts consist primarily of links to the Smithsonian’s own websites or to news articles or the websites of other organizations. Many social media accounts fall somewhere in the middle. A major factor in how we appraise a social media account is the amount of significant original content it includes. This is more of an art than a science and we attempt to err on the side of caution.

The Archives of American Art Facebook page was the first we preserved in the new

Social media accounts with significant original content are captured in full or at least back to the last time they were captured. Social media accounts with little original content are also captured and preserved to document their existence and how they were used, but we will generally only capture a sample of the account, such as two or three months of a Facebook timeline.

There are other ways to minimize the amount of data we are preserving from the social media accounts. Some accounts are structured in such a way that the content and metadata can be exported as a spreadsheet or XML document. Twitter is a good example. The size of these documents is often much smaller than the data collected by crawling the account. We will also preserve a screenshot of the account to document its look. For accounts with more complicated structures, we will often look at the entire account and determine if there are pieces that are not necessary to preserve. Oftentimes photographs, videos, or calendar of events uploaded to the account are also available on a Smithsonian website or publication which is also being preserved. In some cases, these duplicate items can be excluded when we capture the account.

The National Museum of African Art's Twitter account, exported as an XML document on May 18, 2012 u

Another major concern when appraising social media is privacy. Personal information is everywhere in social media applications and we do our best to minimize the amount of that information that we capture and preserve. We avoid capturing content outside of the scope of the Smithsonian-administered account, meaning that we do not capture the profiles or accounts of the individuals who like, follow, or connect with Smithsonian accounts. That does not mean that we do not capture any personal information. For instance, if you comment on a blog or a Facebook post, the text of your comment as well as your name, profile picture, and any other publicly displayed information will likely be captured. However, if we feel that too much personal information would be disclosed by capturing the account, we do not capture it.

While the popularity of individual social media providers will likely fade over time and become just a blip in web history, they exemplify current and future trends in communication. By capturing and preserving the Smithsonian’s social media presence, we are continuing to document the evolution of the Institution’s methods of sharing information and engaging new audiences.

Related Resources

Produced by the Smithsonian Institution Archives. For copyright questions, please see the Terms of Use.