The Bigger Picture: Visual Archives and the Smithsonian
Three Cheers for Embedded Metadata
I love metadata (data about data) because it makes my professional life as a digital archivist, as well as my personal life, easier. Of course, creating and embedding it also requires time and effort. As a case in point, Mike Ashenfelder from the Library of Congress has written about how challenging the process of adding metadata to digital photos can be. Indeed, the challenges are numerous.
For example, files from only a few years ago can become forgotten or obsolete as formats, operating systems or software changes. Adding to the difficulty is the fact that these files are everywhere—online, on computers and servers, and on CDs, DVDs, and thumbdrives tucked in drawers and closets.
Image information is often lost as many digital items end up having multiple lives as they are repurposed on various websites, blogs, and social media sites.
Digital images appear in unexpected places, and Internet searches can take unexpected turns. For instance, my own recent Internet search for images of Midcentury Modern decorating led me to images of a restaurant I used to frequent (the explanation: a blogger with a penchant for the Midcentury, also was a fan of the kitchy décor of the familiar Cozy Dog Drive In). In these cases, if you don’t have any context, you might not know what you are looking at.
A file name can help if it is descriptive enough, such as "2012_01_09_Smithsonian_Final_Report.doc." Captions with images also are important. Taking this one step further is embedded metadata, that is, metadata that is part of the file itself. This information will live with a photo or file no matter where that digital asset is stored, unless the data is removed. Embedded metadata can tell you who shot the photo, camera model, F-stop, ISO speed, where it was taken, copyright, and, if you are lucky, who or what is in the photo.

At the Smithsonian Institution Archives, we receive a variety of born-digital and scanned images that become part of our permanent collections that tell stories about the Smithsonian behind the scenes. These images range from objects and exhibition spaces, to museum special events, to researchers and students working in the field. File names are sometimes descriptive and helpful while others are just the generic name assigned by the camera (IMG_0001). In some cases, the actual media (CDs, DVDs, disks) that house the files contain some useful information. And sometimes a separate word-processing or PDF file with captions is provided.
The more information we have from the source the better. While technical metadata is provided by the digital camera (for example, the camera manufacturer and photo date, if set properly), descriptive information (keywords/tags, captions) has to be added manually. This can be done once the images are transferred to a computer with specific software, many of which offer batching functionality that makes the process more efficient—enabling us, for instance, to not have to enter in the photographer’s name every time.
Embedded metadata is not only for images. Many word-processing, presentation, and spreadsheet applications also have the ability to include the author, keywords, title, and other information. In a Windows 7 environment, right click on a file and select Properties and then Details to view and edit fields. Older Windows operating systems require opening the file itself in software that allows viewing of Description metadata. On a Mac, right click on the file and select Get Info for some basic information. By taking the time to add metadata to your own digital files, you’ll ensure that important and identifying information remains at your fingertips.

There is even an embedded metadata group made up of photography and advertising organizations that has released the Embedded Metadata Manifesto. The five principles follow:
1) Metadata is essential to describe, identify and track digital media and should be applied to all media items which are exchanged as files or by other means such as data streams.
2) Media file formats should provide the means to embed metadata in ways that can be read and handled by different software systems.
3) Metadata fields, their semantics (including labels on the user interface) and values, should not be changed across metadata formats.
4) Copyright management information metadata must never be removed from the files.
5) Other metadata should only be removed from files by agreement with their copyright holders
Thanks to embedded metadata in an image, a researcher looking at a food blog in 2015 will know who those seven attendees were from a scientific conference in 2005.
Comments (15) – Leave a comment
Metadata is also poignant, in that the photographer, Chip Clark, passed away in the summer of 2010. It is good to know that Chip's singular and brilliant vision can be credited to him, embedded in the photograph. Long live metadata!
"A file name can help if it is descriptive enough, such as "2012_01_09_Smithsonian_Final_Report.doc"
excellent idea but consider putting the date last instead of first
Smithsonian_Final_Report_YYYY_MM_DD
Peter,
Thanks. Yes, that is just another way of doing it. Some prefer to have their files sort by the year in a directory listing under file name.

There are a number of commercial and free tools available. Many can be found doing an Internet search of “batch photo metadata.” We don’t endorse specific tools here at the Smithsonian.
Thanks for helping to raise the awareness of others in the archives community about the value of embedded metadata! As one of the people behind http://www.embeddedmetadata.org/ I was glad to see the mention of this new site. As one of the authors of the original "Metadata Manifesto" I thought I would pass on a few other resources for those considering embedding metadata into their image files. The tutorials on http://www.photometadata.org/ show you how to add metadata using a variety of popular programs, and explain the underpinnings of the various methods used to store this information -- as well as a guide to how the various metadata fields are to be used.
For those wanting to dig deeper into effective ways of adding keywords/tags, to images see http://www.controlledvocabulary.com/ and the companion forum.
If someone wants to quickly see more than just Chip Clarks name -- like what is actually going on in this image -- then they can use the online metadata reader at http://regex.info/exif.cgi or install the toolbar widget. Note, in order to select the specific image, hover over the second and control clc (mac) or right click (Windows) and choose the option to "view image" first. Then either use that URL, to paste into the URL above; or click on the installed toolbar widget. For those interested that don't want to install, use the following URL to see the full metadata:
Enjoy!
David
Embedded metadata is fine and good for dissemination and access, but collecting institutions must really capture the information in the collection management or digital asset management systems as well, as it's -very- brittle, and largely invisible, and many, many processing tools do not correctly handle XMP.
The manifesto is nice and all, but a good archival system is one that isn't based on trust and does also does not specify requirements it cannot enforce (filenames are terrible places for data as a result of the ease of changing them, Filesystem limitations, processing tools can alter them, etc.

Ryan,
All great points. We do use collections and digital asset management systems here at the Archives and throughout the Smithsonian. Every component needs to play its part.
Thanks for this interesting post. You’re right, of course, to call attention to the potential value – for the professional archivist and others – of technical, copyright and other kinds of metadata. I’m wondering about a further dimension to this. If you look at the digital landscape of our culture today, much of the ubiquitous imagery is highly photoshopped. Think of magazine covers, advertisements, movie posters and so on, where hardly any image today is “real”. Almost everything has been subject to extensive postproduction work. In addition to the traditional metadata, it seems that there also might be a value in preserving – as far as possible – the photoshop layers of culturally salient imagery.

Christian,
Very true about enhancements to imagery. As we know, many printed images also have been subject to changes as well. Documentation is so important by those making the "improvements" to the images.

Hi all,
Another good resource that didn't make it into my initial posting is The Case for Implementing Core Descriptive Embedded Metadata at the Smithsonian.
Included meta data is very cool. Software making it easy to add meta data to photos is key. Copyright matters, so does other stuff. The focus on copyright in the 5 items seems excessive - and it does expire (even if congress keeps taking from the public and giving new rights to private holders against the balance urged in the constitution [I think it was the constitution where they laid out the reason why allowing government to take rights from the public and grant those rights to an individual for a limited time to encourage the creation of such public goods]).
Certainly if 40% of the list wants to talk about copyrights it should mention that many choose to make material available to the public using more sensible means than the current copyright system - such as creative commons.
It is good to give credit to those that create work (photos, art, writing...). The current warped view of copyrights however, is doing great damage to our country.
Great article and comments. I have never really embedded metadata in all my scans, and would like to start doing so. I'm wondering if completing the "Properties" fields in Windows Explorer is enough, or is it necessary to employ another software? And, how do people resolve the difference between the original image author, and the creation of the digital object (scan)? Finally, how to differentiate the fields "subject" and "tags"?
Thanks for any guidance!

Peter,
These are good questions. If there are enough fields for the user's needs in the Windows Explorer view to populate, then that should be fine. If you are looking for batch editing or additional fields, other software can be useful.
Consistency is key here. If the person who took the photo entered his/her name, then that data should suffice unless one determines it was placed in the wrong field. If data about a digital scan is needed, the Creator field can be used for that information and Credit field used for the photographer's or organization's name.
Tags are the same as Keywords. Different software can have different names for the same IPTC fields (http://www.iptc.org/site/index.html?channel=CH0099). Subject or Description in the photo with this posting is the caption.
Also see http://si-pddr.si.edu/jspui/bitstream/10088/9719/1/GuidelinesEmbeddedMetadata.pdf. It offers tips on embedded metadata for images and provides definitions of various fields.
Good luck,
Lynda
Leave a comment
Produced by the Smithsonian Institution Archives. For copyright questions, please see the Terms of Use.
About
Smithsonian on Flickr Commons
Topics/Tags
- See Here (611)
- American History (542)
- Science (430)
- Archive (330)
- Cities/Places (278)
- Exhibitions (234)
- Web/Tech (210)
- Photo History (189)
- Link Love (153)
- Politics/Government (153)
Blog Roll
Categories
- Collections in Focus (990)
- What Gets Saved (337)
- Behind the Scenes (212)
- Smithsonian History (135)
Monthly Archive
- May 2013 (23)
- April 2013 (26)
- March 2013 (26)
- February 2013 (26)
- January 2013 (28)
- December 2012 (26)
- November 2012 (28)
- October 2012 (32)
- September 2012 (26)
- August 2012 (31)
- July 2012 (26)
- June 2012 (27)
- May 2012 (27)
- April 2012 (27)
- March 2012 (28)
- February 2012 (27)
- January 2012 (26)
- December 2011 (31)
- November 2011 (28)
- October 2011 (35)
- September 2011 (31)
- August 2011 (35)
- July 2011 (41)
- June 2011 (43)
- May 2011 (33)
- April 2011 (40)
- March 2011 (43)
- February 2011 (35)
- January 2011 (36)
- December 2010 (42)
- November 2010 (40)
- October 2010 (44)
- September 2010 (37)
- August 2010 (39)
- July 2010 (38)
- June 2010 (37)
- May 2010 (42)
- April 2010 (44)
- March 2010 (47)
- February 2010 (40)
- January 2010 (39)
- December 2009 (43)
- November 2009 (34)
- October 2009 (11)
- September 2009 (11)
- August 2009 (12)
- July 2009 (14)
- June 2009 (10)
- May 2009 (12)
- April 2009 (14)
- March 2009 (10)
- January 2009 (1)