The Bigger Picture: Visual Archives and the Smithsonian
Preservation of Executable Files
As digital materials become more prolific in large institutions like the Smithsonian, one of the biggest challenges that archives face is figuring out how to preserve uncommon or obsolete computer files. One example is executable files. EXEs are, generally, programs. But it’s not quite that simple! They might be interactive games, video slideshows, or install drivers. As a result, they aren't as uniform as .doc or .jpg files. As part of my internship at the Smithsonian Institution Archives, I'm trying to figure out exactly what we should do with the executables in our collections, conducting research into the filetype and solutions for preservation, and examining the collections in order to create a plan.
As I went through them, I quickly learned that you never you know you're going to get! When I opened our executables, they ranged from databases to video games and, so far, from dates 1985–2006. That's precisely what makes these files difficult to preserve.
In that sense, it’s worth asking: do we want them?
They may or may not be worth keeping, depending on your institution's goals. (For instance, Library and Archives Canada doesn't keep executables in its trusted repository). At the Archives, I found slideshows that were once parts of exhibits—clearly important. I also found install files for programs for which we have updated versions—less important. Then there were files where we didn't even know what they were. Do we preserve them and hope that someday we will be able to figure it out? It seems to me that this has to be determined on a case-by-case judgment call, based on creation date, the files it came with, and if there is any metadata available.
How do we preserve them?
Executables are problematic. They're designed to work on a specific operating system with certain specifications.
A Quick tip: Many executable files will not run off of servers. If you're getting the error "The parameter is incorrect," copy the file onto a single machine and try from there. You'll get a much higher success rate!
So, now that we have the executables we want to keep, we have four options for executables that you can successfully open.
- Migrate: Migrating means that you update the information so that it runs on modern equipment. If you care about the content, but not about the original environment (processor speed, same operating system, etc.), this might be an option. You can update it so that it works for awhile, but computers aren’t static. They’re going to keep changing. In ten years from now, will your future-tablet-PC-holographic-super-machine run a Windows 7 executable? What about twenty, thirty, fourty years? It’s an issue archives are dealing with for all sorts of files, but with executables, the system specifications are so important that it is an even more pressing concern.
- Emulate: Emulating means that you create the proper environment for the file, and run it through there. As long as your emulation environment keeps working, you don’t have to worry about your executable becoming obsolete. But it takes a lot of computing power, time, and effort. And when you have 100 executables, most of which will need their own environments to run (especially in an archive that gets files from multiple institutions over a period of time), that might be beyond your resources.
- Extract Information: This is a great option if you don’t want to worry about the long term. If you can open the file, you might be able to extract all the content, assuming it isn’t too interactive. Moving images can become video files, sound can become audio files. Gain: drive space and preservation file format. Loss: functionality and interaction.
- Sit on It: If you cannot do what you want with the file, you can sit on it and hope that you find more metadata or figure out a way to get the information sometime in the future. However, you have to keep in mind that as technology develops, you move further away from the original environment. This is an option only if you think you’ll ultimately get more information about the file or figure out what you want to do fairly soon.
With the files I examined, we decided extracting the information might be the best option for the Archives at this point. Along with this plan, we document as much as we can about the EXE, including what it seems to be and the year it might have been created. Some of the information we are trying to keep doesn’t have to be in executable format. For instance, the slideshows can be turned into video fileswithout losing intellectual content or experience. Sitting on these files won’t help anything, and if we wait, we might lose information. One of the nice things about trying to preserve rather than altering or creating means that we don’t necessarily need the code of an executable; after all, we want to save the experience, not the programming.
Related Collections
- Smithsonian Office of Education, Smithsonian Online Records, 1991-1997, Accession 97-136, Smithsonian Institution Archives
- National Zoological Park, Office of Public Affairs, Subject Files, 1977-2003, Accession 07-023, Smithsonian Institution Archives
- National Museum of Natural History, Office of Education, Productions, 1996-2000, undated, Accession 11-014, Smithsonian Institution Archives
Related Resources
- Virtualization for Preservation of Executable Art, Kam Woods, Indiana University, Department of Computer Science, 2008
Comments (15) – Leave a comment
When deciding whether or not to keep files that will not open, what are some criteria that you consider?
Hello Rachel,
We have not created exact guidelines for what to keep, but there are a couple of things to think about. For instance, what else came with the file? If it was with exhibit materials, it might be related to the exhibit. On the other hand, it could have been on a CD from a former employee's papers that might not relate to the goals of the Smithsonian. Some executable files might have been on the digital media even before the relevant files were added (remember how old AOL CDs used to come with other programs on them, or games would come with trial versions of other games?).
Another thing to consider is what else is on the drive. You might be able to look at a different file and learn something about the computer from which the files came. For instance, files can give you a range of dates (from their creation date or last modified date) that you know the drive was in use. Files might only open in certain programs, or they might reveal what operating system was used to create them.
One more thing to note is that the Archives keeps the original media. While there is a chance of data loss over time, files that are not kept on the servers might be able to be retrieved from the media if new information is found.
While those examples are by no means comprehensive, I hope that gives you a little insight into what might be considered when deciding whether or not to keep a file that cannot be opened.
Thanks,
Jessica Schaengold
It will be interesting to live several hundred years from now when archeologists are study us. Since we have so much digital information, will it remain for future generations or be more easily lost? With digital imaging, we take pictures and share them online but rarely do people print those photos. The same is to be said of digital messaging. Future generations won't likely find as many letters, diarys, and photos to learn of their ansestory as prior generations had. Clearly historians and museum curators are also wondering how to deal with this dilemma or preserving non-tangible items.
Michael,
This is definitely a problem that archivists are trying to contend with. One of the things that archivists do is migrate electronic files into formats that are more likely to last longer. For instance, when we get JPG images, we transfer them to TIF. JPG image files are compressed; they take up less space, but they also lose little bits of data when they are compressed. TIF files, on the other hand, are very big, but that's because they are not compressed, so they will not degrade over time. This doesn't solve every issue - you still need a working computer that can read the files! But hopefully we will be able to change the formats over time so that they stay up-to-date, and won't be lost to the future generations.
If you are interested in making sure your own files are not lost over time, the Library of Congress has resources for personal digital archiving here: http://www.digitalpreservation.gov/personalarchiving/.
Hopefully people will hold onto their digital documents (like love emails instead of love letters), and make sure they are in a format that will remain accessible and readable, and not just on websites like Facebook and Flickr!
Digital communication has changed the way we function as a society. If we continue to use digital channels (yes, that's a big if!), archivists will continue to do what we can to make sure the information is preserved for future generations.
Of course, we're still figuring things out - there's a lot to discuss and debate.
Thanks for your thoughts!
Jessica Schaengold
EXE files can be malicious programs! Treat with extreme caution if you don't know what they are.
Gary,
That's very true! If you're not sure exactly what the executable file is, be sure to run a virus scan first. Many viruses have the file extension .EXE. One of the very first things we do when we get new digital files is a run a virus check.
Thanks for pointing this out! It is something for everyone to keep in mind whenever you are working with computer files!
Thanks,
Jessica Schaengold
Rachel,
I am glad to learn that you are working on the issue of executables. About a decade or so ago I did a study for the SIA (Edie Hedlin) in which I identified interactive HTML web pages (part of museum exhibits) as a major challenge for the Smithsonian Institution Museum exhibits. In most cases migration is effective in mitigating the effects of technology obsolescence of static digital content. it is not as effective for dynamic digital content, especially where it is important to capture "berhavior" associated with teworks relatively well for static digital content but not so well for dynamic, interactive digital content (e.g., a virtual museum exhibit). Emulation is a potential solution to this issue but I have not seen very much substantive movement in this direction other than a couple of prototype demonstrations. Absent any commercialization, I suspect that emulation will remain an undeveloped access tool. In the meantime, my view is to continue the extract process and see how well it works and complement it with keeping alive the bit stream(s) underlying the dynamic digital content.
One last matter: do you have a written report that describes this project that can be shared with me?
Many thanks.
Charles Dollar
Thank you, this is very informative! We have some EXE files that we're currently sitting on and any information I can gather about them is welcome.
Susan,
I'm glad I could help! My two main pieces of advice are to try to open executables from different machines. If the file won't open on the server, move the file and the files it came with (sometimes it needs the entire directory to work) onto a hard drive, and try from there. Also, if that doesn't work, try opening the file in a basic text reader like Notepad or WordPad. While you won't be able to interpret most of the file, there might be bits of readable text mixed in that can provide some insight.
Thanks,
Jessica Schaengold
One nasty problem you will come across with programs for personal computers is copy protection, serial numbers, dongles, etc. It would be nice if collecting instutions began compiling a list of the protection mechanisms (if any) are needed to make executables run. Failing to preserve the protection information will make it much more difficult, if not impossible, to run protected programs in the future.
Al Kossow
Software Curator
Computer History Museum
Al,
While protections, serial numbers, etc. are always a potential problem, the files we've worked with so far have not had these issues. We always make sure to note any information about the files that is needed for the files to properly run.
Loss of metadata information is an issue that archives have to work around with digital files. As archives accession and process digital files, it is very important to gain as much information about the files as possible, and preserve any and all information necessary for future access to the file.
Thanks for your thoughts! There are many issues around executable files, and we are still trying to figure out the best way to handle them.
Thanks,
Jessica Schaengold
Jessica,
Good information on the issue of executable files. We have quite a bit of EXEs that are stored locally and are recognized by several operating systems, but as we incorporate new and updated software for both the PC and MAC platforms, we are running into problems. We've taken precautions by purchasing virus protection software, but as you know, the structure of EXE files makes them prone to malicious routines. The best results we've had (so far) is opening our EXE files on our Linux server. Plus, our IT team has had success with some EXEs by using a hex viewer and converting to PDFs. You can then open the PDFs in Illustrator if needed. With the amount of files the Smithsonian has, opening, converting and saving them seems like a time-consuming and involved task, but it will be beneficial - not to mention nostalgic - for people to see this outdated technology in the years to come.
Derek Thomas
Derek,
Thank you! Yes; for collection archives, it is crucial to differentiate malicious executable files from relevant executable files. Converting directly to PDF is an interesting concept for viewing and preserving executables. That can be a great way to capture information about the executable and what it is meant to do, as well.
The Archives collects the files from museums about their exhibits. As we move into the digital age, that means collecting information about videos, programs, and the integration of digital materials into these exhibits. Some of the digital files, like documents and images, can be put into an automated process to meet preservation standards. Executable files, however, cannot - as a result, they are indeed very time-consuming! Ultimately, it is worth it, for both the history these files contain and, as you said, the nostalgia. In fact, the Smithsonian American Art Museum currently has an exhibit called The Art of Video Games (http://americanart.si.edu/exhibitions/archive/2012/games/) that, in part, focuses on the changing technology of video games and the nostalgia around older video games.
Thanks,
Jessica Schaengold
Hi Jessica,
Thank you for the reply to my post. I am glad that you found my mention of the PDF approach to be interesting. You will most likely need to test different ways of EXE to PDF conversion as some might end up looking like a bunch of numbers and letters, and not your images. Also, depending on how the EXE files are configured, you might be able to save them with the help of some e-book software systems. I haven't tried them myself, but I've read some articles online that indicate it works in some cases.
Another method that you can try with Photoshop is to save your EXE files as JPEGs. I've been able to do this with some success. There are some great step-by-step instructions online for Mac and PC platforms.
Lastly, the link you posted to The Art of Video Games is so nostalgic. I watched the preview and saw some information indicating that Pac-Man, Space Invaders and Donkey Kong will be featured. That takes me back in time... when video games were simple and safe for children to play.
All the best,
Derek Thomas
Leave a comment
Produced by the Smithsonian Institution Archives. For copyright questions, please see the Terms of Use.
About
Smithsonian on Flickr Commons
Topics/Tags
- See Here (614)
- American History (553)
- Science (437)
- Archive (338)
- Cities/Places (282)
- Exhibitions (236)
- Web/Tech (215)
- Photo History (190)
- Link Love (157)
- Politics/Government (154)
Blog Roll
Categories
- Collections in Focus (1002)
- What Gets Saved (342)
- Behind the Scenes (213)
- Smithsonian History (141)
Monthly Archive
- June 2013 (14)
- May 2013 (32)
- April 2013 (26)
- March 2013 (26)
- February 2013 (26)
- January 2013 (28)
- December 2012 (26)
- November 2012 (28)
- October 2012 (32)
- September 2012 (26)
- August 2012 (31)
- July 2012 (26)
- June 2012 (27)
- May 2012 (27)
- April 2012 (27)
- March 2012 (28)
- February 2012 (27)
- January 2012 (26)
- December 2011 (31)
- November 2011 (28)
- October 2011 (35)
- September 2011 (31)
- August 2011 (35)
- July 2011 (41)
- June 2011 (43)
- May 2011 (33)
- April 2011 (40)
- March 2011 (43)
- February 2011 (35)
- January 2011 (36)
- December 2010 (42)
- November 2010 (40)
- October 2010 (44)
- September 2010 (37)
- August 2010 (39)
- July 2010 (38)
- June 2010 (37)
- May 2010 (42)
- April 2010 (44)
- March 2010 (47)
- February 2010 (40)
- January 2010 (39)
- December 2009 (43)
- November 2009 (34)
- October 2009 (11)
- September 2009 (11)
- August 2009 (12)
- July 2009 (14)
- June 2009 (10)
- May 2009 (12)
- April 2009 (14)
- March 2009 (10)
- January 2009 (1)