Throughout the years, we have written about our digitized and born-digital materials that include images, video, architectural drawings or CAD, and websites. We have not touched upon the preservation of word-processing documents very much, though. Most do not find them as exciting as an image of a Smithsonian event, a drawing of the plans for a museum, or an animal video from the National Zoological Park. So, they can get overlooked as something that needs digital preservation. However, consider all the typing we do on a computer either at work or at home. Some of those digital documents do have long-term value.
The advent of computer word processing caused a revolution in the business world. No longer did one have to use Liquid Paper or correction tape to fix mistakes made while using a typewriter (electric or manual).
Word-processing software that we think of today dates back to the 1970s. Prior to that, word processing was considered more of a business process to make work more efficient through procedures and machines.
The Archives has documents from across the Institution in various versions of Wordstar, XyWrite, WordPerfect, and Microsoft Word for both PC and Mac. These files include press releases, memos, and photo captions. Some of these files are 30 years-old and the software that originally created them no longer exists.
In some ways these files are easier to preserve or convert to another format because they are not as complex as an image file or a website. If you have an older file and cannot read it, here are three ways your document possibly can be accessed. Make sure to use a copy of the file before proceeding.
- Try using viewer software that can read older word-processing files. Some of these programs can even tell you the version of the software program, e.g. MS Word 8 vs MS Word 14. Keep in mind the font and display may be different from the original. Google search “file viewing software” for possible options.
- Try opening it as text file in a text editor. In some cases you also can figure out if the file is WordPerfect, Microsoft Word, or something else. See image examples. Older WordPerfect files have WPC at the beginning of the file. Microsoft Word files have Word and version information at the bottom of the file. Additional coding (like font and printer information) also might appear in the text file view. When dealing with decades-old documents, a file with .doc extension might just mean it is a document and not a MS Word file. Other “extensions” we have seen in the Archives are .let for letter and .mem for memo with files from the 1980s.
- Try opening it with current word-processing software even if it is a different program. Keep in mind that different software can render files in different ways.
If you are working on something very important, you should also consider saving the file in PDF or PDF/A (the A stands for archival). This is good step especially when you want to preserve the look and feel (layout, fonts, etc.) of the document and not rely on proprietary software. These are the best practices we follow at the Archives.
PDF/A files are harder to create, though, due to certain requirements such as no encryption, no audio or video content, and fonts that must be embedded within document. Some proprietary fonts are unable to be embedded. Another option is to migrate the document by saving it in a newer version of the software, if available.
Happy digital preservation in 2016!
A Peek into an Electronic Records Archivist’s Toolbox, The Bigger Picture Blog, Smithsonian Institution Archives