Help me convert 20-year-old word processing files into a current format
September 13, 2014 1:34 PM   Subscribe

I have a bunch of old word processing files created in the early to mid 1990s. I can no longer open them without turning them into a jumble of unformatted text and symbols, and I'm hoping there's some kind of amazing program that can both identify old file types and then convert them to a current program. Details inside.

The files were created on early/mid-90s PCs running DOS and then Windows (the main one being a Toshiba T1000LE, with its oh-so-amazing 20 MB hard drive).

I don't know what word processing programs were used. I would have suspected early versions of Word and/or WordPerfect, but when I try changing the file extensions to things like .wpd and .doc it doesn't help. (Unfortunately when I saved the files back in the 90s I didn't quite understand the function of file extensions and deleted them all.)

I now have a Mac running OS X 10.9.4. I can sort of open the files in Word for Mac 2011, going via a pop-up box that says "Convert file from" which gives me a bunch of options like "MS-DOS Text" and "Unicode File." The results aren't pretty, and lose a lot of formatting and add tons of random characters.

I could manually clean up the files one by one and save them anew, but ideally there is some magic program that will tell me what program made each file and then convert beautifully to Word for Mac 2011.

Help? Thanks!
posted by bassomatic to Computers & Internet (24 answers total) 18 users marked this as a favorite
Response by poster: btw, along with the files I want to convert I have lots of equally old backup files with a .BK! extension, which are obviously backup files, but from what word processing program I don't know.
posted by bassomatic at 1:36 PM on September 13, 2014

Best answer: OpenOffice is one. Libreoffice is another.

See if either of these fit the bill.
posted by Ruthless Bunny at 1:44 PM on September 13, 2014 [1 favorite]

If you can post a sample file somewhere (like on dropbox or something) you'll have a much better chance at someone being able to help you. There were a myriad of word processors available at that time; Word and WordPerfect were popular, but they were by no means ubiquitous.
posted by Aleyn at 1:44 PM on September 13, 2014 [1 favorite]

Word doesn't automatically install all of its converters when you install it. So it may be able to recognize and convert it, but the converter just isn't installed. And I don't know if the OS X version comes with the same converters as the Windows version. @Aleyn has it right. You'd need to post a sample file to give people a chance to assist.
posted by cnc at 1:47 PM on September 13, 2014

Response by poster: Alas, the files all contain personal information so I can't share them, although I know that would help immensely. Ruthless Bunny, I've tried OpenOffice; I'll give LibreOffice a shot too.
posted by bassomatic at 2:14 PM on September 13, 2014

Best answer: Open it in TextEdit or BBEdit and look at the first line or two, there may be a clue to which program it is.
posted by Sophont at 2:42 PM on September 13, 2014 [1 favorite]

Copy one of the files, rename it to be [filename].chk and run one or both of these apps over it. They're designed to recover the filetypes of files found when running scandisk or chkdsk, which automatically add the .chk suffix.

You might get lucky and find that of of the tools can pull the extension out of the file for you. Googling that will hopefully lead you to an app that can open the file correctly.
posted by Solomon at 2:43 PM on September 13, 2014 [1 favorite]

Ah, you have a Mac. Scratch that idea then. I can't imagine there are many apps written for Mac that will do such a niche task.
posted by Solomon at 2:45 PM on September 13, 2014

Best answer: From the command line, type "file -k filename". File's been around for decades, and contains the DNA of hundreds of long-dead file formats. You might get lucky.
posted by Leon at 3:00 PM on September 13, 2014 [3 favorites]

Best answer: If you're comfortable using the terminal you could try the file to find what format the files are in. The GNU tools might recognize even more formats.

If that doesn't work you could go through the Wikipedia list of old word processors and see if anything jumps out I guess.
posted by Baron Humbert von Gikkingen at 3:02 PM on September 13, 2014

paste the output of the first page of xxd file.dat | more into a <pre> section here, making sure no obvious private text is readable. If file can't handle it, maybe we can.

It probably uses upper bits shifted to indicate control codes, hence the line noise look.
posted by scruss at 3:09 PM on September 13, 2014 [1 favorite]

I've had fairly good luck using Zamzar to convert ancient/unknown format files. You might try dropping one of the files on there and tell it to convert it to something like an .rtf file and see what happens.
posted by Thorzdad at 3:55 PM on September 13, 2014

Response by poster: Opened a file from 1991 in TextEdit; here's what appears before actual text:


(there are actually several lines of symbols after that, but I can't seem to paste them here.)

Here's another:

posted by bassomatic at 4:32 PM on September 13, 2014

WPCG and WPC4 seem likely to be WordPerfect files of some sort, but I can't find a confirmation of that easily online. What are the extensions other than BK!?
posted by dttocs at 4:52 PM on September 13, 2014

Best answer: bassomatic, what does "file -k" give you? What's the output of scruss's command? TextEdit will chew up the bytes in ways xxd won't.

There's a chance that the WPC4 file is a WordPerfect Graphics file. Copy it to test.wpg and see if ImageMagick can handle it.
posted by Leon at 4:56 PM on September 13, 2014

Response by poster: Leon! Took me a minute to figure out how to use the file -k command in the terminal, but bingo: (Corel/WP).

Now at least I know what kind of file I'm dealing with.

Bonus thing I learned: You just drag a file to the terminal window and it automatically writes out the file path.
posted by bassomatic at 5:06 PM on September 13, 2014 [1 favorite]

Response by poster: Thanks everyone! I've now deduced that the files were created in an old version of WordPerfect, and after changing the extensions to .wp (or .wpd, both seem to work) I can open the files using OpenOffice. There's still a line or two of symbols at top, but I can delete that and then save as a Word document.

Two of the very oldest files created in 1991 render mostly junk, so it's possible they're corrupted. Considering they've been copied over the years from floppies to CDs and to various hard drives, I'm not too surprised.

You have all just helped me cross a big-ticket item off my to-do list. Thank you.

Bonus question: Is there any file format that's more likely to be readable in 20 more years? Or is the secret to keep saving in new formats every so often?
posted by bassomatic at 5:56 PM on September 13, 2014 [2 favorites]

Is there any file format that's more likely to be readable in 20 more years? Or is the secret to keep saving in new formats every so often?

[I'm a software developer, so I am somewhat qualified to answer thusly:] Open formats (as opposed to proprietary) are more likely to be readable in 20 years, though I doubt that the knowledge of how to read any widely-used or commercial formats will be wholly lost to humanity in the next couple of decades. The advantages of using an open format (for you) is that the libraries for reading/writing them are generally easily obtained, thus anyone making text processing software is more likely to support an open format than a proprietary one. All other things being equal, go with whatever is the bare minimum for your needs, if your sole concern is archiving. Plain text is the most universally understood text format there is, but if you also need font/typesetting controls, maybe you can get away with just rich text format? Word and its cohort have now moved on to XML-based file formats (rather than the older binary formats), which should be more easily readable for a long period of time, so I wouldn't really worry too much about it.
posted by axiom at 6:06 PM on September 13, 2014

Is there any file format that's more likely to be readable in 20 more years? Or is the secret to keep saving in new formats every so often?

Seconding axiom for the idea of using an open format. The one that seems most promising at this time is the OpenDocument standard.

An added bonus is that it is the standard format used in OpenOffice and LibreOffice. A suggestion that may help to make your files more future proof: Save a copy of the LibreOffice program with the files. Save both a Windows and a Linux version on your storage medium of choice (it is an entirely different Ask Mefi to determine what storage medium is best for an expected 20 year span).

Chances are good that some means of running the program can be found even a couple of decades later, likely through some sort of virtual environment. It might be helpful to save a disk image of a current linux installation (Linux Mint or Ubuntu) as well.

(It might be safer and easier to revisit your files every five years or so, convert them to the most current OpenDocument version, and include current versions of whatever open source software and OS are available at the time. You can also change to whatever storage media appear likely to still be common five years later. A few hours of work every 5 years is worth it if you want to ensure access in 20 years.)
posted by 1367 at 7:44 PM on September 13, 2014

Damn! I was hoping for Wordstar.
posted by InsertNiftyNameHere at 11:06 PM on September 13, 2014 [1 favorite]

The strings command will give a dump and all plaintext in the file so an incantation like

strings filename | more

will let you page through to see if you can recognise anything (press space to get next page, q to quit).

Plain text is the most futureproof, maybe markdown format where the intent is clear, XML if you really must. Open projects don't seem to have very long lifespans. When the demand for dedicated word processors vanishes then people will stop working on these open projects and go on to more fashionable things.
posted by epo at 5:11 AM on September 14, 2014

Response by poster: For anyone else facing the same challenge: I just downloaded WordPerfect Viewer for Mac and it's perfect. Once I added the appropriate extension to my ancient WordPerfect files, I could do a fast drag-and-drop file conversion, turning them into clean files that I can save in a variety of formats, including Word or simpler rft and text files. I'll opt for the latter to make future readability easier.
posted by bassomatic at 6:50 AM on September 14, 2014

Glad you found out what it was. Maybe your really old files are in a different format?

As an intermediate archival format, and one that can be written by WordPerfect Viewer, I'd recommend docx. Plain text loses formatting, and if there's anything other than plain ASCII (A-Z, a-z, 0-9, very basic punctuation) it'll get converted into who-knows-what. Line endings are a problem too for plain text; some archaic Mac software (cough Microsoft Office cough) still believes that Macs use CR as end of line, and this will mess your shit up.

docx (but not doc; don't use that; it's complicated), at the very lowest level, is a zip file with an XML representation of the document. Both of those are defined by open standards, so you'll get your text out somehow with at least as much formatting as plain text.

Don't use RTF. Although mostly text, it hasn't aged well, and is stuck in the era of 8-bit character sets. Legacy software (which sadly includes the delightful Protext) writes perfectly valid RTF which almost all modern software will misread.
posted by scruss at 10:21 AM on September 14, 2014

Have you ever wanted to convert files without the need to download software ?
posted by theora55 at 9:15 PM on September 14, 2014

« Older Help for the beginner ballerina.   |   Trying to remeber a short story about alien... Newer »
This thread is closed to new comments.