A "digital rosetta stone" for translating obsolete computer files

beschizza · March 13, 2018, 12:39pm

Originally published at: https://boingboing.net/2018/03/13/a-digital-rosetta-stone-fo.html

…

johnson · March 13, 2018, 1:50pm

Years ago I had a client who was hanging on to this ancient Apple computer that had some MacWrite (I think) formatted file on it. He held onto the computer so - if someday he needed to view that file he COULD.

He had spent many hours and maybe a hundred dollars trying to convert it to something that would work on his current Mac.

Me: What is on the file?
Him: Just an old list of Club Members
Me: Are you ever going to change the info?
Him: No. I just want to be able to see it at some time. I don’t want it lost forever.
Me: Why don’t you just print it out and then get rid of the computer?
Him: Oh…I hadn’t thought of that.

micah · March 13, 2018, 1:56pm

I have an old Mac that I keep just for Quicken 2007.

comedian · March 13, 2018, 2:20pm

For time-travel money laundering?

Dennis_el_Campesino · March 13, 2018, 3:01pm

Ummm… Couldn’t you just use a virtual machine? Unless there is some obscure hardware level interaction, there’s no reason for this, especially on something as new as Windows NT.

philipstorry · March 13, 2018, 3:15pm

Good luck with that. We didn’t even have a good digital Rosetta Stone back in the day…

I remember having file conversion software back in the early 90’s, the days of DOS 5 and Windows 3.1. There was still a competitive software ecosystem back then - people still using WordStar, WordPerfect, IBM’s DisplayWrite, Lotus 1-2-3, SuperCalc, Lotus Symphony, and more. Hell, this was a time when your graphics might come in PCX, BMP, GIF, LBM, TIF, and plenty of others.

Since then we’ve had something of a monoculture occur. Yes, there are plenty of people using software like LibreOffice, but its a minority. Most things produced are going to be in Microsoft Office formats.

Back then, when you needed to get your files converted so that you could get your Deluxe Paint II art into your mate’s WordStar document, it wasn’t peachy. The only real way to ensure full fidelity was to run the original software - you often ended up looking for a format that two applications mutually supported, and would use that as a workaround. Good third-party format conversion software was both expensive and imperfect.

And here’s why I wish them luck - things haven’t actually improved, even with a monoculture. Even today I see old corporate documents that were produced a decade (or three versions) ago, and Word seems to have odd little issues with them. Backwards compatibility for Office isn’t as good as you’d think it is.

If you want the general gist of it, a digital Rosetta stone will be OK. But if you want to see it as it was intended, you should go to the original environment it was produced in.

If that means maintaining a fleet of VMs - well, you only need to build a DOS 5/Windows 3.1/Office 4.3 machine once. Then you can just clone it and keep using it again and again…

anon41912231 · March 13, 2018, 3:21pm

Yes, I think that emulating and virtualizing where possible is the solution. Some of those expensive CAD programs use those annoying hardware dongles (HASP?), I could see that requiring actual physical hardware.

What the article talked about successively converting the file from one version to then work in the current version really scared me because how do you know that all the information was indeed transferred correctly?

How could you ever know if anything except for the original code is going to interpret the document the way the original author at least intended it to be? So shouldn’t you try and run the original code when possible?

Boundegar · March 13, 2018, 4:00pm

I have a powerful solution to almost all of these issues, spelled ASCII. No matter how advanced my software is, ASCII still works.

Kind of sucks for video, I’ll admit.

LurkingGrue · March 13, 2018, 4:28pm

I recently converted a ton of documents using a windows95 vm and running a very old copy of word. I would say it isn’t rocket science but I’ve been in IT and I realize for most people it would be rocket science.

Dammit! All my old code it in Ebcdic and Petscii.

jandrese · March 13, 2018, 4:38pm

Converting between EBCDIC and ASCII is relatively easy. iconv has all of those encodings. PETSCII however is more challenging. I’m not sure there are even unicode codepoints for all of the PETSCII drawing characters.

Trogdor · March 13, 2018, 4:50pm

Still have a Star Trek game for TRS-80 on cassette tape somewhere…

Tom_Rombouts · March 13, 2018, 4:52pm

Just to add a bit of history, back around 1990 two DOS / Windows products that attempted to convert files from one format to another were Data Junction and CrossFile.

anon34399329 · March 13, 2018, 5:36pm

Came here for this, leaving satisfied. That S stands for standard.

Wait, no, I’m not leaving yet. Not till I mention that I’ve got an old thesis written on a CP/M VT180 in a word processor called Select that I’d like to get converted some day (the Rosetta software will need to accept 5and1/4" floppies, though)…

Jorpho · March 13, 2018, 7:12pm

I imagine what it might be like to throw the entire computing power of the USS Enterprise at an ancient binary file – let some monstrously powerful AI reconstruct the cultural gestalt at the time of the document’s creation, developing an approximation not just of all the contemporary coding paradigms but also trends in graphic design, going not so far as to interpret the document the way the original author intended, but even how the original author might have preferred it to be interpreted if released even slightly from the technical limitations of the time.

anon41912231 · March 13, 2018, 7:43pm

Well it would be a start.

d_r · March 13, 2018, 7:58pm

Another problem we had back in the day was that there was no single 5 1/4" floppy standard. There were some conversion programs out there – I wrote one that let my Morrow read disks from a machine on campus – but it exacerbated the ‘moving files between systems’ problem.

armozel · March 13, 2018, 8:09pm

It’s articles like this that make me wonder if a new kind of programmer will come about as a consequence of all this. One that doesn’t really create new software but rather writes programs to access lost but valuable data. Sorta like archaeology but for data. I know we keep adding new frameworks and libraries all the time but look at how C++ and C keep chugging along. It’s inevitable that either programmers will not able to easily access the data directly and thus need virtualization to even load a file or we’ll have to do other kinds of tricks to interact with ancient technology. It’s really weird to consider, honestly.

nixiebunny · March 13, 2018, 8:49pm

This discussion so far is entirely about 8x86 processors. I started in this business before these machines existed, and data were stored in myriad physical formats on floppies of two sizes, with from 8 to 26 sectors per track, or on paper tape (thankfully mostly in ASCII), or on one of a dozen incompatible cassette tape formats.

My brother was able to write a program for a modern computer to accept the cassette tapes from his homebrew 6800 c.1977 and do some DSP to extract the ones and zeroes. It took several hours.

Medievalist · March 13, 2018, 9:58pm

It’s reasonably easy to go from ASCII to EBCDIC, as long as you didn’t use any characters that aren’t in typical mainframe code pages (like, for example, curly braces) but it’s pretty hard to go the opposite direction. Even if you know you’re going to code page 37, you’re still trying to stuff eight EBCDIC bits into seven ASCII bits.

The company I’m currently working for processes a lot of mainframe data. Periodically some gigantic org with thousands of employees will accidentally change the code page translation table on their mainframe and suddenly all their EDI will break…

(Another fun translation task is dealing with the packed decimal formats that crop up in mainframe files amongst the EBCDIC text.)

d_r · March 14, 2018, 2:44am

The Morrow I mention above was a Z80 (running CP/M).

Topic		Replies	Views
This converter transfers your old cassettes to MP3 boing	28	1734	March 30, 2018
OLIVE: a system for emulating old OSes on old processors that saves old data from extinction boing	41	1983	September 27, 2018
The sad truth about your computer boing	59	5359	September 23, 2015
Data recovered from Gene Roddenberry's floppies—but what's on them? boing	34	3141	January 10, 2016
An easy way to transfer old VHS and DVD video to digital boing	34	1860	June 14, 2019

A "digital rosetta stone" for translating obsolete computer files

Related topics