Software Heritage: Creating a safe haven for software


[Read the post]


“software embodies our technical and scientific knowledge and humanity cannot afford the risk of losing it.”

Yes, humanity can risk it just fine. Not every word written is worth saving, nor every program.When the Great Library of Alexandria was destroyed, I’m willing to bet it was 90% crap, and only 10% treasures.


Maybe, but even actual literal crap becomes valuable if it’s old enough.


In one sense true, but in another sense the crap is more valuable to an archaeologist than the masterpieces. They tell us of the culture that made the masterpiece. Think of what we learn when we discover old love letters and supply records found in ancient Roman latrines.


Nice, Matthias! :slight_smile:



While I fully agree that Sturgeon’s law was probably in full effect(possibly mitigated slightly by the fact that hand-copied manuscripts were pretty expensive, which provides an incentive to weed out the crap); I would offer two considerations:

  1. Even if 90% is crap; you still lose a lot of non-crap when the library burns down.

  2. In the case of software, the techniques required to preserve a given program(crap or not) are very likely to be partially or wholly applicable to other programs targeted at the same platform/from the same period; so the rewards of developing a mechanism to preserve a given program will typically be closer to “technique for halting the decay of documents printed on acidic paper” rather than “exhaustive preservation of random penny-dreadful novel” in that they are not necessarily at all easy to develop; but once available are able to be applied to entire classes of programs, not just individual ones.

Some of the very early programs are, indeed, pretty much once-offs, often deeply tied to esoteric architectures; but the more recent you get the more software targets something that was at least a relatively mass-market platform; so it becomes easier to preserve entire classes of software with the same underlying work.

(Alternatively, one could give Stallman his as-always-well-deserved credit for foresight; and argue that proprietary binaries chose obscurity, and deserve to reap obsolescence, while OSS shall live so long as anyone has need of it. The artifact-collector in me finds this willingness to embrace destruction scary. The purist in me can’t suppress a grin at the prospect of DRMed-and/or-obfuscated proprietary stuff being lost to history because it deliberately chose to be brittle.)


When I was a bobling, I made a choose-your-own-adventure game called Abstinence Quest on my personal organizer. The organizer wasn’t programmable, but if you typed the first characters of a memo entry in the right context, it would jump to that entry, so I made the game as a bunch of specially-structured memos. As much as I’d like to have a copy (there were lots of hilarious boner jokes) that’s a good illustration of why I’m not sure a “software library of congress” really makes sense.

In some ways software is a performance, like a magic trick; there’s a lot of technical work that can be archived, but the overall design may also depend strongly on assumptions about how and where the code will be used. By the time you’ve simulated the hardware, and the hardware glitches, and the OS, and everything else you need to recreate a historical program, it may be more informative to just archive a video of it being used in its native context.

Preserving useful algorithms and design patterns is a different story, but again, if a DOS disk-compression utility contains a clever algorithm, the best way to preserve that is to document it on its own, so people can get at it without learning about low-level programming of the early nineties.


I’m just gonna leave a few things here:


I slung a lot of Assembler and COBOL on mainframes back in the day. Will that stuff get preserved, too?


What is the library in the photograph?

  1. Maybe they should partner with the Internet Archive - similar missions and the IA might have some tools already built.

  2. Way back in the day, when our organization contracted to “lease” mainframe software from major vendors, there used to be an ‘escrow’ clause in the contracts, requiring the vendor to put the source code of their proprietary software into escrow somewhere, so that if they went out of business, we could obtain the source and maintain it ourselves if necessary. In practical terms that is rarely done, as there is often a competitor to get similar software from, and a migration tool to switch over, but once in a while something can be so critical or so intertwined with everything that you really have to have it even if the company is gone. Do vendors of software do that any more?


If they’re looking for a good domain name, it looks like, .net, and .org are all available.


This topic was automatically closed after 5 days. New replies are no longer allowed.