Seriously, try "view source" on google.com

I’m honestly not sure what the shock and surprise is. Is it because the code looks unreadable? Is it because there’s so much of it? There really isn’t that much, it’s just that they’ve put it all into the html page, to save the user from needing to download multiple files.

The entire thing is just 189 KB. The BoingBoing home page downloads 222 KB of javascript. It’s not magically smaller just because it’s hidden behind a bunch of one-line tags for jQuery and what-not.

Sure, we get to look at all the foreign code and say “Arghhh! It looks sinister, Google is tracking us!”, but Google is tracking us whether their front page looks like that or if it looks like

<html>
    <head>
        <script src="combined-google-scripts.js"></script>
    </head>
    <body>
        <input></input>
        <button>Search</button><button>I'm feeling lucky!</button>
    </body>
</html>

That would contain all the same evil tracking code, but no one would think to mention it — except maybe to say how lovely and simple it is, even though it would be actually be a worse user experience.

The point of the XKCD comic wasn’t that it’s “all that code” – that’s the result of optimization.

4 Likes

This is cool. What the hell am I looking at?

1 Like

Actually, I did give some thought to this scenario.

Two words: File Hashing. (Not really a thing in 1984)

Granted, that can’t be applied on the user’s side of things, but, yeah, the site owner can reliably use it to determine if they are producing the build artifacts that they expect to be.

It would be a neat trick indeed to be able to infect anything in such a way that it produces an MD5 collision. My problem with the KTH doomsday scenario is that you can ask, “what if the next guy up the chain is infected?” all day long, but at some point it becomes improbable that you’ve got a malicious actor with that much time on their hands to custom tailor the malware to be undetectable in many custom executables.

It’s not unlike the chain of logic followed with conspiracy theorists that when questioned with facts, start to exponentially increase the size of the conspiracy to ensure said facts are suspect. At some point, you lose all credibility.

Anyways, self-modifying malware running rampant spreading itself all over the place is theoretically possible, but there’s ways to combat it, which is why it is not widespread like real viruses. But like I said, we’re not there yet even so.

1 Like

File hashing did exist in 1984, and it was used to verify files in instances where it was common for corruptions to occur, but most of the algorithms used were targeted at detecting, and sometimes correcting, random changes that might occur in transmission, not tamper resistance.

Hashing is also not necessarily a defense. Assume for a moment that KTH occurred in the wild sometime around 1984, and that some part of the C build tool chain was compromised. If this was propagated afterwards sufficiently widely there is no way to tell with certainty that any new build of that part of the tool chain is a clean build or an infected build. If the very first version of GCC ever built was itself already infected then all hashes for GCC would be ‘correct’, but also be for an infected version.

removes tinfoil hat

Your correct that at some point any discussion of KTH becomes a discussion of conspiracy theories. I can’t prove that this hack is now a part of all software, because since the hack is in all software it prevents me from seeing that it’s in all software. But you can’t prove that KTH could not have happened because ultimately there is always one more devious way to implement the hack that would defeat that test.

I don’t honestly believe that there is this unknown self replicating compiler/linker/micro-code hack that has been propagating in all our systems for 30 years without it making a single mistake that could have revealed it to the world. Working as a programer for a living I have far to much confidence in the utter fallibility of programers to think that someone could have written something that has worked perfectly for three decades without a bug fix.

Conspiracy theories are rarely fully disproved, instead it’s far more common for the conspiracy it self to break or leak, revealing at least some part of the theory true. (The NSA really is spying on everyone. Ford really did cover up known fatal (for the occupants) design flaws in it’s vehicles, multiple times. Etc.)

2 Likes

Yeah, but as a security best practice? Even MD4 (from what I read), released in 1990 was very prone to collisions. I mean, I was a kid in the 80’s so perhaps we can write off intelligent discussion of my knowledge back then of anything more than Commodore 64 command syntax not related to running videogames.

OK, let’s keep the tinfoil hat on and see where this discussion leads.

The scenario you describe would indeed negate known-good hashing methods. But, with all the eyes on the GCC code surely someone would have noticed an indicator like a file being maintained or a connection to a C2 server, or mysterious listening socket that couldn’t be explained by the source? Modern intrusion detection and malware detection tools simply automate this once-manual process.

From that point of being made aware that something was amiss, it becomes possible to inspect the binary and map the code to the paths and data structures in the binary. This isn’t easy, and one presumes it was a hell of a lot harder in the 1980’s, but people do this today with static and dynamic tools like IDA Pro, OllyDBG, Softice, and can map source code to binary with some knowledge about how certain compilers do things.

“Yes, but what if those programs were all infected during compilation?”

The infected GCC would become weirdly large with all these purpose built code paths, right? It would be known that the toolchain was infected at some point (or at least really suspect), and some crazy fiend like this guy would build a compiler from scratch in assembly to see what it spits out for comparison.

We have all sorts of modern stuff going on in the security/malware space that muddy the waters from this point. Hardware DEP, application profilers, packet sniffers, hidden NTFS streams, hypervisors…

Yeah, I didn’t figure. This is just a fun argument for the sake of it–one I am making with the utmost respect. Maybe I’ll learn something new from it.

Exactly. The hypothetical hacker would have to be the smartest programmer in history to write something that would escape detection for that long. And not just that, to adapt to new software whose purposes were something other than compiling stuff.

So you didn’t know about the rms terminator and skynets long game then?
Interesting…

This topic was automatically closed after 5 days. New replies are no longer allowed.