Excellent video about why the SHA 256 hash algorithm is so cool and useful

frauenfelder · July 15, 2019, 4:46pm

Originally published at: https://boingboing.net/2019/07/15/excellent-video-about-why-the.html

…

Bernel · July 15, 2019, 5:37pm

“It’s a one-way algorithm, which means there’s no known way to practically retrieve the input from the output.”

I think you need to modify this statement. There will be an infinite amount of different inputs that give the same output, so reversing the algorithm is not only hard but impossible.

What would be important for uses like bitcoin is an effective way of calculating any input that gives the correct hash signature. While this would break bitcoin many other uses would still be safe. If you have an image you want to fingerprint, even if someone manages to find another bit sequence that gives exaclty the same hash, that bit sequence will just be gibberish if looked at as a picture.

RickMycroft · July 15, 2019, 5:45pm

Um. SHA-2 256 or SHA-3 256? (I guess it doesn’t really matter for the purpose of the video.)

Speaker-to-Lampposts · July 15, 2019, 9:03pm

The SHA-2 one. SHA-256 is the official name of the 256-bit variant of SHA-2. This terminology predates SHA3, and SHA-1 is fixed at 160-bits, so at the time there was no ambiguity. When SHA-3 was standardized, they named its variants SHA3-(size) to avoid a conflict.

Reference: the NIST hash function project page, and the linked standards documents FIPS 180-4 (SHA-2) and FIPS 202 (SHA-3).

Matthew_Weathers · July 16, 2019, 8:59pm

(Video OP here) You’re right, of course. I will clarify that in an extended “follow-up” video I’m planning for next week. Thanks for your input, I appreciate it. (Subscribe to my channel if you want to see the update when it comes out.)

Matthew_Weathers · July 16, 2019, 9:01pm

Thanks. Yes, you’re correct. I’ll clarify these names in a follow-up video I’m doing, to explain the different variants better. I appreciate your concise description of the differences, and might quote you, if that’s okay?

Speaker-to-Lampposts · July 17, 2019, 1:59am

Welcome, Matthew, and thanks for the video! Feel free to quote me in your follow-up. BTW, I do have one gripe about the video: around 1:30, when you talk about collision resistance, your phrasing seems to imply that there aren’t any collisions; there are lots, they’re just incredibly hard to find among all the non-collisions.

Matthew_Weathers · July 17, 2019, 6:39pm

Here’s a draft of what I’m planning to include in my follow-up video:

In the original video, I said that “the only time you ever end up with the same hash, or the same fingerprint, is if you started with the exactly identical input,” and while that’s true in the practical sense, it’s not technically, mathematically accurate. In reality, for each possible hash output, there is a finite set of inputs that will produce that that specific output.

However, since currently no one knows how to determine anything about the set of inputs for a specific hash output, and no one has ever found two files that are in the same input set producing the same output, for now we can assume that if you see identical hash output, it almost certainly came from identical inputs.

And I’ll show an example. The reason I say “finite set of inputs” is because I just learned that the SHA-256 specifies a maximum input size (of 2048 pebibytes minus one bit, or about 2305.8 Petabytes - see FIPS 180-3 specification). In addition, SHA-256 has never been proven to be mathematically surjective (you don’t want it to be), so we don’t know whether any of those input sets are the empty set. (In other words, we don’t know if there are any outputs for which there is no possible input).

Does that sound accurate to you? Thanks for your input.

hngr · July 18, 2019, 7:48am

Years back, I was trying to “time-stamp” my Zipped collection of notes for a project, using something like https://stampd.io/, but I couldn’t, because there was another file already on the Bitcoin blockchain a few years earlier with the same SHA-256 hash.

I now remember it’s https://proofofexistence.com/ where I did it. I found the file but, it should be sitting sitting at, Proof of Existence

It’s possible OneDrive mucked with the file contents (watermarking it?) and changed the resulting SHA-256 hash. I regret not writing everything down before I got distracted back then. I had no idea that collisions were not supposed to happen.

Of course, even if I could reproduce the same record, we wouldn’t have had much luck tracking down the other owner of the other file that supposedly matched the same hash.

Matthew_Weathers · July 18, 2019, 5:11pm

Interesting. I had never heard of the Proof of Existence project.

I believe you, that it said there was another file with the same hash. But the odds against that really happening are so huge, that I wonder if something else was really happening. Maybe a software error on their part? (like they were hashing an empty file, or the wrong file) Or you uploaded the same file a few years before and forgot?

Who knows? Even if there really was an accidental hash collision like this, I think the algorithm is still secure as long as no one knows how to do it intentionally.

hngr · July 19, 2019, 3:32am

Oh, I was under the impression from your description that it was almost mathematically impossible to have a hash collision. Like having to land on the same beach on all the earth-like planets in the observable part of our galaxy or something.

But since I was not able to reproduce the same result, there was something that went wrong in the chain. With large files we are supposed to SHA-256 hash it ourselves and paste it in before starting the notarization process. So either the ZIP file contents were watermarked by Microsoft (shaking fist) when it got uploaded to OneDrive and thus changing the resulting hash (I wouldn’t know unless I found another local copy of the same file), or there was indeed a bug at Proofofexistence.

I know I hadn’t uploaded before that date because I produced the ZIP file that day when I was about to notarize, and remember making a small tweak to the file with the intention to try again (making another 500 MB ZIP file). Back then the BTC amount that they wanted was equivalent to $7 so I quickly got distracted and didn’t follow through. The price did come down as they’ve adjusted the BTC ask price since then.

frauenfelder · July 20, 2019, 4:46pm

This topic was automatically closed after 5 days. New replies are no longer allowed.

Topic		Replies	Views
Explainer video shows how secure 256 bit security is boing	12	1714	December 13, 2017
A new "quantum proof" encryption standard is broken by a low-end PC boing	43	1990	August 10, 2022
You can unscramble the hashes of humanity's 5 billion email addresses in ten milliseconds for $0.0069 boing	36	1902	April 14, 2018
A hard look at the wastefulness of "proof of work," the idea at the core of the blockchain boing	27	1910	May 28, 2018
The NSA sure breaks a lot of "unbreakable" crypto. This is probably how they do it boing	72	5051	October 20, 2015

Excellent video about why the SHA 256 hash algorithm is so cool and useful

Related topics