You can unscramble the hashes of humanity's 5 billion email addresses in ten milliseconds for $0.0069

Yeah, the problem is that the whole point of what they are doing is that two different groups can match their databases together. There’s no way to allow that to happen without allowing third parties to crack the code.

Part of my job involves data privacy and I’m constantly asking people a question very close to this. If you are going to release a dataset and you don’t want anyone to be identified, then you just don’t release fields that could identify them. I get that this reduces the value of the dataset for some purposes, but if you really have to choose which side of that tradeoff you are going to land on. In government people are actively thinking about that tradeoff; I imagine in these companies people are just trying to figure out a way to make it look like they made a nod towards privacy.

Like you say, the whole reason they add the has is so that someone else can match it. And yeah, the data privacy problems go way beyond that. Who even needs to match an email address if you have the location people were standing for every text they ever sent?