This is the fact that no one wants to admit: if you a giving away a dataset that has potentially-useful correlations that you don’t understand, you can’t be surprised when someone teases out correlations that you didn’t expect. That was sort of the point. We’re basically just lying to ourselves about being able to throw these things out into the ether in the hope that it helps some stranger figure out how to help us. The only way to curate these datasets while respecting user privacy is under contract to select, vetted researchers, with the understanding that nothing is ever really anonymized.
4 Likes