There is a big difference between a data breach and de-identification; the former presumes that the storage method is secure against unauthorized access, but the data is not. The later assumes the data is freely available, but does not contain identifying marks.
I think both are impossible to solve in any real-world way: there are too many holes in any large system to secure it perfectly against leaks, and many data sets are too complex to protect against a determined sleuth. We live in a world where we struggle to produce cryptographically-secure methods of hiding identity; the wrong pRNG can result in secrets being spilled. A de-identified dataset doesn’t even come close to this level of security; it is full of all kinds of identifying information. A typical dataset of genetic information is full of thousands of correlates with your identity. You cannot release this data in any meaningful sense (i.e., expose actual unique information about an individual, the whole point of releasing the data) without risking identification. To some extent this means that people who want their data to be shared “anonymously” must be made aware of, and accept, the risk of identification.