> The source dataset lets those inferences or joins be tied back to the original identifying data.
But if the attacker lacks the source dataset, they can't do this, and if they possess the source dataset, they'd use it for their analysis rather than using the anonymised dataset.
The point is that if the attacker can connect your user record in the source data with user # 188da24a7789d in the "anonymized" data, they can use that de-identify all information derived or built on the "anonymized" data.
Oh, there is Netflix account for user # 188da24a7789d and the IRS released tax summaries for user # 188da24a7789d? That's interesting, since I know that user # 188da24a7789d is really MaxBarraclough.
If a dataset removes all information except for, say, a user's fingerprints, meaning the only information stored in the anonymous dataset is an image of a fingerprint. The nature of fingerprints prevents them from meeting this requirement, as stated, which effectively eliminates any research that can be done with the data. Given that the only way the dataset could be linked to the original user is if an attacker already had access to the source data, how is this regulation benefiting anyone?
But if the attacker lacks the source dataset, they can't do this, and if they possess the source dataset, they'd use it for their analysis rather than using the anonymised dataset.