Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

However that statistical guarantee also requires your pseudoidentifiers to be picked correctly, i.e. it only holds true if you select all variables the attacker could possibly know about a subject. I think that is the hard part here, it's not something I would recommend someone doing without a lot of research and experience for highly dimensional data.


Right. Even if you assume the worst-case-scenario there isn't some standard risk metric nor threshold to meet.

I feel like differential privacy is the strongest definition we have, but it is also lacking from a practical standpoint. What does it mean to have N nats/bits of information gain from seeing the result of a query? How does this translate to my risk of a PII leak?




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: