Even if the user posts a comment containing what is undeniably personal data, you still might not have to consider it personal data simply because Hacker News search sucks; Recital 26 says:
To ascertain whether means are reasonably likely to be used to identify the natural person, account should be taken of all objective factors, such as the costs of and the amount of time required for identification, taking into consideration the available technology at the time of the processing and technological developments.
> Would just pointing at the website be enough to satisfy the request for a copy?
Yes, in fact recital 63 recommends "remote access" as a method:
What hn probably should do is offer a "takeout" option and a "delete me" option on the account page. The former would export every comment and submission along with vote counts, links to upvoted comments/stories profile data etc. All in machine readable form (eg: s-expressions).
The latter would delete profile along with data. Or, possibly, simply anonymize the posts.
I'm not entirely clear on the GDPR vs publishing - i don't think it's meant as a tool for "book burning" - and I've yet to see an interpretation vis-a-vis public discourse. There certainly are laws governing public archives that override parts of the GDPR in certain contexts.
So while hn would probably have an obligation to export all comments, I'm less clear if they'd have an obligation to delete, under the GDPR.
If the ip is logged along with actions, that'd also be considered personal data, and fall under the GDPR.
The reason I think Hacker News would simply delete it has nothing to do with the GDPR, but because they seem to have responded to requests to delete an account and comments in the past:
> i don't think it's meant as a tool for "book burning"
I think you've confused my statement of "I suspect Hacker News would..." to be a legal/professional opinion about what Hacker News should do, or would be compelled to do so under the GDPR.
That wasn't my intention.
> If the ip is logged along with actions, that'd also be considered personal data, and fall under the GDPR.
"A single household PC may have different family members using it under the same login identity. As a result, the IP address and cookies cannot be connected to a single user. Therefore it is unlikely that this information will be personal data."
That it may be personal data does not mean that it is personal data, nor are you under an express obligation to attempt to unmask anyone that you might have the ability to do so.
There is a risk/reward concept in the GDPR however. There are reasons that are useful to users to keep their IP addresses in a database, and there are risks with keeping their IP addresses in a database. This is why the ICO also recommends you blank out the last octet of the IP address.
> There are reasons that are useful to users to keep their IP addresses in a database, and there are risks with keeping their IP addresses in a database. This is why the ICO also recommends you blank out the last octet of the IP address.
Note: If you are going to use that IP address for determining location (which is common when dealing with the EU, because that is one of the things the EU considers acceptable evidence to justify your choice of which country's VAT to collect for an online sale), do the location lookup before blanking the last octet.
I had hoped that the first 24 would be sufficient to determine country, but that is not the case. For example, here are current results from MaxMind's GeoIP service:
5.62.58.243 US
5.62.58.244 US
5.62.58.245 DE
5.62.58.246 DE
5.62.58.247 DE
5.62.58.248 US
5.62.58.249 US
5.62.58.250 US
A couple weeks ago, BTW, 5.62.58.244 was identified as DE. This suggests that it might be a good idea to keep the full IP address around at least until you file your quarterly VAT MOSS documents, so that you can do another lookup then and possibly get a more clear picture of who you owe VAT to for the sale.
PS: I have no relationship with whoever owns those IP addresses, as far as I know. A few weeks ago I did GeoIP lookups on all 4 billion IPv4 addresses to find all the ranges of US IP addresses (there were 22029 ranges) as part of optimizing a filter that is supposed to reject non-US traffic from certain reports. To get an example for this comment I looked through those ranges looking for one where there were two different US ranges overlapping the same /24, and 5.62.58.0/24 was the first one I noticed.
Those IP addresses belong to the same AS, have the same announcement[1], and have very similar traceroute outputs (both have final hops around miami). The only thing different is their reverse DNS, which I think is throwing maxmind's algorithms off.
https://gdpr-info.eu/art-4-gdpr/
"‘personal data’ means any information relating to an identified or identifiable natural person (‘data subject’); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person;"
So yeah, a single IP in isolation might not trace back to a single individual - but with a timestamp and billing info it might track to a residence - with other data (eg: age, occupation) it certainly will trace back to an individual.
I'm surprised at the ico's interpretation / statement on this.
> > i don't think it's meant as a tool for "book burning"
> I think you've confused my statement of "I suspect Hacker News would..." to be a legal/professional opinion about what Hacker News should do, or would be compelled to do so under the GDPR.
Indeed, that wasn't mean as a direct reply to you, more as a general comment on the GDPR.
There's a provision on right to be forgotten, and it'll be interesting to see that vis-a-vis a public interest in keeping an open archive of public discourse.
> The reason I think Hacker News would simply delete it has nothing to do with the GDPR, but because they seem to have responded to requests to delete an account and comments in the past:
True. I don't think that'll be enough to comply with the GDPR. Just as storing child pornography in bulk, isn't ok if you remove individual pictures on request.
On appeal, the Regional Court of Berlin (the "Kammergericht") ruled that IP addresses in the hands of website operators could qualify as personal data if the relevant individual provides additional details to the website operator (e.g., name, email address, etc.) in the course of using the website
That's basically the same thing as the John Smith example: There's a threshold when you have personally identifying information, and whilst it can certainly include an IP address in some circumstances, there are enough other valid uses for the IP (fraud, VAT, etc) and enough uncertainty (NAT, multiuser computers, etc) that it by itself isn't PII.
> There's a provision on right to be forgotten, and it'll be interesting to see that vis-a-vis a public interest in keeping an open archive of public discourse.
Yes. I don't think it's clear what Internet forums are required to do.
so a flag to hide all the comments of a user who has chosen to be forgotten should be sufficient.
However, if a site wants to refuse the order, they may be successful if they can argue the comments are in the public interest, but if I were a company wanted to refuse a persons rights in this way, I would call the ICO to get clarity.
> I don't think that'll be enough to comply with the GDPR.
If someone contacts the data controller (e.g. pg) and asks to have their data removed (or flagged hidden or whatever), and Pg does it, why don't you think that would be compliant?
> I suspect Hacker News would simply delete the user's information from the site and explain that they control no data on the subject.
No, that would be illegal. Hacker news can set itself up so that it doesn't keep user data longer than 30 days, and then it can just always say it has no data, but, excluding that, you can't respond to an export request by deleting the user's data and then telling them have no data - that violates the user's right to see what data you have on them.
I don't think there's a requirement that HN has to keep delete personal data that they don't need, and FYI the Hacker News privacy policy they publish[1] argue they don't have to do it if they don't want to:
You agree that any termination of your access to the Site under any provision of this Terms of Use may be effected without prior notice, and acknowledge and agree that Y Combinator may (but has obligation to) immediately deactivate or delete your account and all related information and files in your account and/or bar any further access to such files or the Site.
so we're really out on a limb here anyway. But let's assume that HN is GDPR compliant, and say that they delete all personal data after 10 years and on request, etc... Are they then required to keep that data for ten years?
My guess is not. The ICO suggests[2] repeatedly that you not keep data any longer than is necessary, and that you repeatedly review whether it is necessary.
The ICO also says[3]:
However, in many cases, routine use of the data may result in it being amended or even deleted while you are dealing with the request. So it would be reasonable for you to supply information you hold when you send out a response, even if this is different to that held when you received the request
which makes it sound like it's acceptable, except:
it is not acceptable to amend or delete the data if you would not otherwise have done so.
which then suggests that HN only needs to have policy that they delete personal data whenever if it is identified for export. If I were HN, and I actually wanted to do this (however), I would probably call the ICO to confirm.
The text calls out "routine use" - this clause is to permit, e.g. the last access date on the account to be the date of access to the GDPR export request portal (deleting the prior value).
The point of GDPR is to force companies to explain the data they retain and show it to users on request. Setting up a scheme where data is retained but is never available to users for export is a great example of acting in "bad faith" that is likely to increase the possibility that a judge will make an example out of you.
Would all a users comments be classed as personal data? Would just pointing at the website be enough to satisfy the request for a copy?