Hacker News new | past | comments | ask | show | jobs | submit login
GDPR – A Practical Guide for Developers (2017) (bozho.net)
379 points by FooBarWidget on March 3, 2018 | hide | past | favorite | 198 comments



I just want to point out that a lot of the article is the authors own opinion on how the regulation should be implemented into software and a lot of things are probably not needed normally and would be a burden for businesses.

My own take (and the take of most European data protection lawyers I meet) is that consent is not needed, and also possibly is inappropriate in 90% of the cases - instead the “legal basis” called “legitimate interest” should be used instead. This is where you do your own judgement if your data processing is reasonable. Imagine if you yourself always had to consent to all common sense use of your personal data, what a hassle!

If you use legitimate interest, also skip the “under 16” part, consent checkboxes and re-request consent part of the article (Of course there is more to it, but I would not get in to that unless you are doing data processing you do not think the data subjects in general would approve of).

Functions that allow the users to delete and automatically download/access their own data is good practice for legitimate interest but not needed. You are anyway in general allowed to deal with these types of request on a case by case basis if you provide your data subjects with an email address.

What you should do though is automatically delete data that you do not need anymore, such as old logs, customers contact details of customers long gone, old backups etc.

The parts of encyption, you should create an api etc are not required but may be good practice. Just make sure you have normal OK data and data access security.

As discussed above, you SHOULD use data for purposes that the customer has not agreed/consented to. However, never use personal data for purposes that are not compatible with the purposes you informed your customers of when collecting the data (normally stated in your privacy notice on your web site). If you did not have a privacy notice at the time of collection pre-GDPR, what is an compatible purpose will be a judgement call from the context of collection.


A few clarifications (author here)

1. yes, you are correct, most of the features don't need to be implemented in code and having documented procedures would be sufficient (and that is pointed out in a number of places in the article). However, if you are not a small business or have a lot of users, the time needed to implement the features will be negligible compared to the amount of time needed for handling manual requests.

2. The "legitimate interest" legal basis is harder than it seems and many regulators warn against its overuse. Lawyers in my country are skeptical that regulators will accept legitimate interest in many cases, so "to be on the safe side" they recommend relying on consent. Again, as pointed out in the article, this is up to the legal team to decide.

3. The right to be forgotten is valid even under legitimate interest. Article 17(1)(c) is clear about that - whenever a user objects to their data being processed on the basis of legitimate interest. It is a bit hidden, as Article 17 refers to Article 21 which in turn refers to Article 6, but you can piece the whole scenario anyway.

4. About the best practices - agreed, they are not mandatory under the regulation (as pointed out in the article), but having them in place will demonstrate a higher level of compliance.


Thanks for commenting my comment.

2. Yes, up to the legal team and what types of processing you do. If you do processing that the data subject would not expect you to do or that is not in their interest you have to consider this carefully. Maybe allowed or not under legitimate interest but you have to be careful and do a proper assessment. I believe, and I have heard many EU data protection lawyers state, that consent is a last resort option. Probably not universally shared but many ppl appears to think in that way. Also, remember that consent bypasses important principles, such as the necessity test present for all other legal basis.

3. I agree, both RTBF and SAR rights under legitimate interest but no absolute requirement to automate the process in any case. Implementing voluntarily data portability good practice which could tip the balance when using legitimate interest, see below.

4. Yes, and concerning legitimate interest, if you implement these best practice measures this could “tip the balance” in your favor if you read the WP29 legitimate interest opinion.


It seems like when using legitimate interest as a basis for processing that _what you do_ with the data is much more important than what it is you’re collecting in the first place.

When registering an account with an online service, you will probably have to give up your email address. The legitimate interest is to be able to let you log in again and to send password reset emails, or other account related notifications like “we have detected a suspicious login from another continent”.

If you want to stick someone on your marketing email list, asking for consent is a much better option! Unless the context is extremely clear (the email field is specifically for signing up for the email list), asking for consent seems safer.

But in both cases, the basis is about the processing of the data, not the data itself.


Yes, this is important - GDPR is mainly about how you are allowed to use data, ie for what purposes you are processing the data (although collection and storage is also “processing” as a side point)


You would not use legitimate interests to cover off your processing of data in connection with letting a user log in to your site, if it is a requirement of using the service that you are logged in, for example to authenticate who you are. The correct processing basis here would be to process data to provide a service, not under legitimate interests.

If you were processing someone's data to, for example, ensure the safety of your network/detect unauthorised login attempts, then that would likely fall under legitimate interests, because it is processing that is not necessary to provide the underlying service, but is in the users' interests to ensure the protection of their personal data.


Regarding your first statement: It depends if you have a valid contract with the user and the data processing is sufficiently related to the performance of that contract.


> Functions that allow the users to delete and automatically download/access their own data is good practice for legitimate interest but not needed. You are anyway in general allowed to deal with these types of request on a case by case basis if you provide your data subjects with an email address.

I want to be very clear - you _almost always without exception_ have to provide access to/copy of personal data to the data subject no matter what legal basis is used (consent or not). “Data portability”, providing the data in a commonly used electronic format, such as JSON/XML download, is optional when using legitimate interest but mandatory when using consent.

You also _normally_ have to delete personal data of those data subjects who have requested it. You also normally have to stop using the data for the purposes the data subject requests you to stop with.

These processes do not normally need to be automated (but with consent it should be as easy to provide consent for the data subject as it is to revoke it).


Data portability is no big deal; you can simply dump the unformatted output of your SAR process. So if you've built the SAR process you've built that as a byproduct.


Yes. Unless you have legacy systems with screen shot based SAR responses for example


The sort of companies with those legacy systems can afford the development time to extract data, and aren't looking for advice on HN.


The main processing basis for many entities will be straightforward processing to provide service under Art 6.1.b. Legitimate interests like consent should generally be avoided wherever possible due to the additional burden it places on organisations to document the balancing test you've undertaken, and the potential for it to be questioned in future. Although I fully agree in many cases it will be entirely appropriate to use legitimate interests as the relevant processing basis.


I agree w this too. However, it depends if you have a valid contract with the user and the data processing is sufficiently related to the performance of that contract, if you are going to use that basis.

Notwithstanding Article 5.2 on accountability, I do not believe that non-controversial data processing under legitimate interest needs to be well documented in practice, although I may be proven wrong. I believe it is sufficient to mention legitimate interest and what your legitimate interest is in broad terms in the privacy notice.

For controversial use, that is processing which the data subjects may not approve you of doing, I believe you need documentation on the balancing test. However, in controversial cases you probably also need documentation on the necessity both when it comes to performance of contract and legitimate interest.


Sure, maybe it probably doesn't need to be but Article 29 guidance is clear that as a matter of good practice, you should look to document the balancing test you have undertaken to determine legitimate interests is appropriate.

Yes, quite right on the contract side. At the lowest level, a contract could be implied, or would arise from terms of use on a site. Absolutely, processing under the service provision ground should be limited solely to that which is necessary to provide a service.

I guess if you wanted to take things further I suppose on the authentication front, you could argue that authentication/login may not strictly be necessary to provide certain of the services as they could be provided in the absence of a login (creation of a to-do list for example). However my view would be you take things on a broad basis so that if a good proportion of the services required authentication (buying content on the basis of a to-do list) then you could put all service provision under service provision pursuant to a contract rather than splitting between legitimate interests and service provision grounds.


I agree with everything in your comment.

Regarding necessity I think it should not be interpreted too strictly. Rather, I believe it means something like this in EU law:

“Necessity implies the need for a combined, fact-based assessment of the effectiveness of the measure for the objective pursued and of whether it is less intrusive compared to other options for achieving the same goal.”

(Quote from: https://edps.europa.eu/sites/edp/files/publication/17-06-01_... )


This is the biggest problem with GDPR, there's no agreement what it means, but it will go into effect in a few weeks.


What makes you say that there's no agreement as to what it means?

It feels like I see this sort of view expressed quite frequently. My guess is that it's primarily because people want a reason not to look to comply in lots of cases, or to dismiss GDPR. "How can we comply if no one knows what it really means to comply".

In many cases, the GDPR simply reiterates/builds upon existing data protection law which has a wealth of interpretative decisions and guidance. In other areas, the Article 29 Working Party has been issuing guidance on specific aspects of GDPR.

Yes, the GDPR is a lengthy piece of legislation but there are straightforward steps people can take and they generally centre around respecting users' data.


What makes you say that there's no agreement as to what it means? It feels like I see this sort of view expressed quite frequently.

One fundamental problem is that the GDPR, if interpreted literally and fully enforced to the letter, is absurdly onerous for any small organisation and allows for fines that pose an existential threat without any requirement for proportionality.

Defenders of the GDPR, including some of the official regulators, often argue that concerns are exaggerated and regulators are likely to take a more pragmatic approach, trying to educate those breaching the rules rather than coming in with crippling fines. Maybe that will turn out to be true, but in past instances of overly powerful or broad EU rules, there certainly have been cases of heavy-handedness by regulators and courts, so it is illogical to rely on another result this time.

In any case, pragmatic enforcement would not make the law itself any better. Those responsible for working with personal data still have to err on the side of going too far in their efforts to comply, and thus finding themselves at a disadvantage compared to their competition who do not, or not going far enough, and then risking a regulator dropping the sword of Damocles at any time, with no objective standard for "far enough".


Sure, I see where you're coming from. I guess we'll just have to wait and see whether data protection authorities start dropping 20 million Euro fines on people from day 1 for breaches of the law. My view and instinct is that this won't happen. Even with the relatively low level of fines at present, in the UK for example the Information Commissioner's Office has rarely reached the limit.

However, can you give me specific examples of where it would be 'absurdly onerous' to comply? I assume you're talking about restriction of processing, data portability, rights of erasure in the main? Yes, this creates costs, but overall these are minor matters compared to what the regulators will actually focus on which is blatant misuse of consumer data and failure to implement appropriate security measures.

Also, can you give me examples of heavy-handedness by regulators and courts in relation to EU rules? The main example that could potentially fall within this bracket relates to anti-competitive behaviour. In relation to privacy-related matters, the revised E-Privacy Directive in relation to cookie consent was widely ignored without any real ramifications that I'm aware of. On existing data protection law generally, data protection authorities have been relatively restrained in my experience, with the larger fines coming from blatant misuse of personal data or data breaches where even basic security protections were not in place.

What detrimental effects do you think will follow from complying with GDPR compared to those who do not? I'm not saying there won't be any but would be good to understand if you have any specific examples. Do you imagine that whilst some organisations strain to comply with GDPR, others will be forging ahead with new features and capturing market share?

Another point on competition is that on one view, because GDPR is expanding the territorial scope this levels the playing field to an extent. Increased fines also create a disincentive to engage in behaviour harmful to users' privacy. I appreciate that enforcement will likely remain an issue for those outside the EEA depending on the nature of the entity. I cannot imagine that Google would simply avoid paying the previously levied fines, depending on how the appeals go.

My experience is that many businesses are not falling over backwards to comply GDPR. I certainly haven't seen businesses going 'too far' in looking to comply. Businesses that have taken sound advice have adopted a risk-based approach to GDPR compliance, assessing where the greatest risks are and acting accordingly. The regulatory focus will not be on small businesses, but instead on players like Google, Facebook and those losing vast quantities of user data.


Sure, I see where you're coming from. I guess we'll just have to wait and see whether data protection authorities start dropping 20 million Euro fines on people from day 1 for breaches of the law. My view and instinct is that this won't happen.

Of course it won't, but the unlikelihood of the extreme position doesn't make the broader risk of an excessive or heavy-handed response any better.

Also, can you give me examples of heavy-handedness by regulators and courts in relation to EU rules?

Sure: one of my own businesses received a letter from a national tax authority in another EU member state shortly after the new VAT rules for digital sales came in, alleging that we had committed serious tax offences, demanding payment of money we couldn't possibly afford by a deadline that wouldn't even allow time for consulting lawyers or accountants, and threatening immediate and very scary action against us if we did not comply. At first, we thought it must be some kind of hoax, but then the terrifying reality that we really were being threatened by a state actor with enough power to wipe our fledgling business from existence dawned.

If you've never been on the wrong side of a government mistake, you might suggest that our concern over that letter was overblown, paranoia even. Surely no government would not only make such a mistake but then follow through and cause real damage, right? Well, writing as someone who unfortunately has previously been the victim of another serious government mistake in connection with tax affairs, and had life turned upside down for several months trying to sort it out with very real and very scary consequences, I can personally assure you that concern about the consequences when the system goes wrong is quite justified.

What detrimental effects do you think will follow from complying with GDPR compared to those who do not?

Do you mean what is the cost of compliance for those who try to comply, as compared to just ignoring the rules? The cost is all the overhead of writing documents and conducting audits and setting up systems you might never need, just so that you can tick the right boxes. There are plenty of estimates around suggesting that actually carrying out all the work suggested in black and white on the ICO's guidance for data controllers and data processors would take weeks and costs tens of thousands of pounds at a minimum. There are a lot of microbusinesses, which of course are covered by this law just like anyone else, where that represents literally their entire annual turnover and probably a substantial proportion of the total time they have available to do their work in a year.

Do you imagine that whilst some organisations strain to comply with GDPR, others will be forging ahead with new features and capturing market share?

I'm absolutely sure that will be the case, just as it was with things like the new VAT or consumer protection rules before.

As a direct personal example again, that same business I mentioned before lost weeks of developer time updating systems to comply with the EU VAT rules, including a substantial part of one of our developers' Christmas holiday because the rules came into effect right at the start of the year and guidance was still being updated just days before. We later discovered that hardly any other businesses of our size or even substantially larger were even making a serious attempt to comply, essentially meaning that we had wasted all of that time and money trying to do the right thing, while others including our competitors were apparently committing tax fraud with impunity.

As another direct personal example, not only did we have to spend time and money updating systems to comply with the new consumer protection rules for online sales a few years back, we also saw a noticeable drop in conversions because of the scary legal wording we are now required (and this is directly from our lawyer) to display prominently during our checkout process, even though in reality we had always offered significantly better conditions for our customers than anything those consumer protection rules actually required anyway. And of course any competitor outside the EU was free to continue with the streamlined checkout process they had, no scary wording required.

My experience is that many businesses are not falling over backwards to comply GDPR. I certainly haven't seen businesses going 'too far' in looking to comply.

Are you advising my business to knowingly break the law?

Businesses that have taken sound advice have adopted a risk-based approach to GDPR compliance, assessing where the greatest risks are and acting accordingly.

What did that advice cost, and what proportion of small or micro businesses do you think have paid to receive it?

The regulatory focus will not be on small businesses, but instead on players like Google, Facebook and those losing vast quantities of user data.

So they said about the VAT rules, a few weeks before a government organisation against which my business and I had no meaningful defence threatened to destroy a large part of my life that I and others had spent several years building. You'll forgive me, I hope, if I don't take their word for it this time.


What is “it”? GDPR is in large parts based on the Data Protection Directive from 1995 and “Convention 108” from 1981. There is ample of case law, data protection authority opinions, guidance, etc.


GDPR will go into effect this month.

EDIT: My mistake, in May


It will apply from 25 May 2018


I really don't understand how this is going to work in practice for small side projects with a single part-time developer. How are they supposed to afford implementing all these changes, none of which seem trivial or even practical for your standard little PHP site?

So if I run a forum as a side project, what are my options?

1) Spend all free time over the next few months adding these features and neglect any other work on the project.

2) Ignore the GDPR and hope nobody complains.

3) Shut down the side project.

Of course if you're Facebook or Twitter you just assign a few developers to this and you'll be fine. But I don't understand how this will not end up killing small-time web companies, or at least make them a lot less feasible to create.

I suspect many people will go for (2) and hope this fizzles out the same way the cookie law did.


Don't store a bunch of personally identifiable data and you don't have to do any of this.

We have seen what this laissez faire attitude to "capture everything, delete never" has done. Trust has been supremely squandered so at this point I don't think anyone is particularly inclined to believe when someones cries wolf.


The problem is that pretty much everything seems to be considered personally identifiable data. Any web community will at least be storing usernames, passwords, emails and most likely IP addresses. As far as I know, all of that counts.

And even if you don't have a login system, your web server is still going to be logging people's IP addresses.


You're allowed to log people's IP addresses for diagnostic reasons, to prevent fraud and abuse. You don't have to delete these logs when someone emails you if they're still useful for those purposes. You do need to delete those logs eventually though, and you need to disclose how often that is. It can be ten years if you want.

If you're not using IP addresses for those reasons, you shouldn't collect it. It is good advice to mask off the bottom bits of the IP address since it will still likely be unique enough for you, without being useful to someone who has an unrelated list of IP addresses.

You should probably be doing that anyway: If you get hacked, you're basically giving your attacker a list of targets.

You should be able to delete usernames, passwords and email addresses on request. You should be able to remember that request if you restore from backups. You should not be storing passwords since that puts your users at risk if you get hacked, and you're responsible for keeping their personal data safe.


You don't have to delete these logs when someone emails you if they're still useful for those purposes. You do need to delete those logs eventually though, and you need to disclose how often that is. It can be ten years if you want.

That sort of argument has worked pretty well for Disney and friends when it comes to extending copyright indefinitely and yet not technically violating the US constitution's "temporary" wording. I wonder how well it will work in this case. How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?

It is good advice to mask off the bottom bits of the IP address since it will still likely be unique enough for you, without being useful to someone who has an unrelated list of IP addresses.

That isn't necessarily true if, for example, you're using IP addresses to detect a pattern of abusive access within a general pool of acceptable access. And if a full IP address isn't enough to potentially identify a specific threat, it surely also isn't enough to constitute personal data. Either way, your advice seems excessively broad here.

You should be able to delete usernames, passwords and email addresses on request. You should be able to remember that request if you restore from backups.

Doing this effectively without immediately becoming self-contradictory probably requires something like storing a hash of every user name or email address that has had deletion requested, and querying a database of such hashes during the restoration process in order to exclude affected data. While this may well be in the spirit of the GDPR, there is no denying that it is extremely onerous, particularly for anyone working with personal data who rarely if ever receives such a request yet would have to redesign a large part of their IT infrastructure around the ability to do it.


> How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?

By first coming to grips with the fact that this isn't your data.

Keeping data is always a risk. You risk being hacked and jeopardising losing control of people's personal data. The longer you keep it, the longer you are putting that data at risk.

How long can you be expected to go without being hacked? A year? Five years? Ten years? At which point do you think you can be compliant?

> > If you're not using IP addresses for those reasons [fraud detection], you shouldn't collect it. It is good advice to mask off the bottom bits of the IP address since it will still likely be unique enough for you, without being useful to someone who has an unrelated list of IP addresses.

> That isn't necessarily true if, for example, you're using IP addresses to detect a pattern of abusive access within a general pool of acceptable access

Why do you say something is not necessarily true when I have already given this exact reason?

> Doing this effectively without immediately becoming self-contradictory

No, it requires keeping the two or three requests for data or erasure posted to you, or emailed to you.

This isn't complicated. I have email from over twenty years ago.


By first coming to grips with the fact that this isn't your data...

The trouble is that you haven't adopted any objective or actionable position here. It's easy to pass the buck with more questions. What small organisations need is simple, verifiable answers. The absence of such answers from authoritative sources is possibly the single greatest criticism being made of the GDPR.

Why do you say something is not necessarily true when I have already given this exact reason?

You said it was good advice to mask off the bottom bits of the IP address. I don't think that is good advice in general, for the reason I gave: either the full address has a legitimate use for identifying specific threats, or it probably isn't specific enough to constitute controlled personal data in the first place. Either way, it's unclear what benefit derives from masking part of it.

No, it requires keeping the two or three requests for data or erasure posted to you, or emailed to you.

Sorry, but again I can't see how your reply fits with anything I wrote. What point were you trying to make here?


> The trouble is that you haven't adopted any objective or actionable position here.

What, that you don't own someone else's personal data?

If someone signs up for your service, you do not own their email address and cannot use it in a way they wouldn't want.

What part of that is unclear?

> You said it was good advice to mask off the bottom bits of the IP address

If it wasn't needed.

You keep dropping this, so maybe it's because you didn't read it when I wrote it or when you wrote it.

> I don't think that is good advice in general, for the reason I gave: either the full address has a legitimate use for identifying specific threats, or it probably isn't specific enough to constitute controlled personal data in the first place.

Well, you're wrong. The ICO has recommended it in general for exactly this reason:

https://ico.org.uk/media/1061/anonymisation-code.pdf

But again, the GDPR allows for an exception if the data is being used to eliminate fraud:

https://www.privacy-regulation.eu/en/r47.htm

> What small organisations need is simple, verifiable answers

Which is convenient that the ICO is offering simple, verifiable answers: https://ico.org.uk/for-organisations/business/

> The absence of such answers from authoritative sources is possibly the single greatest criticism being made of the GDPR.

Possibly, but also possibly not. So what?

It's still law.

> I can't see how your reply fits with anything I wrote. What point were you trying to make here?

I said: "You should be able to delete usernames, passwords and email addresses on request. You should be able to remember that request if you restore from backups."

You said I can't do that without storing a hash of every user name or email address that has had deletion requested, and querying a database of such hashes during the restoration process which is bonkers: How many complaints and requests for deletion are you going to get? You might get two or three. Ever. How often do you restore backups? Monthly? Yearly? How hard is it to search two or three requests whenever you do a database restoration if you do something a dozen times a year?

If you're a bigger company you might do it more frequently, but then it's clearly no longer onerous.


What, that you don't own someone else's personal data?

No, sorry, I was referring to your whole opening section, which was a response to my question, "How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?"

My point is that even something as "simple" as deciding how long you should retain some data that you originally acquired and have used for some obviously legitimate purpose is not necessarily an easy question. These are the sorts of issues where I'm arguing that small organisations really need simple, concise, clear guidance.

You've linked to a few pages on the ICO's site throughout this discussion. I note in passing that reading and understanding them fully would take many hours, even just looking at the high-level guides and checklists, and having done so, they still leave numerous subjects open to interpretation or judgement where someone would probably need real legal advice to find out where they might stand in practice.

Now, if you're a business with an in-house legal team and an in-house IT team and a turnover well into the millions and a designated management structure and established processes for doing most things, that's probably not a big deal. But if you're three guys running an Internet startup from someone's bedroom, or even a significantly more established online business but not large enough for dedicated IT staff or in-house lawyers (which is still the scale that the vast majority of businesses are working at), no-one has time to read through dozens of pages of "guidance" full of subjective-anyway legalese. You need clear, actionable guidance, and you need a clear, unambiguous legal framework.

I would argue that expecting even a good faith effort at compliance from many of those smaller businesses is unrealistic, given the "support" available today. They're just going to break the law, either of out of ignorance or out of apathy, and either way the law didn't achieve anything useful in all of those cases. That then means that any of us who are concerned with trying to do the right thing and do want to understand our real legal obligations will automatically be at a disadvantage. That's not a good way to encourage compliance or to support smaller businesses growing (and in particular, growing responsibly and legally).


> No, sorry, I was referring to your whole opening section, which was a response to my question, "How can anyone possibly make a reasonable, intelligent, a priori assessment of how long that data will be useful for that kind of purpose?"f

Doesn't matter.

It's not your data.

It's their personal data.

You may use it as long as it benefits them; As long as they would want you to.

Why do I want a site to have my email address? To send me notifications I've requested? To facilitate a password recovery process I initiate? To send me marketing material that I am interested in? Surely it's obvious that if I change my mind, that's up to me.

Why do I want a site to have my IP address? To help protect my account? Sure. For blacklisting addresses that try to log into my account with an invalid password? Of course.

See? Specific examples are easy. Enumerating them is hard though, which is why European courts don't do that. They rely on organisations like the ICO to field questions, make a judgement call, and publish guidance for frequently asked questions. But for the most part, it's just common sense.

> My point is that even something as "simple" as deciding how long you should retain some data that you originally acquired and have used for some obviously legitimate purpose is not necessarily an easy question.

You should keep it for as short a period as possible.

The longer you keep personal data, the longer it is at risk. Remember you're also responsible for keeping that data safe. If you get hacked and lose control of that data, you're responsible!

> You've linked to a few pages on the ICO's site throughout this discussion. I note in passing that reading and understanding them fully would take many hours, even just looking at the high-level guides and checklists, and having done so, they still leave numerous subjects open to interpretation or judgement where someone would probably need real legal advice to find out where they might stand in practice.

Understanding how to pay tax in the US correctly takes much more than "many hours", and very often requires professional advice.

However very few people worry about tax on millions when their income is under $100; Very few companies need to worry about how to handle millions of requests for erasure when they don't even have any personal data.

But some prudence helps: Simply "not keeping data you don't need" is the ICO's advice. It's also best practices for IT security.

Take the few hours to understand it. If you've got specific questions, I might try to answer them, but you're unlikely to have more than a few for your specific business case, and you'll find legal advice for those questions cheaper than tax advice.

> I would argue that expecting even a good faith effort at compliance from many of those smaller businesses is unrealistic

Well, good luck with that.

My experience with European regulators like the ICO is that they're not going to be amused by your argument.

Given that most of this has been law for decades is a big part of why I think it's not unrealistic.

> That then means that any of us who are concerned with trying to do the right thing and do want to understand our real legal obligations will automatically be at a disadvantage.

Then do the right thing: Treat the data as the person would want it treated. Make sure you are proactive and do your best. Don't worry so much that someone else is going to treat people a little worse so you need to abuse people as much as is legally permitted.


I'm still grasping GDPR myself, but in terms of deleting users from backups might be solved via uids. Each user should also have a uid that isn't PII by itself. Upon getting a eraser request you remove everything and preserve the uid and flag it. Then, when restoring from backup you can easily see which users need to be erased, and you've not stored PII for any amount of time.

As an aside, you also have to understand that some data is only PII when you have other data joined with it. Extended PII can easily be ingested into a system and stripped of its association with the user. That value independent of other identity data means its no longer PII, extended or otherwise. But, again I'm still grasping this myself. Please correct me if I am wrong.


I think your approach with UUIDs is practically equivalent to my suggestion involving hashing: you replace something that is the actual personal data with an irreversible proxy.

My concern remains the same either way. It's not that such measures can't technically be implemented, it's that the effort required to do so in practice is disproportionate, particularly for smaller organisations using limited personal data for legitimate purposes where there is little risk to privacy from otherwise properly handled data not being fully deleted on demand.

For example, instead of a small transport business buying a standard backup service and using backup and restore tools that just save all their important data to a secure, reliable location in case of disaster, it appears that they might now have to implement data crunching logic customised to their specific circumstances, despite possibly having no knowledge about how databases or programming work at all.

I fail to see how such a requirement would be constructive in terms of safeguarding anyone's privacy in a meaningful way, but I also fail to see how it isn't required according to the letter of the GDPR.


PII isn't a term used by the GDPR, and the association stripping we're used to (from healthcare, for example), isn't necessarily enough... GDPR covers psuedonymized and anonymized personal data too.


It's even easier than that: Just keep the requests for deletion in a separate database, like hardcopy in a filing cabinet. Then train your staff to check the filing cabinet whenever they restore from backups.


Your webserver logging IP addresses is a legitimate interest that does not require consent (preventing abuse).

I do hope you're not storing passwords, only a securely stretched and hashed variant, which is not subject to the GDPR.

Emails and Usernames will require consent, as do IPs stored outside a context of legitimate interest (preventing abuse).

The only thing that most communities will be missing is a function to delete all user data from their website (ie, delete user + all posts on request) but that should come into standard software soon enough or is already present.


Passwords are not personal data, and even if they were, you aren't storing them anyway (right?).

There's a tricky question around usernames: they're technically not personal data since they aren't necessarily the real name of the user, but the user can choose to use their real name as their username, which is not easy to distinguish in any automated way.

You don't need to store IPs


If someone makes a real name username and you don't do anything else about it, you should be fine.

However, if you take that username, look up the user in the real world and send spam without consent you are in deep shit.

It's all about consent of use and for what. The user has implicitly given consent to use their username as a username but nothing else.


The point about IP addresses and emails makes me wonder what would happen to spammer blacklists. If they're personal information, does that mean services like Stop Forum Spam and Akismet need to remove data when requested, even if it makes life harder for forum admins and community managers? What about those topics you see on admin forums where you're given a giant list of IPs and emails to copy into the ban settings page?


> where you're given a giant list of IPs and emails to copy into the ban settings page?

If you're behind HN, and I email you and tell you to delete CM30's profile, would you do it or would you request me to prove ownership of the profile?

If you're behind HN, and I email you asking you to delete thousands of accounts, would you just do it or would you request me to prove ownership of those 1000 profiles?

Point being: You're not gonna delete anything until you have verified that the user is really who he is claiming to be.


Preventing abuse is a legimitate interest to keep data even if deletion is requested.

For the same reason users can't request you delete all copies of bills and receipts on them.


A blacklist is a legitimate interest isn't it?


Usernames are not necessarilly personally identifying, and are freely given by the user to be publicly displayed on a forum. Passwords are not personnally indentifying, and you shouldn't be storing them anyway!


> Usernames are not necessarily personally identifying

Literally the only purpose of usernames is identifying people.

I don't know the actual text of the GDPR, so I don't know the definition of "personally identifying", but I find it very strange that people seem to restrict the definition of "personally identifying" to "connects directly to your IRL identity".


The distinction stems partly from the US-EU divide. US use the term personally identifiable information (PII). This has a narrow definition and relates to that which connects your real identify. However, 'PII' is not a concept under EU data protection law which instead uses the far broader term of personal data.

Personal data is both PII and any information which could be used to identify someone when combined with other information. As you suggest, a username is likely to be personal data because in lots of cases it is used as a unique ID and is used by the same person across multiple sites to identify them.


Smaller summarised guide for very small side projects:

1. ideally, don't store personal data

2. if you absolutely need to, store the bare minimum

3. if you're doing 2, don't give any of it to 3rd parties

The article splits its bullet points into three sections. The second section is basic security best practice: you should have this covered anyway regardless of the size of your project.

If you stick to my points above, the author's remaining bullets should either be null, or much much easier to implement.


One man shop here...

Cloudfront forwards country information to your origin servers in AWS. My plan was to not do business or display content in European countries until an easy solution to GDPR enables me to quickly meet it's criteria. Certainly libraries will crop up helping to ease the burden of the regulation for smaller operations.

Though... I'm not quite sure what happens when a European citizen uses VPN to spoof a non-GDPR country to gain access to my site, provides me their personal data, then requests to be forgotten. Would it be relevant that I intended my site _not_ to be used in Europe and the user in question circumvented my attempts not to do business in Europe? My bet is that it wouldn't. * Shrug *

> 2) Ignore the GDPR and hope nobody complains.

> 3) Shut down the side project.

Basically, I'm implementing a hybrid of #2 and #3 by not letting European users into my system until I know I can comply with GDPR cheaply and easily.


I cannot imagine why anyone would want to intentionally violate someone's privacy and ignore their stated preferences; to use their data in a way they would not want, regardless of what country they live in. It surprises me that anyone would want to make an insecure system on purpose and take no responsibility for being hacked.

That's what "ignoring the GDPR" means on some level.

It just so happens that the GDPR provides a powerful tool to protect me and my rights, even when I'm in the USA on business, at a Starbucks on their public Wifi.

There's a lot of guffing about IP addresses and weblogs that is confusing a lot of people. Without knowing what kind of information you think you need to collect without permission and without benefitting the person involved, it's difficult if you're a chicken little, or if you're a scumbag spammer: I'll have no sympathy for the latter, but am happy to try and decode the GDPR for the former (within reason: I'm doing some GDPR consultancy at the moment)


Fair enough. Here's what I'll promise you: I'll dig more into GDPR for my side projects, and will opt to shut down projects (instead of fire walling them) if it the effort to conform to GDPR is too large, while I bring my projects up to standard.


Do you do business in Europe? If you only have a handful of European customers, what exactly can happen if you get a letter from the French privacy regulator? Toss is in the bin and move on with your life. Maybe delete the complainants account if you want.

The major costs with GDPR for a small player are things

* understanding the law (far from trivial, particularly given how the various privacy regulators can't be arsed to produce final guidance even to date). Consent is moderately straightforward, but eg legitimate interest balancing tests aren't.

* figuring out every database and table that has user data stored in it

* figuring out 3rd party systems with such data (your transactional mailer, marketing mailer, billing, logging, etc)

* were your marketing consents gdpr-compliant (pro-tip: they weren't). What consent is every marketing contact tied to? Why do we have to reconsent everyone when there is already a working 1-click opt-out link in every marketing email?


I don't think you can escape it by simply throwing up a firewall to try to block EU users. It seems that you're just inextricably pulled into this as soon as you record any data about an EU citizen. Unfortunately you can't know with good certainty whether or not that has happened.


The GDPR applies to european citizens living outside Europe.

You have a fire in your kitchen, and instead of addressing it, you're closing the door.

From the inside.

[Edit: And to be clear, if I ever heard about a site pulling this sort of shit, I would be extremely compelled to issue GDPR requests towards it. This mindset amuses exactly nobody except you.]

Edit 2: I'm being a little snarky here, so here's a bit of actual helpful advice: Instead of trying to be a smartass with the law, which never works, do what every other one-man shop your size will do and ignore it until it becomes an issue. Once it does become an issue, comply in the best faith you can.

Most requests you'll get will probably simply be: "Please delete my account" and "Please send me a copy of my data". You have 1 month to comply with it. I have no doubt you'll be able to.


The GDPR applies to european citizens living outside Europe.

The EU would like to think so. Whether it actually can enforce its law extra-territorially is an entirely different question, the answer to which most likely depends on the nature of any formal agreements it has with other relevant jurisdictions and/or the local law in those jurisdictions.

And to be clear, if I ever heard about a site pulling this sort of shit, I would be extremely compelled to issue GDPR requests towards it.

And that is exactly why the GDPR is too absolute and one-sided. It imposes significant burdens on those working with personal data and provides data subjects with rights that can be abused in a form of barratry.

Instead of trying to be a smartass with the law, which never works, do what every other one-man shop your size will do and ignore it until it becomes an issue.

Isn't that also being a smartass with the law?

Moreover, if compliance with the law is so onerous that small organisations can't reasonably be expected to do it anyway, that's a pretty clear case that the law is too strong.


> Moreover, if compliance with the law is so onerous that small organisations can't reasonably be expected to do it anyway, that's a pretty clear case that the law is too strong.

The text on the GDPR is actually super reasonable. The whole thing is pretty short for how big people say it is, every article is sub 1-page, and essentially everything comes with "within reason" asterisks of various kinds (deadline extensions, "appropriate for context", "doesn't apply if request is unreasonable", etc).

So no, small organizations can absolutely be expected to follow it.

What's happening is some americans here are simply having culture shock. In Europe, the concept that consumers have strong protections and that businesses have responsibilities is neither new nor uncommon.

> Isn't that also being a smartass with the law?

Not really no. Some provisions of the GDPR are continuous, but most of them are what consumers can request of you (their data, deletion/update of their data, etc). Most of the things people freak out about is stuff that doesn't have to be handled until you get your first request.

If you are a one-man shop, aren't handling a lot of personally-identifiable information and in general don't have a big site, you won't have a problem following it. None of GDPR requires that you build automated systems for everything, as other commenters have pointed out.

All this is just restoring some sanity in a world where far too many businesses don't give a crap about their customers.


Just to be clear: I'm in the UK (as are my businesses), I'm generally an advocate of strong privacy protections, I can and do put my money where my mouth is by supporting various organisations that defend such protections, I have read the entire GDPR and a large amount of guidance related to it, I have consulted with other experts on it, and from day one my businesses have always followed careful practices in terms of how much data we collect, what we do with it (nothing at all shady) and how we store it.

In short, my view that the GDPR is bad law doesn't come from a culture shock, a lack of familiarity, or a lack of understanding or expert advice. It comes from not liking poorly-written EU laws that are open to abuse, and from direct personal experience (described in more detail in other comments, so I won't that repeat here) that such laws can actually be abused in practice with potentially serious consequences. And it comes from not liking vague regulation where you don't know how far you really have to go to comply and what the real rules of the game are, and misjudging in either direction has a cost.


> In short, my view that the GDPR is bad law doesn't come from a culture shock

I wasn't implying that. I'm saying that the poster I was initially replying to is getting culture shock; overwhelmed by the sudden need to care about their customer's privacy.

It's a regulation document, of course it's not going to be perfect, I don't know a single one that is. Overall as far as these go, I don't see major problems with it (other than yes, some of the terms in it are loosely defined, including what kind of data is covered under it; these will be things I expect will be learned over time).


It looks like the maximum fine is 4% of annual revenue... seems like the regulation has no teeth if you have no revenue. IANAL and could be totally wrong.

To your point about small companies, I agree, it feels onerous.

What irks me about the right to be forgotten is that it directly counters my right to remember things. Should a shop keeper be allowed to record their observations about who enters their store each day? If they maintain a physical guest book in their brick and mortar store, does a visitor have a right to be erased from that book?


> It looks like the maximum fine is 4% of annual revenue... seems like the regulation has no teeth if you have no revenue.

Up to €20 million, or 4% annual global turnover – whichever is higher.


The fine is up to 20M euro or 4% whichever is greater AFAIK.. and will match the seriousness of the offence. ie a small side project who refuses to honour a forget me request is highly unlikely to be fined 20m euro. However, one that deliberately mines the web for huge amounts of PID without consent and profits from it now has a new regulation to watch out for.


Re the last question, yes, absolutely. Why would data format matter?


I don't think it matters to the principle, but rather to the practicality. It's now practical to aggregate massive amounts of data and create a privacy concern that's unlikely to have existed in the world of offline records.

The principle is freedom of speech. If you tell me that I cannot write down the names of the people who came to visit me today without their permission, you are violating my right to freedom of speech.


If you need to retain information for a particular reason, for example to provide a service, or to retain for legal reasons, or legitimate reasons in connection with your business, then in many cases that will trump the right to be forgotten. See Art 17(1)(a).


Yes, for privacy reasons, it’s not appropriate to keep observations on your fellow citizen...


Dear diary, today I had a discussion on HN with tajen.

Will HN support erasing your comment history if you ask?


If you're a European citizen, in a couple of months they'll legally have to (assuming they don't want to just ignore any European court judgements, which is possible but risky).

I believe you can already delete your account and comments though?


Plus Hacker news doesn't ask for personally identifyable information in order to use it, unless you choose to put your full name in as a user.


HN has everyone's email.

With modern language usage analysis it would probably not be too difficult to work out who everyone is on HN who posts regularly. This is one of the reasons I post under my real name.


HN only has email addresses for members who provide it. It's not a requirement for account creation.


So what is the procedure going to be exactly? Will HN request the user to send over a photo of their national ID card or password to prove EU citiznship and then proceed to delete comments and data?


Option 4) - adopt a risk-based approach to compliance, and look to assess whether any aspect of your service, and the way it makes use of data in its current form is an egregious breach of GDPR. If that is the case, you're likely in breach of existing data protection law.

In terms of risk factors, most side projects will generally use data to provide a service to customers. This type of processing is unlikely to attract regulatory attention.

It's also unlikely that the vast majority of entities will be 100% compliant come May 25, but as the Belgian data protection authority has pointed out, it is important that people demonstrate a good faith approach to compliance.

For new requirements around restriction of processing/provision of data in a programmatic manner, again, dependent on risk, it is likely not necessary to implement these features and side project owners should focus on building their product instead.

Relevant risks for a side-project owner are around a) volume of data held, b) types of uses of data, c) likelihood of users making a requests around erasure/restriction (look at any historic requests received here), and d) regulatory focus on specific areas of the legislation.

This law is not going to fizzle out and on a general level, in my view it is advisable for any entity to look to respect user data regardless of legal obligations relating to that.


Small side project means small data too, easy to manually sort out gdpr requests as need arises. No need to automate everything.


True for some parts of it, but what about things like asking for explicit consent, logging each read access to the database or implementing a "restricted" feature? None of these things are supported by most standard software.


I can't remember the last time I worked with a database that couldn't write out a query log?

That's not to say GDPR isn't a pain, but there's not any significant technical challenges that I know of, it's much more of a challenge in terms of business processes


Last week I looked at a couple of hosted data services whose free tier didn’t give access to a query log. This is pretty common for hobbyist projects.

This means this logging has to be implemented at the app layer.

The developer has to pay for log retention, which doesn’t come for free with several of the hobbyist-level hosting services that I’m familiar with.

I agree with your general point that there’s no significant technical challenges, at least for even a minimal dev team or a practitioner with time to put into this.

Maybe hobbyists shouldn’t be handling this data in the first place, or maybe this is an opportunity for GDPR-complaint SaaS to help them do so safely.


Until Stripe, that seemed to be true of PCI compliance as well. Just keep saying your project is too small form tone to spend any time trying to hack it.


'I suspect many people will go for (2) and hope this fizzles out the same way the cookie law did.'

I suspect you may be right. http://nocookielaw.com/


Or just don't store personal data?


4) If you are not living in the EU you are not obligated to implement any of the changes. Being a small time developer, what is the worst that can happen?


Banned from the majority of the world's developed countries.


What will happen? They will block your website?


If you service any users in the EU or process any personal information of EU citizens, you must comply with GDPR.


What will they do if you don't comply?


Warn you, help you, or fine you, just like they would an eu business.


Use forum software that already implements the required features.


God damn it EU, all these regulations make it impossible for small companies, indie developers to cope with all the bureaucracy.

The VAT for digital products, now the GDPR.

10 more years of regulation and you will spend 90% of the time working on implementing legal requirements and 10% on the actual product.


GDPR—while vastly different to what has become the defacto standard practice in most companies—is largely simple, basic, common decency and common sense. My very tiny startup won't have any problems complying because we've actually given a smidgen of consideration to our users' privacy up until now.

In fact, I foresee it being a much greater tax on large corporations: the work in GDPR is not compliance—that's relatively easy once you have procedures in place—the real work is converting existing non-compliant systems to bring them into compliance. This is going to be much easier for those maintaining relatively small, simpler systems, and easiest of all for brand new startups.


From what I have seen and understood about the regulations and the spirit of them this is basically right.

If your system was intentionally designed with both privacy and the ability for users to own their data (i.e. edit & hard delete whatever, whenever for any reason) in mind, then GDPR should be essentially complied with already 'out of the box'.

If this was not the case, either for cynical reasons, simple disregard for the importance of these things, or a decision to not prioritise these things in favour of shipping more features faster, and you just essentially slapped a checkbox with some legal copy over your signup process and thought you were done with all that pesky user data privacy stuff, well, you're in for a pretty bad time now.

Maybe my reading of it the regulations is naive and it won't be so easy in the first case and will be easy to subvert anyway in the second case. But if not, to be perfectly honest it seems just like what good regulation should do - incentivise good behaviour - allowing businesses that behave well by nature to thrive without too much extra hassle introduced, and suppress both the bad behaviour itself and the general productivity of the business behind it where that's not the case.


I'd hardly say that. "Forget me" can take a lot of design work (can introduce a ton of edge cases). "Export data" requires building an entire information processing pipeline.

Larger corporations have the resources to dedicate to this. But for a small startup deciding between spending 4 dev-months on "forget me" and "export data" versus on enabling the top 3 new primary use cases users are asking for, I understand how this could feel really difficult.

I really wonder if it wouldn't be better to make some of the requirements only for companies above a certain revenue threshold or the types of data collected. (E.g. export data is critical for health or finance-related sites, probably less so for a meme generator startup.)


I would. I'm doing some GDPR consulting at the moment and most of my conversations are "I don't think it's as complicated as you do". Americans tend to read law very pathologically unless they are familiar with how European legislation works, and every programmer out there thinks they are an armchair lawyer since there are "obvious" skillset similarities between decoding software and decoding law.

"Forget me" is very simple: If someone calls you up and asks you to stop using their data, you stop using it and remember that they've done this.

You do not have to:

- Destroy invoices

- Delete web logs

- Delete the record of them asking you to stop using their data

- Reprocess all of your backups

- Recall any reports you might have sent out

Or anything else that is silly. But your salespeople aren't allowed to see that person's details in your CRM anymore.

"Export data" is also very simple for most companies. If you have a CRM containing information about a person, then that person can ask for that information.

> probably less so for a meme generator startup

What possible "personal information" do you think a meme generator startup actually has to collect on individuals that aren't their customers?

They should have a CRM containing companies who are purchasing advertising space on their meme generator startup, and perhaps leads that they have obtained through various incremental marketing sources. They probably do not have any personal information on their users, or if they do, their business will not be impacted by simply not collecting that personal information.

But maybe I don't understand what a "meme generator startup" would do because I'm not in their target market.


You keep mentioning how you're consulting on this issue at the moment and claiming that those of us more cautious than you just don't understand how European law works. Would you mind sharing a little more to justify that authority -- what qualifications do you have that we don't, what sorts of business are you consulting with and how much is compliance (including your advice) costing them, and why is your interpretation of the GDPR reliable in cases where a literal reading either clearly contradicts you or contains significant ambiguity that you imply doesn't matter?


Hi Silhouette,

I'm not claiming anyone more cautious than me doesn't understand how European law works. That's just silly.

I also don't know what qualifications I have that you don't. What qualifications do you have?

The sorts of business I am consulting to are sales and marketing agencies based in the US. As an SME I work with their in-house council to help them understand what the business is doing. I also help define process designed to make compliance obvious and transparent surrounding areas of my expertise.

I have no idea how much compliance is costing them. I don't know if they look at it this way.

Your last "question" consists of some more straw man and a little too much hand-waving: By all means, feel free to point to any contradiction with a specific recital and I can try to address it. If you have another source who claims to be an expert, I can also try to explain why I may have a different opinion than them.


First of all, please let me apologise if my previous comment came across as unnecessarily aggressive. Looking over the thread today, it could be read as quite hostile, which wasn't my intent.

My concern here is that in this discussion (and indeed in other recent HN discussions around the GDPR), you have on several occasions relied on your role as a consultant to support statements that various actions weren't necessary because of the GDPR, and to dismiss some of the potential legal arguments/concerns that several of us have raised suggesting otherwise as if they are some sort of legal trickery and EU courts/legal systems would not like them.

I claim no special qualifications in this area. I'm just a guy who is running businesses that might be affected by the new law and wants them to do the right thing, but wants that right thing to be practical and to know that we're on safe legal ground with it. Naturally I also talk to others in a similar position from time to time, and occasionally with consultants or lawyers active in the field, and so I know that many others share similar concerns and are asking the same sorts of questions.

What I'm seeing is that most of the experts are arguing for things like a "risk-based approach", which is the standard CYA consultant/lawyer answer to almost anything where they can't say "We don't actually know either, but you'll probably get away with it if you don't rock the boat". My point is that this is not good enough. The EU and member state authorities have form, as I've written about elsewhere, for introducing overly broad laws with insufficient safeguards and insufficient consideration for small businesses, and for then causing real and sometimes very serious damage to those smaller businesses in practice afterwards.

This is why I'm arguing that the GDPR as it stands is a bad law. This is why I want to see clear, concise, unambiguous answers from authoritative sources on issues around backups, log/journal-based records, and the like. And this is why I'm asking what your own qualifications are and what you know that we don't, given that just a couple of comments up you have casually dismissed concerns that many of us seem to have as being "silly", when those concerns are based on reading what the GDPR actually says and the ambiguity that we're hearing from other experts who don't seem to share your clear view of the subject.


> [I'm just a guy that] wants that right thing to be practical and to know that we're on safe legal ground with it.

Then explain clearly and specifically what thing you want to do that you believe isn't practical. Please say exactly what you want to do that you think is reasonable but that the GDPR says isn't.

- You don't need to destroy invoices. [1] [2]

- You don't need to delete web logs (if you block out the bottom octet of the IP addresses) [3]

- You don't need to delete web logs if you're using them to prevent fraud [4]

- You don't need to delete the record of them asking you to stop using their data [5] [6]

- You don't need to reprocess all of your backups [7] [8]

- You don't have to recall any reports you might have sent out [9]

Those are everything that I labelled as silly with a link to the authority and a supporting opinion if I think that the authority isn't clear.

If you see someone with a contrary opinion, my offer remains to try and refute any specific example.

> What I'm seeing is that most of the experts are arguing for things like a "risk-based approach", which is the standard CYA consultant/lawyer answer to almost anything

The ICO recommends something similar, but it's not just about rocking the boat: If you're not putting people at risk, and you're not pissing anyone off, then you're probably not going to have trouble because an honest examination of your processes isn't going to reveal neglect or recklessness of another kind.

> and for then causing real and sometimes very serious damage to those smaller businesses in practice afterwards.

A citation would be helpful.

I suspect there's a balance: Are we harming a smaller business that was being inappropriate? Putting people's data at risk? What exactly are we talking about?

[1]: https://ico.org.uk/for-organisations/guide-to-the-general-da...

[2]: https://www.planetverify.com/impact-of-the-eu-gdpr-on-accoun...

[3]: https://ico.org.uk/media/for-organisations/documents/1591/pe...

[4]: http://www.privacy-regulation.eu/en/recital-47-GDPR.htm

[5]: https://www.twobirds.com/~/media/pdfs/gdpr-pdfs/34--guide-to...

[6]: http://www.privacy-regulation.eu/en/recital-65-GDPR.htm (note especially you keep the data in order to comply)

[7]: https://community.jisc.ac.uk/blogs/regulatory-developments/a...

[8]: https://ico.org.uk/media/for-organisations/documents/1475/de...

[9]: https://ico.org.uk/for-organisations/guide-to-data-protectio...


Interesting. Do you have a link to your consulting company? Do you have a blog on GDPR related topics?


I don't operate a blog, and my primary function at my company is as an SME, so I mostly consult to our customer's in-house legal. That said, my contact details aren't difficult to discover, so by all means reach out if there's something specific you want to talk about that you don't want to share publicly.


It wasn't the company's data to begin with. Modern businesses have caused harm to countless individuals by treating data cavalierly.

The GDPR puts things right. It brings the externality into the market, and now the market can correct.

Businesses that rely upon slinging private information around irresponsibly need to adapt. If they can't, their failure in the marketplace is just.


I'm not sure I've read anything in there that is hard to implement, other than retroactively.

I'm sure as time passes there will be frameworks and best practices developed for conforming to these regulations, but I honestly don't see anything egregious or complex to develop in there.


So what's the alternative? Completely lose all of your privacy? It is only developers who can fix this massive PPI leaking.


There's plenty of alternatives. The main problem with GDPR is not the goal of advocating privacy but the details. I would have done it like this:

a) bring out regulation gradually instead of in a single big change like GDPR to have companies time to comply

b) don't write vague laws

c) give specific examples of what GDPR means in practice

d) be more lenient on smaller companies


a) companies had 2 years go comply. Furthermore, the guidlines of the European Commission are clear that the process should be gradual - inspect, write recommendations, small fines, bigger fines. Nothing like "20 million in June"

b) the law had to cover a lot of usecases and in order to do that concisely, it may sound vague in places. I also don't like (developers never like uncertainty), but there's established practice already in regulators and courts about what is considered "adequate", "appropriate", etc. I agree it could've been better though.

c) that is happening already, e.g. ICO (the UK regulator) has a pretty good set of guidelines and examples. There's also the process of "prior consultation" where if you are not sure about something, you go ask your regulator for a decision

d) this is exactly what the "proportionate", "adequate", etc. are in for. If you are a small company with 2000 data records, you are not posing a high risk for the rights and freedoms of data subjects and so most of the things are not a strict requirement


a) The problem with this is that this practical guide was released in November 29, 2017. And this is unofficial. EU should have released a practical guide two years ago in my opinion.

If the process is gradual the law should reflect that.

c) Good to hear :). Apparently it's this: https://ico.org.uk/for-organisations/guide-to-the-general-da... - I hope it's not written from the perspective of the UK legislation.

d) The law should clearly define what is required for smaller companies and what is not. There's some disagreement if this is the case in GDPR articles too.


Every country has a slightly different implementation of the directive, so I don't think the EU will have a single example to give.


However, GDPR is a regulation, not a directive. I haven't seen that countries pass their own implementation of it.


Each country-specific privacy org gets leeway around rules like legitimate interest.


a) The regulators had 2 years to write final regulations. They didn't do that either. Apparently it's too much to ask to have eg final guidance more than 3 months before the implementation deadline.

aa) In actuality, the ICO has made it clear that grace periods are not part of their regulation strategy. See eg speeches by senior regulators.

b) hahaha go spend a pile of cash on lawyers (we're at roughly $50k) who are familiar with 30-ish countries privacy regulators. American companies are quite unlikely to have a lead regulator.

d) proportionate and adequate are words that create giant legal bills, because the gdpr naturally declines to spell out in any concrete fashion what those mean.


a) It is not a big change from the 1995 regulation. It is incremental. There is a feeling that the previous regulation lacked teeth with the multinationals, some of whom have chosen to ignore it. Facebook have lost two cases over aggregating data in Belgium and Germany in the last month.

b) I don't know if you are familiar with European law, but what you see as vague is what others see as flexibility. Laws setting out the spirit of what you are trying to achieve tend to age better than a rule based approach.

c) They did [0]. Because of b) it is not part of the regulation itself.

d) They were under the existing regulation, so why wouldn't they be now? The 'vagueness' as you put it gives a judge considerable flexibility to see if the steps taken to safeguard privacy were appropriate to your size

edit:add reference [0]:https://ec.europa.eu/info/law/law-topic/data-protection/refo...


Well, I personally don't like laws to be vague.


> a) bring out regulation gradually instead of in a single big change like GDPR to have companies time to comply

GDPR wasn't announced yesterday. The time span between announcement and implementation date is over two years. Of course if you only start now there isn't much time left, but then that's your own fault.


GDPR was announced years ago, but this pratical guide was authored a few months ago. EU should have released an official guide two years ago.


"I think all of the above features can be implemented in a few weeks by a small team. " how to trust the rest of the article after reading this?


It's interesting to see how the GDPR seems to clash with some popular data models. For example, git.

Rewriting history of a shared branch is disastrous, but it's currently the only way to redact, say, an e-mail address someone committed with a couple of years ago. I'm curious how the various code hosting sides plan to handle that. Perhaps we'll see an extension of the data model that links commits to committer UUIDs, with the actual information being linked to that, making removal easier.


Apparently Git is ok by GDPR as data subjects do not have the right to erasure if the information is meant for archiving purposes in the public interest [1].

[1] http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:320... (Article 17)


> Paragraphs 1 and 2 shall not apply to the extent that processing is necessary:

> [...]

> (d) for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) in so far as the right referred to in paragraph 1 is likely to render impossible or seriously impair the achievement of the objectives of that processing;

(emphasis mine)

I'd not say redacting a git repository does 'seriously impair' processing for archiving purposes. All the data (with the exception of the redacted e-mail) is still there, after all.

Still, the hashes will have changed, making the repo less useful for current users. But that has nothing to do with archival.


What if the purpose of the archiving is to not only record what was changed but also who changed it?


From the GDPR, recital 45:

> [...] where processing is necessary for the performance of a task carried out in the public interest [...] the processing should have a basis in Union or Member State law.

I don't think that purpose of archiving has a basis in law.

That said, I do remember my law professor calling the 'right to be forgotten' one of the weaker parts of the GDPR, and I'm not an expert, so it's possible I'm missing something.


Thank you for emphasizing that last part. I think you might be right.


And it's also protected by the right of freedom of speech: the entity operating the git server has the right to inform the public of who committed which changes. The GDPR explicitly recognizes "exercising the right of freedom of expression and information", although I'm not sure how European courts would interpret this provision. But for an American entity without a physical presence or assets in Europe, any EU judgment would be quickly quashed by American courts.


And Facebook has the right to 'inform' another company about all data it collected from its users (only in exchange for a nice sum of money!)

Except in the EU, freedom of speech and privacy are both considered human rights, which need to be weighed against each other. Freedom of speech will win when someone uses the GDPR to try to censor e.g. an online news article with some personal facts. But it won't for my Facebook tongue-in-cheek example, and I doubt it will for the redacted committer example either.


How would the judgement be quashed by american courts? No american court has jurisdiction over European courts. For an entity without presence or business in Europe, enforcing a european court decision might be a problem, but that’s a different matter. I’m sure the EU will find a way if the sum is sufficiently high.


Sorry, I was sloppy in saying that the judgment itself would be quashed. What I meant is that any attempt to enforce the judgment would be quashed. Since (by assumption) the defendant doesn't any assets in Europe to pay the fine, enforcing the judgment would require going after the defendant's assets located in the US. American courts will typically enforce foreign judgments from 'friendly' jurisdictions, but if the judgment is incompatible with American law, American courts will quash any attempt to enforce the judgment in the US.

Of course, Facebook and other large American corporations can be expected to comply with GDPR, since the cost of compliance is much less than the opportunity cost of being excluded from the EU market.


Sure, but these are two entirely different matters. Having an open, unenforced judgment against you might lead to complications, for example if the defendant happens to travel to the EU. It’s unlikely that a minor fine will be enforced by snatching the defendant at the airport, but it could at least legally an option.

Or the defendant may later open up a German subsidiary or plan on selling to a company with a german subsidiary. Things would get complicated in those cases.

So it’s important to be somewhat precise here - no enforcement doesn’t equal squashed judgment.


I suspect preserving git history is allowed for the purpose of determining copyright compliance.


There's an exception 'for the establishment, exercise or defence of legal claims.', but there's situations where that would not apply. E.g. commits fixing a single spelling mistake are probably not copyrightable.

Also, I doubt you can just keep a copy of all data you ever process, just because it might some day be useful as legal evidence.


Why would you say that? If you can get sued for a piece of code written 30 years ago, then it seems legitimate to me to store legal evidence for at least 30 years. As far as I know there is no time limit to being sued over something.


That makes sense for repository users keeping a private copy.

But, I was thinking more about companies like Github. If they can hide behind that clause for every single repo they host, the GDPR as a whole becomes useless. Pretty much everything could serve as evidence one day. As far as I know, judges don't like 'hacks' like that.

Also, additionally, code hosting platforms argue they are service providers and should not be liable for copyright infringement as long as they apply notice and takedown.


Does this apply to internal software like Slack and Github provided by an employer to an employee?

e.g. An ex-employee requests that all their identifiable data be deleted from all communication and systems of their former employer. That seems like a problem for institutional knowledge transfer. Will the employer have to adhere to that request?


It's "personal identifiable data" that covered. What you did at work doesn't count, your companies record of which days you worked is not personal within the company. If they shared it without anonymising it then it becomes personal.

"andygcook wrote this library" in internal company data isn't personal data.


How is identifiable data important for institutional knowledge transfer?


I can imagine scenarios where it would be important to know who worked on something.

For example, using git blame I might learn someone was heavily involved with a project or feature. Then I might look on our internal Wiki for old posts which include discussions explaining why certain design decisions were made.


Right, this is the scenario I was thinking of. Most systems that include any type of communication nowadays require you to use a real identity. An ex-employee that asks to have their identify obfuscated from all their work breaks the system of record to answer questions like who worked on what, what decisions were made by whom, etc.


> Restrict processing – in your admin panel where there’s a list of users, there should be a button “restrict processing”. The user settings page should also have that button. When clicked (after reading the appropriate information), it should mark the profile as restricted. That means it should no longer be visible to the backoffice staff, or publicly. You can implement that with a simple “restricted” flag in the users table and a few if-clasues here and there.

The simple hubris in this statement is jaw-dropping. “Just a flag and a few if clauses! Easy peasy!”


This article is one of the best I've seen for describing actual features that you need to build.

I agree that the specific language here is poorly chosen ("simple" and "a few if-clauses" are perilously close to the word "just") but I don't think that should detract from the enormous value the article itself provides.


I think the right to be forgotten is a serious flaw in what otherwise is a major step forward in Data handling law.

Data today has been compared by Schneier to pollution in the industrial revolution. The GDPR is probably the first anti-pollution law with real bite and with a real grasp of just how far this all goes (the extra-territoriality etc)

This does not make this perfect solution. I honestly don't think that "being forgotten" actually makes sense as a right - it seems to have sprung from some unusual case law in ECJ and could much more easily be dealt with by a "do not further process".

But we genuinely can always find ways to implement new laws - the most obvious is to encrypt user data, and then lose the key, but beyond that i think the best outcome of all this is to stop moving data around so much. moving data from system to system is a smell in my view - and one that a eu law is going to help architects the world over realise they are doing wrong


Like pollution laws, it's nonsense if not enforced worldwide. The web can't be contained to a specific locality anymore, it's against the core idea of the technology. The people in the EU who are responsible for this have no clue about the technology.


The GDPR specifically states that any processing of EU citizens personal / private data

essentially two huge things come out of GDPR - personal data about a EU citizen belongs to that citizen, and if you process data about an EU citizen even if you are out of the EU, you are covered by this law (extra-territoriality)

These are huge forward thinking political steps. they do get this stuff. I just think the deletion part is a mis-step


It just sounds good, but in a more or less open world, small businesses selling digital products will just block EU access, even if they're physically located in the EU. Opening a company in the US is easy.


I’ve been reading the EU’s General Data Protection Regulation, and it seems to contain certain loopholes that may be exploited by less than honest agents. The sad possibility is that the mere existence of such loopholes can push an otherwise law-obedient small companies towards mostly-ignoring GDPR in order to remain competitive.

For example, there’s this huge “if” concerning personal data removal, reiterated in multiple sections of GDPR. Quoting the very first section about data processing principles[0], personal data can be stored even after you’ve achieved the initial explicitly stated purpose, as long as it:

> will be processed solely for archiving purposes in the public interest, scientific or historical research purposes or statistical purposes in accordance with Article 89(1) subject to implementation of the appropriate technical and organisational measures required by this Regulation in order to safeguard the rights and freedoms of the data subject (‘storage limitation’)

How wide is the range of activities that can be reasonably claimed to be for scientific or statistical purposes, or for safeguarding the rights and freedoms of your user? How strictly would this be enforced in cases where scientific and statistical purposes are closely intertwined with commercial interests, as it often happens?

Meanwhile, the referenced Article 89(1)[1] doesn’t seem to take a hard stance except for requiring “data minimization”. Even pseudonymisation is explicitly optional, as long as you’ll have a convincing argument that pseudonymising the PII you’ve collected prevents you from fulfilling your “statistical purposes”.

I’m not a lawyer and I’m wondering if someone with more expertise can weigh in on this.

[0] https://gdpr-info.eu/art-5-gdpr/

[1] https://gdpr-info.eu/art-89-gdpr/


Any exceptions to the regulation will inevitably be subject to a narrow interpretation particularly if it is clear that someone is looking to do something which is outside the spirit of the regulation.


These sort of cases will be decided on by a judge. You'll at the very least have to make your decisions sound reasonable.

But yeah, these loopholes do exist nonetheless. The GDPR has been reported on as the most lobbied law in the history of the EU. It scares Google, Facebook, Microsoft et al.


GDPR and Kappa/event sourcing/message queue based/you name it architecture goes together nicely as you get audit logs of everything and it should be quite doable to propagate "delete this person's data" events around the place.

It's a huge hassle compared to what many companies are doing with customer data now but I think it's for the best.

Most things about GDPR go like "Does it feel a bit shady? It probably is. Don't do that." (depending on your moral compass of course)

One thing is for sure: there's a lot of opportunities for consultants as all the big companies need help to resolve the mess of legacy systems storing customer data.


I didn't realize it until reading this post, but certain very popular technologies break GDPR in a deep way.

Bitcoin, for instance, contains a wealth of personal information, which by design are both public, persisted forever, and immutable.

Are blockchain products all going to need a full rewrite or a complicated hard fork?

What about the Wayback Machine? Will they need to have an endpoint that every company will need to call for every “right to be forgotten” request worldwide?


I think the dependency order here is reversed. GDPR breaks certain very popular technologies in a deep way.


What personal information does bitcoin contain?


All payment orders and credit transfers to and from all accounts.

For any Bitcoin address you find on the Web.


Is Bitcoin identifiable? Or is an association to a Bitcoin transaction in a payment system identifiable? Which would mean the personal info could be removed in the payment system


How is that personal, how is it connected to a person?


On the off-chance that you are not asking this in bad faith…

All speculators go through a KYC with their exchange, which identifies them very precisely.

All other users paste it publicly, saying “I own this account! Send me money.” And even those users often have to convert it back to fiat, which requires an exchange, which makes them go through a KYC.


Ah, right, Bitcoin exchanges in USA, and elsewhere, harbour personal identifiable information. Bitcoin doesn't impose it, but it's a practical requirement in some jurisdictions.

(KYC = know your customers)


(assuming you meant to write Kafka) Being able to notify every internal service to delete a user's data is always nice, but in the case of event sourcing, the events are the data. Yet you can't delete Kafka events (not sure about other platforms). In my eyes, GDPR is the death of Kafka as an event sourcing store.


Kafka doesn't persist things forever. AFAIK it's totally OK if you have say 14d retention and then the data is deleted from Kafka too

(it was deleted from everywhere else already because there was an event/request to do so)


Heh, I think we're talking about different scenarios. In the case of event sourcing, we often set the retention period to 'forever', because the events in Kafka are our source of truth. Then we just build a materialising layer on top of Kafka, with the possibility to rehydrate based on _every_ event in the Kafka topics. In this case we would have to do some really weird compaction to delete singular events.


I think that scenario is adressed on this confulent blog post

> Deleting a message from a compacted topic is as simple as writing a new message to the topic with the key you want to delete and a null value. When compaction runs the message will be deleted forever.

Handling GDPR with Apache Kafka: How does a log forget?

https://www.confluent.io/blog/handling-gdpr-log-forget/


Oh, wow, must have missed that post, thanks!


It really depends on what you're doing. If you're doing something where the people you're storing data about don't need to interact with each other, you can actually store each user's events in a separate event log.

This can be fed to more transient event queues which do not have an indefinite retention period where interaction is required.

Not sure about how well-geared Kafka is to this scenario though.


Scenario specific: with Kafkas log compaction you could use message keys and republish a stripped message to the old key, preserving the history and non-personal information, but keeping the queue and message series intact...


As a freelance developer I'm quite sure that if I were to force my clients to comply with as strict an interpretation of GDPR as this, I would pretty shortly find myself replaced by a freelance developer with a more relaxed attitude to GDPR compliance.


This is probably true. If I were in this situation, I would probably only make suggestions and not force compliance.


> I think all of the above features can be implemented in a few weeks by a small team.

That's.. optimistic in the enterprise world.


Practical guide to developers - build your product in the US then expand to Indo-Pacific. Don't bother with rolling out to Europe. AI is the future of business & healthcare, which, due to inherent need for data, is incompatible with anti-data sharing laws such as GDPR. Population is rapidly aging in Europe (47.1 year old average in Germany, 42.9 in EU), so might as well set your business up for the long term by pivoting to the region where growth will take place (and where general population is more acceptive of emerging technologies that rely on easy access to data).


This is the worst advice in this thread. Not only do you lose the European market for no good reason and on logic you might hear from moon landing conspiracy theorists, but you don't even solve your issue as you will still have European users no matter what. People do travel.


If you don't have physical assets in Europe which can be seized then whether or not traveling Europeans decide to use your service is irrelevant. GDPR is not going to force you to take down your service in the US or India. It's kind of like a Saudi visiting California and requesting that local gay people be stoned as per the law of their land - not going to happen.


What about things explicitly designed in a way that there is no option to be forgotten. What about commits in version control sites? What about mailing lists?

From skimming over the spec, it seems that politicians haven't thought about any other sites than social networks or some other profit making sites. Even in that case, if some ML system is trained on the data of the customer, do they have to re-train after anyone invokes right to be forgotten.


Well, if you model can tell guess my name from 100 browser history entries, then yes, I want the law to require you to retrain your model once I invoke that right.

An interesting matter is a use of blockchain-like scheme. I guess the law would mean you can't put GDPR protected information in a public distributed blockchain, but instead use identifiers to decouple GDPR info from that ID. And the invocation of that right to be forgotten would require us to permanently delete an entry linking that identifier to a user.


One of the clearest things I’ve read on The subject.

It is a lot of common sense. Questions over the right to be forgotten vs tax / legal issues come under the “legitimate interest” clause I think. You should delete their data except where you are required to keep it. And that may mean deleting preferences and browsing history, but not their name and address if you are required to keep it.

I intend to implement a “forget me” feature by anonymising any PID and potentially redacting things like messages between users on our system. That way we keep info for stats purposes but don’t have any way to id a person from the data we hold.

The restoring backups / storing preferences about deletion request etc in a separate DB solution is also a good idea. It shows willing to comply with the regulation as well even if it may not strictly be compliant (e.g. until the backup has synced up with the preferences DB, you still have the PID) I think so long as you show a lot of willing and progress towards being compliant and take all practical and reasonable steps to do so, then it shouldn’t be too much of a burden.


This is basically developed to protect users from big abusive companies such as Facebook, Google, Twitter and big marketing agencies.

But it really is overkill for the local restaurant that wants to mail their customers.

Using a bazooka to kill some flies.


It's worse than that, because GDPR in of itself, will not technically stop useds from inadvertently blasting data to any service

Decentralized services that EU citizens use will be even less in compliance as data is shared and copied between nodes by default. Sure block a few servers by spending more resources to find/go through the legal moves than it will take for a dozen more to pop up… see torrent sites/software and how people are monetizing such, because that will be the future… laws like GDPR only make such even more attractive.

And lets just set aside that nation state actors that are routinely compromised will still collect this data that will leak on to the internet… lol

These laws are analogous to those that were against the printing press… fighting the tide of reality where it's easier to do nothing than to contort something to fit a luddites dream of personal privacy provided by the state mandates (on top of building a functional product), without having to do anything oneself to protect ones interest, in the age of deep packet inspection, 0day-exploit-exfil-as-a-service, and metadata drone strikes.

Would be more effective to just make it law that users have to plug a black box onto their devices/networks so it can just filter non GDPR colored bytes lol


What about Wikipedia? User accounts are linked with article edits/history. So if you delete the user how do you handle their edits?


Suppression[0] or similar tech. Delete the username associated with the edit, while keeping the diff around.

[0]: https://en.wikipedia.org/wiki/Wikipedia:Oversight


This strikes me as all very pie-in-the-sky. I understand the law, and the policies that it serves, but the article assumes that a company has a single, centralized data source that you can just put some hand-waving “if then statements” around to limit access, and that supports perfect cascading of data from the user down so we can just implement a few checkboxes to configure, etc. It sounds like good stuff, but that’s not how things work in the real world, where half your users trade Excel output, and can’t be bothered to log their interactions with third parties. I’m not saying that they shouldn’t do it, but they won’t.


I wonder how to deal with data that is accidentally identifiable. For example, imagine that you are running an anonymous poll or survey. In the general case that would not identify an individual person, but in some circumstances a particular collected answer will be unique and could theoretically be connected to an individual.

In such cases it's not really possible to give individuals control over their data, because except for the special case the whole point is that it's not connected to an individual...


For example, imagine you only collect an email at sign up (no name, no country) and you state in your EULA that you might use the email to send onboarding information or commercial communications (promotions, newsletter) that can be opted out.

If you do not have any means to know the country where the owner of the email is located, how do you ensure the right of non-EU citizens to receive the commercial communications they have agreed to receive in your EULA unless they opt out later?

If you do not collect your user country for privacy reasons (I would be wary to sign up for a trial of a service who wants to know my citizenship), how can you prevent EU citizens from using your product?


What happens if your service lets users manage their own customers’ data? I mean, for a product such as Airtables, FieldBookApp, Sharepoint Forms or, simply, Google Forms: Are we, cloud app providers, supposed to ensure our users don’t put PII data in the spreadsheets, and if they do, are we supposed to manage their users’ consent and process their users’ requests for edition and deletion? At the extreme, what should Heroku do for the Postgres darabases they provide to their customers ?

I could only find GDPR blogs about apps facing the final users, but they generally don’t talk about compliance for B2B apps.


In that case, your customer is the "controller" and you're the "data processor". The whole GDPR is written with such B2B relationships in mind.

Chapter 4 is specifically all about that. Article 28 specifies the situation for the data processor: https://gdpr-info.eu/art-28-gdpr/

And well, basically your customer has to tell you all of these things in the contract. (Article 28 Section 3)

Furthermore, if you yourself want to pass that data on to another data processor, you have to notify the controller of that and tell that data processor those things that the controller specified in the contract, too.

It's also in the responsibility of the controller to select data processors that implement "appropriate technical and organisational measures".

That's a term that you should also find plenty of literature and discussions on. The GDPR specifies somewhat in its recitals: https://gdpr-info.eu/recitals/no-78/

And you might also want to get certified that you are GDPR-compliant, just to make it trivial for customers to see that you can implement those appropriate organizational and technical measures.


It seems likely this will lead to increased centralization and/or standardization as many website owners decide it is too complicated and risky to write custom software managing their own user accounts.

This might be similar to how merchants often sell through larger websites like Amazon, or mobile developers sell through app stores.

Or, much like many startups begin with Bootstrap for their CSS and Django provides a built-in admin user interface, perhaps there will be open source skeleton web apps that have all the data models and UI needed for GDPR?

It sounds like there will be lots of business opportunities here.


> Age checks – you should ask for the user’s age, and if the user is a child (below 16), you should ask for parent permission

Please tell me this is only required when age is relevant, such as for sale of alcohol and tobacco?

Otherwise, surely most data holders don't need a customer's age, and this would be forcing them to collect more personal information!

And personally, as a consumer, I don't want to provide information that isn't relevant, so I'm more likely to use a competitor that doesn't ask for such needless PI.


If you use “consent” as your “legal basis” and you are asking for consent that is related to the offering of services over the internet (such as a web shop, a social media web site, discussion forum, ...), you need to somehow verify age (16 years normally) or be very clear that under 18 years olds (“children”) are not to access your service (and not have evidence pointing to that this is undermined)

Note that the 16 years old rule applies to “consent” only. As I stated in a separate comment “consent” is often not the way to do things. I, and many EU data protection lawyers I have met, believe consent to be a “last resort” legal basis of processing personal data. Instead, the legal basis called “legitimate interest” should normally be used where you as a company decides what is resonable, you think is needed to achieve the purposes you are processing data for, and, what the data subject would reasonable expect.

There is no under 16 age limit or age verification requirement in general for “legitimate interest”


In my own case, we only need customer PI as contacts for billing and support, so from what I've read, that doesn't need consent.

Thinking about it, I suppose it does make sense to ask users if they are over 16 if you are going to be processing data in a way that does require consent, just so you know that they can legally give that consent.


I believe this is true. In my country (Sweden) it appears like consent outside of offering internet services may require the data subject to be 18, not 13 (Sweden has made a local adjustment to 13 years of the 16-years old rule referred to). So this rule may actually be a relief of who can consent to what.


What about cookies?

I noticed .NET Core 2.1 is adding a cookie bar, saying its for GDPR compliance - but I didn't see anything about cookies in the article?


We've created an open-source client SDK as a starter kit for making apps compliant with EU GDPR: https://github.com/gdprhq/GdprHq.Io.ClientSdk

Anyone interested in beta testing/integration?


Lets say I have a bookmark file which contain list of urls. It is associated with one user account. Its likely associated with one person but I cannot identify that person. There is no other information associated with the user account. Is the bookmark file personal data ?


I would go with "better safe than sorry" if I had to manage this sort of data. It might be fine so long as the URLs are all non-identifying stuff like the Wikipedia home page. But I bet I would have users with much more revealing bookmarks, like, say, their Keybase profile page, or URLs with session/auth tokens in query strings, or the home page of their church which has a congregation of less than 50 people.


Another way to phrase this question is “what use is the bookmark file without a user to use it?”

I don’t think there’s anything wrong with storing anonymous data: a log of bookmarked urls for instance. So long as there’s no way to trace back who they belong to if that person wants to be forgotten.


In general, it's probably even sensitive personal data. A list of URLs can potentially identify a person's religious beliefs, sexual orientation, political views etc.


> A list of URLs can potentially identify a person's religious beliefs, sexual orientation, political views etc.

But can he be identified ? I believe if its too difficult to identify using even "religious beliefs, sexual orientation, political views etc" then its not personal data. Though I cant seem to find link to the page where I read it.

On similar note, even IP _alone_ appears to be non personal data if since it cannot identify a person [1].

From Case 582/14 – Patrick Breyer v Germany [2]

On appeal, the Regional Court of Berlin (the "Kammergericht") ruled that IP addresses in the hands of website operators could qualify as personal data if the relevant individual provides additional details to the website operator (e.g., name, email address, etc.) in the course of using the website.

Lol even German Govt cant seem to figure out its own laws. They should have gone to one of those consultants.

[1] https://www.gdpreu.org/the-regulation/key-concepts/personal-...

[2] https://www.whitecase.com/publications/alert/court-confirms-...


Few questions:

"Forget me" - What is personal information? Say I have a table with user_id and username, and an order table with user_id, order_id and other other stuff. If the user request a 'forget me', what do I delete? Blank the username? Delete the user_id row? Delete all orders belonging to the user (how would I report these gaps to tax agencies)? Delete the user from my trained ML model (is that even possible?).

"Consent checkboxes" - To what extent can users be forced to give consent or be denied from a service? Like the Cookie law, almost every website requires you to accept the fact that cookies are used, otherwise your experience is degraded (eg. you cannot watch news videos). Or say I want to order something from a webshop, and in order to place an order, I must consent with sharing my personal information with third parties for marketing purposes, else I cannot place the order. Do I have to call them out later? How is this law going to solve things if it prevents me from using things?

"Export data" / "See all my data" - What is 'all my data' here? Is in information I entered when I signed up for a service? Is it information derived from this data (eg. my google search suggestions/ads profile)?

"Don’t assume 3rd parties are compliant" - if I, the data collector, gets fined because a 3rd party, data processor, is not compliant, can I retrieve part of the losses from the data processor? I mean, OpenID allows sharing a lot of personal information from data collectors like Facebook and Google with almost any random site. What can I expect here?

"Consent checkboxes – “I accept the terms and conditions” would no longer be sufficient to claim that the user has given their consent for processing their data." - So if I, as a user, don't give any explicit consent to any personal information sharing, and in May I receive a marketing email from a party I don't have an account with (because they sold my personal information prior to this), I could say they broke the GDPR law?

"Keeping data for no longer than necessary" - My tax agency requires me to keep records of orders/sales/invoices up to 5 years ago. If a user requests deleting their personal information within that time period, what should I do?

"Forget me" - Say an employee leaves a company, and they request their personal information to be deleted. What information do I have to delete? Their Active Directory account? Their salary statements (I need those for tax agencies)? Their name in the git history? Their name from all minutes of all meetings they attended? Their name from documentation they wrote?

Some things might be doable to implement in just 8 weeks, if I had clear guidelines on how to do this, but as of now, I have so many situations where it is unclear what I should do, and no clear way to get answers, that I don't know how I can comply with this law within 8 weeks, as a small software company.


I will try to answer some of your questions based on my findings for far as I am in the process of modifying 3 webapps to be GDPR compliant and I am also starting a side project.

IANAL and please take this as a starting point. I am not sure that what I understood is correct, but I read the GDPR and this is what I will implement.

> What is personal information

The definition for this is here [0].

What I am doing is I am creating some docs where I write very clearly what information I use and for what.

For existing projects I am looking in schema.db and models and extract from there. For the new one (which will be in Rails) I am thinking to make a gem like annotate or something for this specific purpose.

Also I am documenting the information that is in logs and I will treat most of the information the same way I am treating passwords. So far I am looking for SQL statements, params and custom logging messages.

> Say I have a table with user_id and username, and an order table with user_id, order_id and other other stuff. If the user request a 'forget me', what do I delete

Nothing so far if the user_id and username are not related in any ways to anything that can identify a person.

> To what extent can users be forced to give consent or be denied from a service?

Here is the phrasing from the GDPR [1]: “the request for consent shall be presented in a manner which is clearly distinguishable from the other matters, in an intelligible and easily accessible form, using clear and plain language.” So in my opinion this is very different that the Cookie Law as you must make sure the subject understands for what the consent has been given. You should also take a look at Recital 42 and 43 in the beginning of the GDPR where they talk about “consent freely given” and they describe also an imbalance relation between the controller and the user.

> "Export data" / "See all my data" - What is 'all my data'

This is part of Article 15 and I think the situation you are describing is defined by item (3) of that Article. You should correlate it with the definition of personal data. This means that you should provide data you took from the personal data subject but also the personal data you got from anywhere else that is connected to the data subject - see “personal data are collected from the data subject” and “personal data have not been obtained from the data subject” as it is described in the titles of Article 14 and Article 15.

> “I accept the terms and conditions” would no longer be sufficient to claim that the user has given their consent for processing their data."

Consent cannot be included in the Terms and Conditions. Due to the Recital 42 in the beginning “consent should not be regarded as freely given if the data subject has no genuine or free choice or is unable to refuse or withdraw consent without detriment” and also “safeguards should ensure that the data subject is aware of the fact that and the extent to which consent is given”

> My tax agency requires me to keep records of orders/sales/invoices up to 5 years ago. If a user requests deleting their personal information within that time period, what should I do?

You keep them. Article 17, item (3) states that “shall not apply to the extent that processing is necessary” and you should take a look at letter (b) “for compliance with a legal obligation which requires processing by Union or Member State law to which the controller is subject”

> "Forget me" - Say an employee leaves a company, and they request their personal information to be deleted.

You I think you should delete everything that is not a subject of the law and that it cannot be used “for the establishment, exercise or defence of legal claims”.

Regarding Git or commits for me it is clear that they will not be deleted as there are part of “the purposes for which they were collected or otherwise processed”. If they are part of a project which is part of a legal contract with some users or beneficiary then also it is ok not to delete the GIT commits because you need the info “for the exercise or defence of legal claims” in case anyone will ask in a court who did that feature and when.

To be 100% sure one way will be to anonymise Git user (did not try that so far) by changing the username to something generated like “user0000113” and email associated with that account.

[0] - http://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELE... - Article 4, item (1)

[1] - http://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELE... - Artile 7, item (2)

edit: formatting


Thank you for the long answer. It clarifies some issues. I wish the EU would put a 'brochure' along with the official law, containing explanations, examples etc. Our government and official bodies provides these for many of our nation's contracts or official documents (not the law, but rather housing contracts). Some follow-up comments:

>> Say I have a table with user_id and username, and an order table with user_id, order_id and other other stuff. If the user request a 'forget me', what do I delete

> Nothing so far if the user_id and username are not related in any ways to anything that can identify a person

How do I handle this situation when users get to choose their own username? If a user uses their own natural name as a username, then it's identifyable information and I'd have to remove it (then again I'd remove or anonymize it anyway).

>> To what extent can users be forced to give consent or be denied from a service?

> Here is the phrasing from the GDPR [1]: “the request for consent shall be presented in a manner which is clearly distinguishable from the other matters, in an intelligible and easily accessible form, using clear and plain language.” So in my opinion this is very different that the Cookie Law as you must make sure the subject understands for what the consent has been given. You should also take a look at Recital 42 and 43 in the beginning of the GDPR where they talk about “consent freely given” and they describe also an imbalance relation between the controller and the user.

It also describes that "(red. consent) should not contain unfair terms.". Would forced consent for using information for third party marketing purposes during an order check-out be 'unfair terms'? I guess "Consent should not be regarded as freely given if the data subject has no genuine or free choice" would say it doesn't. It would be nice if such situations/examples with a (legal) answer would be searchable somewhere.

Would you be allowed to get consent for an all-encompassing 'third party marketing purposes'? Sounds like that is the thing this law is meant to avoid.

> "for the establishment, exercise or defence of legal claims"

That's a very broad statement. So many loopholes possible there. Just introduce one law in a foreign, non-EU country that requires you to keep all personal information for 'assisting in criminal investigations', and you get to keep whatever you want.


Is it just for me the advisor to the deputy prime minister of a EU country does not work?


Does the GDPR state that all this needs to be automated?

Given how infrequently a small business will get requests to delete, restrict or export data, is it allowed to just do it manually when requested by email?


No, it does not have to be automated, and if manual / email approaches work at your scale that's fine.

The legislation explicitly refers to "proportionality", although people exploiting the FUD around it to drum up business rarely use the word.


The GDPR does not state that it needs to be automated. I assume SME's will also not automate this unless they get a lot of request. Basically all features that are required by the GDPR are already in most common SME software.


In that case, the GDPR actually sounds quite positive - I believe users should be able to request that their data is deleted, and be told in advance if it's going to be used for anything non-obvious (e.g. training an ML model).


> In this particular case, it applies to companies that are not registered in Europe, but are having European customers.

Umm, nope. The EU doesn't have any authority to enforce companies outside of the EU to do anything.


That’s not true. You’re within their scope if you process and/or target EU citizens.

It might of course be difficult to execute the judgment but they can still sue you in the EU. In the US that happens also, for example when some stakeholder sues a foreign website that infringes on their rights. If the sued entity doesn’t show up in court, they just issue a default judgment (meaning the plaintiff wins by default). You can even sue a John Doe in court (at least in the US).

In practice, the EU is also resource constrained like any other government entity, so you probably won’t have much to fear. I mean they’re not going to sue millions of companies all over the world, it just means they have created themselves a new stick that they can choose to use.

I hear that they’ll likely first go after entities that have a big impact on the public (i.e., the most blatant cases).


You are confusing legislation and enforcement. A great deal of legislation has an extra-territorial dimension.

We live in a connected world, so ignoring this legislation should not be assumed to be free of consequences.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: