I wonder what they would have done had it been named "Basecamp Competitor Business Plan.pdf". Would have been awfully tempting to take a peak. Exactly why they shouldn't even be looking at filenames.
Ultimately a company storing files is almost certainly going to require its staff to look through directories, log files, database tables. And it is certainly going to require staff to have the ability, even if they never have to use it.
By giving them your files you are trusting them not to screw you over.
> Ultimately a company storing files is almost certainly going to require its staff to look through directories, log files, database tables.
Why? (Or at least, why should they see anything private in raw form?)
> And it is certainly going to require staff to have the ability, even if they never have to use it.
Why?
> By giving them your files you are trusting them not to screw you over.
By giving themselves the technical ability to examine private user data, they are making a strong (or indeed legally compelling, in some cases) argument for not using their service to store anything private at all. That's a death sentence for most cloud services.
We don't accept companies storing passwords in plain text. We don't accept companies transmitting credit card data in the clear, and PCI DSS requires quite strict controls on access to such data even when it's stored internally on the company network. Businesses dealing with sensitive data such as health records are subject to all kinds of regulations on the privacy of that data. Professionals dealing with privileged communications such as between lawyers and clients don't get a pass. Off-site backup services give all kinds of strong guarantees about the security and privacy of the data entrusted to them.
Why should we give a pass to anyone else, because they can't figure out how to set up a security system where only the end user can access the unencrypted version of their own data?
> We don't accept companies storing passwords in plain text. We don't accept companies transmitting credit card data in the clear
This is in case somebody gains unauthorised access to the data, not in case staff can't be trusted. For example, paying by credit card over the phone you hand over your phone number to whoever is taking your order, but if they were to enter it into a system that system then has to comply with regulations.
As to plain text passwords, again this is in case of the data being stolen. It's all very well saying "Google shouldn't store plain text passwords", but if Google as a company wanted to read my email, they could a.) just replace the encrypted password in my database entry with one that they can use or b.) place code in their login system that would log the plain text password secretly or c.) Go straight into where my emails are being stored and access them there.
How do you get around this and make it impossible for them to access your data?
> This is in case somebody gains unauthorised access to the data, not in case staff can't be trusted.
Actually, no. This is very much for both reasons. Part of PCI compliance is ensuring CC data is encrypted with a key that is partly known by a few people. So, let's say 3 people each knows a part of the key. The goal here is that in production, no one can access the key, but data can still be encrypted/decrypted.
To put it plain, it's not just a matter of encrypting and salting your CC data.
As for your "Google can just" remarks: yes. This can happen in many places. However, you mitigate the risk of this happening with procedures and security. I guarantee you that just working for Google doesn't give you access to the emails. I'd be surprised if the number of people that have direct access to emails at any time is in the double digits. Getting your code into production, I imagine, isn't just a cherry-pick.
You can't prevent people from having access to data you give them. However, they can mitigate the ability for it to happen.
> How do you get around this and make it impossible for them to access your data?
The same way everyone else does: encrypt the data using a secret known only to the customer, isolate internal systems that have access to the decrypted data so that no one person can ever access that data on their own authority, and ensure that whatever procedure does permit access with the requisite authority creates a robust audit trail. If security really matters, the whole system and its logs should be regularly audited by an independent party, too.
You have to have some sort of trust, because obviously if everyone in the company is crooked then nothing but encrypting everything client-side using auditable code is bulletproof. But you can certainly engineer systems so that access requires multiple people's consent and gets securely logged, which would eliminate casual snooping and provide robust evidence for legal action in the event of collective abuse.
"The same way everyone else does: encrypt the data using a secret known only to the customer, isolate internal systems .., and .. creates a robust audit trail."
Who?
Who does that?
If you give your data to someone, chances are they might look at it. If you store files, use email or surf the web at work, chances are the IT guys can look at it. Of course, they should not, and they probably have better things to do etc etc, but believing this will never happen just seems naive.
And then somebody loses their private key material and they look to you to fix their problem.
Client-side encryption with end-user key management is not yet practical for the average end user. Until it is (I'm hopeful we'll get there), the average service will require some sort of administrative back door that is controlled by process and people.
This is in case somebody gains unauthorised access to the data, not in case staff can't be trusted. For example, paying by credit card over the phone you hand over your phone number to whoever is taking your order, but if they were to enter it into a system that system then has to comply with regulations.
Not exactly. The purpose is to limit the number of people who have access to your credit card number so that if one of them uses it fraudulently, it's easy to isolate and verify the source of the fraudulent transactions. Yes, the guy taking the order over the phone will have access to your credit card number. Yes the waiter running your card at the restaurant will have access to your credit card number. If the system is well designed, though, no one else will, and it'll be easy to find the person to blame if fraudulent transactions are made.
I've never worked on PCI compliant systems myself, but I know many developers who have, and they say that the sysadmins take solid measures to ensure that no one, not even the developers, gets any data from a database that handles credit card information. Any data pulled from those servers is first sanitized to ensure that credit card numbers and other personally identifying information is removed. Credit card numbers are replaced with a "sample" number that can be used for validation purposes. Names other information are replaced with sanitized data that has the same "shape" (e.g. number of characters and identical punctuation) as the original.
The purpose of these regulations is to ensure that there's always a clear chain of custody over your credit card numbers. Preventing unauthorized access is only one part of maintaining that chain of custody.
Unless explicitly authorized by the customer, or for the purpose of providing the service, your staff should not be allowed to look at customer data, and what data they look at should be limited to what's necessary to perform their function.
If you do want the right to spelunk through customer data, you need to declare that in the privacy policy. If you declare otherwise, you're breaching the contract with the customer.
The problem is that incidents and attitudes like this make the market lose trust with the cloud services industry, which is poison to everyone.
I agree, however its somewhat disturbing how often I have to view customer data in my current job. I think the bigger companies that have good processes in place probably don't have to have people do it much but lets just say some companies that have older applications (like the one I work for) that have seen better days end up having people have to make a lot of manual database updates and also end up giving access to production DB to their developers in case of emergencies.
I'm not sure I understand what you're saying here.
The only contract with the customer is the privacy policy. The privacy policy is just a promise from a site to abide by certain rules. In the case that there is not a privacy policy, then whatever you tell that site can and will be used against you. From tracking cookies to the most sensitive of files, if you are providing information to a site then you have to assume that it will be used in any way the company sees fit unless they promise otherwise.
Ethically there may be different obligations, but to say that there is some implicit "contract with the customer" is simply not the case.
Since 1890 in the United States, the tort law has had concepts of invasion of privacy and breach of trust. Further on a state by state level there may be laws, such as COPPA 2003 in California, which required a privacy policy to be published. Canada and the EU have even more protective laws if you trade there.
I feel it is safer and more realistic to presume the first paragraph I wrote is the case and cover yourself with a privacy policy if you want to do otherwise as I mentioned.
As always, ask a lawyer if you want professional advice.
I came here to post exactly that. In most serious cloud teams you have very few select people with authorization to look at customer data (the operations team), and everybody else is outside of that group. When debugging the service, you have to pass instructions to that team so that in case confidential data is revealed, only they get to see it.
If a file storage company that claims to be protecting users' data isn't storing it in an encrypted manner that requires people to jump through all manner of technical and proceduraly hoops to get access to them, then they are failing quite badly.
Whatever encryption is there (and we don't know what they are doing in this respect), their staff who manage the systems still have access to look up the file name of the Xth file, or if they like to go snooping through all files.
This is true. The problem here is that they went looking through private user data when they didn't need to. If they were only doing it when essential, eg to debug a problem, people wouldn't be complaining. It's the fact that they did it without their being an urgent need to that has bothered people I think. What other trivial reasons have they used to look through peoples data?
If (in the course of essential sysadmin duties) I saw a file named "Basecamp Competitor Business Plan.pdf", or in fact any name, I wouldn't look at it because that would be wrong, no temptation.
Unfortunately, when around 1 in 3 sysadmins spy on their own colleagues (depending on whose report you read), it's apparent that not everyone has your moral fibre and something more than simply "trust the individual to do the job professionally" is called for.
Same here. There are countless sysadmins who would look though. Especially if they had a financial interest. There are even more who wouldn't look, but would just mention, "Guess what our 100 millionth file was named" to a Director, who also happened to have access.
All the data should be encrypted and only the user's password should be able to decrypt the data. That way you eliminate the problem of someone peaking into your files.
How would a 'forgot password' function work in this case? If you're using something like GPG to encrypt and the password is basically the passphrase for the key, a customer forgetting his/her password would become an irreversible event, and he/she would end up losing all data.
The filename could contain enough information to be a big breach of privacy by itself. Think "microsoft bankruptcy proposal.doc"; "google downsizing plans 2012/2.xls"; "ipad 3 presentation draft.ppt".
Temptation can be resisted, but in this cases you are in trouble just for glancing.
Very true. What if it was a screenshot of a secret CATalog? Or a scan of some secret Company Anonymous Transfer or whatever.
So I hope they never took a look at the "cat.jpg" file because you just cannot tell what's inside.
It's funny you mention that because I actually do have such a file! I also went ahead and used Basecamp too to get a sense of what could be improved or simplified. But of course I didn't even consider uploading that file and this was long before this whole debacle started. We need to trust services like Basecamp, Google Docs and the others... A lot. But we also need to be smart about that trust. A healthy distrust is definitely in order in certain circumstances.