While some of the behaviour here is less professional than I'd expect for user data, I'm wondering what a good architecture for building a system to leverage visual telemetry without harming user privacy would really look like.
The problem is that:
1) It's very difficult to manually or automatically scrub identifying information from video, and doing so probably greatly reduces the value of the data for AI training purposes
2) If you allow users to delete videos that have been sent, then you no longer have a reproducible corpus of data
3) If you keep the recordings for say, a week, and allow the user to delete them before they're sent, either users don't have the time to review it (reviewing hundreds of Sentry clips comes to mind), or are going to just forget about it and we'll have the same problem.
I suspect they do not want to talk about the data collection too much in general, because just like how Spotify built a major value add (we have lots of playlists) off of tricking users into making user playlists public by default (it's actually very non obvious when playlists are created), having a huge fleet of cars recording video is hugely valuable for development of self driving.
I think where I'm conflicted is that at the same time as I personally consider this kind of thing an invasion of privacy and overall a nuisance, I recognize that there is societal good/value to be gleaned from mass collection of datasets. If we asked users whether they wanted to share their playlists or car video recordings, you would end up with almost no data, and the data you got would invariably be biased (see Apple Music and the paucity of good user playlists).
Clearly there's a missing middle solution, but I'm not sure there is a good one.
> While some of the behaviour here is less professional than I'd expect for user data
It's like showing up to a school shooting and saying while some of the behavior here is less professional than I'd expect for an interaction with children...
If I have already paid 10s of thousands of dollars / pounds for your car why am I also generating data for your AI garbage as well?
Opt in if you like, with a coherent discussion and understanding of the privacy risks. Have a "tesla club" where you opt in and get sent a bottle of wine every month or something, but the attitude that you should get this information by default, for free, is fucked up.
> having a huge fleet of cars recording video is hugely valuable for development of self driving.
Ok. But so what? Doing medical experiments on large, unknowing populations of humans might be hugely valuable for development of medicine, but we don't do it. Claiming there's some potential future benefit to violating someone's autonomy today is weak sauce at best and highly unethical at worst.
> I'm wondering what a good architecture for building a system to leverage visual telemetry without harming user privacy would really look like
For starters, employees should have no way to save images from customers. No way to export them, load them on a USB drive, FTP them, or take a picture with their smartphone. If you need people to do tagging, fine, do it [WITH PERMISSION] on special workstations in restricted areas.
This is the solution and something that most companies refuse to do. At best there's some kind of opt-out, but companies love hoovering up everything they can get their hands on whether you want them to or not. It's bad enough when it happens in a corporate-owned space like a grocery store, office, or mall, but when it's happening with items you ostensibly own on your own property it feels fucking criminal. I don't understand why consumers and legislators continue to tolerate it.
Yeah, clearly we should be sympathetic to the difficulties AI researchers have in response to an article about how regular employees are sharing pictures of people being intimate in cars.
The problem is that:
1) It's very difficult to manually or automatically scrub identifying information from video, and doing so probably greatly reduces the value of the data for AI training purposes
2) If you allow users to delete videos that have been sent, then you no longer have a reproducible corpus of data
3) If you keep the recordings for say, a week, and allow the user to delete them before they're sent, either users don't have the time to review it (reviewing hundreds of Sentry clips comes to mind), or are going to just forget about it and we'll have the same problem.
I suspect they do not want to talk about the data collection too much in general, because just like how Spotify built a major value add (we have lots of playlists) off of tricking users into making user playlists public by default (it's actually very non obvious when playlists are created), having a huge fleet of cars recording video is hugely valuable for development of self driving.
I think where I'm conflicted is that at the same time as I personally consider this kind of thing an invasion of privacy and overall a nuisance, I recognize that there is societal good/value to be gleaned from mass collection of datasets. If we asked users whether they wanted to share their playlists or car video recordings, you would end up with almost no data, and the data you got would invariably be biased (see Apple Music and the paucity of good user playlists).
Clearly there's a missing middle solution, but I'm not sure there is a good one.