I have to disagree with the statement, that those techniques are not being used in the wild. I‘ve observed a porn advertising network delivering some js once, which opened a third-party domain served pdf with cookies in the background and then closed the popup immediately again. I was wondering what that was about. Now it’s clear to me.
Pornhub. It could of course be a popup playing a different role (e.g. being part of a "you need to upgrade your vulnerable software naow!1"-scheme) that's only visible if no blockers at all are used.
^ important. If you're trying to provide free access, I always seek ways to mitigate cost. If the API can be slowly updating, like once a day or w/e, then I stick everything behind a static CDN or something such that it should cost you next to nothing. Just a small CPU somewhere to build the JSON endpoints per update cycle.
That's been exactly my experience. Most time is spent connecting or waiting for the server response (TTFB). Using an async I/O event loop approach in combination with EPOLL/KQUEUE you can handle thousands of concurrent connections. You then push the response to your worker nodes, which process the data in a multi-threaded fashion. Stream Processing Frameworks like Apache Spark or Storm work great for that.
Crawling may be cheap, but you also want to save that data and make it queryable without waiting minutes for the response to a query. That makes it way more expensive.
By combining those two you could get a list of projects which are on GitLab and on Github. Using the created_at on both APIs you could figure out which one was there first and which one has been imported/pushed onto the other platform.
(you would of course miss all projects which have been already deleted on Github, although forks should still exists which should help in most cases)
The herd mentality is very much in evidence in the stock market. There's often no logic to it. By standing apart from the herd and sometimes running in the opposite direction you can often make a killing. Example : 2007 when the market crashed. A lot of people panicked and sold because the market dropped (because a lot of people were also selling).
Since the demand exceeds the supply by far, most consultancies take on less experienced candidates, then train them and/or pair them with a more experienced colleague.
I've worked for several smaller consultancies focusing on different domains and technology stacks and always learned it on the job.
tl;dr:
TLS inspection is just another tool in your toolbox to control your corporate network traffic. While it might help to avert infections and detect exfiltration traffic, it's by no means required by GDPR.
The reasoning for this is mostly:
Malware can use TLS to load malicious payloads and exfiltrate data + Data loss and data breaches are targeted by GDPR => Decrypting the traffic let's you detect the malicious activity and prevent the infection / notice the exfiltration, which can help you staying GDPR compliant.
IANAL, but as long as it's a black box and the traffic doesn't get stored nor is accessible, the logs don't contain any personal information and the users are in the know about this processing, it should be okay.