What are some example sites where this is both necessary and sufficient? In my e...

Retr0id · 2024-12-30T14:10:32 1735567832

A lot of WAFs make it a simple thing to set up. Since it doesn't require any application-level changes, it's an easy "first move" in the anti-bot arms race.

At the time I wrote this up, r1-api.rabbit.tech required TLS client fingerprints to match an expected value, and not much else: https://gist.github.com/DavidBuchanan314/aafce6ba7fc49b19206...

(I haven't paid attention to what they've done since so it might no longer be the case)

oefrha · 2024-12-30T14:20:10 1735568410

Makes sense, thanks.

jonatron · 2024-12-30T14:10:27 1735567827

There are sites that will block curl and python-requests completely, but will allow curl-impersonate. IIRC, Amazon is an example that has some bot protection but it isn't "serious".

ekimekim · 2024-12-30T16:20:51 1735575651

In most cases this is just based on user agent. It's widespread enough that I just habitually tell requests not to set a User Agent at all (these aren't blocked, but if the UA contains "python" it is).

thrdbndndn · 2024-12-30T17:41:53 1735580513

Lots of sites, actually.

> I doubt sites without serious anti-bot detection will do TLS fingerprinting

They don't set it up themselves. CloudFlare offer such thing by default (?).

oefrha · 2024-12-30T18:18:14 1735582694

Pretty sure it’s not default, and Cloudflare browser check and/or captcha is a way bigger problem than TLS fingerprinting, at least was the case the last time I scraped a site behind Cloudflare.

Avamander · 2024-12-30T14:39:48 1735569588

CloudFlare offers it. Even if it's not used for blocking it might be used for analytics or threat calculations, so you might get hit later.

remram · 2024-12-30T18:50:24 1735584624

Those JavaScript scripts often get data from some API, and it's that API that will usually be behind some fingerprinting wall.