Hacker News new | past | comments | ask | show | jobs | submit login

What are some example sites where this is both necessary and sufficient? In my experience sites with serious anti-bot protection basically always have JavaScript-based browser detection, and some are capable of defeating puppeteer-extra-plugin-stealth even in headful mode. I doubt sites without serious anti-bot detection will do TLS fingerprinting. I guess it is useful for the narrower use case of getting a short-lived token/cookie with a headless browser on a heavily defended site, then performing requests using said tokens with this lightweight client for a while?



A lot of WAFs make it a simple thing to set up. Since it doesn't require any application-level changes, it's an easy "first move" in the anti-bot arms race.

At the time I wrote this up, r1-api.rabbit.tech required TLS client fingerprints to match an expected value, and not much else: https://gist.github.com/DavidBuchanan314/aafce6ba7fc49b19206...

(I haven't paid attention to what they've done since so it might no longer be the case)


Makes sense, thanks.


There are sites that will block curl and python-requests completely, but will allow curl-impersonate. IIRC, Amazon is an example that has some bot protection but it isn't "serious".


In most cases this is just based on user agent. It's widespread enough that I just habitually tell requests not to set a User Agent at all (these aren't blocked, but if the UA contains "python" it is).


Lots of sites, actually.

> I doubt sites without serious anti-bot detection will do TLS fingerprinting

They don't set it up themselves. CloudFlare offer such thing by default (?).


Pretty sure it’s not default, and Cloudflare browser check and/or captcha is a way bigger problem than TLS fingerprinting, at least was the case the last time I scraped a site behind Cloudflare.


CloudFlare offers it. Even if it's not used for blocking it might be used for analytics or threat calculations, so you might get hit later.


Those JavaScript scripts often get data from some API, and it's that API that will usually be behind some fingerprinting wall.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: