Yep, that's why I've build my own, the existing ones don't give out a list of th...

jdc0589 · on Nov 17, 2016

the biggest headache with the c# implementation was the threading. A lot of the out-of-the-box threading structures (pools, etc...) have limitations you might not think about checking for; e.g. you can't set the number of threads lower than the CPU count on the machine with some of the official .net threadPool helpers; you can try, but it will just silently ignore you.

There is some super useful stuff too though that made it easy to write a generic extensible crawler. My implementation ended up supporting separately compiled plugins you could just dump in a 'plugins\' directory, which responded to events and had full ability to manipulate the output pipeline. Do-able in lots of languages, but c# has some formalized helpers around it that make it super easy.