@everyone who doesn't work in hosting.... >"why did Rails Machine throw out ...

saurik · on April 24, 2012

As someone who does not work in hosting, I am somewhat surprised that "upstream bandwidth providers" don't have mediation strategies for DDoS attacks. It is my understanding that most of the asshats out there aren't sitting on Stuxnet: they have well under a hundred machines that they are able to flood traffic at you with (which is certainly more than enough). It seems like there are network-level mechanisms you can use to just block such an attack at the ingress connections. Is anyone willing to explain more about the issues here? (I, at least, would find it fascinating.)

jsprinkles · on April 24, 2012

Dropping UDP is a start, because most of these attacks randomize the UDP source address. The real problem is AS operators who let forged UDP addresses escape their network. If your outgoing edge ACL is not dropping source addresses that do not belong to you, you are doing it wrong. Period. Full stop.

If the packets are random UDP sources, then there's not really much you can do on the receiving end except strategies that involve the destination address. There's really only one effective one: nulling it.

saurik · on April 24, 2012

Okay, but then why null route the entire machine, as opposed to dropping just the UDP packets going towards it? If you already have routing infrastructure that can disseminate "attempts to access this IP address will fail" it does not seem a stretch to disseminate "attempts to access this IP address over UDP will fail; over TCP there is no issue". Are "upstream bandwidth providers" really that impotent against that kind of issue? :(

I, personally, have absolutely no need to have incoming UDP packets of any kind at all entering my network, and can not come up with a reason why any web hosting company would: it seems like it should almost be a question on your bandwidth contract "will you need UDP (recommendation: no)". (Given the simplicity of the UDP problem, I personally assumed that the complexity would come from state-ful TCP filtering issues.)

prg318 · on April 25, 2012

The issue with dropping UDP is that DNS uses UDP in most implementations. Unless you have no need for DNS on your network, you might want UDP packets to be not dropped completely.

saurik · on April 25, 2012

Ah, that a web host might want to host their own DNS in house (maybe for ease or cost) is not something I considered (I outsource DNS as it is sufficiently performance sensitive you really want to AnyCast it against numerous networks, and there are people that specialize in that). As a client you can just use TCP for DNS. (Again: I am not a host ;P. Thanks!)

jsprinkles · on April 24, 2012

That's what most providers (certainly the one I've worked at) do, in fact, do, is null route the entire machine. Rather than leave you nulled in the router for weeks waiting for the attack to subside, eventually, they'll just cut you loose. That's the typical form these things take.

It would be tempting to blackhole UDP, but it's just as easy to flood a pipe with TCP. You don't need an established connection to get packets in the pipe, and there are many mismanaged networks on planet Earth (as I alluded to) that let just about anything exit. A small botnet of five or ten machines with gigabit uplinks on poorly-managed networks is enough to be a force to reckon with. When I was an administrator on IRC, Cisco routers themselves were common targets to exploit and use for this purpose; often, they're directly plugged into gigabit or even ten gigabit connectivity, and there are really easy ways in IOS to perform a DoS attack.

I can't speak to the decision Pastie's host made in this case, but it sounds like since Rails Machine was donating resources (and Pastie wasn't paying them), Rails Machine must act as a smart company -- certainly, any entrepreneur who enjoys Hacker News would sympathize -- and protect its income flow. Which means, you mess with my customers, I fire you.

sha90 · on April 25, 2012

"You mess with my customers"? So, pastie.org was asking for it by hosting a free-form data pasting site? Again, this is you acting like pastie.org is the one at fault and is responsible for a bunch of idiots deciding to saturate the line.

It seems as though you're skewing the issue here. And I think the real issue has nothing to do with whether pastie.org was a paying customer. I'd be interested to know if RM would do the same thing if there was no sponsorship arrangement and it was paying regular bills. My hypothesis is they'd throw them under that same bus -- and that's really what this comes down to. It's hard to be sympathetic with a company that gives up on its customers (paying or not) after "9 hours". Given that they had been hosted for 3 years prior, a night of DDoSing seems like a really isolated incident, and no reason to drop them permanently. Of course, we don't know if there were other DDoSes, but given that wrecked was so eager to share the piracy concerns and didn't mention any other DDoSes, I don't think there are any.

jsprinkles · on April 25, 2012

Imagine for a moment that your million-dollar app on Amazon goes down. You file a ticket. They are currently absorbing a DoS attack, but they can't tell you that due to their privacy policy with the victim. So instead they tell you they are looking into it and it appears to be some kind of network issue.

Nine hours pass. You get frustrated. You take to Twitter. Anybody else on Amazon down? you ask. You get several people to confirm that they are. You tweet that it's an Amazon issue from your company Twitter. You start Googling alternatives. You write a blog post, months later, about how incompetent Amazon must be and you're so glad that you moved your million-dollar app to Rackspace Cloud. You make the front page of Hacker News. Hundreds follow you. Amazon gains a reputation for unreliability among those that read HN. Sales start decreasing.

Or, they null the customer and none of this happens.

Welcome to hosting.

wrecked · on April 25, 2012

Yes, that.

jsprinkles · on April 25, 2012

I've been in your shoes. Keep your chin up.

marshray · on April 25, 2012

That's assuming the AS operator is not actually sponsoring the DDoS attack.

jsprinkles · on April 25, 2012

Good point. Then I blame upstreams.

adrianpike · on April 25, 2012

A question about null routing - would the upstream provider or the datacenter do that?

Wouldn't one have to be dropping the packets at the upstream level in order to keep the pipe clear? At larger scales, is it common for the datacenter to own/control a router upstream, or would that be something that they would have to get in touch with their upstream provider to make happen?

jsprinkles · on April 24, 2012

People who haven't worked in hosting don't realize that not only is the gear $100,000+, the administrators that understand it are just as expensive. Annually. (Mitigation is a relatively rare skill for network administrators.)

Edit: Let's say you run Joe's Web Hosting. Joe's has three facilities, and you run redundant ten gigabit uplinks at each. Last I priced a device that could handle ten gigabit at line rate, it was ~$120,000, so figure:

   $120,000 x 2 uplinks x 3 facilities = $720,000

Just for the gear.

(I honestly don't remember if that figure was for the gigabit device or the ten gigabit device. I think the ten.)

nenolod · on April 25, 2012

your example is pointless as no actual datacenter actually pays their equipment vendor anywhere near list price for those devices. i certainly don't.

network equipment vendors like to have high list prices in order to:

- ensure they get sales inquiries from serious buyers;

- give sales representatives a large amount of leverage when negotiating the actual line-item pricing for each purchase order;

- promote some kind of enhanced "value" to executives ("it costs a lot of money, so it must be good!")

a realistic price for two routers with two 10gig-e linecards is about $15,000-$25,000 per unit direct from brocade. cisco would be around the $35,000-$50,000 per unit ballpark. if you're paying any more than that, you're getting extremely shafted.

further, the circuits for the uplinks aren't going to cost that much themselves, as it's all sold under percentile-billing (not even 95%, usually more like 90% or 80% at over-gig-e levels these days) when we're talking actual transit costs. commits for "Joe's Web Hosting" might only be 1.5gig on each 10gig circuit, so that cuts down pricing a lot right there.

in terms of getting an experienced engineer, most attacks can be detected and mitigated automatically using netflow/sflow/ipfix flow analysis. i wrote software which handles the detection automatically, with a fairly high success rate, and it's free to use on bitbucket. :)

jsprinkles · on April 25, 2012

This was a price I received directly from a vendor who isn't Cisco or Brocade at a trade show. I honestly don't remember who because I lost interest, but I believe it was Black Lotus. It was priced per gigabit, and we needed ten. ~$12,000/gigabit, $120,000 device. Someone from Black Lotus can feel free to correct me. I was uninterested in negotiations from the get-go based on that ballpark, so if what you say is true, these vendors are doing themselves a disservice by talking me out of investigating them further simply based on a ballpark price. I also wasn't pricing routers, I was pricing mitigation gear. Cisco discontinued theirs, didn't they? The Guard? I seem to recall an admin who deployed Guards telling me they were (are?) six figures as well.

I worked for a serious buyer when I had this casual conversation with the vendor. You do business with my former employer (I've spoken to you before, when you took over cia.vc). I'd consider those facilities "real" datacenters, and they're certainly the same size, if not significantly larger, than your hosting outfit. No need to appeal to authority with me, honestly, I think we're on the same page -- I think your comment misread me as pricing routers instead of mitigation gear.

As for flows, I was interested in a solution that I didn't have to write software to implement. I'd rather have a supported appliance that can handle figuring out DoS attacks itself, rather than me parsing flows and feeding that information back. That way, if the software doesn't work, I can blame somebody else rather than me. My time is precious. Yours sounds less so, and that's your prerogative.

nenolod · on April 25, 2012

yes, Cisco discontinued the Guard module, which is unfortunate, as it did a pretty good job at determining what ACLs needed to be generated and applying them, and was actually much cheaper than the appliances.

in terms of mitigation hardware, all of those appliances are a ripoff. the solution is to replace them with free software that does automatic analysis on the flows and then sends that data automatically elsewhere, putting them out of business.

i have a system set up that does automatic mitigation using ddosmon by having a "scrubbing center" running on freebsd-based devices. this is accomplished by having a custom 'action' module in ddosmon which does three things:

- calculate the necessary ACLs

- insert them into the appropriate pf tables so the mitigation strategy fits my ruleset

- direct the router to send traffic for the IP being flooded to the scrubbing appliance

this is basically the way that the mitigation appliance vendors tell you to do it if you're handling >10gig floods anyway.

i think ddos mitigation is really a place where free software can cause a massively needed paradigm shift.

jsprinkles · on April 25, 2012

I will check it out, and it genuinely sounds interesting. Thanks.

> i think ddos mitigation is really a place where free software can cause a massively needed paradigm shift.

I'd argue resilient, reliable network gear in general is such a place. Reassuring to hear that Google is adopting Openflow and rolling their own ... maybe that'll trickle down. Cisco gear has led the way of being overpriced for fucking years.