Hacker News new | past | comments | ask | show | jobs | submit login

What I was saying with that "combining" is that your router is now essentially the vip, in the sense that it is the load balancer and the peers are chosen and routed to from it. As opposed to a normal router which is merely passing routed traffic into the network and letting a different device handle load balancing. The idea being that different devices are used for different purposes, and separation of their functions may improve overall stability and increase flexibility of your network services.

One downside here is ECMP assumes all paths cost the same, which is ridiculous in real-world load balancing. One of your haproxies is going to get overloaded and then traffic to your site is going to intermittently suck balls as sessions stream in to both under-loaded and over-loaded boxes.

Of course, you have the same problem with round-robin DNS to load balancers, but in the case of a DR LVS load balancer for example, at least it's just starting the connection and handing it off to the appropriate proxy instead of randomly pinning sessions to specific interfaces. With DR it's the backend proxy that determines its return path; the LVS VIP isn't in the path. With LVS it can pick a destination path based on real-world load.

The other downside that you seem to gloss over with regard to scaling is the maximum of 16 ECMP addresses in the forwarding table. I'm sure we'll never need more than 16 of those, though....... (For reference: the company I used to work for had up to 23 proxies just for one application... might cause some hiccups with this set-up)

Doing maintenance on a vip address and doing maintenance on one of these bgp peers work about the same. You stop accepting new connections, let old connections expire, then take down the vip. As far as changing DNS records, instead of that you can either add a hot-spare vip with the IP of the one you want to maintain, or add the ip of that vip to an existing load balancer.




Ok, I understand what you meant now. I do disagree - it's simply doing what routers do, and has no specific knowledge or configuration for the VIP. It's simply forwarding traffic based on a destination table just like any other packet. If this was problematic in any way, your average backbone would implode - ECMP is utilized extensively to balance busy peers. Also routers already do redundancy (at least via L3) extremely robustly - so it's basically a "free" way to load balance your load balancers. You simply are not going to get the same level of performance out of a LVS/DR solution, as it's competing with very mature implementations done in silicon. We'll have to agree to disagree here.

Of course in ECMP all paths are the same - I don't see this as a downside though. Most router vendors do support ECMP weights if really needed, but there are better ways to architect things. I've run this setup with over 1500gbps of Internet-facing traffic, and never ran into a full 10g line because it was engineered properly. An in-house app that lowers my hashing inputs would probably require a different setup though, I agree.

16 ECMP is a decent number, but these days most routers I work with support 32. Some are supporting 64 now. But that's almost irrelevant, unless you're stuffing all your load balancers on a single switch. It's per-device, so you have 8 load balancers connected (and peering via BGP) to one switch, 8 another, and so on. Those then forward those routes up to the router(s) which then ECMP from there (up to 16/32 downstream switches per VIP). I've never needed more than "two levels" of this so I haven't really played with a sane configuration for more than 1024 load balancers for a single VIP (or 512 in your 16-way case). It scales more than perhaps a dozen companies in the world would need it to. Note that this explanation may sound complicated, but in a well engineered (aka not a giant L2 broadcast domain that spans the entire DC) network it just happens without you even specifically configuring for it.

Since my knowledge is dated - how do you "stop accepting new connections" with the LVS/DR model? I'm sure you can, just can't mentally model it at the moment. You need to have the VIP bound to the host in question for the current connections to complete, how do you re-route new connections to a different physical piece of gear at the same time utilizing the same VIP?

There are certainly downsides to this model as well, I don't want to pretend it's the ultimate solution. But, it's generally leaps and bounds better than any vendor trying to sell you a few million dollars of gear to do the same job. The biggest downside to ECMP based load balancing is the hash redistribution after a load balancer enters/leaves the pool. I know some router vendors support persistent hashing, but my use case didn't make this a huge problem. There are of course ways to mitigate this as well, but they get complicated.

In the end, for the scale you can achieve with this the simplicity is absolutely wonderful. It's one of those implementations you look at when you're done and say "this is beautiful" since there are no horrible-to-troubleshoot things that do ARP spoofing and other fuckery on the network to make it work. ECMP+BGP is what you get, you can traceroute, look at route tables, etc. and that displays reality with no room for confusion. No STP debugging to be found anywhere :)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: