Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It wasn't that there was a crunch — that had always existed. There just wasn't all the tooling to implement anything like flex. At least this video was made after "buying Borg quota" was a normal thing. Before it, you had to "buy" regular machines and donate/assimilate them into Borg. Then after X days you'd receive your quota, minus a Borg "tax" of 10% to cover borglet and system daemons' overhead.


And then they got Steamrolled away if you didn't use them enough.


Hah, I think that at least at the beginning, the tools wanted to steamroll the Bigtable service, too. As the one that had to vet the changes every week, I had to go explain that it wasn't really neither our quota nor our usage. For Colossus, sometimes large new clusters would go from very idle to super busy and under provisioned within days or few weeks, AFTER the steamrolling was already done, just because large users moved in. We then had to bring up new curators (masters) quickly, many times with quota that didn't exist. Quite often, when out of options, I ended up taking precious p360 quota from the storage monitoring user to mint production resources for Colossus. Fun times.


So before Borgmon existed, the quota unit was basically an entire machine? No virtualization?

Oh wait, this was probably 2008, VMWare had only just figured out JITed software virtualization a couple years prior. Makes a bit more sense now. And now the containerization thing makes a tad more sense: Google basically skipped over the "use QEMU" (or more recently, Firecracker) phase everyone else is now going through.


Borgmon and Borg are different things, but yes, before the latter, machines were owned by teams. Brian Grant talks a bit about the predecessors, Babysitter and GWQ: https://softwareengineeringdaily.com/2018/04/27/google-clust...

Borg was started in 2003 and by 2006-2007 was already the default way to run things, even though isolation wasn't perfect then. I think it started with chroot jails, then fake NUMA, then cgroups, which were written for it. It took years before all of web search moved to dedicated Borg machines (their quota belonged to you and nobody else could run on them) and, eventually, shared ones.


Fun fact - up this thread you replied to one of original cgroups/borg creato’s comment


Ah, see, should've given 'em to us! Big rectangular state, you know, #3 machine owner at the time. We didn't charge overhead, probably because it never occurred to us to do it.

People brought machines, we gave 'em quota. Easy enough.

So glad to not be doing that any more.


I know the service! The folks in NY had the state flag by their cubicle. My team in 2007-2011 was probably one of the largest users and I think I donated machines to y'all in the old Groningen cluster, before quotas were automated. GFS didn't charge for chunkserver overhead, either, and that mistake took years and lots of pain to fix...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: