Octobot - A Low-Latency, Highly Parallel Distributed Task Queue Worker

Sukotto · on Sept 20, 2010

This is a good case-study of what to do when building a home page for your project.

1) State what your project is about using language that a reasonably savvy (technology-wise) user can understand, even if unfamiliar with the problem domain. (this site would be better if they linked on "message queue" for those people that don't know what that is.)

2) Eye-catching call to action icons

3) Brief list of most important features

4) Brief high-level view

5) Brief low-level view

6) Some clean and minimalist stats on why you might want to use it

7) Easy to understand links to more info.

I don't know if I will ever need this product... but I'm bookmarking it anyway as an outstanding example of how to introduce people to it. (Except for the swear word. The word "Fucking" is really out of place)

Nice job guys.

superjared · on Sept 20, 2010

"Portland Fucking Oregon" is definitely a PDX thing. "We're too cool to not swear" or something. I dig it, but then I live here.

JoachimSchipper · on Sept 21, 2010

I thought their list of adjectives was too long; half of them boiled down to "performant", too, which is not too useful (of course you're trying to be performant; calling out throughput or latency specifically is useful because it helps people evaluate your work for their needs, but having both gives very little information...)

evgen · on Sept 20, 2010

The throughput numbers on the chart would have seemed a bit more valid if the task had not included stuffing the result into a protobuf given the notoriously slow performance of the standard python protobuf implementation that would have been used by Celery and PyInvoke. If they wanted to impress, or at least be honest, the data would have stayed in JSON the whole way through the stack and the worker task would have been a simple string manipulation of something similar.

cscotta · on Sept 20, 2010

Hi there,

The task benchmarked was from a component of our messaging stack at work that I'd ported. My intention was to offer an example of the end-to-end performance of a real-world task that someone might be writing, rather than a sample task that's not much more than a no-op. I'd suggest that this particular one is real-world as it's straight out of one of our applications. I've no intention of trying to be dishonest here - just offering the measurements of my (and our) internal evaluation of the tool for our needs.

While I can't provide the source of the task, I can offer the quick-and-dirty source of the quick little "PyInvoker" I'd whipped up (sorry - didn't realize there was something called PyInvoke at the time). It just takes a message, unpacks the JSON, and uses getattr to call the appropriate task. Nothing fancy like e-mail error notifications, retries, and the like: https://gist.github.com/18a30689832569d67861

Anyhow, always and absolutely take any claims regarding the performance of a tool with a massive grain of salt, and try them for yourself to see if they suit your needs. Octobot's not designed to replace background processing in most applications as tools like Celery and DelayedJob (both of which I use myself) are great, a bit easier to write for, and a bit simpler to get up and running, depending on the application and language being used.

There's no intention to have slighted anyone or any other project here. I'm stoked that a lot of tools exist in this space. I just hadn't seen one on the JVM that offered this level of simplicity and parallelism with a bit of restraint when it comes to feature creep. But if you have an application that demands high throughput / low latency execution of tasks, this might be worth evaluating.

asksol · on Sept 20, 2010

I'm interested in seing the code/config used to benchark Celery. The default settings are not at all optimized for processing lots of small jobs, and you could easily tweak it to get a 100x speed up for that use case, e.g.:

   CELERYD_PREFETCH_MULTIPLIER = 0
   CELERY_DISABLE_RATE_LIMITS = True

Also, channels are not re-used unless you explicitly pass the Publisher, so e.g.

   publisher = task.get_publisher()
   for i in xrange(1000):
       task.apply_async(args=(i, ), publisher=publisher)
   publisher.close()

is known to be a massive speed-up for sending tasks in batch (it seems the creation of channels is very expensive in pyamqplib).

asksol · on Sept 20, 2010

By the way the performance increase you're seeing with the PyInvoker (from your gist) is most likely because it doesn't have prefetch_count enabled.

Celery enables this so a single worker doesn't suck in a million messages at a time, and to balance the work load between available resources. As noted previously it can be disabled.

Btw, octobot looks great, maybe we can share ideas.

cscotta · on Sept 20, 2010

Right on, thanks Ask! I'm checking out some of this right now and might not be able to get through it all today, but will give it a try. Just shot you a couple messages outside of HN - love to talk when you have a chance!

stavros · on Sept 21, 2010

Does CELERYD_PREFETCH_MULTIPLIER do what I think it does (disable prefetching so workers get items as they need them)? I like Celery a lot, having used it for a few days, but the documentation is a bit frustrating :/

evgen · on Sept 20, 2010

No problem. I did not want to suggest that you were fudging numbers or anything, but it just appeared to be a strange selection of parameters for a comparison chart (and to be honest, whenever I see a comparison chart that shows X outperforming Y by more than 10X I tend to take a closer look at the details of what is being compared...)

As to the task selection, I do not think showing a "real-world" task help here, since we all have different tasks to be performed out in the real world. I want to know what overhead a particular piece of infrastructure adds, so something as close as possible to a no-op is a useful data point.

chrisduesing · on Sept 20, 2010

Am I the only one who got really excited upon seeing the domain was octobot.taco.cat, and then very confused when going to www.taco.cat? It seems to be an art gallery in Spanish or something...

I realize this is not germane to the topic, but .cat!? What other really interesting tld's exist that I have never heard about?

listic · on Sept 20, 2010

It really surprised me that they assigned three-letter TLD for Catalan, whereas all other countries have two-letter TLD's.

It happens to be not a country code (http://en.wikipedia.org/wiki/List_of_Internet_top-level_doma...), but the generic TLD, alongside the likes of .com and .org, "for Web sites in the Catalan language or related to Catalan culture", the only one of this kind in this category. Go figure.

Semiapies · on Sept 20, 2010

It's for the Catalan language:

http://en.wikipedia.org/wiki/.cat - http://en.wikipedia.org/wiki/Catalan_language

I think I need to take a look at the TLDs available, lately, because I know a lot have slipped under my radar.

moe · on Sept 20, 2010

Looks interesting at a glance.

But guys, seriously, make your docs available in HTML format. PDF-only docs are out of fashion since the 90s.

gmcquillan · on Sept 20, 2010

Fascinating. The thing I find interesting about this service is that it seems to work seamlessly with multiple queue backends. That would have been really useful at my company, where we completely swapped out our queue server infrastructure. Nice work!

johngalt · on Sept 20, 2010

Sorry to be the slow one here... What is a Distributed Task Queue Worker? What problem does it solve?

listic · on Sept 20, 2010

Yup. I'm just wrapping my head around the things like RabbitMQ and Redis and I think I understand what those are for. But can someone explain straight: in which case should I want to use this Octobot?

superjared · on Sept 20, 2010

Octobot (like Celery, Resque, et al) is a worker, meaning that it takes messages from a queue, such as Rabbit or Redis, and processes that message based on a task that you write. Imagine Octobot being used to create thumbnails for Flickr--a job that should be done asynchronously.

johngalt · on Sept 20, 2010

Do I have this correctly?

Between a list of actions, and the logic that runs those tasks is Octobot. So in your thumbnail example the problem that would be solved would be something along the lines of:

"I've got this code that can create a thumbnail from a given image and let me know if it succeeded or failed, but how do I run this on my backlog of 10million images? I'd need something that can check my list of incoming images and distribute the jobs over X number of computers. At the end it would be great to know my failure rate, and for those failures not to block my ongoing process of creating thumbnails."

So is Octobot there to provide a method of resource allocation? Or is it more of a monitoring app that checks pass/fail of the jobs?

skorgu · on Sept 21, 2010

To expand the image resizing example:

Most web gallery requests go like: (ignoring caching)

Browser -> PHP -> Database

When you upload an image you could simply handle it in-line in the server:

Browser -> PHP -> DB -> Resizer

That means the next page refresh is waiting on that resizer to finish which means long page latency to the most latency-sensitive component imaginable (the fickle user).

So you really want that resize to happen asynchronously, i.e. not wait for the result before showing the page. The roll-your-own method is to put a row in the DB that says "Hey I need to be resized" and have a cron job or somesuch that does the resizing:

Browser -> PHP -> DB

Resizer -> DB

This of course puts all the load on the hardest thing to scale (the DB) so you grow out of it fast. Hence message queues. RabbitMQ, ActiveMQ, Redis, etc are all variants of the queue, so now you have:

Browser -> PHP -> DB -> Queue

And the Queue holds all the resizing jobs that need to be done. You could just modify your cronjob to check the Queue instead of the DB of course.

Octobot (and Celery) is a queue runner that connects to that Queue, reads in the jobs that need to be done and runs them. So instead of a cron job you write your resizer in a way that your queue runner understands and the runner will manage some of the plumbing for you.

So you have

Browser -> PHP -> DB -> Queue

Queue -> Octobot -> Resizer ( -> DB to say it's done perhaps.)

Now that you're decoupled you can add more resizers, webservers, distribute them across multiple systems, expand to EC2 to handle overflow load, etc by leveraging what your queue provides.

stavros · on Sept 21, 2010

For an example of a perfect fit for a distributed worker, see my blog post:

http://blog.historio.us/asynchronous-processing-using-celery

jnoller · on Sept 20, 2010

Interesting; I'm looking forward to comparing/contrasting this with my current favorite - Celery (http://celeryproject.org/) which supports a variety of backends.

metabrew · on Sept 20, 2010

I like that it has the capability to be an email queue build in, by supporting SMTP/SSL. Worth installing just for that, since that's the first thing lots of websites want a message queue for.

DEinspanjer · on Sept 20, 2010

Very interesting. We're currently working on a quick project to test integrating Hazelcast into a distributed server to use as a queue. Did you look at it by any chance? Curious if it was missing something that you needed.

cnlwsu · on Sept 20, 2010

From my personal experience with these, Hazelcast seems to be better suited for something like a single worker queue... We had issues when we had thousands of distributed queues on different systems. We ended up going with Terracotta (which handles this well if you work through the lock contention).

This project sounds promising for that kind of application but wont know until I try it out.

DEinspanjer · on Sept 20, 2010

Sorry for my confusion, so is the problem you experienced one of having thousands of queues that a small number of clients pulled from, or thousands of workers that worked off of a small number of queues?

In our case, we want to be having a very small number of queues that have a high rate of i/o with dozens of reader clients and writer clients.

bananaandapple · on Sept 20, 2010

Could you eloborate where you did have problems? We also plan on using a small number of queues having a high rate of readers and clients (e.g. distributed task execution).

cnlwsu · on Sept 21, 2010

We have a large number of queues with small number of workers.

waratuman · on Sept 20, 2010

Interesting, but won't ever use. It also seems that it is just wrapping typical queueing systems, which I would use anyways because this only supports the JVM.

superjared · on Sept 20, 2010

This is a worker system that uses queues, not a queue unto itself.

swah · on Sept 20, 2010

Surprised to see it wasn't written in Scala, since Scala shows up before in their list of supported languages and Scala code example also comes first.

dataguy · on Sept 20, 2010

Sounds like a nice tool. Will definitely give it a try for high work-load statistical data processing.

there · on Sept 20, 2010

did anyone else have to take a second look at that domain name?

rubashov · on Sept 20, 2010

tako is japanese for octopus...

there · on Sept 21, 2010

i was more interested in the .cat tld