Cog: Containers for Machine Learning

bfirsh · on April 21, 2022

Hello HN! One of the creators of Cog here.

We built this to deploy models to Replicate (https://replicate.com/), but it can also be used to deploy models to your own infra.

Andreas, my co-founder, used to work at Spotify. Spotify wanted to run models inside Docker containers, but Docker was too hard to use for most ML researchers. So, Andreas built a set of templates and scripts to help researchers deploy their own models.

This was mixed in with my experience working at Docker. I created Docker Compose, which makes Docker easier to use for dev environments. We were also joined by Zeke, who created Swagger (now OpenAPI), which is used to define a model’s inputs/outputs. Dominic and some other contributors have since joined! https://github.com/replicate/cog#contributors-

It’s still early days, so expect a few rough edges, but it’s ready to use for deploying models. We’d love to hear what you think.

uniqueuid · on April 21, 2022

This, here is going to save time in the order of (wo-)man lifetimes!

> No more CUDA hell. Cog knows which CUDA/cuDNN/PyTorch/Tensorflow/Python combos are compatible and will set it all up correctly for you.

I've personally spent more than a week of my life in sum sorting this out. It's really overdue someone stops the madness!

p1esk · on April 21, 2022

No cuda hell if you’re using pytorch. A single pip or conda command installs everything you need.

justinsaccount · on April 21, 2022

It only does that by installing a bunch of pre-compiled shared libraries

https://hpc.guix.info/blog/2021/09/whats-in-a-package/

p1esk · on April 21, 2022

Awesome, right?

ellisv · on April 21, 2022

My first reaction was: sigh _another_ tool to help ML/DS folk not write a Dockerfile? Aren't there enough already?

But at closer glance cog seems to have an edge on some of the competitors like Seldon or Bento - namely using modern Python libraries (like Pedantic and FastAPI), CUDA/cuDNN/PyTorch/Tensorflow/Python compatibility, and (probably most important to me) automatic queue workers.

It generated a 1GB image with nothing but Python 3.8 in the config, so folks who really care about deployment size would want to continue writing their own container files.

catchclose8919 · on April 21, 2022

How does this compare to BentoML (https://www.bentoml.com/)?

bfirsh · on April 21, 2022

There is a fair bit of overlap.

Cog is optimized for getting a deep learning model inside a Docker image. We found that ML researchers struggled to use Docker, so we made that process easier. It generates a best practice Dockerfile with all your dependencies, and resolves the CUDA versions automatically. It also includes a queue worker, which we found was the optimal way of deploying long-running/batch models at Spotify and Replicate.

Bento is more flexible – the models can be used outside of Docker, and it has built-in support for deploying to lots of deployment environments, which Cog doesn't have yet.

nigma1337 · on April 21, 2022

Woah, perfect timing! I was just about to start writing a dockerfile+fastapi wrapper for our newest ML project, will try this out instead!

isoprophlex · on April 21, 2022

Me too! About to start a new project that EXACTLY fits the bill for this tool... It sure sounds promising!

teleforce · on April 21, 2022

Can I know how this is different than Pachyderm [1]?

[1]https://www.pachyderm.com/

anonymousDan · on April 21, 2022

Is there an easy way to create a similar dev environment for training on Linux that will take care of all the CUDA driver nonsense?

bfirsh · on April 21, 2022

You can do this with Cog! Once you've written cog.yaml, you can run arbitrary commands inside the environment which has CUDA installed correctly:

  $ cog run python train.py

mountainriver · on April 21, 2022

This is similar to build packs but maybe not as easy?