Hacker News new | past | comments | ask | show | jobs | submit login
Cog: Containers for Machine Learning (github.com/replicate)
160 points by fdalvi on April 21, 2022 | hide | past | favorite | 14 comments



Hello HN! One of the creators of Cog here.

We built this to deploy models to Replicate (https://replicate.com/), but it can also be used to deploy models to your own infra.

Andreas, my co-founder, used to work at Spotify. Spotify wanted to run models inside Docker containers, but Docker was too hard to use for most ML researchers. So, Andreas built a set of templates and scripts to help researchers deploy their own models.

This was mixed in with my experience working at Docker. I created Docker Compose, which makes Docker easier to use for dev environments. We were also joined by Zeke, who created Swagger (now OpenAPI), which is used to define a model’s inputs/outputs. Dominic and some other contributors have since joined! https://github.com/replicate/cog#contributors-

It’s still early days, so expect a few rough edges, but it’s ready to use for deploying models. We’d love to hear what you think.


This, here is going to save time in the order of (wo-)man lifetimes!

> No more CUDA hell. Cog knows which CUDA/cuDNN/PyTorch/Tensorflow/Python combos are compatible and will set it all up correctly for you.

I've personally spent more than a week of my life in sum sorting this out. It's really overdue someone stops the madness!


No cuda hell if you’re using pytorch. A single pip or conda command installs everything you need.


It only does that by installing a bunch of pre-compiled shared libraries

https://hpc.guix.info/blog/2021/09/whats-in-a-package/


Awesome, right?


My first reaction was: sigh _another_ tool to help ML/DS folk not write a Dockerfile? Aren't there enough already?

But at closer glance cog seems to have an edge on some of the competitors like Seldon or Bento - namely using modern Python libraries (like Pedantic and FastAPI), CUDA/cuDNN/PyTorch/Tensorflow/Python compatibility, and (probably most important to me) automatic queue workers.

It generated a 1GB image with nothing but Python 3.8 in the config, so folks who really care about deployment size would want to continue writing their own container files.


How does this compare to BentoML (https://www.bentoml.com/)?


There is a fair bit of overlap.

Cog is optimized for getting a deep learning model inside a Docker image. We found that ML researchers struggled to use Docker, so we made that process easier. It generates a best practice Dockerfile with all your dependencies, and resolves the CUDA versions automatically. It also includes a queue worker, which we found was the optimal way of deploying long-running/batch models at Spotify and Replicate.

Bento is more flexible – the models can be used outside of Docker, and it has built-in support for deploying to lots of deployment environments, which Cog doesn't have yet.


Woah, perfect timing! I was just about to start writing a dockerfile+fastapi wrapper for our newest ML project, will try this out instead!


Me too! About to start a new project that EXACTLY fits the bill for this tool... It sure sounds promising!


Can I know how this is different than Pachyderm [1]?

[1]https://www.pachyderm.com/


Is there an easy way to create a similar dev environment for training on Linux that will take care of all the CUDA driver nonsense?


You can do this with Cog! Once you've written cog.yaml, you can run arbitrary commands inside the environment which has CUDA installed correctly:

  $ cog run python train.py


This is similar to build packs but maybe not as easy?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: