Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've used Metaflow for the past 4 years or so on different ML teams. It's really great!

Straightforward for data/ML scientists to pick up, familiar python class API for defining DAGs, and simplifies scaling out parallel jobs on AWS Batch (or k8s). The UI is pretty nice. Been happy to see the active development on it too.

Currently using it at our small biotech startup to run thousands of protein engineering computations (including models like RFDiffusion, ProteinMPNN, boltz, AlphaFold, ESM, etc.).

Data engineering focused DAG tools like Airflow are awkward for doing these kinds of ML computations, where we don't need the complexity of schedules, etc. Metaflow, imho, is also a step up from orchestration tools that were born out of bioinformatics groups, like Snakemake or Nextflow.

Just a satisfied customer of Metaflow here. thx



If you’ve tried, has it been clunky to run non-python based workflows? I.e if you want to run bedtools or diamond without having to run a bunch of subprocess.run commands?


Right, for most of our workflows, we stay in python land, which is great and seamless with Metaflow being in python. But yes, there are occasions that we have to make a system call to run an old R script or even a compiled C++ executable :shrug: (Metaflow does have some native R support tho) I have not had to use the specific tools you called out, bedtools or diamond.

Most of the time this not a blocking problem since each step in a flow is mapped to a Docker image and/or your choice of EC2 instance (e.g. one step on a GPU, another on a memory optimized instance). You can have one step use an image with all of your python-based ML stuff, and another step have a different image with compiled exectuables that are triggered by a system call. If needed, outputs from such a system call would then need to be persisted in a database/S3 or read back into the python flow for persistence. So, it is not as seamless as a flow in all python, but it can work "good enough".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: