Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In my experience, Polars streaming runs out of memory at much smaller scales than both DuckDB and DataFusion and tends to use much more memory for the same workload when it doesn't outright segfault.

Polars is faster than those two once you get to less than a few GB, but beyond that you're better off with DuckDB or DataFusion.

I would love for this to improve in Polars, and I'm sure it will!



Do you mean segfault or OOM? I am not aware of Polars segfaulting on high memory pressure.

If it does segfault, would you mind opening an issue?

Some context; Polars is building a new streaming engine that will eventually be ready to run the whole Polars API (Also the hard stuff) in a streaming fashion. We expect the initial release end of this year/early next year.

Our in-memory engine isn't designed for out-of-core processing and thus if you benchmark it on restricted RAM, it will perform poorly as data is swapped or you go OOM. If you have a machine with enough RAM, Polars is very competitive in performance. And in our experience it is tough to beat in time-series/window functions.


Segmentation violations are often the result of different underlying problems, one of which can be running out of memory.

We (the Ibis team) have opened related issues and the usual response is to not use streaming until it's ready, or to fix the problem if it can be fixed.

Not sure what else there is to do, seems like things are working as expected/intended for the moment!

We'll definitely be the first to try out any improvements to the streaming engine.


They have different implications for us. An abort due to an OOM isn't a bug in our program, as SEGFAULT is a serious bug we want to fix.


when you say "our in-memory engine", are you talking about dataframe or the lazyframe?


A DataFrame is our in memory table. A LazyFrame is a compute plan that can have DataFrames as source.

The engine is what executes our plans and materializes a result. This is plural as we are building a new one.


My understanding is that the Polars team is working on a new streaming engine. It looks like you will get your wish.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: