Even medium tasks, the slow step is rarely computations, but the squishy human entering them.
When I work on a decent 50GB+ dataset, I have to do something fairly naive before I get frustrated at computation time.
Edit: pandas is now Boring Technology (not a criticism). It is a solid default choice. In contrast, we are still in a Cambrian explosion of NeoPandas wannabes. I have no idea who will win, but there is a lot of fragmentation which makes it difficult me for to jump head first into one of these alternatives. Most of them are faster or less RAM pressure which is really low on my list of problems.
When I work on a decent 50GB+ dataset, I have to do something fairly naive before I get frustrated at computation time.
Edit: pandas is now Boring Technology (not a criticism). It is a solid default choice. In contrast, we are still in a Cambrian explosion of NeoPandas wannabes. I have no idea who will win, but there is a lot of fragmentation which makes it difficult me for to jump head first into one of these alternatives. Most of them are faster or less RAM pressure which is really low on my list of problems.