Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is extremely counterproductive, because I want results you're calling "stale" to use as a reference or inspiration. I don't want to destroy old results just because I changed some parameter value to test an idea.


It improves reproducibility, consistency, and sharing, but reduces convenience for some operations. It's a trade-off in favor of programming in the large.

If you don't want to recompute dependent nodes, then use new names for your experiments rather than redefining old functions and variables. Yes, in some ways this is less convenient for you, but it's more convenient for people receiving your notebooks, that the notebook is always in a consistent state and reproducible.

Maybe it doesn't work well for your workflow, particularly if you're not sharing notebooks and keeping your notebooks small. On the other hand, if your workflow requires significant amounts of leaving notebooks in an inconsistent state, you may end up saving yourself significant frustration with larger notebooks and losing work due to losing track of your mental tracking of inconsistencies.

Also, if you hit a state that you really don't want to lose, you should probably do a quick git commit. You can always squash commits later if needed.

It might be worth changing your workflow, or it might not.


I think this is the interesting point though. Many people want to use Jupyter notebooks so that it looks reproducible. Not to make it actually reproducible. God forbid it actually has to be re-ran, it could have different results!

I think that's my main notebook gripe: they make it look like if you run the code you'll get these results, but that's not even close to the case. Many people abuse this. At this point, I pretty much assume anything in a Jupyter notebook isn't reproducible.


Yes. A Jupyter notebook is only reproducible in my opinion if you can hit "Restart Kernell and execute all cells" and get the same result.

Otherwise, it should never been shared with other people or even contain relevant analysis you may need for yourself later.

But this is not enough - also the library dependencies need to be fixed. Pluto will make this very easy in the near-future: https://github.com/fonsp/Pluto.jl/pull/844


If you’re so attached to that data, you should probably do something to save it other than let it sit in RAM or maybe an old plot in a random notebook.


I'm not sure what exactly you're trying to do: harangue me into using this half-baked notebook replacement, or just telling me how to do my job.


Then instead of reassigning new data to foo, just assign new data to foo2. You can still use the notebook to experiment, what you are doing is removing ambiguity.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: