Hacker News new | past | comments | ask | show | jobs | submit login
A Python tutorial on Bayesian modeling techniques (github.com/markdregan)
218 points by gedrap on Nov 24, 2015 | hide | past | favorite | 15 comments



I am the author of this tutorial. If you are interesting in contributing a section to this tutorial, please get in touch. Some suggested topics: survival analysis, mixture models, classification, time series models... Twitter @markdregan


A few pedantic notes

It would be nice if random variables were denoted by capital letters to distinguish them from particular observations.

In the section about regression you should have the conditional mean of Y equal to \beta X, rather than the overall mean.

Of course, this doesn't really matter too much since the substance of the tutorial is correct. I thought I'd mention it though because sloppy notation can undermine your credibility in some circles, and the equations are the most notable object when you're just skimming the page.

More (very) pedantic notes:

Usually, if you're going to express a sum with \cdots you put A+B+\cdots+Z (with plus signs on both sides of the ellipse for clarity)

The probability operator is usually denoted with a capital P

The log function shouldn't be italicized, you can just type \log to avoid this

Distribution names in statements like y~Pois(\mu) aren't usually italicized.


Looks pretty nice, good job. I think survival analysis is a very underrated tool. I'm working in UX now and there's a lot of test setups were survival analysis makes a lot of sense but isn't used (mothly because people don't know it).


Would you like to expand on that?


Just wanted to say thanks a lot for taking the time to write it! :)


My pleasure. I learned a lot in writing it.


Are you @markdregan or @thinkvein? The @markdregan seems to indicate @thinkvein is you now?


Both are me. @markdregan is my choice handle. @thinkvein was for a blog I used to write.


It's a good initiative!

However, I think the introduction could be improved by briefly describing the "why/what" of Bayesian modeling before you get into the first Hangouts example.


Good idea. It does jump in pretty quick. I'll update this during next revision.


Great initiative!

I have a few suggestions, maybe i missed it, but a prerequisite section would be useful both for knowledge and platform, software etc.

I am new to python and believe this tutorial would be great for me. However in the case of novice-users as myself, lots of time is spent getting the environment right rather than understanding the code.

For example, after downloading and installing anaconda, jupyter and seaborn, i stumble on error message "C:\Anaconda3\lib\site-packages\ipykernel\__main__.py:89: FutureWarning: sort(columns=....) is deprecated, use sort_values(by=.....)"

And here i am stuck, my next step, had it not been this post, would be to investigate syntax changes in python.

Maybe I that's not a correct way to address that problem however that is mainly my point. If the tutorial is targeted to beginners as me, a few more pointers to common errors setting the environment up would be helpful!

Thank you for otherwise great tutorial and keep up the good work!


This is great work. I don't have any substantive comments (need to read it in depth for that). I did miss the lack of "next" links, though - not sure if there is a Jupyter-native way to do that.

I like the matplotlib style created for this too.


Thank you. Next links would be nice and keep the user within the nbviewer mode (which formats the notebooks correctly). I will add these.


Nice work, thanks for taking the time to put this together. Bookmarked.


I don't have any Google Hangout chat messages to run the first example of using jupyter. I know that you are not going to share your data, but it should be handy if some fake conversations could be included. People like me like to first install the applications and then run it to see whether it works as claimed. I installed the conda distribution and the jupyter notebook works correctly. (I installed conda in ubuntu and then seaborn, PyMC3 and panda (PyMC3 and seaborn with pip since conda install 2.3 of PyMC3). It works.

I should say that the first step is to clone:

cd where_you_want_the_data_to_be_copied git clone ....

# and now start jupyter notebook with

jupyter notebook

# go to File/open/ and select the first section.

I see that I can edit the markdown. I translated the introduction to section 0, here it goes. Thanks for this tutorial. The graphics are nice.

### Sección 0: Introducción Bienvenido a "Bayesian Modelling in Python" - un tutorial para personas interesadas en técnica de estadística bayesiana con Python. La lista de secciones del tutorial se encuentra en la página web del projecto [homepage](https://github.com/markdregan/Hangout-with-PyMC3).

La estadística es un tema que en mis años de universidad nunca me gustó . Las técnicas frecuentistas que nos enseñaron (p-values, etc.) parecían rebuscadas y en última instancia di la espalda a este tema en el que no estaba interesado.

Esto cambió cuando descubrí la estadística Bayesiana - una rama de la estadística bastante diferente a la estadística frecuentista que se suele enseñar en la mayoría de las universidades. Mi aprendizaje se inspiró en numerosas publicaciones, blogs y videos. A los que se inician en la estadística bayesiana les recomendaría fervientemente los siguientes:

- [Doing Bayesian Data Analysis](http://www.amazon.com/Doing-Bayesian-Analysis-Second-Edition...) by John Kruschke - [Python port](https://github.com/aloctavodia/Doing_Bayesian_data_analysis) of John Kruschke's examples by Osvaldo Martin - [Bayesian Methods for Hackers](https://github.com/CamDavidsonPilon/Probabilistic-Programmin...) fue para mí una gran fuente de inspiración para aprender estadística bayesiana. En reconocimiento de la gran influencia que ejerció en mí, he adoptado el mismo estilo visual que se usa en BMH. - [While My MCMC Gently Samples](http://twiecki.github.io/) blog de Thomas Wiecki - [Healthy Algorithms](http://healthyalgorithms.com/tag/pymc/) blog de Abraham Flaxman - [Scipy Tutorial 2014](https://github.com/fonnesbeck/scipy2014_tutorial) de Chris Fonnesbeck

He creado este tutorial con la esperanza de que otros lo encontrarán útil y que les servirá para aprender técnicas bayesianas de la misma forma que me ayudaron a mí. Cualquier aportación de la comunidad corrección/comentario/contribución será bienvenida.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: