Markov Chains for programmers

HackOfAllTrades · on April 2, 2022

First thought upon reading the title: "Surely you could pay them to stay."

westcort · on April 1, 2022

I have used Markov chains for a few fun projects, like this Markov chain headline generator (https://locserendipity.com/Markov_Headlines.html), and a Markov generator based on a certain someone's Twitter feed (https://locserendipity.com/Markov_Trump.html), but this looks like a good resources for more serious applications.

ganzuul · on April 2, 2022

https://github.com/coleifer/irc/blob/master/bots/redisbot.py

This is great fun. Once it has learned a little about you and your friends it sometimes spits out a lyrical home-run. It is of course you who provide the intelligence of interpretation, but it still feels mysterious when it 'talks' in a way relevant to the context.

m000 · on April 1, 2022

- "Markov Chains for programmers." Cool!

- Opens PDF.

- Typeset in Computer Modern.

- Starts running, screaming in Comic Sans.

Jokes aside, CM is not the only game for math-heavy documents. Something like Libertinus [1] would probably be more screen-friendly.

[1] https://github.com/alerque/libertinus

layer8 · on April 1, 2022

This has been downvoted, but I think selecting more screen-friendly fonts is a valid concern nowadays. Personally I would also like to see a reflowable format (which I guess would mean HTML with MathJax).

mattalex · on April 1, 2022

Use latexml this is what's running under the hood of ar5iv https://ar5iv.labs.arxiv.org/ and should be able to compile every Latex document to html

spekcular · on April 1, 2022

What's wrong with computer modern?

raister · on April 2, 2022

The author has uploaded a version of the book in the libertinus font - I guess he heard your complaints!

kjs3 · on April 8, 2022

This is seriously the most insightful comment you can come up with? If it is, you would do well to consider silence.

raister · on April 2, 2022

Only the frontispiece uses a different font (can't say for sure whether it is Comic Sans though) - the rest is pure Times new Roman.

dddnzzz334 · on April 1, 2022

Computer Modern is the best looking font

MonkeyClub · on April 2, 2022

With Comic Sans a close second, surely.

raister · on April 1, 2022

Interesting find - really introductory examples. Good read, recommend it.

jonititan · on April 1, 2022

Pymc3 is pretty good. https://docs.pymc.io/en/v3/

b20000 · on April 1, 2022

it’s for programmers but comes with matlab code and excel sheets

Jtsummers · on April 1, 2022

Nothing wrong with either of those. Also, if you take the time to check out the book or its GitHub repo, then you will see that there is also C code and C "challenges" (projects) for the reader to go through.

dr_kiszonka · on April 2, 2022

Right, but it is a pretty unusual combination and 1.5 of these languages are proprietary.

raister · on April 2, 2022

The solutions in proprietary formats are for demonstration only - you could refer to the C code at all times. And also, they could be easily ported to GNU/Octave and LibreOffice Calc without any issues.

b20000 · on April 2, 2022

there is a large gap between code in matlab typically and stuff you can use in production written in C. this is for example the case with DSP code. so it is a red flag indicating the author is an academic and lacks actual experience writing production code.

Jtsummers · on April 2, 2022

I take it you've never worked on radio systems, radar systems, satellites, or avionics then. Or with engineers working on new systems designs and modeling. (And those are just the areas I've used or seen it used by non-academics from my own experience, not at all a comprehensive list) Matlab has lots of non-academic uses, and when someone claims it's only for academics it's usually a sign that they have never used it or only saw it in their freshman "programming for engineers" class or maybe numerical methods.

b20000 · on April 2, 2022

I've had to use matlab in undergrad and grad courses. Yes, it's useful because it has tools for filter design, for plotting, etc etc. What tends to happen though is that a person works on some prototype in matlab without any regard to actual implementation in practice and then throws it over the wall, and then the person writing the actual firmware has to re-invent half of what has been done in matlab so it works within the constraints of the hardware.

hvasilev · on April 1, 2022

There are just so many of these fun AI-related concepts that seem really cool and you get the chill that they will take over the world some day.

Decades pass and you realize they either have little to no application or are incredibly niche :(

Too bad that "solution in a search of a problem" is generally bad approach to problem-solving. I wish our industry was more fun as a whole.

robbedpeter · on April 1, 2022

Most of the time, these things are resource hogs arriving way before their time to shine, either needing Moore's law to catch up the hardware, or some nerd to wrestle with the combinatorial explosion and win. Transformers can be seen as a variation on Markov chains, but the innovation of attention mechanisms means you can use hundreds of thousands of tokens and thousands of tokens in sequences without the problem space going all Buzz Lightyear on you.

https://www.zabaware.com/ultrahal/

Ultra Hal was a best in class chat bot when fixed response systems like Alice/ AIML were the standard. Ultra Hal used Markov chains and some clever pruning, but it dealt with a few hundred tokens as words and sequences only 2 or 3 tokens out. It occasionally produced novel and relevant output, like a really shitty gpt-2.

I think we may see a resurgence of expert systems soon, as gpt-3 and transformers have proved capable of automating rule creation in systems like Cyc. They've already incorporated direct lookups into static databases gpt / RETRO type models. Incorporating predicate logic inference engines seems like the logical and potent next step. GPT could serve as a personality and process engine that eliminates the flaw (tedium) in massive, tedious, human level micro-tasking systems from GOFAI.

It's worth going through all the literature all the way back to the 1956 summer of code and hunt for ideas that just didn't work yet.

https://en.wikipedia.org/wiki/Dartmouth_workshop

Fomite · on April 1, 2022

...Markov Chains (via MCMC) underly most Bayesian inference problems, and pretty much all stochastic dynamical systems models are based on Markov Chains.

klysm · on April 1, 2022

Markov chain Monte Carlo is incredibly useful and widely applied.

it_does_follow · on April 2, 2022

Not to mention that the entire class of Markov Chain Monte Carlo techniques only form a subset of general uses for Markov chains.

Markov chains form the basis of n-gram language models, which are still useful today.

Markov chains are also the basis of the Page-rank algorithm.

Hidden Markov Models (which are just an extension of Markov Chains to have unobserved states) are a powerful and commonly used time series model found all over the place in industry.

In the pre-deep learning model Markov chains (and HMMs) in particular had very wide spread usage in Speech processing.

They are probably one of the most practical statistical techniques out there (out side of obvious example like linear models).

needlefish · on April 3, 2022

Not to mention, it was less than a decade ago that one could have said about neural networks "Decades pass and you realize they either have little to no application or are incredibly niche".

vanderZwan · on April 1, 2022

To give an example: Prediction by Partial Matching is basically a Markov chain in disguise, and an incredibly powerful way to do compression that beats most other forms of text compression (at the price of having a lot more memory overhead)

[0] https://en.wikipedia.org/wiki/Prediction_by_partial_matching

skykooler · on April 2, 2022

Aren't Markov chains how predictive typing usually works? (or worked, I suppose the big players are probably using some sort of neural net now)