Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Decision Making Under Uncertainty with POMDPs.jl (juliaacademy.com)
87 points by amkkma on Sept 20, 2021 | hide | past | favorite | 9 comments


My Ph.D. dissertation was in "Decision Making Under Uncertainty". The topics in the OP are extensive with most of them looking good. E.g., I noticed interpolation -- an old idea but sometimes important. Apparently missing, however, was the old technique of scenario aggregation.

Before going very far on this subject, we should note there was a lot of attention to this subject at the ORFE department at Princeton.

Some of the related advanced math is pretty, but for practice here are some of the real issues:

(1) Far too easily, needed computer resources, both CPU cycles and memory sizes, especially main memory, are high beyond reason, feasibility, practicality. Without exaggeration, this subject is Godzilla for any super computer -- right away gobbles up and spits out -- of less than 100 million square feet of floor space.

(2) Too often this methodology is what would do -- provably best possible for any means of processing the input data -- if had a lot of data and assumptions about that data that don't have.

(3) So, in practice, the first obstacle is to find a problem that gets around (1) and (2) and is doable otherwise.

(4) In practice, the next obstacle is convincing the suits, who are nearly always wildly afraid of anything so new to them, to devote any time, money, data, or attention to the work. The suits too easily see any such project as a lose-lose for them: (A) If the project works and with quite valuable results, the project head might get promoted, maybe over the relevant suits. (B) If the project doesn't work really well, generally it will be seen as a waste with the suits having a black mark on their records, a black mark that makes them easy targets in the C-suite competition.

One possible path: If have a project that passes (1) and (2), in addition have one more attribute: Have the project one (A) can fund and do oneself, (B) have a good list of good candidate customers, and (C) do a corresponding startup. Due to (B), are just a vendor to the customer organization and, thus, not a threat to the C-suite suits. And to the customer, the project has no "waste" since if the customer is not happy the startup doesn't get paid.

Net, do a startup.


Care to share your dissertation?

If I'm guessing correctly what scenario aggregation is by just the name, I've found that technique is very powerful when doing that with plain old human-based decision making. Basically if you have a bunch of complex scenarios that you're trying to consider at once it can be very difficult to make progress because you have to compare whole trees of options against each other along with their probability every time you make a change anywhere. But if several scenarios have some prefix where they are basically equivalent (or can be made to be), you can bundle them together and consider the group as a whole, putting off deciding between them until later. Sometimes this can collapse a huge complex decision tree into the one obviously correct thing to do right now, until the point where they diverge again.


Good, fast view of scenario aggregation. The idea is powerful, not just a heuristic, and from R. T. Rockafellar.

I didn't want to publish my dissertation because that would be like giving it away for free. Instead I wanted to sell it! I made some efforts to sell it but ran into the problem I mentioned for the suits. So, it didn't sell. But I still don't want to give it away for free.


I've never heard of selling a dissertation, but this seems to be a reasonable case for it. I guess you're selling at a fairly high price otherwise it would just be a book, no?

I can imagine one fear such suits might have is that they would invest in access to this idea but be unable to operationalize it. Perhaps a startup to sell good decisions (a funny thing to type) is in order.

Hit me up on the email on my about page if you'd like to chat more. :)

Good luck!


Thanks for the offer.

Before my Ph.D., I had a good career going, non-academic. At times, math was a help, so I got a pure/applied math Ph.D. But I never had any intention of being a professor; I got the Ph.D. to help the good career I already had.

Yup, mostly Ph.D.s are guided to think that their future is as a prof and there, with "publish or perish" they should publish everything they can. Fine for them. But for me, did I mention, I had no intention of being a prof.

For selling the work, the work was to help some big companies in a big industry save some millions of dollars a year. For me, I assumed that the world class university where I got my Ph.D. would provide significant credibility and that the chance of such big bucks savings would at least get me a lot of conversations.

Nope: I got a lot of silence and otherwise some resentment, "We are already doing the most best possible optimal thank you very much." or some such.

Net, the selling I tried didn't work. I didn't have the money, selling skills, or insight into organizational behavior to continue trying to sell. If they regarded my world class research university Ph.D. and dissertation as junk to be tossed in the trash, with no investigation at all, then they were too tough for me to work with.

Thanks for the offer, but I gave my view of applying Markov decision theory -- nearly always in practice, it's usually too difficult technically and otherwise still not wanted or even deeply resented.

So, the whole subject is just a waste? Maybe. Here is a mature view: The best of research is about the best there is in civilization. Still, nearly always the world of practice would like to see research just go away.

For Markov decision theory, a reasonable research question was, what can be done? Can we find optimal solutions in any sense? If so, can those solutions make money in practice?

So, for the good news, the research was successful: Yes, we can write out in great detail, highly polished, how to get optimal solutions. So, we wanted optimal solutions, and now we know what optimal solutions look like.

Or, now the research is done. And, now we have removed all doubt: Essentially always, optimal solutions are too challenging technically. So, we can know we should find something else to do.

Now for the bad news: Now that we do know what the optimal solutions look like, we also know that the solutions are too challenging technically to be applicable in practice except for some rare situations. So, for 99 44/100% of practice, Markov decision theory and a dime will just cover a 10 cent cup of coffee.

Thanks for the offer, but I'm doing a startup, one that has essentially nothing to do with Markov decision theory.


Any recommendations for introductory reading in this area?


Reasoning under uncertainty isn't the sole focus of the book, but Russell & Norvig's AI textbook has a couple of chapters on the topic that are pretty good, with the right amount of detail for the uninitiated.


The basic idea is dirt simple. After that, apparently the OP has an overview. After that, the math gets quite pure. E.g., the Markov assumption is that in the underlying process, the past and future are conditionally independent given the present. Fairly quickly start discussing this assumption in terms of sigma algebras from measure theory instead of just random variables. The sigma algebras are pretty: (a) If have the sigma algebra generated by a random variable, the random variable is determined (from memory from 20 years ago; my source was

I. I. Gihman and A. V. Skorohod, The Theory of Stochastic Processes I, ISBN 0-387-06573-3, Springer-Verlag, New York, 1974.

and I lost my copy in a move and have yet to order another from Springer). (b) If have infinitely many random variables, discussing conditional independence is at least clumsy but the sigma algebra they generate makes talking about conditional independence easy. Where to get infinitely many random variables? In the continuous time case, the past one second will do. In the discrete time case, the full past of the process if assume the process goes back all the way in time.

But, sure, in practice, can't do anything very direct with sigma algebras, that is, they are pretty theory.

Here is the whole idea in a reasonably general case in brief:

Do some financial planning for the next five years. Make time discrete, say, one point in time for each month. Have a point at the beginning and also at the end for a total of 61 points.

Build a spreadsheet with one column for each of the 61 points in time and one row for each of the relevant variables. To make this stochastic optimal control, Markov decision theory, or other equivalent names, in the spreadsheet have some cells that get their values from the built-in random number generator.

Also at each of the first 60 points in time, have some variables that are the decisions get to make.

In the last column, have the value of the business then.

Our mission, and we have to accept it, is to determine the decisions (values in the decision cells) that maximize the value of the business at the end. Ah, since we have the random cells, typically we want to maximize the expected (average) value at the end.

So, we start at column 60 and consider all the possible data we could have in column 60. Sure, that data fills maybe a few exabytes. Then we pick values for the decisions in column 60 and reevaluate the spreadsheet many times letting the random cells generate random numbers, average, and see what average value we get in column 61. We do this for all possible decisions. Sure, now we are keeping busy a few million square feet of computing. But the work is highly parallel. That's what we would about have to do using a general purpose program, but in practice we would look at the specific problem, note special structure (properties, assumptions, simplifications), and get some speedups.

To be more clear, at column 60 set aside the random cells and the decision cells -- the rest have the state at column 60. So, for each possible state, we try all the possible decisions, and for each decision we reevaluate many times and get the expected value of the business in column 61 and pick the best decision. So, there in column 60, for each of the possible states, we get the best decision and the corresponding expected business value in column 61. We store this -- that is, we store the best decisions and expected value for each state. Here we fill some exabytes of storage.

Now we back up to column 59. For each state we try all the possible decisions. For each of those decisions we recalculate, that is, exercise the random number generators, and see what state we get in column 60 and the corresponding business value in column 61. So, for each state in column 59 we find the best decisions.

Now we continue in this way back to columns 58, 57, ..., 1. In column 1 we know the state we are in because that is our present. So our calculation for column 1 told us what decision to make there in column 1.

Then in reality, soon we get to column 2, see what state we are in, and use the decision we tabulated for that state. We continue with column 3, 4, ..., 60, make our decisions in column 60, and see what business value we get in column 61.

The whole thing is the same except much faster if we have no random cells.

There are also continuous time treatments, essentially the same intuitively.

For some books,

Dimitri P. Bertsekas, Dynamic Programming: Deterministic and Stochastic Models.

George L. Nemhauser, Dynamic Programming.

Stuart E. Dreyfus and Averill M. Law, The Art and Theory of Dynamic Programming.

E. B. Dynkin and A. A. Yushkevich, Controlled Markov Processes.

Wendell H. Fleming and Raymond W. Rishel, Deterministic and Stochastic Optimal Control.

There is more from R. T. Rockafellar and R. J.-B. Wets.

And of course we should mention R. Bellman, the father of this subject.


Thanks for the explanation! I'm actually working on a tool right now that uses a similar process, but haven't had much of a formal background. Looks like I'll be adding quite a bit to my reading list!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: