Oh, I see where you're getting at. I think I can clear that up for you: Bayes fo...

civility · on March 15, 2018

Yeah, the difference between lowercase p and capital P seems pretty important. Most places show capital P, so that's part of the confusion.

There's more though. Lowercase p(B|A) seems to really mean p_B(h(a)), and that's not obvious. Hell, I might still have it wrong.

And most everyone says to ignore p(B) in the denominator, but that's really sloppy hand waiving. The notation means something, and there should be a well defined set of substitutions, but in each of the four terms, they do something radically different. I can't see a pattern to follow.

rsp1984 · on March 15, 2018

Lowercase p(B|A) seems to really mean p_B(h(a)), and that's not obvious. Hell, I might still have it wrong.

I think you have it about right. It is "semi-obvious" by the fact that of course, observation and state are related through the observation function (in your case, observing a 3D point as a 2D coordinate) and that is of course part of the PDF.

And most everyone says to ignore p(B) in the denominator, but that's really sloppy hand waiving.

It's not. You want to know the most probable value of A. In other words you are looking for the argmax of a function of A. p(B) is purely a function of B, no A involved, so it becomes a constant in your equation. Since it's in the denominator, it's a normalizing constant.

Note that if you have the "Uppercase Bayes" (P(A|B) = ...) you are looking for concrete values so the normalizing P(B) does matter.

Now in the case of "lowercase Bayes" (p(A|B) = ...) it matters just as much, but you can still ignore it if all you're looking for is the argmax of the resulting PDF, as the p(B) is just a scaling constant and it's not changing the argmax of p(A|B).

but in each of the four terms, they do something radically different. I can't see a pattern to follow.

I don't understand what you mean here.

civility · on March 15, 2018

> It's not. You want to know the most probable value of A.

Nah, it's really not that simple. When I've done this in the past, I've needed both the mean (which is the mode for Normal distributions) and the variance, so I can make confidence ellipses. I don't just care about the most probable location.

I already have a set of techniques for working with Kalman filters. The only reason I would want to understand applying the Bayes' theorem in this context is if it offers insight into a wider class of problems (non-Gaussian PDFs) or if it helps me communicate with others. In both of those cases, I'd like to understand the thing first before I hand-waive the denominator away.

rsp1984 · on March 15, 2018

Nah, it's really not that simple.

Well, you came here and asked and I gave you a response because I happen to know the topic and wanted to help. It's your choice to not believe what I explained but I doubt you'll get a very different answer from other people.

civility · on March 16, 2018

Heh, I think we stepped in the wrong direction. I didn't mean to offend you.

Take care.