I've seen the Kalman filter presented from a few different angles and the one that made the most sense to me was the one from a Bayesian methods class that speaks only in terms of marginal and conditional Gaussian distributions and discards a long of the control theory terminology.
I succeeded in understanding the Kalman filter only when I found a text that took a similar approach. It was this invaluable article, which presents the Kalman filter from a Bayesian perspective:
Meinhold, Richard J., and Nozer D. Singpurwalla. 1983. "Understanding the Kalman Filter." American Statistician 37 (May): 123–27.
This was one of the books we used: https://link.springer.com/chapter/10.1007/978-1-4757-9365-9_...