I think if you have such an exposure to so many stat and probability books, and yet cannot recommend one good one, then clearly you are the person fated by the universe to write that one book written correctly. :)
Back in grad school, my fellow students and I were amazed at how polished were the books of Rudin, Neveu, Royden, Luenberger, Dynkin, etc. but how comparatively ..., say not good, were the books in statistics. There are some hints that there are more good statistics books now.
One of my fellow students was very capable, and I was hoping would write a good book; I doubt if he ever got around to it.
I'm glad the statistics community has at least one foot in important applications, but both feet? Way back there was Cramer. At the Brown University of Applied Math long was U. Grenander -- maybe he could have written a Cramer Volume II.
I'd like to see (A) much more polish on the foundations and then (B) selected with good insight and expertise some of the keys to some of the more important applications.
Some of the application areas where I suspect, with varying degrees of strength, there is some good work include (i) particle physics such as at the LHC, (ii) a huge range of bio-medical research, (iii) high end military radar, sonar, and tracking more generally.
When I was in grad school, some of the gossip was that statistics of sample paths of stochastic processes was a wide open field -- I suspect it still is.
Apparently in the US, for well done theory, at least in attitudes, statistics is a poor cousin of probability theory, that is a poor cousin of pure math, and stochastic processes is just out of the picture.
I haven't tried to be a statistician, but I've done some projects and gotten some results. But for each of the results, clearly there were plenty of loose ends and more to do but without any very clear theory, examples, experience, or methods to tie off the loose ends.
Maybe here's one -- maybe since I'm not putting a lot into this just now: Above I gave my little derivation that with data X, the best estimate of Y is E[Y|X] with the idea that this partly justifies cross tabulation as the discrete version. Okay, but X might be a sample path of a history of a stochastic process with lots of dimensions with goofy data types. So, maybe to cut down some on the exponential explosion of the data required for cross tabulation on several variables, exponential in the number of variables, pick and choose the variables. Okay, but first cut we have not even zip, zilch, or zero on how to do that.
Once I published a paper on multi-dimensional, distribution-free statistical hypothesis tests, intended for zero-day computer security. But, again, the number of variables with data is huge; we encounter another exponential explosion and would like some help on which variables to choose.
Very broadly, from 200,000 feet up, we get to choose the variables to use and then want to know something about the accuracy of our results -- too often we are to use the TIFO (try it and find out) method, some form of Monte Carlo, resampling techniques (B. Efron, P. Diaconis) deleting some variables or observations and trying again, etc.
My guess is that finding a welcoming department in a research university and/or an interested problem sponsor in a funding agency would be too tough.