I've been trying this intro competition a while ago now, but I seem to remember that it was relatively easy to obtain a score of ~0.78 or ~0.79 with about 5 lines of Python or R (using your favorite lib/algo of course). That said, am I the only one thinking that the pursuit of a few extra % points over what seems like a "natural baseline" (which admittedly might translate into a few dozens of correctly classified people in this particular case) is a somewhat strange use of one's time and skills (if we don't take into account the data processing and programming skills one can gain in the process, I'll admit though). My point is that there seems to be nothing remotely "scientific" (or even insightful) to be gathered from such an exhaustive search process (which characterizes a lot of those data science competitions IMO), when you have squeezed to death all the possible ways in which you can transform a given dataset, in order to maximize a very precise metric. This to me appears like a degenerate form of statistical science, which doesn't have much to do with reality anymore.
Having "PassengerId" as the most important feature seems like a bad sign. Is this historical data, like an id assigned at boarding, or is it a synthetic id assigned to records potentially in some non-random order?