I've had my own wins with avoiding premature optimization. My personal data access code snippets use a lot of reflection to figure out what stored procedures to call without me having to specify too much metadata (going with the DRY principle). I started working at a new company and brought the code with me, where the resident "alpha nerd" criticized it, "all that reflection must be slow." I just smiled and did what I wanted to do anyway (I now outrank him) and showed the proof in the pudding: the ~50ms that the code spends in reflecting on its own stack trace is completely dwarfed by the overhead of connecting to the database, waiting on stored procedures to execute, transferring data over the network, and processing result sets into suitable outputs.
We typically write apps for 10 to 50 users; our complexity is in reporting, not request throughput. There's no good way to get around having to outer join on 10 different legacy tables of non-normalized data (which is that way because of some cheese-head early-optimizer who hasn't worked here in three years) to get the report that the client insists is absolutely needed regardless of being unable to come up with some sort of business case.
it was outsourced to a vendor that pretty much took advantage of the contracting startup's lack of DB knowledge. so they got the kitchen sink and a fat invoice.
of course it's easy for the vendor to defend this - "what, we want you to be scalable!".