Caching creates a mature product, no matter what your product roadmap says. If a rewrite is the tool of last result, caching is the second to last. Like pruning shears, once you use them, your world of options collapses considerably. All of those opportunities are gone and nature won't let you put them back.
Critically, caches are global shared state. That's why cache invalidation is the hardest thing. Global state also means people stop trying to have a data flow/architecture. None of these call trees need to advertise that they use a particular value because I can just grab it from a global if I need it. You don't know where things are actually used anymore because they all come from cache lookups, and because the lookup is so cheap (under typical system load, but catastrophic under high load), people don't even try to hold onto previously acquired data. They just fetch it again. Which is another boundary condition for cache invalidation (what if half a transaction has the old value and half the new value?)
They make flame graphs essentially useless. Sometimes immediately, or once the above starts happening. You quickly have no clear idea what the cost of anything is, because you never pay it, or if you try to pay it you end up amplifying the cost until there is no signal. Once people are promiscuously fetching from the cache, they don't even attempt to avoid duplicate lookups, so activities will end up fetching the same data 6 times. If you turn off the cache entirely for perf analysis then the cost skyrockets and the flame chart is still wrong.
You are back to dead reckoning and manually adding telemetry to potential hotspots to sort it out. This is very slow, and can be demoralizing. Quickly this becomes the fastest our app will be for a very long time.
All great points! For the type of work I'm thinking of (mostly offline data processing) I use caches in a much more limited way than you suggest. A short-lived local LRU cache on an expensive pure function over immutable data can significantly reduce our resource costs without adding significant complexity to the code. In some cases, DP would be more efficient but would require a mode of thinking and abstraction that's very different from most of our code.
Critically, caches are global shared state. That's why cache invalidation is the hardest thing. Global state also means people stop trying to have a data flow/architecture. None of these call trees need to advertise that they use a particular value because I can just grab it from a global if I need it. You don't know where things are actually used anymore because they all come from cache lookups, and because the lookup is so cheap (under typical system load, but catastrophic under high load), people don't even try to hold onto previously acquired data. They just fetch it again. Which is another boundary condition for cache invalidation (what if half a transaction has the old value and half the new value?)
They make flame graphs essentially useless. Sometimes immediately, or once the above starts happening. You quickly have no clear idea what the cost of anything is, because you never pay it, or if you try to pay it you end up amplifying the cost until there is no signal. Once people are promiscuously fetching from the cache, they don't even attempt to avoid duplicate lookups, so activities will end up fetching the same data 6 times. If you turn off the cache entirely for perf analysis then the cost skyrockets and the flame chart is still wrong.
You are back to dead reckoning and manually adding telemetry to potential hotspots to sort it out. This is very slow, and can be demoralizing. Quickly this becomes the fastest our app will be for a very long time.