> way before that if you're thrashing those piddly L3 and L2 cache's.
And that's why knowing your hardware and your workload is important. I remember once we had a performance problem with a x86 server. Over lunch, the guy with the problem told me the working dataset was about 8 megs per batch or so. I told him to move the workload to an Itanium server we had for testing because I knew that server had 11 megs of L2 cache, besides being slightly faster per instruction than the Xeon that was running the workload. The throughput more than tripled.
It breaks my heart when people just assume more memory/GHz/cores is always faster.
And that's why knowing your hardware and your workload is important. I remember once we had a performance problem with a x86 server. Over lunch, the guy with the problem told me the working dataset was about 8 megs per batch or so. I told him to move the workload to an Itanium server we had for testing because I knew that server had 11 megs of L2 cache, besides being slightly faster per instruction than the Xeon that was running the workload. The throughput more than tripled.
It breaks my heart when people just assume more memory/GHz/cores is always faster.