Talking about unfamiliarity and specific needs: FPGAs are much better suited than CPUs for processing minimum-sized frames at wirespeed. They can still forward all unhandled frames to a CPU. Yes, it's a lot of development effort compared to a CPU-only solution, but considering all the kernel-optimizing-multicore-cleverness from OP I would say we are approaching the break-even point.