I've had to solve this in a number of ways. The fastest I've found is to precompute a hash map at a low-granularity (well, update on batch cycle regularly). Graphhopper with OSRM + OpenStreetMap data are useful in this domain, to the point where relatively dense polygons can be mapped on 16 CPU hours in a 100km by 100km block.