Absolutely. And beyond weight regularization, for any weighting followed by a si...

Absolutely. And beyond weight regularization, for any weighting followed by a sigmoid or other squashing function, large weights simply tend to saturate the squashing function and there is very little gradient (quickly effectively zero) to benefit from increasing the weight value past that point.