You made this assertion multiple times, but so far it’s been entirely unsupported in fact, despite TFA having made the entire code set available for you to test your hypothesis on.
import numpy as np
import time
vals = np.random.randn(1000000, 2)
point = np.array([.2, .3])
s = time.time()
for x in vals:
np.linalg.norm(x - point) < 3
a = time.time() - s
s = time.time()
np.linalg.norm(vals - point, axis=1) < 3
b = time.time() - s
print(a / b)
~296x faster, significantly faster than the solution in the article. And my assertion was supported by nearly 20 years of numpy being a leading tool in various quantitative fields. It’s not hard to imagine that a ubiquitous tool that’s been used and optimized for almost 20 years is actually pretty good if used properly.