There is already an API available to do this today. The Microsoft Cognitive Services computer vision API detects words, phrases, and lines and included bounding box information indicating where the text was found. See https://azure.microsoft.com/en-us/services/cognitive-service....