I believe such an API already exists: https://pdftables.com/ (no affiliation). W...

vic-traill · on Oct 8, 2021

I had a look - from their FAQ [0]:

However, some PDFs are scanned documents, or only contain images. PDFTables doesn't perform Optical Character Recognition (OCR) to turn these images into text.

To process these kinds of documents, you will need to either enable OCR in your scanning software, or run the PDF through specialist OCR software before using PDFTables.

[0] https://pdftables.com/faq