I've been using Azure's "Document Intelligence" thingy (prebuilt "read" model) to extract text from PDFs with pretty good results [1]. Their terminology is so bad, it's easy to dismiss the whole thing for another Microsoft pile, but it actually, like, for real, works.
[1] https://learn.microsoft.com/en-us/azure/ai-services/document...