> like convert PDF bank statements into CSV transaction files I've tried this re...

ChuckMcM · 2024-07-06T20:51:26 1720299086

It is remarkably difficult and continues to provide a good example of the limitations of LLM based systems.

In my case, I used perl, and exploited the fact for for a given bank, the statements are consistently formatted. Further, PDF OCR conversion responds consistently to the documents with the same formatting. With this combination, it is possible to extract the characters and numbers that are associated with transactions from the document, and then to take those extracted bundles of text and transform them into lines for a CSV file.

The caveat is that it works for only that bank, that "kind" of account (usually checking, credit card, or savings), and when using that specific document OCR tool. Within those constraints it is eminently reliable but utterly non-transferable to a general case.