I think I take something different away from the article, yes tokenizers are imp...

SEGyges · 2024-10-23T17:00:24 1729702824

you might have better luck giving the LM the original document and having it generate its own OCR independently, then asking the llm to tiebreak between its own generation and the OCR output while the image is still in the context window until it is satisfied that it got things correct

7thpower · 2024-10-23T18:23:28 1729707808

This is interesting. What types of content are you using this approach on and how does it handle semi structured data? For instance, embedded tables.