Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Founder of Extend (https://www.extend.ai/) here, it's a great question and thanks for the tag. There definitely are a lot of document processing companies, but it's a large market and more competition is always better for users.

In this case, the Reducto team seems to have cloned us down to the small details [1][2], which is a bit disappointing to see. But imitation is the best form of flattery I suppose! We thought deeply about how to build an ergonomic configuration experience for recursive type definitions (which is deceptively complex), and concluded that a recursive spreadsheet-like experience would be the best form factor (which we shipped over a year ago).

> "How do you see the space evolving as LLMs commoditize PDF extraction?"

Having worked with a ton of startups & F500s, we've seen that there's still a large gap for businesses in going from raw OCR outputs —> document pipelines deployed in prod for mission-critical use cases. LLMs and VLMs aren't magic, and anyone who goes in expecting 100% automation is in for a surprise.

The prompt engineering / schema definition is only the start. You still need to build and label datasets, orchestrate pipelines (classify -> split -> extract), detect uncertainty and correct with human-in-the-loop, fine-tune, and a lot more. You can certainly get close to full automation over time, but it takes time and effort — and that's where we come in. Our goal is to give AI teams all of that tooling on day 1, so they hit accuracy quickly and focus on the complex downstream post-processing of that data.

[1] https://dub.sh/ojv9b7p

[2] https://dub.sh/X7GFlDd



Hey, we've never used or even attempted to use your platform. Respectfully I think you know that, and that you also know that your team has tried to get access to ours using personal gmail accounts dating back to 2024.

A schema builder with nested array fields has been part of our playground (and nearly every structured extraction solution) for a very long time and is just not something that we even view as a defining part of the platform.


Thanks for the reply. Not sure what you're referring to, but I don't believe we've ever copied or taken inspo from you guys on anything — but please do let me know if you feel otherwise.

It's not a big deal at the end of the day, and excited to see what we can both deliver for customers. congrats on the launch!


Two YC companies openly fighting and accusing each other. Not a good look and I'm surprised that you haven't been reprimanded yet.


I'm completely impartial here - seems like there's only so many ways you can design a schema builder?


I agree. I don't know either company, schema builder is a very common feature in many data platforms. Nested or otherwise. Neither is claiming this is a big deal though.


I've used instabase before which has had the same UX for years. What about benchmarks between the two on extraction performance?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: