Hacker News new | past | comments | ask | show | jobs | submit login

In the problem space of “reading data from tables in a manner that can be easily integrated and scaled within a broader semantic processing system”... I would assume that “reading data from tables” isn’t the hard part.



You would assume correctly. The core issue is that one can't interpret meaning from a table and its values from semantics alone. A table's layout conveys a great deal of meaning.

I remember looking at a couple of systems that would try to do a visual-based zonal tagging of a table, but I think the challenge there was how to logically integrate the zonal tagging into the broader semantic processing of the surrounding text.

Not being able to construe information from tables is a huge stumbling block for semantic and NLP systems for a large number of use cases that incorporate technical content. Automating patent research is one I looked at 6 or 7 years ago and tables tanked the concept. Semantic search over digitized maintenance manuals is another use-case I've wrestled with that's a tough nut to crack if the underlying manuals aren't available in a structured schema.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: