Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Have fun going down the rabbit whole of parsing structured data from ingredient phrases. Had a fun few weeks with that!


Yeah it has not been trivial so far! Let me know if you have any tips you can share


I settled on a regex based approach with lots of data clean-up and normalisation up front. Example of my site [4]

Some other approaches I spent a lot of time on:

* Extracting Structured Data From Recipes Using Conditional Random Fields [1]

* Chef Watson [2]

* Ingredient Parser - Model Guide [3]

[1] https://archive.nytimes.com/open.blogs.nytimes.com/2015/04/0...

[2] https://blog.kitchenpc.com/2011/07/06/chef-watson/

[3] https://ingredient-parser.readthedocs.io/en/latest/guide/ind...

[4] https://pretty-recip.es/recipe?recipe-url=https%3A%2F%2Fwww....


Maybe GPT-4 or some other LLM? Maybe too expensive but I would think they'd be able to accomplish the technical task.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: