Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Oh yes, its been good for a while. When we created our Android-use[1] (like computer use) tool, it was the cheapest and the best option among Openai, Claude, llama etc.

We have a planner phase followed by a "finder" phase where vision models are used. Following is the summary of our findings for planner and finder. Some of them are "work in progress" as they do not support tool calling (or are extremely bad at tool calling).

  +------------------------+------------------+------------------+
  | Models                 | Planner          | Finder           |
  +------------------------+------------------+------------------+
  | Gemini 1.5 Pro         | recommended      | recommended      |
  | Gemini 1.5 Flash       | can use          | recommended      |
  | Openai GPT 4o          | recommended      | work in progress |
  | Openai GPT 4o mini     | recommended      | work in progress |
  | llama 3.2 latest       | work in progress | work in progress |
  | llama 3.2 vision       | work in progress | work in progress |
  | Molmo 7B-D-4bit        | work in progress | recommended      |
  +------------------------+------------------+------------------+
1. https://github.com/BandarLabs/clickclickclick


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: