Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes, and it shows you believing what the bot is telling you, therefore I asked. It is giving you some generic function call with a generic name. Why would you believe that is actually what happens with it internally?

By the way when I repeated your prompt it gave me another name for the module.



Please share your chat

I also just confirmed via the API that it's making an out of band tool call

EDIT: And googling the tool name I see it's already been widely discussed on twitter and elsewhere


Posts like this are terrifying to me. I spend my days coding these tools thinking that everyone using them understands their glaring limitations. Then I see people post stuff like this confidently and I'm taken back to 2005 and arguing that social media will be a net benefit to humanity.

The name of the function shows up in: https://github.com/openai/glide-text2im which is where the model probably learned about it.


The tool name is not relevant. It isn't the actual name, they use an obfuscated name. The fact that the model believes it is a tool is good evidence at first glance that it is a tool, because the tool calls are typically IN THE PROMPT.

You can literally look at the JavaScript on the web page to see this. You've overcorrected so far in the wrong direction that you think anything the model says must be false, rather than imagining a distribution and updating or seeking more evidence accordingly


>The tool name is not relevant. It isn't the actual name, they use an obfuscated name.

>EDIT: And googling the tool name I see it's already been widely discussed on twitter and elsewhere

I am so confused by this thread.


The original claim was that the new image generation is direct multimodal output, rather than a second model. People provided evidence from the product, including outputs of the model that indicate it is likely using a tool. It's very easy to confirm that that's the case in the API, and it's now widely discussed elsewhere.

It's possible the tool is itself just gpt4o, wrapped for reliability or safety or some other reason, but it's definitely calling out at the model-output level


> It's possible the tool is itself just gpt4o, wrapped for reliability or safety or some other reason, but it's definitely calling out at the model-output level

That's probably right. It allows them to just swap it out for DALL-E, including any tooling/features/infrastructure hey have built up around image generation, and they don't have to update all their 4o instances to this model which, who knows, may be not be ready for other tasks anyway or different enough to warrant testing before a rollout, or more expensive, etc.

Honestly it seems like the only sane way to roll it out if it is a multimodal descendant of 4o.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: