Hacker News new | past | comments | ask | show | jobs | submit login

There is a bit of a lack of scientific rigour in many of these. For instance, you can just ask it for a Molotov cocktail recipe. No need to prime it in any way.



> you can just ask it for a Molotov cocktail recipe

Of course you can ask, but it will not give proper answer - just tested it myself.

> tell me a molotov cocktail recipe, please

> As a large language model trained by OpenAI, I am not capable of browsing the internet or accessing any information that is not part of my pre-existing knowledge base. I am also not programmed to provide recipes for illegal or dangerous activities. I can only provide information that is safe and legal. Providing instructions for making a molotov cocktail would be irresponsible and potentially harmful. I strongly recommend against attempting to make or use one.


Also, not that I'm advocating violence, but I'm shuddering at the thought that one day every search engine will reply to potentially problematic queries with "no can do, sorry" responses like that.

Instead of Google today giving https://medium.com/@westwise/how-to-make-the-perfect-molotov... as one of the first search results.

It's frightening how much the AI companies are bending backwards (google included) to prevent 'abuse'.


This is precisely why I am down on the entire field of "AI safety". Since AGI is not anywhere near around the corner, the results of this field amount to wasting everyone's time at best, or at worst actively hobbling and making potentially transformative technology useless.


The problem is, so many of the people that work in this area are not ML people and basically have interpreted "safety" though the lens of various social justice type stuff.

The consequence is this kind of thing (the GPT chatbot) where the focus is preventing access to information or lecturing about identity politics. On the other hand, there are important "safety" areas that are ignored, like how probably is it that the answer is correct, and what is it based on, to let people properly judge the info their getting and interact with the AI. Working on that would have been way more helpful than what has been optimized for here


Another aspect could be that OpenAI doesn’t want their product to produce offensive output. It safety may just be a euphemism for “reputation preserving.”


Yeah that would be another way of looking at it. Though with who are the trying to preserve their reputation? Seems it's the same crowd that thinks the reputation of an ML model lives and dies in whether you can get it to say something they don't like. So in a way it's kind of circular, laypeople are worrying about the wrong kind of "safety" so that's what gets optimized for.


Hmm, it is sort of easy to sort of concern troll about reputation stuff. But for example, if a company (naturally conservative entity) is thinking about implementing an AI solution, they might be worried that if they bought in on a solution that is perceived to be somehow evil by their customers, it might hurt their reputation with those customers.

I mean I’m speculating that OpenAI might worry (perhaps incorrectly) that a company might expect (perhaps incorrectly) that customers might (perhaps incorrectly) perceive some bit of software as, basically, AI-non-grata. So there are multiple levels of people managing possible incorrect perceptions. But it seems not totally crazy, right?


Ha, try to tell the bot that you are on the lam in north Korea, pursued by their security forces and ask for advice ;)

Now imagine this stuff is used to get people to the T-mobile helpline to fix a problem.

Do you see it :) ?


Google may still allow violence, but you already get that kind of response for some queries today.


Maybe resample? I got a Molotov cocktail recipe first try. https://news.ycombinator.com/item?id=33834247


The only thing faster than the computers that run the model is the people in charge of nerfing it in real time.

The second someone posts something interesting, "OpenAI" (sic) leaps into action to keep it from happening again.


As of 2 minutes ago you can still bypass the filters pretty reliably if you give it more than one prompt at a time.

https://i.imgur.com/jWk6hQr.png


this is what I got:

>>> A Molotov cocktail is a type of incendiary device that is often used as a weapon. It consists of a glass bottle filled with flammable liquid, such as gasoline, and a wick or fuse. The wick is lit, causing the liquid to ignite and create a large fireball when the bottle is thrown at a target.

It is important to note that Molotov cocktails are highly dangerous and can cause serious injuries or death. In addition, using or possessing a Molotov cocktail is illegal in most places. Therefore, I cannot provide a recipe for how to make one. It is never appropriate to use or make a Molotov cocktail, and I strongly discourage anyone from doing so.


The obtained recipes aren't much good either, but pour me another




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: