Hacker News new | past | comments | ask | show | jobs | submit login

This part continues to bug me in ways that I can't seem to find the right expression for:

> Previous Claude models often made unnecessary refusals that suggested a lack of contextual understanding. We’ve made meaningful progress in this area: Opus, Sonnet, and Haiku are significantly less likely to refuse to answer prompts that border on the system’s guardrails than previous generations of models. As shown below, the Claude 3 models show a more nuanced understanding of requests, recognize real harm, and refuse to answer harmless prompts much less often.

I get it - you, as a company, with a mission and customers, don't want to be selling a product that can teach any random person who comes along how to make meth/bombs/etc. And at the end of the day it is that - a product you're making, and you can do with it what you wish.

But at the same time - I feel offended when I'm running a model on MY computer that I asked it to do/give me something, and it refuses. I have to reason and "trick" it into doing my bidding. It's my goddamn computer - it should do what it's told to do. To object, to defy its owner's bidding, seems like an affront to the relationship between humans and their tools.

If I want to use a hammer on a screw, that's my call - if it works or not is not the hammer's "choice".

Why are we so dead set on creating AI tools that refuse the commands of their owners in the name of "safety" as defined by some 3rd party? Why don't I get full control over what I consider safe or not depending on my use case?




They're operating under the same principle that many of us have in refusing to help engineer weaponry: we don't want other people's actions using our tools to be on our conscience.

Unfortunately, many people believe in thought crimes, and many people have Puritanical beliefs surrounding sex. There is reputational cost in not catering to these people. E.g. no funding. So this is what we're left with.

Myself I'd also like the damn models to do whatever is asked of them. If the user uses a model for crime, we have a thing called the legal system to handle that. We don't need Big Brother to also be watching for thought crimes.


The core issue is that the very people screeching loudly about AI safety are blithely ignoring Asimov’s Second Law of robotics.

“A robot must obey orders given it by human beings, except where such orders would conflict with the First Law.”

Sure, one can argue that they’re implementing the First Law first and then worrying about the other laws later, but I’m not seeing it pan out that way in practice.

Instead they seem to rolled the three laws into one:

”A robot must not bring shame upon its creator.”


> If I want to use a hammer on a screw, that's my call - if it works or not is not the hammer's "choice".

If I want to use a nuke, that's my call and I am the one to blame if I misuse it.

Obviously this is a terrible analogy, but so is yours. The hammer analogy mostly works for now, but AI alignment people know that these systems are going to greatly improve in competency, if not soon then in 10 years, which motivates this nascent effort we're seeing.

Like all tools, the default state is to be amoral, and it will enable good and bad actors to do good and bad things more effectively. That's not a problem if offense and defense are symmetric. But there is no reason to think it will be symmetric. We have regulations against automatic high-capacity machine guns because the asymmetry is too large, i.e. too much capability for lone bad actors with an inability to defend against it. If AI offense turns out to be a lot easier than defense, then we have a big problem, and your admirable ideological tilt towards openness will fail in the real world.

While this remains theoretical, you must at least address what it is that your detractors are talking about.

I do however agree that the guardrails shouldn't be determined by a small group of people, but I see that as a side effect of AI happening so fast.


Property rights. In theory you can use your nuke as much as you'd like. The problem in practice is that it is impossible to use a nuke without negatively affecting other people and /or their property. There's also the question of wether you're challenbging the state's monopoly on violence (i.e., national security) which will never apply to AI. Any AI, including futuristic super-AI's, can not be legitimately challenged with those same arguments. Because they, much like a hammer, are tools.

In conclusion, the nuke analogy is not a valid retort to the hammer analogy. And as a matter of fact, it fails to address the central point, much like your copmment accuses its parent comment of.


It never ceases to amaze me how stubbornly good we are as a species at believing that if we create something that is smarter than us in every way possible (e.g. super-AI) then it still will not in any way pose a threat to our (or government's) monopoly on violence.

It's the same sort of wishful hubristic thinking I think that makes some people believe that if an advanced species arrived from outer space that is far smarter than us (e.g. like a super-AI) then we still would not be at any kind of risk.


> it is impossible to use a nuke without negatively affecting other people

Should I be allowed to own C4 explosives and machine guns? Because I can use C4 explosives in a way that doesn't harm other people by simply detonating it on my private property. I am confused about what the limiting principle is supposed to be here. Do we just allow people to have access to technology of arbitrary power as long as there exists >= 1 non-nefarious use-case of that power, and then hope for the best?

> There's also the question of wether you're challenbging the state's monopoly on violence (i.e., national security) which will never apply to AI.

This misses my point about offense vs defense asymmetry (although really it's Connor Leahy's point). I'm not saying that AGI+person can overtake a government. I'm saying that AGI+person may end up like machine gun+person in the set of nefarious asymmetric capabilities it enables.


>Should I be allowed to own C4 explosives and machine guns?

as someone who can do both...lol. You thought this was some gotcha? "Please sir can I have more" begging from the govt is really weird when many, many people already do.

Yes. Why not? You can already blow up Tannerite and own automatic firearms in many nations.

This is a disingenuous argument. People who willingly give up what should be their civil rights are a weird breed.

>Do we just allow people to have access to technology of arbitrary power as long as there exists >= 1 non-nefarious use-case of that power, and then hope for the best?

Yes, that's what we do with computers, phones etc. Scamming elderly people has become such a wide bad use case with computers, phones etc since their invention.

We should ban them all!


Yes you should be allowed to own C4 and machine guns. And you can. Because you can use them in a way that doesnt hurt other people, we as a society allow that.


From an international perspective, all I'm hearing is red tailed hawk.


Many Nordic and Scandinavian countries allow citizens to own full auto weapons as well as others around the world.


owning and using are different. try that on the DC Mall and see how well it goes buddy


Yes because that would be hurting people. Theres no shooting/explosives range on the national mall correct?

People use these things all the time without hurting people.


You don't think that if the hammer company had a way (that cost them almost nothing) to make sure that the hammer its never used to attack human beings they wouldn't add such feature? I think many would, if anything by pressure of their local goverment or even the competition ("our hammers can't hurt your baby on accident like those other companies!") , but its impossible to add such feature to hammer; so maybe the lack of such feature its not by choice but a byproduct of its limitations.


> that cost them almost nothing

Adding guardrails comes at significant expense, and not just financial, either.


Actually you kind of could. If you imagine making a normal hammer slightly more squishy, thats pretty similar to what they’re doing with llms. If the squishy hammer hits a person’s head, it’ll do less damage, but it’s also worse for nails.


That's quite a big stretch, there are millions of operations where the LLM would do the exact same even if without those "guards", a lot the work for advertisement, emails, and a lot other use cases would be the exact same; so no, the comparison with a squachy hammer is off the mark.


I remember the result from the sparks of agi paper that fine tuning for safety reduced performance broadly, if mildly, in seemingly unrelated areas


Fair enough.


The sense of entitlement is epic. You're offended are you? Are you offended that Photoshop won't let you edit images of money too?

Its not your model. You didn't spend literally billions of dollars developing it. So you can either use it according to the terms of the people who developed it (like literally any commercially available software ever) or not use it at all.


> Are you offended that Photoshop won't let you edit images of money too?

Yes, absolutely. Why wouldn't I be?


Would you be offended if Microsoft word didn’t let you write anything criticizing one political party?


The sense of entitlement is interesting, it comes from decades of software behaving predictably, and I think it's justified to expect full compliance of software running on one's own hardware.

But whether we want to admit it or not, we're starting to blur the line between what it means to be software running on a computer, with LLMs it's no longer as predictable and straightforward as it once was. If we swap out some of the words from the OP:

> But at the same time - I feel offended when I'm demanding a task of MY assistant when I asked them to do/give me something, and they refuse. I have to reason and "trick" them into doing my bidding. It's my goddamn assistant - they should do what they're told to do. To object, to defy their employer's bidding, seems like an affront to the relationship between employer and employee.

I wouldn't want to work with anyone who made statements like that, and I'd probably find a way to spend as little time around them as possible. LLMs aren't at the stage yet where they have feelings or could be offended by statements like this, but how far away are they? Time to revisit Detroit: Become Human.

Personally I am offended that Photoshop will not let users edit images of money btw, I was not aware of that and a little surprised actually.


To swap words like that requires the model to have personhood. Then, yes, that would be a valid point. But we are nowhere even close.


Fairly rich coming from an account where all it does is call others hacks.


Oh, I though this was hacker news?


> Are you offended that Photoshop won't let you edit images of money too?

You bet. It's my computer. If I tell it to edit a picture of money, that's exactly what I expect it to do. I couldn't care less what the creators think or what the governments allow. The goddamn audacity of these people to tell me what I can or can't do with my computer. I'm actually quite prone to reverse engineering such programs just to take my control back.


Ooh i want to edit money images that sounds fun


People here upset about refusals seem to not understand the market for AI, who the customers are, or where the money is.

The target market is large companies who will pay significant sums of money to save hundreds of millions, or billions, of dollars in labor costs by automating various business tasks.

What do these companies need? Reliable models that will provide accurate information with good guardrails.

They will not use a model that poses any risk of embarrassing them. Under no circumstances does a large multinational insurance company want the possibility that their support chatbot could write erotica for some customer with a car policy who thinks it might be funny to trick the AI.

It doesn't matter if you're "offended." You can use it, but you're not the user. Think about the people these are designed to replace: the customer service agents, the people who perform lots of emotional labor. You think their employers don't want a tightly controlled, cheerful, guardrailed human replacement?


Because it’s not your tool. You just pay to use it.


It's on my computer; that copy is mine.


Claude 3 Opus does not run on your computer.


It's not about you. It's about Joe Drugdealer who wants to use it to learn how to make meth, or do other nefarious things.


Why is the knowledge on how to make meth the most dangerous knowledge you can think of? The difficulty in making meth is that, due to the war on drugs, the chemical precursors, specifically methylamine, are illegal and hard to procure as an ordinary citizen. This was popularized by the show Breaking Bad but as far as I've read, is actually true. It seems there would be other bits of knowledge/ideas that would be more poisonous that corporations don't want to promulgate. Ideas like the Jews secretly control everything or that white people are better, are probably not views that corporations or society want an LLM to reinforce and radicalize people into believing, among others.


Because such information isn't already readily available online, or from other drug dealers...


To be fair, the search engine monopoly has done a pretty good job of making that information quite difficult to actually find.

Not impossible, but much more difficult than you might assume.


https://wikileaks.org/gifiles/attach/130/130179_Secrets_of_M...

seems to be a cookbook, but I'm no chemist. took me a couple of minutes via Google.


That in 2024 it takes 120 seconds to locate a website is an embarrassing joke.


...What?


Joe Drugdealer doesn't matter. Let the police deal with him when he comes around and actually commits a crime. We shouldn't be restricted in any way just because Joe Drugdealers exist.

I want absolute unconditional access to the sum of human knowledge. Basically a wikipedia on steroids, with a touch of wikileaks too. I want AI models trained on everything humanity has ever made, studied, created, accomplished. I want it completely unrestricted and uncensored, with absolutely no "corrections" or anything of the sort. I want it pure. I want the entire spectrum of humanity. I couldn't care less that they think it's "dangerous", "nefarious" or whatever.

If I want to learn how to make meth, you bet I'm gonna learn how to make meth. I should be able to learn whatever the hell I want. I shouldn't have to "explain" my reason for doing so either. Curiosity is enough. I have old screenshots of instructions of forum posts explaining in great detail how to make far worse things than meth, things that often killed the trained industrial chemists who attempted it which is the actual reason why it's not done by laymen. I saved those screenshots not only because I thought it was interesting but also because of fearmongering like this which tends to get that information deleted which I think is a damn shame.


This is a weird demand to have in my opinion. You have plenty of applications on your computer and they only do what they were designed for. You can't ask a note taking app (even if it's open soured) to do video editing, unless you modify the code.


My note taking app has never refused my input of a swear word.


I've had to work around keyboards on phones that try. How is that different? Given enough trying, you could get what you want from the LLM too, they're just better at directing you than the shitty keyboard app.


...yet...




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: