LLMs are better at languages that are forgiving, like those two, because if something is not exactly right the interpreter will often be able to just continue on
it's not even a new thing, certain companies were doing it before the pandemic. for a long time. I took my first offer at a remote company in 2012 -- I only met any of those people by chance, years later.
your comment would have a lot more weight if the grandparent hadn't been flagged and removed by people who don't like to read things they disagree with!
How am I supposed to know what arguments to expect from MAGA if you people censor what they have to say?
Censorship isn't helping your cause! It just makes your side look weak and scared.
I agree HN hides comments/posts too quickly, but the main issue here is that you can easily google federal outlays by month. This is an extremely easy thing to do. Unless you don't believe the data collected, there's nothing to debate. Why don't we have a debate about what color the sky is, or what 1+1 is?
What are people supposed to argue with the sort of person that claims destroying 2B of federal science funding is justified solely by alleviating 0.001% of the 1.83T deficit - aka accomplishing nothing?
There's nothing to be aware of, nothing to prepare for, it's an "argument" that destroys itself with simple division. (Taking their grossly exaggerated "3T per semester" deficit number - combining the 2020 peak in annual deficit and casually doubling it - at face value only makes the 2B from the NSF an even more insignificant 0.00033%)
I find nothing revelatory about it. Just another person that wants to vandalize anything associated with their vague meme-complex of woke-lib-fed-science-international stuff.
why would that be a joke? there's a ton of Reddit comments in the training data, and the output is of similar quality. LLMs are literally outputting average Reddit comments.
I have hard similar things but I think that's an exaggeration. When I tell GPT o3 or o4-high to assume a professional air, it stops acting like a meat-based AIs on r/politics; specifically, it stops making inane assumptions about the situation and starts becoming useful again.
For example, I had a question from a colleague that made no sense and I was trying to understand it. After feeding the question to GPT 3o, it aggressively told me that I made a major mistake in a quote and I had to make major changes. (It would be OK if this is what the colleague had said, but this wasn't the case). In reality the colleague had misunderstood something about the scope of the project and GPT had picked up on the other person's opinion as the "voice of reason" and just projected what it thought he was saying in a stronger way.
I changed its instructions to "Be direct; but polite, professional and helpful. Make an effort to understand the assumptions underlying your own points and the assumptions made by the user. Offer outside-of-the-box thinking as well if you are being too generic.". The aggro was immediately lost, and it instead it actually tried to clarify what my colleague was saying and being useful again.
I agree with those who say the vanilla version is sycophantic, but the plain talk version has far too many bad habits from the wrong crowd. It's a bit like Monday; lots of aggro, little introspection of assumption.
"Hallucination" implies that the LLM holds some relationship to truth. Output from an LLM is not a hallucination, it's bullshit[0].
> Using your dietician example: we often know quite well what types of foods to eat or avoid based on your nutritional needs
No we don't. It's really complicated. That's why diets are popular and real dietitians are expensive. and I would know, I've had to use one to help me manage an eating disorder!
There is already so much bullshit in the diet space that adding AI bullshit (again, using the technical definition of bullshit here) only stands to increase the value of an interaction with a person with knowledge.
And that's without getting into what happens when brand recommendations are baked into the training data.
I find this way of looking at LLMs to be odd. Surely we all are aware that AI has always been probabilistic in nature. Very few people seem to go around talking about how their binary classifier is always hallucinating, but just sometimes happens to be right.
Just like every other form of ML we've come up with, LLMs are imperfect. They get things wrong. This is more of an indictment of yeeting a pure AI chat interface in front of a consumer than it is an indictment of the underlying technology itself. LLMs are incredibly good at doing some things. They are less good at other things.
There are ways to use them effectively, and there are bad ways to use them. Just like every other tool.
The problem is they are being sold as everything solutions. Never write code / google search / talk to a lawyer / talk to a human / be lonely again, all here, under one roof. If LLM marketing was staying in its lane as a creator of convincing text we'd be fine.
This happens with every hype cycle. Some people fully buy into the most extreme of the hype, and other people reverse polarize against that. The first group ends up offsides because nothing is ever as good as the hype, but the second group often misses the forest for the trees.
There's no shortcut to figuring out what the truth of what a new technology is actually useful for. It's very rarely the case that either "everything" or "nothing" is the truth.
Very true, I think LLMs will be very good at confirming whatever bias you have. Want to find reasons why unpasturized milk is good? Just ask an LLM. Want to find evidence to be an antivaxxer? Just ask an LLM!
> "Hallucination" implies that the LLM holds some relationship to truth. Output from an LLM is not a hallucination, it's bullshit[0].
I understand your perspective, but the intention was to use a term we've all heard to reflect the thing we're all thinking about. Whether or not this is the right term to use for scenarios where the LLM emits incorrect information is not relevant to this post in particular.
> No we don't. It's really complicated. That's why diets are popular and real dietitians are expensive.
No, this is not why real dietitians are expensive. Real dietitians are expensive because they go through extensive training on a topic and are a licensed (and thus supply constrained) group. That doesn't mean they're operating without a grounding fact base.
Dietitians are not making up nutritional evidence and guidance as they go. They're operating on studies that have been done over decades of time and millions of people to understand in general what foods are linked to what outcomes. Yes, the field evolves. Yes, it requires changes over time. But to suggest we "don't know" is inconsistent with the fact that we're able to teach dietitians how to construct diets in the first place.
There are absolutely cases in which the confounding factors for a patient are unique enough such that novel human thought will be required to construct a reasonable diet plan or treatment pathway for someone. That will continue to be true in law, health, finances, etc. But there are also many, many cases where that is absolutely not the case, the presentation of the case is quite simple, and the next step actions are highly procedural.
This is not the same as saying dietitians are useless, or physicians are useless, or attorneys are useless. It is to say that, due to the supply constraints of these professions, there are always going to be fundamental limits to the amount they can produce. But there is a credible argument to be made that if we can bolster their ability to deliver the common scenarios much more effectively, we might be able to unlock some of the capacity to reach more people.
I just tried this, and google returned a variety of videos (guides for fixing), and various text/website tutorials (home depot, reddit etc), I had to scroll to the absolute bottom to see an ad for a plumber.
I had the same experience. I'm located in Minnesota, USA, not currently logged in to Google, and I use an ad blocker. First result was a Home Depot home repair article that looks genuinely useful. Then relevant YouTube videos, Reddit threads, an iFixIt link, a link to the Portland government website. I see zero things I would explicitly call an "ad" on the first page.
I'm like three days from my one year Duo streak. I've gone from understanding none of my wife's native language to being able to eavesdrop on phone conversations a bit, and to have short exchanges. I've probably spent half an hour daily on average. Sometimes a lot more.
I had no prior exposure. This website is weird, the comments never reflect reality for me on any topic.
Comprehensible input works really well and was popularized by a video that went viral a few years ago entitled “How to acquire any language NOT learn it!” [0]
The method described in the video involves focusing on listening for the first year by having someone read magazines and books to you in the target language, pointing and using other gestures to convey the meaning of words you don’t understand. This method works quite well but it is very difficult to find anyone who will consistently meet with you and practice like this before you have reached a certain level of understanding, and very few people want to learn this way because they see it as a waste of time.
One of the key aspects of this model is that you should not be translating between your native and your target language, which is what you usually do on apps like Duolingo. This has led to a subset of comprehensible input evangelists to fixate on insisting that Duolingo doesn’t work. The reality is that the method that works is the method you use consistently over time. Once you get to a certain level of fluency, you can have actual conversations to reinforce your learning, at which point drill methods like Duolingo will usually plateau while exposure methods like comprehensible input will still be useful for improving grammar and pronunciation.
I don't think your experience here is weird - just seems like you had a good environment for practicing a bit with your wife, which I think is more important than any other aspect of the learning methodology. Now, nitpickers might argue that "better" methods would have achieved much more in less time, but eh.
Try to force some more exchanges with your wife. Make a day of the week the day you only speak her language (for at least a few hours, but don't give up from the frustration when vocabulary runs short - keep it going even if you need to point and sign).
> it proves nothing about reason, which LLMs have clearly shown needs to be distinguished from consciousness.
Uh, they have? Are you saying they know how to reason? Because if so, why is it that when I give a state of the art model documentation lacking examples for a new library and ask it to write something, it cannot even begin to do that, even if the documentation is in the training data? A model that can reason should be able to understand the documentation and create novel examples. It cannot.
This happened to me just the other day. If the model can reason, examples of the language, which it has, and the expository documentation should have been sufficient.
Instead, the model repeatedly inserted bullshitted code in the style of the language I wanted, but with library calls and names based on a version of the library for another language.
This is evidence of reasoning ability? Claude Sonnet 3.7 and Gemini Pro both exhibited this behavior last week.
I think this technology is fundamentally the same as it has been since GPT2
Absolutely LLMs can reason. There are limitations on their ability to reason, as you and everyone else has discovered. But they can absolutely reason about both concepts and the physical world in ways that, say, animals can't - even though presumably animals have at least some sort of self-consciousness and LLM's do not.
> A model that can reason should be able to understand the documentation and create novel examples. It cannot.
That's due to limitations imposed for "security". "Here's a new X, do Y with it" can result in holes bigger and more complex than anyone can currently handle "in time".
It's not about "abilities" with LLMs for now, but about functions that work within the range of edge cases, sometimes including them, some other times not.
You could still guide it to fulfill the task, though. It just cannot be allowed to do it on it's own but since just "forbidding" an LLM to do something is about as effective as doing that to a child with mischievous older brothers, the only ways to actually do it result in "bullshitted" code and "hallucinations".
reply