Today's Large Language Models are Essentially BS Machines

Veraticus · 10 months ago

Today's Large Language Models are Essentially BS Machines

Dark Arc · edit-2 10 months ago

I don’t believe in scaling as a way to discover understanding. Doing that is just praying that the machine comes alive… these machines weren’t programmed to come alive in that way. That’s my fundamental argument, the design of LLMs ignores understanding of the content… it doesn’t matter how much content it’s been scaled up to.

If I teach a real AI about fishing, it should be able to reason about fishing and it shouldn’t need to have read a supplementary knowledge of mankind to do it.

What the LLMs seem to be moving towards is more of a search and summary engine (for existing content). That’s a very similar and potentially quite useful thing, but it’s not the same thing as understanding.

It’s the difference between the kid that doesn’t know much but is really good at figuring it out based on what they know vs the kid that’s read all the text books front to back and can’t come up with anything original to save their life but can quickly regurgitate and summarize anything they’ve ever read.

Communist · edit-2 10 months ago

If I teach a real AI about fishing, it should be able to reason about fishing and it shouldn’t need to have read a supplementary knowledge of mankind to do it.

This is a faulty assumption.

In order for you to learn about fishing, you had to learn a shitload about the world. Babies don’t come out of the womb able to do such tasks, there is a shitload of prerequisite knowledge in order to fish, it’s unfair to expect an ai to do this without prerequisite knowledge.

Furthermore, LLM’s have been shown to do many things that aren’t in their training data, so the notion that it’s a stochastic parrot is also false.

Dark Arc · edit-2 10 months ago

Furthermore, LLM’s have been shown to do many things that aren’t in their training data, so the notion that it’s a stochastic parrot is also false.

And (from what I’ve seen) they get things wrong with extreme regularity, increasingly so as thing diverge from the training data. I wouldn’t say they’re a “stochastic parrot” but they don’t seem to be much better when things need to be correct… and again, based on my (admittedly limited) understanding of their design, I don’t anticipate this technology (at least without some kind of augmented approach that can reason about the substance) overcoming that.

In order for you to learn about fishing, you had to learn a shitload about the world. Babies don’t come out of the womb able to do such tasks, there is a shitload of prerequisite knowledge in order to fish, it’s unfair to expect an ai to do this without prerequisite knowledge.

That’s missing the forest for the trees. Of course an AI isn’t going to go fishing. However, I should be able to assert some facts about fishing and it should be able to reason based on those assertions. e.g. a child can work off of facts presented about fishing, “fish are hard to catch in muddy water” -> “the water is muddy, does that impact my chances of a catching a bluegill?” -> “yes, it does, bluegill are fish, and fish don’t like muddy water”.

There are also “teachings” brought about by how these are programmed that make the flaws less obvious, e.g., if I try to repeat the experiment in the post here Google’s Bard outright refuses to continue because it doesn’t have information about Ryan McGee. I’ve also seen Bard get notably better as it’s been scaled up, early on I tried asking it about RuneScape and it spewed absolute nonsense. Now… It’s reasonable-ish.

I was able to reproduce a nonsense response (once again) by asking about RuneScape. I asked how to get 99 firemaking, and it invented a mechanic that doesn’t exist “Using a bonfire in the Charred Stump: The Charred Stump is a bonfire located in the Wilderness. It gives 150% Firemaking experience, but it is also dangerous because you can be attacked by other players.” This is a novel (if not creative) invention of Bard likely derived from advice for training Prayer (which does have something in the Wilderness which gives 350% experience).

Communist · edit-2 10 months ago

And (from what I’ve seen) they get things wrong with extreme regularity, increasingly so as thing diverge from the training data. I wouldn’t say they’re a “stochastic parrot” but they don’t seem to be much better when things need to be correct… and again, based on my (admittedly limited) understanding of their design, I don’t anticipate this technology (at least without some kind of augmented approach that can reason about the substance) overcoming that.

Keep in mind, you’re talking about a rudimentary, introductory version of this, my argument is that we don’t know what will happen when they’ve scaled up, we know for certain hallucinations become less frequent as the model size increases (see the statistics on gpt3 vs 4 on hallucinations), perhaps this only occurs because they haven’t met a critical size yet? We don’t know.

There’s so much we don’t know.

That’s missing the forest for the trees. Of course an AI isn’t going to go fishing. However, I should be able to assert some facts about fishing and it should be able to reason based on those assertions. e.g. a child can work off of facts presented about fishing, “fish are hard to catch in muddy water” -> “the water is muddy, does that impact my chances of a catching a bluegill?” -> “yes, it does, bluegill are fish, and fish don’t like muddy water”.

https://blog.research.google/2022/05/language-models-perform-reasoning-via.html

they do this already, albeit imperfectly, but again, this is like, a baby LLM.

and just to prove it:

https://chat.openai.com/share/54455afb-3eb8-4b7f-8fcc-e144a48b6798

Today's Large Language Models are Essentially BS Machines

Today's Large Language Models are Essentially BS Machines

Today's Large Language Models are Essentially BS Machines - Ryan McGreal