Such convincing gaslighting. I’m too lazy to look, but was Gomer Pyle a cameo/guest on that episode of the simpsons? or is it just total hallucination?
I wouldn’t call it gaslighting or even hallucination, but just getting things mixed up. I described Bart Simpson and asked if it could tell me which character from which show I meant.
There is a Gomer Pyle who appears in several episodes of the Simpsons and does have pointy hair. I’m pretty sure he doesn’t wear a helmet and bandana.
But Gomer Pyle is also a figure in the Andy Griffith Show and Gomer Pyle USMC, which aired in the 60s.
Because it has fewer parameters and (in some cases) it’s quantized. The hardware needed to run local inference on the full model is not really feasible to most people. Though, the release of it will probably still make a wide impact on the quality of other upcoming smaller models being distilled from it, or trained on synthetic data from it, or merged with it, etc.
The local versions I’ve tested out today are absolutely garbage. It frustrated me over simple questions.
I know it’s not DeepSeek, but this is what I got out of the Reasoner V1 model in GPT4All (“Based on Qwen2.5-Coder 7B”). Use local models with care!
Such convincing gaslighting. I’m too lazy to look, but was Gomer Pyle a cameo/guest on that episode of the simpsons? or is it just total hallucination?
I wouldn’t call it gaslighting or even hallucination, but just getting things mixed up. I described Bart Simpson and asked if it could tell me which character from which show I meant. There is a Gomer Pyle who appears in several episodes of the Simpsons and does have pointy hair. I’m pretty sure he doesn’t wear a helmet and bandana. But Gomer Pyle is also a figure in the Andy Griffith Show and Gomer Pyle USMC, which aired in the 60s.
Why is that? I mean why does the locally run version suck?
Because it has fewer parameters and (in some cases) it’s quantized. The hardware needed to run local inference on the full model is not really feasible to most people. Though, the release of it will probably still make a wide impact on the quality of other upcoming smaller models being distilled from it, or trained on synthetic data from it, or merged with it, etc.