Microsoft researchers build 1-bit AI LLM with 2B parameters — model small enough to run on some CPUs

ThorrJo@lemmy.sdf.org · 3 months ago

Microsoft researchers build 1-bit AI LLM with 2B parameters — model small enough to run on some CPUs

mindbleach@sh.itjust.works · 3 months ago

It’s trinary, and I understand why they instead say “1-bit,” but it still bugs me that they call it “1-bit.”

I’d love to see how low they can push this and still get spooky results. Something with ten million parameters could fit on a Macintosh Classic II - and if it ran at any speed worth calling interactive, it’d undercut a lot of loud complaints about energy use. Training takes a zillion watts. Using the model is like running a video game.

milicent_bystandr@lemm.ee · 2 months ago

Can someone tell me what’s meant by,

The repository describes bitnet.cpp as offering “a suite of optimized kernels that support fast and lossless inference of 1.58-bit models on CPU

Does it mean you need to run your OS with a specific kernel from bitnet.cpp? Or is it a different kind of ‘kernel’?

mindbleach@sh.itjust.works · edit-2 2 months ago

I think they mean whatever’s handling the model. A program into which you feed this inherently restricted format, so it takes advantage of those limitations, in order to run more efficiently.

Like if every number’s magnitude is 1 or 0, you don’t need to do floating-point multiplication.