• mindbleach@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    5
    ·
    3 days ago

    It’s trinary, and I understand why they instead say “1-bit,” but it still bugs me that they call it “1-bit.”

    I’d love to see how low they can push this and still get spooky results. Something with ten million parameters could fit on a Macintosh Classic II - and if it ran at any speed worth calling interactive, it’d undercut a lot of loud complaints about energy use. Training takes a zillion watts. Using the model is like running a video game.

    • milicent_bystandr@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      Can someone tell me what’s meant by,

      The repository describes bitnet.cpp as offering “a suite of optimized kernels that support fast and lossless inference of 1.58-bit models on CPU

      Does it mean you need to run your OS with a specific kernel from bitnet.cpp? Or is it a different kind of ‘kernel’?

      • mindbleach@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        2
        ·
        edit-2
        21 hours ago

        I think they mean whatever’s handling the model. A program into which you feed this inherently restricted format, so it takes advantage of those limitations, in order to run more efficiently.

        Like if every number’s magnitude is 1 or 0, you don’t need to do floating-point multiplication.