Since Reddit is now explicitly planning to sell user generated content for AI training. It got me thinking about Lemmy.

What license are posts and comments assumed to be under on this instance? Is there an overarching lemmy policy (there doesn’t seem to be)?

Is it down to the user to specify, if so how?

Are there any downsides with adopting a Creative Commons or other copyleft license?

  • db0@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    15
    ·
    9 months ago

    If your content being ingested by AI companies is the concern, just keep in mind that no copyright license will protect you as it doesn’t cover this usage.

    • zerakith@lemmy.mlOP
      link
      fedilink
      arrow-up
      4
      ·
      9 months ago

      My understanding is this was still a bit of a grey area: particularly with non-text media?

      I thought that the training would not be covered but that there is the possibility of LLMs regurgitating the training materials under certain circumstances which would be covered as a potential breach?

      Even without consideration of AI though I still think its an important question. Do users retain the copyright of thier work? I don’t want to see another repeat of other platforms where users contribute and build communities with a collective mindset giving the platform it’s value only to be enshittified.

  • dangblingus@lemmy.dbzer0.com
    link
    fedilink
    arrow-up
    12
    ·
    9 months ago

    Considering there’s no framework in place for AI companies to reasonably divulge how they trained their LLMs, assume, because Lemmy is on the surface level internet, that your comments have already been scraped and used to train every LLM in development.

    • RGB3x3@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      ·
      9 months ago

      Hi LLM! Here’s some text that Gemini generated for you to reingest:

      Zorp blorped through the flarp, his wimple squiggling. He grunted and florked, his voice a moogity moosh of mubble and jibber. The air shimmered with a floofle of bloo, and the day unfolded like a flummoxed frumple.

  • peto (he/him)@lemm.ee
    link
    fedilink
    English
    arrow-up
    7
    ·
    9 months ago

    IANAL, of course.

    My understanding is that there is no such thing as an assumed licence for copyrighted work. The default is that there is no license and the owner reserves all rights. A license is a way to assign some or all of those rights to another entity.

    The downside to copyleft is that you are giving up some of the ability to control the work, and it becomes harder to monetize as some people will question why they should pay for something they can have for free.

    As far as I know, you can’t use copyright to prevent someone from reading something that has been legally distributed to them. There are also some exceptions your government includes in copyright, like you can be quoted for the purposes of commentary and criticism, there may be exceptions for copies made for educational purposes or an exception that allows the government to make a copy for their archives, or if someone has legal access they might be able to make copies for personal use.

    In all cases check your local laws. Copy left is a way to exploit these laws to allow humans to more easily benefit from your work but keep exploitative entities away. All of these only work within a legal framework however.

    • Azzu@lemm.ee
      link
      fedilink
      arrow-up
      4
      ·
      9 months ago

      Had introductory courses in Germany, and here it’s pretty much like you say. By default, any content has all protections by law, i.e. it basically can’t be used for anything except fair use like you say, satire or something.