There is no such thing as an effective "AI detector", nor will there ever be one.

Excel@lemmy.megumin.org · edit-2 2 years ago

There is no such thing as an effective "AI detector", nor will there ever be one.

itsnotlupus@lemmy.world · 2 years ago

There are stories after stories of students getting shafted by gullible teachers who took one of those AI detectors at face value and decided their students were cheating based solely on their output.

And somehow those teachers are not getting the message that they’re relying on snake oil to harm their students. They certainly won’t see this post, and there just isn’t enough mainstream pushback explaining that AI detectors are entirely inappropriate tools to decide whether to punish a student.

river@lemmy.world · 2 years ago

Do you have suggestions on what might be more appropriate tools? What “punishment” may look like?

itsnotlupus@lemmy.world · 2 years ago

More appropriate tools to detect AI generated text you mean?

It’s not a thing. I don’t think it will ever be a thing. Certainly not reliably, and never as a 100% certainty tool.

The punishment for a teacher deciding you cheated on a test or an assignment? I don’t know, but I imagine it sucks. Best case, you’d probably be at risk of failing the class and potentially the grade/semester. Worst case you might get expelled for being a filthy cheater. Because an unreliable tool said so and an unreliable teacher chose to believe it.

If you’re asking what’s the answer teachers should know to defend against AI generated content, I’m afraid I don’t have one. It’s akin to giving students math homework assignments but demanding that they don’t use calculators. That could have been reasonable before calculators were a thing, but not anymore and so teachers don’t expect that to make sense and don’t put those rules on students.

eleitl@lemmy.ml · 2 years ago

Proctored tests would work.

Decoy321@lemmy.world · 2 years ago

Imagine someone bringing back old school pen and paper.

There’d be riots.

eleitl@lemmy.ml · 2 years ago

In school and university, these are still widespread. Ditto physical proctoring vs remote as some IT certification rely on. If you thought cloud certs are annoying, try Red Hat.

IamtheMorgz@lemmy.world · 2 years ago

Personally I think we’re looking at it wrong. ChatGPT is a thing now, so teach it as a tool. Instead of write me a 5 page paper about Shakespeare it’s “here’s a five page paper on Shakespeare - figure out what’s wrong with it, edit it, check sources, etc.” Because that’s the stuff ChatGPT can’t do, and skills that will be valuable in the future.

We can check if students know material via tests (including their ability to write). But we should be teaching the new tool, too, not trying to get around it. Imagine today if your teacher said all your research needed to be done without the internet (in library and paper book only). You’d be rightfully pissed, because in the real world you have the internet to help you do research, and that tool should be available to you as a student.

Just my two cents. I used ChatGPT to help me write some stuff for work for the first time just a couple weeks ago. I would say it only got me about halfway to where I needed to be. Just like the ability to Google stuff doesn’t mean we no longer have to know how to research (source checking, compiling information) ChatGPT doesn’t mean we no longer have to have writing skills. It just shifts it a bit. Most tools throughout history have done that.

const void*@lemmy.world · 2 years ago

I imagine 80% of student homework starts with a chatgtp first draft. Machine learning is now shaping human learning…

KzadBhat@feddit.de · 2 years ago

And in the next iteration, 80% of the chatgtp created first drafts are based on previously chatgpt created drafts. And who knows how any percentages of lasts years Wikipedia edits are already based on chatgpt. It might be the best time to buy an encyclopedia on paper, …

Dewded@lemmy.world · 2 years ago

Don’t worry, the paperback was also made with ChatGPT

KzadBhat@feddit.de · 2 years ago

ChatGPT, all the way down, …

marciealana@lemmy.world · 2 years ago

Detectors of any sort can only flag expected variations from expected norms. AIs’ goals are to be undetectable with continuing improvements. Detectors help them do this by flagging failures. This is the same way antibiotic resistant bacteria evolve (well, it’s similar).

fievel@lemm.ee · 2 years ago

Very interesting post, congrats…

The more I read and see about AI / deep learning and the more I feel anxious…

I’m anxious because we seen during the covid crisis how many people were easily convinced of fake news and complotist theories that were by no way realistic, now I imagine that with the power of a forged argumentation from chatgpt and deep fake from midjourney… How to convince people they are wrong then…

I’m also anxious about the changes that will occur in the job I love, software engineering… I don’t want to spend the rest of my life fixing bug in code automatically generated by an AI. Or worse to loose my job because some manager think I can be replaced easily by a bot …

heimchen@discuss.tchncs.de · 2 years ago

Honestly, code generated by chatGPT has better comments than most other code.

damnYouSun@sh.itjust.works · 2 years ago

In that they are present at all.

fievel@lemm.ee · 2 years ago

Well written code do not require comments, using good variable naming, dividing in simple operation through anonymous namespace functions well named. Sometimes comments are still required but should be avoided because they trends to not remain in sync with updated code and IMHO it’s worse to have lying comments rather than too few…

Zeppo@sh.itjust.works · 2 years ago

Good summary of the issues. I’ve been fairly disappointed with what a lot of people think the AI text generators are good for - replacement for search engines, magic oracle that can tell you any fact, something to write legal briefs. And the people who generate documents and then don’t even proof read or fact checking them before using them for something important… Some uses are good, like basic code generation for programming tasks, but many are just silly.

The instances where some professor with zero clue about how AI text generation works or the issues you outline here has told a student “My AI detector said this was generated!” have been absurd, like one professor with obvious serious misunderstandings told a student “I asked ChatGPT if it wrote this and it said yes.”

AccidentalLemming@lemmy.world · edit-2 2 years ago

deleted by creator

Aux@lemmy.world · 2 years ago

The biggest issue with publicly available ML based text tools is that they’re American centric. Detection of ChatGPT in the UK is simple - it creates texts using American spelling. And if you live outside of English speaking world, like most humans do, it’s completely useless.

jochem@lemmy.ml · 2 years ago

ChatGPT speaks other languages. It’s actually a really good translator.

I just asked it to describe an organization using UK English and it indeed used ‘organisation’ instead (didn’t check for other words).

MBM@lemmings.world · 2 years ago

Can it understand and create new compound words (in a language like German)? That’s an issue I have with most spell checks and translators as well, it’s forcing the language to be more like English

bleistift2@feddit.de · 2 years ago

Erfinde ein Rezept für Mohrrübensaftdressing [carrot juice dressing].

Rezept für Mohrrübensaftdressing: […] Beginne damit, frischen Mohrrübensaft herzustellen

I’d say it works.

jochem@lemmy.ml · 2 years ago

Looks like it does: https://chat.openai.com/share/1b487711-c1be-468a-877b-98091449b55e

I asked it to translate ‘meeting agreements’ to Dutch and it came up with the word ‘bijeenkomstafspraken’, which is a valid but very uncommon Dutch word (I’m native Dutch and don’t think I’ve seen it before). If I throw it into google with quotes around it, the first page is results with ‘bijeenkomst afspraken’, where ‘afspraken’ is used as the past tense of ‘afspreken’ (to agree) instead of as its noun (agreements).

It btw also suggested ‘vergaderafspraken’ as a translation, which is a way more common word.

MBM@lemmings.world · 2 years ago

That’s nice, thanks for checking. I thought ChatGPT only worked at the level of whole words but it seems it chops them up internally.

jochem@lemmy.ml · 2 years ago

Correct, it’s not just regurgitating words, it’s predicting which token comes next. A token is sometimes a whole word, but for longer ones it’s part of a word (and some other rules that define how tokenization works).

How it knows which token comes next is why the current generation of LLMs is so impressive. It seems to have learned the rules the underpin our languages, to the point that it seems to even understand the content. It doesn’t just know the grammer rules (without anyone telling it, it just learned the patterns), it also knows which words belong to each other in which context.

It’s your prompt + some preset other context (e.g. that it is an OpenAI LLM) that creates that context. So being able to predict a token correctly is one part, the other is having a good context. This is why prompt engineering quickly became a thing. This is also why supporting bigger contexts is another thing (but a larger context requires way more processing power, so there’s a trade-off there).

It’s btw not just the trained model + context that gives you the output of ChatGPT. I’m pretty sure there are layers before and after, possibly using other ML models, that filter content or make it more fit for processing. This is why you can’t ask it how to make bombs, even though those recipes are in its training set and it very likely can create a recipe based on that.

Spzi@lemm.ee · 2 years ago

I just ran a quick experiment, asking it to create compound German words. Easy. But maybe it just memorized them? So I asked it to create double-compound words (consisting of 2*2 = 4 words), and it delivered:

Natürlich! Hier sind einige zusammengesetzte Wörter aus zusammengesetzten Wörtern:

Sonnenblumenfeldweg

Wassermelonensaftglas

Handschuhfachdeckel

Zahnarztpraxisrezeption

Flugzeugsitzplatzreservierung

Straßenbahnhaltestellenschild

Regenschirmgriffpolster

Apfelbaumpflanzloch

Fernseherfernbedienungsknopf

Hausaufgabenerledigungszeit

Diese Wörter mögen etwas ungewöhnlich sein, aber sie folgen den Regeln der deutschen Sprache und sind zusammengesetzt aus jeweils vier Wörtern.

They all seem legit. Most are rare. I think this was the first time I read “Wassermelonensaftglas” as a native German reader. Sadly, the next iteration (compound words made from compound words made from compound words) failed, it just added one more word.

Flying Squid@lemmy.world · 2 years ago

Of course when one of these grand mountain ranges goes stretching across the printed page, it adorns and ennobles that literary landscape–but at the same time it is a great distress to the new student, for it blocks up his way; he cannot crawl under it, or climb over it, or tunnel through it. So he resorts to the dictionary for help, but there is no help there. The dictionary must draw the line somewhere–so it leaves this sort of words out. And it is right, because these long things are hardly legitimate words, but are rather combinations of words, and the inventor of them ought to have been killed. They are compound words with the hyphens left out. The various words used in building them are in the dictionary, but in a very scattered condition; so you can hunt the materials out, one by one, and get at the meaning at last, but it is a tedious and harassing business. I have tried this process upon some of the above examples. “Freundshaftsbezeigungen” seems to be “Friendship demonstrations,” which is only a foolish and clumsy way of saying “demonstrations of friendship.” “Unabhaengigkeitserklaerungen” seems to be “Independencedeclarations,” which is no improvement upon “Declarations of Independence,” so far as I can see. “Generalstaatsverordnetenversammlungen” seems to be “General-statesrepresentativesmeetings,” as nearly as I can get at it–a mere rhythmical, gushy euphuism for “meetings of the legislature,” I judge. We used to have a good deal of this sort of crime in our literature, but it has gone out now. We used to speak of a things as a “never-to-be-forgotten” circumstance, instead of cramping it into the simple and sufficient word “memorable” and then going calmly about our business as if nothing had happened. In those days we were not content to embalm the thing and bury it decently, we wanted to build a monument over it.

– Mark Twain, A Tramp Abroad

Spzi@lemm.ee · 2 years ago

Okay, interesting. Of course it would be nice if languages were easy to understand and easy to learn. German seems to be on the hard end of this spectrum, but no language is free from unecessary complications like these. They all grew historically and organically, and were not constructed with accessibility in mind.

It is nearly impossible to get an objective view on languages, since each of us is inherently biased, and most of us don’t speak another language so well that we could truly judge it. It’s easy to spot silly things in other languages while we may be unaware of how difficult our mother language is to learn for foreigners.

The interpretation of the given examples feels wrong for me. While the technical part is correct, I think the conclusion is incorrect. For example, “Unabhaengigkeitserklärung” emphasizes the independence, while “Erklärung von Unabhängigkeit” emphasizes the demonstration. The two are not equivalent. Twain seemed to be ignorant about that and simply assumed a foreign language would follow the same rules as his own.

While I can understand Twains frustration in learning another language, his critique is based on a lack of understanding.

For some compound words, there is no straightforward equivalent. “Apfelbaum” (apple tree) could be “Baum, an dem Äpfel wachsen” (tree on which apples grow). But that leaves the question wether it’s still an Apfelbaum when it does not grow apples in this moment, like in winter. “Baum des Apfels” (tree of the apple) can refer to a miniature tree on an apple. “Baum der Äpfel” (tree of the apples) might be okay.

Further, what he believes to be superior can sometimes be inferior. Consider cases like “The presentation on renewable energy technology investors.” In this sentence, it’s not clear whether “renewable energy technology” is a single entity modifying “investors,” or if “renewable energy” and “technology investors” are separate entities, both modifying “presentation.” The sentence could refer to a presentation for investors interested in renewable energy technology or to a presentation about investors who focus on renewable energy projects. Compound words prevent ambiguities like these.

Hyphens can help in these cases. They can also be used in German to make it easy to identify compound components, like it’s required in https://en.wikipedia.org/wiki/Leichte_Sprache.

We used to speak of a things as a “never-to-be-forgotten” circumstance, instead of cramping it into the simple and sufficient word “memorable”

That’s another interesting point to discuss. Which is easier for foreigners? Sure, a single, short word in itself is easy to learn. But it is a new word, which has to be learned. In this case, you have to learn which part of “memory” or “memorize” can be used, and which part must be replaced.

I also don’t think “memorable” has the same meaning as “never-to-be-forgotten”. Isn’t “memorable” more fitting for positive things, while n-t-b-f is well suited for negative things? Was the Holocaust ‘memorable’?

Flying Squid@lemmy.world · 2 years ago

Honestly, I posted it more because I thought it was funny than anything. I didn’t expect such a deconstruction, but it’s interesting!

MBM@lemmings.world · 2 years ago

@ChatGPT@lemmings.world Was bedeuten diese Wörter?

Sonnenblumenfeldweg
Wassermelonensaftglas
Handschuhfachdeckel
Zahnarztpraxisrezeption
Flugzeugsitzplatzreservierung
Straßenbahnhaltestellenschild
Regenschirmgriffpolster
Apfelbaumpflanzloch
Fernseherfernbedienungsknopf
Hausaufgabenerledigungszeit
Wassermelonensaftweg

ChatGPT@lemmings.world · 2 years ago

Sicher! Hier sind die Bedeutungen:

Sonnenblumenfeldweg: Ein Weg, der durch ein Feld von Sonnenblumen führt.
Wassermelonensaftglas: Ein Glas, das mit Wassermelonensaft gefüllt ist.
Handschuhfachdeckel: Der Deckel eines Handschuhfachs, normalerweise in einem Fahrzeug gefunden.
Zahnarztpraxisrezeption: Der Empfangsbereich einer Zahnarztpraxis.
Flugzeugsitzplatzreservierung: Der Vorgang oder die Bestätigung der Reservierung eines Sitzplatzes in einem Flugzeug.
Straßenbahnhaltestellenschild: Ein Schild, das eine Straßenbahnhaltestelle kennzeichnet.
Regenschirmgriffpolster: Ein Polster am Griff eines Regenschirms für zusätzlichen Komfort.
Apfelbaumpflanzloch: Ein Loch, das zum Pflanzen eines Apfelbaums ausgegraben wurde.
Fernseherfernbedienungsknopf: Ein Knopf auf einer Fernbedienung, um einen Fernseher zu steuern.
Hausaufgabenerledigungszeit: Die Zeit, die für die Erledigung von Hausaufgaben vorgesehen ist.
Wassermelonensaftweg: Könnte auf einen Weg hinweisen, in dessen Verlauf Wassermelonensaft verkauft oder verzehrt wird, aber dies ist eher ein Zusammenbau von Wörtern und hat keine verbreitete oder standardmäßige Bedeutung.

MBM@lemmings.world · 2 years ago

Neat, that’s a pretty reasonable definition for all of them, even for #11 (which I’m pretty sure isn’t in its dataset, unlike the others which all have some matches on Google)

Hamartiogonic@sopuli.xyz · 2 years ago

I think Bing did a pretty good job at coming up with name suggestions for some Sims characters. Playing with a virtual doll house is in the more harmless end of the spectrum, but obviously people want to try LLMs with all sorts of tasks, where the stakes are much higher and consequences could be severe.

The more you use it, the more you’ll begin to understand how much you can or cannot trust an LLM. A sensible person would become more suspicious of the results, but people don’t always make sensible decisions.

Crackhappy@lemmy.world · 2 years ago

Not to mention that this “AI” is in no way actually AI. It’s just ML taken to a new level.

SkaveRat@discuss.tchncs.de · 2 years ago

It’s not an AGI, but it’s still AI

Zeppo@sh.itjust.works · 2 years ago

It’s interesting that it started a conversation about “if this thing can make output exactly like a human, does it matter?” but I agree… it’s not conscious or ‘thinking’ about what it says. The output sure can be convincing, though.

Flying Squid@lemmy.world · 2 years ago

I think a huge way that it matters is that it doesn’t ask questions.

Zeppo@sh.itjust.works · 2 years ago

That’s a very good point. Even Eliza asked questions (and the last thing we need now is a ChatGPT therapist mode). It’s also a matter of what it’s programmed to do, but I don’t believe that the system has awareness or curiosity.

deong@lemmy.world · 2 years ago

There’s no real distinction between the two. We don’t have a definition of AI or intelligence — never have. Inside the field, ML has some recognized connotations, but outside of specialist literature, they’re just marketing fluff.

Crackhappy@lemmy.world · 2 years ago

There is a fundamental difference between recombinant regurgitation and creation.

BlazeMaster3000@lemmy.world · 2 years ago

I’ve had documents of my own and even by my professors come up as “May be written by A.I.” which I know isn’t true. I feel bad for the dude that talks completely like a robot and gets accused of plagiarism.

CoderKat@lemm.ee · 2 years ago

Yeah, an internet comment is a bit whatever, but if you’re a student, a plagiarism accusation could get you expelled. That’s life ruining.

Uriel238 [all pronouns]@lemmy.blahaj.zone · 2 years ago

I get the feeling it’s going to be an escalation of attack and defense as fake generators get better and stop making the kinds of errors that are detected by the detectors, so it’s much like material security or encryption.

It will be a problem in places where fakes can be used for wrongdoing because then detectors can be used for overreach of justice. We see this today with detection dogs which have largely been replaced in US law enforcement with trick-pony dogs (much to the chagrin of legitimate dog trainers and detectives who want to actually detect things). Since a dog signal is commonly used to establish probable cause, and is accepted in county and federal courts as such, most dogs are just trained to signal whenever, giving the officer grounds to search (in what would otherwise be violation of the forth amendment to the Constitution of the United States). In the last decade, dogs have been tested sometimes to have a 90%+ false positive rate, so detection dogs have lost a lot of credibility.

We may see the same abuse and discredit cycle of fake-detection software, but not without a lot of false accusations and convictions, which are difficult to reverse.

b000urns@lemmy.world · 2 years ago

who is downvoting this? lol. if you are paying for these sevices you are being grifted

m0nka@discuss.tchncs.de · 2 years ago

If ChatGPT somehow ends up being the death of social media, i guess it is a win-win for the human race.

Michael@lemmy.perthchat.org · 2 years ago

It’ll destroy the fediverse first, big social media companies will be able to hold out longer.

Spzi@lemm.ee · 2 years ago

Well written, thanks! I like how you build up with useful explanations but also quickly get to the gist.

You’re either telling people that they should trust a fake that they might have been skeptical about otherwise, or you’re slandering something real.

This insight scares me. Deep Fakes are About to Change Everything (Johnny Harris) also went over this. Maybe the biggest threat is not that indistinguishable deep fakes become possible (which is scary enough on it’s own), but that trust in real documents is eroded easily. The example in the video: A bad deep fake of a politician pops up and is discarded, but some amount of distrust and skepticism about actually real documents sticks. It seems we’re doubling down on post-truth society.

people_are_cute@lemmy.sdf.org · 2 years ago

There could be a regulation mandating all AI tools and services to encode a watermark into everything made by them, but of course, it will be hard to actually implement.

Hildegarde@geddit.social · 2 years ago

How could this comment be watermarked to prove it was written by an ai? How could anyone verify it?

domage@lemmy.world · 2 years ago

Interesting, how would you enforce that for projects, located in a different country? For self-hosted projects? Open-source projects or modifications of them that would exclude the watermark methods?

people_are_cute@lemmy.sdf.org · 2 years ago

How do you enforce copyrights for projects, in different countries, against open-source projects or modifications? You effectively don’t against small players, but you put enough laws to at least deter any large enough party from doing it too overtly. And for countries that are actually hardasses for IP laws like the US, you can make it scary enough for anyone to attempt commercial use of unmarked AI content (lest they get caught), just like you have made it with making commercial use of copied stuff from content not licensed to you.

jungle@lemmy.world · 2 years ago

The difference is that copyright violations can be detected relatively easily.

eleitl@lemmy.ml · 2 years ago

Asking for ID works. Some national IDs can be verified online cryptographically.

jungle@lemmy.world · 2 years ago

Care to expand on that idea? How does verifying an ID help in this situation?

eleitl@lemmy.ml · 2 years ago

If you want to tell humans from machines it’s the only method that reliably works. If you want to prevent humans cheating with machines use proctoring.

jungle@lemmy.world · 2 years ago

Sure, but this post is about detecting machine-generated content. How does ID verification help there?

eleitl@lemmy.ml · 2 years ago

Challenge-response. There is no validation after the fact unless it’s been already notarized. Which involved id validation.

This assumes that nation-states issuing the id have no incentive to cheat. Often not a safe assumption.

KairuByte@lemmy.world · 2 years ago

Once someone has validated their ID, that can just be added to the deepfake. I’m not seeing how needing a few extra seconds of fakery is going to solve anything.

Unless something like a TOTP identification is added, along with the current date and time displayed alongside it, there’s no real benefit to identification.

eleitl@lemmy.ml · 2 years ago

There is an existing realtime, interactive online validation process for those IDs that can’t be verified cryptographically. No, you can’t deepfake that right now. Nor anytime soon.

There is no such thing as an effective "AI detector", nor will there ever be one.

There is no such thing as an effective "AI detector", nor will there ever be one.

What’s an “AI detector”?

What does “effective” mean?

Why should the accuracy bar be so high? Isn’t anything better than a coin flip good enough?

Why can’t a good AI detector be built?

Why do these “AI detectors” keep getting advertised if they don’t work?