In the age of AI-spam, I now treat typos in webpages as a good sign

Deebster@lemmy.ml · 2 years ago

In the age of AI-spam, I now treat typos in webpages as a good sign

reddig33@lemmy.world · 2 years ago

You shouldn’t. Repost bots on Reddit had already figured out how to use misspellings/typos to get past spam filters.

ares35@kbin.social · 2 years ago

email spam and scammers have been using this tactic forever. if a person is stupid enough to click or respond to the message from ‘Wels Farpo’, they’re more apt to go all-in on the scam.

SkaveRat@discuss.tchncs.de · 2 years ago

You can easily have an AI include some random typos. Don’t be fooled by them

Durotar@lemmy.ml · edit-2 2 years ago

AI that is parsing Lemmy: “Noted.”

Zeth0s@lemmy.world · 2 years ago

It is extremely easy for ai to insert typos. Just FYI

Kalash@feddit.ch · edit-2 2 years ago

AI makes typos.

Hell, when we played around with chatGPT code generation it literally misspelled a variable name which broke the code.

ChiwaWithMujicanoHat@mujico.org · 2 years ago

I worked creating mass content for lots of websites, from product descriptions, to reviews and posts messages. We just inserted random typos after running Quillbot on the text and added ellipsis here and there sometimes.

I think someone in the team had a list of words they purposely changed in MS Word so that they could be misspelled all the time.

Now that ChatGPT let’s you insert your custom global instructions I’m absolutely sure they are asking for it to misspell about 2% of the words in the text and talk in a more coloquial fashion.

As things stand right now, I don’t think there is a discernible way to see if something was written by AI or not and relying on typos is not a wise thing to do.

milicent_bystandr@lemm.ee · 2 years ago

https://xkcd.com/1083

beaubbe@lemmy.world · 2 years ago

A while back, Google trained an AI to learn to speak like a human, and it was making mouth noise and breathing. If AI is trained with human texts, it will 100% insert typos.

Papergeist@lemmy.world · 2 years ago

I worked with a cook who had previously cooked in the military. He told me his boss would occasionally throw an entire egg into the powered eggs so people would think they were using real eggs. Don’t know if that true or not, but moral of the story: don’t trust the typos.

j4k3@lemmy.world · 2 years ago

Think of AI more like human cultural consciousness that we collectively embed into everything we share publicly.

Its a tool that is available for anyone to tap into. The thing you are complaining about is not the AI, it is the results of the person that wrote the code that generated the output. They are leveraging a tool, but it is not the tool that is the problem. This is like blaming Photoshop because a person uses it to make child porn.

Flying Squid@lemmy.world · 2 years ago

I get where you’re coming from, but isn’t it sort of similar to the “guns don’t kill people, people kill people” argument? At what point is a tool at least partially culpable for the crimes committed with it?

j4k3@lemmy.world · 2 years ago

No. Honestly, it is not. There is a lot of misinformation floating around right now. It is because of a campaign from proprietary AI to create monopoly in this space. Open Source offline AI is killing the proprietary model. This is like the early days of the internet when companies tried to monopolize the infrastructure and failed. AI is not the product of the next digital economy, it is the underlying framework. It isn’t anything like what the media portrays. Most people talking about this either have an agenda or they are hot take headlines readers.

(What’s my agenda) -Self education. I am disabled in a way that makes it hard to hold posture. I want to learn computer science, but have gotten stuck in the curriculum many times. As soon as I heard about offline AI that could reference a private database I knew I had to try this. I have no other connections to this space. This tech is extremely powerful in its potential, but it is also an extremely advanced tool. These types of statements are extremely misunderstood by most people that have not taken the time to really understand the technology. This tech is the ability to ask a book questions in plain text. It is the ability to search for information about products without a search engine biased on ads revenue. It is a way to ask highly technical questions and get direct answers. It is a way to use a basic understanding of code and generate snippets an order of magnitude faster than looking up the same info on stack overflow. It is a plain text way to generate Linux commands or to navigate and explain an API. This is also a tool to help with deep personal social, taboo, or difficult issues to talk about with real people. It is a tool to help a person grow by giving them someone to talk to that can understand boring or niche subjects we want to talk about as we learn but have no outlets from deep in our rabbit hole.

This is limited, you must be skeptical of all outputs, and second source everything important. It takes a very large model to generate mostly accurate results. This is everything embedded in language. The massive models are usually trained on multiple languages. This accesses embedded elements and perspectives inherent to other languages that most of us will never have access to.

If you are aware of both the enormous scope of information embedded in your own awareness, and aware of the limitations of your memory when it comes to accuracy of very specific details, this is exactly what any of these LLMs are capable of doing except it has been collectivised and made accessible.

Models themselves have no persistent memory. What does this mean? If you type in questions, it can recall those questions and answers for a time during the session. (Wait, you just said it has no memory!) This functionality is not part of the LLM. This is code that processes the text prompt and a bunch of static instructions needed to tell the model exactly what to do. Keeping the conversion history available is all done in this external layer. The model itself is a freaking internet troll. It is a psychopath reddit user replying unless you tell it exactly what it is and how it should respond, and it will take everything possible out of your intended context. It is really hard to limit this part of the prompt well. It is probably impossible to make a true generalist, but I digress. My point is that, the amount of data that can be entered into a prompt is limited. The history must be managed and there will be terms dropped from the history unless you are trying to collect all of this data for monetization and you are willing to build a giant amount of infrastructure to collect and process this data. The thing is, as far as the model is concerned, all of this data is in a single prompt every single time it is processed. This data can never be added to the model in real time or effectively in post processing. The model can’t interact with this information internationally in a way that alters what it does on the next iteration. The networks inside the model are static. It is not magic. It is complicated, it is tensor math and vectors, and statistics, but all of this is applied to: “Question = (X) category/Prompt text results in (X) as the most probable best next word.” That’s it. That is all that is happening under the hood. The reason this is “new” tech has to do with how the problem of categorizing information is handled quickly in a vector cloud. The model data is just like a better search engine that is able to find everything we’ve ever talked about on the internet.

If you understand this, you should clearly see why this must be transparent, offline, and open source. You should also see why absolute control over this would make extremely concentrated power that no corporation or government should have control over. These are the real issues. Put all of the information you have encountered into this context and ask yourself who has what agenda in the information you have encountered.

If you want a credible source, watch this: https://piped.video/watch?v=OgWaowYiBPM

Or look into Yann LeCun. I hate Meta more than most, but this guy is not the usual from the company. He has the freedom to speak his mind, is the chief architect behind the open source AI movement, and is a former Bell Labs guy. If you know anything about the people, products, and legacy of Bell Labs, you should know that most of our digital age came from these people in this space. This is the future being created right now.

Niello@kbin.social · edit-2 2 years ago

Photoshop is a general purpose image editting tool that is mostly harmless. That’s not the same for AI. The people who created them and allow other people to use them do so anyway without enough consideration to the risks they know is much much higher than something like Photoshop.

What you say applies to photoshop because the devs know what it can do and the possible damage it can cause from misuse is within reasons. The AI you are talking about are controlled by the companies that create them and use them to provide services. It follows it is their responsibility to make sure their products are not harmful to the extend they are, especially when the consequences are not fully known.

Your reasoning is the equivalent of saying it’s the kids fault for getting addicted to predatory mobile games and wasting excessive money on them. Except that it’s not entirely their fault and programs aren’t just a neutral tool but a tool that is customised to the wills of the owners (the companies that own them). So there is such a thing as an evil tool.

It’s all those companies, and the people involved, as well as law makers responsiblity to make the new technology safe with minimal negative impacts to society rather than chase after their own profits while ignoring the moral choices.

j4k3@lemmy.world · 2 years ago

This is not true. You do not know all the options that exist, or how they really work. I do. I am only using open source offline AI. I do not use anything proprietary. All of the LLM’s are just a combination of a complex system of categories, with a complex network that calculates what word should come next. Everything else is external to the model. The model itself is not anything like an artificial general intelligence. It has no persistent memory. The only thing is actually does is predict what word should come next.

Niello@kbin.social · edit-2 1 year ago

Do you always remember things as is? Or do you remember an abstraction of it?

You also don’t need to know everything about something to be able to interpret risks and possibilities, btw.

MrNemobody@lemmy.world · 2 years ago

That’s what the AI wants you to think.

Spaghetti_Hitchens@kbin.social · 2 years ago

Kind of like how tiny imperfections in products makes us think of handmade products

m_r_butts@kbin.social · 2 years ago

I had it all. Even the glass dishes with tiny bubbles and imperfections, proof that they were crafted by the honest, hard-working, indigenous peoples of… wherever.

Flying Squid@lemmy.world · 2 years ago

Pier 1 thanks you for your business.

GentlemanLoser@ttrpg.network · 2 years ago

Tiny brown hands, most likely

war@kbin.social · 2 years ago

deleted by creator

catreadingabook@kbin.social · 2 years ago

Lmao imagine getting referred to a doctor for surgery, you look them up, and their professional webpage is like. “i wen’t 2 harverd”

NoneOfUrBusiness@kbin.social · 2 years ago

They’re not saying they treat the lack of typos as a bad sign, but rather that they treat typos as a good sign. Those are not the same thing.

CreeperODeath@lemmy.world · 2 years ago

That’s a very interesting take my friend

regalia · 2 years ago

Somehow I can pretty easily tell AI by reading what they write. Motivation is what they’re writing for is big, and depends on what they’re saying. Chatgpt and shit won’t go off like a Wikipedia styled description with some extra hallucination in their. Real people will throw in some dumb shit and start arguing with u

🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 🏆@yiffit.net · edit-2 2 years ago

I have a janitor.ai character that sounds like an average Redditor, since I just fed it average reddit posts as its personality.

It says stupid shit and makes spelling errors a lot, is incredibly pedantic and contrarian, etc. I don’t know why I made it, but it’s scary how real it is.

regalia · 2 years ago

what motivation would someone have to randomly run that

also you just added new information to the discussion that you personally did. Can an AI do that?

🇰 🌀 🇱 🇦 🇳 🇦 🇰 🇮 🏆@yiffit.net · edit-2 2 years ago

It is an AI. It’s a frontend for ChatGPT. All I did was coax the AI to behave in a specific way, which anyone else using these tools is capable of doing.