Google Books ingesting the AI-generated rubbish

beschizza · April 8, 2024, 2:56pm

Originally published at: https://boingboing.net/2024/04/08/google-books-ingesting-the-ai-generated-rubbish.html

…

anon61221983 · April 8, 2024, 2:56pm

No Way Reaction GIF by Originals

theophrastus · April 8, 2024, 3:26pm

Quasi-related, there’s a good news item in today’s new york times (yes yes i know) about the early insane dash to get data to train big money ‘A.I.’ products, mostly at OpenAI, google and ‘meta’. Their large language models (LLM) were data-starved for input so they took to speech recognition and you-tube and similar sources for input. Just contemplate on the error rate of speech recognition one generally sees, (typically around 15% for diverse english speakers!), and then realize that that’s the A.I. garbage-in-garbage-out that they want to run our lives and diagnose our conditions.

How Tech Giants Cut Corners to Harvest Data for A.I.

robertmckenna · April 8, 2024, 3:34pm

Google books has always been a dumpster fire for quality control. It’s kind of a Google “tell” at this stage like “as of my last knowledge update”.

sandrodacruz · April 8, 2024, 4:54pm

Im pretty much braced for the entire internet to sink into a godawful glob of Googley grey-goo gobbledygook.

Horselover_Fat · April 8, 2024, 5:46pm

Benjamin Bratton and Blaise Agüera Y Arcas describe this as the ‘Ouroboros Language Problem’ in this text written for Noema.

Basically, as AI generated material begins to become the majority of online content, the AI generators will increasingly being to take in more of their own output as source material. Presumably this would lead to an ever-lessening pool of probable results and a self-marginalization of AI as it enshitifies the internet.

fuzzyfungus · April 8, 2024, 5:52pm

I’m not surprised that they don’t care about the results; I’m a bit surprised that they aren’t more concerned about their precious bots.

My (layman’s) understanding was that a few rounds of inhuman centipede with bots training on bot spew had significant negative effects on the performance of the model.

Kilkrazy · April 8, 2024, 6:33pm

Just As Planned.

Fortunately for human writers we can come up with new, original, well-written stuff because we have brains rather than a very large database of vocab concordances.

robertmckenna · April 8, 2024, 6:45pm

Funny! Because I’ve been calling it “Ourobouros only the snake(s tail) is made of shit”.

gracchus · April 8, 2024, 11:59pm

A portrait of AI-driven Google Books.

FGD135 · April 10, 2024, 6:08am

I rather hope today’s “AI” will never be as powerful or long-lived as that.

Doctor_Faustus · April 10, 2024, 7:12am

I understand the concern about the overwhelming flood of AI-generated content and Google’s pervasive influence on the internet landscape. As AI continues to evolve, there’s a risk of drowning in a sea of generic, algorithmically-produced content that lacks depth and authenticity. However, amidst this challenge, there are still avenues for curated, human-centric content and platforms that prioritize quality over quantity. By actively supporting and participating in these spaces, we can counteract the homogenizing effects of Google’s search algorithms and ensure that the internet remains a diverse and enriching ecosystem for all.

[Yes, this is of course AI generated grey goo]

sandrodacruz · April 10, 2024, 12:21pm

I guess the real problem isn’t AI at all. It’s search engine optimisation that broke the contract between what people ask for and what online services serve up.
A search box has become a manevolent Djinni - Giving me what I asked for, but not what I want.

AI language models simply make the Djinni more powerful.Why homoginise? Now it can counjour custom content off the cuff. Content crowded with convincing communities.
People just like me, sharing their authentic views, every one of them a hommunculus of hype…

system · April 13, 2024, 2:57pm

This topic was automatically closed after 5 days. New replies are no longer allowed.

Topic		Replies	Views
Flood of AI-generated SEO chum content may put the web out of its misery boing	36	1630	June 21, 2023
Google returning AI nonsense in search highlights boing	11	545	November 1, 2023
Crappy AI knockoff books are already topping the publishing charts boing	30	937	August 19, 2023
The business of generating and selling low-quality ebooks boing	27	816	April 23, 2024
AI search chatbots output lies, nonsense and hallucinations boing	15	806	October 11, 2023

Google Books ingesting the AI-generated rubbish

Related topics