Sports

AI systems could be on the verge of collapsing into nonsense, scientists warn

Published

4 months ago

July 24, 2024

Admin

AI systems could be on the verge of collapsing into nonsense, scientists warn

AI systems could collapse into nonsense as more of the internet gets filled with content made by artificial intelligence, researchers have warned.

Recent years have seen increased excitement about text-generating systems such as OpenAI’s ChatGPT. That excitement has led many to publish blog posts and other content created by those systems, and ever more of the internet has been produced by AI.

Many of the companies producing those systems use text taken from the internet to train them, however. That may lead to a loop in which the same AI systems being used to produce that text are then being trained on it.

That could quickly lead those AI tools to fall into gibberish and nonsense, researchers have warned in a new paper. Their warnings come amid a more general worry about the “dead internet theory”, which suggests that more and more of the web is becoming automated in what could be a vicious cycle.

It takes only a few cycles of both generating and then being trained on that content for those systems to produce nonsense, according to the research.

They found that one system tested with text about medieval architecture only needed nine generations before the output was just a repetitive list of jackrabbits, for instance.

The concept of AI being trained on datasets that was also created by AI and then polluting their output has been referred to as “model collapse”. Researchers warn that it could become increasingly prevalent as AI systems are used more across the internet.

It happens because as those systems produce data and are then trained on it, the less common parts of the data tends to left out. Researcher Emily Wenger, who did not work on the study, used the example of a system trained on pictures of different dog breeds: if there are more golden retrievers in the original data, then it will pick those out, and as the process goes round those other dogs will eventually be left out entirely – before the system falls apart and just generates nonsense.

The same effect happens with large language models like those that power ChatGPT and Google’s Gemini, the researchers found.

That could be a problem not only because the systems eventually become useless, but also because they will gradually become less diverse in their outputs. As the data is produced and recycled, the systems may fail to reflect all of the variety of the world, and smaller groups or outlooks might be erased entirely.

The problem “must be taken seriously if we are to sustain the benefits of training from large-scale data scraped from the web”, the researchers write in their paper. It might also mean that those companies that have already scraped data to train their systems could be in a beneficial position, since data taken earlier will have more genuine human output in it.

The problem could be fixed with a range of possible solutions including watermarking output so that it can be spotted by automated systems and then filtered out of those training sets. But it is easy to remove those watermarks and AI companies have been resistant to working together to use it, among other issues.

The study, ‘AI models collapse when trained on recursively generated data’, is published in Nature.

Up Next

Stan Bowman hired as new Oilers GM 3 years after Blackhawks sexual assault scandal

Don't Miss

Unesco rejects proposal to add Stonehenge to world heritage ‘in danger’ list

The Sun News Today

AI systems could be on the verge of collapsing into nonsense, scientists warn

Sports

AI systems could be on the verge of collapsing into nonsense, scientists warn

NFL scores, live updates: Commanders look to get back on track vs. Cowboys; Chiefs, Lions going for 10th win

Futures: Stay Cool In Hot Rally; Meet The New AI Chip Leader

Hezbollah rocket hits near Tel Aviv after Beirut airstrike

Banged-up Aaron Rodgers reportedly ‘resisted’ medical scans out of fear Jets will force him off the field

Report: Tulsa fires Kevin Wilson as the AAC coaching carousel continues

Brazilian meat suppliers stop deliveries to local Carrefour retailers, media say

Canucks Forward Ties Obscure Hits Record In Victory Over The Senators

Storm Bert brings 80% of month’s rain in 48 hours as disruption continues

Sunday Night Football: How to watch the Philadelphia Eagles vs. Los Angeles Rams game tonight

UCLA vs. USC takeaways: Bruins aim for resilience after fumbling away a signature win