Sunday, June 18, 2023

Will GPT Models Choke on Their Own Exhaust?

Until about now, most of the text online was written by humans. But this text has been used to train GPT3(.5) and GPT4, and these have popped up as writing assistants in our editing tools. So more and more of the text will be written by large language models (LLMs). Where does it all lead? What will happen to GPT-{n} once LLMs contribute most of the language found online?

And it’s not just text. If you train a music model on Mozart, you can expect output that’s a bit like Mozart but without the sparkle – let’s call it ‘Salieri’. And if Salieri now trains the next generation, and so on, what will the fifth or sixth generation sound like?

In our latest paper, we show that using model-generated content in training causes irreversible defects. The tails of the original content distribution disappear. Within a few generations, text becomes garbage, as Gaussian distributions converge and may even become delta functions. We call this effect model collapse.

Just as we’ve strewn the oceans with plastic trash and filled the atmosphere with carbon dioxide, so we’re about to fill the Internet with blah. This will make it harder to train newer models by scraping the web, giving an advantage to firms which already did that, or which control access to human interfaces at scale. Indeed, we already see AI startups hammering the Internet Archive for training data.

After we published this paper, we noticed that Ted Chiang had already commented on the effect in February, noting that ChatGPT is like a blurry jpeg of all the text on the Internet, and that copies of copies get worse. In our paper we work through the math, explain the effect in detail, and show that it is universal.

by Ross Anderson, Light Blue Touchpaper |  Read more:
Image: Reddit via:
[ed. Do read Ted Chiang's excellent essay - ChatGPT Is a Blurry JPEG of the Web; and also: The Age of Chat (New Yorker - excerpt below). I had to work with N.P.C's all my career and until now didn't have a good term for them, other than idiots...lol.]

"The gaming industry has long relied on various forms of what might be called A.I., and at G.D.C. at least two of the talks addressed the use of large language models and generative A.I. in game development. They were focussed specifically on the construction of non-player characters, or N.P.C.s: simple, purely instrumental dramatis personae—villagers, forest creatures, combat enemies, and the like—who offer scripted, constrained, utilitarian lines of dialogue, known as barks. (“Hi there”; “Have you heard?”; “Look out!”) N.P.C.s are meant to make a game feel populated, busy, and multidimensional, without being too distracting. Meanwhile, outside of gaming, the N.P.C. has become a sort of meme. In 2018, the Times identified the term “N.P.C.” as “the Pro-Trump Internet’s new favorite insult,” after right-wing Internet users began using it to describe their political opponents as mechanical chumps brainwashed by political orthodoxy. These days, “N.P.C.” can be used to describe another person, or oneself, as generic—basic, reproducible, ancillary, pre-written, powerless.

When I considered the possibilities of N.P.C.s powered by L.L.M.s, I imagined self-replenishing worlds of talk, with every object and character endowed with personality. I pictured solicitous orcs, garrulous venders, bantering flora. In reality, the vision was more workaday. N.P.C.s “may be talking to the player, to each other, or to themselves—but they must speak without hesitation, repetition, or deviation from the narrative design,” the summary for one of the talks, given by Ben Swanson, a researcher at Ubisoft, read. Swanson’s presentation promoted Ubisoft’s Ghostwriter program, an application designed to reduce the busywork involved in creating branching “trees” of dialogue, in which the same types of conversations occur many times, slightly permuted depending on what players say. Ghostwriter uses a large language model to suggest potential barks for N.P.C.s, based on various criteria inputted by humans, who then vet the model’s output. The player doesn’t directly engage the A.I.

Yet it seems likely that “conversations” with A.I. will become more common in the coming years. Large language models are being added to search engines and e-commerce sites, and into word processors and spreadsheet software. In theory, the capabilities of voice assistants, such as Apple’s Siri or Amazon’s Alexa, could become more complex and layered. Call centers are introducing A.I. “assistants”; Amazon is allegedly building a chatbot; and Wendy’s, the fast-food chain, has been promoting a drive-through A.I. Earlier this spring, the Khan Lab School, a small private school in Palo Alto, introduced an experimental tutoring bot to its students, and the National Eating Disorders Association tried replacing its human-run telephone helpline with a bot dubbed Tessa. (neda suspended the service after Tessa gave harmful advice to callers.)

Chatbots are not the end point of L.L.M.s. Arguably, the technology’s most impressive capabilities have less to do with conversation and more to do with processing data—unearthing and imitating patterns, regurgitating inputs to produce something close to summaries—of which conversational text is just one type. Still, it seems that the future could hold endless opportunities for small talk with indefatigable interlocutors. Some of this talk will be written, and some may be conducted through verbal interfaces that respond to spoken commands, queries, and other inputs. Anything hooked up to a database could effectively become a bot, or an N.P.C., whether in virtual or physical space. We could be entering an age characterized by endless chat—useful or skimmable, and bottomless by design."