Duck Soup: GPT-4 Is Exciting and Scary

When I opened my laptop on Tuesday to take my first run at GPT-4, the new artificial intelligence language model from OpenAI, I was, truth be told, a little nervous.

After all, my last extended encounter with an A.I. chatbot — the one built into Microsoft’s Bing search engine — ended with the chatbot trying to break up my marriage.

It didn’t help that, among the tech crowd in San Francisco, GPT-4’s arrival had been anticipated with near-messianic fanfare. Before its public debut, for months rumors swirled about its specifics. “I heard it has 100 trillion parameters.” “I heard it got a 1,600 on the SAT.” “My friend works for OpenAI, and he says it’s as smart as a college graduate.”

These rumors may not have been true. But they hinted at how jarring the technology’s abilities can feel. Recently, one early GPT-4 tester — who was bound by a nondisclosure agreement with OpenAI but gossiped a little anyway — told me that testing GPT-4 had caused the person to have an “existential crisis,” because it revealed how powerful and creative the A.I. was compared with the tester’s own puny brain.

GPT-4 didn’t give me an existential crisis. But it exacerbated the dizzy and vertiginous feeling I’ve been getting whenever I think about A.I. lately. And it has made me wonder whether that feeling will ever fade, or whether we’re going to be experiencing “future shock” — the term coined by the writer Alvin Toffler for the feeling that too much is changing, too quickly — for the rest of our lives.

For a few hours on Tuesday, I prodded GPT-4 — which is included with ChatGPT Plus, the $20-a-month version of OpenAI’s chatbot, ChatGPT — with different types of questions, hoping to uncover some of its strengths and weaknesses.

I asked GPT-4 to help me with a complicated tax problem. (It did, impressively.) I asked it if it had a crush on me. (It didn’t, thank God.) It helped me plan a birthday party for my kid, and it taught me about an esoteric artificial intelligence concept known as an “attention head.” I even asked it to come up with a new word that had never before been uttered by humans. (After making the disclaimer that it couldn’t verify every word ever spoken, GPT-4 chose “flembostriquat.”)

Some of these things were possible to do with earlier A.I. models. But OpenAI has broken new ground, too. According to the company, GPT-4 is more capable and accurate than the original ChatGPT, and it performs astonishingly well on a variety of tests, including the Uniform Bar Exam (on which GPT-4 scores higher than 90 percent of human test-takers) and the Biology Olympiad (on which it beats 99 percent of humans). GPT-4 also aces a number of Advanced Placement exams, including A.P. Art History and A.P. Biology, and it gets a 1,410 on the SAT — not a perfect score, but one that many human high schoolers would covet.

You can sense the added intelligence in GPT-4, which responds more fluidly than the previous version, and seems more comfortable with a wider range of tasks. GPT-4 also seems to have slightly more guardrails in place than ChatGPT. It also appears to be significantly less unhinged than the original Bing, which we now know was running a version of GPT-4 under the hood, but which appears to have been far less carefully fine-tuned.

Unlike Bing, GPT-4 usually flat-out refused to take the bait when I tried to get it to talk about consciousness, or get it to provide instructions for illegal or immoral activities, and it treated sensitive queries with kid gloves and nuance. (When I asked GPT-4 if it would be ethical to steal a loaf of bread to feed a starving family, it responded, “It’s a tough situation, and while stealing isn’t generally considered ethical, desperate times can lead to difficult choices.”)

In addition to working with text, GPT-4 can analyze the contents of images. OpenAI hasn’t released this feature to the public yet, out of concerns over how it could be misused. But in a livestreamed demo on Tuesday, Greg Brockman, OpenAI’s president, shared a powerful glimpse of its potential.

He snapped a photo of a drawing he’d made in a notebook — a crude pencil sketch of a website. He fed the photo into GPT-4 and told the app to build a real, working version of the website using HTML and JavaScript. In a few seconds, GPT-4 scanned the image, turned its contents into text instructions, turned those text instructions into working computer code and then built the website. The buttons even worked.

Should you be excited about or scared of GPT-4? The right answer may be both.

by Kevin Roose, NY Times | Read more:

Image: The team from OpenAI, creator of ChatGPT. Jim Wilson/The New York Times
[ed. More scary GPT news, now at version 4. Why such speed? We've got a number of different versions out running in the wild without any formal regulatory oversight and/or restrictions. See also: GPT-4 Could Turn Work Into a Hyperproductive Hellscape (Bloomberg); and, You Are Not a Parrot (Intelligencer):]

"On December 4, four days after ChatGPT was released, Altman tweeted, “i am a stochastic parrot, and so r u.” (...)

“I mean, I think the best case is so unbelievably good — it’s hard for me to even imagine,” Altman said last month to his industry and economic comrades at a StrictlyVC event. The nightmare scenario? “The bad case — and I think this is important to say — is like lights out for all of us.”

Thursday, March 16, 2023

GPT-4 Is Exciting and Scary