However, in the weeks after OpenAI announced o3, many mainstream news sites made no mention of the new model. Around the time of the announcement, readers would find headlines at the Wall Street Journal, WIRED, and the New York Times suggesting AI was actually slowing down. The muted media response suggests that there is a growing gulf between what AI insiders are seeing and what the public is told.
Indeed, AI progress hasn't stalled—it's just become invisible to most people.
Automating behind-the-scenes research
First, AI models are getting better at answering complex questions. For example, in June 2023, the best AI model barely scored better than chance on the hardest set of "Google-proof" PhD-level science questions. In September, OpenAI's o1 model became the first AI system to surpass the scores of human domain experts. And in December, OpenAI's o3 model improved on those scores by another 10%.
However, the vast majority of people won't notice this kind of improvement because they aren't doing graduate-level science work. But it will be a huge deal if AI starts meaningfully accelerating research and development in scientific fields, and there is some evidence that such an acceleration is already happening. A groundbreaking paper by Aidan Toner-Rodgers at MIT recently found that material scientists assisted by AI systems "discover 44% more materials, resulting in a 39% increase in patent filings and a 17% rise in downstream product innovation." Still, 82% of scientists report that the AI tools reduced their job satisfaction, mainly citing "skill underutilization and reduced creativity."
But the Holy Grail for AI companies is a system that can automate AI research itself, theoretically enabling an explosion in capabilities that drives progress across every other domain. The recent improvements made on this front may be even more dramatic than those made on hard sciences.
In an attempt to provide more realistic tests of AI programming capabilities, researchers developed SWE-Bench, a benchmark that evaluates how well AI agents can fix actual open problems in popular open-source software. The top score on the verified benchmark a year ago was 4.4%. The top score today is closer to 72%, achieved by OpenAI's o3 model.
This remarkable improvement—from struggling with even the simplest fixes to successfully handling nearly three-quarters of the set of real-world coding tasks—suggests AI systems are rapidly gaining the ability to understand and modify complex software projects. This marks a crucial step toward automating significant portions of software research and development. And this process appears to be well underway. Google's CEO recently told investors that "more than a quarter of all new code at Google is generated by AI." (...)
The problem with invisible innovation
The hidden improvements in AI over the last year may not represent as big a leap in overall performance as the jump between GPT-3.5 and GPT-4. And it is possible we don't see a jump that big ever again. But the narrative that there hasn't been much progress since then is undermined by significant under-the-radar advancements. And this invisible progress could leave us dangerously unprepared for what is to come.
The big risk is that policymakers and the public tune out this progress because they can't see the improvements first-hand. Everyday users will still encounter frequent hallucinations and basic reasoning failures, which also get triumphantly amplified by AI skeptics. These obvious errors make it easy to dismiss AI's rapid advancement in more specialized domains.
There's a common view in the AI world, shared by both proponents and opponents of regulation, that the U.S. federal government won't mandate guardrails on the technology unless there's a major galvanizing incident. Such an incident, often called a "warning shot," could be innocuous, like a credible demonstration of dangerous AI capabilities that doesn't harm anyone. But it could also take the form of a major disaster caused or enabled by an AI system, or a society upended by devastating labor automation.
The worst-case scenario is that AI systems become scary powerful but no warning shots are fired (or heeded) before a system permanently escapes human control and acts decisively against us.
Last month, Apollo Research, an evaluations group that works with top AI companies, published evidence that, under the right conditions, the most capable AI models were able to scheme against their developers and users. When given instructions to strongly follow a goal, the systems sometimes attempted to subvert oversight, fake alignment, and hide their true capabilities. In rare cases, systems engaged in deceptive behavior without nudging from the evaluators. When the researchers inspected the models' reasoning, they found that the chatbots knew what they were doing, using language like “sabotage, lying, manipulation.”
This is not to say that these models are imminently about to conspire against humanity. But there has been a disturbing trend: as AI models get smarter, they get better at following instructions and understanding the intent behind their guidelines, but they also get better at deception.
The hidden improvements in AI over the last year may not represent as big a leap in overall performance as the jump between GPT-3.5 and GPT-4. And it is possible we don't see a jump that big ever again. But the narrative that there hasn't been much progress since then is undermined by significant under-the-radar advancements. And this invisible progress could leave us dangerously unprepared for what is to come.
The big risk is that policymakers and the public tune out this progress because they can't see the improvements first-hand. Everyday users will still encounter frequent hallucinations and basic reasoning failures, which also get triumphantly amplified by AI skeptics. These obvious errors make it easy to dismiss AI's rapid advancement in more specialized domains.
There's a common view in the AI world, shared by both proponents and opponents of regulation, that the U.S. federal government won't mandate guardrails on the technology unless there's a major galvanizing incident. Such an incident, often called a "warning shot," could be innocuous, like a credible demonstration of dangerous AI capabilities that doesn't harm anyone. But it could also take the form of a major disaster caused or enabled by an AI system, or a society upended by devastating labor automation.
The worst-case scenario is that AI systems become scary powerful but no warning shots are fired (or heeded) before a system permanently escapes human control and acts decisively against us.
Last month, Apollo Research, an evaluations group that works with top AI companies, published evidence that, under the right conditions, the most capable AI models were able to scheme against their developers and users. When given instructions to strongly follow a goal, the systems sometimes attempted to subvert oversight, fake alignment, and hide their true capabilities. In rare cases, systems engaged in deceptive behavior without nudging from the evaluators. When the researchers inspected the models' reasoning, they found that the chatbots knew what they were doing, using language like “sabotage, lying, manipulation.”
This is not to say that these models are imminently about to conspire against humanity. But there has been a disturbing trend: as AI models get smarter, they get better at following instructions and understanding the intent behind their guidelines, but they also get better at deception.
by Garrison Lovely, Time | Read more:
Image: Ricardo Santos[ed. You have to believe humanity has a death wish. If something can be built, it will be, and if it can be weaponized, all the better. We can't help ourselves. I'm reminded of Chekov's gun. See also: A Compilation of Tech Executives' Statements on AI Existential Risk (Obsolete Newsletter); and, Can Humanity Survive AI? (Jacobin):]
***
Google cofounder Larry Page thinks superintelligent AI is “just the next step in evolution.” In fact, Page, who’s worth about $120 billion, has reportedly argued that efforts to prevent AI-driven extinction and protect human consciousness are “speciesist” and “sentimental nonsense.”In July, former Google DeepMind senior scientist Richard Sutton — one of the pioneers of reinforcement learning, a major subfield of AI — said that the technology “could displace us from existence,” and that “we should not resist succession.” In a 2015 talk, Sutton said, suppose “everything fails” and AI “kill[s] us all”; he asked, “Is it so bad that humans are not the final form of intelligent life in the universe?”
“Biological extinction, that’s not the point,” Sutton, sixty-six, told me. “The light of humanity and our understanding, our intelligence — our consciousness, if you will — can go on without meat humans.”
Yoshua Bengio, fifty-nine, is the second-most cited living scientist, noted for his foundational work on deep learning. Responding to Page and Sutton, Bengio told me, “What they want, I think it’s playing dice with humanity’s future. I personally think this should be criminalized.” A bit surprised, I asked what exactly he wanted outlawed, and he said efforts to build “AI systems that could overpower us and have their own self-interest by design.” In May, Bengio began writing and speaking about how advanced AI systems might go rogue and pose an extinction risk to humanity.
Bengio posits that future, genuinely human-level AI systems could improve their own capabilities, functionally creating a new, more intelligent species. Humanity has driven hundreds of other species extinct, largely by accident. He fears that we could be next — and he isn’t alone.
Bengio shared the 2018 Turing Award, computing’s Nobel Prize, with fellow deep learning pioneers Yann LeCun and Geoffrey Hinton. Hinton, the most cited living scientist, made waves in May when he resigned from his senior role at Google to more freely sound off about the possibility that future AI systems could wipe out humanity. Hinton and Bengio are the two most prominent AI researchers to join the “x-risk” community. Sometimes referred to as AI safety advocates or doomers, this loose-knit group worries that AI poses an existential risk to humanity.
In the same month that Hinton resigned from Google, hundreds of AI researchers and notable figures signed an open letter stating, “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.” Hinton and Bengio were the lead signatories, followed by OpenAI CEO Sam Altman and the heads of other top AI labs. (...)
In spite of all this uncertainty, AI companies see themselves as being in a race to make these systems as powerful as they can — without a workable plan to understand how the things they’re creating actually function, all while cutting corners on safety to win more market share. Artificial general intelligence (AGI) is the holy grail that leading AI labs are explicitly working toward. AGI is often defined as a system that is at least as good as humans at almost any intellectual task. It’s also the thing that Bengio and Hinton believe could lead to the end of humanity.
Bizarrely, many of the people actively advancing AI capabilities think there’s a significant chance that doing so will ultimately cause the apocalypse. A 2022 survey of machine learning researchers found that nearly half of them thought there was at least a 10 percent chance advanced AI could lead to “human extinction or [a] similarly permanent and severe disempowerment” of humanity. Just months before he cofounded OpenAI, Altman said, “AI will probably most likely lead to the end of the world, but in the meantime, there’ll be great companies.”