The company I am referring to is Anthropic, and the tool they posses is called Claude Mythos. Researchers at the company have said that the new model stands to fundamentally upend cybersecurity. At least, for the time being. They postulate that after a transitional period, the world will end up in a steady state where advanced AI benefits defenders rather than cyberattackers. Yet the transitional period could be a long and brutal storm, and we do not know what will break as it hits.
“The threat is not hypothetical,” they conclude. “Advanced language models are here.”
What we do next, both collectively and as individuals, will determine if we can weather the storm.
***
What do the capabilities of Mythos mean, prosaically speaking? It’s hard to say, because I do not have access to it, and in all likelihood, neither do you. The model is not currently public, and may never be in its current form. But broadly speaking, if one takes Anthropic at their word, the model can conduct automated software vulnerability discovery with nearly superhuman performance in some domains.The model can find security vulnerabilities in software, including software systems upon which modern civilization rests, that have eluded security researchers for years, and sometimes decades. The model has found thousands of vulnerabilities so far, most of which have not yet been fixed (for this reason, Anthropic has not publicized the exploits, but they have reported them to the developers of the software in question). An enormous range of consumer and commercial services--from banking to healthcare to education to AI itself—are plausibly implicated.
My model of modern software is that, if you look hard enough, you will find critical vulnerabilities. Looking hard, however, used to be expensive—only the best hackers in the world could do it, and their time was limited. With Mythos, the price of “looking hard” at software has plummeted, and it will get cheaper each month.
This is not wholly bad news; after all, “looking hard” at software is also how software gets improved. Mythos and similarly capable models from other companies that will soon follow, in that sense, are one of the greatest gifts to cybersecurity ever given.
Yet as things stand today, the world is deeply vulnerable. Every day, you rely on untold millions of lines of code maintained by a global population of millions of developers. It will not all be fixed tomorrow, or next month, or next year. The reality is that models of this capability level—and more capable—will almost certainly diffuse widely before all “critical” software is patched. How much damage will be done is anyone’s guess.
If you doubted whether AI systems might have object-level national security implications, now we have clear evidence. Some of the most capable and prized teams in the United States intelligence community do precisely the kind of work that Claude Mythos automates. The same is true of China. You can be inclined to believe this will all work out fine in the end, but it is simply no longer credible to contend that there are no implications for national security from large language models, and therefore for government as a whole.
***
This has been a frustrating issue to discuss candidly for the past two years. The reason is that, in the adolescent period of AI policy and discourse that is now—I hope—coming to a close, taking AI risks seriously was considered uncouth. Speaking about how near-future models might have straightforwardly dangerous capabilities was enough to provoke suspicion: were you a secret “doomer” or Effective Altruist? Were you part of a grand conspiracy to achieve “regulatory capture” for the frontier AI companies? Were you trying to “ban open source”? These sorts of questions constrained debate and put blinders on a large number of otherwise-sane policymakers and other influential people. And these constraints, in turn, meant that one had to tiptoe around reality.But I am done with tiptoeing now, and so should everyone else be. It is a great relief, albeit also a bit uncomfortable, to feel the biting winds against one’s face.
In that spirit, here are some things I believe to be true:
1. Actors who are hostile to the U.S. will possess the capabilities of Mythos, if not better, within a year or two. We will not stop this through “nonproliferation” or some other clever regulatory scheme. We can only blunt the impact of this reality by strengthening our cyberdefenses rapidly.
2. Strengthening cyberdefense will require coordination among state and local government entities, private sector critical infrastructure operators, frontier labs, and the broader private sector, as well as the federal government. But even more importantly, it will require compute: data centers. In recent testimony to the Federal Energy Regulatory Commission, I wrote about the urgency of speeding transmission siting to facilitate the buildout of supercomputing infrastructure for national security. Running massive fleets of automated software vulnerability researchers was precisely one of the use cases I cite in that testimony. In addition to speeding up the FERC process through administrative actions, we need permitting reform urgently.
3. Speaking of national security: The U.S. Department of War, and the federal government more broadly, are engaged in a lawfare campaign against Anthropic whose underlying motivations are deeply unclear and which attacks core American values. Now, the strategic wisdom looks worse and worse by the week. We are fighting a war against Iran, a highly capable cyberoffensive actor. It is inconceivable that the government can have a healthy relationship with the frontier AI industry while attempting to destroy what is arguably the field’s leading company. Anthropic and the Department of War must come to a truce, if not a resolution, as soon as possible, for the good of America’s national defense.
4. In the context of national-security-relevant cybersecurity capabilities, the key and salient difference between the United States and China is not our “innovation ecosystem,” but instead the simple reality that our firms possess the computing power to train and operate models like Mythos today, and theirs do not. It is that simple. China is prioritizing its efforts to develop its own compute manufacturing capacity, and this development is likely to motivate them even further. The best way to disrupt this is a serious increase in targeted export controls on semiconductor manufacturing equipment, too much of which flows freely today from the U.S. and its allies to China. It is long past time for major effort here from Congress and the Trump Administration.
5. The utility of SB 53, which requires frontier AI companies to disclose their assessments of their own models’ cybersecurity risks, is hopefully more apparent now. Some criticisms of that legislative framework have asserted that it attempts to control frontier AI or micromanage companies. But in truth, the framework rests on the notion that AI will not be controllable--that stopping the diffusion of potentially dangerous capabilities is impossible--and that therefore today’s “frontier” capabilities will be broadly dispersed within a short while. This is exactly we need transparency about what developers see at the frontier: so that a large range of societal actors can prepare their defenses appropriately against the developments we see forming at the frontier.
6. Today, Mythos is accessible only within Anthropic and to Anthropic’s chosen partners. Limited releases of this kind will likely be a growing trend because of both compute constraints and safety concerns. Mythos appears to be about five times more expensive to run than Opus, which was already not cheap, but for Anthropic the issue is not so much cost as it is allocating sufficient compute to serve Mythos to the public. This means that the best AI models of the future may be disproportionately, if not exclusively, used within frontier labs for their own purposes, which at least at first will be automated AI R&D. These so-called “internal deployments” have motivated my own pursuit of transparency and private governance frameworks, the latter being private organizations that would audit the safety and security posture of frontier AI companies, including their internal deployments.
***
I wrote on X that Mythos means the training wheels are coming off on AI policy. Perhaps the Department of War’s effort to strangle Anthropic is, to use another metaphor, a sign that the gloves are off too. If the last month has made anything clear, it is that we are in a nastier, sharper, harsher, meaner era of AI discourse, policy, and—ultimately—of AI development and use.I will be honest: I do not see how it is possible for Mythos-level capabilities to diffuse through the world without causing at least some significant security crises and economic disruption. And of course, this cycle of compute infrastructure buildout has only just begun; within a year or so, gigawatts of additional AI compute capacity will be online.
The pimply and ill-shapen adolescence of AI and AI policy have come to an end. The first maturity has now begun.
by Dean W. Ball, Hyperdimensional | Read more:
Image: via
[ed. See also: Everything Reinforces Predictions and Policy Preferences (DWAtV):]
***
Indeed, Anthropic itself has ‘slowed down AI’ in this situation, and done the closest thing we have had to a pause, by not releasing Mythos widely, and pretty much everyone agrees this was the right thing to do. Consider that we might need more similar capabilities, including more broadly.But how long will it be before an open source version, even if somewhat inferior, is available? Will OpenAI and Google soon be showing similar capabilities? (And how will that shift the equilibrium?) Should we upgrade our estimates of the returns to investing in compute?That depends on what counts as similar, especially with the ‘even if somewhat inferior.’ For reasonable values my guess is 1-2 years for open models in terms of absolute capabilities (by then bugs will be a lot harder to find), and on the order of months for OpenAI, and probably a few more months for Google.
How will the willingness of attackers to pay for tokens evolve, relative to the willingness of defenders to pay for tokens? Which are our softest targets?I think this absolutely will lead to higher economic concentration, as it favors economies of scale across the board. [...]
As a side effect, will this also lead to higher economic concentration, as perhaps only the larger institutions can invest in quality patches rapidly enough?
Solve For The Equilibrium
Tyler Cowen shares a model from Jacob Gloudemans of what might happen, where vulnerabilities become much easier to find quickly, but the big problems actually go away due to the increased velocity of defenses and patching.
Rather than being able to hoard exploits everyone has to use their exploits right away or lose them, and most of the time most important actors don’t especially want to mess with any particular target, so they won’t even look for the exploits.
This model assumes good defense is being played where it counts, and that the supply of exploits is limited, and that when you catch an exploit you can defend against those who have already found it and tried to use it. I don’t think those are safe assumptions.
One also should consider the opposite scenario. Right now, an intelligence agency might find an exploit and sit on it for years, perhaps forever, because even if it normally goes unused its value at the right time is very high. But, if that exploit will not last, then they may try to use it.
Ultimately the equilibrium will still involve cyberattacks, because the correct number of cyberattacks is not zero. It might be correct to price out attacks to the point where everyone involved should have better things to do with their time, but if we collectively actually cause everyone to fully give up and go home then everyone is selfishly overinvesting in defenses, unless there is a modest cost to being fully safe. [...]
Conclusion: How To Think About Mythos
[ed. Ten points...]
Things are only going to get faster and weirder and scarier from here.
Things are only going to get faster and weirder and scarier from here.