Game Theory Optimal Strategy for AI Safety and Dominance: Acceleration and Transparency
Let's use first principles reasoning and a few axioms to calculate the optimal strategy for US government and corporate policy.
Everyone is losing their mind over DeepSeek R1 and getting their panties in a twist. Demis Hasabis, the CEO of Google DeepMind, has said that R1 is the “best work” out of China, but it doesn’t represent “any scientific progress.”
Meanwhile, Dario Amodei, CEO of Anthropic, has said we need more export controls against China, but simultaneously that DeepSeek isn’t a threat to his company. So which is it? It’s enough of a threat to strangle China more, or it’s not really a big deal?
What if I told you the game theory optimal strategy was aggressive timelines and radical transparency?
Let’s establish a few axioms to serve as ground truth for this first-principles reasoning exercise:
From a global perspective, AI progress cannot be meaningfully slowed down. There is neither the political willpower within nations to hamper AI progress, nor is their geopolitical coordination (or incentive) to do so. The Pause AI movement has fizzled, no one cares, and progress is only accelerating. Despite concerted effort from concerned scientists, the general consensus is that AI does not represent a significant enough threat to humanity to justify a moratorium. Max Tegmark himself, author of the Pause AI letter, concedes that “only 50% of scientists agree that there is a 10% chance that AI represents an existential threat.” That’s a pretty extreme minority. Compare that to 99% of scientists agreeing that climate change represents and imminent threat to humanity. That’s what scientific consensus looks like. The “Doomer” movement is a fringe conspiracy theory by comparison. Consider that export controls haven’t really worked, and if anything, have forced China to figure out better, more efficient ways to train AI. China is a sovereign nation with massive infrastructure and their own talent.
A world in which the USA retains geopolitical dominance over China is preferable. Many people will debate this point, some vehemently, but knowing what I do about CCP ambitions, I would much rather live in a world where America leads. This point concedes that we are in an arms race with China, which is a fairly defensible argument. Any policy that results in America losing to China represents a moral bad, or undesirable outcome.
AI safety is best served by tight feedback loops via the “ship early and often” strategy pioneered by OpenAI. Much consternation has happened at OpenAI over the last few years, with notable departures such as Ilya Sutskever and Mira Murati, as well as many others. Furthermore, there are differing schools of thought regarding AI safety. Sam has championed the “ship early and often” model, allowing the public to break and exploit models, essentially performing high velocity, decentralized, free research. Ilya and others have advocated for closed-door safety testing, with Anthropic apparently following in that line of thought. However, Sam made the observation that no matter how much money they threw at safety testing and red teaming, the public would still find things they missed. Software rarely survives first contact with the public.
All AI companies have a profit-motive to satisfy, and need capital to do their research. This one is kind of obvious, but basically, if you want to develop AI models, you need to make money doing so. This is why OpenAI changed from open source, nonprofit to a closed source for-profit. Many other companies, including Anthropic, have followed their lead. Harming your market position by withholding models or being overly cautious is just bad business. Furthermore, any business that focuses more on shipping better, more powerful models with “good enough” safety will simply overtake you. See China’s DeepSeek.
All nations and corporations have a strong incentive to maintain first mover advantage. This applies to the companies within nations as well as between nations. Right now, the only two national contenders for “first mover advantage” on AI are America and China. This is why OpenAI is comfortably out ahead of competition, though they were humbled by DeepSeek to the point that they’ve backtracked and Sam has even suggested they were “on the wrong side of history where Open Source was concerned.”
Taken all together, what is the game theory optimal strategy?
I asked Claude to synthesize for me:
The logical conclusion is pretty straightforward: If you can't slow down AI development, and you want the U.S. to maintain technological superiority, then the optimal strategy would be to develop and release AI models aggressively while being open about safety issues and findings. This means Anthropic’s current approach of holding back releases while advocating for export controls is exactly backwards.
Since first-mover advantage is crucial both for market dominance and profit, and real safety improvements come from public testing and feedback, companies should be pushing their advances out quickly while being transparent about safety concerns they discover. This serves multiple goals at once—it helps maintain U.S. technological leadership over China, generates revenue to fund further development, and creates the kind of open environment needed for genuine safety progress.
What’s particularly important is that this approach aligns the profit motive with both national security and safety goals. By being first to market with powerful models while maintaining transparency about safety issues, a company can build market share, contribute to U.S. technological dominance, and help create better safety standards through real-world testing and feedback. This beats trying to develop perfect safety measures in isolation while competitors race ahead.
So in game theory terms, the optimal move is aggressive development combined with radical transparency about safety—not holding back releases while arguing for restrictions on competitors.
One counterargument that people might have to this assertion is “yeah, but what about bad actors? If you release everything open source, then you’re just giving the bad guys equal capabilities.”
This is true, which is why I’m not advocating exclusively for open source only. That is not necessarily the optimal policy. However, as demonstrated by DeepSeek and Meta, companies don’t really stand to benefit for too long by doubling down on an exclusively closed source strategy. Open source ultimately benefits all of humanity, therefore it has an appropriate place in this conversation. Furthermore, open source models aid researchers, such as universities and nonprofits, by being more accessible.
In this case, we can look at existing cybersecurity paradigms, where the “bad guys” or malicious actors are pretty much always equally as equipped as you. We see this in cyberwarfare and cybersecurity operations, where chaos actors can use technologies like “cryptolockers” to ransomware your company’s data. In an anarchic internet, how do we maintain any level of security and stability? How is it not the wild west out there?
It sort of is, but the reason that the world can function with a global internet is because there are simply more “good guys” than “bad guys.” The vast majority of people do not want cyber crime or cyberwarfare to happen. Every company employs numerous cybersecurity experts, and some of those companies have huge reach in terms of monitoring, detection, and deterrence against cyber attacks. Open source software benefits them as well. These are white hat hackers, pen testers, and the like.
DeepSeek’s Wakeup Call
For a while, it looked like closed source AI was the only way to go. AI development is expensive, which means it requires massive investment. That means you need IP law to protect that investment.
However, we’re also rapidly getting to a point where the cost of developing AI is dropping, and meanwhile the capabilities are rising. Eventually, likely by 2026, AI will be able to develop itself. That means recursive self improvement is coming. Guess what? Open source benefits that, and the first nation to figure that out will have a huge advantage.
America has 11x the data centers that China does, which means we win on raw compute alone. However, if China is producing models that are 100x more efficient (which DeepSeek might be), then they effectively have 10x more compute than us. This is a race for efficiency, or what I call a “terminal race condition.” Before too long, every AI model is going to have superhuman levels of expertise. We’re talking human IQ equivalent of 160 to 200. That means any problem you want to solve, they can solve it.
The only questions remain are how fast do they run? How many can you run in parallel?
The bottom line is simple: the game theory optimal strategy—for companies and nations—is acceleration and transparency.
I'd really like to know what you think of Sabine Hossenfelder's latest YT post about AI Safety.
Totally agree about the need for transparent, fast movement and the potential dangers of China overtaking the US wholesale. That said, genuine question, I'm curious specifically about your criticism of Anthropic. I know and have seen your position on AI safety evolve as realities and your own health changed, but I am reasonably sure I saw a video sometime last year where you praised Anthropic's work on AI safety and model alignment (was it something about first principles, I can't quite recall?). I'd be fascinated in a video or blog post specifically about how your views on Anthropic, or the differing approaches of some of the big companies with regards to safety and open source, have evolved as the landscape has changed. Your insights, how they've evolved, and why they've evolved are always interesting.