Bhawika Chhabra/ReutersSince the introduction of ChatGPT to the public, AI safety experts have been calling for a “pause” on the development of the most powerful AI models. Anthropic itself did so in a recent blog post. On Friday evening, that pause may finally have happened. But it came from an unlikely place: The White House. At 5:21 pm on Friday, Anthropic received a directive from the US government instructing it to limit access to its most powerful AI models to US nationals. That includes Anthropic employees, many of whom are foreign nationals. If the directive is actual policy and more than just political retribution for Anthropic’s impasse with the Pentagon, it means other AI labs will be subject to the same restrictions. The impact of the new rule: It will be unprofitable to develop frontier AI models, and potentially even illegal to develop them in the first place, given how many foreigners work on these models at all the major frontier labs. Anthropic pushed back on the decision in a blog post Friday, saying it disagreed with the government’s decision. The White House has taken a more aggressive approach lately in regulating AI, spearheaded in part by National Cyber Director Sean Cairncross. A recent executive order directs agencies to create a voluntary framework under which AI developers could give the government secure access to certain powerful frontier models for up to 30 days before a broader release to trusted partners. The White House did not immediately respond to a request for comment. Cairncross also did not respond. Anthropic declined to comment beyond its blog post. When Anthropic launched its most advanced model yet, Mythos, in early April, it told the world that Mythos was simply too dangerous to release publicly. “The fallout — for economies, public safety, and national security — could be severe,” it said in a blog post. Earlier this week, the company released a special version of Mythos with extra guardrails. It called this product Fable 5. According to Anthropic, the government claims to have found a “jailbreak,” or a way that users could bypass some of Fable 5’s guardrails. Anthropic says whatever jailbreak the government found is not serious enough to warrant such a drastic action. The truth is that no AI model has ever avoided being jailbroken. And while Anthropic put a tremendous amount of effort into trying to find holes in the models’ defenses, in a process known as “red teaming,” there’s no reason to think Fable 5 won’t also be jailbroken. The other thing to keep in mind is that nobody, not even Anthropic, understands how these AI models really work. Despite Anthropic’s research into finding clues, a discipline known as “observability research,” no one can be certain exactly how they’ll behave in every situation. If Mythos is really that big of a risk, then so is Fable 5, potentially. But the truth may be that Anthropic is wrong about Mythos’ dangers, and that the government is taking a drastic action that could potentially hurt the US in the technology race with China. Anthropic will have to take some responsibility for that overreaction. It has been the leading lobbyist for the government taking AI dangers seriously. The government was listening. |