|
Anthropic and OpenAI are booming, but so are providers of open-source AI models and other cheaper alternatives, thanks to businesses using open-source to control their costs, we reported in detail this morning. But there’s more. Together AI, the neocloud which rents out Nvidia cloud servers and access to open source models, raised its annual revenue projections at least three times in the past few months due to the growing usage of open source models, CEO Vipul Ved Prakash told us. (Together was generating around $1 billion in annualized revenue as of March.) The number of tokens processed through its cloud soared to 400 trillion this month from just 30 billion tokens at the same time last year, with much of the growth occurring in the past six months, he said. If you were spending those tokens on DeepSeek v4 Pro, which roughly costs 18 cents per million tokens, that increase would mean the difference of about $70 million in monthly spending. (An AI agent creating a simple document might consume anywhere from 1,000 to 15,000 tokens). Prakash believes open source models will eventually account for the vast majority of AI usage globally. At Hugging Face, which operates a repository of open source AI models, paying subscribers to its repository doubled between January and June, co-founder and CEO Clem Delangue told us, without providing specifics. His co-founder Thomas Wolf said he’s seen a “huge sober, waking-up in companies” concerned about AI costs and getting locked in to using specific AI vendors. For their part, OpenAI and Anthropic have said that new models’ prices are more than justified, given how much more they can accomplish for customers. It’s worth pointing out that open source models aren’t a monolith, and figuring out their relative strengths and weaknesses can take time. The boom in cheaper models extends to providers of so-called model routers that allow companies to easily switch between AI models for different tasks to save on cost. Major firms such as Cisco and Adobe use such routers to temper costs. “We've seen a huge surge in demand over the last six months,” said Not Diamond CEO Tomás Hernando Kofman of the startup’s router. The router allows companies doing AI coding tasks to shift to older and cheaper Anthropic models, which he says saves them 20% to 40% in costs compared to using Anthropic’s most expensive AI. Providers of such routers and “harnesses,” software that helps an AI agent connect to applications so it can perform complex white-collar work more efficiently, also include Martian AI, Factory, Portkey, OpenRouter, CodeStrap and Vercel. AWS Joins Boom of Forward Deployed Engineers Amazon is stoking the boom of specialized consultants, known as forward-deployed engineers, that help customers develop AI applications. Amazon Web Services sales leaders recently began training some of the cloud firm’s “solution architects,” staff that help customers determine the technology they need, to be FDEs that work on-site at customers’ offices, according to a current employee. As we explained in a recent article about FDEs, many large companies aren’t yet savvy enough to do AI on their own, and the FDEs can serve as a combination of software engineers, business consultants and product managers to help them learn the ropes. While Palantir pioneered the role, everyone from OpenAI and Anthropic to Salesforce and Snowflake have jumped on the trend as they push for large enterprise AI contracts. The newly trained FDEs at AWS are joining other staff, including engineers, applied scientists and AI strategists, working with customers on projects, according to Taimur Rashid, managing director of the unit’s Generative AI Innovation Center. AWS prefers using teams of staff, as opposed to individual FDEs, to handle such work, he said. The move expands on work AWS has already been doing at customers’ offices. In early 2025, for instance, AWS and Anthropic sent engineers to the Atlanta-based headquarters of Cox Automotive to help the firm build an AI agent that creates marketing webpages for dealerships. More recently, AWS has also sent teams to work on-site with employees from the Jane Goodall Institute, a wildlife conservation organization, and to other customers such as Formula 1, Fox, Nasdaq, the National Football League and S&P Global, according to an AWS spokesperson. Rashid said the teams are at the center of an AWS program that aims to help customers go from ideas to finished projects within 45 days. The teams have previously helped customers with projects such as customizing open source AI models to specific industries, but are increasingly focused on helping customers develop AI models and agents, he said. The AWS FDE teams are also helping customers develop data management tools that help agents understand a company’s systems better so they can operate accurately, he said. These include knowledge graphs, which show how different types of data—people, companies, and products—are connected to each other; and semantic layers, which establish standard definitions for business metrics such as revenue and customers, he said. The teams also help AWS customers fine-tune the way they run AI models, known as inference, in a way that improves their performance and lowers operational costs, said Rashid.
|