Thanks for reading The Briefing, our nightly column where we break down the day’s news. If you like what you see, I encourage you to subscribe to our reporting here.
Greetings!
It’s party time in San Jose, Calif.! Nvidia on Monday kicks off its annual GTC developer conference in the city, the biggest event on the chip giant’s calendar—which, given the number of public events featuring its CEO, Jensen Huang, is saying something. Huang is scheduled to deliver his keynote tomorrow morning, and it is sure to be a must-watch for people in the AI sector.
One of the biggest things he’s expected to announce is a new chip system that combines Nvidia’s technology with that of Groq, the independent chip firm whose tech Nvidia licensed in a roughly $20 billion deal late last year. This is the first time Nvidia has integrated another company’s AI processor directly into its server racks. Nvidia’s current flagship systems rely almost entirely on its own processors and high-speed interconnects to link the components together.
So why has Nvidia broken the mold for Groq? One reason: inference. That’s the technical term for the process of operating AI models that spit out answers to questions asked by ordinary people, as opposed to the process of training AI models. Nvidia’s chips are outstanding for training models, which is why they’ve been so much in demand in recent years. They’re good at the inference stage as well. But Groq’s chips are tailored for inference, which is going to become an increasing part of AI data center workloads.
Nvidia is expected to name OpenAI as a buyer of the new chip, which could power the AI agent that assists with the firm’s coding tasks. It’s worth noting that assuming Nvidia does unveil the Groq-Nvidia system as expected, it will be quite a departure from Huang’s comment in January that what Nvidia would build with Groq would be “quite unique and quite cool, but it won’t affect our core business.”
Groq’s chip—known as a language processing unit (yes, we have GPUs, TPUs and now LPUs!)—is expected to be mass-produced in the second half of this year at Samsung Electronics’ foundry, according to two people with direct knowledge of Nvidia’s plans. It will be the first time Nvidia has manufactured a server chip outside Taiwan Semiconductor Manufacturing Co., the Taiwanese chipmaker that has supplied virtually all of its flagship AI chips.
Diversifying away from a single supplier makes a lot of sense for Nvidia. But the company plans to eventually move production of the LPU back to TSMC, the people said, as the next generation of LPU will integrate better into a coming version of Nvidia’s next AI chip, we hear.
Here are some technical details of the new Nvidia-Groq system for the AI chip nerds out there. The Nvidia-Groq rack will use a different architecture from that of existing Nvidia racks. It will contain 256 Groq chips in a single rack, and Intel processors will help manage communication between them, according to a person involved in the project—a role Nvidia’s own hardware typically performs in its GPU systems. The decision to use Intel components suggests Nvidia’s existing technologies don’t yet integrate cleanly with the LPU.
Still, this is only the beginning of Nvidia’s plans for Groq’s technology. The company is exploring ways to integrate the LPU more deeply into its future road map for chips. One idea under consideration would be to fuse Groq’s processor with a Feynman GPU—Nvidia’s next generation after Rubin—into a single chip. That would improve performance while lowering costs, according to two people involved in the development. Not that Huang will talk about any of that tomorrow!—Anissa Gardizy and Wayne Ma contributed to this item
In Other News
• Elon Musk says he’s rebuilding xAI from the ground up after the vast majority of its founding staff left the company. “xAI was not built right first time around, so is being rebuilt from the foundations up,” Musk said in an X post on Thursday. “Same thing happened with Tesla.”
• Amazon Web Services is working with AI chip startup Cerebras Systems on a new service that aims to boost the performance of AI applications running in the cloud.
• Amazon Prime Video told customers today it was raising the price for its ad-free tier from $2.99 a month to $4.99 a month. That’s in addition to the cost of a Prime membership.
• Meta Platforms is planning layoffs that could affect 20% or more of the company, Reuters reported Friday. The cuts would offset high costs of artificial intelligence spending and prepare for efficiency gains from AI assistants, said the report.
• The Trump Administration will get $10 billion in payments from investors in a company designed to safeguard the user data of U.S.-based TikTok users, The Wall Street Journal reported.
Today on The Information’s TITV
Check out Friday’s episode of TITV in which we speak with ChatGPT’s former head of product about OpenAI’s shopping pivot and consumer AI at large.
Recommended Newsletter
Start your day with Applied AI, the newsletter from The Information that uncovers how leading businesses are leveraging AI to automate tasks across the board. Subscribe now for free to get it delivered straight to your inbox twice a week.