AI is everywhere except in worker productivity stats |
|
|
Hello and welcome to Eye on AI. In this edition…The growing debate about AI’s impact on productivity; Anthropic debuts the first hybrid reasoning model, Claude 3.7 Sonnet; AI models are prone to cheating their way to goals and that should worry us; a new U.K. report highlights the need for AI regulation.
In the U.S., Elon Musk and his minions at the newly created Department of Government Efficiency have been using AI to attempt to identify budget cuts across the federal government. Some of these cuts—such as slashing anything having to do with diversity or that touched on climate change—seem more driven by ideology than any actual drive for efficiency. Musk and his deputies have summarily dismissed thousands of federal workers and are threatening to fire thousands more, with AI being used again to help decide who to cull. And it seems that one reason Musk and his team think so many of these jobs can be eliminated without derailing vital government services is a belief that in many cases AI can take on the tasks these workers once performed.
But Musk’s faith in AI’s current abilities is not matched so far by evidence from the private sector, where there is a growing debate about exactly how much of a productivity boost AI is providing.
Doubts about generative AI are growing in some quarters Whereas once CEOs were eager to play up vague notions of how AI was helping them boost their bottom lines, some are now trying to tamp down expectations. Brian Chesky, Airbnb’s CEO, told investors and analysts on a company earnings call last week that AI coding assistants have not led to a “fundamental step change in productivity yet.” (Chesky’s comments are perhaps particularly telling because he is known to be a close friend of OpenAI CEO Sam Altman. So you might expect him to have a more boosterish view.)
But many execs say the productivity boost is real Chesky’s view is in stark contrast to what many other executives have been saying about the technology. Ashok Srivastava, Intuit’s senior vice president and chief data officer, told me at the end of last year that the financial and tax software giant had seen an “eight-fold increase in development velocity” (as measured by new software and version releases) over the past four years as the company has deployed more and more AI across its own technology platform. Generative AI coding assistants have helped boost that productivity by an additional 15%, he said.
The company has also seen that using generative AI to help categorize customer inquiries and coach human customer support agents has resulted in an 11% increase in the efficiency of its “customer success” teams—which help Intuit’s customers better use the company’s products, such as QuickBooks, TurboTax, and Mailchimp. And some of the new generative AI features are resulting in real gains for Intuit’s customers too, Srivastava said. He noted that using a QuickBooks AI-powered feature that generates automatic invoice reminders had resulted in Intuit customers getting paid 45% faster than before.
Success theater? Those numbers certainly sound impressive. But it’s still a little difficult to tell from the outside what’s real and what is simply what Edward Achtner, the head of generative AI at the bank HSBC, termed “success theater,” in reference to other company’s AI claims. If it is theater, though, HSBC is also on the stage. Its CEO, George Elhedery, told investors just last week that the bank was rolling out generative AI applications in customer service and, yes, coding assistants for its engineers.
Solow’s paradox redux Of course, productivity gains from new technologies, especially digital ones, can take a long time to become evident. In 1987, economist Robert Solow famously quipped that “you can see the computer age everywhere except in productivity statistics.” (A phenomenon soon named Solow’s Paradox.) But starting eight years later, in 1995, labor productivity did begin to accelerate significantly and remained elevated for the next 10 years as internet adoption flourished. Economists speculate that there is a significant lag in how long it takes companies to adopt new technologies, but also in how long it takes to train workers to use them effectively and to reconfigure work practices to get the largest productivity gains from them. While AI adoption rates are very rapid compared to previous technologies, we are still just two years into the generative AI revolution. So it may take a few years yet to see what the gains really are.
With that, here’s more AI News.
Jeremy Kahn jeremy.kahn@fortune.com @jeremyakahn
|
|
|
AI: Speed matters more, scale matters less, innovation matters most As businesses embrace AI-driven models, they’ll need to rethink everything from workforce strategies to innovation processes. Critical shifts in strategy will emphasize speed more, scale less and innovation most of all. The time to embrace AI is now. Read more
|
|
|
Anthropic unveils Claude 3.7 Sonnet, the first hybrid GPT and reasoning model. The AI company unveiled a new AI model, Claude 3.7 Sonnet, that can decide when it can answer a question quickly drawing largely on its pre-training, and when it needs to spend more time, using chain-of-thought reasoning to arrive at a more accurate answer. Chain of thought reasoning, which uses more computing power at inference time, tends to produce superior answers for questions involving logic, mathematics, and coding. Users can also adjust how much time they want Claude 3.7 to spend “thinking” about an answer. This is Anthropic’s first reasoning model. It is also the first model to combine the sort of instinctive, fast answers that GPT style models like Claude 3.5 Sonnet and OpenAI’s GPT-4o produce, with the chain of thought answers that models such as OpenAI’s o1 and DeepSeek’s R1 produce. You can read more about the new model in Anthropic’s blog post here. The company also launched what it called “an agentic AI coding tool” called Claude Code.
OpenAI plans to swap Microsoft for Softbank-backed Stargate as its major computing supplier. That’s according to a report in tech publication The Information that cited company forecasts that the publication said had been circulated to OpenAI’s investors. The forecasts reportedly showed OpenAI expects that 75% of its computing power needs by 2030 would come from Stargate, the new project to build giant AI data centers that is being funded largely by Japanese tech giant Softbank in partnership with OpenAI and Oracle. Currently, Microsoft supplies most of OpenAI’s computing needs, and the software giant has been one of OpenAI’s major funders. But there has reportedly been growing tension between Microsoft and OpenAI, with OpenAI concerned that Microsoft was not supplying it with enough computing power. Microsoft may also be backing away from its support for OpenAI and its technology. Softbank will provide at least $30 billion of the $40 billion in new funding OpenAI is currently trying to raise in a deal that would value the AI company at $260 billion. Half of that money will in turn be reinvested back into Stargate by OpenAI.
Microsoft cancels some data center leases, investment analyst says. Financial firm TD Cowen reported that Microsoft had cancelled leases for additional data center capacity, according to Bloomberg. That may have something to do with the previous item about OpenAI’s pivot away from Microsoft. The report also comes amid growing concerns that AI adoption is lagging expectations and that the popularity of smaller, open-source AI models, such as those from Chinese AI upstart DeepSeek, may mean compute demand will not grow as fast as previously forecast. In a statement to Fortune, a Microsoft spokesperson said that “thanks to the significant investments we have made up to this point, we are well positioned to meet our current and increasing customer demand.” The Microsoft spokesperson added that, “while we may strategically pace or adjust our infrastructure in some areas, we will continue to grow strongly in all regions.” They also noted the company still plans to spend over $80 billion on computing infrastructure in this fiscal year.
Apple plans massive AI server factory for Texas. The tech giant said Monday that it will work with Foxconn to build a 250,000-square-foot facility in Houston, where it will assemble servers that will help power Apple Intelligence applications. Those servers are currently manufactured outside of the U.S. The company also said it would add 20,000 R&D jobs across the U.S. over the next four years. You can read more from Reuters here.
Big tech data centers may have contributed $5.4 billion to public health costs. That comes from a study by researchers at UC Riverside and Caltech who based the figures on increased pollution from burning fossil fuels to supply data centers with energy. The study, which the Financial Times covered here, estimated that Google generated the largest health costs, at $2.6 billion, followed by Microsoft at $1.6 billion, and Facebook-parent Meta at $1.2 billion. The health impact disproportionately affects lower-income households due to data center locations in states like West Virginia and Ohio. U.S. data center energy use represented about 4% of total U.S. electricity consumption in 2023 and is forecasted to rise to between 7% and 12% by 2028, driven largely by AI workload demand.
|
|
|
Model Behavior—not! There’s mounting evidence that if humans give AI models goals to achieve, the models will sometimes try to achieve these goals in unintended ways, including ways that humans might classify as sneaky, cheating, or perverse. This kind of “reward hacking” has been documented with AI models for years. But it mattered less when the behavior was evident in simple game environments. Now that we are starting to set these models out into the real world as AI agents, the behavior becomes more alarming.
Take two recent reports. In one case, AI safety researchers at an outfit called Palisade Research showed that both OpenAI’s o1-preview model and DeepSeek’s R1 will sometimes resort to cheating in order to beat better chess playing software. The researchers found that when the models were pitted against Stockfish, one of the best existing computer chess programs, the models would sometimes resort to hacking the chess program in order to win, even though the users had never told the models to do this. You can read more on the study in this Time story.
In another study, researchers from AI company Sakana AI used AI reasoning models to better optimize the CUDA code that is used to run AI and other machine learning models on Nvidia GPUs. It turned out that the AI models were pretty good at this. But, in some cases, the AI systems designed CUDA code that didn’t actually work—but found a loophole in the code of a separate software system that was used to evaluate the CUDA code that would allow it to give the CUDA optimizations a high score even though the code was not actually valid. You can read more on Sakana’s substack here.
Both of these experiments ought to concern people about the dangers we will face as we begin to allow AI agents, in many cases powered by AI reasoning models, to perform tasks for us in the real world. It also ought to call into question statements that have attempted to dismiss scenarios in which humans might lose control of AI agents as “science fiction.” Clearly these experiments show that loss of control, at least at a small scale, is entirely possible.
|
|
|
March 3-6: MWC, Barcelona
March 7-15: SXSW, Austin
March 10-13: Human [X] conference, Las Vegas
March 17-20: Nvidia GTC, San Jose
April 9-11: Google Cloud Next, Las Vegas
May 6-7: Fortune Brainstorm AI London. Apply to attend here.
May 20-21: Google IO, Mountain View, Calif.
|
|
|
The need for AI regulation has never been more urgent. The likelihood of AI regulation has never looked more distant. Earlier this week, I was invited to speak at a round table about AI regulation at the British House of Lords. The discussion was hosted by Lord Holmes of Richmond, a U.K. parliamentarian who has been pushing for sensible AI regulation for the past several years. The discussion was timed to coincide with the publication of a brief but powerful new report from Lord Holmes’s office that used eight individual vignettes to highlight the dangers from AI that the public is facing today and why regulation was urgently needed. The report, which is well worth reading, is entitled “8 Realities, 8 Billion Reasons to Regulate”—you can find it here. Holmes hopes to galvanize support for an AI bill he's introduced into parliament that would create a new U.K. AI Authority designed to set standards for the responsible use of AI within government and to carry out an analysis of existing British laws to see where and how they may need to be updated or adapted for the AI age.
Almost all the speakers at the roundtable agreed that there was a pressing need for some regulation. There was also a lot of discussion about the need to coordinate—and make “interoperable”—any U.K. AI regulation with what is happening internationally. And yet, as several speakers mentioned, around the world, including in the U.K., politicians are often framing AI regulation as being in diametric opposition to AI innovation and the need for economic growth that they hope AI can help deliver. In addition, the U.S. has now made it clear that it opposes any AI regulation and that it may be willing to use economic power, in the form of tariffs, to bully other countries into not enforcing any regulation that would adversely impact U.S. tech companies. In the face of these trends, many present weren't optimistic about the chances of regulation.
At the same time, many argued the U.K. should not be cowed into inaction. The idea that regulation is always the enemy of innovation and growth is simply false. The thing that is holding back AI adoption in many commercial spheres are concerns that the technology is not reliable, robust, and safe enough to deploy—as well as concerns that there may be legal or ethical problems with how the data that powers the AI models was obtained and with potentially biased and discriminatory outcomes the technology will produce once deployed. Some AI regulation aimed at addressing these concerns would actually provide the business and public confidence needed to accelerate AI adoption.
|
|
|
|