When Donald Trump fired the head of the Copyright Office, Shira Perlmutter—a move he arguably does not have the power to make, given that the office falls under the jurisdiction of the Library of Congress, which is by law and tradition overseen by the legislature and not the executive—I kind of assumed that this was a move organized by the White House’s tech bros. A few news reports suggested when all this went down that the intention might be to neuter the Copyright Office’s ability to restrict AI companies from using and abusing copyrighted materials for their own benefit. If the pro-AI crowd did organize this defenestration, the Verge is reporting that it backfired: Apparently, Perlmutter’s replacement is from a wing of the MAGA movement that is at best ambivalent about the tech bros. All of this is worth keeping an eye on for multiple reasons (not least among them being, you know, the potential violation of the separation of powers). But I just want to take a moment to highlight the actual report from the Copyright Office that may have sparked all this, because I think it’s an incredibly important document for those of us who worry about the AI-obsessed billionaires who think they should be allowed to steal whatever copyrighted artwork they need to power their suicide-recommendation engines. Some background: For large language models to be any good at mimicking human conversation, they first need to be fed enormous quantities of conversational data. For reasons that should be fairly obvious, movies and novels often have enormous quantities of conversational data, making them very valuable for this purpose. (For more on the technical side of this, you could listen to my interview with Alex Reisner, who has tracked which movies and TV shows are illicitly being fed to the LLMs.) Long story short, in order to train programs like ChatGPT and some of its competitors, copyrighted material was used. No one really denies this. The question was whether or not that training fell under the rubric of fair use—a legal carveout that, in certain circumstances, lets copyrighted material be used without first getting the copyright owner’s permission. Artificial intelligence executives defended force-feeding an LLM millions of pages of scripts by comparing it to a novice screenwriter reading, say, William Goldman’s Marathon Manscreenplay. How do screenwriters learn to write? They read screenplays! This is no different, just on a larger scale. Of course, it is different and this specious line of reasoning fooled roughly no one. The question is the degree of difference and whether or not it runs afoul of fair use. There’s a four-part test to determine whether or not something falls under fair use, including the nature of the copyrighted work in question, the amount of the work being used, how it is being used (e.g., for profit or for education?), and the effect on the market for the already-copyrighted work. So whether or not something counts as fair use is a complicated question, more art than science—and in the case of the large language models, the answer to that question has enormous financial consequences. The tech magnates have said that licensing the material they want to use to train their LLMs would be prohibitively expensive, so they would prefer simply to steal it, thanks. To this the Copyright Office has, apparently, said Uh, not so fast guys. “When a model is deployed for purposes such as analysis or research—the types of uses that are critical to international competitiveness—the outputs are unlikely to substitute for expressive works used in training,” the Copyright Office report concludes. “But making commercial use of vast troves of copyrighted works to produce expressive content that competes with them in existing markets, especially where this is accomplished through illegal access, goes beyond established fair use boundaries.” We’ll see how this shakes out through the courts. But this is a good first step to ensuring that the creators of AI have to, at the very least, compensate the people they’ve ripped off these last few years. On this week’s Bulwark Goes to Hollywood, I interviewed Matthew Specktor about his new book, Golden Hour: A Story of Family and Power in Hollywood. Part memoir, part historical novel, and entirely about the evolution of Hollywood as it grew from a dream factory into an Armani-suited shark tank, Matthew’s book is a really fascinating and compulsively readable ground-eye view of a changing industry. |