Fighting for journalism and profitable news media Tony Livesey faces questions over time as Daily Sport editor | US publishers vs Common CrawlPlus world's biggest news websites ranking and interview with travel correspondent Simon CalderGood morning from the team at Press Gazette on Tuesday, 9 June. Press Gazette’s awards for the best digital journalism products (newsletters, podcasts, websites, etc.) are now open for entries. Find out more here. ⚽ A long-brewing investigation into West Ham owner and pornographer David Sullivan aired on BBC 2 last night and will have made uncomfortable viewing for 5 Live host Tony Livesey. The investigation has also cast light on coverage in the Sport newspapers during the Livesey era, such as the “countdown to 16” feature, which does not reflect well on him. The Times/BBC Panorama is a rare example of publishers revealing allegations of sexual wrongdoing against a named individual (Sullivan) who has not been convicted (or arrested) for any crime. As the Crispin Odey and Noel Clarke cases show, these sorts of stories require huge commitment from publishers as well as deep pockets (especially when the accused is a billionaire). 📉 Our annual ranking of the top 50 biggest news websites in the world shows that the BBC is the most popular online news source in any language per Similarweb data. It also shows that traffic decline is a global problem as publishers face falling referrals from Google around the world. Substack is one news platform that is bucking the trend, growing 50% year on year to make it into our global top 50 for the first time helped by its reliance on email-powered direct relationships with readers. 🚨 It’s sometimes worth reminding ourselves that the multi-billion dollar generative AI industry is a business built largely on theft. AI answer engines are only as good as the data they are trained upon, and OpenAI was able to launch ChatGPT after training the machine on billions of webpages that it did not own (including huge amounts of journalism). Now US trade body Digital Content Next is threatening legal action against the Common Crawl Foundation, which creates an archive of online content. Common Crawl is accused of providing much of the foundational data for ChatGPT enabling it to answer reader questions without the need to visit the web pages it is drawing information from. It is the latest in a number of moves by publishers seeking to stop big tech from bulldozing them into oblivion. Meanwhile, the UK Competition and Markets Authority has opened the door to the first meaningful negotiations between publishers and Google over how their work is indexed and surfaced by the search giant. |