Showing posts with label Copyright. Show all posts
Showing posts with label Copyright. Show all posts

Thursday, April 16, 2026

A Monkey Goes to Court

What happens when something that isn't human makes art? A series of bizarre court battles trying to answer that question centred around this image. Ultimately, it will influence what ends up on your screens and headphones forever.

It was a humid day in the Indonesian jungle, and photographer David Slater was following a group of crested black macaques, a critically endangered and particularly photogenic species of monkey.

He wanted pictures, but the macaques were nervous. So, Slater put his camera on a tripod with autofocus on and a flashbulb, allowing the monkeys to inspect it. Just as he hoped, they started playing with his gear. Then one of them reached up and hit the shutter button while staring directly into the lens. The result was a selfie, taken by a monkey. And its toothy grin inadvertently answered a basic question that sits at the heart of technology.

What came next was nearly a decade of legal battles around an unusual dispute: when something that isn't human makes a work of art, who owns the copyright? Thanks to AI, that's become a issue with some deep implications for modern life – and what it means to be human.

One of the most alarming predictions about AI is that corporations will replace the human-created music, movies and books you love with an endless stream of AI slop. But the US Supreme Court just upheld a decision about AI and copyright which suggests that future may be harder to pull off than the tech industry hoped. The path is still uncertain, and right now, the legal system is the site of a battle that will shape what you read, watch and listen to for the rest of your life. It all traces back to that one little monkey.

Monkey business

The monkey took that selfie in 2011. For a brief, blissful period, Slater enjoyed global attention from the picture, but the troubles began when someone uploaded the photo to Wikipedia, from where it could be downloaded and used free of charge. He asked the Wikimedia Foundation to take it down, arguing it cost him £10,000 (worth about $13,400 today) in lost sales. In 2014, The organisation refused, arguing the photo was in the public domain because it wasn't taken by a person.

The row prompted the US Copyright Office to issue a statement that it would not register work created by a non-human author, putting "a photograph taken by a monkey" first in a list of examples. (Slater didn't respond to interview requests, but his representation arranged for the BBC to use the photo in this article.)

The story gets weirder. Soon after, the advocacy group People for the Ethical Treatment of Animals (Peta) sued Slater on behalf of the monkey. The case argued all proceeds from the photo belonged to the macaque that took the picture, but it was really seen as a test case, an attempt to establish legal rights for animals. After four years and multiple court battles, a San Francisco judge dismissed the case. The judge's reasoning was simple: monkeys can't file lawsuits.

"It was kind of the biggest public conversation piece on this topic," says intellectual property lawyer Ryan Abbott, a partner at Brown, Neri, Smith and Khan in the US. "At the time it was very much about animal rights. But it could have been a conversation about AI." [...]

The missing author

When the US passed the Copyright Act of 1790, we only had to deal with things like writing and drawing. But the invention of photography decades later raised troubling questions. You could argue cameras do the real work, a person just hits a button.

"The Supreme Court looked at this and said, you know, we're going to interpret this purposively," says Abbott, who represented Thaler in a case against the Copyright Office. "Copyright was designed to protect the expression of tangible ideas. And that's broad enough to cover something like photography."

The same logic could apply to AI. "What you really have in photography is exactly the same thing you have here. You have a person issuing instructions to a machine to generate a work," he says. "What's the difference between that and me asking ChatGPT to make an image?"

by Thomas Germain, BBC | Read more:
Image: David Slater/ Caters New/BBC
[ed. More issues than you might imagine.]

Friday, March 20, 2026

A.I. Is Writing Fiction. Publishers Are Unprepared.

For months, speculation has been building online that a buzzy horror novel, “Shy Girl,” was written with the help of A.I.

The novel, about a desperate young woman who is held hostage by a man she met online and forced to live as his pet, was self-published in February 2025. The book quickly found an audience among horror fans, and Hachette published it in the United Kingdom last fall and planned to release it in the United States this spring, billing it as “an unapologetic, visceral revenge horror novel.”

Earlier this year, Max Spero, the founder and chief executive of Pangram, an A.I. detection program, heard of the claims about “Shy Girl” and decided to run a test of the full text. Its results indicated that the book was 78 percent A.I. generated.

“I’m very confident that this is largely A.I. generated, or very heavily A.I. assisted,” said Spero, who posted his research on X in January.

The Times also analyzed passages from the novel using several A.I. detection tools and found recurring patterns characteristic of A.I. generated text, like gaps in logic, excessive use of melodramatic adjectives and an overreliance on the rule of three.

In the months since “Shy Girl” was released in Britain, more readers voiced their suspicions online that the writer relied on A.I., citing nonsensical metaphors and odd, repetitive phrasing. As a chorus of allegations built online in late January that the novel was A.I. generated, Hachette stayed silent.

In response to questions from The New York Times about the A.I. allegations against “Shy Girl,” Hachette told The Times that its imprint Orbit has canceled plans to release the novel in the United States and that Hachette will discontinue its U.K. edition.

The author of “Shy Girl,” Mia Ballard, who according to her author bio writes poetry and lives in Northern California, has very little social media presence, and doesn’t appear to have addressed the allegations of A.I. use on her feeds. In an email to The Times late on Thursday night, Ballard denied using A.I. to write “Shy Girl,” contending that an acquaintance she hired to edit the self-published version of the novel had used A.I.

The decision to cancel the publication came after a lengthy and thorough analysis, Hachette’s spokeswoman said, noting that the company values human creativity and requires authors to attest that their work is original. Hachette also asks its authors to disclose whether they are using A.I. to the company.

“Shy Girl” appears to be the first commercial novel from a major publishing house to be pulled over evidence of A.I. use. Its cancellation is a sign that A.I. writing is not only appearing in cheap self-published e-books that are flooding Amazon but is seeping into even traditionally published fiction.

The stunning fact that “Shy Girl” got so far into the editorial process, and was even released in the U.K. before publishers thoroughly investigated the claims of A.I. use, is a sign of how unprepared many in the book world are to deal with the rise of A.I. It also signals the dawn of an uncertain new era for the book world, as editors and readers alike are increasingly left wondering whether the prose they are reading was written by a human or a machine. [...]

For now, the most obvious disruptions from A.I. are hitting the self-publishing sphere, where authors say the ecosystem has been flooded with A.I. slop. But some in the industry believe that it’s only a matter of time before more books written with A.I. slip past editors at major houses. The technology has become increasingly widespread — as has the practice of picking up self-published books and rereleasing them through traditional imprints.

“It’s not merely inevitable,” said Thad McIlroy, a publishing industry consultant who has urged publishers to clarify their policies around the technology. “We’re in the midst of it.” [...]

Many publishers don’t explicitly prohibit authors from using A.I. in their book contracts. Instead, they rely on longstanding contractual clauses that require writers to affirm that their work is “original,” which many people in the book business now interpret as effectively banning the use of A.I. for text or image creation.

Publishers are also wary of A.I. content because currently, A.I.-generated text and art can’t be protected by copyright. Still, given the widespread uses for A.I. during research, outlining and other parts of the writing process, there’s little clarity on what constitutes its appropriate use. Many in the industry worry that publishers are leaving themselves vulnerable to scammers — or even writers who believe their A.I. use doesn’t cross any lines.

One problem in regulating authors’ A.I. use is that most corporate publishing houses don’t want to ban it outright. Editors recognize that authors use A.I. in a range of ways short of writing with it. And publishing executives want to ensure that their employees can use the technology for tasks like creating marketing copy, audio narration and translation.

The fact that publishing companies generally haven’t drawn a hard line around A.I. use is sowing confusion about what is permissible. Could a novelist ask A.I. to suggest plot twists, propose an alternate ending or polish a draft and still claim it as original work? At what point does the work stop being human?

by Alexandra Alter, NY Times | Read more:
Image: George Wylesol
[ed. I guess I'm of two minds on this. If the writing eventually becomes so good that it's indiscernable from a human-produced product (or even better) why should it be banned? And, why wouldn't you want to read it? Authors and publishing houses have a right to be concerned, but why should they be treated any differently from other professions (programmers being an example) facing the same threat? Because they occupy a so-called creative space? How long will that last? I can imagine an AI producing very high quality material: fiction, non-fiction, screenplays, poetry, advertising copy, etc. because it can draw upon hundreds of years of examples, criticism, reviews, college courses, awards and whatever else is out there to discern patterns, storylines, jokes, whatever, that have proven to produce the highest impact and success. So what to do? The only thing I can think of is labeling: highlighting what's AI produced and what's not and letting the market decide its worth. Many people might actually prefer AI - along the lines of craft brews vs. Bud Light. Who knows? Another option would involve updating copyright laws, but that would require Congress to actually do something, which as we all know is pretty much a non-starter. Just another example of all the disruption that's been predicted now occurring in real time.]

Wednesday, March 4, 2026

Why Libraries Don't Stock Many Audiobooks

Have you ever wondered …

Why can’t my library get more copies of e-books and digital audiobooks?

You’re not alone! And there are a couple of reasons you might find yourself on a long wait list for e-content:
  • Most materials are licensed, not owned by the library like print books are, and publishers put limits on how long and/or how often the content can be used. Once the limit is reached, the library must re-purchase the license if we want to keep offering the e-content to our community. 
  • At the same time, e-books and digital audiobooks cost libraries more than print copies and more than what consumers would pay to purchase them commercially.
Here’s a real-time example:


How can you help?
  • If you finish with e-content early, please return it so the next person can jump off the waiting list and into the book! Just go to Manage Loan and select Return Early in the Libby App.
  • And keep borrowing e-content from your library! The numbers help us advocate for funding.
by Hawaii State Library Association 
[ed. Would it hurt publishers or whoever's collecting these licensing fees to be a little more civic-minded by providing complimentary copies to libraries? (or at least getting rid of repurchasing requirements?) Guess so.]

Tuesday, January 20, 2026

It's Not Normal

Samantha: This town has a weird smell that you're all probably used to…but I'm not.
Mrs Krabappel: It'll take you about six weeks, dear. 
-The Simpsons, "Bart's Friend Falls in Love," S3E23, May 7, 1992
We are living through weird times, and they've persisted for so long that you probably don't even notice it. But these times are not normal.

Now, I realize that this covers a lot of ground, and without detracting from all the other ways in which the world is weird and bad, I want to focus on one specific and pervasive and awful way in which this world is not normal, in part because this abnormality has a defined cause, a precise start date, and an obvious, actionable remedy.

6 years, 5 months and 22 days after Fox aired "Bart's Friend Falls in Love," Bill Clinton signed a new bill into law: the Digital Millennium Copyright Act of 1998 (DMCA).

Under Section 1201 of the DMCA, it's a felony to modify your own property in ways that the manufacturer disapproves of, even if your modifications accomplish some totally innocuous, legal, and socially beneficial goal. Not a little felony, either: DMCA 1201 provides for a five year sentence and a $500,000 fine for a first offense.

Back when the DMCA was being debated, its proponents insisted that their critics were overreacting. They pointed to the legal barriers to invoking DMCA 1201, and insisted that these new restrictions would only apply to a few marginal products in narrow ways that the average person would never even notice.

But that was obvious nonsense, obvious even in 1998, and far more obvious today, more than a quarter-century on. In order for a manufacturer to criminalize modifications to your own property, they have to satisfy two criteria: first, they must sell you a device with a computer in it; and second, they must design that computer with an "access control" that you have to work around in order to make a modification.

For example, say your toaster requires that you scan your bread before it will toast it, to make sure that you're only using a special, expensive kind of bread that kicks back a royalty to the manufacturer. If the embedded computer that does the scanning ships from the factory with a program that is supposed to prevent you from turning off the scanning step, then it is a felony to modify your toaster to work with "unauthorized bread":

If this sounds outlandish, then a) You definitely didn't walk the floor at CES last week, where there were a zillion "cooking robots" that required proprietary feedstock; and b) You haven't really thought hard about your iPhone (which will not allow you to install software of your choosing):

But back in 1998, computers – even the kind of low-powered computers that you'd embed in an appliance – were expensive and relatively rare. No longer! Today, manufacturers source powerful "System on a Chip" (SoC) processors at prices ranging from $0.25 to $8. These are full-fledged computers, easily capable of running an "access control" that satisfies DMCA 1201.

Likewise, in 1998, "access controls" (also called "DRM," "technical protection measures," etc) were a rarity in the field. That was because computer scientists broadly viewed these measures as useless. A determined adversary could always find a way around an access control, and they could package up that break as a software tool and costlessly, instantaneously distribute it over the internet to everyone in the world who wanted to do something that an access control impeded. Access controls were a stupid waste of engineering resources and a source of needless complexity and brittleness:

But – as critics pointed out in 1998 – chips were obviously going to get much cheaper, and if the US Congress made it a felony to bypass an access control, then every kind of manufacturer would be tempted to add some cheap SoCs to their products so they could add access controls and thereby felonize any uses of their products that cut into their profits. Basically, the DMCA offered manufacturers a bargain: add a dollar or two to the bill of materials for your product, and in return, the US government will imprison any competitors who offer your customers a "complementary good" that improves on it.

It's even worse than this: another thing that was obvious in 1998 was that once a manufacturer added a chip to a device, they would probably also figure out a way to connect it to the internet. Once that device is connected to the internet, the manufacturer can push software updates to it at will, which will be installed without user intervention. What's more, by using an access control in connection with that over-the-air update mechanism, the manufacturer can make it a felony to block its updates.

Which means that a manufacturer can sell you a device and then mandatorily update it at a later date to take away its functionality, and then sell that functionality back to you as a "subscription":

A thing that keeps happening:

And happening:

And happening:

In fact, it happens so often I've coined a term for it, "The Darth Vader MBA" (as in, "I'm altering the deal. Pray I don't alter it any further"):

Here's what this all means: any manufacturer who devotes a small amount of engineering work and incurs a small hardware expense can extinguish private property rights altogether.

What do I mean by private property? Well, we can look to Blackstone's 1753 treatise:
The right of property; or that sole and despotic dominion which one man claims and exercises over the external things of the world, in total exclusion of the right of any other individual in the universe.
You can't own your iPhone. If you take your iPhone to Apple and they tell you that it is beyond repair, you have to throw it away. If the repair your phone needs involves "parts pairing" (where a new part won't be recognized until an Apple technician "initializes" it through a DMCA-protected access control), then it's a felony to get that phone fixed somewhere else. If Apple tells you your phone is no longer supported because they've updated their OS, then it's a felony to wipe the phone and put a different OS on it (because installing a new OS involves bypassing an "access control" in the phone's bootloader). If Apple tells you that you can't have a piece of software – like ICE Block, an app that warns you if there are nearby ICE killers who might shoot you in the head through your windshield, which Apple has barred from its App Store on the grounds that ICE is a "protected class" – then you can't install it, because installing software that isn't delivered via the App Store involves bypassing an "access control" that checks software to ensure that it's authorized (just like the toaster with its unauthorized bread).

It's not just iPhones: versions of this play out in your medical implants (hearing aid, insulin pump, etc); appliances (stoves, fridges, washing machines); cars and ebikes; set-top boxes and game consoles; ebooks and streaming videos; small appliances (toothbrushes, TVs, speakers), and more.

Increasingly, things that you actually own are the exception, not the rule.

And this is not normal. The end of ownership represents an overturn of a foundation of modern civilization. The fact that the only "people" who can truly own something are the transhuman, immortal colony organisms we call "Limited Liability Corporations" is an absolutely surreal reversal of the normal order of things.

It's a reversal with deep implications: for one thing, it means that you can't protect yourself from raids on your private data or ready cash by adding privacy blockers to your device, which would make it impossible for airlines or ecommerce sites to guess about how rich/desperate you are before quoting you a "personalized price":

It also means you can't stop your device from leaking information about your movements, or even your conversations – Microsoft has announced that it will gather all of your private communications and ship them to its servers for use by "agentic AI": (...)

Microsoft has also confirmed that it provides US authorities with warrantless, secret access to your data:

This is deeply abnormal. Sure, greedy corporate control freaks weren't invented in the 21st century, but the laws that let those sociopaths put you in prison for failing to arrange your affairs to their benefit – and your own detriment – are.

But because computers got faster and cheaper over decades, the end of ownership has had an incremental rollout, and we've barely noticed that it's happened. Sure, we get irritated when our garage-door opener suddenly requires us to look at seven ads every time we use the app that makes it open or close:

But societally, we haven't connected that incident to this wider phenomenon. It stinks here, but we're all used to it.

It's not normal to buy a book and then not be able to lend it, sell it, or give it away. Lending, selling and giving away books is older than copyright. It's older than publishing. It's older than printing. It's older than paper. It is fucking weird (and also terrible) (obviously) that there's a new kind of very popular book that you can go to prison for lending, selling or giving away.

We're just a few cycles away from a pair of shoes that can figure out which shoelaces you're using, or a dishwasher that can block you from using third-party dishes:

It's not normal, and it has profound implications for our security, our privacy, and our society. It makes us easy pickings for corporate vampires who drain our wallets through the gadgets and tools we rely on. It makes us easy pickings for fascists and authoritarians who ally themselves with corporate vampires by promising them tax breaks in exchange for collusion in the destruction of a free society.

I know that these problems are more important than whether or not we think this is normal. But still. It. Is. Just. Not. Normal.

by Cory Doctorow, Pluralistic |  Read more:
Image: uncredited
[ed. Anything labeled 'smart' is usually suspect. What's particularly dangerous is if successive generations fall prey to what conservation biology calls shifting baseline syndrome (forgetting or never really missing something that's been lost, so we don't grieve or fight to restore it). For a deep dive into why everything keeps getting worse see Mr. Doctorow's new book: Enshittification: Why Everything Suddenly Got Worse and What to Do About It," Farrar, Straus, Giroux, October 7 2025.]

Sony Goes for Peanuts

It wasn’t so long ago that purchases of American institutions by Japanese companies sparked outrage in the United States. When Mitsubishi bought the Rockefeller Center in 1989, a local auto dealership ran a TV spot that invited Americans to “imagine a few years from now. It’s December, and the whole family’s going to see the big Christmas tree at Hirohito Center… Enough already.” Sony’s purchase of Columbia Pictures that same year caused such unease that chairman Akio Morita felt the need to declare “this is not a Japanese invasion.” A Newsweek poll of the era revealed that 54% of Americans saw Japan as a bigger threat to America than the Soviet Union. Many exploited this fear of Japan for their own ends. Politicians grandstanded by smashing Japanese products and demanding investigations into purchases. Predictably, Donald Trump’s first public foray into politics was a jeremiad against Japan in a 1989 appearance on the Oprah Winfrey Show.

Contrast this to yesterday, when Sony announced that it had paid nearly half a billion dollars for another American icon: Peanuts Holding LLC, the company that administers the rights to the Peanuts franchise. Talk about A Charlie Brown Christmas for shareholders! The reaction to this Japanese acquisition of a cultural institution? Crickets. This speaks to how dramatically the relationship between the US and Japan has changed. It also speaks to how dramatically Peanuts changed, how Peanuts changed Japan, and how that in turn changed all of us. But perhaps most of all, it illustrates (pun intended) how stories need products, and products need stories.

There are countless stories out there, and countless products. But crossing these streams — giving stories products in the form of merchandise, or products stories to make them more than just commodities, can supercharge both. It can create international empires. Peanuts is a perfect case in point.

When Charles Shultz’ Peanuts debuted in October of 1950, it was utterly unlike any cartoon Americans had seen in the funny pages. The very first strip’s punchline involved an adorable tyke declaring his hatred for Charlie Brown. Li’l Abner creator Al Capp described the cast as “good mean little bastards eager to hurt each other.” Matt Groening of The Simpsons fame recalled being “excited by the casual cruelty and offhand humiliations at the heart of the strip.” To Garry Trudeau of Doonesbury, it “vibrated with fifties alienation.”

A hint of darkness made Peanuts stick out in a crowded comics page. But it’s hard to square these comments with the Happiness Is a Warm Puppy-era Peanuts I remember from my childhood. By that time Schultz had sanded the rough edges off those “little bastards,” distilling them into cute and lovable archetypes. More to the point, he de-centered the kids to focus on Snoopy, who had morphed from his origins as a four-legged canine into a bipedal, anthropomorphic creature with a bulbous head and a penchant for tap-dancing and flying biplanes.

The vibe shift seems to date to 1966, when the animated It’s the Great Pumpkin, Charlie Brown devoted roughly a quarter of its screen time to Snoopy’s solo flights of fancy. Schultz was already lauded for his short-form social satire: his characters had graced the cover of Time the year before. But he seems to have grasped that the way to riches would be only found by looking at the brighter side of life.

This new Peanuts, less mean, less casually cruel, less alienated, was arguably also less interesting. But there was no question that it was way, way more marketable. You might have identified with one or another of the human characters, with their all too human foibles, but anthropomorphic Snoopy was someone anyone and everyone could inhabit. Kids in particular. You didn’t even have to be American to get him.

This later, kinder, gentler incarnation of Peanuts, and Snoopy in particular, would charm Japanese audiences, thanks to the efforts of a serial entrepreneur named Shintaro Tsuji. He was a would-be poet turned wartime chemist, then a postwar black-market bootlegger of moonshine, and an inveterate hatcher of business schemes ranging from silks to produce to kitchenware. You are undoubtedly familiar with the most successful of his ventures. It is called Sanrio — the home of Hello Kitty.

Tsuji, long interested in American trends, played a key role in importing many of them to Japan. He forged a relationship with Hallmark to translate their greeting cards, and negotiated with Mattel for the rights to Barbie. He acquired the license to Peanuts in 1968, when his company, then known as the Yamanashi Silk Center, was at a low. Snoopy-branded merchandise proved so popular that it put his struggling company back in the black within a year. Snoopy wasn’t the first cute animal to hit big in Japan; Tsuji himself had scored a big hit in the mid-sixties with merchandise featuring Mii-tan, a cute cat designed by the artist Ado Mizumori. But Snoopy’s runaway success seems to have sparked an epiphany in Tsuji.

As he later put it, Japan was “a world in which ‘making money’ meant ‘making things.’ I desperately wanted to leapfrog the ‘things’—the ‘hardware’—and make a business out of the intellectual property—the ‘software.’ I suspect everyone around me thought I was nuts.”

He was nuts. Merchandising characters from hit stories was common sense, then as now. Many Japanese companies did that sort of thing. Creating hit characters without stories was fiendishly difficult, bordering on impossible. Stories breathe life into characters, bestowing them with an authenticity that standalone designs simply do not possess (or need to earn in other ways). Yet Tsuji would not be deterred. In 1971, he launched an in-house art department, staffing it with young women straight out of art school. In the wake of Peanuts’ continuing success, he gave the team a singular directive: “Draw cats and bears. If a dog hit this big, one of those two is sure to follow.”

Two years later, he renamed the Yamanashi Silk Center “Sanrio.” (There’s a whole story about how that came to be, which you can read in my book, if you’re so inclined.) The year after that, in 1974, one of Sanrio’s designers struck gold, in the form of an anthropomorphic cat with a bulbous head and a penchant for hugging: Hello Kitty. Soon, Kitty products were a full-blown fiiba (fever) in Japan. And this time, Tsuji didn’t have to split the proceeds with anyone, because Sanrio owned the character outright. Schultz needed decades of narrative to make stars of Peanuts’ menagerie of characters. Tsuji upended this process by making characters stars without any story at all.

Sanrio famously insists that Hello Kitty isn’t really a cat; she’s a little girl who happens to look like a cat. I take no particular stance on this globally divisive issue. But I think you can make the case that she wouldn’t exist at all, if it hadn’t been for the trail Schultz blazed with Peanuts, shifting away from social satire to make an anthropomorphic dog the star of the show. Tsuji’s genius was realizing that you could make a star without a show — provided you had the ability to print it on countless school supplies, kitchenware, and accessories. That was the trick up his sleeve. The medium is the message, as they say. In essence, Kitty products, ubiquitous to the point of absurdity, became her story.

by Matt Alt, Pure Invention |  Read more:
Image: uncredited
[ed. See also: Super Galapagos (PI):]
***
Once the West feared Japan’s supposed technological superiority. Then came the schadenfreude over Japan’s supposed fall. Now a new generation is projecting upon the country an almost desperate longing for comfort. And is it any wonder? The meme centers on companies producing products that make the lives of consumers easier. That must feel like a dreamy fantasy to young folks who’ve only known life in an attention economy, where corporations are the consumers and they’re the products.

To them, Japan isn’t in the past or the future. It’s a very real place — a place where things haven’t gone haywire. This is Japan as a kind of Galapagos, but not in a pejorative sense. Rather, it’s a superlative, asking, a little plaintively: Why can’t we have nice things like this in our country?...

I agree that Japan is a kind of Galapagos, in the sense that it can be oblivious to global trends. But I disagree that this is a weakness. The reason being that nearly everything the planet loves from Japan was made for by Japanese, for Japanese in the first place.

Looking back, this has always been the case. Whether the woodblock prints that wowed the world in the 19th century, or the Walkmans and Nintendo Entertainment Systems that were must-haves in the Eighties, or the Pokémania that seized the planet at the turn of the Millenium, or the life-changing cleaning magic of the 2010s, or the anime blockbusters Japan keeps unleashing in the 2020s – they hit us in the feels, so we assumed that they were made just for us. But they weren’t.

Friday, October 3, 2025

The War Between Silicon Valley and Hollywood is Officially Over...

Silicon Valley needs creativity, but hates to pay for it. So the war rages on.

But we are now entering the final stage of the war. Silicon Valley is now swallowing up Hollywood. The pace at which this is happening is frightening.

The main protagonist here is a man named David Ellison—a tech scion who wants to be the biggest mogul in the movie industry. A few days ago, his father Larry Ellison became (briefly) the richest man in the world.

Ellison is founder of Oracle, a database software company with a market cap of almost a trillion dollars. That’s nice work if you can get it. But his son David wanted to do something more fun than databases, so he dropped out of college to dabble in movies.

But this quickly became more than dabbling. He produced hit franchise films, notably the Mission Impossible movies. He didn’t care much about artsy cinema—and instead churned out Baywatch and Spy Kids. Like his dad, he knows how to make money.

And then he went on a spending spree. It’s not hard to do that in Hollywood if you have a big pile of cash. Many of the legendary properties from the past have fallen on hard times, and can be acquired from their current owners at a very reasonable price.

So David Ellison bought up National Amusements and merged it into his production company. This gave him control over
  • Paramount Pictures
  • CBS (and CBS News)
  • MTV, Comedy Central, Nickelodeon, and other cable TV channels
  • Paramount+ and Pluto TV streaming platforms
But now he is preparing a takeover of Warner Bros Discovery—which would put him in charge of Warner Bros film and television studios
  • HBO
  • CNN
  • TNT
And it doesn’t stop there.

Oracle, the company founded by David Ellison’s father, will now be one of the new owners of TikTok. So in one generation, the Ellison family has gone from software entrepreneurs to the biggest powerhouse in film and media.

What’s next?

The most attractive remaining asset in Hollywood is Disney. It’s just a matter of time before it gets swallowed up by the techies. The most likely outcome is Apple acquiring Disney, but some other Silicon Valley powerhouse could also do this deal.

There’s so much money in NorCal, and a lot of techies would like to own their own famous mouse (along with theme parks and all the rest). Musk could do it. Zuckerberg could do it. Alphabet could do it. Even some company you don’t think much about, like Broadcom (market cap = $1.6 trillion), could easily finance this deal.

There’s heavy irony in this fact. That’s because Disney helped launch Silicon Valley.

There’s a garage on Addison Avenue in Palo Alto that’s called the birthplace of Silicon Valley. But that birthing only happened because Walt Disney gave Bill Hewlett and Dave Packard an order for eight audio oscillators—which gave them enough cash and confidence to create their garage-born company.

Hewlett-Packard trained the next generation of tech entrepreneurs, including Steve Wozniak, co-founder of Apple. So an Apple buyout of Disney would simply take things full circle.

There are still some chess pieces on the board, and a few moves left to be made—but the winner of this game is already obvious. Hollywood, or what’s left of it, will become a subsidiary of tech interests. I don’t see any other outcome.

by Ted Gioia, Honest Broker |  Read more:
Image: Hollywood Boulevard in the 1930s

Monday, September 8, 2025

The Unbelievable Scale of AI’s Pirated-Books Problem

When employees at Meta started developing their flagship AI model, Llama 3, they faced a simple ethical question. The program would need to be trained on a huge amount of high-quality writing to be competitive with products such as ChatGPT, and acquiring all of that text legally could take time. Should they just pirate it instead?

Meta employees spoke with multiple companies about licensing books and research papers, but they weren’t thrilled with their options. This “seems unreasonably expensive,” wrote one research scientist on an internal company chat, in reference to one potential deal, according to court records. A Llama-team senior manager added that this would also be an “incredibly slow” process: “They take like 4+ weeks to deliver data.” In a message found in another legal filing, a director of engineering noted another downside to this approach: “The problem is that people don’t realize that if we license one single book, we won’t be able to lean into fair use strategy,” a reference to a possible legal defense for using copyrighted books to train AI.

Court documents released last night show that the senior manager felt it was “really important for [Meta] to get books ASAP,” as “books are actually more important than web data.” Meta employees turned their attention to Library Genesis, or LibGen, one of the largest of the pirated libraries that circulate online. It currently contains more than 7.5 million books and 81 million research papers. Eventually, the team at Meta got permission from “MZ”—an apparent reference to Meta CEO Mark Zuckerberg—to download and use the data set.

This act, along with other information outlined and quoted here, recently became a matter of public record when some of Meta’s internal communications were unsealed as part of a copyright-infringement lawsuit brought against the company by Sarah Silverman, Junot Díaz, and other authors of books in LibGen. Also revealed recently, in another lawsuit brought by a similar group of authors, is that OpenAI has used LibGen in the past. (A spokesperson for Meta declined to comment, citing the ongoing litigation against the company. In a response sent after this story was published, a spokesperson for OpenAI said, “The models powering ChatGPT and our API today were not developed using these datasets. These datasets, created by former employees who are no longer with OpenAI, were last used in 2021.”)

Until now, most people have had no window into the contents of this library, even though they have likely been exposed to generative-AI products that use it; according to Zuckerberg, the “Meta AI” assistant has been used by hundreds of millions of people (it’s embedded in Meta products such as Facebook, WhatsApp, and Instagram). (...)

Meta and OpenAI have both argued in court that it’s “fair use” to train their generative-AI models on copyrighted work without a license, because LLMs “transform” the original material into new work. The defense raises thorny questions and is likely a long way from resolution. But the use of LibGen raises another issue. Bulk downloading is often done with BitTorrent, the file-sharing protocol popular with pirates for its anonymity, and downloading with BitTorrent typically involves uploading to other users simultaneously. Internal communications show employees saying that Meta did indeed torrent LibGen, which means that Meta could have not only accessed pirated material but also distributed it to others—well established as illegal under copyright law, regardless of what the courts determine about the use of copyrighted material to train generative AI. (Meta has claimed that it “took precautions not to ‘seed’ any downloaded files” and that there are “no facts to show” that it distributed the books to others.) OpenAI’s download method is not yet known.

Meta employees acknowledged in their internal communications that training Llama on LibGen presented a “medium-high legal risk,” and discussed a variety of “mitigations” to mask their activity. One employee recommended that developers “remove data clearly marked as pirated/stolen” and “do not externally cite the use of any training data including LibGen.” Another discussed removing any line containing ISBN, Copyright, ©, All rights reserved. A Llama-team senior manager suggested fine-tuning Llama to “refuse to answer queries like: ‘reproduce the first three pages of “Harry Potter and the Sorcerer’s Stone.”’” One employee remarked that “torrenting from a corporate laptop doesn’t feel right.”

It is easy to see why LibGen appeals to generative-AI companies, whose products require huge quantities of text. LibGen is enormous, many times larger than Books3, another pirated book collection whose contents I revealed in 2023. Other works in LibGen include recent literature and nonfiction by prominent authors such as Sally Rooney, Percival Everett, Hua Hsu, Jonathan Haidt, and Rachel Khong, and articles from top academic journals such as Nature, Science, and The Lancet. It includes many millions of articles from top academic-journal publishers such as Elsevier and Sage Publications.

by Alex Reisner, The Atlantic | Read more:
Image: Matteo Giuseppe Pani
[ed. Zuckerberg should have his own chapter in the Book of Liars (a notable achievement, given the competition). See also: These People Are Weird (WWL). But there's also some good news: First of its kind” AI settlement: Anthropic to pay authors $1.5 billion (ArsT):]

"Today, Anthropic likely breathes a sigh of relief to avoid the costs of extended litigation and potentially paying more for pirating books. However, the rest of the AI industry is likely horrified by the settlement, which advocates had suggested could set an alarming precedent that could financially ruin emerging AI companies like Anthropic." 

Friday, September 5, 2025

Universal Music Group is Going After Rick Beato

Just when you thought major labels couldn't get more stupid...

I lost faith in the music industry decades ago, and I’ll never get it back. You will have an easier time convincing me that Elvis still lives in Graceland or Santa Claus delivers gifts from an Amazon truck.

I’ve heard too many horror stories and I’ve seen too much firsthand. I eventually came up with my “Idiot Nephew Theory” to explain why major record labels seem so much more stupid than other businesses.

Here’s how I’ve described it:
THE IDIOT NEPHEW THEORY: Whenever a record label makes a strategic decision, it picks the option that the boss’s idiot nephew thinks is best.

And what do idiot nephews decide? That’s easy—they always do whatever the company lawyer recommends.
But just when I think I’ve seen it all, some new kind of stupid comes my way via the music biz.

And that’s the case right now. Universal Music Group has gone to war with Rick Beato.

If UMG were wise, they would thank Mr. Beato, who works tirelessly to grow the audience for their recording artists. Rick is smart and trustworthy, and is probably the most influential music educator in the world right now.

He does his work on YouTube, where he has more than five million subscribers. I’m one of them. I learn a lot from Rick’s videos, and have been fortunate to be his guest on two occasions (here and here).

He offers sharp commentary, and has conducted smart interviews with Sting, Pat Metheny, Rick Rubin, David Gilmour, Ron Carter, George Benson, Keith Jarrett, Michael McDonald, Jimmy Webb, and many other legends. These artists open up with Rick, because he is so knowledgeable, with big ears and a big heart.

So why is Universal Music upset?

Like any music educator, Beato plays a few seconds of the songs he discusses on these videos. But he’s very careful to limit himself to just a short extract—and this is allowed by law.

It’s called fair use. And it’s part of our copyright law.

Universal Music can’t change fair use standards. But it can file a constant stream of copyright infringement complaints with YouTube. And this puts Beato in a difficult situation—because he will get banned from YouTube after just three copyright strikes.

If that happens, his 2,000 videos disappear from the web—including all those historic interviews. His five million subscribers lose a trusted voice.

That may be what Universal Music wants. Listen to Beato explain this dire situation:


Universal Music is making surprising claims. On a short 42-second video on Olivia Rodrigo, Beato included just ten seconds of a song. But UMG still charged him with copyright violation—although this seems a straightforward example of fair use.

Beato pushes back and successfully defends his fair use rights—but the disputes keep coming. He showed us his email box on the recent video.


Rick has been forced to hire a fulltime lawyer to handle the endless stream of infringement claims. He has won repeatedly—but maybe that’s what gets the label so upset.

“We have successfully fought thousands of these now,” Rick explains in the video. “But it literally has cost me so much money to do this. Since we’ve been fighting these things and have never lost one, they still keep coming in….And they’re all Universal Music Group.”

“It looks to me like Rick Beato was targeted,” claims lawyer Krystle Delgado, who runs the Top Music Attorney channel on YouTube. “What the major labels have said in their closed door meetings to me is nothing short of shocking.”

“If you try fighting them, they get upset,” she adds. “And that’s when this thing starts to escalate.” She notes that her other clients run into this problem and one company—Universal Music Group—is the leading instigator. (...)

I could share many other videos expressing support of Beato. But you get the idea—the wider community of music educators and commentators is alarmed.

This is sad confirming evidence for my Idiot Nephew Theory. Maybe some corporate lawyer thinks this is a smart strategy for UMG. But people who care about music see it differently—they know how destructive this kind of behavior really is. (...)

His audience knows how much good Beato does. We see how much he loves the music and how much he supports the record labels and their artists. They should give him their support in return.

by Ted Gioia, Honest Broker |  Read more:
Image: YouTube/Rick Beato
[ed. Everyone knows about Rick, right? If you don't, choose any musical artist that comes to mind and you'll probably find an interview or analysis of their music on his channel. A great educator, historian, and fine musician in his own right. Also, for an additional dose of stupidity, see: We've Reached the Sad Cracker Barrel Stage of Cultural Evolution (HB):]
***
"Hey, I love American traditions as much as the next bumpkin. But Cracker Barrel isn’t a tradition by any stretch of the imagination. The company was founded on September 19, 1969. That’s exactly one month after the end of Woodstock.

Even Jed Clampett could sniff out the phoniness at this chain restaurant.

Cracker Barrel is a postmodern pastiche of rural tropes. Jean Baudrillard would call it a simulacrum. By that he means that it’s a symbol disconnected from reality—it merely refers vaguely to other symbols.

So you can’t bring back my grandpa’s Cracker Barrel—because my paw-paw never saw a Cracker Barrel. (...)

The biggest shareholder is BlackRock. Did you think it was Dolly Parton or Willie Nelson?"

Saturday, March 29, 2025

The Unbelievable Scale of AI’s Pirated-Books Problem

When employees at Meta started developing their flagship AI model, Llama 3, they faced a simple ethical question. The program would need to be trained on a huge amount of high-quality writing to be competitive with products such as ChatGPT, and acquiring all of that text legally could take time. Should they just pirate it instead?

Meta employees spoke with multiple companies about licensing books and research papers, but they weren’t thrilled with their options. This “seems unreasonably expensive,” wrote one research scientist on an internal company chat, in reference to one potential deal, according to court records. A Llama-team senior manager added that this would also be an “incredibly slow” process: “They take like 4+ weeks to deliver data.” In a message found in another legal filing, a director of engineering noted another downside to this approach: “The problem is that people don’t realize that if we license one single book, we won’t be able to lean into fair use strategy,” a reference to a possible legal defense for using copyrighted books to train AI.

Court documents released last night show that the senior manager felt it was “really important for [Meta] to get books ASAP,” as “books are actually more important than web data.” Meta employees turned their attention to Library Genesis, or LibGen, one of the largest of the pirated libraries that circulate online. It currently contains more than 7.5 million books and 81 million research papers. Eventually, the team at Meta got permission from “MZ”—an apparent reference to Meta CEO Mark Zuckerberg—to download and use the data set.

This act, along with other information outlined and quoted here, recently became a matter of public record when some of Meta’s internal communications were unsealed as part of a copyright-infringement lawsuit brought against the company by Sarah Silverman, Junot Díaz, and other authors of books in LibGen. Also revealed recently, in another lawsuit brought by a similar group of authors, is that OpenAI has used LibGen in the past. (A spokesperson for Meta declined to comment, citing the ongoing litigation against the company. In a response sent after this story was published, a spokesperson for OpenAI said, “The models powering ChatGPT and our API today were not developed using these datasets. These datasets, created by former employees who are no longer with OpenAI, were last used in 2021.”)

Until now, most people have had no window into the contents of this library, even though they have likely been exposed to generative-AI products that use it; according to Zuckerberg, the “Meta AI” assistant has been used by hundreds of millions of people (it’s embedded in Meta products such as Facebook, WhatsApp, and Instagram). To show the kind of work that has been used by Meta and OpenAI, I accessed a snapshot of LibGen’s metadata—revealing the contents of the library without downloading or distributing the books or research papers themselves—and used it to create an interactive database that you can search here:

There are some important caveats to keep in mind. Knowing exactly which parts of LibGen that Meta and OpenAI used to train their models, and which parts they might have decided to exclude, is impossible. Also, the database is constantly growing. My snapshot of LibGen was taken in January 2025, more than a year after it was accessed by Meta, according to the lawsuit, so some titles here wouldn’t have been available to download at that point.

LibGen’s metadata are quite disorganized. There are errors throughout. Although I have cleaned up the data in various ways, LibGen is too large and error-strewn to easily fix everything. Nevertheless, the database offers a sense of the sheer scale of pirated material available to models trained on LibGen. Cujo, The Gulag Archipelago, multiple works by Joan Didion translated into several languages, an academic paper named “Surviving a Cyberapocalypse”—it’s all in here, along with millions of other works that AI companies could feed into their models.

Meta and OpenAI have both argued in court that it’s “fair use” to train their generative-AI models on copyrighted work without a license, because LLMs “transform” the original material into new work. The defense raises thorny questions and is likely a long way from resolution. But the use of LibGen raises another issue. Bulk downloading is often done with BitTorrent, the file-sharing protocol popular with pirates for its anonymity, and downloading with BitTorrent typically involves uploading to other users simultaneously. Internal communications show employees saying that Meta did indeed torrent LibGen, which means that Meta could have not only accessed pirated material but also distributed it to others—well established as illegal under copyright law, regardless of what the courts determine about the use of copyrighted material to train generative AI. (Meta has claimed that it “took precautions not to ‘seed’ any downloaded files” and that there are “no facts to show” that it distributed the books to others.) OpenAI’s download method is not yet known.

Meta employees acknowledged in their internal communications that training Llama on LibGen presented a “medium-high legal risk,” and discussed a variety of “mitigations” to mask their activity. One employee recommended that developers “remove data clearly marked as pirated/stolen” and “do not externally cite the use of any training data including LibGen.” Another discussed removing any line containing ISBN, Copyright, ©, All rights reserved. A Llama-team senior manager suggested fine-tuning Llama to “refuse to answer queries like: ‘reproduce the first three pages of “Harry Potter and the Sorcerer’s Stone.”’” One employee remarked that “torrenting from a corporate laptop doesn’t feel right.”

It is easy to see why LibGen appeals to generative-AI companies, whose products require huge quantities of text. LibGen is enormous, many times larger than Books3, another pirated book collection whose contents I revealed in 2023. Other works in LibGen include recent literature and nonfiction by prominent authors such as Sally Rooney, Percival Everett, Hua Hsu, Jonathan Haidt, and Rachel Khong, and articles from top academic journals such as Nature, Science, and The Lancet. It includes many millions of articles from top academic-journal publishers such as Elsevier and Sage Publications. (...)

Publishers have tried to stop the spread of pirated material. In 2015, the academic publisher Elsevier filed a complaint against LibGen, Sci-Hub, other sites, and Elbakyan personally. The court granted an injunction, directed the sites to shut down, and ordered Sci-Hub to pay Elsevier $15 million in damages. Yet the sites remained up, and the fines went unpaid. A similar story played out in 2023, when a group of educational and professional publishers, including Macmillan Learning and McGraw Hill, sued LibGen. This time the court ordered LibGen to pay $30 million in damages, in what TorrentFreak called “one of the broadest anti-piracy injunctions we’ve seen from a U.S. court.” But that fine also went unpaid, and so far authorities have been largely unable to constrain the spread of these libraries online. Seventeen years after its creation, LibGen continues to grow.

by Alex Reisner, The Atlantic |  Read more:
Image: Matteo Giuseppe Pani/The Atlantic

Friday, January 31, 2025

Copyright Office: AI Copyright Debate Was Settled in 1965

The US Copyright Office issued AI guidance this week that declared no laws need to be clarified when it comes to protecting authorship rights of humans producing AI-assisted works.

"Questions of copyrightability and AI can be resolved pursuant to existing law, without the need for legislative change," the Copyright Office said.

More than 10,000 commenters weighed in on the guidance, with some hoping to convince the Copyright Office to guarantee more protections for artists as AI technologies advance and the line between human- and AI-created works seems to increasingly blur.

But the Copyright Office insisted that the AI copyright debate was settled in 1965 after commercial computer technology started advancing quickly and "difficult questions of authorship" were first raised. That was the first time officials had to ponder how much involvement human creators had in works created using computers. (...)

The office further clarified that doesn't mean that works assisted by AI can never be copyrighted.

"Where AI merely assists an author in the creative process, its use does not change the copyrightability of the output," the Copyright Office said.

Following Kaminstein's advice, officials plan to continue reviewing AI disclosures and weighing, on a case-by-case basis, what parts of each work are AI-authored and which parts are human-authored. Any human-authored expressive element can be copyrighted, the office said, but any aspect of the work deemed to have been generated purely by AI cannot.

Prompting alone isn’t authorship, Copyright Office says

After doing some testing on whether the same exact prompt can generate widely varied outputs, even from the same AI tool, the Copyright Office further concluded that "prompts do not alone provide sufficient control" over outputs to allow creators to copyright purely AI-generated works based on highly intelligent or creative prompting. (...)


"The Office concludes that, given current generally available technology, prompts alone do not provide sufficient human control to make users of an AI system the authors of the output. Prompts essentially function as instructions that convey unprotectable ideas," the guidance said. "While highly detailed prompts could contain the user’s desired expressive elements, at present they do not control how the AI system processes them in generating the output." (...)

New guidance likely a big yawn for AI companies

For AI companies, the copyright guidance may mean very little. According to AI company Hugging Face's comments to the Copyright Office, no changes in the law were needed to ensure the US continued leading in AI innovation, because "very little to no innovation in generative AI is driven by the hope of obtaining copyright protection for model outputs." (...)

Although the Copyright Office suggested that this week's report might be the most highly anticipated, Jernite said that Hugging Face is eager to see the next report, which officials said would focus on "the legal implications of training AI models on copyrighted works, including licensing considerations and the allocation of any potential liability."

"As a platform that supports broader participation in AI, we see more value in distributing its benefits than in concentrating all control with a few large model providers," Jernite said. "We’re looking forward to the next part of the Copyright Office’s Report, particularly on training data, licensing, and liability, key questions especially for some types of output, like code."

by Ashley Belanger, Ars Technica |  Read more:
Image: Copilot; Copyright Office
[ed. So, upshot (as I understand it): there has to be some significant (whatever that means) human involvement in the production of a work to receive copyright protection (not sure if that applies to all or parts of the end product). Designing a special prompt is not considered significant human involvement.]

Thursday, August 29, 2024

Xpressenglish.com


The dystopian civilization envisioned in this Charles Beaumont story has eliminated many of today’s “distractions” such as food preparation, books and even the need for sleep. It has also specified uniform male and female appearances to be adopted by undergoing a “Transformation” (operation) upon turning nineteen. A brave girl resists the change, not only putting her job and family’s social position at risk, but also threatening social stability. As she is frog-marched to the operating theater, she realizes the sinister purpose of Transformation… to remove the population’s sense of individual identity. Themes: identity, body shaming, scientific “advancement”, superficial beauty, conformity.


Video Version

This film adaption of the story is an episode from Series Five of the famous American TV series, The Twilight Zone. It follows the original plot quite closely, with the exception of the conclusion where we see the startling result of the girl’s Transformation. Watch and enjoy!


[ed. Wow, what a find! I went looking for Isaac Asimov's "Nightfall", which in 1964 was voted by the Science Fiction Writers of America the best short science fiction story of all time. Hoping to find it in the public domain, I stumbled on this amazing site of collected short stories and novellas: xpressenglish.com (see the About page). This story (The Beautiful People) just happened to be on the first page, but there are literally thousands of other stories available. Bookmark and enjoy!]

Sunday, April 7, 2024

How Tech Giants Cut Corners to Harvest Data for A.I.

In late 2021, OpenAI faced a supply problem.

The artificial intelligence lab had exhausted every reservoir of reputable English-language text on the internet as it developed its latest A.I. system. It needed more data to train the next version of its technology — lots more.

So OpenAI researchers created a speech recognition tool called Whisper. It could transcribe the audio from YouTube videos, yielding new conversational text that would make an A.I. system smarter.

Some OpenAI employees discussed how such a move might go against YouTube’s rules, three people with knowledge of the conversations said. YouTube, which is owned by Google, prohibits use of its videos for applications that are “independent” of the video platform.

Ultimately, an OpenAI team transcribed more than one million hours of YouTube videos, the people said. The team included Greg Brockman, OpenAI’s president, who personally helped collect the videos, two of the people said. The texts were then fed into a system called GPT-4, which was widely considered one of the world’s most powerful A.I. models and was the basis of the latest version of the ChatGPT chatbot.

The race to lead A.I. has become a desperate hunt for the digital data needed to advance the technology. To obtain that data, tech companies including OpenAI, Google and Meta have cut corners, ignored corporate policies and debated bending the law, according to an examination by The New York Times.

At Meta, which owns Facebook and Instagram, managers, lawyers and engineers last year discussed buying the publishing house Simon & Schuster to procure long works, according to recordings of internal meetings obtained by The Times. They also conferred on gathering copyrighted data from across the internet, even if that meant facing lawsuits. Negotiating licenses with publishers, artists, musicians and the news industry would take too long, they said.

Like OpenAI, Google transcribed YouTube videos to harvest text for its A.I. models, five people with knowledge of the company’s practices said. That potentially violated the copyrights to the videos, which belong to their creators.

Last year, Google also broadened its terms of service. One motivation for the change, according to members of the company’s privacy team and an internal message viewed by The Times, was to allow Google to be able to tap publicly available Google Docs, restaurant reviews on Google Maps and other online material for more of its A.I. products.

The companies’ actions illustrate how online information — news stories, fictional works, message board posts, Wikipedia articles, computer programs, photos, podcasts and movie clips — has increasingly become the lifeblood of the booming A.I. industry. Creating innovative systems depends on having enough data to teach the technologies to instantly produce text, images, sounds and videos that resemble what a human creates.

The volume of data is crucial. Leading chatbot systems have learned from pools of digital text spanning as many as three trillion words, or roughly twice the number of words stored in Oxford University’s Bodleian Library, which has collected manuscripts since 1602. The most prized data, A.I. researchers said, is high-quality information, such as published books and articles, which have been carefully written and edited by professionals.

For years, the internet — with sites like Wikipedia and Reddit — was a seemingly endless source of data. But as A.I. advanced, tech companies sought more repositories. Google and Meta, which have billions of users who produce search queries and social media posts every day, were largely limited by privacy laws and their own policies from drawing on much of that content for A.I.

Their situation is urgent. Tech companies could run through the high-quality data on the internet as soon as 2026, according to Epoch, a research institute. The companies are using the data faster than it is being produced.

“The only practical way for these tools to exist is if they can be trained on massive amounts of data without having to license that data,” Sy Damle, a lawyer who represents Andreessen Horowitz, a Silicon Valley venture capital firm, said of A.I. models last year in a public discussion about copyright law. “The data needed is so massive that even collective licensing really can’t work.”

Tech companies are so hungry for new data that some are developing “synthetic” information. This is not organic data created by humans, but text, images and code that A.I. models produce — in other words, the systems learn from what they themselves generate.

OpenAI said each of its A.I. models “has a unique data set that we curate to help their understanding of the world and remain globally competitive in research.” Google said that its A.I. models “are trained on some YouTube content,” which was allowed under agreements with YouTube creators, and that the company did not use data from office apps outside of an experimental program. Meta said it had “made aggressive investments” to integrate A.I. into its services and had billions of publicly shared images and videos from Instagram and Facebook for training its models.

For creators, the growing use of their works by A.I. companies has prompted lawsuits over copyright and licensing. The Times sued OpenAI and Microsoft last year for using copyrighted news articles without permission to train A.I. chatbots. OpenAI and Microsoft have said using the articles was “fair use,” or allowed under copyright law, because they transformed the works for a different purpose.

More than 10,000 trade groups, authors, companies and others submitted comments last year about the use of creative works by A.I. models to the Copyright Office, a federal agency that is preparing guidance on how copyright law applies in the A.I. era.

Justine Bateman, a filmmaker, former actress and author of two books, told the Copyright Office that A.I. models were taking content — including her writing and films — without permission or payment.

“This is the largest theft in the United States, period,” she said in an interview.

by Cade Metz, Cecilia Kang, Sheera Frenkel, Stuart A. Thompson and Nico Grant, NY Times | Read more:
Image: Jason Henry for The New York Times
[ed. Read the whole thing. Of course it's illegal. Arrogantly so. It's part of Tech's ethic - move fast, ask for permission/forgiveness later. Congress needs to get off their lazy, self-absorbed asses and do some fast moving themselves (ha!). Big Tech have simply become modern day robber barons. See also: OpenAI transcribed over a million hours of YouTube videos to train GPT-4 (The Verge).]

Thursday, November 16, 2023

Turning Hums into Melodies

and also:

YouTube's first AI-generated music tools can clone artist voices and turn hums into melodies (Endgadget)
Images: YouTube
[ed. Great, just what we need: more music without musicianship. Why would anyone actually want to learn to play an instrument? Can't imagine.]

Sunday, July 16, 2023

We Are All Background Actors

In Hollywood, the cool kids have joined the picket line.

I mean no offense, as a writer, to the screenwriters who have been on strike against film and TV studios for over two months. But writers know the score. We’re the words, not the faces. The cleverest picket sign joke is no match for the attention-focusing power of Margot Robbie or Matt Damon.

SAG-AFTRA, the union representing TV and film actors, joined the writers in a walkout over how Hollywood divvies up the cash in the streaming era and how humans can thrive in the artificial-intelligence era. With that star power comes an easy cheap shot: Why should anybody care about a bunch of privileged elites whining about a dream job?

But for all the focus that a few boldface names will get in this strike, I invite you to consider a term that has come up a lot in the current negotiations: “Background actors.”

You probably don’t think much about background actors. You’re not meant to, hence the name. They’re the nonspeaking figures who populate the screen’s margins, making Gotham City or King’s Landing or the beaches of Normandy feel real, full and lived-in.

And you might have more in common with them than you think.

The lower-paid actors who make up the vast bulk of the profession are facing simple dollars-and-cents threats to their livelihoods. They’re trying to maintain their income amid the vanishing of residual payments, as streaming has shortened TV seasons and decimated the syndication model. They’re seeking guardrails against A.I. encroaching on their jobs.

There’s also a particular, chilling question on the table: Who owns a performer’s face? Background actors are seeking protections and better compensation in the practice of scanning their images for digital reuse.

In a news conference about the strike, a union negotiator said that the studios were seeking the rights to scan and use an actor’s image “for the rest of eternity” in exchange for one day’s pay. The studios argue that they are offering “groundbreaking” protections against the misuse of actors’ images, and counter that their proposal would only allow a company to use the “digital replica” on the specific project a background actor was hired for. (...)

You could, I guess, make the argument that if someone is insignificant enough to be replaced by software, then they’re in the wrong business. But background work and small roles are precisely the routes to someday promoting your blockbuster on the red carpet. And many talented artists build entire careers around a series of small jobs. (Pamela Adlon’s series “Better Things” is a great portrait of the life of ordinary working actors.) (...)

Maybe it’s unfair that exploitation gets more attention when it involves a union that Meryl Streep belongs to. (If the looming UPS strike materializes, it might grab the spotlight for blue-collar labor.) And there’s certainly a legitimate critique of white-collar workers who were blasé about automation until A.I. threatened their own jobs.

But work is work, and some dynamics are universal. As the entertainment reporter and critic Maureen Ryan writes in “Burn It Down,” her investigation of workplace abuses throughout Hollywood, “It is not the inclination nor the habit of the most important entities in the commercial entertainment industry to value the people who make their products.”

If you don’t believe Ryan, listen to the anonymous studio executive, speaking of the writers’ strike, who told the trade publication Deadline, “The endgame is to allow things to drag out until union members start losing their apartments and losing their houses.”

by James Poniewozik, NY Times | Read more:
Image: Jenna Schoenefeld for The New York Times
[ed. See also: On ‘Better Things,’ a Small Story Goes Out With a Big Bang (NYT).]

Tuesday, July 4, 2023

The Abuse of YouTube's Copyright Policy

[ed. Important. We need a massive overhaul of copyright law, especially with AI coming.]