Tuesday, June 9, 2020

It’s Time to Archive the Internet Archive

Five of the world's largest publishers sued the Internet Archive, claiming its open-access digital library is a mass infringement on their copyright. The move puts the internet’s most important archive in danger, and has at least got some data hoarders talking about archiving the Internet Archive, and what that would even look like.

Last week, Hachette Book Group, Inc., HarperCollins Publishers LLC, John Wiley & Sons, Inc., and Penguin Random House LLC filed a copyright infringement lawsuit against the Internet Archive and five ‘Doe’ defendants, claiming that the Internet Archive is a piracy site.

In March, the Internet Archive set up a new service for people displaced from library and educational access due to COVID-19, called the "National Emergency Library." Nearly 1.4 million books are available in full for anyone to download and read, without a waitlist, until the end of June or the end of the coronavirus pandemic crisis in the US, according to an announcement on their site.

While damages haven't been set, the publishers could claim up to $150,000 in statutory damages per infringement, for each of the 1.4 million copyright works in the emergency library. They're also demanding a preliminary and permanent injunction of the Internet Archive, and anyone involved with it, from reproducing and distributing more works, and that all current copyrighted copies on the site be destroyed—effectively shutting down the entire library. (...)

The move puts one of the internet’s largest repositories of knowledge in peril. Over on the DataHoarder subreddit, threads have been started about what it would take to archive the archive, which holds dozens of petabytes of data and is constantly growing (there have been attempts to simply understand the sheer amount of data the archive holds). Academics have been saying for years that the Internet Archive must be made more resilient by creating backups of the backups and storing them in other locations. When Donald Trump was elected president, the Internet Archive announced it was making a backup in Canada. Egypt’s Bibliotheca Alexandrina once had a backup of the Internet Archive’s Wayback Machine, but it has not been updated in years.

by Samantha Cole and Jason Koebler, Motherboard | Read more:
Image: Wikimedia Commons
[ed. For background, see also: You Can Now Access 1.4 Million Books for Free Thanks to the Internet Archive (Motherboard).]