Friday, August 25, 2023

Erasure of Content

Can Be a Problem for the Public and for History

Janine Jackson
: In the 1980s, when we at FAIR would talk about how the goals of journalism as a public service, and of information as a public good, were in conflict with those of media as a profit-driven business, we were often met with the contention that the internet was going to make that conflict meaningless, by democratizing access to information and somehow sidelining that profit motive with—technology!

Well, now we’re here, and much of our lives are online. It’s where many get news and information, how we communicate and learn. But power is still power, and the advertising model that drives so much fear and favor in traditional journalism is still in effect.

So, while much is different, there are still core questions to consider when you’re trying to figure out why some kinds of news or “content” is in your face, like it or not, and why some perspectives are very hard to find, and why there’s so much garbage to get through to get to any of it.

Our next guest’s job is to report on life online. Thomas Germain is a senior reporter at Gizmodo. He joins us now by phone from here in town. Welcome to CounterSpin, Thomas Germain.

Thomas Germain: Happy to be here.

JJ: There are internet rules that are not visible to all users, particularly those of us who aren’t looking into the gears of the thing, you know? We just want to read articles, or look at cats falling off chairs.

But as “offline” media have unseen rules—like if a sponsor can’t be found to buy ads on a show, well, that show’s not going to air, no matter how much people might like it—there are also behind-the-scenes factors for internet content that are not journalistic factors, if you will.

I wonder if you would talk us through what CNET—which many listeners will know is a longstanding website dedicated to tech news—is currently doing, and what do you think it means or portends?

TG: Yeah, so CNET is one of the oldest technology news sites on the internet. It’s been around since 1995, and they have tens of thousands, maybe hundreds of thousands, of articles that they’ve put up over the years.

But I got a tip that CNET had started deleting its old content, because of the theory about improving the site’s performance on Google. And I went and I checked it out, and what I found was the company has been deleting thousands of its own articles.

Now, there’s a lot of complicated reasons that this is happening, but the No. 1 thing that people need to understand is a lot of the writing that happens on the internet is aimed as much at robots as it is at humans. And what I mean here is the algorithms that run Google search, right? Almost all internet traffic is driven by how high you show up in the search results on Google.

And there’s an entire industry called “search engine optimization” that is essentially a kind of gamified effort to get your content and your website and individual pages to perform better on Google.

And this is actually a huge thing that drives the journalism business. It’s the reason that you look at articles and you see the same keyword repeated over and over. It’s basically one of the things that dictates what subjects journalists write about, what’s covered and how it’s written.

And the performance of your entire site dictates how your individual pages will do. And Google issued some guidance last year which suggested that if you’ve got some content on your site that’s not performing well, it might help if you take it down. It didn’t say this explicitly, but a lot of companies, CNET included, have been going through and looking at pages that aren’t performing well, which tends to be older content.

And some of that content, they’re redirecting the URL of that page to other articles that they want to promote. And in some cases, they’re taking it down altogether.

So the effect of this is this kind of ironic thing, right? Google‘s entire reason for being is to make information easier to find, but in effect, because of the design of their algorithms, they’re actually encouraging companies, indirectly, to take some information off the internet altogether.

JJ: Because if folks are not “engaging”—that’s the word we’ve all learned to use—with a particular piece that a website might have up, then that’s dragging down the SEO of the site generally, is what you’re saying? Like if you have a lot of content that folks are not actively engaging with, then maybe your new stuff might not show up so high up on Google. Is that, vaguely, somewhere in the ballpark of what’s happening?

TG: That’s basically it. It’s really complicated. And also, we don’t really know exactly what’s going on here. Google isn’t super transparent about the way that its algorithms function, and search engine optimization, or SEO, is as much a guessing game as it is based on actual data. There’s some information that journalists and content publishers have access to, about how certain things are performing, but in other cases, it’s just best practices, and people crossing their fingers, essentially.

So the one thing we know for sure is the more content that’s on your website, the longer it takes Google‘s robots, they call them “crawlers,” to go through every page, which is how the company determines how certain pages will rank for search results.

So what they’ve said is, you’ve got a giant, old site like CNET, and there’s some content that’s not performing well, shrinking that down, they call it “content pruning,” can help you increase the performance of the content that you want to promote. So in effect, it could be an advantage to you, if you’ve got a giant site, to take some of that content down.

JJ: I think listeners will already understand the harm that that does to public information and to journalism, because obviously we think of the internet, dumbly perhaps, as an archive, and there is a severe loss implied in sites like CNET, and others if they follow their lead, in deleting old material.

TG: Yeah. Journalism, they say that it’s the first draft of history, right? And if you’re doing any kind of archival research, if you want to know what people were talking about in 1997, it helps to be able to have a record of all these old articles, even if no one’s reading them, even if they’re about topics that don’t have any obvious importance now. CNET used the example of old articles that talk about the prices of AOL, which is a thing that you can’t even get anymore.

But this stuff can be important for reasons that aren’t immediately obvious. And the loss of this information can really have a serious detrimental effect on the public record.

There are some companies that are working to preserve this stuff. The most well-known one is the Internet Archive. It’s got this tool called the Wayback Machine, which goes and preserves copies of webpages.

And CNET says that before it deletes content, it lets the Internet Archive know to make a copy of it, so it’s not gone forever. And they say they preserve their own copy, but they’re relying on a third-party service that’s a nonprofit to maintain this content, and who knows whether it’s going to be around in the long term.

But there’s an effect on the journalists, right? Because you want a record of your work in order to just keep track of what you’ve done, but also to have stuff to put in your portfolio to get new jobs. So the erasure of this content can be a problem, for just the general public and for history, but also for the people who are tasked with writing this stuff in the first place.

JJ: Absolutely. And, of course, who knows what’s going to be interesting from the past to look back on, because, who knows, you can’t predict what you might want to go back and look through. You know, maybe AOL will come up in the future, and we’ll want to know what was said about it at the time. So it seems like a loss. (...)

TG: Yeah, I think this is something that everybody experiences, you’re aware of it, we all know that we’re seeing more ads, but I think people don’t quite recognize how prevalent it is and how dramatically it’s changed.

And it’s actually a recent change. So over the last year, we’ve seen a massive increase in the amount of advertising. We’re seeing it in places we’ve never seen before; Uber, I think, is an example, where we’re getting pop-up notifications that have ads in them, but just about every context you can think of: I saw an ad in a fortune cookie the other day. If there’s a space where there’s people’s eyes, it’s being turned into a space for advertising.

And there are two, I think, counterintuitive reasons that this is happening. And the first one’s actually because there are increasingly regulations and restrictions about privacy, right? There’s laws, more so in Europe than in the United States, that are restricting the ways that companies can collect and use your data.

And simultaneously, Google and Apple, who control all of the phones, understand that the writing is on the wall here, and they’re trying to get out in front of regulation before it happens, by putting their own limits on how companies collect data on their platforms.

Now what this does is it makes advertising less profitable, right, because targeted ads make more money than regular ads. But those targeted ads need lots of data. And if the data’s harder to find, it’s harder to make money if you’re a company that makes its cash on ads.

So what do you do in that situation? You just increase the number of ads that you’re showing people.

Simultaneously, there’s this other thing that’s happening in the technology industry, which is the economy, right? The federal government has raised interest rates; that makes it more expensive to borrow money. And all of this endless runway that the technology companies had for the better part of the decade is suddenly drying up.

And there’s been this shift where investors have started to understand that the technology industry isn’t some kind of magic money printing machine, and people are expecting more return on their investment.

So if you’re a company, and you need to add a new revenue stream and you don’t have any great ideas, the obvious one is to add more ads to your platform, or put them in places where they’ve never been before.

So there’s these two competing forces, right, privacy and the economy, that are pushing companies to inundate us with ads. And it’s really grown to an astonishing level. 

by Janine Jackson and Thomas Germain, FAIR | Read more:
Image: via