Comment To see where this dump truck is headed, let’s first follow the debris trail.
Tracing the trail is difficult, but the impacts of internet content factories, which flourished until around 2010, are still visible.
The net effect of content being generated at the rate of ten to thirty pieces a day on specialized topics – all in the hands of non-specialists via the guiding hand of Google Trends – has led to an Internet that, in 2010, was saturated fluff (if not nonsense), keyword stuffed articles that offered little usable information and in many cases a lot of advice and information that was just plain incorrect.
And because content mills naturally spawned more content mills (and why not when the worst associated content sold to Yahoo! for $100 million), what happened next was inevitable. These new content companies simply replicated what they found on bigger content mills, using the Internet of the day as a training set, so to speak. The cycle of bad articles with few details or worse, inaccurate details, repeated itself over and over again until it became difficult to distinguish one article from the next unless it was found on one of the rare edited and renowned sites.
The name of the game for these early content companies was pure volume. Revenue from ad networks (Google Adsense, etc.) was already down in 2005, but with thousands if not millions of articles, each generating maybe three hundred a day, the money wasn’t bad. For a content factory with 200,000 articles, this was a tidy $2 million business with extremely low overhead. Hosting wasn’t hugely expensive, web design was easy with open source CMS tools like WordPress, Drupal and others, and most importantly (and ultimately disastrously), bulk content could be purchased for only pennies per item in offshore stores.
This pattern meant that the internet was quickly flooded with poorly written nonsense, much of which is still searchable in its original or even worse re-edited form. Google had to start stepping up its game to filter around this and learn how to deliver quality content versus the magic mix of keywords that content mills could exploit.
The issues with content mills are clear, especially all these years later, but it was all on a human scale with the limitations of “slow” writers and keyword stuffers. The future presents us with a new subject, which could change the way we use the Internet for good.
Let’s do some math
Imagine it’s 2006 and you’re in the content industry. You are at the top of your game. You have a team of 100 writers in India who earn the equivalent of $10 a day to write and publish twenty 400-word articles (topics driven by Google keyword trending data vs expertise etc. ).
Your daily costs for salary are around $1,000. Every day, your content mill publishes 2,000 pieces of “unique” content 365 days a year, and each of those articles, assuming good search engine rankings (which could be easily played with keyword tricks to the time), each of these items will generate three cents a day.
And while we’re using nice, rounded numbers for ease, consider these annual numbers (annual because you only need to run this business for a year, Adsense money comes in no matter what, at least for some time):
Paying writers who create, publish, and mark up 20 articles a day is costing you $365,000 a year. They generate 730,000 pieces of content worth $10.95 per piece over the year (assuming three hundred per day for 365 days). And all of that, which is pretty handy for you, Western Content Lord, means you have a business generating about $8 million a year.
Oh. But you have to subtract accommodation and such. Let’s call it five thousand dollars. The big ugly cost? All these “expensive” writers. And you think of yourself who needs it?
Because oh boy, is there a new business model for content mills. And while their early 2000s predecessors made the internet boring and full of junk articles that hit keyword and word count goals without saying anything at all, this one is disruptive enough to turn the internet into complete garbage. And not only from a content point of view, but also from the point of view of how the business operates on the Internet.
Put the S in iOS
This new business model is already being rolled out. You have probably read many articles generated by GPT or similar AI models. The reason you probably haven’t noticed is that they’re not bad. Really you think they’re not bad, but it’s because you’ve been weaned on the shitty internet (IoS) content factories have caused, which has caused us to lower our expectations for information consumption.
The problem is that these AI-generated articles must get their information somewhere in sufficient volume to appropriately produce new clones of information masked in slightly more eloquent language. And where do the AI training algorithms come from? From iOS, of course.
If we do more calculations, suppose that 10% of the training data derived from IoS contains factual errors. As the AI trains and then retrains and retrains, these errors increase. And go up. And multiply and in a decade of retraining on bad, weird, weirdly worded, and increasingly incomprehensible data, we end up with real IoS.
And math is again super important, as is volume.
A single Western Content Lord-scale content mill operator, for example, can use free tools to generate content as quickly as human operators can plug it in with a simple prompt phrase. This same team of 100 workers can enter 300 parts per day.
They don’t write it, they just ask ChatGPT. They can ask him to fill it with keywords like a mofo and also generate keywords, for that matter. Eventually, this ChatGPT process (as one of many examples) will have API hooks to post output directly to WordPress or whatever CMS Content Lord chooses.
When this unification of the AI platform to CMS is complete, the circle is complete: the Internet does nothing but talk to itself.
The race to the bottom
What Western Content Lord and its competitors don’t realize is how quickly this race to the bottom will begin – and soon.
Google Adsense and every other ad network on the planet will recognize the deluge and cut what it pays for a click or a view to next to nothing. And then it will be nothing, but not before Google and its ilk scramble to blacklist known AI content factories. But there will be too many that will appear too quickly. It will just be easier for a Google, for example, to create a safe list of well-known publishers supported by hard-working humans.
Great, you think, balance is restored! Not really.
Keeping up with all the search innovations that push those IoS results down to you will cost Google money, AI training at billion-dollar scales, and massive and frequent retraining on the corpus. of internet. This corpus will be infected fast and furiously and how do the search giants pay for all this search innovation? Via advertising revenue.
Search advertising giants like Google might seem to be holding their noses and accepting content mill results in the queue because it’s in their economic interest to do so. But what if the pool of “acceptable” content drops by 95%?
The Exponential Rate of Internet Shittification
We return again to the theme of mathematics and volume and such to address the most important point: the danger of information is an exponential problem. A series of errors generated and then repeated by content factories over a decade means that these problems are trained in the basic AI language model from the corpus of the Internet and reinforced.
It’s one thing to live in the age of fake news, partly because for most thinking people, it’s obviously fake. When the internet repeats a mistake often enough, it becomes the truth, and it’s the most insidious accidental result of it all.
Personally, it would make me feel better to end this piece with some kind of “fight the power” message, but honestly, at this point, the cat is out of the bag. Content factories can make do with revenue per item measured on a five-year value plan and as low as 0.05 cents over time. But who cares, right? It’s free money. Hosting is cheap, a CMS is free, and as long as there’s advertising money, it’s worth it.
It’s the internet you deserve, apparently. ®