Not quite 15 years ago, I proposed that Web 3.0 would be system-generated content. There was talk about the semantic web, where we started tagging things, even auto-tagging, and then operating on chunks by rules connecting tags, not hard wiring. I think, however, that we’ve reached a new interpretation of Web 3.0 and system-generated content.
Back then, I postulated that Web 1.0 was producer-generated content. That is, the only folks who could put up content had all the skills. So, teams (or the rare individual) who could manage the writing, the technical specification, and the technical implementation. They had to put prose into html and then host it on the web with all the server requirements. These were the ones who controlled what was out there. Pages were static.
Then, CGIs came along, and folks could maintain state. This enabled some companies to make tools that could handle the backend, and so individuals could create. There were forms where you could type in content, and the system could handle posting it to the web (e.g. this blog!). So, most anyone could be a web creator. Social media emerged (with all the associated good and bad). This was Web 2.0, user-generated content.
I saw the next step as system-generated content. Here, I meant small chunks of (human-generated) content linked together on the fly by rules. This is, indeed, what we see in many sites. For instance, when you see recommendations, they’re based upon your actions and statistical inferences from a database of previous action. Rules pull up content descriptions by tags and present them together
There is another interpretation of Web 3.0, which is where systems are disaggregated. So, your content isn’t hosted in one place, but is distributed (c.f. Mastodon or blockchain. Here, the content and content creation are not under the control of one provider. This disaggregation undermines unified control, really a political issue with a technical solution.
However, we now see a new form of system-generated content. I’ll be clear, this isn’t what I foresaw (though, post-hoc, it could be inferred). That is, generative AI is taking semantics to a new level. It’s generating content based upon previous content. That’s different than what I meant, but it is an extension. It has positives and negatives, as did the previous approaches.
Ethics, ultimately, plays a role in how these play out. As they say, Powerpoint doesn’t kill people, bad design does. So, too, with these technologies. While I laud exploration, I also champion keeping experimentation in check. That is, nicely sandboxing such experimentation until we understand it and can have appropriate safe-guards in place. As it is, we don’t yet understand the copyright implications, for one. I note that this blog was contributing to Google C4 (according to a tool I can no longer find), for instance. Also, everyone using ChatGPT 3 has to assume that their queries are data.
I think we’re seeing system-generated content in a very new way. It’s exciting in terms of work automation, and scary in terms of the trustworthiness of the output. I’m erring on the side of not using such tools, for now. I’m fortunate that I work in a place of people paying me for my expertise. Thus, I will continue to rely on my own interpretation of what others say, not on an aggregation tool. Of course, people could generate stuff and say it’s from me; that’s Web 3.0 and system-generated content. Do be careful out there!