Some Thoughts on Content Creation and Theft

I’ve never been fond of the term “content creator”, basically because it’s thrown around by large numbers of people who have nothing to say, other than that they want to be thought of as content creators. That self-applied term is as meaningless as things like start-up (which has become a meaningless buzzword in Japanese as well), entrepreneur, solopreneur, and a diverse spectrum of other popular buzzwords. Anyone can call themselves a content creator, and that has led to a serious devaluing of the term.

But for people who actually create content or have likenesses they wish to protect the rights to, the Internet—and social media in particular—has simply enabled theft thereof without consequences, including theft of material purportedly protected by laws.

Anything you create and dare to put online can be unlawfully published and used to make profit, and there’s virtually nothing you can do about it that will have any effect, unless you are a large corporation with a team of attorneys, and even those entities are plagued by pirating and unlawful publishing.

The provenance of most of the content uploaded to social media is unknown and undisclosed, not that disclosing the provenance grants publishing rights; it does not. Since a lot of that content it is the result of a multiple unlawful publishing, an unlawful republisher very likely doesn’t even know who owns the content they have unflawfully republished. The proliferation of “Where is that?” questions about photos and the annoyance of some thieves with those questions is evidence of this situation. The unlawful republisher often does not know from where an impressive photo was taken.

Anonymity and the social media business models that rely on providing and protecting user and advertiser anonymity have rendered legal remedies meaningless, even if they were economically feasible, which they seldom are.

This is demonstrated by the countless anonymous page posts on Facebook. Zuckerberg is certainly not interested in stopping these posts, because they provoke engagement, and engagement gives him and his company more money and increased power to capture the attention—and manipulate the behavior—of what are now billions of users.

The game has been won by the tech giants, and it looks like nobody is willing to stop them. People who remain silent are guilty of contributory negligence and act as accomplices, although apparently many haven’t a clue as to what’s going on.

Jaron Lanier was right.

Where did the chatbot hear that?

The buzz over more than the last year in cyberspace has been arguably buzzier than we’ve seen in a while. It is the buzz about AI chatbots, the highest profile one at the moment being ChatGPT and its peripheral functions, created by OpenAI.

The buzz has been triggered by ChatGPT’s abilities in several areas. One is ChatGPT’s ability to come up with plausible answers to questions, and in English bordering on human-created text.

Another is its amazing ability to come up with things in diverse styles such as haiku and rap on demand.

Yet another is ChatGPT’s ability to make breathtakingly stupid factual mistakes, some being total fabrications, which have come to be called hallucinations, but that could still fool unwary and credulous chatbot-struck users. A related problem is its own credulity in believing leading questions and producing responses that rely on falsehoods and mischaracterizations in questions put to it.

These aspects of ChatGPT’s behavior aside, the appearance of such chatbots means that humans must pay more attention to credibility and accountability than ever before.

If a human friend tells you something that is not only shocking but incredible in the true sense of the word, you can ask the friend “Where in the world did you hear that?” And if your friend says she heard it from YouTube, you might be just a bit skeptical. If she learned it from a certain highly opinionated podcaster known for promoting conspiracy theories, you might start to wonder about the trustworthiness of that friend’s statement, including statements about other subjects. But you should be thankful that your human friend is at least willing and able to reveal the source of her information, enabling you to evaluate it. That’s where AI chatbots part ways with the real world.

ChatGPT and its like collect information from countless Internet sources, some good, some not-so-good, and some totally wrong. The learning process is an opaque and impenetrable black box. You might wonder what sources were used to generate a totally fabricated and factually incorrect account of events that you know is wrong; or about what sources were used to generate a true, useful response. You might not care if you know the answer to the question you asked and are only window-shopping for chatbot failure stories to post online.

But what about when you ask ChatGPT or its now-multiplying wannabe clones a non-trivial question you don’t know the answer to? If the chatbot gives you a plausible-sounding answer, you or others might believe it and could make decisions based on the chatbot response.

I have experimented numerous times with some leading questions I know the answers to; ChatGPT failed miserably in too many cases to repair the damage already done to its reputation with me. Getting facts wrong about events that are not likely to affect our lives or fortunes is one thing. Fabricating answers to questions that are more important, however, is potentially very dangerous.

Since AI chatbots learn from what humans have written on the Internet, the quality of what the humans write is even more important than before. When you consider that much of what is written on the Internet is not even written by fully identified humans, the potential problems come into focus. It is important to be able to know and evaluate the sources of an AI chatbot’s learning. But before that, it would be better if the chatbot itself could know and evaluate the sources of the information from which it is learning, thereby front-loading quality into its knowledge base and, by extension, its responses. The anonymity and lack of accountability that has long been a characteristic of Internet information makes that quite difficult.

That anonymity and lack of accountability is a problem even when chatbots are learning from human-sourced information. But when chatbots start flooding the Internet with their own content, sometimes helped along by humans who trusted them, will chatbots effectively start learning from other chatbots that themselves have learned from not-very-learned humans or even from other chatbots? The image of multiplying mops in Disney’s Sorcerer’s Apprentice comes to mind. Let the believer beware.