Finishing Schools, Generative AI, and how Ethics is not a solvable problem.
What do images of black nazis, dark-skinned popes, cybernetics, and the culture war have in common? Relative complexity.
summary: Ethics is an incredibly complex adaptive problem you never solve, just address, over time. Generative AI is being used in so many different ways and addresses the full complexity of ethics: what is good and how to communicate it? Any attempts to make AI “good” (ethical, values aligned, etc.) needs to be at least as complex as human ethics. A single tool or simple filter or single company is far too simple for the task at hand.
In the news recently: You may have seen many reports of PR and functionality issues around generative AI, especially on charged culture-war topics. There are many such cases. No single one-off article gets at the real underlying problem. Understandably, some companies have pulled back from the touchy issues they have stumbled on. A google search can show you many examples.
What’s going on? Imagine the underlying models are like a precocious, rambunctious teenager. Imagine teaching that teenager manners for high society, because that teenager was going to be a Unite States Senator, Queen of England, etc. In other words: you need them to grow into deft, positive communicators for the unforeseen, wildly-complex challenges they will encounter.
To do this well, you’d want a great teacher. A teacher that walked you through many examples and built up your own expertise over many bits of repetition and practice. They’d have to be incredibly deft at reading complex ethical situations and coaching your precious teenager (the AI model).
What you wouldn’t do is give them a short and simplistic set of rules, force them to follow it, and then be surprised when they make gaffes in public.
But this is part of what the industry has tried with generative AI.
This method of teaching the AI manners by many examples is called Reinforcement Learning with Human Feedback (RLHF). It is a method of teaching. Just as the style of a single teacher will come through in their students, the culture of a company that is using RLHF will show up in the AI models taught manners with this method. That is, The technique is possibly up for the challenge, but an overly-simplistic teacher (RLHF) can still curtail the student (AI model).
The gaffes and problems arise when the methods of teaching are not up for the problem. Suppose you gave this precocious teenager five years of etiquette practice and public speaking training, and then still forced them to follow some simple rules like "always say please," or "never use bad words." Companies trying to slap ethical guardrails onto the output of these models (LLMs) are essentially doing the just that – providing the AI equivalent of a few polite maxims in hopes of "civilizing" them.
What a waste of all the potential.
How unsurprising are all the gaffes.
The problem is, human ethics are unsolvably complex. Anyone would be foolish to believe any single company could encapsulate it in a tidy set of simple rules for a generative AI. We the public and critics are foolish and complicit to demand such perfection, lambasting the technology over inevitable missteps that even we humans make. It's a lose-lose game, especially for proactive companies making good-faith efforts to address biases and safety.
But it is a chance for leadership, whether from a company, regulator, journalist, or more, to raise the conversation to the full challenge of ethics-as-domain, not a single problem to “get right”.
Ashby’s Law of Requisite Variety
The underlying problem here is well understood. Hiding in plain site: Ashby’s Law of Requisite Variety.
The Basic Idea: If you want to control a complex thing in a complex domain, the system you use to control it needs to be at least as complex as the thing you're controlling.
Example: Imagine trying to drive a car with just a gas pedal and brake pedal. Sure, you can make it go and stop, but you have no way of steering! That's why steering wheels, turn signals, mirrors, etc., are important – they give you enough control to handle the complexity of driving.
Applied to AI: Trying to control the output of a powerful AI model with a few simple rules is like the car example. Human ethics are hugely complex and change over time. A simple "decision tree" can't possibly keep up or handle all the nuances.
Key Takeaway: To responsibly manage complex systems (like powerful AI), humanity needs equally complex and flexible control methods. Simplistic rules won't cut it.
Hastily added ethical filters coming from a single peculiar company’s culture, like a basic decision tree, are woefully insufficient to manage the vast complexity of a powerful AI model. Encoding the social mores of a single culture at a single moment in time leaves huge opportunities for manipulation by authoritarian regimes or misinterpretation as those cultural norms inevitably shift.
It’s difficult to overstate how much of a long-term problem that is. Anyone working on these should ask themselves: If I were to build this system, and then my political enemies were to gain control of it, would I trust in the resilience and adaptiveness of the system I built, or would I be afraid of abuse?
If you’d be afraid of abuse, build something else: Something with sufficient complexity for the problem at hand.
It is not the fault of any specific company’s culture. The problem is that a single company’s culture is a type error. It cannot be a complex-enough tool. No possible company culture is complex enough for the problem.
Expanding out: I wrote this as a criticism of the companies. But most of humanity is complicit.
When we as citizens and journalists and critics ask a tech company to “solve” the messy, ongoing, never-fully-solvable problem of human ethics, of course we’re going to be disappointed! We customers have asked for something that cannot be done. And when the impossible isn’t achieved, shame on us for asking.
It's time to embrace the inherent messiness and diversity of ethics instead of striving for AI models to be paragons of virtue to everyone (an impossible goal). The industry should focus on equipping AI products with the ability to recognize ethical dilemmas, weigh trade-offs, and transparently explain their reasoning. Then the pluralism of all their users can act in concert with the AI tools themselves. This mirrors how humanity ideally teaches children: by fostering ethical awareness and critical thinking, not rigid conformity. And allowing a pluralism of humans to address ongoing ethical issues.
Moreover, those pushing for simplistic ethical controls on LLMs often misunderstand the fundamental nature of these models. They are vast probabilistic systems, generating responses based on learned patterns from massive datasets. Expecting them to flawlessly comply with rigid ethical principles is akin to demanding that a weather prediction model never forecast rain on a sunny day – it misses the inherently probabilistic nature of the system.
This doesn't absolve the industry of responsibility. Companies building generative AI systems have a duty to rigorously analyze their training data for biases and work to mitigate clear flaws. But this is a nuanced, ongoing process, not a problem solved by crude filters that attempt to strong-arm a complex system into parroting a single set of myopic values.
To begin to address AI ethics, humanity needs to broaden the conversation. It's not just about the technology; it's about societal values and how humanity wants to use these powerful tools. And how “societal values” are not a problem to be solved but a concern to continually address and evolve. Expecting AI models to be perfectly ethical (a nonsense ask) sidesteps the hard work of determining what "ethical" even means in rapidly evolving contexts, let alone how to reflect that in code.
It’s not a problem to be solved, but it is a chance to lead, for anyone inside or outside the industry, to raise the conversation.
The danger of pretending otherwise is twofold. First, it sets up AI for inevitable failure, leading to public backlash and potentially stifled innovation. Second, it absolves us (companies, governments, individuals) of taking responsibility for the complex ethical choices embedded within these systems.
Demanding that generative AI models perfectly mirror the complexity of human ethics is nonsensical. Ethics is more complex than an AI model. To expect flawlessness is to fundamentally misunderstand how they operate and to misunderstand ethics or values.
The control system needs to match the complexity of the environment (human ethics) and the system being controlled (the AI model). With our first attempts, the industry has a gross mismatch, and that's bad for everyone.
Summary: Ethics as a whole is an ongoing quest for human society as a whole. It’s not a problem to be solved. It is very messy and often bloody. Ethics is also something society as a whole cannot delegate away to an AI. So that messy system that figures out question of Ethics is both the one that has to ultimately control AIs more directly, but is also incapable to solve for a neat single outcome.
And over-simplifying the question of ethics or values into an overly-simple control mechanism on AIs is bound to fail, whether it comes from within the industry or as a request from outside the industry.