Set an AI
A Tale of Two Recursions
Did Stalin’s position, then, rest on data of any sort whatever? Of course not. In such cases facts and figures did not interest him. If Stalin said anything, it meant it was so - after all, he was a “genius,” and a genius does not need to count, he only needs to look and can immediately tell how it should be. When he expresses his opinion, everyone has to repeat it and to admire his wisdom… the proposal was not based on an actual assessment of the situation but on the fantastic ideas of a person divorced from reality.
One of the problems of dictatorship is sometimes described as the “dictator’s dilemma”. The dictator wants to know how much repression he needs to apply to maintain order, but the act of applying repression destroys the feedback mechanisms that tell him how much repression is needed. Eventually, the dictator is making decisions in the dark, untethered from reality and guided only by his own paranoia.
Those same pressures tend to destroy essential information feedback loops in any authoritarian structure. If the people around a leader are nervous to tell them the truth, eventually the information the leader receives includes less and less reality and more and more of the leaders own thoughts, echoed back to them.
This dynamic was sufficiently clear in 1532, that Machiavelli wrote a chapter about the dangers that flatterers pose to a prince.
Some of the strongest characterisations of where AI will take us rely on the idea of “recursive self-improvement”.
The concept is that as AI systems get better, they will produce output that can be used to train further AI systems. This whole process could happen more and more quickly, and so the story goes, would lead to an AI system that gets smarter faster than human society can keep up with.
The opposing view says that this doesn’t actually happen – in fact, AI learning from AI ultimately results in lower and lower quality output, as it becomes more disconnected from reality. One example of that viewpoint is published in On the Limits of Self-Improving in Large Language Models, but a number of people have discovered that as the training corpus contains more and more examples of content generated by AI, you see something called “model collapse” – the produced AIs get stupider.
Does that kill the idea of recursive self-improvement?
Not entirely. While model collapse is a real thing, Zenil’s paper is specifically describing the case where the proportion of external signal (i.e. connection to reality) in the training data set is decreasing towards zero. But while this modern version of the dictators’ dilemma may be true, it’s an extreme that people training models on synthetic data tend not to fall into in real life.
Machiavelli’s suggested solution to flatterers was not to get rid of flattery all together - in his view it provided a useful respect signal, and stopped you from being swamped in too many different opinions. His recommendation to avoid the dangers of becoming untethered from reality, was to carefully choose a restricted group of advisors who are required to provide the truth even if uncomfortable when asked. That way the wise princes court incorporates both flatterers and grounded truth tellers.
Even though Einstein can get a long way with Gedankenexperiment, at some point, if you want your physics to match up to reality, you need to test it, and there are lots of ways to have AIs tested against an external reality. They can play games with fixed rules and look at their win record. They can write code, and we can tell quickly if it compiles and achieves the requirements or not. We can give them access to simulated virtual environments, or even real robotic environments. To avoid Zenil’s conclusion, we just need to ensure that some amount of reality is always finding its way into the training data.
I’ve recently been fine-tuning an AI to read simple musical scores from images. I created synthetic music, rendered it to sheet music, and then gave it to the AI to try to reproduce the music that I’d used to render the score.
Well, I say that I created the synthetic music, but actually, an AI did that for me. The fact that the dataset I was using was partially synthetically generated by AI didn’t disrupt the training process.
On top of that, almost all of the actual effort of setting up the code to run the training process, choosing the service to run it (runpod), and the service to deploy it to for inference (modal), and the code to evaluate loss was also created by AI.
I was there to direct, and keep it honest (we occasionally had differences of opinion about what 100% accurate meant), and even though we didn’t have a true multilevel recursive structure and this didn’t result in an intelligence take off event, the idea of recursive self-improvement (in at least some areas) seems to me to be much closer to a practical fact than a theoretical impossibility.