Fractured Entanglement: AI is modeling the world in a messy way

The most interesting perspectives on AI can be encountered on the Jim Rutt Show, which I have previously referenced with regards to AI risk.

In a more recent episode, Rutt interviews Ken Stanley about his Fractured Entanglement Representation hypothesis. A preprint describing this hypothesis is available here.

Here is how Stanley defined his terms:

[Fractured] means that information underlying the same unitary concepts (e.g. how to add numbers) is split into disconnected pieces. Importantly, these pieces then become redundant as a result of their fracture: they are separately invoked in different contexts to model the same underlying concept, when ideally, a single instance of the concept would have sufficed. In other words, where there would ideally be the reuse of one deep understanding of a concept, instead there are different mechanisms for achieving the same function. At the same time, these fractured (and hence redundant) functions tend to become entangled with other fractured functions, which means that behaviors that should be independent and modular end up influencing each other in idiosyncratic and inappropriate ways. For example, a set of neurons within an image generator that change hair color might also cause the foliage in the background to change as well, and separating these two effects could be impossible.

Stanley’s hypothesis reminded me of the way we understand the movement of the planets. In the pre-Copernican world, astronomers devised all sorts of complex models to explain why the planets moved the way they did. If we had given an AI enough observational data about planetary motion back then, it’d have been able to predict them accurately, but it wouldn’t have any understanding of the underlying physics. The underlying neural networks would be complex and uninterpretable. It took a leap of understanding by Kepler to come up with his elegant and simple laws of planetary motion. It’s not clear if an AI, even with limitless training data, could’ve come up with them.

I have only a basic understanding of neural networks, even though I’ve now used algorithms based on them for more than two decades to analyze genomic data. More recently, I have become what one may consider a power user of large language models. Stanley’s hypothesis articulates something that has long bothered me about both LLMs and the underlying neural networks.

Maybe the most important question this raises is if the current unprecedented investment in AI infrastructure will actually pay off, or if we need to re-think the way we train and construct AIs.