The missing formal science of machine intelligence

Mar 14

There’s a convenient metaphor when talking about AI understanding: we are doing alchemy, and we need chemistry.

While helpful, it often gets flattened into a much narrower claim. AI needs more rigor, we’re told: better definitions, cleaner experiments, stronger mathematics, fewer hand-wavy stories dressed up as theory.

That would matter. It would improve the field.

It still would not capture the transition the metaphor is trying to name.

Beyond rigor

When chemistry emerged, the decisive change was not simply that experimenters became more careful. Late alchemy could be highly systematic. Procedures could be reproduced. Variables could be controlled. Practical knowledge could accumulate. What remained missing was deeper and stranger: a formal structure that organized the domain itself. Particular cases were visible. The space of possible cases was not.

That distinction matters.

This is the distinction the metaphor usually blurs. A field can be empirical, disciplined, and technologically powerful without yet possessing a formal science of its own objects. It can measure reliably, reproduce effects, and build artifacts that work. Practice often outruns understanding. What formal science adds is not mere seriousness. It is a structure that makes the space itself visible.

A field can become extraordinarily good at handling things before it learns how to think about them. Precision can increase. Techniques can stabilize. Engineering power can grow. Meanwhile the underlying objects remain only half-legible.

And this is not just a matter of elegance. It changes what kinds of problems can be solved at all.

Without formal structure, the shape of the search space remains invisible. When the search space is invisible, certain problems do not simply become harder. They become structurally inaccessible. Not because the researchers are foolish. Not because they are insufficiently determined. Because the task itself requires navigating a space they cannot yet see.

That is what chemistry changed. It did not merely make the study of matter more rigorous. It turned many problems from continuous, blind exploration into discrete, navigable search. Once there are elements, molecules, constraints on combination, and recurring structural families, you can enumerate candidates. You can rule things out. You can distinguish a dead end from a near miss. You can know when you have exhausted a region of possibility. The alchemist is not a worse chemist. The alchemist is searching in the dark.

The periodic table did far more than summarize known substances. It organized the space of possible elements. It made gaps visible. It turned discovery into navigation.

Once a field acquires that kind of structure, inquiry changes character. Researchers are no longer wandering through an inventory. They are moving through a landscape.

What formal structure changes

Thermodynamics brought the same shift to engines. Engineers already knew plenty about engines; they could compare them, improve them, and scale them. Thermodynamics introduced entropy, free energy, the second law. Suddenly there was a way to separate design problems from physical limits. Some ambitions remained difficult. Others turned out to be impossible. That distinction changes everything. It saves effort, redirects invention, and prevents whole generations from pushing on locked doors.

Geometry offers another version of the story. Euclidean geometry had been enormously successful. Riemann opened a larger world in which Euclidean space appeared as one case among many.

Then there is the more unsettling example: quantum mechanics.

Quantum theory did not merely sharpen classical physics. It changed what counted as an acceptable description of a physical system. State, property, measurement, even the relation between observation and reality became unstable under the old picture. New formal tools made rigorous reasoning possible, but they also did something more disorienting: they altered the conceptual furniture of the domain.

That is the stronger lesson.

Sometimes the missing ingredient in a field is not only rigor. Sometimes the field lacks the language in which its phenomenon becomes thinkable at all.

Where AI stands

AI may be in that kind of moment.

The field already possesses enormous sophistication. It can probe model internals, steer behavior, analyze training dynamics, and explain why particular architectures work as well as they do. Much of this work is excellent. Some of it is mathematically deep. Some of it has already changed practice.

Still, the dominant pattern is familiar. We are learning how to inspect systems we already know how to build. We are getting better at intervention, diagnosis, and local explanation.

Useful, certainly. Important, often. A science of machine intelligence would ask for more.

To put it bluntly: AI has many microscopes. What remains hard to see is anything like a periodic table.

Researchers already feel the absence. The complaints recur: interpretability is too correlational, the units of analysis wobble, explanations remain underdefined, favored concepts carry more rhetorical force than formal clarity. Beneath those complaints sits a deeper uncertainty. What counts as an explanation here? What are the relevant objects of analysis? Which abstractions are illuminating the phenomenon, and which are merely convenient?

For that reason, the call for stronger foundations in interpretability is exactly right. The field needs cleaner notions of explanation, sharper standards of evidence, and fewer methods that appear persuasive while resting on concepts nobody has really nailed down.

Even so, the deeper issue may lie elsewhere.

Greater rigor can stabilize a practice without making the underlying domain intelligible. Methods can improve. Definitions can tighten. Benchmarks can become more convincing. All the while, the field may still be missing the formal objects that would reorganize what it is studying.

The practical trap

And that is not only a philosophical deficiency. It is a practical trap.

Without formal structure, you cannot see the space of possible systems clearly enough to navigate it. You do not know which failures reflect bad luck, which reflect poor design, and which reflect fundamental limits. You do not know whether a promising direction is one step away or impossible in principle. You do not know whether a problem calls for more scale, a different architecture, a new training regime, or an entirely different way of carving the phenomenon. In such a condition, effort becomes hard to interpret. Search becomes expensive improvisation.

That is the possibility AI researchers should take seriously.

The next advance may arrive as more than a better toolkit for interpreting today’s models. It may come as a way of seeing machine intelligence as a structured space: a domain with classifications, invariants, hierarchies, obstructions, recurring forms, and real limits.

This is not purely speculative. Researchers are already reaching for it: trying to formalize compositionality rather than merely invoke it, borrowing mathematical structures from other domains to see what they reveal about this one, asking what kinds of formal strength are needed to underwrite properties the field often treats as given. The work is scattered, early, and far less visible than the empirical programs that dominate AI. Still, it exists. And if the history of science is any guide, this is often how a formal transition begins: not with one decisive breakthrough, but with the growing sense that the inherited categories no longer hold, and that new ones are needed.

A further danger

What might follow from that recognition?

Perhaps interpretability, as currently practiced, will come to look like one local method inside a larger science. Perhaps the important shift will involve new primitives, new failure modes, new ways of comparing systems. Perhaps some of our most familiar words — feature, concept, representation, reasoning, understanding — will survive only as rough approximations, useful in some regimes and misleading in others.

There is a further danger. Those words may owe as much to our vantage point as to the phenomenon itself: to the systems we happen to have built, the behaviors we happen to notice, and the forms of explanation human beings find most natural. We may be mistaking the part of the space that is easiest for us to describe for the part that is most fundamental. A formal science would help us tell the difference.

That possibility is unsettling. It should be.

Because there may be problems — perhaps there already are — that cannot be solved by scaling, empirical tuning, architectural tinkering, or more diligent interpretability alone. Not because the field is lazy. Because those problems require reasoning over a space of possibilities that current methods do not make visible. They require seeing the room before searching it.

Why this matters

These systems are no longer quiet engineering artifacts. They are entering science, institutions, and ordinary human judgment. They produce useful outputs in settings where explanation matters. Under those conditions, a correct answer and an understood answer begin to diverge. So do prediction and comprehension. So do usefulness and accountability.

That is why the missing science matters.

The issue is larger than incompleteness in our present methods. We may still be missing the formal perspective in which machine intelligence becomes an object of understanding rather than only of manipulation.

The history and philosophy of science offers a consistent warning: better tools are rarely the whole story. At certain moments, progress depends on conceptual invention: a new classification, a new calculus, a new ontology, sometimes all three arriving in a tangle.

AI does need more rigor. Everyone can see that.

The larger task is harder to name and harder to build. It is the search for the formal structures through which machine intelligence becomes legible, and through which the space of possible systems becomes visible enough to navigate.

That is the transition the metaphor is really pointing toward, and a central research program at FABRIAL.

Umar AGHA