Why we’re launching a frontier mathematical foundations lab

Harny and I are launching FABRIAL LABS, the only place designed around the collision of abstract math x applied AI.

How fields get stuck

There's a useful analogy in the history of cartography. For centuries, mapmakers got better and better at drawing flat maps of the Earth. Mercator projections, conic projections, azimuthal projections. Each one solved real problems. Each one was a genuine improvement. And each one inherited the same constraint: you cannot flatten a sphere without distortion. No amount of skill or ingenuity changes that. It's a mathematical fact about the relationship between curved and flat geometry.

You could spend your career making better flat maps, and you'd produce real value. Ships navigated the world with Mercator projections. But certain questions, like "what is the true shortest path between two points on the globe," simply can't be answered correctly on a flat map. The distortion isn't a bug in the mapmaker's technique. It's a property of the projection itself. To answer those questions, you don't need a better flat map. You need to work on the globe.

This pattern recurs throughout the history of science and mathematics, and it always has the same structure.

Before Copernicus, astronomers built on a geocentric model: the Earth at the center, everything else revolving around it. The model wasn't failing. It predicted planetary positions reasonably well. But every time observations got more precise, the model needed more compensatory machinery: epicycles, circles on circles, corrections layered on corrections. Each one improved the predictions. The system got more complex and more accurate simultaneously, and that looked like progress.

But the complexity was a symptom of the wrong foundation, not a sign that the right one was getting closer. The breakthrough came from replacing the foundation itself: first heliocentrism, then Kepler's elliptical orbits. Entire classes of corrections became unnecessary, because the thing they were compensating for no longer existed.

How fields get unstuck

A subtler example, and the one closest to what we're doing: the relationship between algebra and calculus.

Algebra is a language of static relationships. It can describe states, solve for unknowns, express fixed equations. If you want to describe continuous change, like acceleration, you can approximate it in algebra: compute the position at time 1, time 2, time 3, make the intervals smaller, get closer and closer. The approximation works. You can get correct answers to arbitrary precision.

But "continuous change" is not a thing in algebra. It's not an object you can name, manipulate, compose with other objects, or reason about directly. It's a pattern you reconstruct, laboriously, from static snapshots. You can approximate its effects. You cannot speak about it.

What calculus did was not just make the computation easier. It introduced a new kind of mathematical object: the derivative. Continuous change became something you could write down directly, compose with other operations, inspect, and prove things about. The operation that previously required an infinite limiting process became native to the language.

This is worth sitting with, because it reveals something deep about how higher-order abstractions actually work. They don't just add tools for doing existing computations more efficiently. They elevate something that was previously an emergent approximation into a native primitive.

And the difference between "emergent approximation" and "native primitive" is not one of convenience. It's a difference in kind.

When something is a primitive in your language, it comes with structural guarantees: you can compose it and know the composition is well-defined, inspect it and know what you're looking at, build on it and know the foundation is sound. When it's an emergent approximation, you get none of that. You can run the approximation and check the output. But you cannot guarantee its behavior in advance, because the thing you care about doesn't exist as an entity in your system. It's a shadow cast by entities that do.

The AI question

The AI parallel writes itself: attention mechanisms, massive parameter matrices, RLHF, chain-of-thought prompting, elaborate post-training pipelines. Each one is a genuine improvement. Each one adds compensatory machinery that makes the system more capable. And the system keeps getting more complex and more accurate simultaneously.

The question is whether that complexity is the path to the breakthrough, or whether it's epicycles.

We believe it's closer to epicycles. Here's why.

The feeling

Many people already feel intuitively that something is off. Tens of billions of dollars are being spent and qualitatively new capabilities are rare. Models get better at what they already do. The step-change doesn't arrive. That feeling is widespread.

Where people diverge is in the diagnosis. The mainstream view is that architectural innovation downstream of the current foundations can still break through: a better attention mechanism, a novel state space design, a new approach to program synthesis. That's a reasonable position. Some of those bets will pay off, and they're worth funding.

But there are three independent reasons to believe the wall is deeper than architecture.

Three reasons

Anecdotally: every architecture shares the same foundational mathematical choices, and every architecture shares the same qualitative gaps. Compositionality is brittle everywhere. Reasoning is opaque everywhere. Efficiency hits the same scaling curves everywhere. If the gaps were architectural, you'd expect different architectures to have different gaps. They don't.

Empirically: early models built on different foundational math show signatures that shouldn't exist under the current paradigm:

  • Inverse scaling, where the performance advantage grows as problems get harder

  • Factorial-to-polynomial compression

  • Learning that compounds as permanent reusable objects rather than dissolving into parameters

Peer-reviewed, NeurIPS '23 and '24.

Formally: you can prove, as a theorem, ceilings on what specific architecture classes can achieve regardless of scale or training. Not benchmarks. Proofs.

The shared root

The shared root is a structural commitment so deep that most researchers don't think of it as a choice: learned knowledge lives in parameters.

What a model acquires from data is deposited into continuous numerical weights in high-dimensional space. This is true across the board:

  • Transformers, SSMs, MoE, retrieval-augmented systems. The canonical architectures. Knowledge is weight matrices.

  • Neurosymbolic hybrids. The symbolic component is either hand-crafted or extracted from the neural side. The learned knowledge still lives in parameters, with a translation bridge bolted on top.

  • Program synthesis. The output might be a discrete program, but the learning mechanism that proposes programs stores its knowledge in parameters.

  • Geometric deep learning and GNNs. The closest to caring about structure. Group equivariance, gauge invariance, and symmetry preservation are real and meaningful constraints on how parameters interact. But the learned objects are still vectors in high-dimensional continuous space. Structure governs what the parameters are allowed to do. It doesn't change what they are.

  • "First principles" labs rethinking search, generalization, and optimization. If knowledge still lives in parameters and learning still means optimizing them, the foundational commitment is unchanged.

A lab can rethink everything downstream of that commitment. But if what the system knows is still a point in continuous parameter space, it inherits whatever that medium can and can't express.

What the current paradigm is good at

The current paradigm is not out of room. More than that: it is genuinely excellent at what it does.

Pattern completion at scale turns out to cover an enormous range of valuable applications. Drug screening, protein structure prediction, materials discovery, coding assistants, content generation, search. SWE-bench scores will keep climbing. Enterprises will keep adopting. These are real achievements, and the markets they serve are massive.

For the broad landscape of tasks that reduce to learning statistical regularities from large datasets, the current substrate works, will keep working, and will keep improving.

Where we're looking instead

The frontier we're interested in is somewhere else. Not the same capabilities done better, but a different class of capability entirely:

  • Systems that can represent a scientific theory as a mathematical object and compose it with another theory to see what follows.

  • Systems whose reasoning is a sequence of verified structural transformations you can inspect step by step, not a statistical prediction you can only evaluate after the fact.

  • Systems where what gets learned becomes a permanent, reusable, composable building block, the way a proven lemma becomes a permanent tool in mathematics, rather than a parameter configuration that might or might not transfer to the next problem.

  • Systems that can manipulate abstraction itself: patterns in relationships between theories, and patterns in those patterns, at arbitrary depth.

None of this is a better version of what current AI does. It's a different kind of thing.

What compositionality actually means

This frontier hinges on a specific property that's worth being precise about, because the word is used loosely.

In ML, "compositional" typically means a behavioral property: the model generalizes to novel combinations on a benchmark. We mean something structural and deeper.

The meaning of a composite is determined by the meanings of its parts and how they are connected, by construction, for all valid inputs. And meaning flows bidirectionally: the context in which a component appears shapes its effective meaning, and the component shapes its context, through the same formal structure. This isn't something learned from data. Attention mechanisms approximate it. The mathematical framework we build on guarantees it.

In the current substrate, none of this holds. Knowledge is dissolved into parameter fields via gradient descent. Reasoning is a statistical traversal through those fields. Composition is something you hope the model learns to approximate from data. Sometimes it does. It comes with no structural guarantees that it will, and no formal way to verify that it has.

The approximations keep getting better. But "composition" is not a thing in the language. It's a pattern you reconstruct from parameters and hope holds up. The qualitative breakthrough requires a formalism where composition is native: a primitive you can name, inspect, compose, and prove things about. Where structure isn't a constraint imposed on the medium, but the medium itself.

The inevitability of abstract math

If the wall is mathematical, then the question of what to do about it has a surprisingly narrow answer.

You need math that operates on formalisms rather than within them. Math that can diagnose the structural limits of the current language and provide higher-order abstractions where the hard things become native. That isn't one option among many. It's the only kind of math that does this.

You arrive at abstract math not by preference, but by elimination.

That's why Fabrial exists: the first institution built natively on the collision between abstract mathematics and applied AI. Not as a skunk works or a side project within a lab organized around other priorities. As the thesis itself.

Previous
Previous

The missing formal science of machine intelligence

Next
Next

Adequacy v. intelligibility, or why mathematical formalisms of quantum are a precedent