manifesto

An AI you can argue with.

The case for white-box affective intelligence — and why the alternative is a category mistake.

Every model that has been called empathetic in the last five years has been a black box. You give it a sentence; it gives you back a sentence that sounds like it understood. There is no audit trail. No mechanism. No way to disagree with the model on the basis of how it arrived at its output, because nobody — not even the people who trained it — can describe how it arrived at its output.

This is fine when the stakes are low. Autocomplete a function name. Translate a recipe. Suggest a chord. The cost of being wrong is friction.

When the stakes are feeling, the calculation changes. A system that listens to a person describe their day, infers something about their interior life, and answers from that inference is making a clinical move. If it gets it wrong, we should know why. If it gets it right, we should still know why — because gettable rightness is the whole game.

The category problem.

Affective AI built on top of large neural language models is, at best, a category mistake. The model is trained on text. Text is the shadow of feeling, not the feeling itself. When the model produces a response that reads as understanding, it is performing textual fidelity to past responses that read as understanding. It is not feeling. It is mirroring the surface of feeling.

This is not a complaint about LLMs. They are remarkable tools. The point is narrower: using them as the substrate for affective intelligence concedes the very thing we want to build. You cannot bolt an explanation layer onto a system that, by construction, has no explanation layer. You can only bolt on a plausible-sounding generator of post-hoc explanations, which is worse than nothing.

What white-box buys you.

Serena is built differently. Every feeling Serena reports back is the output of an arithmetic trace over symbols you can inspect. Tokens, parts of speech, ARPAbet phonemes, point-wise mutual information lexicon weights, Kneser-Ney n-gram conditionals, a 12-dimensional emotion projection, a 7-dimensional non-semantic affective vector, a 6-dimensional DNA trait vector, and lifespan priors that scale every signal by a developmental window from birth to twenty-five years.

Every one of those quantities lives in a graph with a stable trace ID. Every value can be challenged. If Serena says you sound val −0.32 · aro 0.58, you can ask which symbols made that so, and the system can show you, edge by edge, how it arrived. If it is wrong, you can correct the lexicon. You can adjust the priors. You can disagree with the model on terms the model itself can articulate.

What this is not.

This is not anti-neural. It is anti-opaque. There is interesting work to be done in mating symbolic affective cores to neural surface generators — using the symbolic system to govern the neural system, not the other way around. That work is a separate document.

This is also not a claim of perfection. The lexicon is imperfect. The lifespan curves are simplifications. The phonemic mappings throw away information. All of that is in the open. You cannot say the same of any frontier neural assistant.

We don't generate. We feel — then respond. The trace is the product. The trace is yours.

— Beyond the Box, GlassMind Division