architecture

Six layers, fully traceable.

Serena's stack is not a model. It is a pipeline of symbolic operations on tokens, phonemes, and lifespan priors. Every layer reports its inputs, outputs, and the operation that connected them.

tokensPOSARPAbetPMI lexn-gramphonΔ12D EVec7D affecttrace

example trace · 9 nodes · 10 edges · trace_id 0x7af2

L0 / Surface NLP

Tokens, POS, phonemes

Input is tokenized, part-of-speech-tagged, and mapped to ARPAbet phonemes. No semantic embedding step. No vector store. Symbols stay symbols all the way down.

  • Tokenizer: NLTK-derived, lossless on whitespace
  • POS tagger: rule + lookup hybrid
  • Phonemic mapper: CMU dict + heuristic fallback
L1 / Lexicon & n-grams

PMI weights · Kneser-Ney

PMI lexicon assigns each token a tuple of emotional, valence, and intensity weights. A Kneser-Ney smoothed n-gram model gives conditional surprise. The combination is the emotional signature of the surface form.

  • Lexicon: 12-axis emotion weights (NRC-derived + curated)
  • n-gram: 4-gram with absolute discounting
  • Output: 12D emotion vector (EVec)
L2 / Affective core (7D)

Non-semantic structural affect

Independent of the lexicon, a 7D affective vector is computed from non-semantic structural features — phoneme distributions, prosody proxies, syntactic patterns. Decoupling the 7D from the 12D guards against lexical bias dominating affect.

  • Axes: VAL, ARO, DOM, PRD, NOV, COM, SRF
  • Sources: phonemic energy, syntactic depth, repetition signals
  • Output: 7D affect vector
L3 / DNA traits (6D)

Stable agent personality

A six-dimensional trait vector parameterises the agent itself, not the input. STAB, OPEN, AGREE, ASSERT, SENSE, LEARN. Traits scale priors, modulate response shape, and persist across sessions.

  • Initialised at agent creation
  • Updated by long-horizon homeostatic integration
  • Constant during a single inference call
L4 / Lifespan priors

0–25y developmental window

Affect is age-shaped. Newborn blending, adolescent volatility, adult homeostasis. Every signal in L1–L3 is multiplied by a lifespan-scaled prior so the same input is read differently at different ages.

  • Curves: piecewise smooth, fit to developmental literature
  • Newborn blending: bias toward arousal-led states
  • Plasticity: age-decayed update gain
L5 / Memory & response

LMDB · MPHF · template

Short-term memory is held in an LMDB-backed adaptive resonance structure. Long-term memory uses minimal-perfect hashing for O(1) symbolic recall. Response generation is template-based and deterministic given the trace.

  • STM: LMDB + ART, recency-weighted
  • LTM: MPHF, ~10ns per recall
  • Response: typed templates, no token sampling