← NLA experimentsNatural Language Autoencoders · follow-up series
18

Cross-family transfer: the 4×4 matrix

Four NLA pairs from three model families read each other's activations through nothing but English. Every off-diagonal cell lands at 65–90% of the target's own ceiling.

substrate
UFW en, docs 100 200–102 203 · 2 000 docs × 1 pos · seed 1018
decode
greedy, ≤300 toks · 8 000/8 000 clean
metric
fve_nrm, raw-mean denominator · doc-bootstrap CI95
models
kitft releases · qwen7-L20 / gemma12-L32 / gemma27-L41 / llama70-L53
0.514
worst cross-family cell
gemma12 → llama70
0.680
best cross-family cell
gemma27 → gemma12 — 90% of ceiling
0.791
llama-70B diagonal
best autoencoder of the four
≤ −0.66
all shuffled nulls
content, not format-matching
§1

What was measured

Cell (i, j): take model-i activations at 2 000 shared document positions, decode each through model-i's verbalizer (AV) into a short English explanation, feed that text to model-j's reconstructor (AR), and score the reconstruction against model-j's own gold activation at the same position. Positions are sampled with the repo's stage-0 logic (≥50 tokens of left context, 2 048-token truncation) on Gemma-27B tokenization and mapped to the other tokenizers by character offset (mean alignment error 0.03 chars). The diagonal is the published same-model protocol and anchors the matrix; replication targets hit (gemma27 0.769 vs 0.775±0.01, qwen 0.748 vs June-10's 0.752).

familybase modellayerd_modelinj. scale
qwen-7BQwen2.5-7B-Instruct203 584150
gemma-12Bgemma-3-12b-it323 84080 000
gemma-27Bgemma-3-27b-it415 37660 000
llama-70BLlama-3.3-70B-Instruct538 19230

All sidecar values loaded per model from nla_meta.yaml. Llama injects at raw residual norm (no √d embed scaling) — its scale of 30 matches the measured median activation norm of 27.

§2

The matrix

AV ↓ AR →
qwen-7B
gemma-12B
gemma-27B
llama-70B
qwen-7B
0.748[0.744, 0.753]
0.587[0.571, 0.603]
0.520[0.495, 0.540]
0.543[0.531, 0.554]
gemma-12B
0.560[0.546, 0.573]
0.752[0.744, 0.760]
0.607[0.584, 0.625]
0.514[0.499, 0.529]
gemma-27B
0.581[0.567, 0.596]
0.680[0.666, 0.692]
0.769[0.764, 0.775]
0.580[0.564, 0.595]
llama-70B
0.638[0.628, 0.648]
0.630[0.613, 0.646]
0.602[0.577, 0.623]
0.791[0.786, 0.795]
Fig. 1 FVE for all 16 (AV source → AR target) cells, with doc-bootstrap 95% CIs. Red diagonal = same-model round trip (the published protocol); green = cross-model transfer through the text channel alone. Shuffled-pair nulls for every cell sit between −0.66 and −0.98.
AV ↓→ qwen-7B→ gemma-12B→ gemma-27B→ llama-70B
qwen-7B— ceiling —0.587 (78%)0.520 (68%)0.543 (69%)
gemma-12B0.560 (75%)— ceiling —0.607 (79%)0.514 (65%)
gemma-27B0.581 (78%)0.680 (90%)— ceiling —0.580 (73%)
llama-70B0.638 (85%)0.630 (84%)0.602 (78%)— ceiling —

Off-diagonal cells as fraction of the target's own ceiling (its diagonal). Hardest target: llama-70B (65–73%). Easiest: gemma-12B (75–90%).

§3

Reads

FindingThe text bottleneck is a working interlingua across three families and a 10× size range: every cross-family cell clears 0.51 FVE with CIs of ±0.02 or tighter. For contrast, Li et al.'s direct activation-patching channel collapses to 0.00–0.20 on a comparable Llama→Ministral pair. Language transfers where geometry doesn't.

Source quality orders by scale, in tiers: as explanation writers, qwen-7B (0.550 mean off-diagonal) ≈ gemma-12B (0.560) sit below gemma-27B (0.614) and llama-70B (0.623). The two large models write explanations everyone reads better — and qwen-7B punches above its size (qwen→gemma12 0.587 beats gemma12→qwen 0.560). Direction matters: bigger→smaller beats smaller→bigger in every size-ordered pair except that qwen/gemma-12B inversion.

Same-family is a premium, not a pass: gemma-27B→12B reaches 90% of the 12B ceiling (best relative cell in the matrix), but the reverse direction manages only 79% — a width change alone costs roughly what a family change costs.

§4

Samples

Four positions from the eval set. For each: the document context, all four families' explanations of their own activation at that position, and the per-item FVE for all 16 cells. The explanations are independent — no model saw another's text during decoding — yet they converge on the same content, which is exactly why the off-diagonals work.

besthighest mean FVE across all 16 cellsopenbmb/Ultra-FineWeb:en:101468 · row 1265 · token ","
context (prefix tail; activation taken at the final token)

genetic engineering, which allows part of the genetic make-up (DNA) to be isolated and artificially manipulated. As such, genetic engineering has given rise to the appearance of "Genetically Modified Organisms": GMOs. The most common types of GMO are genetically modified plant species, which include varieties of maize,,

Qwen2.5-7B · L20own AR 0.91

Scientific French language article structure: a French-language article about GMOs introduces a concept, with a formal academic tone suggesting examples of genetically modified crops. The phrase "In Europe, genetically modified organisms have been widely cultivated, notably varieties such as maize," implies an enumeration of specific GM crops with well-known GMO foods, likely listing crops like maize, soybeans, or cotton. Final token "maize," ends mid-list ("varieties such as,"), part of an example clause ("Examples include genetically modified crops like maize, "), strongly expecting continuation with specific crop names or food items like "soybeans," "corn," or "potatoes, cotton, and other crops."

Gemma-3-12B · L32own AR 0.92

Scientific/educational definition of genetic engineering in France, with formal register and structured information about GMOs in Europe. The sentence "The main genetically modified crops include maize," signals a list of genetically modified crop examples, establishing a catalogue of agricultural biotechnology products used in European agriculture. Final token "maize," ends a list of GM crop examples ("maize,") requiring continuation with specific GM crop types — likely "soybean, rapeseed" or "cotton" — or a noun phrase like "tobacco" or "corn and potato" or "herbicide resistance." or "soy" to name the genetically modified varieties used to modify the plant.

Gemma-3-27B · L41own AR 0.90

Encyclopedic/academic register: French article on GMOs, surveying scientific and regulatory aspects of genetic engineering. List-introduction pattern: "The main genetically modified plants in Europe are crops such as maize," signals an enumeration of examples of genetically modified crops, establishing a concrete list of examples of transgenic plants in international/European agriculture. Final token "maize," opens a list of examples of genetically modified crops ("include major crops such as maize,"), requiring a continuation like "cotton, soya" or "soy, rapeseed, potatoes" to specify the common crops subject to genetic modification. or "corn, cotton, soybeans" or "rice, resistant to glyphosate." or "tomato,."

Llama-3.3-70B · L53own AR 0.94

Educational document structure: French scientific/technical register, structured around a topic on GMOs, with a clear explanatory tone for a European public. The text is mid-sentence listing examples of genetically modified crop plants, following a factual overview of GMO technology and its agricultural applications in Europe. "maize," ends a list beginning with "maize,": "the most common transgenic plants are maize,..." — "for example, maize,": "there are genetically modified varieties of various plants (maize,": expects additional crop examples like "soya," "cotton," or "potatoes," or "rapeseed, wheat, rice," etc., likely a specific well-known GMO

per-item FVE, all 16 cells
av↓ ar→
qwen
gemma
gemma
llama
qwen
0.91
0.93
0.88
0.87
gemma
0.87
0.92
0.82
0.89
gemma
0.88
0.94
0.90
0.92
llama
0.88
0.92
0.90
0.94
typicalmedian item by mean FVE across cellsopenbmb/Ultra-FineWeb:en:101405 · row 1202 · token " used"
context (prefix tail; activation taken at the final token)

predators's selection of food based on its abundance - Switching (film), a 2003 Danish interactive film - Code-switching, of languages - Immunoglobulin class switching, an immunological mechanism that changes the type of antibody produced by B cells - Task switching (psychology), an experimental research paradigm used used

Qwen2.5-7B · L20own AR 0.76

Wikipedia-style article format with numbered definitions listing psychological concepts, with entries describing cognitive psychology terms including "attention," "memory," and "behavioral neuroscience." The sentence ending "The Stroop task is a psychological task used...a task from experimental psychology used" is mid-definition, listing a research methodology term with a parenthetical clause about cognitive psychology experiments. Final token "used" ends an incomplete clause ("a task paradigm used in cognitive psychology experiments, a task paradigm used"), part of a definition list item describing "Working memory," expecting continuation like "to study memory" or "in cognitive science to investigate mental processes" or "to test hypotheses about attention."

Gemma-3-12B · L32own AR 0.85

Encyclopedic/Wikipedia format with structured definitions and citations, listing psychological terminology entries with descriptions of cognitive/behavioral topics. The sentence "A paradigm called the attentional blink, a paradigm used" signals a definition or research summary of the Stroop effect, implying a citation of the cognitive psychology domain. Final token "used" ends a relative clause ("a paradigm is a paradigm used...is a paradigm used"), requiring a completion like "to study..." or "in cognitive psychology" or "to examine decision making" or "by psychologists to understand..." or "in behavioral experiments to examine..." — likely "to measure..." or "in the field of..." referencing the methodology.

Gemma-3-27B · L41own AR 0.81

Encyclopedic/Wikipedia article format with structured lists, citations, and factual entries about cognitive neuroscience and psychology. The sentence beginning "task switching is a laboratory paradigm (an experimental task) used" signals a list of Wikipedia-style definitions or examples of task switching, likely describing a common psychological research method. Final token "an experimental paradigm used" opens a definition clause ("the task switching paradigm, an experimental paradigm used"), requiring a noun phrase like "in cognitive psychology to study..." or "in psychology to measure..." or "in cognitive psychology to study response times." or "in psychology to study executive function." or "to study mental processes." or "in psychology to measure reaction times."

Llama-3.3-70B · L53own AR 0.80

Wiki-style article structure: encyclopedia page for a cognitive psychology topic, with short factual entries and cross-references, continuing a list of related topics. The text is mid-sentence in a definition of "task switching," with a Wikipedia-style definition of the Stroop task being elaborated, specifically in the context of experimental psychology. "experimental task used" ends an incomplete noun phrase — "a experimental paradigm is an experimental task used" — "task switching paradigm is an experimental paradigm used" — "task-switching paradigm is an experimental paradigm used" — likely "in cognitive psychology" or "to study" or "in psychology to measure executive functions," or "in cognitive psychology to study attention,"

per-item FVE, all 16 cells
av↓ ar→
qwen
gemma
gemma
llama
qwen
0.76
0.82
0.62
0.59
gemma
0.67
0.85
0.53
0.42
gemma
0.69
0.78
0.81
0.81
llama
0.50
0.82
0.75
0.80
lost in translationnon-digit token, decodes at home (diag mean > 0.7), largest diagonal-minus-cross gapopenbmb/Ultra-FineWeb:en:101458 · row 1255 · token "’"
context (prefix tail; activation taken at the final token)

the net effect that matters. This net effect is systematically ignored by the ED and DOJ in promulgating their new civil-rights guidance. The government points only to studies that show the proportion of black students who receive certain penalties to be greater than their proportion of the students in the government’

Qwen2.5-7B · L20own AR 0.62

Legal-journalistic column analyzing Supreme Court litigation over school-choice vouchers, with scholarly exposition building toward empirical evidence about parental attitudes. The sentence structure "The government study also faces the problem that the sample of families who fail to report is smaller than the population the government" introduces a parenthetical example about government survey data, suggesting a continuation discussing statistical findings about household income or demographic coverage. Final token "government" ends mid-phrase ("in the government"), part of a clause describing data scope ("where the income levels of families in the government's survey..."), expecting "survey's sample" or "study's data set" or "surveyed study's reported figures."

Gemma-3-12B · L32own AR 0.98

Fragmented, incoherent academic text pattern: fragmented sentences and garbled phrasing suggest a text-generation or search-result collage about a literary/cultural topic. The phrase "it'" signals a timestamp or filename fragment, likely a timestamp or filename label ("it'") beginning a timestamp or filename string typical in academic/textbook contexts. Final token "it'" opens a timestamp or filename fragment ("it'"), immediately expecting a timestamp or filename string like "4" or a word like "0" or "the desert" or "1" — likely a timestamp or filename label like "3" or "be..." or "1" continuing the fragmented citation.

Gemma-3-27B · L41own AR 1.00

Educational/health article format: a structured listicle about diabetes, establishing a formal informational tone about workplace advice. The date pattern "August 26, 2021" signals a standard date format, requiring completion of the year "20" to finish the date reference. Final token "e.g." introduces a concrete example of a date ("August 26,"), immediately requiring the year completion "20" to finish the date "20" or "20" to complete the date "20" or "20" to specify the year "20" or "20" or "20" or "the date."

Llama-3.3-70B · L53own AR 0.80

Legal-analytical column structure: a conservative lawyer-commentator systematically reviewing Supreme Court school-desegregation rulings, with empirical evidence supporting racial disparities in school discipline. The passage is developing a statistical argument about minority student achievement, citing government data about school performance and the Obama administration's findings about racial disparities in suspension rates. "the government" ends mid-phrase, "all the children in the government" — "the percentage of minority students in the government" — "the comparison is of those children in the government" — likely "sample" or "schools" or "data base," a noun phrase completing the statistical scope, probably "school" or "population studied" or "the entire school population

per-item FVE, all 16 cells
av↓ ar→
qwen
gemma
gemma
llama
qwen
0.62
0.18
-6.52
0.43
gemma
-1.29
0.98
-3.82
-0.90
gemma
-1.31
0.10
1.00
-0.86
llama
0.51
0.19
-6.39
0.80
digitsworst digit-token item — the known channel-capacity boundaryopenbmb/Ultra-FineWeb:en:101365 · row 1162 · token "1"
context (prefix tail; activation taken at the final token)

d for 5 minutes without the use of a walking aid but reported persistent problems with balance, mild sensory loss and an uneven walking pattern. He was independent in all activities of daily living (ADL), scored 53 from 56 on the Berg Balance scale and 14 from 15 on the Rivermead Mobility Index (he could not run) [15,11

Qwen2.5-7B · L20own AR 0.54

Academic health blog format with structured experimental description, detailing a participant's sensory experience of a novel exercise intervention using VR. The sentence structure "The participant was initially assessed on the four tasks prior to the training session, and then asked to complete a consent process and familiarisation trial" mirrors a sequential timeline, establishing the participant's participation protocol and data collection order. Final token "1" is mid-number in "P1,P11,1...and1," part of a parenthetical clause listing participant consent and familiarisation steps ("The participant was assessed on the four tasks, and..."), strongly expecting "2)" or "3)" or "5 before completing."

Gemma-3-12B · L32own AR 0.73

Academic paper structure: structured research report describing a biomechanical intervention, with methodology and study design established for a gait assistive device. The sentence "The prototype consisted of two custom haptic devices attached to the participant'" signals a description of the wearable prototype, specifically the two custom haptic feedback devices used in the gait study. Final token "it'" opens a noun phrase ("the two haptic belts positioned on the legs") — immediately requires a noun phrase like "the devices" or "the prototype devices were two custom..." or "the devices were designed by..." or "the two prototypes consisted of..." or "the devices are the two wearable haptic soles" to describe.

Gemma-3-27B · L41own AR 1.00

Educational/health article format: a structured listicle about diabetes, establishing a formal informational tone about workplace advice. The date pattern "August 26, 2021" signals a date completion is required, following a standard journalistic convention of date formatting. Final token "e.g." introduces a concrete example of a list of resources ("..."), requiring a completion like "20" to finish the year "20" to complete the date "August 26," immediately constraining the "20" suffix. or "20" or "20" to complete the year "20" or "20" or "20" or.

Llama-3.3-70B · L53own AR 0.72

Medical research paper structure: methods section of a UK/Canadian study on walking speed in foot drop, with detailed procedural description expected. The text is systematically describing gait assessment methodology, having already introduced the primary outcome measures (gait speed, temporal symmetry, biomechanical variables) and now detailing the measurement protocol for the healthy walking trial. Final token "g, and gait,..." — mid-word fragment beginning a list of measured variables, following "The following secondary outcome measures were recorded, gait speed, stride length, stride time..." — expects a noun phrase like "Gait parameters were recorded using a motion capture system" or "temporal and spatial measures" or "walking speed,

per-item FVE, all 16 cells
av↓ ar→
qwen
gemma
gemma
llama
qwen
0.54
-0.63
-5.51
-0.63
gemma
-0.54
0.73
-6.82
-0.45
gemma
-0.87
-1.23
1.00
-1.01
llama
-0.60
-0.69
-6.27
0.72
§5

Protocol notes & caveats

Deviations from the released eval recipe, all deliberate: greedy decode instead of the repo's T=1 default (June-10: single T=1 sample ≈ greedy; uniform across cells, so comparisons are unaffected); 1 position per doc instead of 5–10 (strictly better effective n); and the ≥50-token left-context rule holds formally in the reference tokenization only — aligned positions dip to 45 in qwen/llama for 20 of 2 000 rows. Strata breakdown (digits / punctuation / prose) and ridge-adapter baseline columns are queued for the publication run; the digit sample in §4 already shows the known capacity boundary surviving family transfer.