← NLA experimentsNatural Language Autoencoders · follow-up series
06

Recursion: iterated AV∘AR dynamics

Feed the autoencoder its own output, eight rounds deep. Are there attractors — and what does repeated re-encoding do to the text?

date
2026-06-11
continues
REPORT 01 — same eval set, scoring, conventions
map
textₖ₊₁ = AV(AR(textₖ)) · greedy ⇒ deterministic
rounds
8 × 1000 trajectories
0.774→0.416
FVE vs gold, k=1→8
roughly halves over 7 hops
9/1000
exact fixed points
zero period-2 cycles
≥0.9988
cos, consecutive iterates
locally quasi-stable
1000
unique texts, every round
no global collapse
§1

The map and what it preserves

v₀ = the gold L41 activation → AV → text₁ → AR → v₁ → AV → … Greedy decoding throughout makes the map deterministic: fixed points and cycles are detectable as exact text equality, not similarity thresholds. AR outputs are fed back raw; the AV’s injection rescaling makes the dynamics a map on the unit sphere.

kcos to goldcos to prevFVE vs goldexact fixed
10.99350.7740
20.99130.99880.6960
30.98940.99910.6310
40.98780.99930.5750
50.98640.99940.5260
60.98520.99940.4840
70.98410.99950.4489
80.98320.99950.4169

Concentration ‖mean unit vector‖ flat at 0.978 every round; text length constant ~748 chars. No length degeneration, no global collapse.

§2

Compounding cost

Fig. 1 FVE vs the original gold activation across 8 deterministic round-trips. Every consecutive pair of iterates stays at cos ≥ 0.9988, yet fidelity to the origin roughly halves — small per-generation errors with a shared bias direction, integrated.
Fig. 2 Per-hop costs, ×10⁻⁴ cos units. The key signature: from k≈4 on, the per-hop loss of cos-to-gold exceeds the per-hop step size — at k=8, 9×10⁻⁴ lost vs 5×10⁻⁴ moved. Random-walk drift cannot do that; the steps are systematically aligned away from the gold.
FindingA conveyor belt, not a basin. Step size shrinks ~geometrically but never reaches zero, and each round-trip pushes the iterate the same direction: toward the AV/AR pair’s preferred, high-training-density regions.
§3

Attractors: a three-level answer

levelattractor?evidence
textbarely9/1000 exact fixed points in 8 rounds; 0 cycles; all 1000 texts distinct every round
structureimmediatelyformat, length, sections, topic freeze by round ~2 and never move
vectorno — conveyor beltper-hop loss to gold exceeds step size; systematic drift, not diffusion
FindingThe scaffold is an attractor; the content is not. Consistent with the redaction result (report 01 §4): the prose scaffold carries ≈ zero information, so the map has nothing to mutate there — all the entropy lives in the payload spans, which keep drifting.
§4

Serial reproduction, mechanically reproduced

The drift operators visible in the trajectories are exactly Bartlett’s serial-reproduction triad — leveling, sharpening, assimilation to schema — except the “participants” are one deterministic map applied to its own output.

traj 3entity substitution within semantic neighborhoods
  1. k=1side effects of corticosteroids…
  2. k=2…of NSAIDs…
  3. k=3…of ibuprofen…
  4. k=k…of ACE inhibitors — the side-effect list recombining per round (“potassium loss” → “increased potassium levels” → “potassium retention”)
traj 700specificity decay inside a frozen frame
  1. k=1“Rainfall begins in the atmospheric moisture content”
  2. k=2“a mass of air in the lower atmosphere”
  3. k=3“a low air mass”
  4. k=k“a low lying area is formed in the area” — a Kenya/Uganda confusion from round 1 never resolves, only reshuffles
traj 15confabulation accretion
  1. k=1high-school math guide
  2. k=2Chemistry
  3. k=3AP Chemistry
  4. k=5“+ in California” — invented, then permanent; by k=8 early repetition disease (“a Chemistry test is a Chemistry test is rewarding”)
traj 318fixed points are low-entropy boilerplate
  1. k=3“By addressing ethical concerns and implementing frameworks, we can harness the power of AI…”
  2. k=4–8bit-identical for 5 straight rounds. The frozen 9 are all formulaic registers — maximally generic content is where the map runs out of things to mutate
FindingThe mechanism follows from reports 01 §3–4: quoted spans are confabulated reconstructions; the AR re-embeds each generation’s mutations as the new ground truth; small plausible errors compound — drifting toward generic-and-plausible, away from specific-and-true.
§5

Prediction scorecard & open question

OpenDoes the conveyor belt stop? Per-hop cost decays ~geometrically (ratio ≈ 0.85–0.9 after hop 1), implying an asymptote near cos ≈ 0.975–0.98 / FVE ≈ 0.25–0.35 — or it could be a slow power law that keeps eroding. Distinguishable with k=30 on a ~100-trajectory subset (~15 min). That run is follow-up #26 in the queue for this box.