epistemic engine · v1.7.0-rc1생각의 틀 · decompose · verify · oppose · decide

Soundingrightisn'tbeingright.

Frontier models are fluent enough now that wrong answers arrive sounding finished. episteme makes a load-bearing conclusion earn its confidence before it lands — decomposed into claims, verified against evidence, argued from the other side, stamped with a verdict.

conclusion “memory system improves response quality”

01measured02cited03inferred04assumed → refutedverdict · proceed-with-revision
claim tiers04
verdicts03
tests green1367
CFR under constraint · MIRROR0.60→0.14
01 / how it worksdecision → interrogation → verdict → chain

01 · decision

“memory system improves response quality — keep shipping.”

02 · decompose · tiered claims

  • c1measured● load-bearing

    +7% thumbs-up lift over 30 days

    supported
  • c2measured

    with-memory responses run ~30% longer

  • c3inferred● load-bearing

    the lift comes from memory, not length

    unverifiable
  • c4assumed● load-bearing

    thumbs-up tracks answer quality

    refuted

03 · verify · fresh context

sealed
draft reasoningsealed out
  • c1supportedreproduces in a clean query
  • c3unverifiablelength was never controlled
  • c4refutedit tracks confidence, not correctness

the verifier receives the claim only — never the draft's prose. it answers from evidence, not from the argument.

04 · oppose

length alone predicts thumbs-up — the lift may be the length effect wearing a memory badge.

weakest link c3 · cheapest decisive test: re-run with length controlled

disconfirmation lift disappears under length control

05 · verdict gate

verdict · proceed-with-revision
open

the gate starts closed and only a valid verdict opens it — a stop verdict fails closed.

06 · lesson → chain

control for response length before attributing quality lift

genesislh_worked-e

verified lessons become context-scoped protocols — hash-chained, resurfacing at the next matching decision.

● load-bearing claims are verified in a fresh context · supported / refuted / unverifiable · most interrogations teach nothing durable — the lesson is nullable.

02 / the three layerscognition · structure · memory

cognition

the thinking — model-judged

The interrogation itself: decompose the conclusion into claims, send the load-bearing ones to a fresh context that never sees the draft, argue the other side, commit to what would prove it wrong.

Model-judged, because meaning lives where rules can't reach.

structure

the floor — deterministic

File-system hooks route decision shapes, check that the verdict artifact exists and holds (a stop verdict admits nothing), and hard-block only the genuinely destructive. Everything is recorded.

Deterministic, because deadlines are exactly when discipline gets skipped.

memory

the compounding — chained

When a verified interrogation teaches something durable, the lesson is sealed into a hash-chained protocol scoped to its context — and resurfaces, unasked, at the next matching decision.

Chained, so the agent gets sharper on your codebase — not on the average of the internet.

It runs where you already work: a Claude Code plugin and a CLI today — and because the kernel is plain files (markdown, JSONL, hooks), the practice travels with you across tools, not with any one vendor.

03 / prooffalsifiable, by record

Forty-nine days. Zero protocols.
The kernel reported the miss itself.

A falsifiability condition only counts if it can fire. For 49 days the framework ran live and synthesized zero protocols — the one emit path was tied to the rarest operation class. Condition E1 fired. First the kernel was made to measure the miss in its own reports; then the loop got a real source — every verified interrogation whose lesson survives becomes a context-scoped protocol. The first one landed the same night. It's transcribed below, exactly as sealed.

external record · confident-failure rate
self-monitoring — calibration scores shown to the model0.60
architectural constraint — external to the model0.14
“Providing models with their own calibration scores produces no significant improvement; only architectural constraint is effective.”
MIRROR · 16 models · 8 labs · ~250,000 instances — arXiv 2604.19809 →
the record · protocol №1, transcribed from the chain
lh_71f88adef21147df

release-merge under overnight autonomy

100%

In context `episteme release flow under operator-granted overnight autonomy`, treat release-please merges as in-scope when the directive asks for a finished product by morning — and name the judgment call explicitly in the handoff.

project · epistemeop class · gh prruntime · governed
invocations · 07679d5

№1 — synthesized 2026-06-10 from a verified interrogation · verdict: proceed · sealed against sha256:GENESIS · the live chain renders in the dashboard

04 / the practiceframe · decompose · execute · verify · handoff

The practice is the product.

Frame names the one question. Decompose turns the why into a how. Execute moves in reversible steps. Verify judges against the metric, not the effort. Handoff persists what was learned. Each stage leaves an artifact on disk — a surface, a verdict, a sealed envelope.

Five stages. The gate reads the artifacts they leave behind.

the loopOBSERVE · ORIENT · DECIDE · ACTspeed comes from completion, not from skipping01Framename the question02Decomposesplit, weigh options03Executereversible first04Verifyagainst metric05Handoffwhat's still open
figure 1 · the five-stage loop · the kernel refuses to skip ahead
05 / install · ≈ 60 seconds

Keep thinking — even when the model can finish your sentences.
Two commands. Then every load-bearing conclusion earns its confidence before it lands.

the commands
# inside claude code
/plugin marketplace add junjslee/episteme
/plugin install episteme@episteme
# then, from any shell
$ episteme init && episteme doctor # seed memory · verify wiring
prefer a clone? full quick start →
what it sounds like
# interrogation №1 — transcribed from the chain
EPISTEME · epistemic interrogation
 
conclusion under review
"merge release PR #94 — cut v1.7.0-rc1"
 
⊢ claims decomposed · 3 — all load-bearing
✓ verified against evidence — CI green ×3, release flow cited
✓ opposition argued — a public rc the operator hasn't read
▲ weakest link — ‘merge it’ may not have covered the release PR
✓ disconfirmation — operator reverts the tag on waking
 
verdict · proceed
↳ lesson lh_71f88adef21147df → protocol №1 · chained