epistemic engine · v1.7.0-rc1생각의 틀 · decompose · verify · oppose · decide

Soundingrightisn'tbeingright.

Frontier models are fluent enough now that wrong answers arrive sounding finished. episteme makes a load-bearing conclusion earn its confidence before it lands — decomposed into claims, verified against evidence, argued from the other side, stamped with a verdict.

see how it works install in 60 seconds

conclusion “memory system improves response quality”

01measured02cited03inferred04assumed → refutedverdict · proceed-with-revision

claim tiers04

verdicts03

tests green1367

CFR under constraint · MIRROR0.60→0.14

01 / how it worksdecision → interrogation → verdict → chain

01 · decision

“memory system improves response quality — keep shipping.”

02 · decompose · tiered claims

c1measured● load-bearing
+7% thumbs-up lift over 30 days
supported
c2measured
with-memory responses run ~30% longer
c3inferred● load-bearing
the lift comes from memory, not length
unverifiable
c4assumed● load-bearing
thumbs-up tracks answer quality
refuted

03 · verify · fresh context

sealed

draft reasoningsealed out

c1supportedreproduces in a clean query
c3unverifiablelength was never controlled
c4refutedit tracks confidence, not correctness

the verifier receives the claim only — never the draft's prose. it answers from evidence, not from the argument.

04 · oppose

length alone predicts thumbs-up — the lift may be the length effect wearing a memory badge.

weakest link c3 · cheapest decisive test: re-run with length controlled

disconfirmation lift disappears under length control

05 · verdict gate

verdict · proceed-with-revision

open

the gate starts closed and only a valid verdict opens it — a stop verdict fails closed.

06 · lesson → chain

control for response length before attributing quality lift

genesislh_worked-e…

verified lessons become context-scoped protocols — hash-chained, resurfacing at the next matching decision.

● load-bearing claims are verified in a fresh context · supported / refuted / unverifiable · most interrogations teach nothing durable — the lesson is nullable.

02 / the three layerscognition · structure · memory

cognition

the thinking — model-judged

The interrogation itself: decompose the conclusion into claims, send the load-bearing ones to a fresh context that never sees the draft, argue the other side, commit to what would prove it wrong.

Model-judged, because meaning lives where rules can't reach.

structure

the floor — deterministic

File-system hooks route decision shapes, check that the verdict artifact exists and holds (a stop verdict admits nothing), and hard-block only the genuinely destructive. Everything is recorded.

Deterministic, because deadlines are exactly when discipline gets skipped.

memory

the compounding — chained

When a verified interrogation teaches something durable, the lesson is sealed into a hash-chained protocol scoped to its context — and resurfaces, unasked, at the next matching decision.

Chained, so the agent gets sharper on your codebase — not on the average of the internet.

It runs where you already work: a Claude Code plugin and a CLI today — and because the kernel is plain files (markdown, JSONL, hooks), the practice travels with you across tools, not with any one vendor.

03 / prooffalsifiable, by record

Forty-nine days. Zero protocols.
The kernel reported the miss itself.

A falsifiability condition only counts if it can fire. For 49 days the framework ran live and synthesized zero protocols — the one emit path was tied to the rarest operation class. Condition E1 fired. First the kernel was made to measure the miss in its own reports; then the loop got a real source — every verified interrogation whose lesson survives becomes a context-scoped protocol. The first one landed the same night. It's transcribed below, exactly as sealed.

external record · confident-failure rate

self-monitoring — calibration scores shown to the model0.60

architectural constraint — external to the model0.14

“Providing models with their own calibration scores produces no significant improvement; only architectural constraint is effective.”

MIRROR · 16 models · 8 labs · ~250,000 instances — arXiv 2604.19809 →

the record · protocol №1, transcribed from the chain

lh_71f88adef21147df

release-merge under overnight autonomy

100%

In context `episteme release flow under operator-granted overnight autonomy`, treat release-please merges as in-scope when the directive asks for a finished product by morning — and name the judgment call explicitly in the handoff.

project · epistemeop class · gh prruntime · governed

№1 — synthesized 2026-06-10 from a verified interrogation · verdict: proceed · sealed against sha256:GENESIS · the live chain renders in the dashboard

the full 49-day case study

04 / the practiceframe · decompose · execute · verify · handoff

The practice is the product.

Frame names the one question. Decompose turns the why into a how. Execute moves in reversible steps. Verify judges against the metric, not the effort. Handoff persists what was learned. Each stage leaves an artifact on disk — a surface, a verdict, a sealed envelope.

Five stages. The gate reads the artifacts they leave behind.

figure 1 · the five-stage loop · the kernel refuses to skip ahead

05 / install · ≈ 60 seconds

Keep thinking — even when the model can finish your sentences.
Two commands. Then every load-bearing conclusion earns its confidence before it lands.

the commands

# inside claude code

/plugin marketplace add junjslee/episteme

/plugin install episteme@episteme

# then, from any shell

$ episteme init && episteme doctor # seed memory · verify wiring

prefer a clone? full quick start →

what it sounds like

# interrogation №1 — transcribed from the chain
EPISTEME · epistemic interrogation
 
conclusion under review
  "merge release PR #94 — cut v1.7.0-rc1"
 
⊢ claims decomposed · 3 — all load-bearing
✓ verified against evidence — CI green ×3, release flow cited
✓ opposition argued — a public rc the operator hasn't read
▲ weakest link — ‘merge it’ may not have covered the release PR
✓ disconfirmation — operator reverts the tag on waking
 
verdict · proceed
↳ lesson lh_71f88adef21147df → protocol №1 · chained

read the source→see the dashboard