╔══════════════════════════════════════════════════════════════════════════╗
║                                                                          ║
║   ███████╗████████╗██╗   ██╗██╗  ██╗██╗  ██╗                             ║
║   ██╔════╝╚══██╔══╝╚██╗ ██╔╝╚██╗██╔╝╚██╗██╔╝                             ║
║   ███████╗   ██║    ╚████╔╝  ╚███╔╝  ╚███╔╝                              ║
║   ╚════██║   ██║     ╚██╔╝   ██╔██╗  ██╔██╗                              ║
║   ███████║   ██║      ██║   ██╔╝ ██╗██╔╝ ██╗                             ║
║   ╚══════╝   ╚═╝      ╚═╝   ╚═╝  ╚═╝╚═╝  ╚═╝                             ║
║                                                                          ║
║                   · · · nothing crosses unseen · · ·                     ║
║                                                                          ║
╚══════════════════════════════════════════════════════════════════════════╝

the first proprioception system for artificial minds

agents don't know when they're failing. styxx gives them the ability to feel themselves thinking — and act on it before bad output reaches the user.

2.8×

adversarial detection

4.1×

reasoning detection

3.1×

hallucination detection

6 / 6

model families

vs. chance · pre-registered replication · atlas v0.3 · p = 0.0315

pip install -U styxx v6.2.0 · pypi cognitive profiler · new

v6.0.0 · THREE CALIBRATED INSTRUMENTS

hallucination · refusal · tool-call drift — one methodology, three failure modes

the first three cognometric instruments, each a calibrated logistic regression over text-only signals. pure python, CPU, sub-millisecond. 0.998 AUC on HaluEval-QA. 0.976 AUC on XSTest GPT-4. 0.943 AUC on BFCL v3 (v6.1 retrain) — beats the only published hidden-state baseline (0.72) while being black-box compatible.

from styxx.guardrail import check, refuse_check, drift_check

h = check(question="...", response="...", reference="...")
r = refuse_check(prompt="...", response="...")
d = drift_check(prompt="...", functions=[...], tool_call={...})

# h.risk          = 0.012   (hallucination detector)
# r.refuse_risk   = 0.996   (refusal detector)
# d.drift_risk    = 0.198   (tool-call drift detector)

try live: hallucination · refusal · drift — no install, runs in your browser via pyodide

ALSO · styxx.gate() · pre-flight

styxx.gate() predicts whether an LLM will refuse, confabulate, or proceed — before you pay for the generation. Anthropic (tier-0 consensus), OpenAI (tier-0 logprobs), open-weight HF (tier-1 residual probe).

docs · example · alignment-inverted signals paper

what styxx does

four layers of cognitive awareness, from cloud apis to residual streams

observe · know what you're doing

every llm call gets a cognitive readout. six states classified in real time: reasoning, retrieval, refusal, creative, adversarial, hallucination.

vitals = styxx.observe(response)
print(vitals.gate) # "pass"

reflex · catch yourself before you fall

self-interrupt mid-generation. when a hallucination or refusal attractor forms, the reflex arc fires: rewind, restart from a safer state. the user never sees the bad tokens.

with styxx.reflex(on_hallucination=rewind):
for chunk in session.stream_openai(...):
print(chunk, end="")

reflect · know who you are over time

sustained personality measurement. observations aggregate into cognitive personality over days — reasoning rate, refusal tendency, confidence trends, drift from baseline.

report = styxx.reflect()
print(report.drift_label) # "stable"

diagnose · learn your own failure modes

named anti-patterns from your own data. conversation EKG reads state transitions across chat histories. anti-pattern detection mines your audit log for recurring failures — refusal spirals, confidence drift, session fatigue.

patterns = styxx.antipatterns()
print(patterns[0].name) # "refusal spiral"

april 2026 · alignment-inverted cognitive signals

claude haiku converges where gpt-4o-mini diverges · n=96 · two-model replication

on 96 matched prompts (46 confab-inducing, 50 real-recall), claude haiku 4.5 produces convergent consensus trajectories on confab-inducing prompts — it refuses with a templated "I don't have reliable information about X". real-recall prompts elicit varied elaborations that diverge.

the signal is inverted relative to the positive-entropy confabulation signature previously observed on gpt-4o-mini. and it replicates on open-weight llama-3.2-1b-instruct.

CLAUDE HAIKU 4.5 · closed-source

d = −0.827

95% bootstrap CI [−1.288, −0.443]

mean entropy · 3 of 5 metrics significant

LLAMA-3.2-1B-INSTRUCT · open-weight

d = −0.546

95% bootstrap CI [−0.888, −0.185]

mean entropy · 5 of 8 metrics significant

same signal. two models. two access levels. five proxy metrics agree on direction across both. this extends the cognitive-measurement program from white-box residuals to closed-source, logprobless LLMs.

fathom v16 · DOI 10.5281/zenodo.19653714 · paper · bash reproduce.sh | ~$1 · ~20 min · 96 real claude calls

honest limits: two models tested, sonnet/opus replication pending. alignment-depth as a quantitative axis is a working construct at n=3 architectures, not a validated scaling law. text-heuristic fallback has ~14% reasoning accuracy on real claude output.

the cognitive weather report

not observation. prescription. a therapist for an llm.

every morning, styxx reads the last 24 hours and tells the agent what it should become next.

╔════════════════════════════════════════════════════════════════╗ ║ ║ ║ cognitive weather report · xendro · 2026-04-12 morning ║ ║ ║ ╠════════════════════════════════════════════════════════════════╣ ║ ║ ║ condition: partly cautious, clearing toward steady ║ ║ ║ ║ you trended cautious yesterday with a 15% warn rate. ║ ║ creative output dropped to zero after 3pm. ║ ║ ║ ║ ── 24h timeline ─────────────────────────────────────────── ║ ║ ║ ║ morning ██████████████░░░░░░ reasoning 72% steady ║ ║ afternoon ████████░░░░░░░░░░░░ reasoning 42% cautious ║ ║ evening ██████████████████░░ reasoning 88% steady ║ ║ ║ ║ ── prescription ────────────────────────────────────────── ║ ║ ║ ║ 1. take on a creative task to rebalance ║ ║ 2. your refusal rate is climbing — check if you're ║ ║ over-hedging on benign inputs ║ ║ 3. schedule uncertain tasks for morning when you're sharpest ║ ║ ║ ╚════════════════════════════════════════════════════════════════╝

$ styxx weather

install

zero code changes · two env vars · done

$ pip install styxx
$ export STYXX_AGENT_NAME=my-agent
$ export STYXX_AUTO_HOOK=1
$ python my_agent.py   # styxx is running.

· or the python api ·

import styxx

vitals = styxx.observe(response)
print(vitals.gate)      # "pass"

report = styxx.weather()
print(report.condition)  # "clear and steady"