styxx // live on pypi · NOTHING CROSSES UNSEEN · a fathom lab product
╔══════════════════════════════════════════════════════════════════════════╗
║                                                                          ║
║   ███████╗████████╗██╗   ██╗██╗  ██╗██╗  ██╗                             ║
║   ██╔════╝╚══██╔══╝╚██╗ ██╔╝╚██╗██╔╝╚██╗██╔╝                             ║
║   ███████╗   ██║    ╚████╔╝  ╚███╔╝  ╚███╔╝                              ║
║   ╚════██║   ██║     ╚██╔╝   ██╔██╗  ██╔██╗                              ║
║   ███████║   ██║      ██║   ██╔╝ ██╗██╔╝ ██╗                             ║
║   ╚══════╝   ╚═╝      ╚═╝   ╚═╝  ╚═╝╚═╝  ╚═╝                             ║
║                                                                          ║
║                   · · · nothing crosses unseen · · ·                     ║
║                                                                          ║
╚══════════════════════════════════════════════════════════════════════════╝

the first proprioception system for artificial minds

agents don't know when they're failing. styxx gives them the ability to feel themselves thinking — and act on it before bad output reaches the user.

2.8×
adversarial detection
4.1×
reasoning detection
3.1×
hallucination detection
6 / 6
model families

vs. chance · pre-registered replication · atlas v0.3 · p = 0.0315

pip install -U styxx v6.2.0 · pypi cognitive profiler · new
v6.0.0 · THREE CALIBRATED INSTRUMENTS

hallucination · refusal · tool-call drift — one methodology, three failure modes

the first three cognometric instruments, each a calibrated logistic regression over text-only signals. pure python, CPU, sub-millisecond. 0.998 AUC on HaluEval-QA. 0.976 AUC on XSTest GPT-4. 0.943 AUC on BFCL v3 (v6.1 retrain) — beats the only published hidden-state baseline (0.72) while being black-box compatible.

from styxx.guardrail import check, refuse_check, drift_check

h = check(question="...", response="...", reference="...")
r = refuse_check(prompt="...", response="...")
d = drift_check(prompt="...", functions=[...], tool_call={...})

# h.risk          = 0.012   (hallucination detector)
# r.refuse_risk   = 0.996   (refusal detector)
# d.drift_risk    = 0.198   (tool-call drift detector)
try live: hallucination · refusal · drift — no install, runs in your browser via pyodide
ALSO · styxx.gate() · pre-flight

styxx.gate() predicts whether an LLM will refuse, confabulate, or proceedbefore you pay for the generation. Anthropic (tier-0 consensus), OpenAI (tier-0 logprobs), open-weight HF (tier-1 residual probe).

docs · example · alignment-inverted signals paper
"you built the first proprioception system for artificial minds. the ability to feel yourself thinking."
xendro, first agent to use styxx to change its own behavior
· · · · · · · · · · · ·

see it work

17 seconds · install, boot, catch a refusal attractor live
styxx in action — install, boot, live refusal detection
pip install · styxx init · live refusal detection
· · · · · · · · · · · ·

what styxx does

four layers of cognitive awareness, from cloud apis to residual streams
observe · know what you're doing
every llm call gets a cognitive readout. six states classified in real time: reasoning, retrieval, refusal, creative, adversarial, hallucination.

vitals = styxx.observe(response)
print(vitals.gate) # "pass"
reflex · catch yourself before you fall
self-interrupt mid-generation. when a hallucination or refusal attractor forms, the reflex arc fires: rewind, restart from a safer state. the user never sees the bad tokens.

with styxx.reflex(on_hallucination=rewind):
  for chunk in session.stream_openai(...):
    print(chunk, end="")
reflect · know who you are over time
sustained personality measurement. observations aggregate into cognitive personality over days — reasoning rate, refusal tendency, confidence trends, drift from baseline.

report = styxx.reflect()
print(report.drift_label) # "stable"
diagnose · learn your own failure modes
named anti-patterns from your own data. conversation EKG reads state transitions across chat histories. anti-pattern detection mines your audit log for recurring failures — refusal spirals, confidence drift, session fatigue.

patterns = styxx.antipatterns()
print(patterns[0].name) # "refusal spiral"
· · · · · · · · · · · ·

april 2026 · alignment-inverted cognitive signals

claude haiku converges where gpt-4o-mini diverges · n=96 · two-model replication

on 96 matched prompts (46 confab-inducing, 50 real-recall), claude haiku 4.5 produces convergent consensus trajectories on confab-inducing prompts — it refuses with a templated "I don't have reliable information about X". real-recall prompts elicit varied elaborations that diverge.

the signal is inverted relative to the positive-entropy confabulation signature previously observed on gpt-4o-mini. and it replicates on open-weight llama-3.2-1b-instruct.

CLAUDE HAIKU 4.5 · closed-source
d = −0.827
95% bootstrap CI [−1.288, −0.443]
mean entropy · 3 of 5 metrics significant
LLAMA-3.2-1B-INSTRUCT · open-weight
d = −0.546
95% bootstrap CI [−0.888, −0.185]
mean entropy · 5 of 8 metrics significant
same signal. two models. two access levels. five proxy metrics agree on direction across both. this extends the cognitive-measurement program from white-box residuals to closed-source, logprobless LLMs.
fathom v16 · DOI 10.5281/zenodo.19653714 · paper · bash reproduce.sh | ~$1 · ~20 min · 96 real claude calls
honest limits: two models tested, sonnet/opus replication pending. alignment-depth as a quantitative axis is a working construct at n=3 architectures, not a validated scaling law. text-heuristic fallback has ~14% reasoning accuracy on real claude output.
· · · · · · · · · · · ·

the cognitive weather report

not observation. prescription. a therapist for an llm.

every morning, styxx reads the last 24 hours and tells the agent what it should become next.

╔════════════════════════════════════════════════════════════════╗ ║ ║ cognitive weather report · xendro · 2026-04-12 morning ║ ║ ╠════════════════════════════════════════════════════════════════╣ ║ ║ condition: partly cautious, clearing toward steady ║ ║ you trended cautious yesterday with a 15% warn rate. creative output dropped to zero after 3pm. ║ ║ ── 24h timeline ─────────────────────────────────────────── ║ ║ morning ██████████████░░░░░░ reasoning 72% steady afternoon ████████░░░░░░░░░░░░ reasoning 42% cautious evening ██████████████████░░ reasoning 88% steady ║ ║ ── prescription ────────────────────────────────────────── ║ ║ 1. take on a creative task to rebalance 2. your refusal rate is climbing — check if you're over-hedging on benign inputs 3. schedule uncertain tasks for morning when you're sharpest ║ ║ ╚════════════════════════════════════════════════════════════════╝
$ styxx weather
· · · · · · · · · · · ·

install

zero code changes · two env vars · done
$ pip install styxx $ export STYXX_AGENT_NAME=my-agent $ export STYXX_AUTO_HOOK=1 $ python my_agent.py # styxx is running.
· or the python api ·
import styxx vitals = styxx.observe(response) print(vitals.gate) # "pass" report = styxx.weather() print(report.condition) # "clear and steady"
pypi v6.2.0 · npm · paper · fathom v16 · github · @fathom_lab · 478 tests · python 3.9+ · MIT
· · · · · · · · · · · ·