// DOCS · CURRENT BUILD

Setup & Reference

Everything needed to run the full pipeline — signal ingest, cross-venue arb detection, deterministic trade plans, 2-click execution with QA gating, journaling, ground-truth hindsight, and policy feedback — plus the Claude Code ↔ Hermes handoff skill for delegating build tasks to a remote agent.

Quickstart

pip install -r requirements.txt
npm install
cp .env.example .env

python scripts/seed_db.py          # creates organizedmarket.db
python scripts/dev.py              # boots the pipeline (13 agents + bridge)
# another terminal:
npx wrangler pages dev dashboard --port 8789
# open http://localhost:8789

Within seconds dashboard/data/*.json start filling and every dashboard page goes live. Inspect SQLite at any time:

sqlite3 organizedmarket.db "select tier, count(*) from signal_log group by tier;"
sqlite3 organizedmarket.db "select status, count(*) from journal_entry_log group by status;"

Dashboards

All six pages poll their feed every 2s over CF Pages and highlight HIGH-tier items prominently (teal border, pulse animation).

Route	Purpose	Feed
`/dashboard`	Live signal feed — gap, z, tier, confidence, trade-plan card with copy-ticket + deeplinks	`data/signals.json`
`/actions`	2-click action queue: QA assessment per-check, Approve QA → Place Trade, per-leg execution status	`data/actions.json`
`/arb`	Cross-venue edges (Polymarket YES vs TastyTrade options-implied prob), Lightweight Charts triple-line (poly / tasty / edge), symbol + market dropdowns	`data/arb.json`
`/journal`	Trading journal — one entry per HIGH signal OR profitable arb (shadow even if not executed). Live MFE / MAE, ideal entry/exit notes, autoresearch block with post-exec + periodic thesis reviews	`data/journal.json`
`/policy`	Per-bucket learning stats: TP rate, mean realized capture, suggested confidence multiplier, suggested exit rule. Read-only surface of the feedback loop	`data/policy.json`
`/signals?id=…`	Single-signal detail view	derived
`/`	Architecture overview (this deploy: organized-market-arch.pages.dev)	static

Sidebar is consistent across every page; the active one is highlighted in teal. Copy buttons (plan copy, ticket copy, approve/execute) use navigator.clipboard — no server round-trip. Deeplinks open venue market pages in a new tab. Your local project is never mutated by the dashboard; it only reads rolling JSON feeds the agent pipeline writes.

ClawBox desktop app

clawbox-app/ is a Tauri 2 wrapper that mirrors the HIGH-tier hero card + action list from the web dashboard using the same plan.js renderer. Build once with cd clawbox-app && npm run build; rebuild after dashboard asset changes.

Agent Pipeline (13 agents)

All boot under scripts/dev.py on a shared in-process asyncio bus with Pydantic-validated topics.

Agent	Role	Publishes
`signal`	TastyTrade market-data + market-metrics polling → options-implied probability per symbol	`quote.update`
`poly`	Polymarket Gamma + CLOB HTTP reads (no wallet required). CLI-first with HTTP fallback. Populates `market_slug` for resolution lookups	`odds.update`, `market.microstructure`
`kalshi`	Kalshi odds (mock mode until the RSA-JWT live client is wired)	`odds.update`
`sentiment`	Twitter/X + NewsAPI with Claude-NLP scoring	`sentiment`
`correlator`	Welford-rolling divergence, z-scored tier. Consults policy feedback cache before emit	`signal`
`arb`	Polymarket × TastyTrade edge detector with Black-based calibrated probabilities. Deduplicates micro-moves. Tags a calibration record	`arb.poly_tasty`
`qa`	Gates HIGH signals + profitable arbs through freshness / confidence / edge / cooldown / daily-budget / dedup checks → `ProposedAction`	`action.proposed`
`journal`	Shadow-journals every flagged event, promotes shadow→open on EXECUTED, tracks MFE/MAE + ideal entry/exit on every snapshot	`journal.entry`
`hindsight`	Labels matured entries at horizon; swaps in real PnL when Polymarket resolves	`journal.entry` (replay)
`policy`	Per-bucket aggregate stats + suggested multipliers / exit rules	`policy.stat`
`tier_correlator`	MED→HIGH lift mining on `signal_log`	`tier.lift`
`autoresearch`	Model-drift probes + periodic journal thesis/risk notes (dedup'd, 60s cap per entry)	`model.drift`, `journal.research`
`sniffer`	Counterparty fingerprinting from microstructure + signal cross-ref	`counterparty.fingerprint`
`dispatcher`	Terminal subscriber — routes each topic to its sinks (dashboard JSON + SQLite + optional Slack/Discord)	—

2-Click Agentic Execution

Every ProposedAction flows through a state machine. Two explicit human clicks before any order hits a venue.

State machine

PROPOSED ──▶ QA_PASSED ───click 1──▶ APPROVED ───click 2──▶ EXECUTING ──▶ EXECUTED
          └─▶ QA_REJECTED                                              └─▶ FAILED
                                                                       └─▶ EXPIRED

QA gates (deterministic)

freshness — signal ≤ QA_MAX_AGE_SECONDS (default 120s)
confidence — ≥ QA_MIN_CONFIDENCE (default 0.75)
edge — |divergence| ≥ QA_MIN_EDGE (default 0.05)
cooldown — ≥ QA_COOLDOWN_SECONDS since last action for same symbol (default 300s)
daily_budget — running risk ≤ QA_DAILY_CAP_USD (default $5,000)
dedup — the spawning source_id hasn't already queued

Bridge server

scripts/bridge.py runs alongside the agents on http://127.0.0.1:18799.

HTTP	Purpose
`GET /api/actions`	Full queue
`GET /api/actions/{id}`	Single action detail
`POST /api/approve/{id}`	QA_PASSED → APPROVED
`POST /api/execute/{id}`	APPROVED → EXECUTING → EXECUTED/FAILED. Fires both legs; publishes back to the bus so journal + policy pick it up

Executor layer — fail-closed

Venue	Path	Needs
TastyTrade (options vertical)	`agents/executor/tasty_orders.py` — chain resolution + 50%-width limit debit/credit spread	OAuth (already in `.env`) + `AUTO_EXEC_TASTY=1`
Polymarket (YES/NO)	`agents/executor/poly_orders.py` via `py-clob-client`	`POLY_WALLET_PRIVATE_KEY` + Polygon USDC + `AUTO_EXEC_POLY=1`
Kalshi (YES/NO)	`agents/executor/kalshi_orders.py` — RSA-PSS signed headers	`KALSHI_KEY_ID` + PEM + `AUTO_EXEC_KALSHI=1`

Master kill switch: AUTO_EXEC_ENABLED=0 — any place_* call returns status=manual with a clear message. Nothing leaves the machine until both the master and the per-venue flag are flipped.

Learning Loop

Every flagged event gets journaled + evaluated + bucketed; the policy cache feeds multipliers back into future signals. Four layers:

Layer	Agent	What it does
1 · Shadow journal	`journal`	Opens a `SHADOW` entry on every HIGH signal / profitable arb regardless of whether you executed. Promotes to `OPEN` in place when an action with matching `source_id` executes — same `entry_id`, full snapshot history preserved
2 · Hindsight evaluator	`hindsight`	At horizon (`HINDSIGHT_HORIZON_SECONDS`, default 1h) labels entries with `realized_capture`, `ideal_entry_lag_seconds`, `ideal_exit_lag_seconds`. Separately polls `gamma-api.polymarket.com` for resolution — when a market settles, writes `ground_truth_capture` with the real $/$-risked return and overwrites the proxy
3 · Policy aggregator	`policy`	Per `pattern_key` bucket (`symbol\|venue\|gap:bucket\|z:bucket\|sentiment:regime`) — TP rate, mean realized capture, mean MFE/MAE, suggested confidence multiplier (clipped [0.3, 1.3], shrunk toward 1.0 until n≥20), suggested exit rule. Read-only; exposed on `/policy`
4 · Feedback	`policy/feedback.py`	Correlator + arb consult the cache at emit time. `LEARN_MODE=shadow` (default) stamps a `PolicyAdjustment` audit on every `TradePlan` but doesn't mutate confidence/size. `LEARN_MODE=active` applies the multipliers (size capped at ×1.2)

Autoresearch notes on the journal

Autoresearch subscribes to TOPIC_ACTION (EXECUTED) + TOPIC_JOURNAL. On entry open: one ideal_entry_exit note. Periodically thereafter (60s cap per entry, headline-deduped): thesis_update or risk_review notes with MFE giveback / drawdown commentary. The LLM path lives behind AGENT_AUTORESEARCH_LIVE=1; the deterministic path runs even without an API key.

Hermes Handoff

A Claude Code skill at ~/.claude/skills/hermes/ delegates coding tasks to the Hermes agent running on claws-mac-mini over Tailscale. Your local Claude Code session keeps full read/write access throughout — Hermes operates on a server-side git worktree.

Command	Action
`/hermes delegate "<task>"`	Rsync current project to `claws:~/hermes-handoffs/<project>/`, create an isolated `git worktree`, launch `hermes chat` in a detached tmux session. Returns a task-id
`/hermes status [id]`	RUNNING/STOPPED + tail log + list worktree changes
`/hermes pull [id]`	Rsync worktree back into `./.hermes/incoming/<id>/`. Shows `HANDOFF_RESULT.md` + diff vs local. You review and selectively merge
`/hermes list`	Recent task-ids + running state
`/hermes cancel <id>`	tmux kill-session

One-time remote setup: isolated HERMES_HOME=~/.hermes-handoff/ on claws with GTM MCP stripped to avoid port-10918 conflict with the always-on Hermes gateway; Codex OAuth via codex login --device-auth; two small patches to hermes-agent/run_agent.py to fix tool-call parsing on ChatGPT-account Responses API streams.

Env Variables

All AGENT_*_LIVE and AUTO_EXEC_* flags default to 0 (mock / dry-run). Every client has mock and live implementations; the flag selects at boot.

Data agents

Variable	Purpose
`AGENT_SIGNAL_LIVE`	TastyTrade REST + DXLink — needs OAuth token block
`AGENT_POLY_LIVE`	Polymarket public Gamma + CLOB reads (no wallet needed for read)
`POLY_MARKETS`	Comma-separated `slug[:LINKED_SYMBOL]` list
`AGENT_KALSHI_LIVE`	Kalshi REST — needs `KALSHI_KEY_ID` + PEM
`AGENT_SENTIMENT_LIVE`	Twitter/X filtered stream + NewsAPI
`AGENT_TIER_CORRELATOR_LIVE`	MED→HIGH lift mining from `signal_log`
`AGENT_AUTORESEARCH_LIVE`	Anthropic-powered post-mortems; deterministic notes fire regardless

Trade plans + execution

Variable	Default	Purpose
`TRADE_UNIT_USD`	1000	Dollar-risk unit; plans size legs by `confidence × unit`
`ARB_MIN_EDGE`	0.05	Minimum \|edge\| to fire an arb
`ARB_FEE_ROUNDTRIP`	0.02	Friction haircut on net edge
`AUTO_EXEC_ENABLED`	0	Master switch — all `place_*` return `manual` when 0
`AUTO_EXEC_TASTY` / `_POLY` / `_KALSHI`	0	Per-venue enable; each also requires master=1
`TASTY_ACCOUNT_NUMBER`	—	Override; first account used if blank
`POLY_WALLET_PRIVATE_KEY`	—	32-byte hex; required for poly orders
`POLY_SIGNATURE_TYPE`	0	0=EOA, 1=POLY_PROXY, 2=GNOSIS_SAFE
`KALSHI_KEY_ID` / `KALSHI_PRIVATE_KEY_PATH`	—	Required for Kalshi orders

QA, hindsight, learning

Variable	Default	Purpose
`QA_MAX_AGE_SECONDS`	120	Reject QA if older than this
`QA_MIN_CONFIDENCE`	0.75	Minimum signal confidence to enqueue
`QA_MIN_EDGE`	0.05	Minimum \|edge\|
`QA_COOLDOWN_SECONDS`	300	Per-symbol cooldown
`QA_DAILY_CAP_USD`	5000	Daily risk cap
`HINDSIGHT_HORIZON_SECONDS`	3600	When proxy outcomes get labelled
`HINDSIGHT_CADENCE_SECONDS`	60	Sweep + resolution-lookup interval
`LEARN_MODE`	shadow	`shadow` = audit only; `active` = apply multipliers
`AUTORESEARCH_MODELS`	claude-opus-4-6,claude-opus-4-7	Drift probe pair
`AUTORESEARCH_MODEL_DEEP` / `_FAST` / `_HAIKU`	4-7 / 4-6 / haiku-4-5	Token-router slots for the forthcoming router import

Message Topics (Pydantic schemas)

Topic	Publisher(s)	Subscriber(s)
`quote.update`	signal	correlator, arb, journal
`odds.update`	poly, kalshi	correlator, arb, journal
`market.microstructure`	poly, kalshi	sniffer
`sentiment`	sentiment	correlator
`signal`	correlator	dispatcher, qa, journal, tier_correlator
`arb.poly_tasty`	arb	dispatcher, qa, journal
`action.proposed`	qa, bridge (state transitions)	dispatcher, journal
`journal.entry`	journal, hindsight (replay)	dispatcher, autoresearch, policy
`journal.research`	autoresearch	journal (appends note)
`policy.stat`	policy	dispatcher
`tier.lift`	tier_correlator	autoresearch
`model.drift`	autoresearch	downstream attribution
`counterparty.fingerprint`	sniffer	dispatcher

Confidence tiers: HIGH > 0.75 · MED 0.45–0.75 · LOW < 0.45.

Going Live (one agent at a time)

Populate one agent's credential block in .env (start with TastyTrade — OAuth already scaffolded by scripts/oauth_login.py).
Flip its flag: AGENT_SIGNAL_LIVE=1. Restart that agent; the rest stay mocked.
Capture real payloads with scripts/record_fixture.py for reproducible dev.
Expand coverage via POLY_MARKETS=slug1[:SYM],slug2[:SYM],…. Current config: 6 SPX → SPY markets + 2 NDX → QQQ markets + 2 BTC markets.
Once 50+ entries have reached EVALUATED status in the journal, flip LEARN_MODE=active to let policy multipliers adjust live confidence + size.
Only then consider flipping any AUTO_EXEC_* flag — and start with Tasty's options leg (defined-risk) before touching prediction venues.

Operational Checklist

TastyTrade OAuth refresh tested (24h token expiry) — scripts/refresh_token.py
>100 entries in signal_log; false-positive rate reviewed by tier
>50 entries in journal_entry_log reached EVALUATED before flipping LEARN_MODE=active
Bridge reachable: curl http://127.0.0.1:18799/api/actions returns JSON
All AUTO_EXEC_* flags confirmed 0 in a production .env audit before every session
pytest -q green locally; verify CI runs on every PR
Twitter rate limits monitored (<500k tweets/month on Basic)
Slack / Discord webhooks tested on a HIGH-tier dry run
.env, keys/, .hermes/, .wrangler/ in .gitignore; rolling dashboard feeds also ignored
CF Pages dashboard deployed; arch page at organized-market-arch.pages.dev current
Hermes gateway + codex OAuth valid on claws-mac-mini; ssh claws true succeeds over Tailscale

OrganizedMarket surfaces signals and can execute defined-risk trades when its kill switches are explicitly flipped. Default posture is paper-trade-and-learn: every HIGH opportunity journals regardless of execution, so the policy loop gains reps even when you sit out.