// DOCS · CURRENT BUILD

Setup & Reference

Everything needed to run the full pipeline — signal ingest, cross-venue arb detection, deterministic trade plans, 2-click execution with QA gating, journaling, ground-truth hindsight, and policy feedback — plus the Claude Code ↔ Hermes handoff skill for delegating build tasks to a remote agent.

Quickstart

pip install -r requirements.txt
npm install
cp .env.example .env

python scripts/seed_db.py          # creates organizedmarket.db
python scripts/dev.py              # boots the pipeline (13 agents + bridge)
# another terminal:
npx wrangler pages dev dashboard --port 8789
# open http://localhost:8789

Within seconds dashboard/data/*.json start filling and every dashboard page goes live. Inspect SQLite at any time:

sqlite3 organizedmarket.db "select tier, count(*) from signal_log group by tier;"
sqlite3 organizedmarket.db "select status, count(*) from journal_entry_log group by status;"

Dashboards

All six pages poll their feed every 2s over CF Pages and highlight HIGH-tier items prominently (teal border, pulse animation).

RoutePurposeFeed
/dashboardLive signal feed — gap, z, tier, confidence, trade-plan card with copy-ticket + deeplinksdata/signals.json
/actions2-click action queue: QA assessment per-check, Approve QAPlace Trade, per-leg execution statusdata/actions.json
/arbCross-venue edges (Polymarket YES vs TastyTrade options-implied prob), Lightweight Charts triple-line (poly / tasty / edge), symbol + market dropdownsdata/arb.json
/journalTrading journal — one entry per HIGH signal OR profitable arb (shadow even if not executed). Live MFE / MAE, ideal entry/exit notes, autoresearch block with post-exec + periodic thesis reviewsdata/journal.json
/policyPer-bucket learning stats: TP rate, mean realized capture, suggested confidence multiplier, suggested exit rule. Read-only surface of the feedback loopdata/policy.json
/signals?id=…Single-signal detail viewderived
/Architecture overview (this deploy: organized-market-arch.pages.dev)static

Sidebar is consistent across every page; the active one is highlighted in teal. Copy buttons (plan copy, ticket copy, approve/execute) use navigator.clipboard — no server round-trip. Deeplinks open venue market pages in a new tab. Your local project is never mutated by the dashboard; it only reads rolling JSON feeds the agent pipeline writes.

ClawBox desktop app

clawbox-app/ is a Tauri 2 wrapper that mirrors the HIGH-tier hero card + action list from the web dashboard using the same plan.js renderer. Build once with cd clawbox-app && npm run build; rebuild after dashboard asset changes.

Agent Pipeline (13 agents)

All boot under scripts/dev.py on a shared in-process asyncio bus with Pydantic-validated topics.

AgentRolePublishes
signalTastyTrade market-data + market-metrics polling → options-implied probability per symbolquote.update
polyPolymarket Gamma + CLOB HTTP reads (no wallet required). CLI-first with HTTP fallback. Populates market_slug for resolution lookupsodds.update, market.microstructure
kalshiKalshi odds (mock mode until the RSA-JWT live client is wired)odds.update
sentimentTwitter/X + NewsAPI with Claude-NLP scoringsentiment
correlatorWelford-rolling divergence, z-scored tier. Consults policy feedback cache before emitsignal
arbPolymarket × TastyTrade edge detector with Black-based calibrated probabilities. Deduplicates micro-moves. Tags a calibration recordarb.poly_tasty
qaGates HIGH signals + profitable arbs through freshness / confidence / edge / cooldown / daily-budget / dedup checks → ProposedActionaction.proposed
journalShadow-journals every flagged event, promotes shadow→open on EXECUTED, tracks MFE/MAE + ideal entry/exit on every snapshotjournal.entry
hindsightLabels matured entries at horizon; swaps in real PnL when Polymarket resolvesjournal.entry (replay)
policyPer-bucket aggregate stats + suggested multipliers / exit rulespolicy.stat
tier_correlatorMED→HIGH lift mining on signal_logtier.lift
autoresearchModel-drift probes + periodic journal thesis/risk notes (dedup'd, 60s cap per entry)model.drift, journal.research
snifferCounterparty fingerprinting from microstructure + signal cross-refcounterparty.fingerprint
dispatcherTerminal subscriber — routes each topic to its sinks (dashboard JSON + SQLite + optional Slack/Discord)

2-Click Agentic Execution

Every ProposedAction flows through a state machine. Two explicit human clicks before any order hits a venue.

State machine

PROPOSED ──▶ QA_PASSED ───click 1──▶ APPROVED ───click 2──▶ EXECUTING ──▶ EXECUTED
          └─▶ QA_REJECTED                                              └─▶ FAILED
                                                                       └─▶ EXPIRED

QA gates (deterministic)

Bridge server

scripts/bridge.py runs alongside the agents on http://127.0.0.1:18799.

HTTPPurpose
GET /api/actionsFull queue
GET /api/actions/{id}Single action detail
POST /api/approve/{id}QA_PASSED → APPROVED
POST /api/execute/{id}APPROVED → EXECUTING → EXECUTED/FAILED. Fires both legs; publishes back to the bus so journal + policy pick it up

Executor layer — fail-closed

VenuePathNeeds
TastyTrade (options vertical)agents/executor/tasty_orders.py — chain resolution + 50%-width limit debit/credit spreadOAuth (already in .env) + AUTO_EXEC_TASTY=1
Polymarket (YES/NO)agents/executor/poly_orders.py via py-clob-clientPOLY_WALLET_PRIVATE_KEY + Polygon USDC + AUTO_EXEC_POLY=1
Kalshi (YES/NO)agents/executor/kalshi_orders.py — RSA-PSS signed headersKALSHI_KEY_ID + PEM + AUTO_EXEC_KALSHI=1

Master kill switch: AUTO_EXEC_ENABLED=0 — any place_* call returns status=manual with a clear message. Nothing leaves the machine until both the master and the per-venue flag are flipped.

Learning Loop

Every flagged event gets journaled + evaluated + bucketed; the policy cache feeds multipliers back into future signals. Four layers:

LayerAgentWhat it does
1 · Shadow journaljournalOpens a SHADOW entry on every HIGH signal / profitable arb regardless of whether you executed. Promotes to OPEN in place when an action with matching source_id executes — same entry_id, full snapshot history preserved
2 · Hindsight evaluatorhindsightAt horizon (HINDSIGHT_HORIZON_SECONDS, default 1h) labels entries with realized_capture, ideal_entry_lag_seconds, ideal_exit_lag_seconds. Separately polls gamma-api.polymarket.com for resolution — when a market settles, writes ground_truth_capture with the real $/$-risked return and overwrites the proxy
3 · Policy aggregatorpolicyPer pattern_key bucket (symbol|venue|gap:bucket|z:bucket|sentiment:regime) — TP rate, mean realized capture, mean MFE/MAE, suggested confidence multiplier (clipped [0.3, 1.3], shrunk toward 1.0 until n≥20), suggested exit rule. Read-only; exposed on /policy
4 · Feedbackpolicy/feedback.pyCorrelator + arb consult the cache at emit time. LEARN_MODE=shadow (default) stamps a PolicyAdjustment audit on every TradePlan but doesn't mutate confidence/size. LEARN_MODE=active applies the multipliers (size capped at ×1.2)

Autoresearch notes on the journal

Autoresearch subscribes to TOPIC_ACTION (EXECUTED) + TOPIC_JOURNAL. On entry open: one ideal_entry_exit note. Periodically thereafter (60s cap per entry, headline-deduped): thesis_update or risk_review notes with MFE giveback / drawdown commentary. The LLM path lives behind AGENT_AUTORESEARCH_LIVE=1; the deterministic path runs even without an API key.

Hermes Handoff

A Claude Code skill at ~/.claude/skills/hermes/ delegates coding tasks to the Hermes agent running on claws-mac-mini over Tailscale. Your local Claude Code session keeps full read/write access throughout — Hermes operates on a server-side git worktree.

CommandAction
/hermes delegate "<task>"Rsync current project to claws:~/hermes-handoffs/<project>/, create an isolated git worktree, launch hermes chat in a detached tmux session. Returns a task-id
/hermes status [id]RUNNING/STOPPED + tail log + list worktree changes
/hermes pull [id]Rsync worktree back into ./.hermes/incoming/<id>/. Shows HANDOFF_RESULT.md + diff vs local. You review and selectively merge
/hermes listRecent task-ids + running state
/hermes cancel <id>tmux kill-session

One-time remote setup: isolated HERMES_HOME=~/.hermes-handoff/ on claws with GTM MCP stripped to avoid port-10918 conflict with the always-on Hermes gateway; Codex OAuth via codex login --device-auth; two small patches to hermes-agent/run_agent.py to fix tool-call parsing on ChatGPT-account Responses API streams.

Env Variables

All AGENT_*_LIVE and AUTO_EXEC_* flags default to 0 (mock / dry-run). Every client has mock and live implementations; the flag selects at boot.

Data agents

VariablePurpose
AGENT_SIGNAL_LIVETastyTrade REST + DXLink — needs OAuth token block
AGENT_POLY_LIVEPolymarket public Gamma + CLOB reads (no wallet needed for read)
POLY_MARKETSComma-separated slug[:LINKED_SYMBOL] list
AGENT_KALSHI_LIVEKalshi REST — needs KALSHI_KEY_ID + PEM
AGENT_SENTIMENT_LIVETwitter/X filtered stream + NewsAPI
AGENT_TIER_CORRELATOR_LIVEMED→HIGH lift mining from signal_log
AGENT_AUTORESEARCH_LIVEAnthropic-powered post-mortems; deterministic notes fire regardless

Trade plans + execution

VariableDefaultPurpose
TRADE_UNIT_USD1000Dollar-risk unit; plans size legs by confidence × unit
ARB_MIN_EDGE0.05Minimum |edge| to fire an arb
ARB_FEE_ROUNDTRIP0.02Friction haircut on net edge
AUTO_EXEC_ENABLED0Master switch — all place_* return manual when 0
AUTO_EXEC_TASTY / _POLY / _KALSHI0Per-venue enable; each also requires master=1
TASTY_ACCOUNT_NUMBEROverride; first account used if blank
POLY_WALLET_PRIVATE_KEY32-byte hex; required for poly orders
POLY_SIGNATURE_TYPE00=EOA, 1=POLY_PROXY, 2=GNOSIS_SAFE
KALSHI_KEY_ID / KALSHI_PRIVATE_KEY_PATHRequired for Kalshi orders

QA, hindsight, learning

VariableDefaultPurpose
QA_MAX_AGE_SECONDS120Reject QA if older than this
QA_MIN_CONFIDENCE0.75Minimum signal confidence to enqueue
QA_MIN_EDGE0.05Minimum |edge|
QA_COOLDOWN_SECONDS300Per-symbol cooldown
QA_DAILY_CAP_USD5000Daily risk cap
HINDSIGHT_HORIZON_SECONDS3600When proxy outcomes get labelled
HINDSIGHT_CADENCE_SECONDS60Sweep + resolution-lookup interval
LEARN_MODEshadowshadow = audit only; active = apply multipliers
AUTORESEARCH_MODELSclaude-opus-4-6,claude-opus-4-7Drift probe pair
AUTORESEARCH_MODEL_DEEP / _FAST / _HAIKU4-7 / 4-6 / haiku-4-5Token-router slots for the forthcoming router import

Message Topics (Pydantic schemas)

TopicPublisher(s)Subscriber(s)
quote.updatesignalcorrelator, arb, journal
odds.updatepoly, kalshicorrelator, arb, journal
market.microstructurepoly, kalshisniffer
sentimentsentimentcorrelator
signalcorrelatordispatcher, qa, journal, tier_correlator
arb.poly_tastyarbdispatcher, qa, journal
action.proposedqa, bridge (state transitions)dispatcher, journal
journal.entryjournal, hindsight (replay)dispatcher, autoresearch, policy
journal.researchautoresearchjournal (appends note)
policy.statpolicydispatcher
tier.lifttier_correlatorautoresearch
model.driftautoresearchdownstream attribution
counterparty.fingerprintsnifferdispatcher

Confidence tiers: HIGH > 0.75 · MED 0.45–0.75 · LOW < 0.45.

Going Live (one agent at a time)

  1. Populate one agent's credential block in .env (start with TastyTrade — OAuth already scaffolded by scripts/oauth_login.py).
  2. Flip its flag: AGENT_SIGNAL_LIVE=1. Restart that agent; the rest stay mocked.
  3. Capture real payloads with scripts/record_fixture.py for reproducible dev.
  4. Expand coverage via POLY_MARKETS=slug1[:SYM],slug2[:SYM],…. Current config: 6 SPX → SPY markets + 2 NDX → QQQ markets + 2 BTC markets.
  5. Once 50+ entries have reached EVALUATED status in the journal, flip LEARN_MODE=active to let policy multipliers adjust live confidence + size.
  6. Only then consider flipping any AUTO_EXEC_* flag — and start with Tasty's options leg (defined-risk) before touching prediction venues.

Operational Checklist

OrganizedMarket surfaces signals and can execute defined-risk trades when its kill switches are explicitly flipped. Default posture is paper-trade-and-learn: every HIGH opportunity journals regardless of execution, so the policy loop gains reps even when you sit out.