Setup & Reference
Everything needed to run the full pipeline — signal ingest, cross-venue arb detection, deterministic trade plans, 2-click execution with QA gating, journaling, ground-truth hindsight, and policy feedback — plus the Claude Code ↔ Hermes handoff skill for delegating build tasks to a remote agent.
Quickstart
pip install -r requirements.txt
npm install
cp .env.example .env
python scripts/seed_db.py # creates organizedmarket.db
python scripts/dev.py # boots the pipeline (13 agents + bridge)
# another terminal:
npx wrangler pages dev dashboard --port 8789
# open http://localhost:8789
Within seconds dashboard/data/*.json start filling and every dashboard page goes live. Inspect SQLite at any time:
sqlite3 organizedmarket.db "select tier, count(*) from signal_log group by tier;"
sqlite3 organizedmarket.db "select status, count(*) from journal_entry_log group by status;"
Dashboards
All six pages poll their feed every 2s over CF Pages and highlight HIGH-tier items prominently (teal border, pulse animation).
| Route | Purpose | Feed |
|---|---|---|
/dashboard | Live signal feed — gap, z, tier, confidence, trade-plan card with copy-ticket + deeplinks | data/signals.json |
/actions | 2-click action queue: QA assessment per-check, Approve QA → Place Trade, per-leg execution status | data/actions.json |
/arb | Cross-venue edges (Polymarket YES vs TastyTrade options-implied prob), Lightweight Charts triple-line (poly / tasty / edge), symbol + market dropdowns | data/arb.json |
/journal | Trading journal — one entry per HIGH signal OR profitable arb (shadow even if not executed). Live MFE / MAE, ideal entry/exit notes, autoresearch block with post-exec + periodic thesis reviews | data/journal.json |
/policy | Per-bucket learning stats: TP rate, mean realized capture, suggested confidence multiplier, suggested exit rule. Read-only surface of the feedback loop | data/policy.json |
/signals?id=… | Single-signal detail view | derived |
/ | Architecture overview (this deploy: organized-market-arch.pages.dev) | static |
Sidebar is consistent across every page; the active one is highlighted in teal. Copy buttons (plan copy, ticket copy, approve/execute) use navigator.clipboard — no server round-trip. Deeplinks open venue market pages in a new tab. Your local project is never mutated by the dashboard; it only reads rolling JSON feeds the agent pipeline writes.
ClawBox desktop app
clawbox-app/ is a Tauri 2 wrapper that mirrors the HIGH-tier hero card + action list from the web dashboard using the same plan.js renderer. Build once with cd clawbox-app && npm run build; rebuild after dashboard asset changes.
Agent Pipeline (13 agents)
All boot under scripts/dev.py on a shared in-process asyncio bus with Pydantic-validated topics.
| Agent | Role | Publishes |
|---|---|---|
signal | TastyTrade market-data + market-metrics polling → options-implied probability per symbol | quote.update |
poly | Polymarket Gamma + CLOB HTTP reads (no wallet required). CLI-first with HTTP fallback. Populates market_slug for resolution lookups | odds.update, market.microstructure |
kalshi | Kalshi odds (mock mode until the RSA-JWT live client is wired) | odds.update |
sentiment | Twitter/X + NewsAPI with Claude-NLP scoring | sentiment |
correlator | Welford-rolling divergence, z-scored tier. Consults policy feedback cache before emit | signal |
arb | Polymarket × TastyTrade edge detector with Black-based calibrated probabilities. Deduplicates micro-moves. Tags a calibration record | arb.poly_tasty |
qa | Gates HIGH signals + profitable arbs through freshness / confidence / edge / cooldown / daily-budget / dedup checks → ProposedAction | action.proposed |
journal | Shadow-journals every flagged event, promotes shadow→open on EXECUTED, tracks MFE/MAE + ideal entry/exit on every snapshot | journal.entry |
hindsight | Labels matured entries at horizon; swaps in real PnL when Polymarket resolves | journal.entry (replay) |
policy | Per-bucket aggregate stats + suggested multipliers / exit rules | policy.stat |
tier_correlator | MED→HIGH lift mining on signal_log | tier.lift |
autoresearch | Model-drift probes + periodic journal thesis/risk notes (dedup'd, 60s cap per entry) | model.drift, journal.research |
sniffer | Counterparty fingerprinting from microstructure + signal cross-ref | counterparty.fingerprint |
dispatcher | Terminal subscriber — routes each topic to its sinks (dashboard JSON + SQLite + optional Slack/Discord) | — |
2-Click Agentic Execution
Every ProposedAction flows through a state machine. Two explicit human clicks before any order hits a venue.
State machine
PROPOSED ──▶ QA_PASSED ───click 1──▶ APPROVED ───click 2──▶ EXECUTING ──▶ EXECUTED
└─▶ QA_REJECTED └─▶ FAILED
└─▶ EXPIRED
QA gates (deterministic)
- freshness — signal ≤
QA_MAX_AGE_SECONDS(default 120s) - confidence — ≥
QA_MIN_CONFIDENCE(default 0.75) - edge — |divergence| ≥
QA_MIN_EDGE(default 0.05) - cooldown — ≥
QA_COOLDOWN_SECONDSsince last action for same symbol (default 300s) - daily_budget — running risk ≤
QA_DAILY_CAP_USD(default $5,000) - dedup — the spawning
source_idhasn't already queued
Bridge server
scripts/bridge.py runs alongside the agents on http://127.0.0.1:18799.
| HTTP | Purpose |
|---|---|
GET /api/actions | Full queue |
GET /api/actions/{id} | Single action detail |
POST /api/approve/{id} | QA_PASSED → APPROVED |
POST /api/execute/{id} | APPROVED → EXECUTING → EXECUTED/FAILED. Fires both legs; publishes back to the bus so journal + policy pick it up |
Executor layer — fail-closed
| Venue | Path | Needs |
|---|---|---|
| TastyTrade (options vertical) | agents/executor/tasty_orders.py — chain resolution + 50%-width limit debit/credit spread | OAuth (already in .env) + AUTO_EXEC_TASTY=1 |
| Polymarket (YES/NO) | agents/executor/poly_orders.py via py-clob-client | POLY_WALLET_PRIVATE_KEY + Polygon USDC + AUTO_EXEC_POLY=1 |
| Kalshi (YES/NO) | agents/executor/kalshi_orders.py — RSA-PSS signed headers | KALSHI_KEY_ID + PEM + AUTO_EXEC_KALSHI=1 |
Master kill switch: AUTO_EXEC_ENABLED=0 — any place_* call returns status=manual with a clear message. Nothing leaves the machine until both the master and the per-venue flag are flipped.
Learning Loop
Every flagged event gets journaled + evaluated + bucketed; the policy cache feeds multipliers back into future signals. Four layers:
| Layer | Agent | What it does |
|---|---|---|
| 1 · Shadow journal | journal | Opens a SHADOW entry on every HIGH signal / profitable arb regardless of whether you executed. Promotes to OPEN in place when an action with matching source_id executes — same entry_id, full snapshot history preserved |
| 2 · Hindsight evaluator | hindsight | At horizon (HINDSIGHT_HORIZON_SECONDS, default 1h) labels entries with realized_capture, ideal_entry_lag_seconds, ideal_exit_lag_seconds. Separately polls gamma-api.polymarket.com for resolution — when a market settles, writes ground_truth_capture with the real $/$-risked return and overwrites the proxy |
| 3 · Policy aggregator | policy | Per pattern_key bucket (symbol|venue|gap:bucket|z:bucket|sentiment:regime) — TP rate, mean realized capture, mean MFE/MAE, suggested confidence multiplier (clipped [0.3, 1.3], shrunk toward 1.0 until n≥20), suggested exit rule. Read-only; exposed on /policy |
| 4 · Feedback | policy/feedback.py | Correlator + arb consult the cache at emit time. LEARN_MODE=shadow (default) stamps a PolicyAdjustment audit on every TradePlan but doesn't mutate confidence/size. LEARN_MODE=active applies the multipliers (size capped at ×1.2) |
Autoresearch notes on the journal
Autoresearch subscribes to TOPIC_ACTION (EXECUTED) + TOPIC_JOURNAL. On entry open: one ideal_entry_exit note. Periodically thereafter (60s cap per entry, headline-deduped): thesis_update or risk_review notes with MFE giveback / drawdown commentary. The LLM path lives behind AGENT_AUTORESEARCH_LIVE=1; the deterministic path runs even without an API key.
Hermes Handoff
A Claude Code skill at ~/.claude/skills/hermes/ delegates coding tasks to the Hermes agent running on claws-mac-mini over Tailscale. Your local Claude Code session keeps full read/write access throughout — Hermes operates on a server-side git worktree.
| Command | Action |
|---|---|
/hermes delegate "<task>" | Rsync current project to claws:~/hermes-handoffs/<project>/, create an isolated git worktree, launch hermes chat in a detached tmux session. Returns a task-id |
/hermes status [id] | RUNNING/STOPPED + tail log + list worktree changes |
/hermes pull [id] | Rsync worktree back into ./.hermes/incoming/<id>/. Shows HANDOFF_RESULT.md + diff vs local. You review and selectively merge |
/hermes list | Recent task-ids + running state |
/hermes cancel <id> | tmux kill-session |
One-time remote setup: isolated HERMES_HOME=~/.hermes-handoff/ on claws with GTM MCP stripped to avoid port-10918 conflict with the always-on Hermes gateway; Codex OAuth via codex login --device-auth; two small patches to hermes-agent/run_agent.py to fix tool-call parsing on ChatGPT-account Responses API streams.
Env Variables
All AGENT_*_LIVE and AUTO_EXEC_* flags default to 0 (mock / dry-run). Every client has mock and live implementations; the flag selects at boot.
Data agents
| Variable | Purpose |
|---|---|
AGENT_SIGNAL_LIVE | TastyTrade REST + DXLink — needs OAuth token block |
AGENT_POLY_LIVE | Polymarket public Gamma + CLOB reads (no wallet needed for read) |
POLY_MARKETS | Comma-separated slug[:LINKED_SYMBOL] list |
AGENT_KALSHI_LIVE | Kalshi REST — needs KALSHI_KEY_ID + PEM |
AGENT_SENTIMENT_LIVE | Twitter/X filtered stream + NewsAPI |
AGENT_TIER_CORRELATOR_LIVE | MED→HIGH lift mining from signal_log |
AGENT_AUTORESEARCH_LIVE | Anthropic-powered post-mortems; deterministic notes fire regardless |
Trade plans + execution
| Variable | Default | Purpose |
|---|---|---|
TRADE_UNIT_USD | 1000 | Dollar-risk unit; plans size legs by confidence × unit |
ARB_MIN_EDGE | 0.05 | Minimum |edge| to fire an arb |
ARB_FEE_ROUNDTRIP | 0.02 | Friction haircut on net edge |
AUTO_EXEC_ENABLED | 0 | Master switch — all place_* return manual when 0 |
AUTO_EXEC_TASTY / _POLY / _KALSHI | 0 | Per-venue enable; each also requires master=1 |
TASTY_ACCOUNT_NUMBER | — | Override; first account used if blank |
POLY_WALLET_PRIVATE_KEY | — | 32-byte hex; required for poly orders |
POLY_SIGNATURE_TYPE | 0 | 0=EOA, 1=POLY_PROXY, 2=GNOSIS_SAFE |
KALSHI_KEY_ID / KALSHI_PRIVATE_KEY_PATH | — | Required for Kalshi orders |
QA, hindsight, learning
| Variable | Default | Purpose |
|---|---|---|
QA_MAX_AGE_SECONDS | 120 | Reject QA if older than this |
QA_MIN_CONFIDENCE | 0.75 | Minimum signal confidence to enqueue |
QA_MIN_EDGE | 0.05 | Minimum |edge| |
QA_COOLDOWN_SECONDS | 300 | Per-symbol cooldown |
QA_DAILY_CAP_USD | 5000 | Daily risk cap |
HINDSIGHT_HORIZON_SECONDS | 3600 | When proxy outcomes get labelled |
HINDSIGHT_CADENCE_SECONDS | 60 | Sweep + resolution-lookup interval |
LEARN_MODE | shadow | shadow = audit only; active = apply multipliers |
AUTORESEARCH_MODELS | claude-opus-4-6,claude-opus-4-7 | Drift probe pair |
AUTORESEARCH_MODEL_DEEP / _FAST / _HAIKU | 4-7 / 4-6 / haiku-4-5 | Token-router slots for the forthcoming router import |
Message Topics (Pydantic schemas)
| Topic | Publisher(s) | Subscriber(s) |
|---|---|---|
quote.update | signal | correlator, arb, journal |
odds.update | poly, kalshi | correlator, arb, journal |
market.microstructure | poly, kalshi | sniffer |
sentiment | sentiment | correlator |
signal | correlator | dispatcher, qa, journal, tier_correlator |
arb.poly_tasty | arb | dispatcher, qa, journal |
action.proposed | qa, bridge (state transitions) | dispatcher, journal |
journal.entry | journal, hindsight (replay) | dispatcher, autoresearch, policy |
journal.research | autoresearch | journal (appends note) |
policy.stat | policy | dispatcher |
tier.lift | tier_correlator | autoresearch |
model.drift | autoresearch | downstream attribution |
counterparty.fingerprint | sniffer | dispatcher |
Confidence tiers: HIGH > 0.75 · MED 0.45–0.75 · LOW < 0.45.
Going Live (one agent at a time)
- Populate one agent's credential block in
.env(start with TastyTrade — OAuth already scaffolded byscripts/oauth_login.py). - Flip its flag:
AGENT_SIGNAL_LIVE=1. Restart that agent; the rest stay mocked. - Capture real payloads with
scripts/record_fixture.pyfor reproducible dev. - Expand coverage via
POLY_MARKETS=slug1[:SYM],slug2[:SYM],…. Current config: 6 SPX → SPY markets + 2 NDX → QQQ markets + 2 BTC markets. - Once 50+ entries have reached
EVALUATEDstatus in the journal, flipLEARN_MODE=activeto let policy multipliers adjust live confidence + size. - Only then consider flipping any
AUTO_EXEC_*flag — and start with Tasty's options leg (defined-risk) before touching prediction venues.
Operational Checklist
- TastyTrade OAuth refresh tested (24h token expiry) —
scripts/refresh_token.py - >100 entries in
signal_log; false-positive rate reviewed by tier - >50 entries in
journal_entry_logreachedEVALUATEDbefore flippingLEARN_MODE=active - Bridge reachable:
curl http://127.0.0.1:18799/api/actionsreturns JSON - All
AUTO_EXEC_*flags confirmed 0 in a production.envaudit before every session pytest -qgreen locally; verify CI runs on every PR- Twitter rate limits monitored (<500k tweets/month on Basic)
- Slack / Discord webhooks tested on a HIGH-tier dry run
.env,keys/,.hermes/,.wrangler/in.gitignore; rolling dashboard feeds also ignored- CF Pages dashboard deployed; arch page at organized-market-arch.pages.dev current
- Hermes gateway + codex OAuth valid on claws-mac-mini;
ssh claws truesucceeds over Tailscale
OrganizedMarket surfaces signals and can execute defined-risk trades when its kill switches are explicitly flipped. Default posture is paper-trade-and-learn: every HIGH opportunity journals regardless of execution, so the policy loop gains reps even when you sit out.