Hermes Agent — Setup Overview

AI Chief of Staff · Telegram-native · Cloud + Local hybrid · Last updated 2026-05-17

Cloud
DigitalOcean Droplet — Hermes Daemon
Ubuntu 24.04 LTS · 2 GB / 1 vCPU · NYC1 · $12/mo
Telegram bot (inbound + outbound) · Scheduled cron · systemd Restart=always
State: ~/.hermes/state.db (SQLite + FTS5 · ~30 day rolling)
LLM Gateway
Anthropic Claude
Primary: claude-haiku-4-5-20251001
Fallback: OpenRouter (Haiku-class)
Hard cap: $50/mo workspace
Expected: $0.30–$1.80/mo
↕ SSH via 1Password Agent                     ↕ Telegram API (outbound only)
Local
M1 Max MBP — Daily Driver
Claude Code (Anvil) · Dev work
gbrain CLI + MCP → PGLite local
Vault: ~/second-brain/ (Obsidian Sync)
Free · Deferred
gbrain (Knowledge Layer)
PGLite local (free, $0/mo)
Indexes vault via semantic search
MCP tools: mcp__gbrain__*
Supabase deferred
Mobile
iPhone / iPad
Telegram → Hermes bot
Voice memos inbound
Obsidian app (read vault)
↕ Obsidian Sync              ↕ OpenAI text-embedding-3-large (~$1.25 one-time)
Durable Substrate
~/second-brain/ Markdown Vault (Tolaria)
~8,000 markdown files · Obsidian Sync · Source of truth for all knowledge · gbrain dual-writes here first
Hard Rules (Bright Lines)
Never touch financial accounts (read-only max)
Never send messages without approval gate
No credentials in model context
No unvetted community skills
No medical or legal output acted upon
Hermes drafts — Jesse decides
All vault writes to _quarantine/ first
Auth & Security
SSH via 1Password Agent (no keys on disk)
UFW: SSH-only inbound on DO droplet
Hermes API bound to 127.0.0.1 only
Telegram: allowed user IDs whitelist
HERMES_APPROVALS_MODE=on
All secrets in ~/.hermes/.env (mode 600)
$50/mo Anthropic workspace cap
🤖
Claude Haiku 4.5
DigitalOcean Droplet · Primary
Orchestration, routing, message classification, Telegram responses, cron summaries
$0.30–$1.80
per month
🔁
OpenRouter Haiku-class fallback
DigitalOcean Droplet · Fallback only
Activated if Anthropic API is down or rate-limited
Negligible
Qwen 3.6-35B-A3B MoE Q4_K_M
M1 Max MBP (64 GB) · Ollama · Primary all-rounder
Daily driver for dev work, research synthesis, draft writing, brainstorming. GPT-4-class, fully private.
$0
local
💻
Kimi K2.6
M1 Max MBP (64 GB) · Ollama · Coding specialist
Complex, multi-file coding sessions. Deep work context. Use for privacy-sensitive coding.
$0
local
🦙
Llama 4 Scout
M1 Max MBP (64 GB) · Ollama · Long-context / multimodal
Long-context documents, PDFs, multimodal tasks. Batch processing overnight.
$0
local
🔧
GPT-OSS-20B Q4_K_M
M1 Air (16 GB) · Ollama · Primary
Burst tasks under 2 min: route Telegram messages, classify voice memos, briefing templates.
$0
local
🔄
Qwen3.5-9B Q4_K_M
M1 Air (16 GB) · Ollama · Fallback
Lighter fallback if GPT-OSS-20B has issues.
$0
local
🎙️
faster-whisper (STT)
M1 Air · Local speech-to-text
Transcribes Telegram voice memos locally. No API cost, no throttle, fully private.
$0
local
Decision Rule: Local vs Cloud
Use local (Ollama) for: privacy-sensitive docs, prompt iteration, batch overnight processing.
Use cloud (Haiku) for: Hermes orchestration on DO, multi-turn agentic tasks, final polish.
Do NOT run coding sessions or large-context on M1 Air (16 GB, thermals after 5 min).
Monthly Costs (MVP)
DigitalOcean Droplet (2GB/1vCPU)$12.00
Anthropic API (Haiku, capped at $50)$0.30–$1.80
OpenAI Embeddings (gbrain, ongoing)~$0.05
gbrain / PGLite (local)$0.00
Telegram Bot$0.00
Obsidian Sync (existing)$0.00
Tailscale (free tier)$0.00
Ollama (all local models)$0.00
Total / Month$12.35–$13.85
One-Time Costs
OpenAI vault embedding (~8K files)~$1.25
One-Time~$1.25
Deferred / Upgrade Path
Supabase Pro (cross-machine gbrain)+$25/mo
DO resize to 4 GB+$12/mo
Tailscale Pro (if mesh needed)+$6/mo
Original Plan
$60
per month
Supabase Pro + Hermes + LLM
Current MVP
$13
per month
DO + Haiku only, free gbrain
Saving
78%
less than original plan
~$570/yr saved
Build Status
P0
Decisions + keys locked
DO, Anthropic cap, Telegram bot, SSH, OpenAI embedding key
✓ Done
P1
gbrain engine on MBP
PGLite init, MCP registered, put/search smoke test passed
✓ Done (2026-05-16)
P2
DO droplet + Hermes install
SSH connect, apt prep, pip install hermes-agent, configure .env
⟳ In Flight (Day 1)
P3
Vault embedding pass
OpenAI key → gbrain sync → ~$1.25 → MCP semantic search live
⟳ In Flight (Day 1 Track B)
P6
Voice capture pipeline
faster-whisper → classifier → wikilinker → dual-write
Week 2+
P7
Daily 7am scorecard
financial-freedom data → Telegram delivery → cron 07:00
Day 10
P8
Friday audit cron
/wealth-advisor review via Hermes at 4pm Friday
Day 12
Earning Trust Protocol
W1
Read-only / draft only
Watch every output. No autonomous writes.
Week 1
W2–3
Add one write workflow
After observing correct operation for a week
Weeks 2–3
M2+
Expand one verified workflow at a time
Blast radius compounds with access level
Month 2+
M1 Air Constraints
Burst tasks under 2 min (routing, classification)
Voice transcription via faster-whisper
No coding sessions (thermals after 5 min)
No 50+ page PDFs (16GB + daemon = swap hell)
No multi-turn chains (slow by turn 5)
Month 1 — Foundation
The Librarian (Voice-First Capture)
Telegram voice → faster-whisper STT → classifier → wikilinker → vault write + gbrain dual-write. Start: just capture to _quarantine/. Add classifier at Week 3.
Month 2 — Self-Knowledge
The Cartographer (Longitudinal Self)
Nightly 5-min Telegram check-in (3 rotating questions: energy, persona tilt, money anxiety). After Month 6: weekly synthesis + cross-year pattern queries.
Month 3 — Decision Radio
The F1 Race Engineer (Multi-Venture Radio)
Max 5 outbound radio calls/day. One Telegram channel. Pronto first. Never auto-sends. Solves operational fatigue across 5 ventures.
Q3+ Honorable Mentions
War Correspondent's FixerTravel templates
Reading CompounderKindle/Readwise
Renaissance Practice TrackerMusic/writing reps
Mossad BriefingsPre-meeting intel
Integrations — Deferred
Hermes ↔ gbrain bridgeTailscale + HTTP MCP
Supabase ProNo forcing function yet
Readwise / GoodNotes OCRWeek 2+
Klaviyo / Stripe / ShopifyScoped keys, Week 2+
Multi-profile (RBC, Pronto)Post-MVP