Hermes Agent — Setup Overview

AI Chief of Staff · Telegram-native · Cloud + Local hybrid · Last updated 2026-05-17

System Architecture

Cloud

DigitalOcean Droplet — Hermes Daemon

Ubuntu 24.04 LTS · 2 GB / 1 vCPU · NYC1 · $12/mo
Telegram bot (inbound + outbound) · Scheduled cron · systemd Restart=always
State: ~/.hermes/state.db (SQLite + FTS5 · ~30 day rolling)

LLM Gateway

Anthropic Claude

Primary: claude-haiku-4-5-20251001
Fallback: OpenRouter (Haiku-class)
Hard cap: $50/mo workspace
Expected: $0.30–$1.80/mo

↕ SSH via 1Password Agent ↕ Telegram API (outbound only)

Local

M1 Max MBP — Daily Driver

Claude Code (Anvil) · Dev work
gbrain CLI + MCP → PGLite local
Vault: ~/second-brain/ (Obsidian Sync)

Free · Deferred

gbrain (Knowledge Layer)

PGLite local (free, $0/mo)
Indexes vault via semantic search
MCP tools: mcp__gbrain__*
Supabase deferred

Mobile

iPhone / iPad

Telegram → Hermes bot
Voice memos inbound
Obsidian app (read vault)

↕ Obsidian Sync ↕ OpenAI text-embedding-3-large (~$1.25 one-time)

Durable Substrate

~/second-brain/ Markdown Vault (Tolaria)

~8,000 markdown files · Obsidian Sync · Source of truth for all knowledge · gbrain dual-writes here first

Hard Rules (Bright Lines)

✗ Never touch financial accounts (read-only max)

✗ Never send messages without approval gate

✗ No credentials in model context

✗ No unvetted community skills

✗ No medical or legal output acted upon

✓ Hermes drafts — Jesse decides

✓ All vault writes to _quarantine/ first

Auth & Security

✓ SSH via 1Password Agent (no keys on disk)

✓ UFW: SSH-only inbound on DO droplet

✓ Hermes API bound to 127.0.0.1 only

✓ Telegram: allowed user IDs whitelist

✓ HERMES_APPROVALS_MODE=on

✓ All secrets in ~/.hermes/.env (mode 600)

✓ $50/mo Anthropic workspace cap

Model Stack by Machine & Use Case

☁ Cloud (Hermes Daemon on DO)

🤖

Claude Haiku 4.5

DigitalOcean Droplet · Primary

Orchestration, routing, message classification, Telegram responses, cron summaries

$0.30–$1.80
per month

🔁

OpenRouter Haiku-class fallback

DigitalOcean Droplet · Fallback only

Activated if Anthropic API is down or rate-limited

Negligible

💜 M1 Max MBP — Daily Driver (Local, Free)

⚡

Qwen 3.6-35B-A3B MoE Q4_K_M

M1 Max MBP (64 GB) · Ollama · Primary all-rounder

Daily driver for dev work, research synthesis, draft writing, brainstorming. GPT-4-class, fully private.

$0
local

💻

Kimi K2.6

M1 Max MBP (64 GB) · Ollama · Coding specialist

Complex, multi-file coding sessions. Deep work context. Use for privacy-sensitive coding.

$0
local

🦙

Llama 4 Scout

M1 Max MBP (64 GB) · Ollama · Long-context / multimodal

Long-context documents, PDFs, multimodal tasks. Batch processing overnight.

$0
local

🍃 M1 Air — Hermes Daemon Host (Local)

🔧

GPT-OSS-20B Q4_K_M

M1 Air (16 GB) · Ollama · Primary

Burst tasks under 2 min: route Telegram messages, classify voice memos, briefing templates.

$0
local

🔄

Qwen3.5-9B Q4_K_M

M1 Air (16 GB) · Ollama · Fallback

Lighter fallback if GPT-OSS-20B has issues.

$0
local

🎙️

faster-whisper (STT)

M1 Air · Local speech-to-text

Transcribes Telegram voice memos locally. No API cost, no throttle, fully private.

$0
local

Decision Rule: Local vs Cloud

Use local (Ollama) for: privacy-sensitive docs, prompt iteration, batch overnight processing.
Use cloud (Haiku) for: Hermes orchestration on DO, multi-turn agentic tasks, final polish.
Do NOT run coding sessions or large-context on M1 Air (16 GB, thermals after 5 min).

Monthly Costs (MVP)

DigitalOcean Droplet (2GB/1vCPU)$12.00

Anthropic API (Haiku, capped at $50)$0.30–$1.80

OpenAI Embeddings (gbrain, ongoing)~$0.05

gbrain / PGLite (local)$0.00

Telegram Bot$0.00

Obsidian Sync (existing)$0.00

Tailscale (free tier)$0.00

Ollama (all local models)$0.00

Total / Month$12.35–$13.85

One-Time Costs

OpenAI vault embedding (~8K files)~$1.25

One-Time~$1.25

Deferred / Upgrade Path

Supabase Pro (cross-machine gbrain)+$25/mo

DO resize to 4 GB+$12/mo

Tailscale Pro (if mesh needed)+$6/mo

Original vs Revised Cost Architecture

Original Plan

$60

per month
Supabase Pro + Hermes + LLM

Current MVP

$13

per month
DO + Haiku only, free gbrain

Saving

78%

less than original plan
~$570/yr saved

Build Status

Decisions + keys locked

DO, Anthropic cap, Telegram bot, SSH, OpenAI embedding key

✓ Done

gbrain engine on MBP

PGLite init, MCP registered, put/search smoke test passed

✓ Done (2026-05-16)

DO droplet + Hermes install

SSH connect, apt prep, pip install hermes-agent, configure .env

⟳ In Flight (Day 1)

Vault embedding pass

OpenAI key → gbrain sync → ~$1.25 → MCP semantic search live

⟳ In Flight (Day 1 Track B)

Telegram round-trip smoke test

hermes chat → gateway start → DM bot → response confirmed

▷ Next

systemd persistence

Restart=always, reboot test, Docker sandbox backend

▷ Next

Voice capture pipeline

faster-whisper → classifier → wikilinker → dual-write

Week 2+

Daily 7am scorecard

financial-freedom data → Telegram delivery → cron 07:00

Day 10

Friday audit cron

/wealth-advisor review via Hermes at 4pm Friday

Day 12

Earning Trust Protocol

Read-only / draft only

Watch every output. No autonomous writes.

Week 1

W2–3

Add one write workflow

After observing correct operation for a week

Weeks 2–3

M2+

Expand one verified workflow at a time

Blast radius compounds with access level

Month 2+

M1 Air Constraints

✓ Burst tasks under 2 min (routing, classification)

✓ Voice transcription via faster-whisper

✗ No coding sessions (thermals after 5 min)

✗ No 50+ page PDFs (16GB + daemon = swap hell)

✗ No multi-turn chains (slow by turn 5)

M1 → M3 Vision: Specialist Agents

Month 1 — Foundation

The Librarian (Voice-First Capture)

Telegram voice → faster-whisper STT → classifier → wikilinker → vault write + gbrain dual-write. Start: just capture to _quarantine/. Add classifier at Week 3.

Month 2 — Self-Knowledge

The Cartographer (Longitudinal Self)

Nightly 5-min Telegram check-in (3 rotating questions: energy, persona tilt, money anxiety). After Month 6: weekly synthesis + cross-year pattern queries.

Month 3 — Decision Radio

The F1 Race Engineer (Multi-Venture Radio)

Max 5 outbound radio calls/day. One Telegram channel. Pronto first. Never auto-sends. Solves operational fatigue across 5 ventures.

Q3+ Honorable Mentions

War Correspondent's FixerTravel templates

Reading CompounderKindle/Readwise

Renaissance Practice TrackerMusic/writing reps

Mossad BriefingsPre-meeting intel

Integrations — Deferred

Hermes ↔ gbrain bridgeTailscale + HTTP MCP

Supabase ProNo forcing function yet

Readwise / GoodNotes OCRWeek 2+

Klaviyo / Stripe / ShopifyScoped keys, Week 2+

Multi-profile (RBC, Pronto)Post-MVP