What Hermes can do
A capable assistant, reachable from a chat thread.
Nine live integrations, a conversational brain, voice, and web access — running 24/7 on a private server and controlled entirely through everyday language in WhatsApp.
Conversational brain
Powered by GLM-5.1 with a custom persona and long-term memory, so it remembers context across conversations.
Calendar
List your schedule, add events from plain language (“lunch with Sam tomorrow 1pm”), create and delete events.
Drive & Docs
Search Drive, create Google Docs with content, read them back, append text, and organise into folders.
Sheets & Slides
Build spreadsheets — write, read, append rows — and assemble slide decks with slides and text boxes.
Tasks & Contacts
Manage your to-do lists and look up, add, or update people in your Google Contacts.
ClickUp
Create, update and browse tasks in your IBEX workspace — a focused 19-tool slice, no destructive actions.
Web & docs
Live web search (Tavily) for fresh answers, plus Context7 for up-to-date library and framework documentation.
Voice, vision & email
Transcribes voice notes, reads images, and sends/receives email under its own address on your behalf.
Weather & Prayer · Karachi
Live Karachi weather and today's prayer times (Univ-Karachi method, Hanafi) — free, no API keys.
Maps-lite
Geocoding, place search, and driving distance/ETA over OpenStreetMap — no Google billing.
Document Intelligence
Send a PDF/Word/Excel/CSV — it extracts text & tables, answers questions, and can push tables into a Sheet.
Tech & AI news
Fresh technology and AI headlines on demand, and bundled into the daily brief.
Daily brief & reminders
An 11 AM brief (weather, prayer, news, calendar, Google + ClickUp tasks with links) to WhatsApp, email & Discord, plus a reminder at each prayer time.
Health monitor
Checks every 10 min — instant Discord alerts on failure + a colorful 4-hourly heartbeat with CPU/RAM/disk bars and a witty AI sign-off.
Things to ask it
Just say it in the chat.
A few of the things Hermes can handle today. Phrase them naturally — it figures out which tool to reach for.
How to talk to it
A phrasing cheat-sheet.
You never need exact commands — but here’s the kind of thing each integration responds to.
Calendar
- “what’s on my calendar today?”
- “schedule lunch with Sam Friday 1pm”
- “add a dentist appt next Tuesday 3pm”
- “cancel my 4pm”
Docs & Drive
- “create a doc called Notes with…”
- “find my doc about the budget”
- “read me the doc <link>”
- “make a folder called Trips”
Sheets
- “make a spreadsheet of my expenses”
- “add a row: Rent, 1500, paid”
- “what’s in the Budget sheet?”
Slides
- “create a deck titled Q3 Review”
- “add a slide about revenue”
Tasks
- “add a task to renew insurance”
- “what’s on my to-do list?”
- “mark ‘call bank’ done”
Contacts
- “what’s Ali’s email?”
- “add a contact: Mark, mark@x.com”
ClickUp
- “show my open ClickUp tasks”
- “create a task in <list>”
Web & email
- “search for…”, “summarise <link>”
- “email Omair about the demo”
- “check my email”
Weather & Prayer
- “what's the weather?”
- “weather in Lahore tomorrow”
- “today's prayer times”
- “when's Maghrib?”
Maps
- “directions to the airport”
- “how far is Dolmen Mall?”
- “find a pharmacy in Clifton”
Documents
- send a 📄 file + “summarise this”
- “what's the total on this invoice?”
- “put these tables in a Sheet”
News
- “latest AI news”
- “top tech headlines”
- “news about <topic>”
General
- “remember that I prefer…”
- send a 🎙️ voice note
- send an 🖼️ image to describe
Connected systems
Integrations matrix.
Every connected service, how it’s built, and its current status. Google services are in-house adapters over official Google APIs.
| Integration | Type | Status | Tools | Notes |
|---|---|---|---|---|
| Calendar | Google · in-house | live | 4 | events, quick-add |
| Drive & Docs | Google · in-house | live | 6 | create/read/append docs, search |
| Sheets | Google · in-house | live | 5 | read/write/append |
| Slides | Google · in-house | live | 4 | decks, slides, text |
| Tasks | Google · in-house | live | 5 | lists & tasks |
| Contacts | Google · in-house | live | 5 | People API |
| ClickUp | Third-party | live | 19 | IBEX workspace, no deletes |
| Context7 | Third-party | live | 2 | live library docs |
| Tavily | Third-party | live | — | web search |
| Weather | In-house · Open-Meteo | live | 2 | Karachi default, any city |
| Prayer times | In-house · AlAdhan | live | 2 | Karachi · Hanafi method |
| Maps-lite | In-house · OpenStreetMap | live | 4 | geocode · search · directions |
| Document Intelligence | In-house | live | 5 | extract · Q&A · tables → Sheet |
| News | In-house · Tavily | live | 3 | tech · AI · any topic |
| Discord | Native channel | live | — | 3rd channel · allowlisted to you |
| Health monitor | In-house · cron | live | — | Discord alerts + 4-hourly metrics + AI summary |
| Morning brief + reminders | In-house · cron | live | — | WhatsApp + email · 11 AM · prayer alerts |
| Maps (Google) | Google · API key | parked | — | needs billing — using Maps-lite instead |
| Gmail API | idea | — | richer than IMAP/SMTP | |
| Keep | declined | — | API is Workspace-only |
Under the hood
How it’s actually built.
Hermes runs as a small set of services on a private VPS. Every Google integration is an in-house adapter calling Google’s official APIs — no third-party connectors hold your data.
01 Infrastructure
- host
- Oracle VPS ·
ssh oraclevps - os
- Ubuntu 24.04 · aarch64 · 24 GB
- services
- hermes-gateway, hermes-dashboard (systemd)
- ops
- weekly backups · model watcher
02 The brain
- primary
- GLM-5.1 via z.ai coding endpoint
- fallback
- Gemini 2.5 Flash
- speech
- Groq Whisper (STT) · Gemini (TTS)
- vision
- Gemini
03 Channels
- Baileys bridge · self-chat
- IMAP/SMTP as its own address
- dashboard
- localhost:9119 via SSH tunnel
- access
- allowlisted to you
04 Integrations
- in-house adapters, official APIs
- libraries
- google-api-python-client · google-auth
- others
- ClickUp · Context7 · Tavily
- protocol
- MCP (stdio) per service
05 Resource footprint
- compute
- 4-core ARM · 24 GB · always-on
- model
- GLM coding plan (z.ai)
- aux
- Gemini (free tier) · Groq
- storage
- local STT model + caches
06 Custom adapters
- location
~/.hermes/custom-mcp/- runtime
- Python 3.12 venv (via uv)
- auth
- shared
gauth.py+ 1 token - size
- ~300 lines you can audit
Security posture
One OAuth client, one combined-scope refresh token in a 0600 file behind a shared auth helper — no secrets live in the config. Scopes are minimal per service, the OAuth app is published to production (long-lived tokens), and all access is allowlisted to your number and address.
The shape of it
System architecture.
Messages flow in from WhatsApp or email, through the gateway to the GLM-5.1 agent, which reaches in-house MCP adapters that speak to each provider’s official API.
flowchart TD U["📱 You · WhatsApp / Email"] --> GW["Hermes Gateway
(systemd · allowlisted)"] GW --> AG["Hermes Agent
GLM-5.1 · persona · memory"] AG -. fallback / senses .-> AUX["Gemini · Groq Whisper
vision · STT · TTS"] AG --> MCP{{"MCP adapters (stdio)"}} subgraph INH["In-house adapters · official Google APIs"] CAL["gcal"] DRV["gdrive · Docs"] SHT["gsheets"] SLD["gslides"] TSK["gtasks"] CON["gcontacts"] end subgraph EXT["Other integrations"] CU["clickup"] C7["context7"] TV["Tavily"] end MCP --> INH MCP --> EXT INH --> GAPI["Google APIs
Calendar · Drive · Docs · Sheets · Slides · Tasks · People"] CU --> CUAPI["ClickUp API"] AUTH["gauth.py · 1 OAuth client
1 token · 0600 file"] -. credentials .-> INH
Trust & data
Security & privacy.
Plain-language account of where your data lives, what Hermes can touch, and how access is locked down.
- ◆Your data stays on your turf. Everything runs on your VPS and talks directly to your Google account. No third-party server sits in the middle holding your data.
- ◆Official auth, minimal scopes. One Google OAuth client with only the scopes each feature needs — Calendar, Drive, Docs, Sheets, Slides, Tasks, Contacts. Nothing broader.
- ◆One token, well guarded. A single refresh token in a
0600file (owner-only) behind a shared helper. No secrets in the config file. - ◆Locked to you. WhatsApp and email are allowlisted to your number and addresses — no one else can talk to it.
- ◆Auditable by design. The Google adapters are ~300 lines of our own code calling Google’s official libraries; you can read every line in
~/.hermes/custom-mcp/. - ◆What it can’t do. No admin/Workspace powers, no access to other people’s accounts, and destructive ClickUp actions are deliberately excluded.
- ◆Recoverable. Weekly encrypted-at-rest backups of the whole setup; tokens are revocable and re-mintable at any time.
Keeping it running
Operations & recovery.
The handful of things worth knowing to run, watch, and recover Hermes.
01 Services
- gateway
systemctl restart hermes-gateway- dashboard
hermes-dashboard· :9119- logs
~/.hermes/logs/- reload
- after MCP change: /new in chat
02 Backups
- when
- weekly · Sunday 03:00
- keep
- 4 rotating snapshots
- scope
- whole
~/.hermes - extras
- config
.bak-*per change
03 Dashboard access
- tunnel
ssh -L 9119:localhost:9119- open
- http://localhost:9119
- bind
- loopback-only (not public)
04 Watchers
- models
- glm-5.2 watcher · 6-hourly
- alert
- emails you on a new model
- tokens
- long-lived (app in production)
Common questions
FAQ.
Is my data private?
What model powers it?
What happens if it goes offline?
systemctl restart hermes-gateway) brings it back.Can anyone else use it?
Why build your own Google connectors instead of off-the-shelf ones?
How do I add a new capability?
/new in the chat. Just ask — that’s the workflow that built this.Why is Google Maps not available?
What's next
Roadmap & ideas.
Recently shipped, what's parked, and ideas worth exploring next.
✅ Recently shipped
Maps-lite (free)
Geocoding, place search and driving distance/ETA over OpenStreetMap — no Google billing.
Document Intelligence
Reads PDF/Word/Excel/CSV — extracts text & tables, answers questions, pushes tables to a Sheet.
Daily brief & reminders
11 AM brief to WhatsApp + email + Discord, plus a reminder at each prayer time.
Discord channel
Same Hermes, now reachable on Discord alongside WhatsApp & email.
Health monitor
Self-checks every 10 min, instant Discord alerts on trouble, plus a 4-hourly metrics heartbeat with an AI one-liner.
Google Maps
Richer data than Maps-lite, but it needs a billing account — on hold by choice.
💡 Ideas to explore next
Multi-doc & knowledge base
Ask across many documents at once, with a personal "remember this" store to query later.
OCR & field capture
Read scanned PDFs and photos, and auto-extract invoice & contract fields.
Gmail API
Richer email than IMAP/SMTP — labels, threads, search, smarter drafting & triage.
Scheduled digests
Weekly summaries, end-of-day recaps, or topic digests on a cadence you set.
Deeper ClickUp
Auto-create tasks from chat/email/docs, recurring tasks, and status roll-ups.
Finance & expenses
Track spending, bill & card reminders, and stock or crypto price alerts.
Smart home
Control lights, devices & routines via Home Assistant, from chat.
Proactive follow-ups
"Nudge me if X hasn't happened yet" — smart reminders beyond the brief.
GitHub
Repos, issues & PRs — "what changed?", create issues, review summaries.
Voice-first & calls
Richer voice replies — and maybe a number you can actually call.
Travel & currency
Flight tracking, live PKR/USD conversion, and trip itineraries.
Image generation
"Make me a graphic / logo / diagram" straight from chat.
Urdu & multilingual
Converse in Urdu and translate on the fly.
Current limitations
- ▲Google Maps is parked — using free Maps-lite (OpenStreetMap) instead; Google's version needs billing.
- ▲Google Keep is unavailable — its API is Workspace-only and won't work with a personal account.
- ▲Unverified-app caps — the OAuth app is in production but unverified: a one-time consent warning and a 100-user cap (you're the only user).
- ▲Maps-lite is driving-only — OpenStreetMap routing covers car routes; walking/transit aren't wired yet.
History
Changelog.
How Hermes got here, by date.
- Added Weather (Open-Meteo) and Prayer times (AlAdhan · Karachi/Hanafi) — free, no API keys.
- Shipped a proactive morning brief at 11 AM to WhatsApp + email — weather, prayer, Tech + AI news, calendar (with links), tasks due today, and open tasks from Google Tasks + ClickUp (clickable links) — plus a reminder at each prayer time.
- Built Maps-lite on OpenStreetMap (geocode, search, driving directions) — no Google billing.
- Added Document Intelligence — send a PDF/Word/Excel/CSV and it extracts text & tables, answers questions, and can push tables into a Google Sheet.
- Added Discord as a third channel (same Hermes, allowlisted to you) and routed the morning brief to WhatsApp + email + Discord.
- Built a health monitor — every 10 min it checks services/channels/brain/disk and posts to a Discord #health channel: instant @mention alerts (also email + WhatsApp) on failure/stress, recovery notices, and a colorful 4-hourly heartbeat with CPU/RAM/disk progress bars, full system metrics, and a witty AI one-liner.
- Validated & hardened the ClickUp integration — added comments & tags (now 19 tools), fixed task moves & duplication, added request timeouts and rate-limit handling, taught it Karachi-time due dates, and moved its API key out of config into a secrets file. Still no destructive deletes.
- Named this portal Omair Aslam · Hermes Agent Setup and folded all of the above in.
- Added Google Docs / Drive / Sheets / Slides, then Tasks, then Contacts.
- Investigated Google Keep (Workspace-only) and declined it.
- Pivot: on the “official components only” rule, rebuilt every Google integration as in-house adapters over Google’s official client libraries.
- Published the OAuth app to production (long-lived tokens); consolidated to one shared token, no secrets in config.
- Migrated Calendar, Drive/Docs, Tasks & Contacts in-house; added Sheets & Slides in-house. Retired all third-party Google connectors.
- Installed a GLM-5.2 watcher (emails you when a newer model lands); built this dossier; Maps parked on billing.
- Web dashboard shipped as a permanent service (SSH-tunnel access).
- Weekly rotating backups of the whole setup.
- ClickUp tools trimmed 154 → 19; auxiliary chores offloaded to Gemini Flash.
- Deployed on the Oracle VPS with a WhatsApp self-chat gateway and email.
- Brain set to GLM-5.1 (z.ai); Gemini fallback + vision + TTS; Groq Whisper STT.
- First integrations: ClickUp, Context7, Google Calendar; Tavily web search; custom persona.
- Retired a duplicate bot (OpenClaw) sharing the same number.