5.7 KiB
Infrastructure Findings — SquareMCP / FetcherPay
This document captures the as-built architecture, ingress behavior, monitoring state, and Hermes route table discovered during the 2026-06-14 outage response.
1. High-level architecture
The single production server (104.190.60.129) hosts two separate ingress layers:
| Ingress Layer | Technology | Serves |
|---|---|---|
| Docker edge proxy | Traefik v3 | *.fetcherpay.com Docker Compose stacks, plus static file-provider routes for *.squaremcp.com |
| Kubernetes ingress | nginx-ingress-microk8s + cert-manager | *.squaremcp.com K8s workloads (currently bypassed by Traefik) |
Both layers use Let’s Encrypt TLS. Public ports 80/443 are bound by the Docker Traefik container, so its iptables rules win over host-network K8s services.
2. Traefik configuration
Static config
File: /home/garfield/traefik.yml
- Dashboard enabled on
:8080withinsecure: true. - Entrypoints:
web(HTTP → HTTPS redirect) andwebsecure(HTTPS,:443). - Providers: Docker (socket) + file provider (
/letsencrypt/manual/tls.yml,watch: true). - Certificate resolver:
letsencryptvia GoDaddy DNS-01.
Compose
File: /home/garfield/traefik-compose.yml
- Networks:
hermes-net,obsidian-net,fetcherpay(all external). - Volumes: Docker socket, static config,
letsencryptdirectory.
Dynamic routing
File: /home/garfield/letsencrypt/manual/tls.yml
Final state after the fix has file-provider routers for all commercial domains and path-specific rules that send /api/pilot-request and /auth/tiktok to Hermes.
3. Kubernetes ingress mismatch
- Controller class:
public - Ingress class used by manifests:
nginx
This means the active controller ignores most Ingress resources. Even if Traefik were removed, those Ingresses would not be served until the class is reconciled.
Affected manifests include:
hermes-mcp/hermes-k8s.yamlhermes-mcp/product/app/app-k8s.yamlhermes-mcp/docs/docs-k8s.yamlhermes-mcp/product/site/squaremcp-k8s-ingress.yaml
4. Hermes MCP route table
File: hermes-mcp/src/index.ts
Public / commercial endpoints
| Method | Path | Notes |
|---|---|---|
GET |
/ |
Static files from ../product |
GET |
/openapi-living-brief.json |
Obsidian-only OpenAPI spec for ChatGPT |
GET |
/openapi.json |
Full OpenAPI spec |
GET |
/auth/tiktok/start |
Redirect to TikTok Login Kit |
GET |
/auth/tiktok/callback |
TikTok OAuth callback |
POST |
/api/pilot-request |
Public form submission; origin-gated |
GET |
/health |
Liveness/readiness probe |
OAuth / MCP discovery
| Method | Path |
|---|---|
POST |
/oauth/register |
GET / POST |
/oauth/authorize |
POST |
/oauth/token |
GET |
/.well-known/oauth-authorization-server |
GET |
/.well-known/openid-configuration |
GET / POST / DELETE |
/mcp |
GET |
/sse |
POST |
/messages |
GET |
/tools |
Capability-guarded tool API
All /api/* tool routes require auth + capability grant:
| Capability | Example endpoints |
|---|---|
obsidian |
/api/obsidian/search, /api/obsidian/note, /api/obsidian/note/append, /api/obsidian/sync |
email |
/api/email/profile, /api/email/search, /api/email/read, /api/email/send |
whatsapp |
/api/whatsapp/send, /api/whatsapp/templates |
linkedin |
/api/linkedin/profile, /api/linkedin/post, /api/linkedin/message |
telegram |
/api/telegram/me, /api/telegram/message, /api/telegram/updates |
discord |
/api/discord/me, /api/discord/guilds, /api/discord/message |
instagram |
/api/instagram/profile, /api/instagram/media, /api/instagram/post |
twitter |
/api/twitter/search, /api/twitter/tweets, /api/twitter/tweet |
facebook |
/api/facebook/page, /api/facebook/posts, /api/facebook/post |
tiktok |
/api/tiktok/profile, /api/tiktok/video, /api/tiktok/video/status |
Health endpoint
app.get('/health', (_req, res) => {
res.json({
status: 'ok',
service: 'hermes-mcp',
toolCount,
transports,
endpoints,
});
});
Used by both K8s readiness and liveness probes in hermes-k8s.yaml.
5. Monitoring gaps
Prometheus / Grafana
- Prometheus and Grafana containers exist in
docker-compose.fetcherpay.yml. - Prometheus scrapes itself,
fetcherpay-api:3000, and Docker metrics at172.20.0.1:9323. - Hermes MCP is not scraped and has no
/metricsendpoint. - No Alertmanager, no alert rules.
Health checks
- Hermes has
/healthbut no/readyor/livezseparation. - Docker health checks exist for Postgres, MySQL, Redis, Gitea, and FetcherPay API, but not for Hermes.
Uptime / synthetic probes
- No blackbox exporter.
- No external uptime monitoring (Pingdom, UptimeRobot, Grafana Cloud, etc.).
- No cert-expiry alerting.
- No K8s ingress reconciliation check.
Logs
- No centralized log aggregation (Loki, Vector, Fluentd).
6. Secret management
hermes-k8s.yamlis gitignored and contains plaintext secrets (email, DB, OAuth, API keys).- Docker Compose stacks rely on exported env vars or
.envfiles. - No Sealed Secrets, External Secrets Operator, or Vault in use.
7. Notable risks
- Single point of failure: one residential IP, one host, one edge proxy.
- Split edge: two ingress controllers with conflicting class configuration.
- Manual certificate workaround: static K8s-extracted certs in Traefik must be manually rotated before expiry.
- No observability: no metrics, alerting, or synthetic probes for the commercial domains.
- Stopped services not detected: Docker restart policies only help if containers were initially started.