$ emrebener
home personal-projects openclaw & hermes infrastructure

OpenClaw & Hermes Infrastructure

technologies: ubuntu, hetzner cloud, tailscale, wireguard, tinyproxy, surfshark, camoufox, systemd, docker, google pub/sub repository: closed source
published: updated: type: pet

1. Agent infrastructure, not just an agent install

This is the cloud setup I built to run OpenClaw and Hermes as long-lived agents on Hetzner without exposing their admin surfaces to the public internet. The interesting part is the infrastructure around the agents: public ingress is closed, administration and service-to-service traffic ride over Tailscale, browser automation goes through Camofox, and Camofox exits through a separate WireGuard-backed HTTP CONNECT proxy instead of the raw datacenter IP.

The system is split across two Ubuntu 24.04 VPS hosts:

HostRolePublic ingress model
openclaw-2Main agent host for OpenClaw, Hermes, Camofox, Gmail watch, SearXNG, Whisper, and job automation servicesHetzner Cloud Firewall blocks unsolicited inbound traffic
vpn-proxySingle-purpose browser egress proxy for CamofoxHetzner Cloud Firewall blocks unsolicited inbound traffic except Tailscale UDP 41641 for a direct tailnet path

I deliberately did not make the agent host a general public web server. OpenClaw’s gateway, Hermes, Camofox, Whisper, the job tracker API, and the dashboard all bind to localhost or tailnet-reachable addresses. From my laptop I use SSH aliases that target Tailscale IPs, and the SSH config forwards the few dashboards I need onto local ports.

local machine
  -> ssh openclaw-server-tailscale
  -> local forwards:
       127.0.0.1:18789 -> OpenClaw gateway/dashboard
       127.0.0.1:15173 -> Job tracker dashboard
       127.0.0.1:13001 -> Job tracker API

The tradeoff is that I now depend on Tailscale for ordinary operations. I accepted that because the recovery path is still Hetzner’s KVM console, and the security payoff is large: no public SSH, no public Camofox, no public dashboard, and no accidental admin panel exposed because a service changed its bind address.

2. Tailscale as the private control plane

Tailscale is the control plane for this setup. Both machines have public IPs for outbound internet access, but the useful addresses are their stable 100.x.x.x tailnet addresses. SSH, OpenClaw dashboard forwarding, the browser proxy hop, and operational health checks all use the tailnet.

That choice replaced a more conventional VPS model. I could have opened SSH to my home IP, exposed dashboards behind Nginx and basic auth, and managed firewall rules per service. I went with Tailscale instead because the services are operational tools, not public products. They do not need internet ingress, and private WireGuard links are a better fit than hardening a pile of HTTP endpoints one by one.

The proxy VM keeps one public UDP exception for Tailscale’s direct connectivity path. That is not an application port. It exists so the openclaw-2 to vpn-proxy hop can stay direct instead of falling back to a DERP relay, which matters because every browser request crosses that link.

openclaw-2
  -> Tailscale encrypted peer link
  -> vpn-proxy:8888 on the tailnet

Normal ICMP and public SSH can stay blocked. tailscale ping is the useful diagnostic here because it tells me whether the machines are talking directly or through a relay, which ordinary ping cannot answer in this firewall model.

The nice operational detail is that the SSH command itself still looks normal. The alias lives in my local ~/.ssh/config; it points HostName at the server’s Tailscale IP, selects the Unix user and identity file, and defines the local forwards I want when I connect. There is no secret in the alias. The private key stays on my machine, and the server only sees the corresponding public key in authorized_keys.

Host openclaw-server-tailscale
    HostName 100.x.y.z
    User cc
    IdentityFile ~/.ssh/my-main-key
    IdentitiesOnly yes

    LocalForward 18789 127.0.0.1:18789
    LocalForward 15173 127.0.0.1:5173
    LocalForward 13001 127.0.0.1:3001

That is why ssh openclaw-server-tailscale can both log into the machine and make private dashboards available on my laptop as 127.0.0.1 URLs. The public firewall never has to allow inbound TCP 22, 18789, 5173, or 3001; the only reachable path is the authenticated tailnet path.

3. Browser identity had to be consistent

The browser layer is where the project stopped being “install two agents on a VPS” and became real infrastructure work. Agentic browsing from a Hetzner IP has two problems: the IP belongs to a known datacenter, and browser fingerprint signals can contradict the location implied by the network.

OpenClaw and Hermes do not launch arbitrary browsers directly. They talk to a shared Camofox service on openclaw-2, which wraps Camoufox, a Firefox fork aimed at reducing automation fingerprints. Camofox handles the browser-engine side: tabs, navigation, clicks, typing, and the lower-level fingerprint surface. That still leaves the network identity.

I wanted the browser to present as a stable London user:

SignalTarget
Exit IPSurfshark dedicated London IP
Browser timezoneEurope/London, derived from the exit IP
GeolocationLondon, derived from the exit IP
Localepinned en-GB
Browser engineCamoufox through Camofox

The locale detail matters more than it looks. Camofox’s geoip mode can align timezone and geolocation from the proxy exit IP, but the language selection was not deterministic enough for this setup. A browser that exits in London but changes language between launches is a fingerprint in its own right. I patched the Camofox launch options so CAMOFOX_LOCALE=en-GB is passed explicitly while geoip still handles timezone and geolocation.

The other important decision was failure behavior. OpenClaw’s Camofox plugin can auto-start its own browser if the shared Camofox server is down. That is convenient, but the fallback browser would not necessarily be proxied. For this system, failing loudly is better than silently crawling from the Hetzner IP.

4. Turning a VPN tunnel into an HTTP CONNECT proxy

Surfshark gives me a WireGuard configuration, not a browser proxy. Camofox needs an HTTP CONNECT proxy endpoint. The core networking task was therefore to turn “a WireGuard VPN tunnel” into “a tailnet-only authenticated HTTP proxy that Camofox can use.”

I decided against running the VPN on openclaw-2. A full-tunnel WireGuard configuration replaces the host’s default route, and on a machine I administer over Tailscale that is a good way to break access to the box. I also decided against making the proxy VM itself a host-wide VPN client, because that machine also depends on Tailscale for management and for the agent-to-proxy hop.

The solution is a Linux network namespace named vpn on the vpn-proxy host. The root namespace keeps normal Hetzner networking and Tailscale. The VPN namespace gets the Surfshark default route. tinyproxy runs inside the namespace, so only traffic that enters tinyproxy exits via Surfshark.

vpn-proxy root namespace
  eth0, tailscale0, SSH, apt, systemd management
  veth-h 10.99.99.1
  DNAT tailscale0:8888 -> 10.99.99.2:8888

vpn namespace
  veth-v 10.99.99.2
  wg0 default route
  tinyproxy on 10.99.99.2:8888
  Surfshark DNS resolvers

The subtle WireGuard part is where the interface gets created. wg0 is created in the root namespace first, then moved into the vpn namespace. WireGuard keeps the encrypted transport socket associated with the namespace where it was created, so the tunnel can still send encrypted UDP through the real host network while the tunnel interface itself lives inside the isolated namespace.

The resulting request path is the whole project in one diagram:

agent on openclaw-2
  -> Camofox on 127.0.0.1:9377
  -> Camoufox browser configured with an HTTP proxy
  -> Tailscale link to vpn-proxy:8888
  -> iptables DNAT into the vpn namespace
  -> tinyproxy
  -> WireGuard wg0
  -> Surfshark London
  -> target website

That gives me the property I wanted: the browser’s traffic takes the VPN, while management traffic, package updates, SSH, and Tailscale itself do not.

5. Systemd makes it survive reboot

The proxy stack is rebuilt by systemd instead of by a hand-run script. vpn-netns.service creates the namespace, veth pair, WireGuard interface, routes, DNS, and DNAT rule. tinyproxy-vpn.service starts tinyproxy inside that namespace. On the agent host, Camofox, OpenClaw, Hermes, the Gmail watch, and the job tracker all run as systemd user services with linger enabled for their owning users.

The split by Unix user is intentional:

UserOwns
clawOpenClaw, SearXNG, Whisper, Gmail watch, job automation
hermHermes, Camofox, KittenTTS
ccClaude Code and general operations

I could have run everything under one agent account and called it done. I separated the services because the homes and state directories are meaningful boundaries: OpenClaw state lives under ~claw/.openclaw, Hermes under ~herm/.hermes, browser service ownership under herm, and job automation assets under claw. It makes service management more verbose, especially because user systemd commands need XDG_RUNTIME_DIR, but it keeps the blast radius and operational model much clearer.

The same pattern shows up in the job automation layer. The tracker is the source of truth, not prompt memory or browser tabs:

finder timer
  -> OpenClaw job-opportunity-finder skill
  -> POST /api/jobs

manual applier run
  -> OpenClaw job-application-worker skill
  -> POST /api/jobs/claim-next
  -> generate tailored cover letter
  -> apply through Camofox
  -> PATCH /api/jobs/:id/status

stale recovery timer
  -> POST /api/jobs/recover-stale

The finder runs on a timer with queue backpressure, while the applier timer is intentionally disabled because form automation is still risky enough that I prefer manual runs. The wrappers do cheap tracker preflight checks before launching OpenClaw, so an empty queue or already-full queue does not burn an agent session.

6. The Gmail pipeline is push-driven

OpenClaw also triages Gmail. I wanted this to be event-driven instead of a polling script that checks mail every few minutes. The path uses Gmail push notifications, Google Pub/Sub, Tailscale Funnel, and a local gog gmail watch serve process.

Gmail inbox
  -> Google Pub/Sub push subscription
  -> Tailscale Funnel public URL
  -> 127.0.0.1:8788 gog gmail watch serve
  -> 127.0.0.1:18789/hooks/gmail
  -> OpenClaw triage session
  -> Telegram DM for important mail

The public edge here is deliberately narrow. Tailscale Funnel exposes only the Gmail Pub/Sub receiver path. The actual OpenClaw gateway remains localhost-only on openclaw-2, and the watch server forwards into it over loopback with a hook token.

There were two operational details worth making boring. Gmail watch registrations expire after seven days, so a systemd user timer renews the watch daily and restarts the watch service. Also, gog gmail watch serve returning HTTP 202 does not mean an email was delivered. A real forwarded message is HTTP 200; 202 can simply mean a duplicate push or a Gmail history event with nothing useful to forward. That distinction is now part of the runbook because it changes how I debug the pipeline.

7. Defensive boundaries around automation

The job automation side has explicit policy boundaries, not just prompt instructions. LinkedIn and Indeed are excluded from the pipeline. LinkedIn is too valuable to risk with bot automation, and Indeed is excluded by policy.

I enforced that in several layers:

  • The finder goals and agent skills say not to search, browse, scrape, or apply through either platform.
  • The job tracker API rejects LinkedIn and Indeed job creation.
  • tinyproxy blocks LinkedIn domains before Camofox can browse them.
  • The old LinkedIn automation protocol was moved out of the active OpenClaw workspace.

This is the same design principle as the proxy work: prompts are not enough when the failure mode matters. If an agent should not cross a boundary, put the boundary in the surrounding system too.

8. The result

The end state is a small private agent platform rather than a single bot process. OpenClaw and Hermes run continuously on openclaw-2; their browsing goes through a shared Camofox server; browser traffic exits through a London WireGuard tunnel on a separate proxy VM; admin access and service-to-service traffic stay inside Tailscale; and the whole thing comes back after reboot through systemd.

The main technical choices were all about reducing hidden coupling:

  • Tailscale handles the private control plane, so the cloud firewall can block public admin ingress.
  • The VPN lives on a separate host, so a route mistake cannot take down the agent host.
  • WireGuard lives inside a network namespace, so a full-tunnel VPN cannot steal the proxy VM’s own management path.
  • Camofox gets a stable proxy, Europe/London geo signals, and a pinned en-GB locale, so browser identity stays consistent.
  • The tracker, Gmail watch, and job automation wrappers give agents explicit state boundaries instead of depending on chat history.

The setup is not pretending to be undetectable. A Surfshark exit is still a VPN exit, and the strictest websites can score that. The point is to remove the cheap inconsistencies that get browser automation blocked quickly: raw datacenter IP, mismatched locale, unstable language, public admin surfaces, and ad hoc long-running processes with no service owner. The work was mostly routing, isolation, systemd, and failure-mode design, which is exactly where reliable agent infrastructure seems to live.