ACME With Caddy + Let's Encrypt — Free & Automated TLS Certificates
Setting up HTTPS used to be an ordeal. Buy a certificate from a CA, prove ownership, install the cert by hand, set a calendar reminder for the next year, and hope you didn’t forget. Today the same job is two lines of config and zero ongoing maintenance. Three pieces made that change: Let’s Encrypt (a free CA), ACME (the protocol that automates everything between client and CA), and reverse proxies like Caddy that bake the whole flow into the server itself.
1. What a TLS certificate actually is
A TLS certificate is a signed statement that says “this public key belongs to this domain name”. That’s it. It doesn’t encrypt anything on its own — it’s an identity document, signed by a third party the browser already trusts, that lets a server prove who it is during the TLS handshake.
The cryptography behind the scenes is public-key (asymmetric) crypto. A keypair has two halves: a private key, which the server keeps secret, and a public key, which it hands out freely. Anyone can encrypt a message with the public key that only the private key can decrypt, and the private key can produce signatures that anyone can verify with the public key. TLS uses both directions.
The certificate itself is a structured document in a format called X.509. The relevant fields are roughly:
- Subject. Identifies who the cert is for. For modern web certs the hostname lives in the Subject Alternative Name (SAN) extension, possibly multiple names. Browsers ignore the older Common Name field for hostname matching.
- Subject Public Key. The actual public key the cert is binding to the subject.
- Issuer. The CA that signed the cert.
- Validity. “Not before” and “Not after” timestamps. Modern public certs cap at 398 days; Let’s Encrypt issues for 90.
- Signature. The issuer’s signature over the rest of the cert’s contents.
During a TLS handshake, the server sends this certificate to the client. The client checks three things: the signature is valid and traces back to a CA the client already trusts (more on that next), the cert hasn’t expired or been revoked, and the SAN contains the hostname the client was trying to reach. If all three pass, the client believes the public key in the cert really belongs to the server it’s talking to, and the two sides go on to negotiate the symmetric session keys that encrypt the actual traffic.
So the cert is a credential, and the encryption is downstream of trusting it.
2. Certificate authorities and the chain of trust
Trust on the public web bottoms out in a small set of root CAs whose public keys are baked into operating systems and browsers. Everything else inherits its trustworthiness through a chain of signatures that ends at one of those roots.
Root CAs don’t sign end-entity certs directly. The root keypair is too valuable to expose, so it lives offline in a hardware security module and only comes out to sign intermediate CAs, which then do the day-to-day work of signing customer certs. A real-world chain looks like this:
ISRG Root X1 (root, in every modern trust store)
└── Let's Encrypt R11 (intermediate, signed by the root)
└── example.com (your cert, signed by the intermediate)When a TLS server hands a client its certificate, it actually hands over the leaf certificate plus any intermediates needed to bridge to a root. The client already has the root locally and uses it to verify the intermediate’s signature, then uses the intermediate’s public key to verify the leaf’s signature. If any link is broken, expired, or signed by something not in the trust store, the chain fails and the browser shows a warning.
Two practical consequences fall out of this:
- Configure intermediates, or expect mysterious failures. A common misconfiguration is serving only the leaf cert. It works in Chrome on a developer laptop (Chrome will sometimes fetch the missing intermediate itself) and fails everywhere else. Caddy handles this automatically; if you ever do TLS by hand, remember to concatenate the intermediate.
- A root rolling its key is a multi-year migration. When ISRG (the org behind Let’s Encrypt) cross-signed its X1 root through IdenTrust during its first years, that cross-signature was the only thing keeping older Android devices working. The chain matters for compatibility too, not just verification.
3. Let’s Encrypt: free certs, by design
Let’s Encrypt is a free, automated, publicly-trusted CA operated by the Internet Security Research Group (ISRG). Its goal is to make HTTPS the default for the entire web by removing the two friction points that kept it from being so: cost and manual operation.
Before Let’s Encrypt, a DV (Domain Validation) cert from a commercial CA cost somewhere between 200 per year and required someone to receive a verification email click a link, download a .crt file, and install it on the server. Multiply that across every blog, every internal tool, every side project, and HTTPS-everywhere just wasn’t going to happen.
Let’s Encrypt makes two opinionated choices that change the math:
- Certs are free. Funded by sponsors and donations rather than per-cert fees.
- Certs are short-lived (90 days) and machine-issued. No email handshake, no waiting for a human at the CA, no fee. The protocol does the work.
The 90-day lifetime is the part that throws people off. A common reaction is “why force me to renew four times a year instead of once?” Short lifetimes are a feature, though:
- They make automation non-optional. A 90-day cert that has to be renewed without human help forces you to actually wire up the renewal, which means by the time something goes wrong, the recovery path is already exercised. A 397-day cert that “just needs to be replaced annually” is a calendar entry that gets missed.
- They limit the blast radius of a key compromise. Revocation infrastructure on the public web is famously flaky (OCSP, CRL); short lifetimes are a backstop that work even when revocation doesn’t.
- They let the CA roll protocol changes faster. When SHA-1 needed to die, when key sizes needed to grow, when log requirements changed, short-lived certs let the ecosystem move in months rather than years.
Note that Let’s Encrypt only issues DV certs, the kind that prove “you control the domain”, nothing more. EV (Extended Validation) and OV (Organization Validation) certs require manual paperwork verifying a legal entity, and Let’s Encrypt doesn’t compete in that space. For ~95% of the web, DV is all that matters; the green company-name bar that EV used to render is mostly gone from browsers anyway.
4. ACME: the protocol that makes automation possible
ACME (Automatic Certificate Management Environment, RFC 8555) is the protocol a client speaks to a CA to register, request a certificate, prove domain control, and pick the issued cert up, all over HTTPS with JWS-signed JSON payloads. It’s what makes machine-issued certs possible. Let’s Encrypt is the most prominent ACME server but not the only one — ZeroSSL, Google’s public CA, internal CAs like Smallstep all speak ACME too.
4.1. Account, order, authorization, challenge
The ACME flow has four states, in order: account, order, authorization, challenge. Each gets its own definition:
- Account. The client generates a keypair and registers it with the CA. The CA returns an account URL. Every subsequent request is signed by this account key, so the CA always knows who’s asking. You usually have one account per server (Caddy manages this for you).
- Order. The client asks for a certificate covering one or more domain names. The CA returns the order with a list of authorization objects, one per domain.
- Authorization. For each domain, the CA wants proof that the requester controls it. Authorization objects contain one or more challenges, and the client picks one to complete. Once an authorization is valid, it’s cached for a while; re-ordering for the same domain inside that window skips straight to the next step.
- Challenge. The actual proof-of-control mechanism. ACME defines three: HTTP-01, DNS-01, and TLS-ALPN-01. Each works on a different layer.
Once every domain in the order has at least one valid authorization, the client submits a Certificate Signing Request (CSR), the CA finalizes the order, and the client downloads the issued certificate. The CA also publishes the cert to Certificate Transparency logs, public append-only logs that browsers cross-check, which is what makes large-scale silent mis-issuance very hard.
The challenge step is where the protocol meets your infrastructure. Three challenge types exist because no single one works in every environment.
4.2. HTTP-01
HTTP-01 proves you control a domain by serving a specific file at a specific path over plain HTTP on port 80.
The CA gives the client a token (a random string). The client computes a key authorization by combining the token with the SHA-256 thumbprint of the account public key: keyAuthorization = token || '.' || base64url(SHA256(accountKey)). It then serves that string at:
http://{domain}/.well-known/acme-challenge/{token}The CA fetches that URL (following redirects within rules), checks the body matches, and marks the authorization valid.
HTTP-01 is the default for most setups because it requires zero domain-side configuration: no DNS API token, no special TLS handling. Two limits to be aware of:
- It needs port 80 reachable from the public internet. Cloud environments that only expose 443, or backends sitting behind a load balancer that doesn’t forward port 80, can’t use HTTP-01.
- It can’t issue wildcards. A wildcard cert (
*.example.com) has no specific hostname to serve a token under, so HTTP-01 isn’t applicable.
4.3. DNS-01
DNS-01 proves control by setting a TXT record on the domain itself, which means it works even when nothing is publicly reachable over HTTP.
The client computes the same key authorization as HTTP-01, then publishes base64url(SHA256(keyAuthorization)) as the value of:
_acme-challenge.{domain}. TXT "...base64url hash..."The CA queries DNS for that TXT record (using its own resolvers, not yours) and validates the value.
DNS-01 is the only way to get a wildcard cert. It’s also the right choice when:
- Port 80 is firewalled off and you can’t open it.
- The origin server isn’t publicly reachable at all (e.g. it sits behind Cloudflare’s tunnel, or it’s an internal-only ingress that still has a public DNS name).
- You’re issuing certs for many subdomains and would rather not deal with HTTP routing for each.
The downside is that the ACME client now needs API credentials to write to your DNS zone. Caddy handles this through provider-specific modules covered in section 7.
4.4. TLS-ALPN-01
TLS-ALPN-01 is the third option, designed for the case where port 443 is open but port 80 isn’t.
The client serves a self-signed certificate during a TLS handshake that negotiates the acme-tls/1 ALPN (Application-Layer Protocol Negotiation) identifier. ALPN lets a client and server pick which protocol to speak over a TLS connection, normally h2 for HTTP/2 or http/1.1, but the ACME CA opens a handshake asking for the special acme-tls/1 identifier on port 443. The TLS server must respond with a specially-formed self-signed cert that carries the key authorization in an X.509 extension. The CA verifies the extension contents and marks the authorization valid.
Nothing transfers over the TLS connection at the application layer. The whole proof lives in the handshake metadata, so you don’t need an HTTP listener and you don’t need to interrupt your real HTTPS traffic.
TLS-ALPN-01 is mostly useful in load-balancer scenarios where only port 443 reaches your server. Because it happens during a real TLS handshake on port 443, the same listener can serve both real traffic and ACME validation handshakes without conflict. Caddy tries it before HTTP-01 for that reason.
5. Caddy: the reverse proxy that hides all of this
Caddy is a web server and reverse proxy written in Go, distributed as a single static binary, with HTTPS as the default and ACME built into the core. Point it at a domain, and it gets a real certificate, redirects HTTP to HTTPS, staples OCSP responses, and renews on schedule, none of which requires configuration. You configured the reverse proxy, and HTTPS came with it.
That’s what makes Caddy different from nginx, HAProxy, or Apache. Those are excellent TLS terminators, but they terminate using certs you brought yourself; you’d pair them with certbot, lego, acme.sh, or another external ACME client running on a cron job. The split works fine, but it’s two systems with two failure modes, and certbot reload-hooks have a long history of subtle bugs (nginx not picking up the new cert, services not restarting cleanly, the cert file racing the symlink swap). Caddy collapses both into one process that owns both the cert and the listener that uses it.
The trade-off worth naming: Caddy is younger than nginx by about 15 years and has fewer of the very-large-fleet stories you’ll see in the nginx ecosystem. For multi-instance high availability you need to think about its storage layer (the default is a filesystem directory, which doesn’t work for clustered instances; there are modules for shared storage via Redis, etcd, S3). For one-to-a-few servers the defaults are excellent.
For everything that follows, “Caddy” means a single instance of Caddy v2 on a server with a public IP, with DNS pointing at it.
6. A working Caddyfile, end to end
6.1. The minimal config
Here’s the smallest useful Caddyfile:
example.com {
reverse_proxy localhost:8080
}Two lines. That’s a complete production-ready HTTPS reverse proxy for example.com, fronting an application on localhost:8080. On startup Caddy will:
- Listen on
:80and:443. - Acquire a Let’s Encrypt cert for
example.comover ACME. - Redirect any plain-HTTP request to its HTTPS equivalent.
- Serve HTTPS, proxying to the backend.
- Renew the cert automatically when it’s within 1/3 of its lifetime from expiry.
The prerequisites Caddy can’t do for you:
example.com’s A/AAAA record points to this server’s public IP.- Ports 80 and 443 are reachable from the public internet (HTTP-01 needs 80; TLS-ALPN-01 needs 443; one of them has to work).
- The process can bind to 80 and 443. On Linux, that means either running Caddy via the official systemd unit (which grants the
CAP_NET_BIND_SERVICEcapability), running as root, or fronting Caddy withsetcap.
One small upgrade: set an email so Let’s Encrypt can reach you if there’s a problem with your account (expiration warnings if renewal stalls, security notices):
{
email [email protected]
}
example.com {
reverse_proxy localhost:8080
}The block with no name ({ ... }) is the global options block; it sets server-wide defaults. Caddy works without it, but supplying an email is good citizenship.
6.2. What happens on the first request
On first start, Caddy walks through the full ACME flow once: register an account, order a cert, complete a challenge, store the result. In detail:
- Caddy generates an ACME account keypair and registers it with Let’s Encrypt’s production directory (
https://acme-v02.api.letsencrypt.org/directory). The account URL is saved to disk. - Caddy submits an order for
example.com. Let’s Encrypt returns an authorization with HTTP-01, TLS-ALPN-01, and DNS-01 challenges. - Caddy tries TLS-ALPN-01 first (just port 443). It sets up the special ALPN responder on the existing port-443 listener, signals readiness, and tells Let’s Encrypt to validate. Let’s Encrypt opens a TLS connection with
acme-tls/1, Caddy returns the validation cert, the authorization becomes valid. - If TLS-ALPN-01 fails (rare), Caddy falls back to HTTP-01, serving the token under
/.well-known/acme-challenge/. - With the authorization valid, Caddy submits a CSR, Let’s Encrypt finalizes the order and returns the issued cert.
- Caddy stores the cert on disk and starts serving it. All of this typically completes within 5–15 seconds.
Subsequent requests use the on-disk cert directly. When renewal time comes around (Caddy checks every ~10 minutes), it goes through the same flow but the account already exists, so it skips registration. Renewal failures are retried with backoff; you get logs but no user-visible disruption until the cert actually expires.
6.3. Where certs and account keys live
Caddy stores everything ACME-related under its data directory. The exact path depends on how you run it:
- Default for
caddy runstarted by a user:$XDG_DATA_HOME/caddy/(typically~/.local/share/caddy/). - Default for the official systemd unit:
/var/lib/caddy/.
Inside, the layout looks like this:
caddy/
├── acme/
│ └── acme-v02.api.letsencrypt.org-directory/
│ └── users/
│ └── [email protected]/
│ ├── [email protected] (account private key)
│ └── [email protected]
├── certificates/
│ └── acme-v02.api.letsencrypt.org-directory/
│ └── example.com/
│ ├── example.com.crt (leaf + intermediate chain)
│ ├── example.com.key (cert private key)
│ ├── example.com.json (metadata: issuer, OCSP, etc.)
│ └── example.com.ocsp (cached OCSP response)
└── locks/ (transient renewal locks)Two things matter operationally:
- Back up the data directory. Losing it means re-registering an account and re-issuing every cert. Re-issuing under time pressure can run into rate limits (section 8.2).
- Permissions are sensitive. The account key and cert keys are private material; restrict to the user Caddy runs as.
For multi-instance setups (active-active behind a load balancer), filesystem storage doesn’t share between hosts. Use the Redis or S3 storage modules instead, configured under storage in the global options block.
7. Wildcards and DNS-01 with Caddy
A wildcard cert covers any single-label subdomain. *.example.com matches api.example.com, app.example.com, whatever.example.com, and the only way to get one is DNS-01. There’s no host-specific port to serve a token under, so the proof has to live at the zone level.
DNS-01 with Caddy requires two things the default binary doesn’t have on its own:
- A DNS provider module for whoever runs your zone (Cloudflare, Route53, DigitalOcean, deSEC, …). These live under
github.com/caddy-dns/*and aren’t bundled into the default binary because there are dozens of them and most users only need one. - A build of Caddy that includes that module. The supported tool is
xcaddy, which compiles a custom binary with the modules you specify.
A wildcard config for a Cloudflare-hosted zone looks like this. First, build Caddy with the Cloudflare module:
xcaddy build --with github.com/caddy-dns/cloudflareThen a Caddyfile that gets both apex and wildcard in one cert:
*.example.com, example.com {
tls {
dns cloudflare {env.CLOUDFLARE_API_TOKEN}
}
@api host api.example.com
handle @api {
reverse_proxy localhost:8080
}
@app host app.example.com
handle @app {
reverse_proxy localhost:8081
}
handle {
respond "unknown subdomain" 404
}
}{env.CLOUDFLARE_API_TOKEN} is read from the process environment at start, so the token isn’t hard-coded into the file. The token needs Zone:Read and DNS:Edit scope for the relevant zone; CF’s “edit zone DNS” template covers it.
What happens at issuance: Caddy creates a _acme-challenge.example.com TXT record via the Cloudflare API, waits for DNS propagation, signals Let’s Encrypt to validate, then cleans the record up. Same flow for the wildcard, since *.example.com validates against the same _acme-challenge.example.com record.
Section 4.3 already covered the non-wildcard cases where DNS-01 is the right pick (port 80 firewalled, origin not publicly reachable, many subdomains under one zone). One worth singling out in the Caddy context: when the origin sits behind Cloudflare’s “orange cloud” proxy, Cloudflare terminates TLS upstream, so HTTP-01 against your origin doesn’t reach the right place. DNS-01 bypasses the question of where the origin actually sits.
8. Operating it: renewals, rate limits, debugging
Once the first issuance is working, the next concerns are renewals (will they happen, and will they fail loudly if they don’t?), rate limits (have I burned through Let’s Encrypt’s quota?), and debugging (when something does break, where do I look?).
8.1. Renewal cadence
Caddy’s certificate manager wakes up every ~10 minutes and renews certs whose remaining lifetime is less than 2/3 of their original lifetime. For a 90-day Let’s Encrypt cert that’s a 30-day window before expiry, which leaves plenty of room for retries if the first attempts fail.
Renewal goes through the full ACME order/authorization/challenge flow again, but skips account registration (the account key persists and gets reused indefinitely). Authorizations are cached at Let’s Encrypt for ~30 days after they go valid, so renewals within that window often skip the challenge step entirely: Caddy submits a CSR and gets a fresh cert back immediately.
If renewal fails, Caddy logs the error and retries with exponential backoff. The old cert keeps serving traffic until it actually expires; users see no disruption from a single-day renewal hiccup. The disruption happens if renewals fail for the entire 30-day window, which is rare but worth alerting on (certificate_expiration is exported as a Prometheus metric if you wire up the metrics endpoint).
8.2. Rate limits and the staging directory
Let’s Encrypt enforces several rate limits in production. The ones that come up:
| Limit | Value | Window |
|---|---|---|
| Certificates per registered domain | 50 | 1 week (rolling) |
| Duplicate certificates (same set of names) | 5 | 1 week (rolling) |
| Failed validation attempts per account/hostname | 5 | 1 hour |
| New orders per account | 300 | 3 hours |
“Registered domain” means the eTLD+1, so example.com, api.example.com, and *.example.com all count against the same 50-per-week pool. Easy to blow through when you’re iterating on DNS configuration in a tight loop: misconfigure something, fix it, try again, fix it, try again. Eight orders later you’re locked out for a week.
The fix is to point Caddy at Let’s Encrypt’s staging directory while you iterate:
{
acme_ca https://acme-staging-v02.api.letsencrypt.org/directory
}
example.com {
reverse_proxy localhost:8080
}Staging speaks the same protocol but issues certs from an untrusted-by-browsers root, so clients will reject them. That’s the point; staging exists for testing. The rate limits are 30–60x higher than production, so you can fail validation a hundred times and not get locked out. When the config is stable, remove the acme_ca line and Caddy re-issues against production.
If you do hit production rate limits, the recovery is to wait out the rolling window. There’s no manual reset, and Let’s Encrypt’s support won’t bypass it. Switch to staging while you wait.
8.3. Common failures and reading the ACME log
The top failure modes, in roughly the order of how often they bite people:
- Port 80 blocked. HTTP-01 can’t reach you. If your environment also can’t do TLS-ALPN-01 (some cloud LBs only forward port 443 to one backend at a time), issuance fails entirely. Fix: open port 80, or switch to DNS-01.
- DNS hasn’t propagated. You created the A record five minutes ago, started Caddy, and the order fails because Let’s Encrypt’s resolvers haven’t seen it yet. Caddy retries; just wait. If the propagation never happens, your DNS record is wrong or the zone isn’t authoritative on the resolvers you think it is.
- CAA records blocking issuance. A
CAA 0 issue "digicert.com"record on the zone tells every CA other than DigiCert not to issue. If you have CAA records (recommended; they limit who can issue for you), make sureletsencrypt.orgis in the allowed list:example.com. CAA 0 issue "letsencrypt.org". - DNS-01 with wrong token scope. Caddy authenticates fine but the API call to create the TXT record gets a 403. Re-check the token’s zone scope; Cloudflare’s per-zone tokens need both Read and Edit on the specific zone, not the wildcard “all zones”.
- Production rate-limit lockout. Covered above. Switch to staging.
- Lost data directory after a server rebuild. No account key, no certs, everything has to re-issue from scratch, and if you’re rebuilding the same server repeatedly during automation development, you can rate-limit yourself. Pin the data directory to persistent storage outside the rebuild path.
When something goes wrong, the ACME conversation is logged at INFO level by default. Tail it directly if you ran Caddy in the foreground:
caddy run --config /etc/caddy/Caddyfile 2>&1 | grep -i -E 'acme|tls|cert'Under the official systemd unit, the same log lives in journald:
journalctl -u caddy -fLet’s Encrypt’s error messages are usually direct: urn:ietf:params:acme:error:rateLimited, urn:ietf:params:acme:error:dns, urn:ietf:params:acme:error:unauthorized. The error type maps cleanly to one of the cases above. If the message is unauthorized, you have a challenge-validation problem (CA reached your server but didn’t see the expected token); if it’s dns, the CA couldn’t resolve your domain or your TXT record; if it’s rateLimited, you’re stuck until the window resets.
The two-line Caddyfile from section 6.1 still works under the hood the same way the longest config in this post does: it’s all the same ACME flow against the same Let’s Encrypt directory. The configuration is short because ACME, Let’s Encrypt, and Caddy collectively did the hard work years ago. Good protocol, free reliable CA, a server that speaks both natively.