$ emrebener
home blogs networking i closed every port and could still ssh in — how vpns really work

I Closed Every Port and Could Still SSH In — How VPNs Really Work

author: emre bener read time: 29 min about: virtual private network, wireguard, nat traversal
published: updated: mentions: tailscale, ipsec, openvpn, network address translation, secure shell

1. A server that accepts no connections

One of my servers runs on a Hetzner VPS, and its cloud firewall is configured to drop every unsolicited inbound packet, SSH and HTTP included. If you take the box’s public IP and try to connect to anything on it, the packets are dropped at the firewall before they reach the machine. In practice it has no reachable public surface at all.

Despite that, this command connects instantly from my laptop:

ssh openclaw-server-tailscale

Note that the short name is not a real hostname. It is an alias defined in my ~/.ssh/config, which ssh expands before it connects:

Host openclaw-server-tailscale
    HostName 100.x.y.z # the server's Tailscale address
    User <user>
    IdentityFile ~/.ssh/<private-key>

When I type ssh openclaw-server-tailscale, the client looks up the matching Host block and substitutes the real values: the address to dial, the login user, and the private key to authenticate with. The alias itself is pure convenience and has nothing to do with the puzzle. The part that matters is the HostName it points at, a 100.x.y.z address, which is the thread the rest of this post pulls on.

It opens a shell on a machine that refuses every incoming connection, and no port was opened to make that possible. The firewall really is configured that way, which is easy to confirm: an SSH attempt against the box’s public IP times out every time.

This isn’t a trick or a misconfiguration. It’s a VPN doing its job, and the VPN in question is Tailscale. But “it’s a VPN” is an unsatisfying answer, because (as I found out while building this setup) “VPN” isn’t a single thing. It is a category that covers several fairly different technologies, and the reason I can reach a sealed box depends on the specifics of how one of them works.

This post works through that from the bottom up: what a VPN actually is, which protocols matter today, the two different shapes a VPN network can take, and finally the thing that resolves the opening puzzle, NAT traversal. By the end, SSH into a closed box should be straightforward to explain.

The running example is real infrastructure. One of my servers runs two VPNs at the same time, for two unrelated reasons, and that pairing is a convenient way to see the whole concept at once.

2. What “VPN” actually means

“VPN” names a goal, not a specific technology. It stands for Virtual Private Network, and the three words are worth reading literally:

  • Network: a way for machines to reach each other.
  • Private: that reach is limited to a defined set of participants, and outsiders can neither read the traffic nor join in.
  • Virtual: the network is not built from dedicated physical cabling. It is assembled on top of some other network, almost always the public internet.

Taken together, a VPN is any technique that makes a set of machines behave as if they share a private network, while physically they are scattered across the internet and connected only by links that everyone else also uses. That is a goal, and many different technologies meet it, with genuinely different trade-offs between them.

It helps to treat “VPN” the way we treat “database”. Nobody hears “database” and assumes one specific product, because the alternatives are familiar: PostgreSQL, SQLite, a key-value store, and others. “VPN” feels more monolithic only because most people encounter it through a single product category.

That category is the consumer privacy VPN: the NordVPN, Surfshark, and ExpressVPN tier, sold with the pitch “install this and your traffic is hidden”. Those products are real VPNs, but they expose just one use of the technology, wrapped in one workflow, and brand the whole bundle as “a VPN”. So “VPN” ends up meaning “the app that hides my IP”, and the rest of the category falls out of view.

The rest of that category is what the opening puzzle depends on, because the puzzle lives in a part of it that the consumer pitch never touches. The next section separates the two questions that the single word “VPN” tends to blur together.

3. Purpose and protocol are two separate questions

Most confusion about VPNs comes from collapsing two independent questions into one word. The questions are: what is the VPN for, and how is its tunnel actually built. They vary independently, and any given VPN is one answer to each.

3.1. What the VPN is for

The first question is about purpose, or equivalently the shape of the problem being solved. There are four common answers.

Remote access. One person, outside a private network, needs to act as if they were inside it. The classic case is connecting to a company network from home to reach internal services. Traffic flows from the single client into the private network and back.

Site-to-site. Two entire networks are joined so that machines on each side reach machines on the other as if they shared one network. Two office LANs linked into one is the standard example. The link is permanent, and the participants are networks rather than individual users.

Privacy, or egress relocation. Here the private network is not the point at all. The goal is only to make your traffic leave the internet from somewhere other than where you physically are, usually to change the apparent source IP. This is the consumer VPN from the last section. You are not joining a network you care about; you are borrowing its exit point.

Mesh. Many machines, each a peer of the others, all directly reachable across the group regardless of where they physically sit or what firewalls stand in front of them. No single side is “the network” and the rest “clients”; every node is both.

These four are not variations on one thing. They are different shapes, with different topologies and different failure modes. A consumer privacy VPN and a mesh VPN have very little in common beyond both being VPNs.

3.2. How the tunnel is built

The second question is mechanical. Whatever a VPN is for, it has to do three jobs on every packet:

  1. Encapsulation. Wrap the original packet inside another packet so it can travel across the public internet to the other end. This wrapping is what the word “tunnel” refers to.
  2. Encryption. Scramble the contents so the networks the packet crosses cannot read it.
  3. Authentication. Prove that each end is who it claims to be, so an outsider cannot impersonate a participant or inject traffic.

A VPN protocol is one concrete recipe for doing those three jobs. WireGuard is one recipe, OpenVPN another, IPsec another. They differ in which cryptography they use, how the handshake works, what transport they run over, and how much they attempt beyond the minimum. Section 4 covers the recipes that matter today.

The point is that purpose and protocol are independent. The same protocol can serve any purpose, and the same purpose can be served by different protocols.

3.3. The running example: two servers

My setup makes that independence concrete. It is two small servers, and between them they cover every distinction in this section.

The first server is the box from Section 1, the one I SSH into. It runs a single VPN, Tailscale, for one reason: so that I and my other machines can reach it privately. Purpose: mesh.

The second server is a dedicated proxy. Its job is to take web traffic from the first server and send it back out to the internet from a residential IP in London rather than from a datacenter. To do that it runs two VPNs at once:

  • Surfshark, a consumer privacy VPN, which provides the London exit. Purpose: privacy. Topology: hub-and-spoke, which Section 6 covers.
  • Tailscale, the same mesh VPN the first server runs, so the two servers can reach each other privately. Purpose: mesh.

Two servers, three VPN instances, two unrelated purposes, two topologies. And the detail that ties this to the previous section: every one of those instances uses the same protocol underneath. Surfshark and Tailscale both move their packets with WireGuard. Same recipe for building the tunnel, very different things built from it.

That is the whole purpose-versus-protocol distinction sitting in real infrastructure. The next section takes the protocol side: which recipes are actually in use today.

Two servers, three VPN instances, one protocolServer 1 — the sealed boxruns: TailscaleServer 2 — the proxyruns: Surfshark + TailscaleSurfshark London exit(hub)Tailscale mesh —reach privatelySurfshark — privacy(hub-and-spoke)All 3 VPN instances run on WireGuard underneathTwo servers, three VPN instances, one protocolServer 1 — the sealed boxruns: TailscaleServer 2 — the proxyruns: Surfshark + TailscaleSurfshark London exit(hub)Tailscale mesh —reach privatelySurfshark — privacy(hub-and-spoke)All 3 VPN instances run on WireGuard underneath

4. The protocols that matter today

Three protocols carry essentially all the VPN traffic worth discussing today: IPsec, OpenVPN, and WireGuard. They arrived roughly in that order, and each was largely a reaction to the weaknesses of what came before, so the clearest way to understand them is in sequence.

4.1. IPsec / IKEv2

IPsec is the oldest of the three and the one written into the standards. It is not a single protocol but a suite, standardized by the IETF in the 1990s, that operates directly at the IP layer. IKEv2 (Internet Key Exchange version 2) is the part that handles authentication and key negotiation; IPsec proper is the part that encrypts and carries the packets. In practice people say “IPsec/IKEv2” to mean the whole working combination.

Its strengths come from being a standard. It is implemented natively in the kernels of Windows, macOS, iOS and Linux, so a client often needs no extra software. IKEv2 also handles roaming well: it has a defined mechanism for keeping a tunnel alive when a device changes network, which is why it is common on mobile.

Its weakness is complexity. The suite has accumulated decades of options, modes, and extensions, with many ways to combine them and many ways to combine them badly. A misconfigured IPsec tunnel tends to be subtly weak rather than obviously broken. The protocol is capable, but it is not the one to reach for if you value a small, auditable configuration.

4.2. OpenVPN

OpenVPN, released in 2001, is the long-standing open-source workhorse. Rather than operating at the IP layer like IPsec, it runs as an ordinary userspace program and builds its tunnel using TLS, the same protocol family that secures HTTPS. Authentication is usually certificate-based, with a full public-key infrastructure (a certificate authority, issued client certificates, revocation lists), which suits organizations that already manage certificates.

Its defining practical feature is that it can run over TCP on port 443. To a firewall or a network filter, an OpenVPN-over-443 connection is hard to tell apart from a normal HTTPS request, so OpenVPN often gets through restrictive networks that block more obvious VPN traffic. That flexibility is the main reason it stayed dominant for so long.

The cost is weight. OpenVPN is a large codebase, it runs in userspace with the context-switching overhead that implies, and it is historically slower than the alternatives. Carrying the tunnel over TCP can also trigger “TCP meltdown”, where a reliable protocol running inside another reliable protocol produces compounding retransmissions under packet loss. Newer versions cut the overhead with a kernel data-channel module, but the protocol’s age shows.

4.3. WireGuard

WireGuard, merged into the Linux kernel in 2020, is the modern reset. It was written as a deliberate reaction to the complexity of IPsec and the weight of OpenVPN, and its central design value is minimalism: the original implementation is roughly 4,000 lines of code, against tens of thousands for the alternatives. A small codebase is easier to audit, and so easier to trust.

It reaches that size by refusing to be configurable. It uses one fixed, modern set of cryptographic primitives with no negotiation, runs over UDP only, and identifies peers purely by public key. It is fast, it runs in the kernel, and it is the protocol both of my VPNs use underneath.

WireGuard also makes a deliberate trade-off in the other direction: it leaves a lot out. It has no built-in way to distribute keys, no peer discovery, no NAT traversal, and no real notion of user identity. Those omissions are exactly what the opening puzzle turns on, so WireGuard gets its own section next. For now it is enough to place it: the smallest, fastest, and most modern of the three, and the one this post spends the most time on.

4.4. Older protocols and a newer direction

A few protocols still appear in VPN settings menus but are not worth choosing. PPTP, from the 1990s, is cryptographically broken and should not be used for anything. L2TP/IPsec and SSTP still function but are dated and offer nothing the three above do not.

One direction is worth watching at the other end. Some newer VPNs tunnel over QUIC, the transport behind HTTP/3. Because so much ordinary web traffic is now QUIC, a VPN built on it is very hard to distinguish from normal browsing. That is the same firewall-evasion property OpenVPN-over-443 had, brought up to date. Apple’s iCloud Private Relay works roughly along these lines. QUIC-based VPNs are not yet a mainstream self-hosted option, but they are the most plausible “next” protocol.

The three current protocols line up like this:

ProtocolYearRuns overPeer identityNotable strengthNotable weakness
IPsec / IKEv21990sIP layercertificates or pre-shared keynative in every OS, good roamingsprawling, easy to misconfigure
OpenVPN2001TCP or UDP, often port 443certificates (PKI)can disguise as HTTPS, flexiblelarge, slower, userspace
WireGuard2020UDP onlypublic keystiny, fast, modern cryptoomits key distribution and NAT traversal

The next section takes WireGuard apart, because its omissions are what the opening puzzle is built on.

5. How WireGuard works

WireGuard is small because it collapses jobs that other VPNs keep in separate subsystems. Four ideas carry most of the design: a peer is just a keypair, routing and encryption are the same lookup, the handshake is short and forward-secret, and there is almost no connection state. Understanding those four is enough to understand why the opening puzzle needs a layer on top.

5.1. A peer is a keypair

A WireGuard peer is identified by a single Curve25519 keypair. There are no usernames, no passwords, no certificates, and no certificate authority vouching for anything. A peer has a private key it keeps and a public key it shares, and to the rest of the network that public key is its identity. If you know a peer’s public key you can address it; if you do not, you cannot.

This is unusual. Identity in WireGuard is not tied to a hostname, an IP address, or an account; it travels with the keypair. The same peer keeps the same identity whether it sits in a datacenter, on a home network, or on a phone hotspot, and that property is what later makes painless roaming possible.

5.2. Cryptokey routing

The central idea in WireGuard is cryptokey routing, and it is the main reason the implementation is so small. A WireGuard configuration is mostly a list of peers, and each peer entry carries a field called AllowedIPs. A minimal configuration looks like this:

[Interface]
PrivateKey = <this peer's private key>
Address = 10.0.0.2/32

[Peer]
PublicKey = <the other peer's public key>
Endpoint = 203.0.113.10:51820
AllowedIPs = 10.0.0.0/24

AllowedIPs on a peer means “this peer is allowed to send from, and is reachable at, these IP ranges”. That one field does two jobs that other VPNs handle with separate machinery.

For outbound packets it acts as a routing table. When WireGuard has a packet to send, it takes the destination IP, finds the peer whose AllowedIPs covers it, and encrypts the packet with that peer’s public key. Choosing the route and choosing the encryption key are a single lookup.

For inbound packets it acts as an access-control check. When an encrypted packet arrives, WireGuard decrypts it with the sending peer’s key, then checks that the packet’s source IP falls inside that peer’s AllowedIPs. If it does not, the packet is dropped. Authenticating who sent a packet and authorizing what they were allowed to send are a single check.

Routing, encryption-key selection, and access control, which in IPsec or OpenVPN are three separate subsystems, collapse into one table of peers. That collapse is most of where the 4,000-line figure comes from.

5.3. The handshake, forward secrecy, and silence

Before two peers exchange data they perform a short handshake, based on the Noise protocol framework, a modern and well-analyzed toolkit for building secure handshakes. The WireGuard handshake takes a single round trip. Each side proves it holds the private key matching the public key the other expects, and in the process they derive a fresh pair of ephemeral session keys that encrypt the actual traffic.

The session keys are what give the connection forward secrecy. They rotate roughly every two minutes and are derived from temporary values discarded afterward. If an attacker records encrypted traffic today and later steals a peer’s long-term private key, they still cannot decrypt what they recorded, because the long-term key was never the thing that encrypted the data.

The handshake also explains a property worth stating now, because Section 7 depends on it. WireGuard never replies to a packet it cannot authenticate. A packet that does not belong to a valid handshake or session is dropped silently, with no error and no response. Port-scan a WireGuard endpoint and you get nothing back; the port behaves exactly as if it were closed. A WireGuard listener is effectively invisible to anyone who does not already hold a valid key.

5.4. Almost no connection state

WireGuard barely has a notion of a “connection”. A traditional VPN like OpenVPN establishes a session, holds it open, runs keepalives, and tears it down on timeout. WireGuard instead keeps only two things per peer: the most recent network address it has seen that peer at, and the current session keys. When there is no traffic to send, nothing happens. There is no idle session to maintain or to time out.

This near-statelessness produces a genuinely useful property: roaming without reconnecting. Suppose a peer moves from Wi-Fi to cellular, so its public IP changes. A connection-oriented VPN treats that as a broken session and forces a reconnect. WireGuard does not: the peer simply sends its next packet from the new address, the receiving end decrypts it, sees that it authenticates correctly against a known peer’s key, and quietly updates its record of where that peer is. The tunnel never “reconnected” because it was never a connection to begin with. This is the keypair-as-identity idea from 5.1 paying off: identity is bound to the key, not the location, so the location is free to change.

5.5. What WireGuard leaves out

Everything above is what WireGuard does. What makes it a primitive rather than a finished product is what it deliberately does not do.

It does not distribute keys. Every peer’s configuration must already contain every other peer’s public key and a way to reach it, put there by some outside means. Add one peer to a ten-machine network and you have eleven configuration files to edit.

It does not discover peers or traverse NAT. WireGuard needs at least one side of each pair to have a known, reachable Endpoint. If both peers sit behind home routers with no fixed public address, plain WireGuard cannot connect them on its own.

It also does not assign IP addresses, hand out DNS, or offer any dynamic notion of user identity, group membership, or revocation. Removing a peer’s access means editing configuration files and deleting a key by hand.

None of this is an oversight; it is the same minimalism as the rest of the design. WireGuard is an excellent way to move encrypted packets between two peers that already know about each other, and nothing more. The two missing pieces the opening puzzle needs, automatic key distribution and NAT traversal, are not WireGuard’s job. They belong to whatever is built on top of it, and the next section is about that layer.

6. Traditional and mesh topologies

A VPN’s topology answers the question “who connects directly to whom”. There are two shapes worth knowing, and my two VPNs happen to be one of each.

6.1. Hub-and-spoke: the traditional shape

The traditional VPN topology is hub-and-spoke. Every client, a “spoke”, connects to one central server, the “hub”, and all traffic flows through that hub. Two spokes that want to reach each other do not connect directly; their traffic goes spoke to hub to spoke. The hub is the only machine every participant holds a tunnel to.

This is the shape of every consumer privacy VPN, and for that purpose it is exactly right. A privacy VPN’s job, from Section 3, is egress relocation: you want your traffic to leave the internet from somewhere else, and that requires a chokepoint. Surfshark’s London server is the hub, my proxy server is a spoke, and the web traffic routed through it flows up the tunnel to that hub and exits onto the public internet from there, carrying the hub’s IP. A privacy VPN without a central exit point would not be a privacy VPN. The hub is the product.

Hub-and-spoke has well-known costs. The hub is a bottleneck, because all traffic crosses it. It is a single point of failure. And it handles your traffic at the point where it exits, so you are trusting the hub operator. For a consumer privacy VPN those costs are acceptable, because the chokepoint is the entire point. For other purposes they are pure overhead.

6.2. The mesh alternative

The other shape is a mesh. There is no hub. Every node connects directly to every other node it needs to reach, peer to peer, over its own dedicated encrypted tunnel. If machine A sends a packet to machine B, that packet goes straight from A to B and passes through no third machine.

For my second VPN this is the right shape. I am not relocating an exit point; I am letting a handful of my own machines reach each other privately, from wherever each one happens to be. No traffic should funnel through a central box, so a mesh removes the bottleneck and the single point of failure, and keeps every connection on the shortest path. When I SSH from my laptop to the server in Section 1, that is a direct laptop-to-server tunnel with nothing in between.

But a mesh reintroduces the problem Section 5 left open. WireGuard can build the A-to-B tunnel only if A already holds B’s public key and address, and vice versa, for every pair. A hub-and-spoke VPN has one relationship per spoke, the spoke and the hub, which an administrator configures once. A mesh of nn machines has n(n1)/2n(n-1)/2 relationships, and every time a machine is added or changes address, every other machine’s configuration is out of date. Hand-editing WireGuard configs, as described in Section 5.5, does not survive contact with a real mesh.

Something has to distribute keys and addresses automatically. That something is the control plane.

Hub-and-spokeMeshHubSpokeSpokeSpokeSpokeNodeNodeNodeNodeAll traffic flows through the hubDirect peer-to-peer paths, no hubHub-and-spokeMeshHubSpokeSpokeSpokeSpokeNodeNodeNodeNodeAll traffic flows through the hubDirect peer-to-peer paths, no hub

6.3. Control plane and data plane

A mesh VPN like Tailscale splits into two planes, and keeping them separate is the key to understanding what it does.

The data plane is the actual encrypted traffic: the WireGuard tunnels between your machines carrying your SSH sessions, file transfers, and everything else you send. The control plane is the bookkeeping: which machines belong to the network, what their public keys are, and where each can currently be reached.

Tailscale runs a central service called the coordination server, and it is the control plane and nothing else. When a machine joins, it generates its WireGuard keypair locally, keeps the private key, and sends only the public key to the coordination server. The server maintains a directory of every machine’s public key and current set of possible addresses, and whenever something changes, a machine joining, leaving, or moving, it pushes the updated directory to the others.

The property that matters is that the coordination server never carries your traffic. It is a phone book, not a switchboard. It tells your machines how to find and authenticate each other, and then the machines build WireGuard tunnels directly, peer to peer. Your data takes the data plane; the coordination server only ever touches the control plane. Even if that server were compromised, it holds no session keys and sees no traffic, because WireGuard’s keys never left your machines.

This is the layer that supplies what WireGuard omitted. WireGuard said it would not distribute keys; the coordination server is the answer, providing automatic and continuous key and address distribution so the mesh’s many pairwise relationships stay current without anyone editing a file. The coordination server can itself be self-hosted: Tailscale runs one as a service, and an open-source implementation called Headscale lets you run your own. Either way the split between the planes is the same.

A phone book, not a switchboardCoordination server(control plane)LaptopServerkeys + addresseskeys + addressesDirect WireGuardtunnel — your traffic(data plane)The coordination server never carries traffic — it only tells peers how to find each other.A phone book, not a switchboardCoordination server(control plane)LaptopServerkeys + addresseskeys + addressesDirect WireGuardtunnel — your traffic(data plane)The coordination server never carries traffic — it only tells peers how to find each other.

6.4. What a tailnet is

With both planes in place, the term tailnet has a precise meaning. A tailnet is the set of machines registered under one coordination account, together with the policy of which of them may talk to which. It is a membership list, not a place and not a server.

Every machine in a tailnet receives a stable private IP, the 100.x.y.z address that appeared in Section 1’s SSH config. The control plane assigns that address and binds it to the machine’s key, so it does not change when the machine moves between physical networks. The range it comes from, 100.64.0.0/10, is a block reserved for exactly this kind of use, so it never collides with ordinary home or office addressing.

That is why openclaw-server-tailscale resolved to a 100.x.y.z address. My laptop and my server are both members of one tailnet, both received the other’s key from the coordination server, and both hold a stable 100.x.y.z address for the other. From the laptop’s point of view, the server is simply “the machine at that address”, reachable as if it were on the same LAN.

Which leaves exactly one thing unexplained, and it is the original puzzle. The control plane has handed my laptop the server’s key and the fact that it lives at a certain 100.x.y.z address. But the server’s real, physical network still drops every unsolicited inbound packet. Holding a peer’s key and wanting to reach it does not, by itself, get a packet through a firewall that refuses inbound connections. How the direct tunnel is established through that firewall is the subject of the next section.

7. NAT traversal: reaching a machine that accepts no connections

The resolution to the opening puzzle is that Tailscale never needs an inbound connection at all. It connects two machines that both make only outbound connections, by having them dial out at the same moment. To see why that works, start with what a firewall actually blocks.

7.1. “Inbound” really means “unsolicited inbound”

The puzzle assumes a firewall that blocks inbound traffic blocks everything arriving from outside. It does not, and it cannot, because a machine that truly accepted nothing inbound could not browse the web.

Walk through an ordinary outbound request. My server opens a connection to a website on port 443; the request packet leaves; the website’s response then has to arrive inbound. A firewall that dropped every inbound packet would drop that response, and no outbound connection could ever complete. So every firewall, and every NAT device, distinguishes two kinds of inbound packet: unsolicited ones, with no matching prior outbound, which it drops, and replies to something a machine inside already sent, which it allows.

It tells them apart with a state table, often called connection tracking. Each time a machine inside sends a packet out, the firewall records an entry: the source address and port, and the destination address and port. When a packet arrives inbound, the firewall checks it against the table. If it matches an existing entry, meaning it looks like a reply to a recorded outbound packet, it is allowed through. If nothing matches, it is dropped.

The consequence is the important part. Every outbound packet briefly opens a specific, narrow return path through the firewall, and that path admits only packets coming back from the exact destination the original packet was sent to. The firewall on my server blocks unsolicited inbound traffic. It does not, and structurally cannot, block replies to connections the server itself started. That distinction is the crack NAT traversal works through.

7.2. Hole punching

Here is the key move. If an outbound packet opens a return path for replies from its destination, then two machines that each send an outbound packet to each other, at the same time, each open a return path for the other. The two packets cross in transit, and each arrives at a firewall that has just recorded a matching outbound entry. To each firewall, the incoming packet looks exactly like the reply it was expecting, so each is allowed in. This is hole punching, and it is how a direct tunnel forms between two machines that both refuse unsolicited inbound traffic.

What this needs is addresses and timing: each side has to know where to send its packet, and both have to send at roughly the same moment. That is the coordination server’s job. Both machines hold an ordinary outbound connection to it, which every firewall permits, so it can always reach them. It learns each machine’s public-facing address and the port its NAT will use, tells each machine about the other, and signals them to begin. The coordination server never carries the resulting traffic; it only arranges the rendezvous.

Step by step, for my laptop reaching the firewalled server:

  1. My laptop and the server each hold an open outbound channel to the coordination server.
  2. The coordination server tells each one the other’s current public address and port.
  3. Both begin sending WireGuard UDP packets to the other’s address.
  4. My laptop’s first packet leaves and opens a return path, in my laptop’s network, for replies from the server’s address.
  5. The server’s first packet leaves and opens a return path, in its firewall, for replies from my laptop’s address.
  6. The packets cross. Each arrives at a firewall that just recorded a matching outbound entry, so each is admitted as a “reply”.
  7. A direct, bidirectional WireGuard tunnel now exists between the two machines. SSH runs inside it.

No inbound port was ever opened. The server’s firewall rule, drop unsolicited inbound, was never violated, because by the time my laptop’s packet arrived the server had already sent a packet outbound toward my laptop, so the arriving packet was not unsolicited. Both machines only ever initiated outbound connections. They simply did it toward each other, at the same time, with a coordinator telling them where to aim.

That is the resolution of the opening puzzle. The firewall was not bypassed, fooled, or misconfigured. Tailscale never needs the thing the firewall forbids. It uses only outbound connections, which nothing blocks, and is careful about making two of them meet at the right moment.

7.3. When hole punching fails: DERP relays

Hole punching does not always succeed. It depends on each side predicting the address and port its NAT will use, and some NATs, called symmetric or “hard” NATs, assign a fresh and unpredictable port for every distinct destination. When the coordination server cannot predict the port, the two packets cannot be aimed at each other, and the punch fails.

For those cases Tailscale has a fallback called DERP (Designated Encrypted Relay for Packets). DERP servers are relays, reached over plain outbound HTTPS on port 443, which works on essentially every network. Both peers connect outbound to the same DERP server, and DERP shuttles packets between them. Because both connections are outbound, no firewall blocks them, so a relayed connection works even under the worst NAT conditions.

Two things keep DERP from being a real compromise. First, the relay cannot read your traffic: the packets are WireGuard-encrypted end to end, and DERP only moves opaque encrypted blobs. Second, DERP exists to get you connected immediately, but Tailscale keeps attempting hole punching in the background and silently upgrades the link to a direct connection the moment a punch succeeds. A relayed link is the floor, not the resting state. The command tailscale ping reports which mode a given link is using, direct or via DERP.

7.4. One open UDP port

The two servers in this setup are firewalled differently, and now the difference makes sense. The first server, the one from the opening puzzle, has every inbound port closed and relies entirely on hole punching, exactly as described above. The second server, the proxy box, has one deliberate exception: a single inbound UDP port, 41641, left open. That is the port Tailscale uses.

It is not strictly required. Hole punching, with DERP as a fallback, would connect the proxy box regardless. But leaving 41641 open gives that machine a stable, predictable inbound port, so the other side does not have to predict anything. It can send straight there, and a direct connection forms reliably and immediately, with no risk of dropping to a relayed path.

The reason to do this for that box specifically is the traffic pattern. Every bit of relocated web traffic from the first server crosses the link to the proxy box, continuously. That link is the one connection in the setup where a relayed fallback would actually be felt, so it is worth guaranteeing a direct path, and one open UDP port buys that guarantee.

Opening it costs almost nothing, because of WireGuard’s behaviour from Section 5.3. WireGuard is silent to anyone without a valid key: a packet that does not authenticate is dropped with no response. Port 41641 is open, but to a port scanner it is indistinguishable from a closed one, and only a peer that already holds a valid key can do anything with it. It is an open port that an attacker cannot even confirm exists.

8. What the setup demonstrates

Every distinction in this post is visible in two small servers.

Two of the three VPN instances are a privacy VPN and a mesh VPN, and they could not differ more in purpose or shape. Surfshark exists to relocate an exit point, so it is hub-and-spoke: traffic funnels through a central server on purpose, because that central server is the whole product. Tailscale exists to let my own machines reach each other, so it is a mesh: no hub, direct peer-to-peer tunnels, every node equal. One topology wants a chokepoint; the other removes it.

Underneath, though, both are WireGuard. The same protocol, the same handshake, the same cryptokey routing carries packets for two VPNs that share nothing at the level of purpose. That is the purpose-versus-protocol split made physical: the protocol is the recipe for a tunnel, the topology is what you build with it, and the two vary independently.

The mesh VPN is also where the opening puzzle resolved. Tailscale gets a packet into a server that accepts no connections not by opening a port but by never needing one. WireGuard supplies the encrypted tunnel and deliberately omits key distribution and NAT traversal; Tailscale supplies both, through a coordination server that distributes keys and addresses and arranges the rendezvous, and through hole punching that turns two outbound connections into a single direct link. The firewall keeps its rule the entire time. Nothing about it was weakened.

A few things are worth taking away from all of this.

VPN is a category, not a product or a protocol. It names a goal: make scattered machines behave as if they shared a private network. The consumer privacy app is one member of that category, not the category itself.

Purpose and protocol are independent questions. What a VPN is for (remote access, site-to-site, privacy, mesh) and how its tunnel is built (IPsec, OpenVPN, WireGuard) vary separately. The same protocol can serve any purpose, and the same purpose can run on different protocols.

WireGuard mostly settled the protocol question. It is small, fast, and modern enough that new VPN systems tend to build on it rather than replace it. The interesting engineering moved up a layer, to the control plane: discovery, key distribution, identity, and NAT traversal. That is the layer Tailscale, Headscale, and similar systems actually compete in.

The closed-port trick is not a trick. It is the quietly elegant idea at the bottom of the whole stack: a firewall only blocks what it was never asked for, so a system that only ever connects outbound, from both ends at once, never meets the rule in the first place. The lesson generalizes past VPNs. The cleanest way through a restriction is often to design so that you never depend on the thing it forbids.