Why we dropped public UDP RADIUS and went pure RadSec

2026-05-11

Early on, Arbiter exposed a public UDP RADIUS endpoint on the internet, the same shape every other cloud NAC still ships. We turned it off. Every tenant now reaches the cloud through a RadSec tunnel from an Edge appliance, with no plain UDP exposed at all. This is why we made the call.

For a stretch of Arbiter's early life we ran a public UDP RADIUS endpoint on the open internet. It was the obvious thing to ship: low onboarding friction, every cloud NAC vendor does it, customers could point a switch at a hostname and have an Access-Accept ten seconds later. We have since turned that endpoint off. Every Arbiter tenant now talks to the cloud through a RadSec (RFC 6614) tunnel from an on-premises Edge appliance. There is no plain UDP exposed at arbiter-radius.arbiter.ie any more. This post is why we made that call.

What was wrong with the public UDP endpoint

Cleartext metadata over the public internet

RADIUS over UDP encrypts only the User-Password attribute under the shared secret. Everything else (the username, the MAC address, the NAS-IP, the Called-Station-Id, the VLAN names returned in the reply) travels in the clear. On a private network that's fine. Across the public internet it's a confidentiality problem on its own merits and a GDPR Article 32 problem the moment a regulator asks how you transport identifiers. We were not comfortable telling customers their MAC addresses traversed Hetzner's edge in plain text.

BlastRADIUS and the broader UDP attack surface

CVE-2024-3596 (BlastRADIUS) showed that the Response-Authenticator MD5 construction in classic RADIUS could be forged by an off-path attacker who could inject UDP. The published mitigation is mandatory Message-Authenticator enforcement, which we do. But the broader point is that the attack class only exists because UDP gives off-path adversaries something to inject. Moving the transport onto a mutually authenticated TLS tunnel removes the attack surface, not just the specific exploit. We would rather be off the road than wearing a better seatbelt.

No clean path for inbound Change-of-Authorization

RFC 5176 CoA is how a modern NAC pushes a session change (VLAN flip, session termination, quarantine) after the initial authentication. With public UDP, the cloud has to reach back into the customer's network on UDP/3799, which means an inbound firewall rule, which most security teams refuse and we would not recommend they accept. The workaround on UDP is polling, which is slow and noisy. There was no good answer here.

A public RADIUS port is an abuse magnet

An internet-facing UDP/1812 attracts brute force, replay attempts and reconnaissance traffic from the moment it goes up. We added per-source rate limiting and Message-Authenticator gates and it cut the noise, but the steady-state log volume from drive-by abuse was an operational tax we were paying for a transport we did not want to recommend in the first place.

UDP across the WAN is fragile

UDP does not retransmit. Asymmetric routes, NAT timeouts and ISP path changes meant occasional lost auths even on otherwise healthy networks. The customer experience was a switch that worked nineteen times out of twenty and a help desk ticket nobody could reproduce.

The decision: pure RadSec, no exceptions

We removed the public UDP listener. Every Arbiter tenant connects via an Edge appliance over a single outbound TCP/2083 RadSec tunnel, authenticated with a per-tenant mTLS client cert. RADIUS, accounting, DHCP profiling discoveries, CoA and operator diagnostics all share that one tunnel. No inbound firewall rules at the customer site. No public RADIUS port at arbiter-radius. No cleartext identifiers on the internet.

We were honest with ourselves about the trade. We lost the frictionless onboarding story where a prospect points a Cisco switch at a hostname and gets a green light in the demo. We accept that cost. The kind of customer who refuses to deploy a 2 vCPU appliance for security plumbing was rarely the kind we were going to do good work for.

What pure RadSec gives us

All RADIUS traffic encrypted on the wire under mutual TLS, including the metadata that UDP leaks in cleartext.
Certificate-based trust using per-tenant ECDSA PKI, with strict isolation between tenants.
A single outbound TLS connection per site. No inbound holes for RADIUS, accounting or CoA.
Change-of-Authorization over the existing tunnel, so VLAN flips and quarantine actions land in seconds without firewall changes.
A local cache on the Edge that keeps authentication working through cloud or WAN outages, then resynchronises automatically.

See it running on a live tenant

Spin through a read-only Arbiter tenant in your browser. Real endpoints, real RADIUS logs, no sign-in.

Open the demo

Start your own trial

First-boot banner over SSH. After activation the appliance is sealed: bootstrap password locked, password SSH disabled, configuration portal-driven from then on.

How the Edge actually works

The Edge ships as a small Debian 13 virtual appliance: two vCPUs, 2 GB RAM, 8 GB disk. Download the OVA (or the .deb if you'd rather drop it onto an existing host) and import into VMware, Hyper-V, Proxmox or VirtualBox. Boot. In your tenant portal you click Issue activation token on the appliance row, copy the one-time token (Argon2id-hashed at rest, plaintext shown exactly once) and paste it at the appliance's first-boot prompt.

The appliance then opens a single outbound TCP/2083 connection to arbiter-radius.arbiter.ie and authenticates with an mTLS client cert signed by your tenant's own Arbiter root: ECDSA P-256, 10-year validity, private key AES-256-GCM-wrapped under a per-tenant data encryption key that lives on a separate host the auth pipeline never reaches. Neither machine alone can sign a cert.

Inside the tunnel runs a four-frame protocol carrying RADIUS, accounting, DHCP discoveries for endpoint profiling and Change-of-Authorization, plus operator-initiated diagnostic SSH gated by the tenant audit log. CoA is fan-in by design: any source (MDM compliance flip, captive-portal accept, operator click) writes a row to a coa_pending queue and a dispatcher pushes a frame down the matching tunnel. The session on the switch re-evaluates in seconds, with no inbound hole in your firewall.

When the link drops, the appliance answers from a seven-day SQLite cache of every Access-Accept and Reject it has seen, with re-signed Message-Authenticators. TCP keepalives detect a half-open path inside a minute and devices the appliance has never seen fall through to the switch's own dead-server policy. Time discipline runs against NTS-authenticated stratum-1 NTP. The appliance has no open inbound ports and no console wizard after first boot and updates land via a signed apt repository on a schedule the cloud controls.

Summary

Public UDP RADIUS on the internet is the path of least resistance and the wrong default. Cleartext metadata, a live UDP attack surface, no clean inbound CoA and a transport that drops packets across the WAN are not problems you patch around, they are reasons to change the transport. We did. Arbiter is RadSec-only, via the Edge, end to end. If a vendor is still offering you a cloud RADIUS hostname to point your switch at over UDP, ask them what they are doing about the four problems above.