Network Reliability

See Network Failures Your Servers Can't

When users can't reach your servers, your servers have no idea. Network Error Logging turns every browser into a distributed probe, reporting DNS failures, TCP timeouts, and TLS errors before users give up.

The Problem

Your Server Logs Have a Massive Blind Spot

Here's an uncomfortable truth: if a user never reaches your servers, your servers will never know. DNS failures, TCP timeouts, TLS handshake errors — they all happen before a single byte reaches your infrastructure. Your application logs show nothing. Your APM tools show nothing. Everything looks fine.

The "last mile" — the final 20% of the network path between your users and your servers — accounts for 80% of network problems. ISP routing issues, regional DNS outages, BGP misconfigurations — these affect thousands of users while your dashboards stay green.

In 2024, when 50% of Canva's users on Cox Communications in San Diego suddenly couldn't connect, Canva's servers just saw "a slight drop in traffic." No alerts fired. No errors logged. Without client-side visibility, diagnosing the issue would have taken hours. With it, they identified the ISP routing problem within minutes.

Meanwhile, your support team fields tickets saying "your site is down" while you're staring at perfect uptime metrics. The classic "it works for me" nightmare — except you have no data to prove otherwise.

The Solution

Network Error Logging: Your Users' Browsers Report Failures

Network Error Logging (NEL) is a W3C standard that lets browsers automatically report network failures to an endpoint you specify. When a user experiences a DNS resolution failure, TCP timeout, or TLS handshake error, their browser captures the details and sends a report — even though the request never reached your servers.

NEL is supported by 80% of browsers worldwide — Chrome 71+, Edge 79+, and Opera 58+. It's been battle-tested at scale: Google has used NEL across all their domains since 2014 to detect DNS hijacking and BGP route leaks. Wikimedia uses it to power their public status page.

Reports include detailed diagnostic information: the failure phase (DNS, TCP, TLS, or HTTP), the specific error type (dns.name_not_resolved, tcp.timed_out, tls.cert_error), elapsed time, and the server IP if one was resolved. Everything you need to understand what went wrong.

Implementation requires just two HTTP headers — no JavaScript SDK, no build process changes, no client-side weight. Add a NEL header and a Report-To header, and browsers start reporting failures immediately.

The Challenge

But Raw Reports Are Just Noise

NEL gives you visibility, but individual error reports aren't actionable. A single tcp.timed_out could be a user's flaky WiFi. A dns.name_not_resolved might be a misconfigured corporate proxy. Without context, you're drowning in data that doesn't tell you anything.

The signal is in the patterns. When dozens of users from the same ASN report DNS failures, that's an ISP outage. When TCP timeouts cluster in a specific metro area, that's a regional network problem. When TLS errors spike after a deployment, that's a certificate misconfiguration. But finding these patterns requires aggregating reports by geography, ASN, and error type.

High-traffic sites face another challenge: volume. A site with millions of daily visitors could generate hundreds of thousands of error reports. Without sampling and intelligent filtering, you'll overwhelm any endpoint — and your team.

And then there's the response question. Even when you identify a pattern, what do you do? If the problem is an ISP routing issue, you need evidence to escalate. If it's a CDN edge failure, you need data to route around it. Raw reports don't give you that.

The Answer

The Reporting API Turns Network Errors Into Insights

Aggregate reports, detect patterns, and route to your existing observability tools.

Regional Pattern Detection
Aggregate reports by geography and ASN to distinguish real outages from individual user issues. Know immediately when an ISP, region, or CDN edge is experiencing problems — not hours later when traffic drops.
Real-Time Alerting
Route network errors to AppSignal, webhooks, or Google Chat. Get notified when failure patterns emerge, not when customers complain. Catch DNS hijacking attempts and BGP issues as they happen.
Distributed Probing
Every browser becomes a network probe. No synthetic monitoring agents needed — your actual users report real connectivity from their actual locations, ISPs, and network conditions.
SLA Evidence
When a third-party network causes an outage, you need proof. NEL reports provide timestamped evidence of failures by ASN and geography — exactly what you need to escalate to ISPs or hold vendors accountable.

Ready to Monitor Network Errors?

See the failures your servers can't. Detect regional outages, ISP issues, and connectivity problems in minutes, not hours.