Blog » Security » How to Implement Layer 7 DDoS Protection for Apps

How to Implement Layer 7 DDoS Protection for Apps

Share this post:

Most teams that get hit by an application-layer attack already had DDoS protection deployed. The protection just wasn’t covering the right layer. Layer 7 DDoS protection solves a fundamentally different problem than network-layer defenses, and treating them as interchangeable is one of the most common and costly infrastructure mistakes in production.

The Layer Split Is an Architectural Decision, Not Just a Classification

L3/L4 defenses—BGP blackholing, anycast scrubbing, SYN-flood protection—operate on packet volume and protocol behavior. They’re excellent at absorbing 100 Gbps UDP amplification attacks or SYN floods because those attacks are distinguishable from legitimate traffic by volume alone. Hardware can process them at line rate without understanding what the packets contain.

L7 attacks are different. An HTTP flood, a slow POST, a cache-bypass sweep, or a credential-stuffing campaign all look like valid TCP connections with properly formatted HTTP headers. The packets arrive one at a time, each individually legitimate from a network perspective. Volume-based systems see nothing unusual until the origin collapses.

The architectural consequence is non-negotiable: L7 inspection requires a full inline HTTP proxy. You cannot scrub application-layer attacks out-of-path via BGP divert—there is no way to inspect request bodies, headers, and session patterns from a network scrubbing center that only sees packet flows. If your DDoS provider is not terminating every HTTP connection and proxying clean requests to your origin, you have zero L7 protection regardless of what the vendor dashboard says.

The Attack That Doesn’t Look Like an Attack

The most operationally dangerous L7 pattern is also the least discussed: the cache-bypass attack.

An attacker who understands your application’s cache key structure—and most CDN caching behavior is publicly observable—crafts requests that systematically avoid cached content. The simplest version appends a randomized query parameter (?v=7f3a2b, ?t=1718290000) to every request. More targeted versions rotate Accept-Language headers, Accept-Encoding values, or X-Forwarded-For addresses to defeat cache normalization rules.

The result: every request passes through to origin. No CDN cache hit. No edge absorption. At 5,000 requests per second—a rate that produces less than 50 Mbps of bandwidth—a mid-tier application server can exhaust its database connection pool within two to three minutes. Origin response time degrades, connection queue backs up, health checks start failing, and the load balancer marks nodes unhealthy.

From the L3/4 monitoring dashboard, this looks like a modest traffic uptick. No bandwidth alerts fire. No packet-rate thresholds are crossed. By the time operators notice the origin error rate climbing, the application is already unavailable.

In environments SIRAYA has observed across APAC deployments, this pattern is consistently underestimated because teams focus monitoring on bandwidth graphs rather than origin connection depth and cache hit ratio. Cache hit ratio dropping from 92% to 15% in a five-minute window is the real leading indicator—and most teams do not have that alert configured.

The API Endpoint Blind Spot

A second failure pattern deserves equal attention. WAF products protect web applications using JavaScript challenges: the CDN serves a lightweight JS snippet, verifies the browser can execute it, and then issues a clearance cookie. Browsers pass this check automatically. Bots that cannot run JavaScript do not.

This works well for HTML applications. It fails completely for API endpoints.

Your /api/v1/authenticate endpoint, your GraphQL endpoint, your webhook receiver—none of these are called by a browser rendering a page. They’re called by mobile apps, backend services, automated clients, and, equally, by credential-stuffing tools and API flood scripts. A JS challenge on an API endpoint will break your legitimate integrations while providing no protection against bots that already don’t respect it.

A common deployment pattern that creates this exposure: WAF and bot management rules are configured for the primary domain, with /api/* routes either excluded from challenge rules or handled by a separate origin that lacks WAF coverage entirely. The API endpoints—which are often the highest-value attack surface for data exfiltration, account takeover, and service disruption—are left exposed.

For API paths, layer 7 DDoS protection requires a different toolkit: rate limiting scoped to authenticated tokens or API keys (not IPs), TLS fingerprint analysis (JA3/JA4 signatures to identify non-browser clients), request body anomaly detection, and per-endpoint RPS thresholds tuned to actual traffic baselines rather than global defaults.

Why IP-Based Rate Limiting Fails at APAC Scale

IP-based rate limiting is the first control most teams reach for, and in many APAC deployments, it creates more problems than it solves.

Mobile carrier networks in Southeast Asia, South Korea, and parts of China use large-scale Carrier-Grade NAT (CGNAT). Thousands of legitimate users can share a single public IP address. An IP-rate limit tuned to block aggressive crawlers will trigger on the IP blocks of major mobile carriers, dropping legitimate traffic from real users mid-session. This shows up as a sudden spike in 429 errors concentrated in mobile user segments—often blamed on the CDN, sometimes on the application, rarely traced back to the rate-limit configuration.

At the same time, a well-resourced attacker running through a residential proxy network will distribute attack traffic across tens of thousands of IPs, keeping each IP’s individual request rate below any threshold that would also protect against the CGNAT problem.

The more durable approach at the application layer is to rate-limit by authenticated session, API token, or behavioral fingerprint—not by IP. For unauthenticated endpoints where that is not possible, consider a shorter-window burst limit combined with progressive challenge escalation rather than a hard per-IP cutoff.

The Origin IP Exposure That Nullifies Edge Protection

Deploying a CDN with full WAF and L7 DDoS coverage provides no protection if the attacker knows your origin server’s IP address and targets it directly.

Origin IP addresses are more discoverable than most teams expect. Historical DNS records (pre-CDN migration), SSL certificate transparency logs, IPv6 addresses left in application-level error messages, A records on subdomains not covered by the CDN, or responses from a misconfigured staging environment—any of these can expose the origin. Attackers enumerate these routinely before launching application-layer attacks, specifically to bypass edge protection.

The fix is to treat origin IP concealment as a hard requirement of L7 DDoS protection architecture, not a nice-to-have. At the firewall level, accept inbound HTTP/HTTPS connections only from your CDN provider’s published IP ranges and reject everything else. For higher-assurance setups, replace direct origin exposure entirely with an outbound tunnel (Cloudflare Tunnel, AWS PrivateLink, or equivalent) that never requires the origin to accept public inbound connections at all.

Choosing the Right Defense Architecture

The practical decision is not “L3/4 or L7″—it’s understanding which layer represents your dominant attack surface.

If you run infrastructure that is a target for volumetric attacks by virtue of what it is—a large DNS resolver, a financial exchange, a gaming platform handling high connection counts—you need L3/4 coverage with a high-capacity anycast network upstream of everything else.

If you run a web application or API, L7 is almost certainly where the risk concentrates. A mid-size application is not going to receive a 500 Gbps UDP flood. It will receive 30,000 HTTP requests per second to its unauthenticated password-reset endpoint, or 8,000 slow connections consuming all available worker threads, or a targeted crawl designed to exhaust your search index query budget.

For most web and API workloads, L7 inline protection at the CDN edge covers more real attack surface than L3/4 scrubbing alone. Both layers together is the production-grade answer, but if you have to prioritize coverage given cost or operational complexity, the application layer is where modern attacks land.

A common architecture SIRAYA recommends for APAC-deployed applications runs three tiers: an anycast CDN edge with integrated WAF and bot management as the first contact point, origin-level connection rate limiting at the load balancer tier (tuned conservatively to avoid CGNAT false positives), and per-endpoint rate limits enforced in the API gateway layer for authenticated API paths. Each tier handles what it is well-positioned to detect; no single tier is expected to stop everything.

The WAF Mode Deployment Risk

One operational failure that appears with surprising regularity: WAF rulesets deployed in “detect” or “log-only” mode that were never promoted to “block” mode before traffic ramped up.

The reasoning is understandable—teams want to observe false positives before enforcing rules aggressively. But “detect mode” is not protection. It is monitoring. In production environments under real attack conditions, a WAF that is logging but not blocking is contributing to your post-incident report, not preventing it.

The better operational practice is to deploy in block mode on lower-risk endpoints first (static asset paths, documentation pages) while monitoring, then progressively expand rule enforcement to higher-sensitivity paths like authentication and API endpoints. This builds confidence in rule tuning without leaving the entire application exposed during the observation period.

Layer 7 DDoS protection is not a product you buy and forget. It is a configuration that must be matched to your specific application’s traffic patterns, API structure, authentication model, and geographic user distribution. The defaults rarely hold under a real targeted attack. The teams that hold are those who have tuned their rules against their own baseline traffic, know exactly what their cache behavior is under load, and have hidden their origin before the attacker starts looking for it.

Share this post:

To learn more about the gambling industry’s insights and technical solutions, subscribe our official Telegram channel

Telegram: @siraya_official

To learn more about the gaming industry’s insights and technical solutions, subscribe our official Telegram channel. You can also contact us for a Free Trial!

See What SIRAYA Can Do For You!

You can become the next great story. Let us show you how!

AI Starts Here — Come See Us at SiGMA Asia 2026

AI × Gaming: Unlocking the Next Growth Engine

Rebuilding Content Productivity with AI