You need to enable JavaScript to use the communication tool powered by OpenWidget

How to Fix Ahrefs Site Audit 404 Errors from Cloudflare’s Email Protection

ahrefs broken links cloudflare email protection

If you’re running Ahrefs Site Audit and watching your site health score plummet due to hundreds of “Page has links to broken page” errors, you’re not alone. When every broken link error points to a mysterious /cdn-cgi/l/email-protection URL returning 404 status codes, you’ve encountered one of the most frustrating conflicts between Cloudflare’s security features and SEO crawlers.

This isn’t just about Ahrefs. Popular SEO auditing tools like Screaming Frog, Sitebulb, and SEMrush can all stumble over the same Cloudflare Email Address Obfuscation mechanism. The good news? There’s a proper solution that doesn’t require disabling your spam protection or settling for inaccurate audit data.

Understanding Why Cloudflare Email Obfuscation Breaks SEO Audits

Cloudflare’s Email Address Obfuscation is part of their Scrape Shield protection suite, designed to hide email addresses from spam harvesters while keeping them visible to human visitors through JavaScript decoding. When enabled, Cloudflare automatically transforms email links in your HTML from simple mailto: addresses into encoded paths under the /cdn-cgi/l/email-protection directory.

Here’s what actually happens behind the scenes. Your original markup might look like this:

<a href="mailto:contact@example.com">Get in touch</a>

Cloudflare’s edge servers intercept the response and transform it into:

<a href="/cdn-cgi/l/email-protection#5b3e363a32371b3c363a323775383436">contact</a>
<script data-cfasync="false" src="/cdn-cgi/scripts/f2bf09f8/cloudflare-static/email-decode.min.js"></script>

The encoded string after the hash represents your email address, encrypted using XOR encryption with a random key (the first byte). Cloudflare injects the email-decode.min.js script to decrypt and display the address when JavaScript executes in a browser.

For human visitors browsing with JavaScript enabled, this process is invisible and seamless. The problem emerges when SEO crawlers encounter these transformed links.

Most technical SEO auditing tools operate without executing JavaScript during their crawl, or execute it in limited capacity. The crawler discovers internal links pointing to /cdn-cgi/l/email-protection#... URLs, attempts to fetch them, receives 404 responses (because these aren’t real pages), and dutifully reports them as broken links in your audit results.

This creates a cascade of false positives that can mask genuine technical SEO issues on your website. When your Ahrefs dashboard shows 300 broken internal links and 290 of them are email obfuscation artifacts, identifying the 10 actual problems becomes needle-in-haystack territory.

Why the Popular robots.txt Solution Doesn’t Work

Search any SEO forum for “Ahrefs cdn-cgi 404 errors” and you’ll find countless threads suggesting you add this line to your robots.txt file:

Disallow: /cdn-cgi/

This advice appears logical on the surface but fundamentally misunderstands how site audit tools process discovered links. When you disallow a path in robots.txt, you’re not making those URLs invisible to crawlers – you’re merely telling them not to request those specific paths.

Here’s what actually happens with the robots.txt approach. The Ahrefs Site Audit crawler visits your homepage and parses the HTML. It discovers ten links pointing to various /cdn-cgi/l/email-protection#... URLs.

The crawler checks your robots.txt and sees that /cdn-cgi/ is disallowed. Following proper robot protocol, it doesn’t attempt to fetch those URLs. However, the crawler has already discovered these as internal links on your page. Since it cannot verify whether they’re working or broken (due to the robots.txt restriction), it reports them as problematic links that it was prevented from crawling.

The end result is identical to the original problem: hundreds of flagged issues in your audit report, your site health score takes a hit, and you still can’t distinguish real problems from false positives. The robots.txt method is a band-aid on a wound that needs stitches.

How to Whitelist Ahrefs IP Addresses in Cloudflare Configuration Rules

The only effective approach is creating targeted Cloudflare rules that selectively disable Email Address Obfuscation for verified SEO crawler traffic while maintaining protection for everyone else. Cloudflare’s Configuration Rules system provides exactly this granular control.

There are two implementation methods, each with distinct security implications and technical requirements.

This approach verifies that requests originate from Ahrefs’ publicly documented IP ranges before disabling email obfuscation.

IP address verification is significantly more secure than user-agent checking because spoofing an IP address from Ahrefs’ actual server infrastructure is technically impractical for malicious actors, whereas copying a user-agent string requires no technical sophistication whatsoever.

Prerequisites: This method requires access to create and manage IP Lists in your Cloudflare account. Navigate to Manage Account > Settings > Lists to verify you have this capability. Some Cloudflare plan levels or organizational permission structures may restrict IP List creation. If you encounter permission barriers, proceed directly to Method 2.

Creating the Ahrefs IP List

First, you need to define which IP addresses belong to Ahrefs’ crawler infrastructure:

  1. In your Cloudflare dashboard, navigate to Manage Account (top-left account switcher)
  2. Select Settings from the account-level menu
  3. Click the Lists tab and then Create list
  4. Name your list descriptively: ahrefs_crawler_ips or seo_audit_tools
  5. Set the content type to IP
  6. Add Ahrefs’ complete IP range inventory. Ahrefs maintains their crawler fleet across several IP blocks distributed globally. They publish and update this information at https://help.ahrefs.com/en/articles/78658. Copy and paste all IP ranges simultaneously.
  7. Click Add to list and then Save

Important maintenance note: Ahrefs occasionally adds new IP ranges as they expand infrastructure or shifts providers. Set a calendar reminder to verify your IP list quarterly against their official documentation. Outdated IP lists will cause new Ahrefs crawler nodes to encounter the obfuscation issue, gradually reintroducing 404 errors into your audit results.

Implementing the Configuration Rule

Now you’ll create the rule that leverages your IP list:

  1. Return to your domain-specific dashboard (not account settings)
  2. Navigate to Rules in the left sidebar
  3. Click Create rule and select Configuration Rule from the dropdown
  4. Provide a clear, descriptive name like Allow Ahrefs Site Audit - Disable Email Obfuscation
  5. In the “When incoming requests match…” section, configure: IP Source Addressis in list → Select your ahrefs_crawler_ips list
  6. In the “Then…” actions section, locate Email Obfuscation
  7. Click Add next to Email Obfuscation and set the toggle to Off
  8. Review your configuration and click Deploy

The rule activates immediately across Cloudflare’s global edge network. Your next Ahrefs Site Audit crawl (whether scheduled or manually triggered) will encounter unobfuscated email addresses and report zero false-positive 404 errors from email protection.

Extending This Solution to Other SEO Tools

The Cloudflare email obfuscation conflict isn’t exclusive to Ahrefs. If you use multiple SEO auditing platforms or desktop crawlers like <a href=”https://www.screamingfrog.co.uk/seo-spider/” target=”_blank”>Screaming Frog SEO Spider</a>, Sitebulb, SEMrush Site Audit, or OnCrawl, you can expand your Cloudflare rules to accommodate them.

For IP-based whitelisting, create separate IP lists for each crawler service (most publish their IP ranges) and modify your Configuration Rule to check multiple lists using OR logic. For user-agent approaches, extend your expression with additional OR conditions. Here’s an example that covers multiple tools:

(http.user_agent contains "AhrefsSiteAudit") or 
(http.user_agent contains "AhrefsBot") or 
(http.user_agent eq "Screaming Frog SEO Spider/19.0") or 
(http.user_agent contains "SemrushBot")

However, remember that each additional user-agent you whitelist increases the surface area for potential spoofing. The best practice for protecting against email harvesting while maintaining audit accuracy is implementing separate IP-based rules for each crawler service you legitimately use.

Verifying Your Solution Works

After implementing either method, verification ensures your configuration resolves the issue:

  1. Clear Cloudflare’s cache: Navigate to Caching in your Cloudflare dashboard and select Purge Everything. This forces fresh content generation with your new rule applied.
  2. Trigger a fresh Ahrefs crawl: In your Ahrefs project, initiate a new Site Audit. For scheduled audits, you may need to wait until the next automatic run, though you can typically trigger manual crawls immediately.
  3. Monitor the crawl progress: Watch the audit as it completes. Previously problematic pages should no longer generate /cdn-cgi/l/email-protection 404 errors.
  4. Check Cloudflare Firewall Events: Navigate to SecurityEvents in Cloudflare to verify Ahrefs requests are being processed according to your rule. Look for requests matching your configured criteria and confirm they’re showing “Configure Email Obfuscation: OFF” in the action details.
  5. Review audit results: Once the crawl completes, examine your site health score and broken link report. The cdn-cgi related 404 errors should have vanished, leaving only genuine technical issues requiring attention.

If you still observe email obfuscation errors after implementing your rule, verify that the rule is deployed and positioned correctly (Cloudflare processes rules in order), check that you haven’t inadvertently created conflicting rules, and confirm that Ahrefs hasn’t recently updated their crawler infrastructure.

Final Recommendations

For most website operators, the IP whitelist method represents the optimal balance of security and functionality. The slight additional complexity of managing IP lists pays dividends in protection against email harvesting attempts by malicious actors spoofing legitimate crawler user-agents.

It’s important to never disable Cloudflare’s Email Address Obfuscation globally just to satisfy SEO auditing tools. Email addresses published in plain HTML remain prime targets for spam bots, and the resulting inbox pollution isn’t worth sacrificing for cleaner audit reports. Targeted, conditional disabling for verified crawlers gives you accurate SEO data while maintaining protection where it matters most.

The robots.txt approach, despite its popularity in forum discussions, simply doesn’t solve the underlying problem. It’s a misapplication of what robots.txt controls and leaves you with identical audit report issues. Skip this ineffective solution entirely and implement proper Cloudflare rule-based handling from the start.

With these Cloudflare Configuration Rules in place, your Ahrefs Site Audit will finally reflect genuine technical SEO issues rather than false positives from email protection mechanisms.

What causes Ahrefs 404 errors with Cloudflare email protection?

Ahrefs 404 errors occur when Cloudflare’s email protection blocks crawling requests from Ahrefs bots, resulting in missing data.

Can Cloudflare’s email protection be configured to allow Ahrefs crawlers?

Yes, you can adjust Cloudflare’s email protection settings to whitelist Ahrefs crawlers, allowing proper data collection without errors.

Welcome Back! Please Log In.