Bot Traffic

I discovered Peter Rukavina’s post Bots Are Eating My Blog for Lunch thanks to this post on Kev’s blog. Short story, by analizing his server’s log he discovered that around 85% of his blog’s traffic is generated by bots.

Couldn’t resist following the same procedure to analyze my blog’s traffic from January to May 2025:

  • Semrush bot accounts for 64.2% of my traffic
  • 18.6% are visits from other bots
  • 17.2 % is human traffic

Claude.ai generated this nice table:

<div style="background: rgba(255, 255, 255, 0.95); backdrop-filter: blur(10px); border-radius: 15px; padding: 25px;">
    
    <h2 style="color: #2d3748; text-align: center; font-size: 2rem; margin-bottom: 10px; background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%); -webkit-background-clip: text; -webkit-text-fill-color: transparent; background-clip: text; margin-top: 0;">Apache Log Analysis</h2>
    
    <p style="text-align: center; color: #718096; font-size: 1.1rem; margin-bottom: 25px;">User Agent Summary & Traffic Breakdown</p>
    
    <!-- Stats Grid -->
    <div style="display: grid; grid-template-columns: repeat(auto-fit, minmax(180px, 1fr)); gap: 15px; margin-bottom: 25px;">
        <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 12px; text-align: center; box-shadow: 0 8px 20px rgba(102, 126, 234, 0.3);">
            <div style="font-size: 1.8rem; font-weight: bold; margin-bottom: 5px;">1,579,188</div>
            <div style="font-size: 0.85rem; opacity: 0.9;">Total Requests</div>
        </div>
        <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 12px; text-align: center; box-shadow: 0 8px 20px rgba(102, 126, 234, 0.3);">
            <div style="font-size: 1.8rem; font-weight: bold; margin-bottom: 5px;">20</div>
            <div style="font-size: 0.85rem; opacity: 0.9;">Unique User Agents</div>
        </div>
        <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 12px; text-align: center; box-shadow: 0 8px 20px rgba(102, 126, 234, 0.3);">
            <div style="font-size: 1.8rem; font-weight: bold; margin-bottom: 5px;">64.2%</div>
            <div style="font-size: 0.85rem; opacity: 0.9;">Bot Traffic</div>
        </div>
        <div style="background: linear-gradient(135deg, #667eea 0%, #764ba2 100%); color: white; padding: 15px; border-radius: 12px; text-align: center; box-shadow: 0 8px 20px rgba(102, 126, 234, 0.3);">
            <div style="font-size: 1.8rem; font-weight: bold; margin-bottom: 5px;">17.0%</div>
            <div style="font-size: 0.85rem; opacity: 0.9;">Human Traffic</div>
        </div>
    </div>
    
    <!-- Responsive Table Container -->
    <div style="overflow-x: auto; margin-top: 20px;">
        <table style="width: 100%; border-collapse: collapse; background: white; border-radius: 12px; overflow: hidden; box-shadow: 0 8px 25px rgba(0, 0, 0, 0.1); min-width: 600px;">
            <thead>
                <tr style="background: linear-gradient(135deg, #4facfe 0%, #00f2fe 100%); color: white;">
                    <th style="padding: 12px 10px; text-align: left; font-weight: 600; font-size: 0.9rem;">Category</th>
                    <th style="padding: 12px 10px; text-align: left; font-weight: 600; font-size: 0.9rem;">User Agent / Bot</th>
                    <th style="padding: 12px 10px; text-align: left; font-weight: 600; font-size: 0.9rem;">Requests</th>
                    <th style="padding: 12px 10px; text-align: left; font-weight: 600; font-size: 0.9rem;">%</th>
                    <th style="padding: 12px 10px; text-align: left; font-weight: 600; font-size: 0.9rem;">Description</th>
                </tr>
            </thead>
            <tbody>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">SEO Bot</td>
                    <td style="padding: 10px; color: #2d3748;">SemrushBot</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">1,013,643</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">64.2%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">SEO analysis and competitive research</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #718096;">Unknown</td>
                    <td style="padding: 10px; color: #2d3748;">Empty/Null User Agent</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">293,203</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">18.6%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Missing or stripped user agent</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">Monitoring</td>
                    <td style="padding: 10px; color: #2d3748;">UptimeRobot</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">80,669</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">5.1%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Website uptime monitoring service</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">SEO Bot</td>
                    <td style="padding: 10px; color: #2d3748;">AhrefsBot</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">30,751</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">1.9%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">SEO backlink analysis crawler</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">Image Bot</td>
                    <td style="padding: 10px; color: #2d3748;">ImagesiftBot</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">26,659</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">1.7%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Image analysis and processing</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">E-commerce</td>
                    <td style="padding: 10px; color: #2d3748;">Amazonbot</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">25,723</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">1.6%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Amazon's web crawler</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">Search Bot</td>
                    <td style="padding: 10px; color: #2d3748;">MJ12bot v1.4.8</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">17,493</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">1.1%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Search engine crawler</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #38a169;">Desktop</td>
                    <td style="padding: 10px; color: #2d3748;">Chrome 91 (Windows 10)</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">14,711</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.9%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Human user - Windows desktop</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #3182ce;">Mobile</td>
                    <td style="padding: 10px; color: #2d3748;">Chrome 90 (Android - Redmi)</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">13,408</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.8%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Human user - Android mobile</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">Search Bot</td>
                    <td style="padding: 10px; color: #2d3748;">PetalBot</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">12,997</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.8%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Huawei search engine crawler</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #38a169;">Desktop</td>
                    <td style="padding: 10px; color: #2d3748;">Safari 18.4 (Mac OS X)</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">12,922</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.8%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Human user - Mac desktop</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #3182ce;">Mobile</td>
                    <td style="padding: 10px; color: #2d3748;">Chrome 60 (Samsung)</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">12,754</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.8%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Human user - Samsung mobile</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">Custom Bot</td>
                    <td style="padding: 10px; color: #2d3748;">l9explore</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">12,715</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.8%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Custom exploration/scraping tool</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #d69e2e;">Social Media</td>
                    <td style="padding: 10px; color: #2d3748;">Facebook External Agent</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">12,339</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.8%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Facebook link preview crawler</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #38a169;">Desktop</td>
                    <td style="padding: 10px; color: #2d3748;">Chrome 80 (Mac OS X)</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">11,986</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.8%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Human user - Mac desktop (older)</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">Search Bot</td>
                    <td style="padding: 10px; color: #2d3748;">MJ12bot v2.0.0</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">9,769</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.6%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Search engine crawler (newer)</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #e53e3e;">AI Bot</td>
                    <td style="padding: 10px; color: #2d3748;">GPTBot (OpenAI)</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">9,613</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.6%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">OpenAI's web crawler for AI training</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #805ad5;">RSS Reader</td>
                    <td style="padding: 10px; color: #2d3748;">Feedbin</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">9,355</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.6%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">RSS feed reader (3 subscribers)</td>
                </tr>
                <tr style="border-bottom: 1px solid #e2e8f0;">
                    <td style="padding: 10px; font-weight: 600; color: #38a169;">Desktop</td>
                    <td style="padding: 10px; color: #2d3748;">Edge 114 (Windows 10)</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">8,727</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.6%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Human user - Windows Edge</td>
                </tr>
                <tr>
                    <td style="padding: 10px; font-weight: 600; color: #38a169;">Desktop</td>
                    <td style="padding: 10px; color: #2d3748;">Chrome 133 (Mac OS X)</td>
                    <td style="padding: 10px; font-weight: 600; color: #2d3748;">8,427</td>
                    <td style="padding: 10px; color: #718096; font-size: 0.9rem;">0.5%</td>
                    <td style="padding: 10px; color: #4a5568; font-size: 0.85rem;">Human user - Mac desktop (latest)</td>
                </tr>
            </tbody>
        </table>
    </div>
    
    <!-- Summary Note -->
    <div style="margin-top: 20px; padding: 15px; background: #f7fafc; border-radius: 10px; border-left: 4px solid #4facfe;">
        <p style="margin: 0; color: #4a5568; font-size: 0.9rem;"><strong>Key Insights:</strong> SemrushBot dominates with 64.2% of traffic, while legitimate human users account for only 17%. Consider implementing bot blocking or rate limiting to improve server performance and reduce bandwidth costs.</p>
    </div>
    
</div>

I don’t mind bots visits in general. But Semrush traffics seems excessive. I decided to block the bots that generated over 5% of the traffic in my robots.txt file.

# robots.txt - Block bots with 10%+ traffic
# Generated based on Apache log analysis

# Block SemrushBot (64.2% of traffic)
# SEO analysis and competitive research bot
User-agent: SemrushBot
Disallow: /

# Block various SemrushBot variants
User-agent: SemrushBot/7~bl
Disallow: /

User-agent: SemrushBot/*
Disallow: /

# Note: Cannot block empty/null user agents via robots.txt
# Consider server-level blocking for requests without User-Agent headers

# Allow other legitimate bots and crawlers
# UptimeRobot (5.1%) - monitoring service, usually beneficial
User-agent: UptimeRobot
Allow: /

# AhrefsBot (1.9%) - SEO bot, but lower traffic
User-agent: AhrefsBot
Allow: /

# Amazonbot (1.6%) - e-commerce crawler
User-agent: Amazonbot
Allow: /

# GPTBot (0.6%) - OpenAI's crawler for AI training
# Uncomment the following lines if you want to block AI training
# User-agent: GPTBot
# Disallow: /

# Facebook crawler (0.8%) - for link previews
User-agent: facebookexternalhit
Allow: /

# Allow all other bots by default
User-agent: *
Allow: /

Also, as bots not always respect the robots.txt file, I blocked Semrush in my webserver configuration:

<IfModule mod_rewrite.c>
    RewriteEngine On
    
    # Block SemrushBot and all its variants
    RewriteCond %{HTTP_USER_AGENT} SemrushBot [NC]
    RewriteRule ^.*$ - [F,L]
    
    # Block specific SemrushBot version from your logs
    RewriteCond %{HTTP_USER_AGENT} "SemrushBot/7~bl" [NC]
    RewriteRule ^.*$ - [F,L]
    
    # Block any SemrushBot version pattern
    RewriteCond %{HTTP_USER_AGENT} "SemrushBot/[0-9]" [NC]
    RewriteRule ^.*$ - [F,L]
    
    # Optional: Block empty/null user agents (18.6% of your traffic)
    # Uncomment the following lines to also block empty user agents
    # RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
    # RewriteCond %{HTTP_USER_AGENT} ^-$
    # RewriteRule ^.*$ - [F,L]
</IfModule>

Testing the new configuration:

$ curl -I -A "SemrushBot" https://zoia.org

HTTP/1.1 403 Forbidden
Date: Wed, 11 Jun 2025 18:10:38 GMT
Server: Apache/2.4.58 (Ubuntu)
Content-Type: text/html; charset=iso-8859-1

I’ll wait and check the new traffic stats in a couple of months.

bots robots.txt bots traffic

Join my free newsletter and receive updates directly to your inbox.