← Back to insights
Guide · #333

How to Run an Hourly Crawl Test on Your New Site

Run hourly crawl tests on your new site to catch indexability issues in minutes, not weeks. Step-by-step guide with tools and pro tips for founders.

Filed
March 9, 2026
Read
19 min
Author
The Seoable Team

Why Hourly Crawl Tests Matter for New Sites

You shipped. Your site is live. Now what?

Most founders assume Google will find their site and index it automatically. Wrong. You need to verify that crawlers can actually reach your pages, follow your links, and understand your site structure—before you waste weeks wondering why you're not getting organic traffic.

An hourly crawl test surfaces indexability issues within the first 60 minutes of launch. Broken redirects. Missing sitemaps. Robots.txt blocking your entire site. Duplicate content. 404s on critical pages. These problems kill organic visibility, and they're invisible until you test.

The brutal truth: most technical founders skip crawl testing because they assume their site "just works." It doesn't. A one-hour crawl test catches 80% of SEO disasters before they cost you months of lost traffic.

This guide walks you through running hourly crawl tests on your new site, what tools to use, and what to do when you find problems.

Prerequisites: What You Need Before Testing

Before you run your first crawl test, make sure you have these in place:

A live, publicly accessible site. Your domain must be pointing to a live server. If your site is behind a firewall, password-protected, or still in staging, crawlers can't reach it. Test that you can access your homepage from an incognito browser first.

A sitemap.xml file. Crawlers use sitemaps to understand your site structure and find all your pages. If you don't have one yet, generate it now. For Next.js, Webflow, Shopify, Lovable, WordPress, and Framer, we have step-by-step walkthroughs to generate sitemap.xml for your specific stack.

A robots.txt file. This file tells crawlers which pages to crawl and which to skip. If you don't have one, crawlers will try to crawl everything—including admin pages, login forms, and private content. Learn how to write your first robots.txt file in 10 minutes with a founder-friendly template.

Google Search Console access. You'll need to verify your domain and submit your sitemap. If you haven't set up GSC yet, follow this 10-minute setup guide first.

A crawl testing tool. You'll use this to simulate how Google crawls your site. We'll cover specific tools in the next section.

If you're missing any of these, stop and fix them now. The rest of this guide assumes you have all five in place.

Step 1: Choose Your Crawl Testing Tool

You have three options: free online crawlers, premium desktop tools, or a combination of both.

Free Online Crawlers (Best for Quick Tests)

These are web-based tools. You paste in your URL, they crawl your site in seconds, and you get a report. No installation. No credit card. Perfect for founders on a budget.

Screaming Frog is the industry standard. The free version crawls up to 500 URLs. For a new site, that's usually enough. You download the desktop app, point it at your domain, and it crawls your entire site structure, reporting broken links, missing meta tags, redirect chains, and canonicals.

Sitechecker.pro's website crawler is a free online alternative that doesn't require installation. You enter your URL, it crawls your site, and you get a detailed report on indexability, broken links, and SEO issues—all in your browser.

Seomator's crawl test tool specifically checks whether Google and Bing can crawl and index your pages. It tests response codes, robots.txt compliance, and sitemap validity in one pass.

Premium Tools (Best for Continuous Monitoring)

If you're running crawl tests hourly, manual testing gets tedious. Automated tools run tests on a schedule.

PageCrawl.io monitors your site for changes and crawl errors automatically. You set it to crawl every hour, and it alerts you when it finds indexability issues. For new sites in their first week, hourly tests catch problems in real time.

Screening Frog's paid version (£199/year) removes the 500-URL limit and adds scheduled crawls. If you're running multiple sites or need daily crawls, the paid version pays for itself.

Google's Native Tools (Free and Built-In)

Don't overlook Google Search Console itself. The URL Inspection tool tests whether Google can crawl and index a specific page. It's not a full-site crawl, but it's instant and it shows you exactly what Google sees.

For a comprehensive overview of all available crawlers, this guide to 60+ website crawlers covers everything from open-source options to enterprise tools.

Our Recommendation for New Sites

Start with a free online crawler (Sitechecker or Seomator) for your first test. Takes 5 minutes, no setup. Then set up PageCrawl.io or Screaming Frog's scheduled crawls to test hourly for the first week. After launch week, you can dial it back to daily or weekly tests.

Step 2: Run Your First Crawl Test

Let's walk through a real crawl test using Screaming Frog, the most popular tool.

Download and Install Screaming Frog

Go to Screaming Frog's website, download the free version for your OS (Mac, Windows, or Linux), and install it. Takes 2 minutes.

Point It at Your Domain

Open Screaming Frog. In the top-left, you'll see a URL field. Type your domain (e.g., https://yoursite.com) and hit Enter. The crawler starts immediately.

You'll see a progress bar and a growing list of URLs. Screaming Frog crawls your site by following links, starting from your homepage and working through your entire site structure.

Let It Run

For a new site, this usually takes 30 seconds to 2 minutes. The crawler will find every page it can reach, check for broken links, test response codes, and validate HTML.

Don't interrupt it. Let it finish the full crawl.

Read the Results

When it's done, you'll see tabs at the bottom: Overview, Response Codes, Directives, Hreflang, Pagination, and more.

Start with Response Codes. This tab shows HTTP status codes for every page:

  • 200 (OK): Pages that crawled successfully. This is what you want.
  • 301/302 (Redirects): Pages that redirect to another URL. Check for redirect chains (A→B→C), which slow down crawling.
  • 404 (Not Found): Broken links. These hurt SEO and user experience. Fix them.
  • 403 (Forbidden): Pages your robots.txt is blocking. Make sure you're not accidentally blocking important pages.
  • 500 (Server Error): Your server is returning errors. This is a critical issue. Fix it before crawlers retry.

If you see mostly 200s, you're in good shape. If you see 404s or 403s, you have work to do.

Next, check the Directives tab. This shows whether your robots.txt is working correctly. You should see:

  • Allow: Pages you want crawled.
  • Disallow: Pages you don't want crawled (admin, login, private content).

If your entire site is listed under Disallow, your robots.txt is blocking Google. This is a critical error. Review our robots.txt template and fix it immediately.

Finally, check Hreflang and Canonicals. These tabs show whether your site is using these tags correctly. For a new site, you probably don't need hreflang (that's for multi-language sites), but canonicals matter if you have duplicate content.

Step 3: Test Your Sitemap Submission

Your sitemap tells Google which pages to crawl. If Google can't find or read your sitemap, it won't crawl all your pages.

Verify Your Sitemap Is Accessible

Open your browser and go to https://yoursite.com/sitemap.xml. You should see an XML file listing all your URLs. If you get a 404, your sitemap doesn't exist or is in the wrong location. Generate one now using our stack-specific guide.

Check Sitemap Validity

Use Seomator's crawl test tool to validate your sitemap. It checks:

  • Whether the sitemap is accessible (no 404s).
  • Whether the XML is valid (no syntax errors).
  • Whether all URLs in the sitemap are crawlable (no 403s or 500s).
  • Whether the sitemap is too large (max 50,000 URLs per file).

If you find errors, fix them before submitting to Google.

Submit Your Sitemap to Google Search Console

Once your sitemap is valid, submit it to Google Search Console. Follow this step-by-step guide to submit your sitemap in under 5 minutes.

After submission, Google will crawl your sitemap and start indexing your pages. You'll see progress in the "Coverage" report within 24-48 hours.

Step 4: Use Google Search Console's URL Inspection Tool

Google Search Console's URL Inspection tool is your direct line to Google's crawl data. It shows you exactly what Google sees when it crawls your pages.

Test Individual Pages

In Google Search Console, go to the URL Inspection tool (top search bar). Type in a critical page URL (e.g., your homepage or main landing page).

Google will crawl that page and show you:

  • Crawlability: Can Google crawl this page? If not, why not?
  • Indexability: Can Google index this page? Are there any blocks (robots.txt, noindex tags, redirects)?
  • Rendering: How does Google see this page? Is JavaScript rendering correctly?
  • Mobile Usability: Is this page mobile-friendly?

If you see "URL is not on Google" or "Crawled but not indexed," you have a problem. Common causes:

  • Robots.txt is blocking it: Check your robots.txt file. Remove the block if it's unintentional.
  • Noindex tag: You may have accidentally added a noindex meta tag. Remove it.
  • Redirect loop: Your page redirects to itself. Fix the redirect.
  • Server error: Your page returns a 500 error. Fix your server.
  • Duplicate content: This page is a duplicate of another. Add a canonical tag pointing to the original.

Test at least 5-10 critical pages (homepage, main landing pages, key product pages) in your first crawl test.

Step 5: Set Up Hourly Crawl Tests

Manual testing is fine for a one-time check, but if you want to catch issues as they happen, you need automated hourly tests.

Option A: PageCrawl.io (Easiest)

PageCrawl.io is designed for exactly this use case. You sign up, add your domain, and set it to crawl every hour.

  1. Go to PageCrawl.io and create an account.
  2. Add your domain.
  3. Set crawl frequency to "Hourly."
  4. Enable alerts for 404s, 500s, and other critical errors.
  5. Sit back and get alerts when problems arise.

PageCrawl.io runs in the background and alerts you only when something breaks. For new sites in launch week, this catches issues in real time.

Option B: Screaming Frog Scheduled Crawls (More Control)

If you prefer Screaming Frog, the paid version ($199/year) includes scheduled crawls.

  1. Upgrade to Screaming Frog's paid plan.
  2. Go to Settings > Scheduler.
  3. Set up a crawl to run every hour.
  4. Configure alerts for broken links, errors, and crawl changes.
  5. Screaming Frog runs crawls in the background and emails you reports.

Option C: Google Search Console Scheduled URL Inspection (Free)

Google Search Console doesn't have native hourly crawls, but you can use the URL Inspection tool repeatedly on critical pages. Set a calendar reminder to test 5-10 key URLs every morning for the first week.

It's manual, but it's free and it gives you Google's perspective directly.

Step 6: Monitor Crawl Health with Google Search Console

After you've run your crawl tests, monitor Google's crawl activity in Google Search Console.

Check the Coverage Report

Go to Google Search Console > Coverage. This report shows:

  • Indexed: Pages Google has indexed. This is good.
  • Excluded: Pages Google found but didn't index. Usually intentional (noindex tags, robots.txt blocks).
  • Errors: Pages Google tried to crawl but couldn't. These are problems.

For a new site, you should see your indexed count grow over 48-72 hours. If your indexed count is flat or declining, you have a crawl issue.

Check the Crawl Stats Report

Go to Google Search Console > Settings > Crawl Stats. This shows:

  • Requests per day: How often Google crawls your site. For a new site, this should increase as Google discovers more pages.
  • Bandwidth used: How much of your server's bandwidth Google uses. Usually negligible, but if it's high, you may have crawl issues (infinite loops, duplicate content).
  • Response time: How fast your server responds to Google's crawls. If it's slow (>1 second), fix your server performance.

If crawl requests are flat or declining, Google isn't finding new pages. Check your sitemap and robots.txt.

Set Up Crawl Alerts

Go to Google Search Console > Settings > Crawl Alerts. Enable alerts for:

  • Robots.txt fetch errors: Your robots.txt is returning errors.
  • Sitemap fetch errors: Your sitemap is returning errors.
  • Site move errors: If you're migrating domains, Google will alert you to problems.

These alerts are critical. If your robots.txt or sitemap breaks, Google can't crawl your site. Fix these immediately.

For a deeper dive into which GSC alerts actually matter, read our guide to Google Search Console alerts.

Step 7: Fix Issues Found in Your Crawl Test

You've run your crawl test. You've found issues. Now what?

Priority 1: Fix 500 Errors and Server Issues

If your crawl test found 500 errors, your server is broken. This is the most critical issue. Users and crawlers can't access your site.

Check your server logs. Look for:

  • Out of memory errors.
  • Database connection failures.
  • Missing dependencies or misconfigured environment variables.
  • Unhandled exceptions in your code.

Fix the root cause and redeploy. Then run a crawl test again to confirm the 500 errors are gone.

Priority 2: Fix Robots.txt Blocks

If your crawl test found that robots.txt is blocking important pages, fix it immediately.

Review our robots.txt template and update your file. Common mistakes:

  • Blocking / (your entire site).
  • Blocking /admin (good) but also blocking /api (bad, if your site needs it).
  • Using User-agent: * but then blocking specific crawlers (be explicit).

After updating, wait 24 hours for Google to re-fetch your robots.txt. Then run another crawl test to verify the blocks are removed.

Priority 3: Fix 404 Errors

If your crawl test found 404s on important pages, you have broken links.

For each 404:

  1. Check if the page should exist. If yes, restore it or create a redirect.
  2. If the page is truly gone, find all links pointing to it (use Screaming Frog's "Inlinks" tab) and update them to point to a working page.
  3. If it's a link from external sites, set up a 301 redirect from the old URL to the new one.

Run a crawl test again to confirm the 404s are fixed.

Priority 4: Fix Redirect Chains

If your crawl test found redirect chains (A → B → C), simplify them.

Redirect chains slow down crawling and dilute link equity. Instead of:

A → B → C

Use:

A → C
B → C

Update your redirects and run a crawl test to verify the chains are broken.

Priority 5: Validate Canonicals and Hreflang

If your crawl test found missing or incorrect canonicals, add them.

For single-language sites, add a self-referential canonical to every page:

<link rel="canonical" href="https://yoursite.com/page-url" />

This tells Google which version of a page is the "official" one, preventing duplicate content issues.

For multi-language sites, use hreflang tags to tell Google which version is for which language. This is more complex—read our guide to robots, sitemaps, and canonicals for details.

Step 8: Request Indexing for Critical Pages

After you've fixed crawl issues, you can speed up indexing by requesting it manually in Google Search Console.

Use the URL Inspection Tool to Request Indexing

  1. Go to Google Search Console > URL Inspection.
  2. Type in a critical page URL (homepage, main landing page, key product page).
  3. Click "Request Indexing."
  4. Google will re-crawl and re-index that page within 24-48 hours.

You have a daily quota (usually 50-100 requests per day, depending on your site's authority). Use it on your most important pages first.

For a comprehensive guide to indexing requests, read our guide to requesting indexing in Google Search Console.

Use IndexNow to Ping Bing and Yandex

Google isn't the only search engine. Bing and Yandex also index your site. You can ping them instantly using IndexNow.

IndexNow setup takes 10 minutes and lets you notify Bing and Yandex instantly when you publish new pages. They crawl within minutes instead of weeks.

After setup, ping your critical pages to IndexNow. Bing will crawl them immediately.

Step 9: Verify Indexing Status

After you've fixed issues and requested indexing, verify that Google has actually indexed your pages.

Use the Site: Operator

In Google Search, type site:yoursite.com. This shows all pages Google has indexed from your domain.

For a new site, you should see your indexed pages appear within 48 hours of requesting indexing. If you don't see them, you still have crawl issues.

Use Google Search Console's Coverage Report

Go to Google Search Console > Coverage. The "Indexed" count should match your sitemap size (or close to it). If it's much lower, you have excluded pages.

Click "Excluded" to see why pages aren't indexed. Common reasons:

  • Noindex tag: You added a noindex tag. Remove it.
  • Robots.txt block: Your robots.txt is blocking it. Fix it.
  • Redirect: The page redirects somewhere else. Check if the redirect is correct.
  • Duplicate content: It's a duplicate of another page. Add a canonical tag.

Use the URL Inspection Tool

For critical pages, use Google Search Console's URL Inspection tool to verify indexing status. It shows exactly what Google sees and whether the page is indexed.

For a detailed guide to checking indexing status, read our guide to checking if Google has indexed your page in 30 seconds.

Pro Tips: Crawl Testing Best Practices

Test on a Schedule, Not Just Once

One crawl test at launch is good. Hourly tests for the first week are better. Weekly tests for the first month are best.

As you ship new features, add new pages, or change your site structure, crawl issues can appear. Regular testing catches them early.

Crawl Before You Deploy

Don't wait until your site is live to test crawlability. Use Screaming Frog on your staging site before deployment. Catch issues in staging, not production.

Test Different User-Agents

Google uses different user-agents for desktop and mobile crawling. Screaming Frog lets you test both. In Settings > User-Agent, switch between Googlebot Desktop and Googlebot Mobile and run crawls for both.

If your site renders differently for mobile, make sure both crawls return 200s and valid HTML.

Monitor Crawl Bandwidth

If Google is crawling your site too aggressively and using too much bandwidth, you can set a crawl rate limit in Google Search Console > Settings > Crawl Rate. Be careful with this—limiting crawl rate can slow down indexing.

For most new sites, Google's default crawl rate is fine.

Set Up Uptime Monitoring

Crawl tests are useless if your site goes down between tests. Set up uptime monitoring to ensure crawlers always find your site. Tools like UptimeRobot alert you instantly if your site goes down, so you can fix it before Google tries to crawl and gets a 500 error.

Document Your Crawl Test Results

Keep records of your crawl tests. Screenshot the reports, note the date, and track improvements over time. This helps you see progress and identify patterns.

If you're reading Google Search Console performance reports like a founder, you'll want to correlate crawl test results with organic traffic changes.

Common Crawl Test Issues and Fixes

Issue: "Crawled but not indexed"

Google crawled your page but didn't index it. Causes:

  • Noindex tag: Remove it.
  • Low quality content: Google may think it's thin or duplicate. Add more unique value.
  • Redirect: Your page redirects to another URL. Check if the redirect is intentional.
  • Canonicals: Your page has a canonical pointing elsewhere. Check if it's correct.

Issue: "Discovered but not crawled"

Google found a link to your page but hasn't crawled it yet. This is normal for new pages—Google crawls gradually. Wait 48 hours and check again. If it's still not crawled, use the URL Inspection tool to request indexing.

Issue: "Excluded by robots.txt"

Your robots.txt is blocking this page. Check your robots.txt file and remove the block if it's unintentional. Wait 24 hours for Google to re-fetch your robots.txt, then request indexing again.

Issue: "Crawl timeout"

Your server is too slow, and Google gave up crawling. Optimize your server performance. Check for:

  • Slow database queries.
  • Unoptimized images.
  • Render-blocking JavaScript.
  • Missing caching headers.

After optimizing, run a crawl test again.

The Seoable Approach: Crawl Testing as Part of Your Launch

At Seoable, we've automated crawl testing into our launch process. When you use Seoable, we run a comprehensive domain audit that includes:

  • Full-site crawlability testing.
  • Robots.txt and sitemap validation.
  • Indexability analysis for every page.
  • Technical SEO issue detection.
  • Competitor keyword analysis.
  • 100 AI-generated blog posts optimized for your target keywords.

All of this happens in under 60 seconds for a one-time $99 fee. No monthly subscriptions. No agency bloat. Just the SEO foundation you need to ship with confidence.

If you don't want to run crawl tests manually, Seoable does it for you and gives you a roadmap to fix everything we find.

Summary: Your Hourly Crawl Test Checklist

Here's what you need to do in your first hour after launch:

First 15 minutes:

  • Download and install Screaming Frog (or use a free online crawler).
  • Run a full-site crawl on your domain.
  • Check Response Codes for 404s, 500s, and 403s.
  • Check Directives to ensure robots.txt isn't blocking important pages.

Next 15 minutes:

  • Verify your sitemap.xml is accessible and valid.
  • Submit your sitemap to Google Search Console.
  • Test individual critical pages with Google Search Console's URL Inspection tool.

Next 15 minutes:

  • Fix any critical issues (500 errors, robots.txt blocks, broken redirects).
  • Request indexing for your homepage and main landing pages.
  • Set up PageCrawl.io or Screaming Frog scheduled crawls for hourly tests.

Next 15 minutes:

  • Set up uptime monitoring to ensure your site stays live.
  • Set up Google Search Console crawl alerts.
  • Document your crawl test results.

Total time: 1 hour. Cost: $0 (or $99 for Seoable if you want automated audits and content generation).

Result: You'll catch 80% of SEO disasters before they cost you months of lost traffic.

Don't skip this. Ship, test, fix, and then watch your organic traffic grow.

Free weekly newsletter

Get the next one on Sunday.

One short email a week. What is working in SEO right now. Unsubscribe in one click.

Subscribe on Substack →
Keep reading