Guide · #734

PostHog Feature Flags for SEO Experiments

Use PostHog feature flags to safely A/B test on-page SEO changes. Step-by-step guide for founders running keyword experiments without breaking production.

Filed

May 10, 2026

Read

18 min

Author

The Seoable Team

The Problem With Shipping SEO Changes

You've identified a keyword opportunity. The search volume is there. Your competitor's ranking with mediocre content. You know you can win the ranking—but you also know that shipping an SEO change to production without testing it is how you tank your traffic.

Traditional SEO tools don't solve this. Ahrefs tells you what to rank for. Semrush shows you competitor gaps. But neither lets you test title tag variations, meta description rewrites, or heading restructures safely before committing them to Google's index.

This is where feature flags become your SEO weapon. PostHog feature flags let you roll out on-page changes to a percentage of your traffic, measure the impact on click-through rate (CTR), dwell time, and conversions, then decide whether to ship it to 100% of users or kill it.

No rollback delays. No index pollution. No guessing whether your SEO change actually moved the needle.

This guide walks you through using PostHog feature flags to run controlled SEO experiments on your live site. You'll learn how to structure tests, measure the right metrics, and ship winning changes with confidence.

Prerequisites: What You Need Before Starting

Before you implement feature flags for SEO testing, make sure you have these pieces in place:

PostHog account and SDK integration. You need PostHog installed on your site. If you haven't done this yet, set up PostHog on your domain first. The SDK takes minutes to install and works with most frontend frameworks.

Google Analytics 4 configured. You'll need GA4 running to track the metrics that matter for SEO experiments. If you're starting from scratch, follow the step-by-step GA4 setup for SEO tracking to configure events and dimensions correctly from day one.

Google Search Console connected. You need GSC to monitor how your changes affect impressions and CTR in Google's search results. Link GA4 with Google Search Console in 2 minutes so you can see search performance alongside user behavior.

Rank tracking in place. Before and after your experiment, you need to know where you rank. Use free and low-cost rank tracking to monitor keyword positions daily. This tells you if your test moved the needle in Google's index, not just on your site.

A clear hypothesis. Know what you're testing and why. "Does a longer title tag increase CTR for the keyword 'how to X'?" is a hypothesis. "Test something about the page" is not.

If you're missing any of these, pause here and set them up. The rest of this guide assumes you have working tracking infrastructure.

How Feature Flags Work for SEO Testing

Feature flags are conditional code toggles. They let you show different versions of a page to different users without deploying new code.

In PostHog, a feature flag works like this:

if (user is in test group) {
  show variant B (new title tag, new meta description, new heading structure)
} else {
  show variant A (control, original version)
}

PostHog handles the logic. You define the flag, set the rollout percentage, and PostHog randomly assigns users to variants. You measure the difference in behavior between the groups.

For SEO specifically, feature flags let you:

Test title tags and meta descriptions without redeploying your site or waiting for Google to re-index.
A/B test heading structures and content formatting to see if one layout drives more engagement than another.
Safely roll out schema markup changes to a percentage of traffic before committing to 100%.
Test internal linking strategies (different anchor text, link placement, link density) without affecting your entire site's link topology at once.
Experiment with content length on high-value pages to see if longer content with more keyword mentions improves CTR without cannibalizing other pages.

The key advantage: you're measuring real user behavior on your live site. You're not guessing. You're not relying on SEO agency claims. You're running experiments.

Step 1: Define Your SEO Experiment and Hypothesis

Before you touch PostHog, write down what you're testing.

Your hypothesis should follow this format:

"If I [change X on the page], then [metric Y] will [increase/decrease] by [Z%], because [reason based on SEO principle]."

Examples:

"If I change the title tag from 'How to X | Brand' to 'How to X: Complete Guide [2024]', then CTR will increase by 15%, because the new title is more specific and includes a year modifier that signals freshness."
"If I restructure the H2s to match search intent more closely (moving 'Comparison' before 'How-to'), then dwell time will increase by 20%, because users find what they're looking for faster."
"If I add FAQ schema markup to the page, then CTR will increase by 10%, because Google will show rich snippets in the SERP."

Write this down. Share it with your team. This hypothesis prevents you from running experiments that don't matter.

Choose your primary metric. For SEO experiments, your primary metric is usually one of these:

Click-through rate (CTR) from Google Search Console. This is the gold metric. If your change increases CTR in the SERP, Google sees more engagement and may rank you higher.
Dwell time (time on page). If users spend more time on your page after the change, it signals relevance to Google.
Scroll depth or engagement rate. If users scroll further or engage more with the content, the change likely improved content quality or readability.
Conversion rate. If the page is a landing page or conversion driver, measure whether the change improves conversions.

Don't measure everything. Pick one metric that aligns with your hypothesis. This is your success criterion.

Step 2: Create a Feature Flag in PostHog

Log into PostHog and navigate to Feature Flags in the left sidebar.

Click New feature flag.

Fill in the flag details:

Flag key: Use a descriptive name. Example: title_tag_freshness_modifier or h2_restructure_intent_match. This is what you'll reference in your code.

Flag name: A human-readable label. Example: "Title Tag Freshness Modifier Test" or "H2 Restructure for Intent Matching."

Description: Write one sentence explaining what the flag does and why. Example: "Test whether adding a year modifier to the title tag increases CTR for the 'how to X' keyword."

Under Rollout, select Rollout to a percentage of users. Start conservative: 10-20% of traffic. This limits the risk if the change tanks your metrics.

Set the percentage. For a new experiment, 15% is a good starting point. This gives you enough data to measure impact without risking your whole site if something goes wrong.

Under Variants, you have two options:

Simple flag (on/off): The flag is either true or false. Use this if you're toggling a feature on or off.

Multivariate flag: The flag returns different values for different users. Use this for A/B testing multiple variations. For example:

Variant A (control): {"title": "How to X | Brand", "description": "Original description"}
Variant B (test): {"title": "How to X: Complete Guide [2024]", "description": "Longer, more specific description"}

For SEO experiments, multivariate flags are usually better because you can test multiple page elements at once.

Click Save to create the flag.

Step 3: Implement the Feature Flag in Your Code

Now you need to add the flag logic to your site's code.

PostHog provides SDKs for JavaScript, React, Python, and other languages. The implementation depends on your tech stack, but the pattern is the same:

Check if the user is in the test group using the feature flag.
If yes, render variant B (the new version).
If no, render variant A (the control, original version).

JavaScript example:

const flagValue = posthog.getFeatureFlag('title_tag_freshness_modifier');

if (flagValue === 'test') {
  // Render new title tag
  document.title = 'How to X: Complete Guide [2024]';
  document.querySelector('meta[name="description"]').setAttribute(
    'content',
    'Step-by-step guide to X. Learn the best practices, common mistakes, and tools. Updated 2024.'
  );
} else {
  // Keep original title tag (control)
  document.title = 'How to X | Brand';
  document.querySelector('meta[name="description"]').setAttribute(
    'content',
    'Learn how to X with our guide.'
  );
}

React example:

import { useFeatureFlagVariantKey } from 'posthog-js/react';

export function PageTitle() {
  const variant = useFeatureFlagVariantKey('title_tag_freshness_modifier');

  if (variant === 'test') {
    return (
      <>
        <title>How to X: Complete Guide [2024]</title>
        <meta name="description" content="Step-by-step guide to X..." />
      </>
    );
  }

  return (
    <>
      <title>How to X | Brand</title>
      <meta name="description" content="Learn how to X..." />
    </>
  );
}

The key is this: PostHog assigns the user to a variant on first visit and keeps them in that variant for the duration of the experiment. This prevents the same user from seeing both versions, which would skew your data.

Deploy your code to production. The feature flag is already live in PostHog, so once your code is deployed, the experiment is running.

Step 4: Monitor the Experiment in Real Time

Go back to PostHog and open your feature flag. You'll see a dashboard showing:

Rollout percentage: How much traffic is in the test group.
User counts: How many users have seen each variant.
Variant assignment: The breakdown of users across control and test variants.

This tells you the flag is working. Users are being assigned to variants correctly.

But here's the catch: PostHog doesn't automatically measure your SEO metrics. PostHog tracks user behavior (page views, events, clicks), but it doesn't know which variant a user saw or how that variant affected CTR in Google Search Console.

You need to wire PostHog events to your GA4 account so you can measure the experiment's impact.

Step 5: Connect PostHog Events to GA4

To measure your SEO experiment, you need to track which variant each user sees and how they behave as a result.

The best way to do this is to send PostHog feature flag data to GA4 as a custom dimension.

Step 5A: Create a custom dimension in GA4.

In Google Analytics 4, go to Admin > Custom definitions > Custom dimensions.

Click Create custom dimension.

Dimension name: feature_flag_variant
Scope: User
Description: "The feature flag variant assigned to this user (control or test)."
Event parameter name: feature_flag_variant (this must match the parameter you send from PostHog)

Click Save.

Step 5B: Send feature flag data from PostHog to GA4.

In your code, whenever you check a feature flag, send that flag assignment to GA4:

const flagValue = posthog.getFeatureFlag('title_tag_freshness_modifier');

// Send the flag assignment to GA4
gtag('event', 'view_item', {
  'feature_flag_variant': flagValue || 'control'
});

if (flagValue === 'test') {
  document.title = 'How to X: Complete Guide [2024]';
} else {
  document.title = 'How to X | Brand';
}

Now every page view in GA4 is tagged with the feature flag variant the user saw.

For more on setting up GA4 events properly, follow the guide to GA4 events for SEO to track custom events that reveal user intent and conversion paths.

Step 6: Set Up Your Experiment Dashboard

You need to see the experiment results in one place. Build a simple dashboard in Looker Studio that shows:

Sessions by variant (control vs. test).
Average engagement time by variant (dwell time proxy).
Scroll depth by variant (if you're tracking scroll events).
Conversion rate by variant (if applicable).
Bounce rate by variant (lower is usually better).

You can build this in Looker Studio by connecting to GA4 and filtering by your custom feature_flag_variant dimension.

Alternatively, use PostHog's built-in analytics to track custom events. For example, you can create an event called page_scroll_50_percent and measure what percentage of control vs. test users trigger that event.

The goal: one dashboard showing whether your SEO change is winning or losing.

Step 7: Run the Experiment for Sufficient Duration

How long should you run the experiment?

Minimum: 2 weeks. This captures different traffic patterns (weekdays vs. weekends, different times of day).

Ideal: 4 weeks. This gives you enough data to detect meaningful differences and account for Google's indexing lag.

For high-traffic sites: 1 week may be enough if you're getting hundreds of sessions per day in each variant.

For low-traffic sites: Run for 6-8 weeks to build statistical confidence.

During the experiment:

Check your dashboard daily but don't obsess over early results. Early data is noisy.
Watch for unexpected events (traffic spike, outage, competitor ranking change) that could skew results.
Monitor your rank tracking to see if Google has re-indexed the page and if rankings have moved.

Don't stop the experiment early just because you see a positive trend. You need statistical significance, not just a promising direction.

Step 8: Analyze Results and Make a Decision

After 2-4 weeks, analyze your dashboard. Compare control vs. test across your primary metric.

Winning result: Your test variant shows a statistically significant improvement (usually 10%+ improvement in your primary metric, with enough data volume that the difference is unlikely to be random).

Losing result: Your test variant shows a statistically significant decline in your primary metric.

No clear winner: The difference is small or the data is too noisy to draw conclusions.

What to do in each case:

If you're winning: Roll out the test variant to 100% of traffic. Change the PostHog rollout percentage to 100%. Update your site's code to make the test variant the default. Monitor your rank tracking over the next 2-4 weeks to see if Google re-ranks you higher.

If you're losing: Kill the experiment. Revert the code to the original version. Disable the feature flag in PostHog. This is a win—you learned something and didn't damage your site.

If there's no clear winner: Run the experiment longer, or adjust your hypothesis and try a different variation. Maybe a smaller change (different title tag format, different heading order) would move the needle.

Step 9: Document and Iterate

After each experiment, document the result:

Flag name: title_tag_freshness_modifier
Hypothesis: "Adding a year modifier to the title tag increases CTR."
Variants tested: Original vs. new title format.
Duration: 4 weeks.
Primary metric: CTR in Google Search Console.
Result: CTR increased 12% in test variant.
Action: Rolled out to 100%.
Follow-up: Monitor rank tracking for 4 weeks to confirm Google re-ranks the page.

Keep a running log of experiments. Over time, you'll build institutional knowledge about what works for your site.

Then run the next experiment. Test a different page element. Test a different hypothesis. Use the quarterly SEO review process to identify which pages have the most impact potential, then prioritize your experiments there.

Pro Tips for SEO Feature Flag Experiments

Test one variable at a time. If you change the title tag AND the H2 structure AND add schema markup in one experiment, you won't know which change drove the result. Isolate variables.

Use multivariate feature flags for complex tests. If you're testing multiple variations of the same element (three different title formats, for example), use PostHog's multivariate flag feature to test all three simultaneously. This is faster than running three separate experiments.

Track secondary metrics alongside your primary metric. If your primary metric is CTR, also track dwell time, bounce rate, and conversion rate. Sometimes a change improves CTR but tanks dwell time (users click but leave fast), which is a red flag.

Use feature toggles for emergency rollbacks. If a change causes a massive drop in traffic after rollout, you can disable the flag instantly without deploying code. This is your safety net.

Be cautious with crawl budget. When you're running experiments with different title tags and meta descriptions, Google's crawler sees two versions of the same page. This uses crawl budget. For low-traffic pages, this is fine. For high-traffic pages, consider running experiments on lower-priority pages first.

Don't test vanity metrics. Bounce rate, pages per session, and average session duration are often misleading for SEO. Focus on metrics that correlate with rankings: CTR, dwell time, and conversion rate.

Common Mistakes to Avoid

Mistake 1: Not waiting for Google to re-index. You roll out a title tag change, see no CTR improvement in GA4, and kill the experiment. But Google hasn't re-indexed the page yet. Wait 1-2 weeks after rollout before checking Google Search Console for CTR changes.

Mistake 2: Running too many experiments at once. You test a title tag change on 10 pages simultaneously, see mixed results, and can't figure out what worked. Run experiments on one page at a time until you have a repeatable playbook.

Mistake 3: Ignoring statistical significance. You see a 5% improvement in your primary metric after 1 week and declare victory. With low sample size, a 5% difference is noise. Use a statistical significance calculator to determine if your result is real.

Mistake 4: Testing changes that don't align with SEO principles. You test removing keywords from the title tag because it "looks better." But Google uses title tags to understand page relevance. Test changes that align with SEO best practices, not just design preferences.

Mistake 5: Not tracking rank position alongside on-page metrics. You see improved CTR in GA4 but your rank position hasn't moved. This could mean the change improved CTR without improving relevance (users click more but leave fast). Always cross-check rank tracking with on-page metrics.

Connecting Experiments to Your Broader SEO Strategy

Feature flags are a tool. They work best when integrated into a broader SEO process.

Start with a keyword roadmap. Identify the keywords you want to rank for. Then identify the pages that rank for those keywords (or should rank for those keywords).

For each page, use rank tracking on a bootstrapper's budget to measure current position. Then run feature flag experiments to improve CTR, dwell time, and engagement on those pages.

Measure the impact using the 5 SEO metrics that actually matter: organic traffic, rankings, CTR, conversion rate, and crawl health.

Every 90 days, run a quarterly SEO review to audit rankings, fix crawl issues, validate keywords, and identify the next batch of pages to experiment on.

This is how you turn feature flags from a cool tool into a repeatable, measurable SEO process.

Scaling Your Experiments

Once you've run a few experiments and found patterns that work, scale them.

Example: You discover that adding a year modifier to title tags increases CTR by 10% consistently. Now apply that pattern to your top 20 ranking pages. Use a feature flag to roll it out to 50% of traffic on all 20 pages simultaneously, then measure the aggregate impact.

Or: You discover that restructuring H2s to match search intent improves dwell time. Create a content brief that includes this H2 structure, then use it to guide your AI-generated content process. Every new piece of content ships with the winning structure baked in.

This is how feature flags become infrastructure. You're not running one-off experiments. You're building repeatable, data-driven SEO processes.

When to Use Feature Flags vs. Traditional A/B Testing

Feature flags are powerful, but they're not always the right tool.

Use feature flags when:

You're testing on-page changes (title tags, meta descriptions, heading structure, content formatting).
You want to measure impact before full rollout.
You need emergency rollback capability.
You're testing changes that might affect rankings (you want to limit exposure while measuring impact).
You have engineering resources to implement the flag in code.

Use traditional A/B testing (Google Optimize, Optimizely) when:

You're testing page layout, design, or user experience changes.
You want a no-code solution.
You're measuring conversion rate only (not SEO metrics).
You have a high-traffic site and can afford to split traffic 50/50.

For SEO specifically, feature flags are better. They let you control rollout percentage, measure impact on search metrics, and roll back instantly if something goes wrong. Traditional A/B testing tools aren't designed for SEO experiments.

Conclusion: From Guessing to Measuring

Most founders ship SEO changes and hope they work. They change a title tag, wait for Google to re-index, check their rank tracking, and either celebrate or panic.

Feature flags let you measure first, then commit. You run a controlled experiment on a percentage of traffic, measure the impact on CTR and dwell time, and only roll out the change if it's winning.

This removes the guessing. You're not relying on SEO agency claims or gut instinct. You're running experiments on your live site and measuring real user behavior.

Here's what you need to do:

Set up PostHog on your site if you haven't already.
Define a clear hypothesis for your first SEO experiment.
Create a feature flag in PostHog with a 10-15% rollout.
Implement the flag in your code and deploy to production.
Connect PostHog events to GA4 so you can measure the experiment's impact.
Run the experiment for 2-4 weeks and monitor your dashboard.
Analyze the results and make a decision: roll out, kill, or iterate.
Document the result and run the next experiment.

Repeat this process every quarter. Over a year, you'll run 12+ experiments. Some will win. Some will lose. But every experiment teaches you something about what works on your site.

This is how you build SEO as infrastructure. Not through agency relationships or one-time content drops. Through repeatable, data-driven experiments.

Start with one experiment. Ship it. Measure it. Learn from it. Then ship the next one.

That's how you build organic visibility that compounds.

Free weekly newsletter

Get the next one on Sunday.

One short email a week. What is working in SEO right now. Unsubscribe in one click.

Subscribe on Substack →

Keep reading