Ready to generate
Back to Blog
Strategy

App Store A/B Testing: How to Run Experiments That Actually Work

Most developers waste months on bad A/B tests. Learn the statistical framework and practical tips for tests that drive real growth.

Marcus ChenFebruary 12, 202513 min read
a/b testingoptimizationgoogle playconversion

Why Most A/B Tests Fail

I've reviewed over 100 A/B tests run by indie developers.

90% of them were useless.

Not because the developers were incompetent. But because they made basic statistical mistakes that invalidated their results.

Here's how to run A/B tests that actually tell you something.

The Statistical Foundation

What is Statistical Significance?

When you run an A/B test, you're comparing two versions:

  • Version A: 20% conversion
  • Version B: 22% conversion
  • Is B actually better? Or is this just random noise?

    Statistical significance tells you the probability that the difference is real, not chance.

    The Magic Number: 95%

    Standard practice: only trust results with 95% confidence.

    This means: "There's only a 5% chance this difference is random noise."

    Why Sample Size Matters

    Small sample = unreliable results.

    If you test for 3 days and get 200 impressions, your results mean nothing. You need thousands of impressions for reliable data.

    Rule of thumb: Aim for at least 1,000 impressions per variant before drawing conclusions.

    A/B Testing on Google Play

    Google Play has built-in A/B testing. It's free and powerful.

    How It Works

  • Create a "Store Listing Experiment"
  • Upload variant screenshots/descriptions
  • Google splits traffic 50/50
  • Wait for statistical significance
  • Apply the winner
  • What You Can Test

  • **App icon** — High impact, fast results
  • **Screenshots** — Most common test
  • **Short description** — Visible in search
  • **Feature graphic** — Top of listing page
  • **Full description** — Lower impact
  • Google's Statistical Engine

    Google will tell you when a result is "statistically significant." Trust this.

    Don't end tests early. Don't declare winners based on gut feeling.

    A/B Testing on iOS

    Apple's App Store is harder. No built-in A/B testing.

    Your Options

    1. Product Page Optimization

    Apple's official solution. You can test:

  • Up to 3 treatments vs. your control
  • Screenshots
  • App previews
  • Promotional text
  • Limitations:

  • Only available for certain account types
  • Slower to set up
  • Less granular than Google
  • 2. Time-Based Testing

    Change your screenshots, measure for 2-4 weeks, compare to previous period.

    Problems:

  • External factors (seasonality, marketing) affect results
  • Less reliable than true A/B split
  • 3. Third-Party Tools

    Services like SplitMetrics or StoreMaven run fake App Store pages and drive traffic to test.

    Pros: Real A/B testing

    Cons: Expensive, requires ad spend

    My iOS Recommendation

    For indie developers: Use Product Page Optimization if available. Otherwise, focus on Google Play testing and apply learnings to iOS.

    What to Test First

    Highest Impact (Test These First)

    1. First Screenshot

    This single asset drives 70%+ of conversions. Test:

  • Different headlines
  • Different value propositions
  • Dark vs. light backgrounds
  • 2. App Icon

    Fastest to show results. Test:

  • Different color schemes
  • With/without text
  • Different shapes/styles
  • Medium Impact

    3. Second Screenshot

    If first screenshot is optimized, test your second.

    4. Feature Graphic (Android)

    Highly visible but often overlooked.

    Lower Impact (Test Later)

    5. Description

    Most users don't read it. Test last.

    6. Later Screenshots (3-10)

    Fewer users see these. Optimize first screenshots first.

    The Testing Framework

    Step 1: Form a Hypothesis

    Don't test randomly. Have a reason.

    Bad: "Let's try a blue background"

    Good: "Our analytics show users drop off at screenshot 2. Maybe the feature isn't clear. Let's test a simpler headline."

    Step 2: Create Variants

    Rule: Only test ONE variable at a time.

    If you change the headline AND the background AND the device frame, you won't know what caused the difference.

    Control: Original screenshot

    Variant: Same screenshot, different headline only

    Step 3: Run the Test

    Let it run until:

  • Statistical significance (Google will tell you)
  • At least 1,000 impressions per variant
  • At least 7 days (to account for weekly patterns)
  • Step 4: Analyze Results

    Don't just look at "which won."

    Look at:

  • By how much? (+2% vs +20% matters)
  • Is it statistically significant?
  • Are there any segment differences? (Country, device type)
  • Step 5: Implement Winner

    Apply the winning variant. Document what you learned.

    Step 6: Test Next Thing

    A/B testing is continuous. When one test ends, start the next.

    Common A/B Testing Mistakes

    Mistake 1: Ending Tests Too Early

    "Version B is up 15% after 2 days! Let's end the test!"

    No. Early results are unreliable. Let tests run to statistical significance.

    Mistake 2: Testing Too Many Variables

    "Let's test new icon, new screenshots, new description, and new price all at once!"

    You'll learn nothing. Test one variable at a time.

    Mistake 3: Ignoring Statistical Significance

    "Version B has 21% conversion vs Version A's 20%. B wins!"

    Maybe. Or maybe that's within the margin of error. Check the confidence level.

    Mistake 4: Not Having a Hypothesis

    Random testing wastes time. Form hypotheses based on data.

    Mistake 5: Testing Small Tweaks

    "Let's test making the headline 2px bigger."

    Small changes = small effects. Test bold differences that could move the needle.

    Advanced Testing Tips

    Segment Your Analysis

    If possible, analyze results by:

  • Country (US might prefer different designs than Japan)
  • Device type (phone vs tablet)
  • Traffic source (organic vs paid)
  • Use Seasonal Baselines

    Downloads fluctuate by day of week and season. Compare like-to-like periods.

    Document Everything

    Keep a testing log:

  • What you tested
  • Hypothesis
  • Results (with confidence levels)
  • What you learned
  • Future you will thank you.

    Testing Resources

    Free

  • **Google Play Experiments** — Built into Play Console
  • **Apple Product Page Optimization** — Built into App Store Connect
  • Paid

  • **SplitMetrics** — Third-party iOS testing
  • **StoreMaven** — Third-party iOS testing
  • **Phiture** — ASO agency with testing tools
  • For Screenshots

  • **Shotsy** — Generate variants quickly for testing
  • Creating multiple screenshot variants for testing used to take hours. With , you can generate variants in minutes, making it practical to run more tests.Shotsy

    The Testing Mindset

    A/B testing isn't about "finding the perfect screenshot."

    It's about continuous improvement.

    Even if your test "fails" (no winner), you learned something. That's valuable.

    The best apps aren't built by genius designers. They're built by teams that test relentlessly and let data guide decisions.

    Start testing. Keep testing. Never stop testing.

    Create screenshot variants for testing →

    Stop Wasting Hours on Screenshots

    Join developers who create App Store screenshots in under 60 seconds.

    Try Shotsy Free — No Credit Card
    3-day free trialCancel anytimeAI-powered copy
    Ready to ship? Create screenshots in 60 secondsTry Free
    Shotsy - Create Stunning App Store Screenshots