Introduction

  • The Big Question: Are brand search ads in Google App Campaigns truly incremental, or are we just paying for installs that would have happened organically?
  • The Client: A productivity-focused mobile app.
  • The Hypothesis: If brand ads were truly driving incremental installs, excluding them should negatively impact revenue and ROAS.
  • The Approach: A controlled A/B test using stratified sampling and difference-in-difference analysis to measure the real impact of brand exclusion.

Why We Ran This Experiment

  • Google Ads aggressively targets brand terms, potentially cannibalizing organic installs.
  • Many users searching for the brand might install with or without an ad.
  • The goal was to determine if brand campaigns actually increased total revenue or just shifted attribution.

How We Designed the Experiment

Step 1: Ensuring a Fair Comparison – Stratified Sampling

  • Simply dividing states randomly would not work since revenue varies greatly across different regions.
  • We used unsupervised learning to cluster states into five groups based on daily revenue.
  • From each cluster, states were randomly assigned to either the control or treatment group to ensure both groups were comparable.
  • This technique, known as stratified sampling, helped reduce variability between the groups.

Step 2: Checking the Parallel Trends Assumption

  • Before running the experiment, we checked whether both groups followed a parallel trend in revenue.
  • Parallel trends assumption passed, meaning any differences after brand exclusion would be due to the experiment, not pre-existing trends.

Step 3: Structuring the Campaigns for Testing

  • Brand Terms Removed from the Main US Campaign
    • First, we excluded all brand-related traffic from the primary Google App Campaign in the US.
  • Created a Separate Brand Campaign
    • Only included text assets (ensuring it primarily captured brand searches).
    • Set a very high tROAS target to limit aggressive bidding.
    • Excluded brand ads in half of the US regions (treatment group).
  • Difference-in-Difference Analysis to Remove Bias
    • Since the two groups still had slight differences in pre-treatment revenue, we used difference-in-difference (DiD) analysis.
    • This statistical method adjusts for pre-existing differences, ensuring a more accurate causal inference.

The Key Findings

1. Revenue Was Slightly Higher in the Brand-Excluded Group—But Not Statistically Significant

  • The treatment group (no brand ads) had 10.63% higher revenue with outliers and 7.89% higher revenue without outliers.
  • However, this difference was not statistically significant, meaning we cannot conclude that brand exclusion caused the revenue increase.
  • Another important fact is that the brand-excluded group spent more overall on the main NB UAC campaign, inflating revenue (interesting change in dynamics). When we did the regression analysis, we did not find that the spend for NB UAC campaign increased due to excluding brand

2. Blended ROAS Was Practically the Same

  • The effect of excluding brand on ROAS after the difference in difference analysis was 1.4% which is negligible.Thus, we could not reject the null hypothesis – Brand ads did not have an incremental impact on ROAS

3. Total Revenue Was Not Affected by Brand Exclusion

  • Despite turning off brand ads in half of the US, overall revenue remained stable, as checked by a regression analysis.
  • This suggests that users searching for app brand terms likely installed organically when no ad was shown.

What This Means for Advertisers Using UAC

  • Brand search ads in Google App Campaigns may not be necessary—especially when organic demand & brand is strong.
  • Blended ROAS as the real KPI—not just what Google Ads reports. Our ROAS reported in Google Ads dropped pretty significantly for the campaign after excluding Brand but the same difference wasn’t visible when looking at blended ROAS.
  • Rigorous testing methods (stratified sampling, parallel trends checks, difference-in-difference analysis) are essential to avoid misleading conclusions.

Final Thoughts & Next Steps

  • Conclusion: Every app and situation is different, but in this case, excluding brand ads did not affect ROAS. 
  • For apps with a strong brand, it may be worth testing brand exclusions rather than assuming ads on brand terms are driving incremental installs.
  • App rating is probably important in this case. The lower your app rating and the higher your main competitor’s showing up on your brand search, the more likely you’d need to keep bidding on your brand term, as a very low rating may deter users. In these cases, having your ad and organic result take more space and push down competitors may be helpful.  
  • Why is it important to conduct A/B tests independently without relying on platforms? A/B tests are essential to make decisions based on causal inference. According to the recent research, Braun and Schwartz illustrate that A/B tests conducted through platforms are biased due to divergent delivery. In a nutshell, When A/B tests are conduced through the platform, the algorithm decides which ad is relevant to what type of user. Thus, control and treatment groups are not random and comparable. For more information check: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3896024
  • Hence, it is essential to independently conduct this experiments by choosing appropriate methodology based on the context