Facebook ad testing: is more ads better?

Yellow ad, red ad… Does it matter in the end?

Introduction

I used to think differently about creating ad variations, but having tested both methods I’ve changed my mind. Read the explanation below.

There are two alternative approaches to ad testing:

“Qwaya” method* — you create some base elements (headlines, copy texts, pictures), out of which a tool will create up to hundreds of ad variations
“Careful advertiser” method — you create hand-crafted creatives, maybe three (version A, B, C) which you test against one another.

In both cases, you are able to calculate performance differences between ad versions and choose the winning design. The rationale in the first method is that it “covers more ground”, i.e. comes up with such variations that we wouldn’t have tried otherwise (due to lack of time or other reasons).

Failure of large search space

I used to advocate the first method, but it has three major downsides:

it requires a lot more data to come up with statistical significance
false positives may emerge in the process, and
lack of internal coherence is likely to arise, due to inconsistency among creative elements (e.g., mismatch between copy text and image which may result in awkward messages).

Clearly though, the human must generate enough variation in his ad versions if he seeks a globally optimal solution. This can be done by a) making drastically different (e.g., humor vs. informativeness) as oppose to incrementally different ad versions, and b) covering extremes on different creative dimensions (e.g., humor: subtle/radical informativeness: all benefits/main benefit).

Conclusion

Overall, this argument is an example of how marketing automation may not always be the best way to go! And as a corollary, the creative work done by humans is hard to replace by machines when seeking optimal creative solutions.

*Named after the Swedish Facebook advertising tool Qwaya which uses this feature as one of their selling points.