Facebook uses an algorithm to estimate the proportion of text in an image embedded into a Facebook ad.
The algorithm works well and it is well designed (it's not an issue of bad data science per se), but it results in 50% of my ads being rejected, see sample ad and rejection notice below. This is an example where better communications between legal, management, clients, and finance people would improve revenue. Clearly, Facebook should either reduce the threshold required for my ad to pass muster (who at Facebook makes the decision about the threshold?) or detect and turn legitimate clients (like us) into trusted advertisers - meaning, in our case, 95% of our ads would be approved (just like on Google), rather than 50%.
One of our ads rejected because of too much text in the image
You might argue that it's the advertiser's problem to not put images with too much text on landing pages. I disagree, these ads are automatically created by Facebook when boosting an article; the image is fetched by Facebook and integrated into the ad. Hint for Facebook: when your algorithm detects multiple images on a page, pick up one that meets your threshold rather than the picture at the top. Of course, this requires more Internet bandwidth as Facebook would have to crawl entire landing pages rather than just the first few paragraphs, to capture the best image (in our case, and I guess for most publishers, most articles have 0 or 1 image anyway).
What is the motivation behind Facebook's business rule, regarding the accepted proportion of text in Facebook display ads? Would accepting images with more text cause a loss of revenue elsewhere, big enough to outweigh the benefits? Or is there not enough inventory, meaning that
Even if this is the explanation (I doubt it is), the algorithm should not be black and white. It's not like my ad has porn or anything like that in it (which would require another algorithm for detection, possibly analyzing the metadata found in the image header). Note that my Facebook post in question (see above picture) had received one genuine "like" before I decided to boost it (I tend to boost a few posts that are popular, rather than tons of stuff that few people are interested in). I believe an ad with genuine "likes" and too much text in the image is better than the other way around, for both Facebook and the advertiser. Or am I missing something?
Here's the automated message sent by Facebook, when your ad is rejected. It takes about 10 minutes for the algorithms to make a decision (probably because they must process million of ads in real time).
Facebook automated rejection email
One or more of the posts you promoted don’t meet our guidelines and have been disapproved. Your post is still visible on your Page but will not be promoted. You will only pay for any actual impressions or clicks your ad receives.
Your ad wasn't approved because it uses too much text in its image, which violates Facebook's ad guidelines. Ads that show in the Feed are not allowed to include more than 20% text. You'll still be charged for any impressions or clicks your ad received before it was disapproved.
Data scientists working in a specific field (advertising in this case) should be knowledgeable about all the business aspects, and not just pure technical statistics or computer science. In this example, it means having a good understanding of all the issues addressed in my article. Data scientists should be able to identify whether the problem in question is big or small (compute a loss or missed revenue per month, measured in dollars), who should be involved to fix it, how to convince the right people to get it fixed, and then measure the lift when the problem is fixed. This is even more critical than knowing the most advanced or obscure data mining algorithms. Data scientists that do not have this kind of skill or experience are not real data scientists, or at best, they are too junior and need to gain more business experience.