This permanent experimental design setting allow you to learn, participate or check out the results at any time, as data is gathered and reported in real time. This article illustrates a few concepts:
You can actively participate and earn a reward (up to $1,000) or passively look at how the test progresses day after day (or even second after second), and offer your explanation to the problem we try to solve, by analyzing the data available to anyone.
1. The problem
When we send an email to more than 100,000 subscribers, we want to check the performance of the campaign. We measure the performance using various metrics, such as clicks on links, open rate, etc. Some email blasts are text only (for instance LinkedIn), and the only way to track performance is using links that are redirects. We could create our own redirect system for better monitoring, but so far we have relied on bit.ly and goo.gl URL shorterners, which also provide traffic monitoring in real time. These statistics are sometimes used for traffic attribution.
We suspect that the numbers reported by bit.ly are wrong, sometimes inflated by a factor 3, sometimes by a factor 10. We suspect (based on gut feelings) that a glitch in Outlook is responsible for discrepancies, causing clicks to sometimes be double or triple counted, if not worse. This problem appears with user clicking from an email message, not from a click on a web page.
The question is: how do we assess the bias? How can we do a test to measure the discrepancy?
2. The solution - below are test links
We proceed here with classic experimental design. We observed the problem with bit.ly, and we assume that Google's shorterner (goo.gl) is more accurate. Thus we want to compare bit.ly (test) numbers with goo.gl (control).
We have created 2 test messages A and B (see section 5), each featuring 4 links:
The first message contains the following links:
The second message contains the following links:
All links point to the exact same page, but we used different tags to create 8 distinct URL's. This allows us to track the traffic separately for each link. In particular, it helps identify the effect of link position in the message, on the number of clicks. Note that bit.ly and goo.gl links are interlaced, with goo.gl in first position in first message, and bit.ly in first position in second message. This is how experimental design works, to eliminate the effect of external variables such as link position. Ideally, you'd like to do a test with more than two messages, especially if you want to assess variance in your estimates.
3. The results - click on the links below to check out traffic statistics
The traffic statistics (page views), broken down per day, are available at the following URL's, in real-time (each new click appears instantly in the traffic report, you can check it out yourself):
First message:
Second message:
You can use these links to collect data. In addition, a total page view count, called Ning count, is available (also updated in real-time) on the target web page, although most of the traffic reaching this page does not come from our test.
4. Caveats and potential improvements
A potential issue is the fact that both goo.gl and bit.ly are not different, but both victim of the same glitch, thus both numbers would be inflated. Of course, the first thing to look at is whether these numbers are similar.
A big improvement to this test consists in building two artificial (test) pages rather than using a single, popular page: one page where bit.ly traffic is directed, and another one where goo.gl is directed. These two pages would only be accessed through the test, so that the Ning count should be equal to the total bit.ly clicks count for the first page, and to the total goo.gl click count for the second page. But it's not a perfect fix: there is no way to guarantee that traffic would come only from our test, plus Ning numbers might be wrong - too low or too high. Actually, we also use two additional mechanism to monitor traffic: Google analytics, and stats provided by our newsletter management vendors.
Other potential issues:
5. How to participate
There are different ways to participate.
You can share this experiment with your students, and they can actively participate.
You can gather the data, analyze it, assess if there is a discrepancy between bit.ly and goo.gl (an even look at daily numbers from the Ning count if it helps), and post the results of your investigation in the comment section below. If you find an explanation to the discrepancy, that would be even better - it would prove (assuming your conclusion is different from mine) that data analysis can beat gut feelings.
Another way to participate, and win an award, is to contact me at [email protected], at least 24 hours in advance, to let me know that you are going to email or post copies of message A and/or message B (see below) on a specific date, for your own testing purposes. You can modify the messages, as long as the 4 URL's are unchanged. If more than 100 clicks are generated across the 8 target URLs, you will be paid $1 per click (capped at $1,000) for the traffic generated that day - the whole traffic, not just yours - as reported by bit.ly and goo.gl. You must use a corporate email address to participate (to contact me), or else have a profile on Data Science Central, with a link to your LinkedIn profile. Fake clicks do not count, and we have sophisticated mechanisms to detect them.
Message A
Subject: Invitation to participate in a statistical experiment
Data Science Central is currently running a statistical test using a methodology known as experimental design, to measure discrepancy between two metrics and find the root cause.
This is an opportunity to learn about experimental design, be able to follow results in real-time, criticize Granville's test, access real life data, and help develop better analytics.
Click on one of the following links to help us answer our statistical question. These 4 links redirect to the exact same page on DataScienceCentral.com, so no need to click on more than one.- http://bit.ly/1dm1N6T
- http://goo.gl/nkFdCQ
- http://bit.ly/1iftd2J
- http://goo.gl/OZJsrIYou can also actively participate (and possibly earn a little money). Go to http://bit.ly/1oqj1XB for details about the test, or just to learn more about experimental design - a fundamental topic in statistics and data science, listed as a requirement in many job ads. Or feel free to forward to a colleague who might be interested.
Message B
Subject: Invitation to participate in a statistical experiment
Data Science Central is currently running a statistical test using a methodology known as experimental design, to measure discrepancy between two metrics and find the root cause.
This is an opportunity to learn about experimental design, be able to follow results in real-time, criticize Granville's test, access real life data, and help develop better analytics.
Click on one of the following links to help us answer our statistical question. These 4 links redirect to the exact same page on DataScienceCentral.com, so no need to click on more than one.- http://goo.gl/kOYLgB
- http://bit.ly/1g7rfD4
- http://goo.gl/jTR0SN
- http://bit.ly/1mfNusQYou can also actively participate (and possibly earn a little money). Go to http://bit.ly/1oqj1XB for details about the test, or just to learn more about experimental design - a fundamental topic in statistics and data science, listed as a requirement in many job ads. Or feel free to forward to a colleague who might be interested.
Other links
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central