Test your data science skills!
Our new challenge is about identifying periodicity (simple or multiple) and especially periodic peaks occurring in each cycle, in the attached spreadsheet, after taking into account seasonality, outliers (e.g. Christmas day), noise, and messy data.
Cyclic peaks, download spreadsheet for detailed data
We know what these cyclic peaks are, because they are caused by our actions, whose purpose is precisely - assuming it works as expected - to create these peaks. Here we ask you to
To know about the cause of this phenomenon, as well as what the data is about, and download a much bigger, multi-dimensional data set related to the same time series, go to our members-only page where the solution is provided. Previous challenges of the week can be found here.
DSC Resources
Additional Reading
Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge
Tags:
I see go to our members-only page but I am unable to see the solution out there. Where is it in there?
Dr. Vincent, Can you please let me know where is the solution for this one. I see this to be a typical problem in web data and eager to understand the cyclical patterns.
Answer is in item #4 in the members-only page.
Best,
Vincent
I am sorry, I couldn't locate that page. Could you send the link please.
Dr. Vincent Granville said:
Answer is in item #4 in the members-only page.
Best,
Vincent
Howdy All,
I created a couple plots in R: geom_line with a geom_boxplot on top. You can see a a peak on Monday and Thursday corresponding to DSC emails, a dip on Saturday where people mow their lawns and do laundry. You can also see from the lines a shift upwards in page counts as the year progresses. There are only two outlier points.
Also, looking at/plotting the page views delta's (today minus yesterday) will help to discover the patterns.
Finally, we also do a Friday blast (usually called Good Friday Reading) , but it is a much smaller one in terms of reach, not done each week, and Friday being a relatively low day, you don't see it in the data.
Thanks for this interesting challenge!
http://www.datasciencecentral.com/profiles/blogs/how-we-combined-di.... I implemented regression model following the trend+seasonal component strategy.
Interesting how the differenced variables removed any sort of overall trend and made seasonal patterns easier to identify.
© 2021 TechTarget, Inc.
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles