]]>

]]>

I stumbled upon this book by chance, when searching for material about time series (probably the most interesting chapter in this collection.) The various chapters are accessible from the top tabs, on this web page. It is mostly about R, but it has a few interesting chapters on statistical science too. Below is a summary.Time series decomposition (in chapter 23)This website was created with 6 major sections: Programming, Plotting, Regression, ANOVA, Advanced topics,and R-Apps.The tutorials build on each other, but can also be utilized independently from one another, and refer back to other chapters that cover related topics in greater depth.R-programming: includes 9 chapters which covers the basics of how install R, review of the important basic functions, and some advanced concepts such data manipulation and transformations to prepare your data for analysis.Plotting: included 2 chapters on how to make pretty plots for the most common uses in psychology.Regression: included 8 chapters spanning how to conduct different types of regressions (linear, multiple, moderation/mediation,moderated mediation, logistic, Poisson, and multilevel and Mixed). Chapters focus on how to be able to run models and check assumptions. Some have short theoretical reviews.ANOVA: included 2 chapters on how to run between-, within-, and mixed-subjects ANOVAs with simple set of follow-up tests.Advanced topics: included 4 chapters on selecting correlation types, AIRMA, decision trees and signal detection.R Apps: includes a chapter which shows how to make a Shiny application, a living online document which is reactive to user input and a chapter which shows how an ANOVA parses variance.Some the chapters simulate datasets and others have links for you to download csv files. Each chapter might use different packages (i.e., library of functions), please install.packages("name of package") indicated at the start of each chapter for doing the tutorial. For more information on installing packages see https://www.r-bloggers.com/installing-r-packages/.List of chaptersThe Basics Indexing Logicals and Loops apply Functions plyr Sampling and Replication Melting & Casting Reshaping Data Using Tidyr GGplot for Scatterplots & Density Plots Boxplots and Bar Graphs Regression: Basics, Assumptions, & Diagnostics Plotting Regression Interactions Mediation and Moderation Moderated Mediation Multilevel Modeling Mixed Effects Modeling Testing the Assumptions of Multilevel Models Logistic and Poisson Regression Between-Subjects ANOVA in R ANOVA (afex): Within Subjects and Mixed Designs Correlation Types and When to Use Them Using ARIMA for Time Series Analysis Decision Trees Shiny Apps in RStudio The authors of the tutorials were all graduate students in the department of psychology at the University of Illinois at Chicago. See More

]]>

]]>

]]>

I found an interesting websites featuring hundreds of charts derived from US census data. It shows contrasts between states, cities, regarding education, jobs, languages spoken, salaries, even discrepencies between men and women or Asians and Caucasians, regarding various metrics broken down by location, education, or other criteria. I selected four of these charts.You can access all the charts here. The data was last updated in September 2018, according to the website. See More

]]>

I want to test the optimum price for some items sold online. One way to do it is to set two different prices and do some A/B testing to see which price generates the most revenue, or comparing user-customized versus flat prices, using Thompson sampling, the Taguchi method or multi-armed A/B testing.How to proceed if you want to test a continuous set of prices, not just two or three prices A / B / C? Is testing (say) 10,000 different prices any better than standard A/B testing, or does it lead to over-fitting and thus a non-robust solution? Likewise, if you want to test which background color works best for a website, is testing one million different colors more efficient than standard testing, and how to do it?Also, let's say you want to modify 20 features on your website, each one having 4 potential values (color, font size, font face and so on). In short, instead of A/B testing with 2 potential outcomes (A or B), you perform a multivariate test with 4^20 (4 at power 20) outcomes. Of course you will be able to test only a tiny fraction of all the possibilities, but is it more efficient than sequentially doing an A/B test for one feature, then another A/B test for another feature, and so on? The latter approach would take a lot of time and would result in a very local optimum. For instance, for the first feature, maybe A works best, for the second one (after choosing A for the first one) C works best, but for both featured combined, maybe (D, B) works best. How to do such a test when the number of potential combinations is 4^20?Finally, how do you determine the sample size for these types of experiments? Or in other words, what is the stopping criterion? Are p-values still being used in this context?See More

]]>