In this 5 Minute Analysis we'll preprocess, join, and explore disjoint sales data for retail locations. Then we’ll dive into the relation between store type, fuel prices, and weekly sales as well as prepare the data for easy loading into Business Analysis tools such as Tableau and PowerBI. Finally, we’ll explore and analyze our preprocessed and joined data in Tableau and use its powerful visualizations to draw additional insights.
Dataset: Retail Data Analytics
This blog post explores and analyzes the data using PivotBillions, available freely on docker.
You can now see the columns and types of each dataset and modify them as you see fit. You can also view or change which column or columns are set as primary keys and determine how the datasets should be joined. When you are done viewing or modifying the data structure, you can import and load the data.
We now have access to all 421,569 rows of the joined retail sales data. This contains the data and features from each of the three separate data files from our original dataset.
By hovering over each column name you can sort the data by that column, view that column’s distribution over all of the data, filter by the data in that column, or rename that column. For example, we can view the distribution of the Temperature during the sales period in our data by clicking on the second-from-the-left icon (distribution) in for the Temperature column.
As expected it has a relatively normal distribution.
Now, for our analysis we’ll filter all of the data to get rid of Holiday sales since they are less indicative of general sales during the year.
All of our data is immediately filtered to data occurring on non-Holiday days.
We’ll now do some basic analysis to see what relationship the fuel price has on each store type’s weekly sales.
Pivot Billions now quickly reorganizes your data by store type and fuel price. It also provides counts, sums, and statistics on the corresponding weekly sales. You can also sort by a column or filter the data. The data is sorted by Count by default.
You can see that Pivot Billions quickly organizes the data by the 2,086 unique combinations of the Type and Fuel Price Columns. We’ll now interactively view the data.
We can immediately see that Store Type A has by far the most sales of each of the store types.
Now let’s view the relationship between weekly sales by store type and fuel price.
You can see an increase in sales during some periods of low fuel prices for Store Type B. However, it is unclear whether this increase is solely due to low fuel prices since the increase in sales is not present for all periods of low fuel prices.
This warrants further exploration either in Pivot Billions or another tool. In Pivot Billions we could simply add another dimension or filter to explore their effect on our results. If we want to use another tool such as Tableau or PowerBI we simply click the download button on the top right of the pivot widget, enter our filename, and select Download.
We can also load all or some of the data directly into Tableau, without the need to rearrange it. For efficiency we’ll look at all of the non-holiday data for the first ten departments across all stores and types.
We can now easily import this data into Tableau and use its powerful visualizations to dive into the data.
Visualizing the data this way makes it clear that Department 6 has the least average weekly sales of the first 10 departments for each store type. Also, Type C Stores have heavily concentrated sales in Departments 1, 2, 4, and 8 with the other departments making up an insignificant percentage of sales. This could warrant dropping some of those departments for Type C Stores or Department 6 for all store types if those departments don’t have secondary benefits such as drawing customers to the stores.
To view and interact with this visualization or download the workbook to Tableau, see my Getting the Whole Picture of Retail Sales Workbook on Tableau Public.