I believe everyone is born with the innate, curiosity-driven, explore-test-learn Data Science capability. At Halloween, kids naturally embrace a rapid exploration, rapid testing, failure-empowering “Scientific Method” to optimize their candy yield and logistical “Trick or Treating” algorithms.
So, what can we – as parents and teachers – provide to help nurture these budding data scientists? How can we prepare them for a future using data and analysis (analytics) to make informed operational, policy and life decisions?
Don’t be a scaredy cat and let’s talk about how we can get our kids ready for the future – by preparing them to embrace their inner data scientist.
The Hypothesis Development Canvas is a design tool that succinctly defines the problem that one is trying to solve. The Hypothesis Development Canvas is a collaborative tool that captures the details about the hypothesis or problem that we are trying to solve, brainstorms the metrics and variables against which progress and success will be measured, identifies the stakeholders who either impact or are impacted by the targeted hypothesis, identifies and prioritizes the decisions that the stakeholders need to make in support of the targeted hypothesis (see Figure 2).
Figure 2: Halloween “Treat or Treating” Candy Optimization Hypothesis Development Canvas
Having your students construct a Hypothesis Development Canvas for their Trick or Treating objectives is a great way to help our future data scientists understand the importance of preparation before actually putting science to the data. The Hypothesis Development Canvas in Figure 2 provides a “paint by the numbers” example for our future data scientists to thoroughly understand what they are trying to achieve, how they will measure success and how they can leverage data and analysis to optimize their key decisions to optimize their Halloween “Treat or Treating” endeavor. This canvas helps clarify the following before actually diving into the analysis that drives the event optimization, including:
Note: one of the most important outcomes from the Hypothesis Development Canvas exercise is 1) the identification of the variables and metrics against which hypothesis progress and success will be measured, and 2) the identification, validating, valuation and prioritization of the key decisions that they need to make in support of the targeted hypothesis. Get these two items right, and your students are well down the path to becoming data scientists and serving up Frankenstein his favorite kind of potatoes: monster-mashed!
Children are naturally able to optimize across multiple, sometimes conflicting variables – volume of candy, quality of candy, distance to travel between sources of candy, time to wait at the door to get their candy – in order to optimize their candy gathering decisions. So, while we as parents see a traditional neighborhood map such as Figure 3…
Figure 3: Traditional Neighborhood Map
…our children are applying their innate data science (data and analysis) skills to map out the candy gathering targets and their logistical paths that they believe will yield the best results given the metrics against which they will measure progress and success (see Figure 4).
Figure 4: Optimized Candy Gathering Logistical Map
One last thing to help our future data scientists is a simple but effective homework assignment. In this exercise, we want to 1) help our students get comfortable optimizing across different metrics while 2) performing some rudimentary analytics to create a “score” that tells them the best neighborhoods to target for their candy optimization journey.
Figure 5 provides a simple spreadsheet that is designed to help students get comfortable playing with the data and the decision variable weights in order to make an informed decision about what neighborhoods they should target for their “Treat or Treating” venture.
Figure 5: Rudimentary Neighborhood Scoring Algorithm
To calculate the Neighborhood Candy Gathering Optimization Score in the last column of Figure 5, the students need to do the following (indicated in red in Figure 5):
Allow the students to play with the weights on the Variables and the Neighborhoods to see the impact that each has on the resulting Candy Optimization Score in the final column of the spreadsheet.
Extra credit: ask them what data they might want to gather in order to help them make even more informed, accurate weighting decisions.
Finally, the spreadsheet from Figure 5 can be pulled off of Google Docs: https://docs.google.com/spreadsheets/d/13fwmBLm5DPsDRNGqHvzI-9u5wWw...
Extra, extra credit: What do you get when you divide the circumference of your Jack-o’Lantern by its diameter?
Did you answer, Pumpkin Pi? Hehehe
Kids are natural data scientists; they have the natural curiosity to leverage data and basic analysis to make more informed decisions. But what are we as parents and teachers doing to nurture that innate, curiosity-driven, explore-test-learn Data Science capability. Help them by introducing them to a structured way to perform basic analysis – using the Hypothesis Development Canvas – and watch their natural curiosity, creativity and innovation cycle kick in.
In closing, I ‘witch’ you a Happy Halloween and have fun “Trick or Treating”, you crazy data scientists you!