.

Ghostly greetings!

I believe everyone is born with the innate, curiosity-driven, explore-test-learn Data Science capability. At Halloween, kids naturally embrace a rapid exploration, rapid testing, failure-empowering “Scientific Method” to optimize their candy yield and logistical “Trick or Treating” algorithms.

So, what can we – as parents and teachers – provide to help nurture these budding data scientists? How can we prepare them for a future using data and analysis (analytics) to make informed operational, policy and life decisions?

Don’t be a scaredy cat and let’s talk about how we can get our kids ready for the future – by preparing them to embrace their inner data scientist.

The Hypothesis Development Canvas is a design tool that succinctly defines the problem that one is trying to solve. The Hypothesis Development Canvas is a collaborative tool that captures the details about the **hypothesis** or problem that we are trying to solve, brainstorms the **metrics and variables** against which progress and success will be measured, identifies the **stakeholders** who either impact or are impacted by the targeted hypothesis, identifies and prioritizes the **decisions** that the stakeholders need to make in support of the targeted hypothesis (see Figure 2).

**Figure** **2****: Halloween “Treat or Treating” Candy Optimization ** **Hypothesis Development Canvas**

Having your students construct a Hypothesis Development Canvas for their Trick or Treating objectives is a great way to help our future data scientists understand the importance of preparation before actually putting science to the data. The Hypothesis Development Canvas in Figure 2 provides a “paint by the numbers” example for our future data scientists to thoroughly understand what they are trying to achieve, how they will measure success and how they can leverage data and analysis to optimize their key decisions to optimize their Halloween “Treat or Treating” endeavor. This canvas helps clarify the following before actually diving into the analysis that drives the event optimization, including:

- What is your Halloween candy gathering objectives? For example: “To gather and retain as much high-quality candy, within the allotted time period, as possible.”
- What are the metrics against which you will measure candy gathering progress and success? For example: “Maximize candy quality, optimize candy volume, minimize effort exerted to gather candy, minimize distance covered to gather candy.”
- Who are your key stakeholders who can help you achieve your objectives? For example: “Friends, parents, neighbors, siblings.”
- What are the key decisions that you need to make? For example:
- What outfit are you going to wear?
- What neighborhoods and residences are you going to target?
- When to start out and how long to go?
- With which friends are you going? (Be sure to leave your skeleton friend at home because he’s got no-body to go with.)
- What candies to your keep for yourself?
- What candies are offered up for the “Dad Tax”?
- What treats (raisins, apples) do you off load to your younger siblings?

- What data might one want to use to help optimize the above decisions? For example:
- Last Year’s Yield by Residence or Store
- New Neighbors
- Neighborhood Construction
- Weather
- Day of the Week (school night versus non-school night)
- Friends’ Neighborhood Recommendations
- Traffic
- Local Events

Note: one of the most important outcomes from the Hypothesis Development Canvas exercise is 1) the identification of the **variables and metrics** against which hypothesis progress and success will be measured, and 2) the identification, validating, valuation and prioritization of the key **decisions** that they need to make in support of the targeted hypothesis. Get these two items right, and your students are well down the path to becoming data scientists and serving up Frankenstein his favorite kind of potatoes: monster-mashed!

Children are naturally able to optimize across multiple, sometimes conflicting variables – volume of candy, quality of candy, distance to travel between sources of candy, time to wait at the door to get their candy – in order to optimize their candy gathering decisions. So, while we as parents see a traditional neighborhood map such as Figure 3…

**Figure** **3****: Traditional Neighborhood Map**

…our children are applying their innate data science (data and analysis) skills to map out the candy gathering targets and their logistical paths that they believe will yield the best results given the metrics against which they will measure progress and success (see Figure 4).

**Figure** **4****: Optimized Candy Gathering Logistical Map**

One last thing to help our future data scientists is a simple but effective homework assignment. In this exercise, we want to 1) help our students get comfortable optimizing across different metrics while 2) performing some rudimentary analytics to create a “score” that tells them the best neighborhoods to target for their candy optimization journey.

Figure 5 provides a simple spreadsheet that is designed to help students get comfortable playing with the data and the decision variable weights in order to make an informed decision about what neighborhoods they should target for their “Treat or Treating” venture.

**Figure** **5****: Rudimentary Neighborhood Scoring Algorithm**

To calculate the Neighborhood Candy Gathering Optimization Score in the last column of Figure 5, the students need to do the following (indicated in red in Figure 5):

- Enter the names of their potential target Neighborhoods.
- Next, enter a weight for the relative importance of them of each of the 3 different variables (Variable 1: Amount of Candy, Variable 2: Quality of Candy, and Variable 3: Time to Gather Candy). We use a scale of 1 to 10 where 10 is your most important variable and 1 is your least important variable.

**Note:**Not all variables are of equal weight. Part of the data science process is making trade-offs between the weights assigned to the different variables. Because there probably isn’t an equal difference between the importance of the variables, feel free to use the full range of 1 to 10 to make a relative determination of the value of each variable vis-à-vis each other. - Finally, for each neighborhood, enter a weight for how well you think that particular neighborhood does vis-à-vis each variable. For example: for Variable 1 (Amount of Candy), I felt that Mid Town and South Side would yield the highest volume of candy based upon previous experience and recommendations from friends (so both got 8’s out of 10), while I felt that Old Town would probably yield the lowest volume of candy based upon previous experience and recommendations from friends (so I gave Old Town a 2 out of 10).

Allow the students to play with the weights on the Variables and the Neighborhoods to see the impact that each has on the resulting Candy Optimization Score in the final column of the spreadsheet.

Extra credit: ask them what data they might want to gather in order to help them make even more informed, accurate weighting decisions.

Finally, the spreadsheet from Figure 5 can be pulled off of Google Docs: https://docs.google.com/spreadsheets/d/13fwmBLm5DPsDRNGqHvzI-9u5wWw...

Extra, extra credit: What do you get when you divide the circumference of your Jack-o’Lantern by its diameter?

Did you answer, Pumpkin Pi? Hehehe

Kids are natural data scientists; they have the natural curiosity to leverage data and basic analysis to make more informed decisions. But what are we as parents and teachers doing to nurture that innate, curiosity-driven, explore-test-learn Data Science capability. Help them by introducing them to a structured way to perform basic analysis – using the Hypothesis Development Canvas – and watch their natural curiosity, creativity and innovation cycle kick in.

In closing, I ‘witch’ you a Happy Halloween and have fun “Trick or Treating”, you crazy data scientists you!

Views: 775

Tags: #AI, #BigData, #DOBD, #DataAnalytics, #DataMonetization, #DataScience, #DeanofBigData, #DeepLearning, #DesignThinking, #DigitalTransformation, More…#DigitalTwins, #Economics, #IIoT, #Innovation, #InternetOfThings, #IoT, #MachineLearning, #NeuralNetworks, #Smart, #SmartCity, #SmartSpaces, #TLADS, #ThinkLikeADataScientist

- A History and Timeline of Big Data
- AI voice technology has benefits and limitations
- Strong data governance frameworks are fuel for analytics
- Top 12 most commonly used IoT protocols and standards
- What is the status of quantum computing for business?
- How parallelization works in streaming systems
- An Eggplant automation tool tutorial for Functional, DAI
- Circular economy model enables sustainability and resilience

Posted 29 March 2021

© 2021 TechTarget, Inc. Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of Data Science Central to add comments!

Join Data Science Central