Integrating GenAI into “Thinking Like a Data Scientist” Methodology

It’s incredible how many organizations utilize Generative AI (GenAI) and Large Language Models (LLMs) to enhance their information assembly, integration, and application abilities. These GenAI technologies have been applied in various areas, from drafting legal documents and resolving service issues to coding software applications and (er, um) writing blog posts. The potential uses of GenAI seem limited by only our creativity.

Inspired by this GenAI innovation wave, I decided to experiment with leveraging GenAI as a “research assistant” to enhance the effectiveness of my “Thinking Like a Data Scientist” (TLADS) methodology and class. In essence: how could my students leverage GenAI to speed up the TLADS process, improve the TLADS outcomes that are critical for business initiative execution success, and free up more time so that students could spend more time on other aspects necessary for the successful application of data and analytics?

So, I decided to run an experiment against my in-class Chipotle Case Study. And the results and learnings so far have been staggering, causing me to fine-tune the TLADS process and re-engineer some of the supporting design canvases to make them more GenAI-friendly.

I am going to write a series of blogs so that I can share what I am learning as part of this journey. And if you already understand the “Thinking Like a Data Scientist” methodology, I think you’ll find this series of blogs both interesting and illuminating in understanding how GenAI can improve the research necessary to ensure data and analytics success.

Note: I will be using Microsoft Bing AI for this exercise for the following reasons:

Uses GPT4, which has access to more current data
Updated more frequently with new data from new sources (including my new book, “AI & Data Literacy: Empowering Citizens of Data Science,” which I just released a week ago!)
Free (which is essential when dealing with college students).

TLADS Step #1: Identify and Assess Targeted Business Initiative

Step 1 of the TLADS process is to identify the targeted business initiative and then assess the business initiative by identifying the business initiative’s desired outcomes, KPIs, and metrics against which initiative and business outcomes effectiveness will be measured, likely benefits, potential impediments, and the costs and risks associated with initiative failure and potential unintended consequences (Figure 1).

Figure 1: Value Engineering Canvas

We must convert the information in the re-named Value Engineering Canvas into a narrative (prompt) that we can feed into the Bing thread.

Once we have fed the narrative into Bing, here are some sample prompts that you might want to explore with Bing:

Prompt: With this information about my targeted business initiative, are there other KPIs and metrics I should consider?
Prompt: What other potential business benefits should I consider for the “Increase Same Store Sales” business initiative?
Prompt: What other potential impediments should I consider for the “Increase Same Store Sales” business initiative?
Prompt: What additional potential risks should I consider for the “Increase Same Store Sales” business initiative?
Prompt: What additional unintended consequences should I consider for the “Increase Same Store Sales” business initiative?
Prompt: Rank order or score on a scale of 1 to 100, the Impediments in Chipotle exercise
Prompt: What is your rationale for scoring the Impediments?
Prompt: What actions could I take to reduce the Impediments risks?
Prompt: And what else should I be asking?

And I’m sure there are even more dimensions to explore with Bing to develop a deeper and more comprehensive understanding of your targeted business initiative’s factors and requirements for successful execution.

TLADS Step #2: Understand Stakeholders and Expectations

Step 2 of the TLADS process seeks to identify the key initiative stakeholders – those people or roles that either impact or are impacted by the business initiative – and then understand why the business initiative is important to them. To fully leverage GenAI in Step 2, I have reworked the renamed Stakeholders Requirements Assessment design canvas to include the stakeholders’ desired outcomes, the critical decisions that they need to make in support of the business initiatives, and the KPIs and metrics against which they will measure the effectiveness of the desired outcomes and their key decisions (Figure 2).

Figure 2: Stakeholders Requirements Assessment

You can also create a Persona Map and a Customer Journey Map for each stakeholder to uncover more data about each stakeholder, their critical decisions, the influencers of those decisions, and their associated gains and pains.

Once you have fed the stakeholder information into Bing (via a massive prompt), you can ask various questions to expand your knowledge, insights, and understanding of your stakeholders. Here are some sample prompts that you might want to explore:

Prompt: Are there other key stakeholders that I should consider for my targeted business initiative?
Prompt: Why would this business initiative be important to each stakeholder?
Prompt: What critical decisions must each stakeholder make to support the business initiative?
Prompt: What are the stakeholders’ desired outcomes, and what are the KPIs and metrics against which they would measure outcomes and decision effectiveness?
Prompt: Across all stakeholders, what are the most decisions to support my targeted business initiative?
Prompt: And what else should I be asking?

TLADS Step #3: Identify and Understand Business Entities

Step 3 of the Thinking Like a Data Scientist methodology focuses on identifying and understanding the business initiative’s key business entities. These business entities can be either humans or devices.

We will create and apply our analytic scores based on these business entities. For humans, we might want to measure their likelihood of buying a specific product, leaving the company, suffering a stroke, attending a certain movie, and so on. For devices or equipment, we might want to measure their likelihood of needing maintenance, remaining useful life, degrading performance, consuming energy, generating noise, and producing quality output (Figure 3).

Figure 3: TLADS Step 3: Business Entities Assessment

To gain more insights into your potential business entities that might be relevant to your targeted business initiative, we might want to explore these prompts:

Prompt: What are the key business entities (either human or equipment business entities) around which I want to build analytic propensity scores to optimize their performance considering the targeted business initiative?
Prompt: What behavioral or performance insights would I want to gather for each business entity?
Prompt: Which of these business entities are most important to the successful execution of my business initiative, and why?
Prompt: Create a score (from 1 to 100) for each business entity on that entity’s value potential, data availability, and influenceability.
Prompt: And what else should I be asking?

Summary: Integrating GenAI + TLADS – Part I

I am going to stop Part I of this exercise at this point because 1) I’m still working through the entirety of integrating GenAI with the Thinking Like a Data Scientist methodology, so I want to capture what I’ve learned so far, and 2) what I am learning in the rest of the exercise is blowing my mind.

But here is my key learning so far in applying GenAI to the TLADS methodology:

The only things limiting your ability to exploit GenAI are your curiosity and creativity and your ability to communicate clearly and effectively.

And in case you are following along on this journey, here are some pragmatic learnings in using Bing AI:

You can reset the content in a new thread, but it is time-consuming to re-enter the aggregated insights from the previous conversation. And while the Bing question box is limited to 4,000 characters, I found a hack for expanding that maximum number of characters to 25,000, which is an excellent help in resetting the next conversation thread.
Each Bing conversation is limited to 30 questions, and starting a new thread will reset the conversation memory. I am still figuring out how to override that limit, so be judicial in using your questions.
I ran this exercise several times with Bing by resetting the conversational thread. And each time, I got slightly different responses. It would be best if you judged when “good enough” is actually “good enough” from your Bing research assistant. The good news is that your Bing research assistant does not tire of repeatedly answering the same questions. And it is free (at least, so far).
I’ve found that I’ve created separate Bing threads for each of the major TLADS topics, such as Stakeholders, Business Entities, Use Cases, Analytic Scores / Features, and Recommendations. Yes, staying organized given the 30-question limit has been a wee bit challenging, especially as I get excited on this journey.

Final note: So far, the result of this experiment is a “GenAI + Thinking Like a Data Scientist” eBook that is currently over 60 pages (and still growing as I uncover additional questions to explore with my GenAI research assistant Bing AI). I have not yet decided what to do with this eBook, but I will probably include it in my “Big Data MBA: Thinking Like a Data Scientist” university and client classes and my forthcoming 2-day masterclass.

It’s just too big to post on a blog, and yes, concerning capturing all my learnings from integrating GenAI with my TLADS methodology eBook, “We’re going to need a bigger boat!”

Integrating GenAI into “Thinking Like a Data Scientist” Methodology – Part I

TLADS Step #1: Identify and Assess Targeted Business Initiative

TLADS Step #2: Understand Stakeholders and Expectations

TLADS Step #3: Identify and Understand Business Entities

Summary: Integrating GenAI + TLADS – Part I