📊 Data Science Methodology 101: From Problem to Approach – Analytic Approach
In the previous stage of the data science methodology, we explored the importance of Business Understanding—defining the problem and clarifying the goals. Once the question is clear, the next step is choosing the right Analytic Approach.
This stage is crucial because different questions require different techniques. Selecting the wrong approach can lead to misleading insights, while the right one ensures the solution aligns with business requirements.
🔹 What Is an Analytic Approach?
An analytic approach refers to the strategy or method used to analyze data in order to solve the defined problem. It depends on:
The type of question being asked.
The kind of patterns we want to uncover.
The business requirements defined in the first stage.
Think of it as choosing the right “tool” for the job: prediction, classification, clustering, or descriptive analysis.
🔹 Choosing the Right Approach
Here are some common scenarios:
Prediction (Probabilities of Action)
Example: “What is the likelihood of a customer churning next month?”
Tool: Predictive models (e.g., regression, decision trees).
Descriptive Analysis (Relationships & Groups)
Example: “What groups of customers show similar buying behavior?”
Tool: Clustering or association models.
Statistical Analysis (Counts or Summaries)
Example: “How many patients were readmitted within 30 days?”
Tool: Hypothesis testing, descriptive statistics.
Classification (Yes/No or Categorical Outcomes)
Example: “Will this transaction be fraudulent or not?”
Tool: Decision trees, logistic regression, random forests.
🔹 Machine Learning in Analytic Approach
Machine Learning (ML) gives computers the ability to learn from data without being explicitly programmed. In data science methodology, ML helps identify:
Hidden relationships between variables.
Emerging patterns or trends.
Insights into human behavior (e.g., clustering social media users by interests).
Depending on the problem, ML approaches might include:
Clustering → grouping similar data points.
Association rules → finding patterns like “people who buy X also buy Y.”
Classification → predicting categories like “spam” or “not spam.”
🔹 Case Study: Healthcare Readmissions
Let’s revisit the healthcare insurance provider case study from the Business Understanding stage.
Problem: Reduce patient readmissions, especially for congestive heart failure patients.
Chosen Analytic Approach:
The team selected a Decision Tree Classification Model.
How it works:
Each node in the tree splits data based on conditions (e.g., age, treatment type).
Each path leads to a “leaf” with a predicted outcome (Yes = readmission, No = no readmission).
The proportion of outcomes in each leaf gives the risk score.
Why decision trees?
Easy for non-data scientists (like clinicians) to understand.
Provides both prediction and explanation (which conditions led to the prediction).
Flexible → multiple models can be applied at different stages of hospital stay.
Creates a dynamic risk profile that evolves with ongoing treatments.
Example:
If the leaf shows 70% of similar patients were readmitted, the new patient’s risk is 0.7 (70%).
If the dominant outcome is “No,” risk is calculated as 1 – proportion of no patients.
🔹 Why Analytic Approach Matters
Selecting the right analytic approach ensures:
The method matches the question.
Insights are actionable and relevant.
Stakeholders can understand and trust the model outputs.
In this case, the decision tree helped healthcare providers make data-driven, transparent decisions about patient care, improving both efficiency and outcomes.
🔹 Wrapping Up
The Analytic Approach is the second stage in the Data Science Methodology.
It answers the question:
👉 “How should we use data to solve this problem?”
With business requirements guiding the process, data scientists can now decide whether to use predictive modeling, descriptive analysis, classification, clustering, or statistical techniques.
In the next stage, we’ll dive into Data Requirements—figuring out what data is needed to support the chosen analytic approach.
Source: IBM Data Science Professional Certificate (Coursera)
Instructors: A special thanks to the incredible IBM instructors and the entire IBM Skills Network Team.
Disclaimer: This blog post is part of my personal learning journal where I document my progress through the IBM Data Science Professional Certificate. These articles represent my personal understanding and interpretation of the course material. They are not official course notes and are not endorsed by IBM or Coursera.