causal-inference
npx machina-cli add skill pablodiegoo/Data-Pro-Skill/causal-inference --openclawCausal Inference
This skill utilizes mathematical modeling to go beyond correlation, identifying what genuinely moves the needle.
Core Capabilities
1. Driver & Association Analysis
drivers_analysis.py: Identifies key drivers of a target variable (e.g., Overall Satisfaction).association_matrix.py: Calculates strong associations between categorical groups.chi2_residuals.py: Computes standard and adjusted residuals for cross-tabulations.
2. Modeling Diagnostics
partial_residual_plot.py&glm_partial_residual_plot.py: Validates linear and generalized linear models.factor_analysis.py: Discovers latent variables and structure in complex data.multivariate_normal_contours.py: Plots multidimensional relationships.
Source
git clone https://github.com/pablodiegoo/Data-Pro-Skill/blob/main/src/datapro/data/skills/causal-inference/SKILL.mdView on GitHub Overview
Causal Inference uses mathematical modeling to go beyond correlation and isolate what genuinely moves the needle. It supports driver and association analysis, chi-square residuals, and partial dependence diagnostics to validate relationships and reveal latent structure in complex datasets.
How This Skill Works
Core components drive the workflow: drivers_analysis.py finds key drivers of a target (e.g., Overall Satisfaction), association_matrix.py measures strong relationships between categorical groups, and chi2_residuals.py computes cross-tab residuals. For modeling and diagnostics, partial_residual_plot.py and glm_partial_residual_plot.py validate linear and generalized linear models. Factor_analysis.py and multivariate_normal_contours.py reveal latent structure and multidimensional relationships.
When to Use It
- When you need to identify which factors truly drive a target metric (Key Driver Analysis).
- When you want to assess cross-tab fit using chi-square residuals.
- When exploring strong associations between categorical groups with an association matrix.
- When validating linear/GLM models with partial residual plots.
- When uncovering latent structure or multidimensional relationships with factor analysis and contour plots.
Quick Start
- Step 1: Define your target variable and run drivers_analysis.py to identify key drivers.
- Step 2: Run association_matrix.py and chi2_residuals.py to map associations and assess cross-tab fit.
- Step 3: Use partial_residual_plot.py or glm_partial_residual_plot.py to validate models; optionally run factor_analysis.py and multivariate_normal_contours.py for latent structure.
Best Practices
- Define a clear target variable before analysis.
- Pre-clean data and ensure sufficient category counts for chi-square tests.
- Use drivers_analysis to rank drivers, then corroborate with association and residual analyses.
- Regularly validate models with partial_residual_plot.py and glm_partial_residual_plot.py.
- Leverage factor_analysis.py and multivariate_normal_contours.py to explore latent structure.
Example Use Cases
- Use drivers_analysis.py to identify key drivers of Overall Satisfaction in a customer survey.
- Apply chi2_residuals.py to compute residuals for a cross-tab of product category by region.
- Plot association_matrix.py to reveal strong associations between customer segments.
- Validate a revenue prediction model with partial_residual_plot.py.
- Discover latent segments using factor_analysis.py and visualize with multivariate_normal_contours.py.