Get the FREE Ultimate OpenClaw Setup Guide →

causal-inference

npx machina-cli add skill pablodiegoo/Data-Pro-Skill/causal-inference --openclaw
Files (1)
SKILL.md
1.0 KB

Causal Inference

This skill utilizes mathematical modeling to go beyond correlation, identifying what genuinely moves the needle.

Core Capabilities

1. Driver & Association Analysis

  • drivers_analysis.py: Identifies key drivers of a target variable (e.g., Overall Satisfaction).
  • association_matrix.py: Calculates strong associations between categorical groups.
  • chi2_residuals.py: Computes standard and adjusted residuals for cross-tabulations.

2. Modeling Diagnostics

  • partial_residual_plot.py & glm_partial_residual_plot.py: Validates linear and generalized linear models.
  • factor_analysis.py: Discovers latent variables and structure in complex data.
  • multivariate_normal_contours.py: Plots multidimensional relationships.

Source

git clone https://github.com/pablodiegoo/Data-Pro-Skill/blob/main/src/datapro/data/skills/causal-inference/SKILL.mdView on GitHub

Overview

Causal Inference uses mathematical modeling to go beyond correlation and isolate what genuinely moves the needle. It supports driver and association analysis, chi-square residuals, and partial dependence diagnostics to validate relationships and reveal latent structure in complex datasets.

How This Skill Works

Core components drive the workflow: drivers_analysis.py finds key drivers of a target (e.g., Overall Satisfaction), association_matrix.py measures strong relationships between categorical groups, and chi2_residuals.py computes cross-tab residuals. For modeling and diagnostics, partial_residual_plot.py and glm_partial_residual_plot.py validate linear and generalized linear models. Factor_analysis.py and multivariate_normal_contours.py reveal latent structure and multidimensional relationships.

When to Use It

  • When you need to identify which factors truly drive a target metric (Key Driver Analysis).
  • When you want to assess cross-tab fit using chi-square residuals.
  • When exploring strong associations between categorical groups with an association matrix.
  • When validating linear/GLM models with partial residual plots.
  • When uncovering latent structure or multidimensional relationships with factor analysis and contour plots.

Quick Start

  1. Step 1: Define your target variable and run drivers_analysis.py to identify key drivers.
  2. Step 2: Run association_matrix.py and chi2_residuals.py to map associations and assess cross-tab fit.
  3. Step 3: Use partial_residual_plot.py or glm_partial_residual_plot.py to validate models; optionally run factor_analysis.py and multivariate_normal_contours.py for latent structure.

Best Practices

  • Define a clear target variable before analysis.
  • Pre-clean data and ensure sufficient category counts for chi-square tests.
  • Use drivers_analysis to rank drivers, then corroborate with association and residual analyses.
  • Regularly validate models with partial_residual_plot.py and glm_partial_residual_plot.py.
  • Leverage factor_analysis.py and multivariate_normal_contours.py to explore latent structure.

Example Use Cases

  • Use drivers_analysis.py to identify key drivers of Overall Satisfaction in a customer survey.
  • Apply chi2_residuals.py to compute residuals for a cross-tab of product category by region.
  • Plot association_matrix.py to reveal strong associations between customer segments.
  • Validate a revenue prediction model with partial_residual_plot.py.
  • Discover latent segments using factor_analysis.py and visualize with multivariate_normal_contours.py.

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers