Get the FREE Ultimate OpenClaw Setup Guide →
w

Pywayne Statistics

Verified

@wangyendt

npx machina-cli add skill @wangyendt/statistics-2 --openclaw
Files (1)
SKILL.md
9.7 KB

Pywayne Statistics

Comprehensive statistical testing library for hypothesis testing, A/B testing, and data analysis.

Quick Start

from pywayne.statistics import NormalityTests, LocationTests
import numpy as np

# Test data normality
nt = NormalityTests()
data = np.random.normal(0, 1, 100)
result = nt.shapiro_wilk(data)
print(f"p-value: {result.p_value:.4f}, is_normal: {not result.reject_null}")

# Compare two groups
lt = LocationTests()
group_a = np.random.normal(100, 15, 50)
group_b = np.random.normal(105, 15, 50)
result = lt.two_sample_ttest(group_a, group_b)
print(f"Significant difference: {result.reject_null}")

Test Categories

NormalityTests (NormalityTests)

Test if data follows a normal distribution or other specified distributions.

MethodDescriptionUse Case
shapiro_wilkShapiro-Wilk testSmall-medium samples (n ≤ 5000)
ks_test_normalK-S normality testMedium-large samples
ks_test_two_sampleTwo-sample K-S testCompare two sample distributions
anderson_darlingAnderson-Darling testTail-sensitive normality test
dagostino_pearsonD'Agostino-Pearson K²Based on skewness and kurtosis
jarque_beraJarque-Bera testLarge samples, regression residuals
chi_square_goodness_of_fitChi-square goodness-of-fitCategorical data
lilliefors_testLilliefors testUnknown parameters K-S test

Example:

from pywayne.statistics import NormalityTests

nt = NormalityTests()
result = nt.shapiro_wilk(data)
if result.p_value < 0.05:
    print("Data is NOT normally distributed")
else:
    print("Data follows normal distribution")

LocationTests (LocationTests)

Compare means or medians across groups (parametric and non-parametric).

MethodDescriptionUse Case
one_sample_ttestOne-sample t-testCompare sample mean to a value
two_sample_ttestTwo-sample t-testCompare two independent group means
paired_ttestPaired t-testCompare before/after measurements
one_way_anovaOne-way ANOVACompare 3+ group means
mann_whitney_uMann-Whitney U testNon-parametric two-sample test
wilcoxon_signed_rankWilcoxon signed-rankNon-parametric paired test
kruskal_wallisKruskal-Wallis H testNon-parametric multi-group test

Example (A/B Testing):

from pywayne.statistics import LocationTests, NormalityTests

lt = LocationTests()
nt = NormalityTests()

# Check normality first
if nt.shapiro_wilk(control).p_value > 0.05:
    result = lt.two_sample_ttest(control, treatment)
else:
    result = lt.mann_whitney_u(control, treatment)

print(f"Effect significant: {result.reject_null}")

CorrelationTests (CorrelationTests)

Test correlation between variables and independence of categorical variables.

MethodDescriptionUse Case
pearson_correlationPearson correlationLinear relationship
spearman_correlationSpearman's rankMonotonic relationship
kendall_tauKendall's tauRank correlation, small samples
chi_square_independenceChi-square independenceCategorical variables
fisher_exact_testFisher's exact test2×2 contingency table
mcnemar_testMcNemar's testPaired categorical data

Example:

from pywayne.statistics import CorrelationTests

ct = CorrelationTests()
result = ct.pearson_correlation(x, y)
print(f"Correlation: {result.statistic:.3f}, p-value: {result.p_value:.4f}")

TimeSeriesTests (TimeSeriesTests)

Test time series properties: stationarity, autocorrelation, cointegration.

MethodDescriptionUse Case
adf_testAugmented Dickey-FullerUnit root test for stationarity
kpss_testKPSS testStationarity test (complements ADF)
ljung_box_testLjung-Box Q testOverall autocorrelation
runs_testRuns testRandomness testing
arch_testARCH effect testHeteroscedasticity
granger_causalityGranger causalityCausal relationship
engle_granger_cointegrationEngle-Granger cointegrationLong-term equilibrium
breusch_godfrey_testBreusch-GodfreyHigher-order autocorrelation

Example:

from pywayne.statistics import TimeSeriesTests

tst = TimeSeriesTests()
adf_result = tst.adf_test(time_series_data)
kpss_result = tst.kpss_test(time_series_data)

if adf_result.reject_null:
    print("Series is stationary")
else:
    print("Series has unit root (non-stationary)")

ModelDiagnostics (ModelDiagnostics)

Regression model diagnostics: heteroscedasticity, autocorrelation, multicollinearity.

MethodDescriptionUse Case
breusch_pagan_testBreusch-PaganHeteroscedasticity test
white_testWhite's testGeneral heteroscedasticity
goldfeld_quandt_testGoldfeld-QuandtStructural break heteroscedasticity
durbin_watson_testDurbin-WatsonFirst-order autocorrelation
variance_inflation_factorVIFMulticollinearity diagnosis
levene_testLevene's testHomogeneity of variance
bartlett_testBartlett's testHomogeneity (normal assumption)
residual_normality_testResidual normalityRegression assumption check

Example:

from pywayne.statistics import ModelDiagnostics

md = ModelDiagnostics()
residuals = y - model.predict(X)

# Check assumptions
bp_result = md.breusch_pagan_test(residuals, X)
dw_result = md.durbin_watson_test(residuals)

if bp_result.reject_null:
    print("Warning: Heteroscedasticity detected")

TestResult Object

All test methods return a unified TestResult object:

result = nt.shapiro_wilk(data)

# Access results
result.test_name        # Test method name
result.statistic        # Test statistic value
result.p_value          # P-value
result.reject_null      # True if null hypothesis is rejected
result.critical_value   # Critical value (if applicable)
result.confidence_interval # Tuple (lower, upper) if applicable
result.effect_size      # Effect size if applicable
result.additional_info  # Dict with additional information

Utility Functions

list_all_tests()

List all available test methods across all modules.

from pywayne.statistics import list_all_tests
print(list_all_tests())

show_test_usage(method_name)

Display usage and documentation for a specific test.

from pywayne.statistics import show_test_usage
show_test_usage('shapiro_wilk')

Method Selection Guide

Normality Tests

Sample SizeRecommended Method
n < 30Shapiro-Wilk
30 ≤ n ≤ 300Shapiro-Wilk, D'Agostino-Pearson
n > 300Jarque-Bera, Kolmogorov-Smirnov

Location Tests

ConditionParametricNon-parametric
Normal datat-test, ANOVA-
Non-normal data-Mann-Whitney U, Kruskal-Wallis
Paired dataPaired t-testWilcoxon signed-rank

Multiple Testing Correction

When performing multiple tests, apply p-value correction:

from statsmodels.stats.multitest import multipletests

p_values = [r.p_value for r in results]
rejected, p_corrected, _, _ = multipletests(
    p_values, alpha=0.05, method='fdr_bh'
)

Common Applications

Data Quality Check

def data_quality_check(data):
    nt = NormalityTests()
    lt = LocationTests()

    normality = nt.shapiro_wilk(data)

    # Outlier detection (IQR)
    Q1, Q3 = np.percentile(data, [25, 75])
    IQR = Q3 - Q1
    outliers = data[(data < Q1 - 1.5*IQR) | (data > Q3 + 1.5*IQR)]

    return {
        'size': len(data),
        'is_normal': not normality.reject_null,
        'p_value': normality.p_value,
        'outliers': len(outliers)
    }

A/B Testing Workflow

def ab_test_analysis(control, treatment):
    nt = NormalityTests()
    lt = LocationTests()

    # Check normality
    norm_c = nt.shapiro_wilk(control[:100])
    norm_t = nt.shapiro_wilk(treatment[:100])

    # Choose appropriate test
    if norm_c.p_value > 0.05 and norm_t.p_value > 0.05:
        result = lt.two_sample_ttest(control, treatment)
    else:
        result = lt.mann_whitney_u(control, treatment)

    return {
        'test_used': result.test_name,
        'p_value': result.p_value,
        'significant': result.reject_null,
        'effect_size': result.effect_size
    }

Regression Model Diagnostics

def diagnose_model(y, X, model):
    md = ModelDiagnostics()
    residuals = y - model.predict(X)

    return {
        'heteroscedasticity_bp': md.breusch_pagan_test(residuals, X).reject_null,
        'autocorrelation_dw': md.durbin_watson_test(residuals).statistic,
        'residuals_normal': md.residual_normality_test(residuals).p_value,
        'vif_max': max(md.variance_inflation_factor(X))
    }

Notes

  • All methods accept np.ndarray or list as input
  • All methods return TestResult with consistent interface
  • Always validate test assumptions before applying parametric tests
  • Apply multiple testing correction when performing several tests
  • Report effect sizes alongside p-values for complete interpretation

Source

git clone https://clawhub.ai/wangyendt/statistics-2View on GitHub

Overview

Pywayne Statistics is a comprehensive statistical testing library for hypothesis testing, A/B testing, and data analysis. It offers 37+ methods across normality tests, location tests, correlation tests, time series tests, and model diagnostics. All methods return unified TestResult objects with a consistent interface that includes p-value, statistic, confidence interval, and effect size.

How This Skill Works

Tests are organized into categories (NormalityTests, LocationTests, CorrelationTests, plus time series and diagnostics). Each method accepts data inputs and returns a TestResult with p_value, statistic, confidence interval, and effect size, enabling uniform interpretation across tests.

When to Use It

  • Perform hypothesis testing for A/B experiments to determine if there is a real effect between groups
  • Check data quality and normality before choosing parametric vs non-parametric tests
  • Validate regression model assumptions using diagnostics and residual analysis
  • Analyze time series data to detect changes, trends, or anomalies
  • Assess relationships and independence between variables using correlation and contingency tests

Quick Start

  1. Step 1: Import modules, e.g., from pywayne.statistics import NormalityTests, LocationTests
  2. Step 2: Create data and run tests (e.g., nt.shapiro_wilk(data) and lt.two_sample_ttest(group_a, group_b))
  3. Step 3: Interpret results by checking result.p_value, result.reject_null, and result.effect_size

Best Practices

  • Run normality tests before selecting parametric tests to avoid invalid conclusions
  • Compare parametric and non-parametric options when assumptions are violated
  • Report p-values alongside effect sizes and confidence intervals for practical significance
  • Use a unified TestResult interface to compare results across tests
  • Ensure adequate sample size to achieve reliable p-values and stable estimates

Example Use Cases

  • A/B test quality check: test normality with Shapiro-Wilk and compare groups using two_sample_ttest
  • Non-parametric alternative: use Mann-Whitney U when normality fails
  • Exploring relationships: compute Pearson or Spearman correlations between features
  • Categorical analysis: test independence with Chi-square or Fisher's exact test on contingency tables
  • Model diagnostics: apply time-series and residual tests to validate forecasting or regression models

Frequently Asked Questions

Add this skill to your agents
Sponsor this space

Reach thousands of developers