How to Run a Power Analysis

This guide explains how to use GeoSC to evaluate test strength before running an experiment.

Running Power Analysis via CLI

Use the power command to simulate experiments with known treatment effects and calculate the statistical power:

Packaged-install example:

geosc power --config /path/to/power_analysis_config.yaml --use-gpu --jobs -1

Source-checkout example with the shipped demo config:

geosc power --config data-config/power_analysis_config.yaml --use-gpu --jobs -1

Key Flags

  • --config: Path to the YAML configuration file.
  • --use-gpu: (Optional) Accelerates computation using compatible GPUs.
  • --jobs -1: Uses all available CPU cores for parallel processing.

Built wheel and sdist installs do not include data-config/, so packaged users should provide their own YAML path.

Expected Outputs

  • power_analysis_results.csv: Detailed grid of simulated lifts, sample sizes, power, Monte Carlo uncertainty, simulation failures, seed, DGP rank, feasible placebo support, and SparseSC/backend parameters.
  • power_curves.png: Visualization of the power curve across different minimal detectable effects (MDE).

Rows with valid: false exceeded the configured simulation failure tolerance or had no successful simulations. Do not use those rows for feasibility decisions. Seeded runs use CPU generation for reproducible simulation draws, so the shipped demo config sets use_gpu: false.

Design Validity and Plot Filtering

The power calculator fails fast at construction with a ValueError if the design has no treatment units, or if there are fewer control units than treatment units. In that case placebo inference is infeasible and no power analysis is run.

plot_power_curves filters rows where valid is false out of the decision plots (logging the number of excluded rows), and skips plot creation entirely if no valid rows remain. Inspect power_analysis_results.csv directly to see why rows were marked invalid (high failure_rate, zero n_successful, etc.).

Reproducibility

When random_seed is set, repeated runs of the same config produce identical power_analysis_results.csv content (excluding timestamp/path metadata), in both sequential and parallel modes. This is covered by both unit and CLI integration tests.

Example Plot

The source-checkout demo workflow ships a representative power-curve output:

Example power curves Example power curves