A package that supplies fully-tested re-implementations of useful functions for significance testing: link

A python implementation of standard statistical tests: link

A python implementation of replicability analysis for experiments with multiple datasets: link

A python implementation of a test to compare deep neural network models: link

An implementation of power analysis: link