A package that supplies fully-tested re-implementations of useful functions for significance testing: link
A python implementation of standard statistical tests: link
A python implementation of replicability analysis for experiments with multiple datasets: link
A python implementation of a test to compare deep neural network models: link
An implementation of power analysis: link