Under null hypothesis, p-values are uniformly distributed
Central limit theorem states that sample means follow normal distribution. People often confuse this with and question the validity of uniform distribution of p-values under nul hypothesis. Should we not be observing extreme p-values much less ofter than usual pvalues like 0.2 or 0.3? Imagine if null hypothesis is true. This means there is no difference betweem treatment and control. import numpy as np import pandas as pd import statsmodels.api as sm from scipy import stats import matplotlib.pyplot as plt pval_lis = [] t_stat_lis = [] samp_mean_lis = [] for i in range(100000): sample = np.random.normal(0, 1, 10) samp_mean = sample.mean() samp_mean_lis.append(samp_mean) t_stat = sample.mean()/(sample.std()/np.sqrt(sample.size)) t_stat_lis.append(t_stat) pval = stats.t.sf(abs(t_stat), df=sample.size)*2 pval_lis.append(pval) plt.hist(np.array(samp_mean_lis)) Distribution of sample means ...