Suppose that I have a series of data:
age;height 8;120 8;123 8;130 8;125 10;160 9;158 8;120 7;126 6;98 5;97 7;115 7;120 7;118 8;117 6;97 6;99 9;123 10;157 10;155 9;155 9;153 5;96 7;115 6;94 6;94 5;87 8;117 6;96 5;97 6;91 6;88 9;149 6;94 8;117 10;156 10;160 6;90 6;90 7;116 5;89 6;90 7;118 10;162
And I would like to assess the normality using Kolmogorov-Smirnov using both SPSS and Python. SPSS yielded a result of:
| variables | statistics | sig |
|---|---|---|
| age | 0.190 | 0.000 |
| height | 0.173 | 0.002 |
I tried to compare using Python with this code:
import pandas as pd
from scipy.stats import kstest
from scipy.stats import norm
data = pd.DataFrame([[8, 120], [8, 123], [8, 130], [8, 125], [10, 160], [9, 158], [8, 120], [7, 126], [6, 98], [5, 97], [7, 115], [7, 120], [7, 118], [8, 117], [6, 97], [6, 99], [9, 123], [10, 157], [10, 155], [9, 155], [9, 153], [5, 96], [7, 115], [6, 94], [6, 94], [5, 87], [8, 117], [6, 96], [5, 97], [6, 91], [6, 88], [9, 149], [6, 94], [8, 117], [10, 156], [10, 160], [6, 90], [6, 90], [7, 116], [5, 89], [6, 90], [7, 118], [10, 162]], columns=['age', 'weight'])
x = np.log(data.age)
n = norm(loc=0,scale=1)
kstest(x, n.cdf)
which gives:
KstestResult(statistic=0.9462396895483368, pvalue=5.139087762288979e-55)
Even if I don't log-transform the data, the result is still different:
kstest(data.age, n.cdf)
which gives:
KstestResult(statistic=0.9999997133484281, pvalue=9.27397852188504e-282)
The SciPy calculation is correct given your input: the KS-test statistic is the maximum difference between the empirical CDF and the provided CDF evaluated at the data.
The SPSS code is not provided, so I cannot assess the reason for the discrepancy. Perhaps in SPSS you are not testing the null hypothesis that the data follows the standard normal distribution, which is clearly not the case. Instead, perhaps it is performing Lilliefors' test, which uses the KS-statistic to perform a test of the null hypothesis that the data follows a normal distribution in which the parameters
locandscaleare treated as unknown.If you want to perform such a test, there are many more powerful options available. Consider the Shapiro-Wilk test.