loader
Our Blogs - Ms Office Solution
blog

Introduction to Non-parametric Statistical Significance Tests in Python

Nonparametric statistics are those methods that do not assume a specific distribution to the data.Often, they refer to statistical methods that do not assume a Gaussian distribution. They were developed for use with ordinal or interval data, but in practice can also be used with a ranking of real-valued observations in a data sample rather than on the observation values themselves.A common question about two or more datasets is whether they are different. Specifically, whether the difference between their central tendency (e.g. mean or median) is statistically significant.This question can be answered for data samples that do not have a Gaussian distribution by using nonparametric statistical significance tests. The null hypothesis of these tests is often the assumption that both samples were drawn from a population with the same distribution, and therefore the same population parameters, such as mean or median.If after calculating the significance test on two or more samples the null hypothesis is rejected, it indicates that there is evidence to suggest that samples were drawn from different populations, and in turn the difference between sample estimates of population parameters, such as means or medians may be significant.These tests are often used on samples of model skill scores in order to confirm that the difference in skill between machine learning models is significant.



In general, each test calculates a test statistic, that must be interpreted with some background in statistics and a deeper knowledge of the statistical test itself. Tests also return a p-value that can be used to interpret the result of the test. The p-value can be thought of as the probability of observing the two data samples given the base assumption (null hypothesis) that the two samples were drawn from a population with the same distribution.The p-value can be interpreted in the context of a chosen significance level called alpha. A common value for alpha is 5% or 0.05. If the p-value is below the significance level, then the test says there is enough evidence to reject the null hypothesis and that the samples were likely drawn from populations with differing distributions.




  • p <= alpha: reject H0, different distribution.

  • p > alpha: fail to reject H0, same distribution.



Mann-Whitney U Test:The Mann-Whitney U test is a nonparametric statistical significance test for determining whether two independent samples were drawn from a population with the same distribution.The test was named for Henry Mann and Donald Whitney, although it is sometimes called the Wilcoxon-Mann-Whitney test, also named for Frank Wilcoxon, who also developed a variation of the test.The default assumption or null hypothesis is that there is no difference between the distributions of the data samples. Rejection of this hypothesis suggests that there is likely some difference between the samples. More specifically, the test determines whether it is equally likely that any randomly selected observation from one sample will be greater or less than a sample in the other distribution. If violated, it suggests differing distributions.




  • Fail to Reject H0: Sample distributions are equal.

  • Reject H0: Sample distributions are not equal.



For the test to be effective, it requires at least 20 observations in each data sample.We can implement the Mann-Whitney U test in Python using the mannwhitneyu() SciPy function. The functions takes as arguments the two data samples. It returns the test statistic and the p-value.



You like to read:-



https://innovationalofficesolution.com/Blog/detail/maximising-data-analytics:-microsoft-fabric-vs.-power-bi



Wilcoxon Signed-Rank Test:In some cases, the data samples may be paired.There are many reasons why this may be the case, for example, the samples are related or matched in some way or represent two measurements of the same technique. More specifically, each sample is independent, but comes from the same population.Examples of paired samples in machine learning might be the same algorithm evaluated on different datasets or different algorithms evaluated on exactly the same training and test data.The samples are not independent, therefore the Mann-Whitney U test cannot be used. Instead, the Wilcoxon signed-rank test is used, also called the Wilcoxon T test, named for Frank Wilcoxon. It is the equivalent of the paired Student T-test, but for ranked data instead of real valued data with a Gaussian distribution.The default assumption for the test, the null hypothesis, is that the two samples have the same distribution.




  • Fail to Reject H0: Sample distributions are equal.

  • Reject H0: Sample distributions are not equal.



For the test to be effective, it requires at least 20 observations in each data sample.The Wilcoxon signed-rank test can be implemented in Python using the wilcoxon() SciPy function. The function takes the two samples as arguments and returns the calculated statistic and p-value.



Kruskal-Wallis H Test



When working with significance tests, such as Mann-Whitney U and the Wilcoxon signed-rank tests, comparisons between data samples must be performed pair-wise.This can be inefficient if you have many data samples and you are only interested in whether two or more samples have a different distribution.The Kruskal-Wallis test is a nonparametric version of the one-way analysis of variance test or ANOVA for short. It is named for the developers of the method, William Kruskal and Wilson Wallis. This test can be used to determine whether more than two independent samples have a different distribution. It can be thought of as the generalization of the Mann-Whitney U test.The default assumption or the null hypothesis is that all data samples were drawn from the same distribution. Specifically, that the population medians of all groups are equal. A rejection of the null hypothesis indicates that there is enough evidence to suggest that one or more samples dominate another sample, but the test does not indicate which samples or by how much.




  • Fail to Reject H0: All sample distributions are equal.

  • Reject H0: One or more sample distributions are not equal.



Each data sample must be independent, have 5 or more observations, and the data samples can differ in size.We can update the test problem to have 3 data samples, instead of 2, two of which have the same sample mean. Given that one sample differs, we would expect the test to discover the difference and reject the null hypothesisThe Kruskal-Wallis H-test can be implemented in Python using the kruskal() SciPy function. It takes two or more data samples as arguments and returns the test statistic and p-value as the result.



Visit- https://innovationalofficesolution.com



You like to read:-



https://innovationalofficesolution.com/Blog/detail/office-solution-helped-medical-wearable-device-company-to-get-more-clients.



 


Share This