I am rather confused about the Mann Whitney test, many statements I read state it tests for distribution equality between two populations and some state it tests for means/median/central tendency only. I ran some simple tests and it shows it only tests for central tendency, not shape. Many books state distribution equality (pdf), why? Can you please explain.
Distribution equality statements
Sheldon Ross' book Suppose that one is considering two different methods of production in determining whether the two methods result in statistically identical items. To attack this problem let X1,...,Xn, Y1,...,Ym denote samples of the measurable values of items by method 1 and method 2. If we let F and G, both assumed to be continuous, denote the distribution functions of the two samples, respectively, then the hypothesis we wish to test is H0:F=G. One procedure for testing H0 is the Mann-Whitney test.
Some Caltech notes Now suppose we have two samples. We want to know whether they could have been drawn from the same population, or from different populations, and, if the latter, whether they differ in some predicted direction. Again assume we know nothing about probability distributions, so that we need non-parametric tests. Mann-Whitney (Wilcoxon) U test. There are two samples, A (m members) and B (n members); H0 is that A and B are from the same distribution or have the same parent population.
Wikipedia This test can be used to investigate whether two independent samples were selected from populations having the same distribution.
Nonparametric Statistical Tests The null hypothesis is H0: θ = 0; that is, there is no difference at all between the distribution functions F and G.
But when I use F=N(0,10) and G=U(-3,3) to test, the p-value is very high. They can't be more different except E(F)=E(G) and symmetric.
-----Mean/median equality statements-------
- ArticleThe Mann–Whitney U-test can be used when the aim is to show a difference between two groups in the value of an ordinal, interval or ratio variable. It is the non-parametric version of the t-test.
- Test results
#octave
pkg load statistics #import octave statistics package
x = normrnd(0, 1, [1,100]); #100 N(0,1)
y1 = normrnd(0, 3, [1,100]); #100 N(0,3)
y2 = normrnd(0, 20, [1, 100]); #100 N(0,20)
y3 = unifrnd(-5, 5, [1,100]); #100 U(-5,5)
[p, ks] = kolmogorov_smirnov_test(y1, "norm", 0, 1) #KS test if y1==N(0,1)
p = 0.000002; #y of N(0,3) not equal to N(0,1)
[p, z] = u_test(x, y1); #Mann-Whitney of x~N(0,1) vs y~N(0,3)
p = 0.52; #null accepted
[p, z] = u_test(x, y2); #Mann-Whitney of x~N(0,1) vs y~N(0,20)
p = 0.32; #null accepted
[p, z] u_test(x, y3); #Mann-Whitney of x~N(0,1) vs y~U(-5,5)
p = 0.15; #null accepted
#Apparently, Mann-Whitney doesn't test pdf equality
-------Confusing---------
- Nonparametric Statistical Methods, 3rd Edition I don't understand how its H0: E(Y)-E(X) = 0 = no-shift, can be deduced from (4.2) which seems to suggest pdf equality (equal higher moments) except the shift.
- Article The test can detect differences in shape and spread as well as just differences in medians. Differences in population medians are often accompanied by equally important differences in shape. really??how??...confused.
After-thoughts
It seems many notes teach MW in a duck-typing way in which MW is introduced as a duck because if we only focus on key behaviours of a duck (quack=pdf, swim=shape), MW does appear like a duck (location-shift test). Most of the times, a duck and donald duck don't behave too markedly different, so such a MW description seems fine and easy to understand; but when donald duck dominates a duck whilst still quacking like a duck, MW can show significance, baffling unsuspecting students. It is not the students' fault, but a pedagogical mistake by claiming donald duck is a duck without clarifying he can be un-duck at times.
Also, my feeling is that in parametric hypothesis testing, tests are introduced with their purpose framed in , making the implicit. Many authors move on to nonparametric testing without first highlighting differences in getting the test-statistics probabilities (permutating X Y samples under ), so students continue to differentiate tests by looking at .
Like we are taught to use t-test for or and F-test for , with and implicit; on the other hand, we need to be explicit about what we test in as is trivially true for all tests of a permutation nature. So when instead of seeing and automatically thinking of so it is a K-S test, we should rather pay attention to the in deciding what's under analysis () and pick a test (KS, MW) accordingly.