Dictionaries of Paradoxes for Statistical Tests onkSamples

Abstract
A relationship between the Kruskal–Wallis nonparametric statistical test on k samples and the Borda positional method for voting is established and then exploited to gain a complete analysis of the counterintuitive results and statistical orderings that arise when a set of data is restricted to various subsets of the k samples. This is done by introducing and examining the “dictionary” of possible orderings of the samples occurring with the Kruskal-Wallis procedure. It is shown that using ranks, as opposed to any other weights, minimizes the number and kinds of paradoxes that can arise from examining subsets of the data—projection paradoxes. The idea of a dictionary for a statistical procedure is discussed. The dictionaries for the Deshpandé class of statistical procedures (including Bhapkar's V test and Deshpandé's L test) are computed. An estimate for the relative sizes of the number of paradoxes for the Kruskal-Wallis test and any other such test—by comparing relative sizes of their dictionaries—is given in a few cases. An analysis is given of the method of selecting the “best” of several samples by recursively narrowing down the size of the set of samples one is “reasonably certain” must contain the optimal choice. It is shown that the best sample may be dependent on which recursive method is chosen: For the same data, different recursive methods can yield different outcomes.

This publication has 9 references indexed in Scilit: