To use the module in a S-lang script, it must first be loaded into the
interpreter. The standard way to do this is to load it using the
require function, e.g.,
require ("stats");
To load it into a specific namespace, e.g., ``S'', use
require ("stats", "S");
Most of the stats module's functions provide a brief usage
message when called without arguments, e.g.,
slsh> chisqr_test;
Usage: p=chisqr_test(X,Y,...,Z [,&T])
More detailed help is available using the help function:
slsh> help chisqr_test
chisqr_test
SYNOPSIS
Apply the Chi-square test to a two or more datasets
USAGE
prob = chisqr_test (X_1, X_2, ..., X_N [,&t])
DESCRIPTION
This function applies the Chi-square test to the N datasets
.
.
To illustrate the use of the module, consider the task of comparing
gaussian-distributed random numbers to a uniform distribution of
numbers. In the following, the ran_gaussian function from the
GNU Scientific Library module will be used to generate the gaussian distributed random
numbers.
First, start by loading the stats and gslrand
modules into
slsh:
slsh> require ("gslrand");
slsh> require ("stats");
Now generate 10 random numbers with a variance of 1.0 using the
ran_gaussian and assign the resulting array to the variable g:
slsh> g = ran_gaussian (1.0, 10);
Similarly, assign u to a uniformly distributed range of 10 numbers
from -3 to 3:
slsh> u = [-3:3:#10];
These two datasets may be compared using the stats module's
two-sample non-parametric tests. First the Kolmogorov-Smirnov test
may be applied using ks_test2:
slsh> ks_test2 (g,u);
0.78693
This shows a p-value of about 0.79, which indicates that there is no
significant difference between these distributions. Similary, the
Kuiper and Mann-Whitney-Wilcoxon tests yield p-values of 0.46, and
0.97, respectively:
slsh> mw_test (g,u);
0.970512
slsh> kuiper_test2 (g,u);
0.462481
Instead of 10 points per dataset, perform the tests using 100 points:
slsh> g = ran_gaussian (1.0, 100);
slsh> u = [-3:3:#100];
slsh> ks_test2 (g,u);
0.00613403
slsh> mw_test (g,u);
0.741508
slsh> kuiper_test2 (g,u);
1.38757e-06
As this example shows, both the Kolmogorov-Smirnov and Kuiper tests
found significant differences between the data sets, whereas the
Mann-Whitney-Wilcoxon test failed to find a significant difference.
The fact that the Mann-Whitney-Wilcoxon test failed to find a
difference is that the test assumes that the underlying distributions
have the same shape but may differ in location. Clearly the
distributions represented by g and u violate this
assumption.