- Statistics Tutorial
- Home
- Adjusted R-Squared
- Analysis of Variance
- Arithmetic Mean
- Arithmetic Median
- Arithmetic Mode
- Arithmetic Range
- Bar Graph
- Best Point Estimation
- Beta Distribution
- Binomial Distribution
- Black-Scholes model
- Boxplots
- Central limit theorem
- Chebyshev's Theorem
- Chi-squared Distribution
- Chi Squared table
- Circular Permutation
- Cluster sampling
- Cohen's kappa coefficient
- Combination
- Combination with replacement
- Comparing plots
- Continuous Uniform Distribution
- Continuous Series Arithmetic Mean
- Continuous Series Arithmetic Median
- Continuous Series Arithmetic Mode
- Cumulative Frequency
- Co-efficient of Variation
- Correlation Co-efficient
- Cumulative plots
- Cumulative Poisson Distribution
- Data collection
- Data collection - Questionaire Designing
- Data collection - Observation
- Data collection - Case Study Method
- Data Patterns
- Deciles Statistics
- Discrete Series Arithmetic Mean
- Discrete Series Arithmetic Median
- Discrete Series Arithmetic Mode
- Dot Plot
- Exponential distribution
- F distribution
- F Test Table
- Factorial
- Frequency Distribution
- Gamma Distribution
- Geometric Mean
- Geometric Probability Distribution
- Goodness of Fit
- Grand Mean
- Gumbel Distribution
- Harmonic Mean
- Harmonic Number
- Harmonic Resonance Frequency
- Histograms
- Hypergeometric Distribution
- Hypothesis testing
- Individual Series Arithmetic Mean
- Individual Series Arithmetic Median
- Individual Series Arithmetic Mode
- Interval Estimation
- Inverse Gamma Distribution
- Kolmogorov Smirnov Test
- Kurtosis
- Laplace Distribution
- Linear regression
- Log Gamma Distribution
- Logistic Regression
- Mcnemar Test
- Mean Deviation
- Means Difference
- Multinomial Distribution
- Negative Binomial Distribution
- Normal Distribution
- Odd and Even Permutation
- One Proportion Z Test
- Outlier Function
- Permutation
- Permutation with Replacement
- Pie Chart
- Poisson Distribution
- Pooled Variance (r)
- Power Calculator
- Probability
- Probability Additive Theorem
- Probability Multiplecative Theorem
- Probability Bayes Theorem
- Probability Density Function
- Process Capability (Cp) & Process Performance (Pp)
- Process Sigma
- Quadratic Regression Equation
- Qualitative Data Vs Quantitative Data
- Quartile Deviation
- Range Rule of Thumb
- Rayleigh Distribution
- Regression Intercept Confidence Interval
- Relative Standard Deviation
- Reliability Coefficient
- Required Sample Size
- Residual analysis
- Residual sum of squares
- Root Mean Square
- Sample planning
- Sampling methods
- Scatterplots
- Shannon Wiener Diversity Index
- Signal to Noise Ratio
- Simple random sampling
- Skewness
- Standard Deviation
- Standard Error ( SE )
- Standard normal table
- Statistical Significance
- Statistics Formulas
- Statistics Notation
- Stem and Leaf Plot
- Stratified sampling
- Student T Test
- Sum of Square
- T-Distribution Table
- Ti 83 Exponential Regression
- Transformations
- Trimmed Mean
- Type I & II Error
- Variance
- Venn Diagram
- Weak Law of Large Numbers
- Z table
- Statistics Useful Resources
- Statistics - Discussion
Statistics - Kolmogorov Smirnov Test
This test is used in situations where a comparison has to be made between an observed sample distribution and theoretical distribution.
K-S One Sample Test
This test is used as a test of goodness of fit and is ideal when the size of the sample is small. It compares the cumulative distribution function for a variable with a specified distribution. The null hypothesis assumes no difference between the observed and theoretical distribution and the value of test statistic 'D' is calculated as:
Formula
$D = Maximum |F_o(X)-F_r(X)|$
Where −
${F_o(X)}$ = Observed cumulative frequency distribution of a random sample of n observations.
and ${F_o(X) = \frac{k}{n}}$ = (No.of observations ≤ X)/(Total no.of observations).
${F_r(X)}$ = The theoretical frequency distribution.
The critical value of ${D}$ is found from the K-S table values for one sample test.
Acceptance Criteria: If calculated value is less than critical value accept null hypothesis.
Rejection Criteria: If calculated value is greater than table value reject null hypothesis.
Example
Problem Statement:
In a study done from various streams of a college 60 students, with equal number of students drawn from each stream, are we interviewed and their intention to join the Drama Club of college was noted.
B.Sc. | B.A. | B.Com | M.A. | M.Com | |
---|---|---|---|---|---|
No. in each class | 5 | 9 | 11 | 16 | 19 |
It was expected that 12 students from each class would join the Drama Club. Using the K-S test to find if there is any difference among student classes with regard to their intention of joining the Drama Club.
Solution:
${H_o}$: There is no difference among students of different streams with respect to their intention of joining the drama club.
We develop the cumulative frequencies for observed and theoretical distributions.
Streams | No. of students interested in joining | ${F_O(X)}$ | ${F_T(X)}$ | ${|F_O(X)-F_T(X)|}$ | |
---|---|---|---|---|---|
Observed (O) | Theoretical (T) | ||||
B.Sc. | 5 | 12 | 5/60 | 12/60 | 7/60 |
B.A. | 9 | 12 | 14/60 | 24/60 | 10/60 |
B.COM. | 11 | 12 | 25/60 | 36/60 | 11/60 |
M.A. | 16 | 12 | 41/60 | 48/60 | 7/60 |
M.COM. | 19 | 12 | 60/40 | 60/60 | 60/60 |
Total | n=60 | ||||
Test statistic ${|D|}$ is calculated as:
The table value of D at 5% significance level is given by
Since the calculated value is greater than the critical value, hence we reject the null hypothesis and conclude that there is a difference among students of different streams in their intention of joining the Club.
K-S Two Sample Test
When instead of one, there are two independent samples then K-S two sample test can be used to test the agreement between two cumulative distributions. The null hypothesis states that there is no difference between the two distributions. The D-statistic is calculated in the same manner as the K-S One Sample Test.
Formula
${D = Maximum |{F_n}_1(X)-{F_n}_2(X)|}$
Where −
${n_1}$ = Observations from first sample.
${n_2}$ = Observations from second sample.
It has been seen that when the cumulative distributions show large maximum deviation ${|D|}$ it is indicating towards a difference between the two sample distributions.
The critical value of D for samples where ${n_1 = n_2}$ and is ≤ 40, the K-S table for two sample case is used. When ${n_1}$ and/or ${n_2}$ > 40 then the K-S table for large samples of two sample test should be used. The null hypothesis is accepted if the calculated value is less than the table value and vice-versa.
Thus use of any of these nonparametric tests helps a researcher to test the significance of his results when the characteristics of the target population are unknown or no assumptions had been made about them.
To Continue Learning Please Login
Login with Google