Instructor: Prof. Jaimie Kwon (Homepage)
Lecture: MW ScN206,6:00-7:50 pm
Objectives: A nonparametric procedure is a statistical procedure that makes relatively mild assumptions regarding the distribution and/or the form of underlying functional relationship. This is in contrast to many traditional procedures that rely on the assumption that the underlying populations follow a normal distribution and/or the relationship is a straight line. We will study nonparametric methods such as Sign, Wilcoxon and rank-correlation tests for various situations including one-sample and two-sample problems and one-way and two-way layouts as well as independence and regression problems. Also, we will cover 'modern,' computer intensive nonparametric methods including kernel density estimation and nonparametric regression (splines, Nadaraya-Watson and local regression). We expect the student to i) understand the merit of nonparametric methods for certain situations, ii) be able to decide which nonparametric techniques are applicable in various situations, and iii) apply the techniques to real datasets using software R/S-Plus. The lecture will be a mixture of overview of theoretical backgrounds and (hands-on) demonstration of R implementation of the methods.
Link: The R Project for Statistical Computing
Link: College of Science Computer Lab
HW: Download R and install it in your machine (if you have one). You may use those installed at the computer lab. Using S-Plus is OK, but it may not be compatible with what I teach in the class.
Misc: There is "An Introduction to R" in 'Manuals' tab of the R webpage. The acrobat file is usually available with an R installation. Read at your leisure.
ANN: Wed office hours are moved to Fri 10-11:30
HW: run the following codes in R and think what leads to the difference in t-test and wilcoxon test.
data.active <- c(9.00, 9.50, 9.75, 10.00, 13.00, 9.50)
data.passive <- c(11.00, 10.00, 10.00, 11.75, 10.50, 15.00)
t.test(data.active, data.passive)
wilcox.test(data.active, data.passive)
boxplot(data.active, data.passive)
Here are some backgrounds on the data: (we have only the first two groups for simplicity) This is a list containing data on age at walking (in months) for four groups of infants:
(From Zelazo et.al, (1972)``Walking'' in the newborn, Science, 176, 314-315.)
ANN: Parts of the book (Hollander and Wolfe) that will be covered:ANN: We will have two quizzes, approximately 2 weeks before midterm and final.
ANN: Details about midterms and finals
Format of the quizzes and tests:
In-class, open book, open note.
HW: Chapter 2 of H&W, #4-7, #13, #17, #20. For grad, #8 too. (Due Next Monday; will be graded!)
Third week survey (ppt)
HW: In Chapter 3,
#1, 2, 6,7,8, 16
Do #1 by hand & simple calculator
For #16, use R or any other software including ¡®StatExact¡¯.
#14 (Graduate studnets only)
Due next Monday.
There will be quiz on Wednesday.
Solution to HW 1. (pdf)
Quiz #1 and Solution (PDF)
Solution to HW 2. (pdf)
Lab Session
x <- c(1.83, 0.50, 1.62, 2.48, 1.68, 1.88, 1.55, 3.06, 1.30)
y <- c(0.878, 0.647, 0.598, 2.05, 1.06, 1.29, 1.06, 3.14, 1.29)
wilcox.test(y - x, alternative = "less") # exact
wilcox.test(y - x, alternative = "less",
exact = FALSE, correct = FALSE) # large sample approximation
wilcox.test(y-x, conf.int=TRUE, conf.level=.95) # CI
wilcox.test(y-x, conf.int=TRUE, conf.level=.95, alternative="less")
#one-sided CI
Given a data file 'depress.txt' in the current working directory with contents like
x y
1 1.83 0.878
2 0.5 0.647
3 1.62 0.598
4 2.48 2.05
5 1.68 1.06
6 1.88 1.29
7 1.55 1.06
8 3.06 3.14
9 1.3 1.29
You can run something like
write.table(data.frame(x,y), 'depress.txt', quote=FALSE)
data <- read.table('depress.txt')
wilcox.test(data$y - data$x, alternative = "less") # The same.
wilcox.test(data$y - data$x, alternative = "less",
exact = FALSE, correct = FALSE) # H&W large sample
wilcox.test(data$y - data$x, conf.int=TRUE, conf.level=.95)
Data files
Homework #2 Due Monday, 10/25
In Chapter 3, do #1, 2, 6,7, 16; 18, 27, 36 (the first part only), 37, 41, 42
Do all by hand & simple calculator except: for #16, 42, use R or any other
software including 'StatExact'.
#14, 22, 23 (Graduate students only)
Very crude lecture note (PDF). Contains many typos and errors, thus shouldn't be used as the main material.
Solution to HW 2 (PDF)
Midterm on Wednesday
Midterm Problems + Solution (PDF)
x <- c(0.80, 0.83,1.89,1.04,1.45,1.38,1.91,1.64,0.73,1.46)
y <- c(1.15,0.88,0.90,0.74,1.21)
wilcox.test(x,y, alternative='greater')
wilcox.test(x,y, alternative='greater', exact=FALSE, correct=FALSE)
Cut and paste the data and then run
mann-whitney C1 C2;
alternative 1.
For estimation,
Mann-whitney C1 C2
WDIFF C1 C2 C3 C4 C5
SORT C3 C6
11/3/2004 (Wed) Continuing two sample wilcoxon rank sum test; Minitab +R session
11/8: Two sample wilcoxon rank sum test;
HW #1, 6*, 7*, 14;
15, 18 (estimate only, no confidence interval), 23;
27 (no table entry? just use normal approximation. Then skip #32), 32 (don't
compare with #17), 35*, 36*
due next Monday (11/15)
* : graduate students only
11/10: Lecture Note for two-sample location and dispersion problem, Kolmogorov/Smirnov
test (PDF)
Kolmogorov-Smirnov test; One-way layout (Kruskal-Wallis);
11/15: One-way layout (Kruskal-Wallis); muco.txt Test
for independence (Kendall, Spearman);
Minitab+R
11/17: More problems in one-way layout; Minitab+R
Lecture Note for one-way and two-way layouts (PDF)
11/22: Quiz (50 minute) Will cover from Wilcoxon rank sum test to Kruskal-Wallis test.
Quiz and Solution (PDF)
Pearson, Fisher's exact test
11/24: Minitab+R
Homework Due next Wednesday
Chapter 8: 1, 4*, 20, 41
Chapter 10: 12, 13
Lecture Note for independence test and Comparing two success probabilities (PDF)
Read sections in "Comparing two success probabilities", especially "Approximate tests and confidence intervals for p1-p2" .
11/29: Spearman's rank correlation; Fisher's exact test
12/1: Review session
Last Lecture note, Solution, Practice question (PDF)
Final (40%): December 8th, Wednesday 7:00 pm to 8:50 pm
In-class final and solution (PDF)
Misc: Lab reservation:
Mondays - 7:00 PM - 7:50 PM JAIMYOUNG KWON STAT 3872-01
Mondays - 8:00 PM - 9:50 PM JAIMYOUNG KWON STAT 6601-01
Weds - 7:00 PM - 7:50 PM JAIMYOUNG KWON STAT 6872-01
Weds - 8:00 PM - 9:50 PM JAIMYOUNG KWON STAT 6601-01