STAT {4610, 6872}, Introduction to Nonparametric Statistical Methods, Fall 2004

Instructor: Prof. Jaimie Kwon (Homepage)

Lecture: MW ScN206,6:00-7:50 pm

Objectives: A nonparametric procedure is a statistical procedure that makes relatively mild assumptions regarding the distribution and/or the form of underlying functional relationship. This is in contrast to many traditional procedures that rely on the assumption that the underlying populations follow a normal distribution and/or the relationship is a straight line. We will study nonparametric methods such as Sign, Wilcoxon and rank-correlation tests for various situations including one-sample and two-sample problems and one-way and two-way layouts as well as independence and regression problems. Also, we will cover 'modern,' computer intensive nonparametric methods including kernel density estimation and nonparametric regression (splines, Nadaraya-Watson and local regression). We expect the student to i) understand the merit of nonparametric methods for certain situations, ii) be able to decide which nonparametric techniques are applicable in various situations, and iii) apply the techniques to real datasets using software R/S-Plus. The lecture will be a mixture of overview of theoretical backgrounds and (hands-on) demonstration of R implementation of the methods.

Syllabus

ANNouncements/HandOut/LINKs/MISC.

Week 1

Link: The R Project for Statistical Computing

Link: College of Science Computer Lab

HW: Download R and install it in your machine (if you have one). You may use those installed at the computer lab. Using S-Plus is OK, but it may not be compatible with what I teach in the class.

Misc: There is "An Introduction to R" in 'Manuals' tab of the R webpage. The acrobat file is usually available with an R installation. Read at your leisure.

ANN: Wed office hours are moved to Fri 10-11:30

HW: run the following codes in R and think what leads to the difference in t-test and wilcoxon test.

data.active <- c(9.00, 9.50, 9.75, 10.00, 13.00, 9.50)
data.passive <- c(11.00, 10.00, 10.00, 11.75, 10.50, 15.00)
t.test(data.active, data.passive)
wilcox.test(data.active, data.passive)
boxplot(data.active, data.passive)

Here are some backgrounds on the data: (we have only the first two groups for simplicity) This is a list containing data on age at walking (in months) for four groups of infants:

(From Zelazo et.al, (1972)``Walking'' in the newborn, Science, 176, 314-315.)

ANN: Parts of the book (Hollander and Wolfe) that will be covered:
Some of you have asked which part of the book is covered in the class. They are as follows:
Sections 2.1, 2.2, 2.3, 3.1, 3.4, 4.1, 5.4, 6.1, 7.1, 8.1, 8.5, 10.1, and 10.2.

ANN: We will have two quizzes, approximately 2 weeks before midterm and final.

ANN: Details about midterms and finals
Format of the quizzes and tests:
In-class, open book, open note.

ANN: Office hours are changed to MW 5-6, 10-10:30 PM

 

Week 2

HW: Chapter 2 of H&W, #4-7, #13, #17, #20. For grad, #8 too. (Due Next Monday; will be graded!)

Week 3

Third week survey (ppt)

HW: In Chapter 3,
#1, 2, 6,7,8, 16
Do #1 by hand & simple calculator
For #16, use R or any other software including ¡®StatExact¡¯.
#14 (Graduate studnets only)
Due next Monday.

There will be quiz on Wednesday.

Solution to HW 1. (pdf)

Quiz #1 and Solution (PDF)

Week 4

Solution to HW 2. (pdf)

Lab Session

x <- c(1.83, 0.50, 1.62, 2.48, 1.68, 1.88, 1.55, 3.06, 1.30)
y <- c(0.878, 0.647, 0.598, 2.05, 1.06, 1.29, 1.06, 3.14, 1.29)
wilcox.test(y - x, alternative = "less") # exact
wilcox.test(y - x, alternative = "less",
exact = FALSE, correct = FALSE) # large sample approximation
wilcox.test(y-x, conf.int=TRUE, conf.level=.95) # CI
wilcox.test(y-x, conf.int=TRUE, conf.level=.95, alternative="less") #one-sided CI

Given a data file 'depress.txt' in the current working directory with contents like

x y
1 1.83 0.878
2 0.5 0.647
3 1.62 0.598
4 2.48 2.05
5 1.68 1.06
6 1.88 1.29
7 1.55 1.06
8 3.06 3.14
9 1.3 1.29

You can run something like

write.table(data.frame(x,y), 'depress.txt', quote=FALSE)
data <- read.table('depress.txt')
wilcox.test(data$y - data$x, alternative = "less") # The same.
wilcox.test(data$y - data$x, alternative = "less",
exact = FALSE, correct = FALSE) # H&W large sample
wilcox.test(data$y - data$x, conf.int=TRUE, conf.level=.95)

Data files

depress.txt

salary.txt

blood.txt

 

Week 5

Homework #2 Due Monday, 10/25
In Chapter 3, do #1, 2, 6,7, 16; 18, 27, 36 (the first part only), 37, 41, 42
Do all by hand & simple calculator except: for #16, 42, use R or any other software including 'StatExact'.
#14, 22, 23 (Graduate students only)

Very crude lecture note (PDF). Contains many typos and errors, thus shouldn't be used as the main material.

Solution to HW 2 (PDF)

Midterm on Wednesday

Week 6

Midterm Problems + Solution (PDF)

Week 7

diffusion.txt

alcohol.txt

R

x <- c(0.80, 0.83,1.89,1.04,1.45,1.38,1.91,1.64,0.73,1.46)
y <- c(1.15,0.88,0.90,0.74,1.21)
wilcox.test(x,y, alternative='greater')

wilcox.test(x,y, alternative='greater', exact=FALSE, correct=FALSE)

Minitab

Cut and paste the data and then run

mann-whitney C1 C2;
alternative 1.

For estimation,

Mann-whitney C1 C2

WDIFF C1 C2 C3 C4 C5

SORT C3 C6

 

11/3/2004 (Wed) Continuing two sample wilcoxon rank sum test; Minitab +R session

--

Week 8

11/8: Two sample wilcoxon rank sum test;
HW #1, 6*, 7*, 14;
15, 18 (estimate only, no confidence interval), 23;
27 (no table entry? just use normal approximation. Then skip #32), 32 (don't compare with #17), 35*, 36*
due next Monday (11/15)
* : graduate students only

11/10: Lecture Note for two-sample location and dispersion problem, Kolmogorov/Smirnov test (PDF)
Kolmogorov-Smirnov test; One-way layout (Kruskal-Wallis);

Week 9

11/15: One-way layout (Kruskal-Wallis); muco.txt Test for independence (Kendall, Spearman);
Minitab+R

11/17: More problems in one-way layout; Minitab+R

Lecture Note for one-way and two-way layouts (PDF)

Week 10

11/22: Quiz (50 minute) Will cover from Wilcoxon rank sum test to Kruskal-Wallis test.

Quiz and Solution (PDF)

Pearson, Fisher's exact test

11/24: Minitab+R

Homework Due next Wednesday
Chapter 8: 1, 4*, 20, 41
Chapter 10: 12, 13

Lecture Note for independence test and Comparing two success probabilities (PDF)

Read sections in "Comparing two success probabilities", especially "Approximate tests and confidence intervals for p1-p2" .

Week 11

11/29: Spearman's rank correlation; Fisher's exact test

12/1: Review session

Last Lecture note, Solution, Practice question (PDF)

Final

Final (40%): December 8th, Wednesday 7:00 pm to 8:50 pm

In-class final and solution (PDF)


Misc: Lab reservation:
Mondays - 7:00 PM - 7:50 PM JAIMYOUNG KWON STAT 3872-01
Mondays - 8:00 PM - 9:50 PM JAIMYOUNG KWON STAT 6601-01
Weds - 7:00 PM - 7:50 PM JAIMYOUNG KWON STAT 6872-01
Weds - 8:00 PM - 9:50 PM JAIMYOUNG KWON STAT 6601-01