Example of using a bootstrap t-test for a non-normal distribution.
Here, take a sample of size n=20 from \(X\sim \mbox{Exp}(\lambda)\) and treat this as the given data. Then find the bootstrap distribution from the data and use that to evaluate the p-value for the two-sided hypothesis test.
Generate a sample of size n:
n <- 10
lambda <- 1
mu = 1/lambda
sigma = 1/lambda
data = rexp(n,lambda)
data_mean = mean(data)
data_sd = sd(data)
theory mean = 1
data mean = 0.4633832
theory std dev = 1
data std dev = 0.3297858
Plot the histogram of the sample and scaled version of the pdf
hist(data,xlim=c(0,5))
xfine<-seq(0,5,length=101)
scale = 10
lines(xfine,scale*dexp(xfine,lambda),col='green',type='l',xlab='x',ylab='pdf')
abline(v=mu,col='green')
abline(v=data_mean,col='blue',lty='dashed')
NOTE: The mean of the data (blue dashed) is not necessarily close to the theoretical mean (green solid).
Find the exact t-bootstrap distribution from the data
N <- 10**4 # number of samples
tboot <- numeric(N) # array for bootstrap dist
# loop to create sample, calculate t statistic,
for (i in seq(1,N)) {
resample <- sample(data,n,replace=TRUE)
xbar <- mean(resample)
s <- sd(resample)
tboot[i] <- (xbar-data_mean)/(s/sqrt(n))
}
head(tboot)
## [1] 0.1407625 -2.5782255 1.3467943 1.0096115 -0.5674166 -3.0201808
tboot_mean <- mean(tboot)
tboot_sd <- sd(tboot)
p_minus <-sum(tboot <= data_mean)/N
p_plus <- sum(tboot >= data_mean)/N
p2 <- 2*min(p_minus,p_plus) # two-sided p value
Plot the histogram of the t-bootstrap distribution
hist(tboot)
abline(v=0,col='green')
abline(v=tboot_mean,col='red',lty='dotted')
true mean = 1
data mean = 0.4633832
bootstrap mean = -0.7016526
p_minus = 0.7304
p_plus = 0.2696
p2 = 0.5392