I have been following Khan Academy videos to gain understanding of hypothesis testing, and I must confess that all my understanding thus far is based on that source. Now, the following videos talk about z-score/hypothesis testing:
Now, coming to my doubts, which is all about the denominator in the z-score:
- For the z-score formula which is: z = (x – μ) / σ, we use this directly when the standard deviation of the population(σ), is known. But when its unknown, and we use a sampling distribution, then we have z = (x – μ) / (σ / √n); and we estimate σ with σs ; where σs is the standard deviation of the sample, and n is the sample size.
Then z score = (x – μ) / (σs / √n). Why are dividing by √n, when σs is already known? Even in the video, Hypothesis Testing - Sal divides the sample's standard deviation by √n. Why are we doing this, when σs is directly given?
Please help me understand.
- I tried applying this on the following question, and faced the problems below:
Question : Yardley designed new perfumes. Yardley company claimed that an average new perfume bottle lasts 300 days. Another company randomly selects 35 new perfume bottles from Yardley for testing. The sampled bottles last an average of 190 days, with a standard deviation of 50 days. If the Yardley's claim were true, what is the probability that 35 randomly selected bottles would have an average life of no more than 190 days ?
So, the above question, when I do the following:
z = (190-300)/(50/√35), we get z = -13.05, which is not a possible score, since z score should be between +-3.
And when I do, z = (190-110)/50, or rather z = (x – μ) / σ, I seem to be getting an acceptable answer over here.
Please help me figure out what I am missing.
I think the origin of the 1/\sqrt{n} is simply whether you're calculating the standard deviation of the lifetime of a single bottle, or the standard deviation of the (sample) mean of a set of bottles.
The question indicates that 50 days is the standard deviation of the lifetimes of the set of 35 bottles. That implies that the estimated mean age (190 days) will have a margin of error of about 50/\sqrt{35} days. Assuming that this similar margin of error applied to the claimed 300-day lifetime, one can calculate the probability that a set of 35 bottles would be measured to be 190 days or less, using the complementary error function.
Your z=-13.05 looks about right, implying that it is extremely unlikely that claimed 300-day lifetime is consistent with that seen in the 35-bottle experiment.