# Hypothesis testing and p-values | Inferential statistics | Probability and Statistics | Khan Academy

5847 ratings | 2193670 views
Html code for embedding videos on your blog
Fabrizio Blasio (3 hours ago)
That was amazing, I was watching random statistics videos on YouTube, then I remembered about you and I asked myself if you had created a video designed for medical students, p value was <0.001 for the null hypothesis (was it?! XD)
Such a great explanation though I've to watch normal distribution twice before watching this..If you've a clear understanding in deviation and mean you'll sure get this.
@5:14 he says "std dev of sampling distribution" it should be "std dev of sample MEAN distribution"
Colin Java (3 hours ago)
+Aditya Chauhan Thanks for the clarification
+Colin Java Yes for the first statement. Yes for second as well with a condition that sample size is large..
Colin Java (7 hours ago)
+Aditya Chauhan Thanks, so I'm basically confusing the std dev of the sample with that of the std dev of sample means? And we can take the population and sample std devs to be the same thing, since in reality we wouldn't know the population std dev? Thanks
+Colin Java okay, so for the formula you need the std dev of the distribution of sample means, but in the ques 0.5 is the std dev of the sample taken for the experiment itself. It's like when 100 rats' reaction time was measured, he found 0.5sec was the std dev of that sample. Now we need to calculate what would be the std dev of the distribution made by the means of such infinite samples, we have central limit theorem for that. In that theorem, we need population std dev which was approximated as sample std dev. I hope this helps.
Colin Java (13 hours ago)
I don't get it though, with central limit theorem, you take the std dev of the distribution of sample means to be sigma/sqrt(n), where sigma is the population std dev, then when you look at the bell curve, you have z = [x- mu] / [sigma/sqrt(n)]. But... in the question, it explicitly tells you the std dev of the sample times is 0.5 seconds, so why not just use that when you get to the bell curve and use z = [x - mu] / 0.5? I don't think I'm the only one who thinks something is wrong here.
liviyabags (14 days ago)
Oh man ... you've made my day !!!!
Teboho Lebone (16 days ago)
Am I the only one who got this the first time?
sandeep mouli Nalluri (24 days ago)
Easy way to remember: P = Probability of null hypothesis is true, which is => u = 1.2 being true. Since P value calculated is small = only 0.03 (0.3 %) of samples taken from population will actually have u =1.2. This means u is not equal to 1.2 in 99.7% of samples taken from population. Therefore we reject null hypothesis since it's only true in 0.3% of samples taken from population.
Haroun Trabelsi (24 days ago)
shouldn't the p value be equal to 0.003/2 since our sample value is on the left side ?
Ganesh Shelke (26 days ago)
Very nicely explained! It's one of the most important concept in Data Science and Machine Learning! Thanks team! :)
Edsknife (1 month ago)
"the drug is gwen". Yeah, Gwen *is* my drug.
Luka Pavlovic (1 month ago)
Thanks man this really helped a lot
Yanfeng Liu (1 month ago)
Hypotheseseseses
Maker 003 (1 month ago)
This...makes so much more sense. You just saved me from my stats final, thank you
Why do you do Mu minus X?
Jane Deijnen (1 month ago)
hypothesisesises
no words sir thank you very much
Takashi Huang (1 month ago)
This is amazing! Having heard about Khan Academy a million times, this is my first time watching it. This is too good and I want my tuition back
Shreyash Agarwal (1 month ago)
Why are we calculating sample standard deviation, can't we use 0.5 seconds as the sample standard devation as it's calculated from sample only
James Oh (1 month ago)
That's the sample standard deviation which is different than the standard deviation of the sampling distribution. Watching the Central Limit Theorem video may help you understand.
whats the probability I'll pass tomorrow if i inject myself with that drug now
James Lactao (2 months ago)
I think that the null hypothesis should be "There is no difference between the response time of rats injected with the drug (1.05 seconds), and the response time of rats without the drug (1.2 seconds)" rather than saying that mu is equal 1.2 s (even w/ drug) since obviously, it's not (it's 1.05 s).
Monica Liu (2 months ago)
literally every video is so helpful
Siwei Zhang (2 months ago)
But why 0.5 is considered population standard deviation in the calculation? I though 0.5 is the sample deviation????
Ben Cooley (2 months ago)
Waiiiiit, is this z stuff or t stuff
Suraj Kumar (2 months ago)
4:12 can you explain this ?
Faizan Mohiuddin (2 months ago)
Khan Academy, regarding determining the z-score, should it rather be the t-score because the population SD is unknown. That is t = (1.05 - 1.2)/(0.05) = -3. The area from the left of t = -3 is 0.0017. This means the P-value is 0.0017 because probability is area under a normal distribution. This is a normal distribution b/c the data sample amount is 100 > 30; thus, this is in accord to the Central Limit Theorem. If I compare this P-value= 0.0017 to a low alpha value, alpha = 0.01, then, yes, reject H null. That is, the drug has no effect. My point is t-distribution should have been used and not z-score distribution. Please advise.
JordanFireBomb (2 months ago)
My teacher couldnt even helped me with this so this was my only choice
SK (3 months ago)
For those of you who are confused, P value is the probability that the data from a given sample is not due to the changes made or external influences. In other words, if the null hypothesis was correct, we would end up with a large probability that the sample mean would still be possible without any external influences i.e. injecting rats in this case. A smaller P value means greater confidence that the results were due to the external factors.
Md Saleh (3 months ago)
great lecture. Can anyone help me with the device he uses for writing.
I thought Z statistic (3 in here) should be compared to Z parameter (not looked at here) to conclude anything.
Ivan (4 months ago)
Why are we estimating the standard deviation of the sampling distribution when we already have the sample standard deviation? Why are we dividing the "population" standard deviation with square root of the "sample" size?
Mesno barole (4 months ago)
how did he find 0.3%
Simone zanetti (4 months ago)
Great man!!
Renato De Leon (5 months ago)
Hello ABM 11 students who are cramming for the exams tomorrow
Roy Flores (5 months ago)
Nice knowing you guys
Minhee Kim (5 months ago)
This lecture definitely helped me a lot.
Monir Real Viva (5 months ago)
How can find 0.003
Yatin Arora (5 months ago)
explained very well thank you for clearing my doubt
rero chan (5 months ago)
م فهمت 😭💔
Sunny Yoda (5 months ago)
What? I am so confused.... and screwed
hubert1990s (5 months ago)
so puzzling part about sd (6:40). the instruction says "with a sample sd of 0,5", then he says "best estimation of sampling distribution standard deviation" - does it mean population distribution or what?
James Bekurs (5 months ago)
THANK YOU!!!! I was struggling to grasp this concept in a practical sense, and your video helped me connect the dots.
Crystal Ss (6 months ago)
I’m a third stage medical student why I should study statistics 🤯🤯😤🤐
denzel (6 months ago)
i am going to reject i am going to reject i am going to reject
Halle Finn (6 months ago)
isn't it a t score because you're taking the mean? also none of the conditions were met or even tested...
Does anyone know the difference between sample standard deviation and standard deviation of sampling distribution???
Light Yagami (6 months ago)
love you khan
Isabelle Contreras (6 months ago)
Thanks a lot! This video really explains hypothesis testing in a very simple way.
jiabin luo (6 months ago)
sorry. why it's 99.7%? how to calculate. Thanks
ig ig (6 months ago)
thank you you helped me a lot in fact, you save me you are a very great teacher
Krishna Kumar (7 months ago)
I did not understand, why did we chose to look for values >3 std-devs instead of <3 std-devs or =3 (in which case prob would be 0).
Amine Zaidi (8 months ago)
guys, what does he mean by result this extreme in 9:50
Amine Zaidi (7 months ago)
Bioengineer (7 months ago)
It means that the null hypothesis is unlikely to be the true population mean. The nearer the null hypothesis is from the sample mean, the greater will be it's probability to be the true population mean. But the result showed it was far away, so, as the result showed, you can reject it with a probability of 0.3% to be mistaken. There's 0.3% chance that the true population mean will be greater or equal to 1.05+3*0.05=1.2s or less or equal to 1.05-3*0.05=0.9s. As a result, there's a 99.7% chance to be 0.9<True pop. mean<1.2.
A T (8 months ago)
I learned more in 12 minutes online than I did for 3 90 minute university lectures lol
Sheldon Tauro (9 months ago)
Could anyone explain why did we divide the standard deviation by the root of the sample size .
Shakeel Alam (9 days ago)
Go through his previous videos, he gave good explainations there.
sephiroth (10 months ago)
i am from multimedia design, doing phd now in social science n have to do quanti, n learn statistic, n here i am, some school level statistic lesson..cuz i have zero knowledge on statistic..
SquidySquirts (10 months ago)
So is the P value for the hypothesis (not null) 0.97?
Salma Mohamed (11 months ago)
hey
John Dempsey (11 months ago)
Take a shot every time he repeats himself
Viswanath Viswanath (11 months ago)
We already given sample standard deviation is o.5, why we have to estimate ?
abc def (11 months ago)
It would be better if you stop repesting yourself
Plague Doctor (11 months ago)
How on earth are you teaching this if you don't even know that the plural of hypothesis is hypotheses (hypothe-sees, like see with your eyes)
Spurgeon Green (11 months ago)
Eternally grateful
duc anh duong (11 months ago)
so where dose 99.7/% come from?
AgeOfTechnology (11 months ago)
Wouldn't that be .03 not .003...
Kashyap Iyer (11 months ago)
shouldn't the p-value be 0.15%? as we have to consider the part only on the left side of the mean, that is 1.05, and not 1.35...
D.F (1 year ago)
Can i borrow your mind for tomorrows exam?
Joana Jenkings (1 year ago)
Can someone tell me when I'm ever going to use this in my everyday life?
Mar Mar Meows (1 year ago)
I’m gonna fail my statistics class
Lloyd Carmona (1 year ago)
Thank you
Selena Rodriguez (1 year ago)
What about the time saying if there's enough or not enough evidence?
Hannah H (1 year ago)
Your videos make everything so much clearer! Thank you so much for sharing 😊
Misaki Ichigo (1 year ago)
how did you know p value is 0.03% what's the relation with 3 std dviations
Moody San (1 year ago)
thank you༼☯﹏☯༽
Jordan - (1 year ago)
Why dont we divide the 0.003 by 2 as we are only looking at one side of the bell curve?
Sizwe Mbokazi (1 year ago)
i need GPS for this
Mark Elrod (1 year ago)
Look at it this way, the null is saying that "there is no way on earth that the average is any value but 1.2 seconds." Then, assuming that is true, we do some math and figure out that if the drug indeed did have no effect, and we randomly sampled mice 100 times, it there would be a 0.3% chance that some of those mice had response times of 1.05 seconds. So it would be super improbable to get a value of 1.05 seconds. Now....rewind back to the problem. We were told that the scientist not only had mice with a response time of 1.05, but even better, that was his average response time! This means that it is crazy to think that the drug had no effect, because if it didn't there would only be a 0.3% chance we got a value of 1.05 seconds.
seshant bhansali (29 days ago)
eteoklos man, you should make your own set of videos. Thanks for that explanation :D
eteoklos (4 months ago)
But that the mean of the sample would be 1.05 seems different (even less) than 0.3%, right?
SA 3 (1 year ago)
He say that we assume that 0 hypothesis is true. What does or assumption change in the calculations? would we make other calculations if we assumed the alternative hypothesis was true?
Fun Uber Games (1 year ago)
Jagroop Singh (1 year ago)
Nvm. I'm just gonna skip this question tomorrow
Małgorzata Górska (1 year ago)
Okay, I don't get why you'd choose to reject the null hypothesis if the probability of getting the extreme result of the alternative hypothesis is only 0.3%.For me, if something has 99.7% chance of happening, then it is almost certain. I feel like I'm stupid here.
Katherine G Carpio (1 year ago)
This legit didn’t help me at all. I don’t get it...
thatnolan (1 year ago)
Yes! Thank you!
MegaMsc123 (1 year ago)
This guy is boring AF
tomlinzombie (1 year ago)
My stats final is at 6:00.... I still dont get it....
Rose B (1 year ago)
i dont understand how it differs from a value?
CK (1 year ago)
5:18 it's not the standard "deviation" of the sample distribution but standard "error" of the sample distribution. se=s/sqrt(n)
Pirilani Banda (1 year ago)
but isnt it when we are not given the population standard deviation we then use the T table? am abit confused
SheStillRuns (1 year ago)
SALMAN KHAN IS THERE ANYTHING THAT YOU DON'T KNOW? EVEN GODS DIVIDED THE COSMOS AND THE ELEMENTS AMONGST THEMSELVES. SAL KHAN, GOD OF PAN-OLOGY.
Matin Sayed (1 year ago)
Thank you Khan!! You're the best. So much explained under 15 min - incredible!
Prinz von Kirchberg (1 year ago)
Wrong. 99.7% of the prob is within 2.75 deviations...
Nicolai Or Die (1 year ago)
why isn't the sample standard deviation what the video calls the standard deviation of our sampling distribution? I understand that s = 0.5 and that std dev of x bar = 0.05 but i just wanna understand the difference so i know how to clock it on the exam
Peter Uhd (1 year ago)
Thanks dude, clear and easy to follow
Daniel Murillo (1 year ago)
This is an awesome explanation.
rtong12 (1 year ago)
You guys are awesome...even for this middle aged professional...education never ends 😀
Abhay Dhotrekar (1 year ago)
0:38
Rubi Goswami (1 year ago)
the sample's standard deviation is already given to be 0.5 seconds. Why do we calculate it again as 0.5/sqrt(sample size)?
Bretta Sipa (1 year ago)
Why are you not using the one sample t-test? I don't understand why you chose this method?
Maanya P (1 year ago)
why are you not using t statistic when you sigma is not known, it will lead to more accurate conclusions.
Naz Kauser (1 year ago)
can somebody tell me what is the reason why a author would not add a p value or confidence interval in his randomised controlled trial? the author already stated there is no significant difference but is there any literature I can use to back the reason of why he didn't add a p values or confidence interval in his randomised controlled trial???
Truly Fave (1 year ago)
This is one of the few khan tutorials that I didn't really understand :(
pablo escobar (1 year ago)
Khan rocks!!!