|
||||||||||||||
| Null and Alternate Hypothesis Significance level Power Related pages Gage capability Manufacturing worker's role in quality Concept of variation in quality and SPC ISO 9000 Lean Manufacturing and Lean Enterprise |
Statistical Hypothesis Testing Information to accompany SPC Essentials and Productivity Improvement: A Manufacturing Approach
Every statistical test tests the null hypothesis H0 against the alternate hypothesis H1. Null means "nothing," and the null hypothesis is that nothing is present. The process change or treatment makes no difference, or the process is operating properly. The null hypothesis is like presumption of innocence. "Accepting the null hypothesis" is like acquitting a defendant. It does NOT prove that the null hypothesis is true, or that the defendant is innocent. It means there is a reasonable doubt about the defendant's guilt. In statistical testing, the significance level, Type I risk, or alpha risk is the "reasonable doubt." It is the chance of wrongly rejecting the null hypothesis when it is true. In acceptance sampling, it is the producer's risk, or risk of wrongly rejecting a lot that meets requirements. The alternate hypothesis is that the process change or treatment has an effect, or something is wrong with the process. The Type II risk is the chance of accepting the null hypothesis when it is false. The "consumer's risk" is the Type II risk for an acceptance sampling plan. It is the chance of passing a lot that does not meet the requirements. If the Type I risk is the chance of crying wolf, the Type II risk is the chance of not seeing a real wolf. The following table explains hypothesis testing and risks.
Significance level This is among the more confusing terms. "Does a 5 percent significance level mean there is only a 5% chance that my results are significant?" The significance level is actually the alpha, or Type I risk. If the null hypothesis is true, there is a 5 percent chance of rejecting it because of random variation (luck). Statisitical tables also can be confusing. The significance level (Type I risk, alpha risk) is the UPPER tail of the distribution (for the F and chi square distributions). The figure shows a chi square distribution with 6 degrees of freedom. 12.59 is the 95th percentile of the distribution. If we run an experiment whose result follows a chi square distribution with 6 degrees of freedom, and the null hypothesis is true, we expect to get chi square =< 12.59 95 times out of 100. If chi square > 12.59, there is only a 5 percent chance that it's just luck or variation, and we can be 95 percent sure that the null hypothesis is false. We can "convict the defendant (the null hypothesis) beyond a reasonable doubt." In statistics, we can quantify this "doubt" as the significance level. F(12.59) is the cumulative chi square distribution. It is the integral of the distribution from 0 to 12.59, and 95 percent of the area under the curve is to the left of 12.59. Alpha, or the upper tail, is 5 percent (the rest). ![]() Power A test's power improves with sample size. Tests also become more powerful as the situation gets worse. As the wolf gets closer, the shepherd is more likely to see it. In statistical process control (SPC), the only way to improve the chart's power without increasing the false alarm rate is to use a bigger sample. In SPC, the average run length (ARL) is the average number of samples that we will take before we detect an out of control condition. It is the reciprocal of the power. If there is a 10 percent chance that a point will be outside a control limit (because the process has shifted) we will, on average, take ten samples before this happens. * Some people say that a control chart is not a hypothesis test. The original philosophy was that +/- 3 sigma control limits will capture most (~99.73 percent) of the random variation in any process, even one that does not follow a normal distribution. Modern computers, however, can characterize even highly nonnormal processes (e.g. gamma distributions, common when there is a one-sided specification, especially impurities). We can, therefore, characterize a nonnormal process and set appropriate control limits for it. In any event, when we start talking about average run lengths, and a control chart's false alarm risks, this implies a hypothesis test. "It looks like a duck, walks like a duck, and quacks like a duck..." excerpt from SPC Essentials and Productivity Improvement: A Manufacturing Approach All material (C) 1996, Intersil Corporation (formerly Harris Semiconductor) or ASQC Quality Press
Send Mail to (mail to webmaster@ is discarded due to abuse by spammers.) |
|||||||||||||