Book promotion

 Lean Manufacturing and Lean Enterprise
Beyond the Theory of Constraints

Null and Alternate Hypothesis

Significance level


Related pages

Gage capability

Manufacturing worker's role in quality

Concept of variation in quality and SPC

ISO 9000

Lean Manufacturing and Lean Enterprise

Statistical Hypothesis Testing
Information to accompany SPC Essentials and Productivity Improvement: A Manufacturing Approach
  • NEW from Levinson Productivity Systems: Statistical Process Control Chart Simulator, designed to teach production workers how to read an interpret control charts in an hour or less. It also facilitates very rapid training in the concepts of variation and accuracy. It achieves this by using animated target and histogram (quincunx-style) figures in conjunction with control charts. The price is $50.00, which includes shipping and handling within the U.S.Download the instruction file (1 megabyte, Word document) for free.
Hypothesis testing often confuses people but it is the keystone of most statistical applications. Every acceptance sampling test, designed experiment, and control chart* is a statistical hypothesis test.
  1. Statistical tests separate significant effects from mere luck or random chance.
  2. All hypothesis tests have unavoidable, but quantifiable, risks of making the wrong conclusion. Statistical tests always involve Type I (producer's or alpha) and Type II (consumer's or beta) risks. The Type I risk is the chance of deciding that a significant effect is present when it isn't. The Type II risk is the chance of not detecting a significant effect when one exists.
Levinson Productivity Systems, P.C. can train your company's employees in statistical process control or implement statistical controls for real-world manufacturing processes that don't follow the normal (bell curve) distribution. Other products and services include lean manufacturing, Theory of Constraints, and ISO 9000.

Design of Experiments: Applications and Basic Principles
184 PowerPoint slides (including Notes pages for handouts) $85.00.
Download package description as a Word document
Objectives: Overview course, no in-depth mathematical knowledge is required. Includes some Minitab examples.
(1) Know what kind of experiments are available and how they are used.
(2) Know how to interpret results (hypothesis testing, outlier analysis)
  1. Introduction: what is Design of Experiments?
  2. Planning the experiement: randomization, blocking, and replication
  3. Interpreting test statistics (including hypothesis testing)
  4. Types of experiments. One-way Analysis of Variance
  5. Two-factor experiments and interactions
  6. Multi-factor experiements. Factorial designs.
  7. Linear regression
  8. Nonparametric methods
Available on CD-ROM or via E-mail  (contact me for the latter). See Lean Manufacturing for licensing terms (one copy may be in use on a projector at any given time, and you can make unlimited hard copies of the notes pages for your audiences) and directions for printing PowerPoint notes pages for handouts. When ordering by PayPal, please send an E-mail to TheBoss "at" so I will check for the electronic payment, and include your shipping address. Check orders to Levinson Productivity Systems, P.C., 6 Lexington Court, Wilkes-Barre PA 18702.
    Null and Alternate Hypothesis
    Every statistical test tests the null hypothesis H0 against the alternate hypothesis H1. Null means "nothing," and the null hypothesis is that nothing is present. The process change or treatment makes no difference, or the process is operating properly. The null hypothesis is like presumption of innocence.

    "Accepting the null hypothesis" is like acquitting a defendant. It does NOT prove that the null hypothesis is true, or that the defendant is innocent. It means there is a reasonable doubt about the defendant's guilt. In statistical testing, the significance level, Type I risk, or alpha risk is the "reasonable doubt." It is the chance of wrongly rejecting the null hypothesis when it is true. In acceptance sampling, it is the producer's risk, or risk of wrongly rejecting a lot that meets requirements.

    The alternate hypothesis is that the process change or treatment has an effect, or something is wrong with the process. The Type II risk is the chance of accepting the null hypothesis when it is false. The "consumer's risk" is the Type II risk for an acceptance sampling plan. It is the chance of passing a lot that does not meet the requirements. If the Type I risk is the chance of crying wolf, the Type II risk is the chance of not seeing a real wolf. The following table explains hypothesis testing and risks.

State of nature (actual situation)
Decide that there is a problem
Decide that there is no problem
There isn't a problem; the situation is as it should be.
False alarm risk (alpha) or Type I risk
  • The risk of crying wolf when there isn't one
  • Risk of convicting an innocent defendant
  • Quality acceptance sampling; risk of rejecting a good lot
  • SPC; risk of calling the process out of control when it is in control
  • Design of experiments (DOE or DOX); risk of concluding that there is a difference between the treatments when there isn't
100% - alpha
  • Chance of acquitting an innocent defendant
  • Quality acceptance sampling; chance of accepting a good lot
  • SPC: chance of calling the process in control when it is
  • DOE: conclude that there is no difference between the treatments when there isn't.
There is a problem; the situation requires adjustment
Power (gamma)
A test's ability to detect a real problem, or difference
  • Chance of seeing the wolf 
  • Chance of convicting a guilty defendant
  • Quality acceptance sampling; chance of rejecting a bad lot
  • SPC: chance of calling the process out of control when it is
  • DOE: chance of detecting a difference between the treatments
Risk of missing the problem: Type II risk (beta)
  • Risk of not seeing the wolf·
  • Risk of acquitting a guilty defendant·
  • Quality acceptance sampling; risk of shipping a bad lot·
  • SPC; risk of calling the process in control when it is out of control
  • DOE: chance of missing a difference between the treatments

    Significance level

    This is among the more confusing terms. "Does a 5 percent significance level mean there is only a 5% chance that my results are significant?" The significance level is actually the alpha, or Type I risk. If the null hypothesis is true, there is a 5 percent chance of rejecting it because of random variation (luck).

    Statisitical tables also can be confusing. The significance level (Type I risk, alpha risk) is the UPPER tail of the distribution (for the F and chi square distributions). The figure shows a chi square distribution with 6 degrees of freedom. 12.59 is the 95th percentile of the distribution. If we run an experiment whose result follows a chi square distribution with 6 degrees of freedom, and the null hypothesis is true, we expect to get chi square =< 12.59 95 times out of 100. If chi square > 12.59, there is only a 5 percent chance that it's just luck or variation, and we can be 95 percent sure that the null hypothesis is false. We can "convict the defendant (the null hypothesis) beyond a reasonable doubt." In statistics, we can quantify this "doubt" as the significance level.

    F(12.59) is the cumulative chi square distribution. It is the integral of the distribution from 0 to 12.59, and 95 percent of the area under the curve is to the left of 12.59. Alpha, or the upper tail, is 5 percent (the rest).

Chi square distribution, 6 degrees of freedom


A test's power improves with sample size. Tests also become more powerful as the situation gets worse. As the wolf gets closer, the shepherd is more likely to see it. In statistical process control (SPC), the only way to improve the chart's power without increasing the false alarm rate is to use a bigger sample.

In SPC, the average run length (ARL) is the average number of samples that we will take before we detect an out of control condition. It is the reciprocal of the power. If there is a 10 percent chance that a point will be outside a control limit (because the process has shifted) we will, on average, take ten samples before this happens.

* Some people say that a control chart is not a hypothesis test. The original philosophy was that +/- 3 sigma control limits will capture most (~99.73 percent) of the random variation in any process, even one that does not follow a normal distribution. Modern computers, however, can characterize even highly nonnormal processes (e.g. gamma distributions, common when there is a one-sided specification, especially impurities). We can, therefore, characterize a nonnormal process and set appropriate control limits for it. In any event, when we start talking about average run lengths, and a control chart's false alarm risks, this implies a hypothesis test. "It looks like a duck, walks like a duck, and quacks like a duck..."

excerpt from SPC Essentials and Productivity Improvement: A Manufacturing Approach
All material (C) 1996, Intersil Corporation (formerly Harris Semiconductor) or ASQC Quality Press
Order Books Online

Send Mail toElephant "at" to webmaster@ is discarded due to abuse by spammers.)

visitors since 23 December 2003