Chapter 7

February 5, 2018 | Author: Anonymous | Category: Math, Statistics And Probability, Normal Distribution

Short Description

Download Chapter 7...

Description

Week 5 Sep 29 – Oct 3 Four Mini-Lectures QMM 510 Fall 2014

Chapter Contents 7.1 Describing a Continuous Distribution 7.2 Uniform Continuous Distribution 7.3 Normal Distribution

7.4 Standard Normal Distribution 7.5 Normal Approximations 7.6 Exponential Distribution

7.7 Triangular Distribution (Optional)

7-2

So many topics, so little time …

Chapter 7

Continuous Probability Distributions ML 5.1

Chapter 7

Continuous Distributions Events as Intervals •

Discrete Variable – each value of X has its own probability P(X).

•

Continuous Variable – events are intervals and probabilities are areas under continuous curves. A single point has no probability.

7-3

Chapter 7

Continuous Distributions PDF – Probability Density Function Continuous PDF: • Denoted f(x) • Must be nonnegative • Total area under curve = 1 • Mean, variance, and shape depend on the PDF parameters • PDF reveals the shape of the distribution

7-4

Chapter 7

Continuous Distributions CDF – Cumulative Distribution Function Continuous CDF:

• •

•

•

Denoted F(x) Shows P(X ≤ x), the cumulative proportion below x. Shows the area to the left of any given point on the PDF. There are Excel functions for either the PDF or CDF.

7-5

Chapter 7

Continuous Distributions Probabilities as Areas • Continuous probability functions:

• Unlike discrete distributions, the probability at any single point is 0.

• The entire area under any PDF, by definition, is 1.

• Mean is the balance point of the distribution.

7-6

Chapter 7

Continuous Distributions Expected Value and Variance The mean and variance of a continuous random variable are analogous to E(X) and Var(X ) for a discrete random variable. Here the integral sign replaces the summation sign. Calculus is required to compute the integrals.

7-7

Chapter 7

Normal Distribution

Characteristics of the Normal Distribution

7-8

•

Normal or Gaussian (or bell-shaped) distribution was named for German mathematician Karl Gauss (1777 – 1855).

•

Defined by two parameters, µ and .

•

Denoted N(µ, ).

•

Domain is –  < X < +  (continuous scale).

•

Almost all (99.7%) of the area under the normal curve is included in the range µ – 3 < X < µ + 3.

•

Symmetric and unimodal about the mean.

Characteristics of the Normal Distribution

7-9

Chapter 7

Normal Distribution

Characteristics of the Normal Distribution •

Normal PDF f(x) reaches a maximum at µ and has points of inflection at µ ± 

Bell-shaped curve Note: All normal distributions have the same shape but differ in the axis scales.

•

Excel function for PDF (height of the function at x) is =NORM.DIST(x, µ, , 0) 0 for PDF, 1 for CDF

7-10

Chapter 7

Normal Distribution

Characteristics of the Normal Distribution •

•

Normal CDF has a “lazy-S” shape

Excel function for CDF (area to left of x) is =NORM.DIST(x, µ, , 1) 0 for PDF, 1 for CDF

7-11

Chapter 7

Normal Distribution

Characteristics of the Standard Normal Distribution Since for every value of µ and , there is a different normal distribution, we transform a normal random variable to a standard normal distribution with µ = 0 and  = 1 using the formula z = (x µ)/.

•

7-12

Chapter 7

Standard Normal Distribution

Characteristics of the Standard Normal •

Standard normal PDF f(x) reaches a maximum at z = 0 and has points of inflection at +1.

•

Shape is unaffected by the transformation. It is still a bell-shaped curve. Standard normal tables or Excel functions can be used to find the desired probabilities.

•

7-13

Excel function for CDF (area to left of z) is =NORM.DIST(z, 1) Figure 7.11

Chapter 7

Standard Normal Distribution

Characteristics of the Standard Normal •

Standard normal CDF

• • •

•

7-14

A common scale from 3 to +3 is used. Entire area under the curve is unity. The probability of an event P(z1 < Z < z2) is a definite integral of f(z). However, standard normal tables or Excel functions can be used to find the desired probabilities.

Chapter 7

Standard Normal Distribution

Normal Areas from Appendix C-1

7-15

•

Appendix C-1 allows you to find the area under the curve z.

•

For example, find P(0 < Z < 1.96):

Chapter 7

Standard Normal Distribution

from 0 to

Normal Areas from Appendix C-1 • •

•

7-16

Now find P(1.96 < Z < 1.96). Due to symmetry, P(1.96 < Z) is the same as P(Z < 1.96).

So, P(1.96 < Z < 1.96) = .4750 + .4750 = .9500, or 95% of the area under the curve.

Chapter 7

Standard Normal Distribution

Basis for the Empirical Rule • • •

7-17

Approximately 68% of the area under the curve is between + 1 Approximately 95% of the area under the curve is between + 2 Approximately 99.7% of the area under the curve is between + 3

Chapter 7

Standard Normal Distribution

Normal Areas from Appendix C-2 •

Appendix C-2 allows you to find the area under the curve from the left of z (similar to Excel).

•

This table is the CDF (not the PDF). For example,

P(Z < 1.96) =NORM.S.DIST(1.96,1)

7-18

P(Z < 1.96)

P(1.96 < Z < 1.96)

=NORM.S.DIST(-1.96,1)

=NORM.S.DIST(1.96,1)NORM.S.DIST(-1.96,1)

Chapter 7

Standard Normal Distribution

Normal Areas from Appendices C-1 and C-2 • •

Appendices C-1 and C-2 yield identical results. Use whichever table is easiest.

Finding z for a Given Area •

•

•

7-19

Appendices C-1 and C-2 can be used to find the z-value corresponding to a given probability. For example, what z-value defines the top 1% of a normal distribution? This implies that 49% of the area lies between 0 and z, which gives z = 2.33 by looking for an area of 0.4900 in Appendix C-1.

Chapter 7

Standard Normal Distribution

Chapter 7

Standard Normal Distribution Finding Areas Using Standardized Variables •

John took an economics exam and scored 86 points. The class mean was 75 with a standard deviation of 7. What percentile is John in? That is, what is P(X < 86) where X represents the exam scores? Appendix C-2: Cumulative Standard Normal Distribution (continued) This table shows the normal area less than z .

•

John’s score is 1.57 standard deviations above the mean.

•

P(X < 86) = P(Z < 1.57) = .9418 (from Appendix C-2)

•

John is approximately in the 94 percentile.

7-20

th

z 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0

0.00 0.5000 0.5398 0.5793 0.6179 0.6554 0.6915 0.7257 0.7580 0.7881 0.8159 0.8413 0.8643 0.8849 0.9032 0.9192 0.9332 0.9452 0.9554 0.9641 0.9713 0.9772

0.01 0.5040 0.5438 0.5832 0.6217 0.6591 0.6950 0.7291 0.7611 0.7910 0.8186 0.8438 0.8665 0.8869 0.9049 0.9207 0.9345 0.9463 0.9564 0.9649 0.9719 0.9778

0.02 0.5080 0.5478 0.5871 0.6255 0.6628 0.6985 0.7324 0.7642 0.7939 0.8212 0.8461 0.8686 0.8888 0.9066 0.9222 0.9357 0.9474 0.9573 0.9656 0.9726 0.9783

0.03 0.5120 0.5517 0.5910 0.6293 0.6664 0.7019 0.7357 0.7673 0.7967 0.8238 0.8485 0.8708 0.8907 0.9082 0.9236 0.9370 0.9484 0.9582 0.9664 0.9732 0.9788

0.04 0.5160 0.5557 0.5948 0.6331 0.6700 0.7054 0.7389 0.7704 0.7995 0.8264 0.8508 0.8729 0.8925 0.9099 0.9251 0.9382 0.9495 0.9591 0.9671 0.9738 0.9793

0.05 0.5199 0.5596 0.5987 0.6368 0.6736 0.7088 0.7422 0.7734 0.8023 0.8289 0.8531 0.8749 0.8944 0.9115 0.9265 0.9394 0.9505 0.9599 0.9678 0.9744 0.9798

0.06 0.5239 0.5636 0.6026 0.6406 0.6772 0.7123 0.7454 0.7764 0.8051 0.8315 0.8554 0.8770 0.8962 0.9131 0.9279 0.9406 0.9515 0.9608 0.9686 0.9750 0.9803

0.07 0.5279 0.5675 0.6064 0.6443 0.6808 0.7157 0.7486 0.7794 0.8078 0.8340 0.8577 0.8790 0.8980 0.9147 0.9292 0.9418 0.9525 0.9616 0.9693 0.9756 0.9808

0.08 0.5319 0.5714 0.6103 0.6480 0.6844 0.7190 0.7517 0.7823 0.8106 0.8365 0.8599 0.8810 0.8997 0.9162 0.9306 0.9429 0.9535 0.9625 0.9699 0.9761 0.9812

0.09 0.5359 0.5753 0.6141 0.6517 0.6879 0.7224 0.7549 0.7852 0.8133 0.8389 0.8621 0.8830 0.9015 0.9177 0.9319 0.9441 0.9545 0.9633 0.9706 0.9767 0.9817

•

Finding Areas by Using Standardized Variables You can use Excel, Minitab, TI83/84, etc. to compute these probabilities directly. The Excel functions are shown: Without standardizing: =NORM.DIST(x, µ, , 1) =NORM.DIST(86, 75, 7, 1) =.9420

7-21

With standardizing: =NORM.S.DIST(z, 1) =NORM.S.DIST(1.57, 1) =.9418

Slight difference is due to rounding z to 1.57

Chapter 7

Standard Normal Distribution

ML 5.2

Inverse Normal • How can we find the various normal percentiles (5th, 10th, 25th, 75th, 90th, 95th, etc.) known as the inverse normal? That is, how can we find X for a given area?

• We simply turn the standardizing transformation around: Solving for x in z = (x − μ)/ gives x = μ + zσ 7-22

Chapter 7

Inverse Normal

Inverse Normal: Excel Finding x:

7-23

Finding z:

Chapter 7

Inverse Normal Distribution

Inverse Normal: Example • John’s economics professor decides that any student who scores below the 10th percentile must retake the exam. • The exam scores are normal with μ = 75 and σ = 7. • What is the score that would require a student to retake the exam? • We need to find the value of x that satisfies P(X < x) = .10. • The z-score for with the 10th percentile is z = −1.28.

7-24

Chapter 7

Inverse Normal Distribution

Inverse Normal The logical steps to solve the problem are: • Use Appendix C to find z = −1.28 to satisfy P(Z < −1.28) = .10. • Substitute z = −1.28 into z = (x − μ)/σ to get −1.28 = (x − 75)/7 • Solve for x to get x = 75 − (1.28)(7) = 66.03 (or 66 after rounding) • Students who score below 66 points on the economics exam will be required to retake the exam. or use Excel to obtain z: =NORM.S.INV(0.1) = 1.282

7-25

or use Excel to solve in one step: =NORM.INV(0.1,75,7) = 66.03

Chapter 7

Inverse Normal Distribution

Normal Approximation to the Binomial

Chapter 7

Normal Approximations

•

Binomial probabilities are difficult to calculate when n is large.

•

Use a normal approximation to the binomial distribution.

•

As n becomes large, the binomial bars become smaller and the PDF approaches a continuous distribution.

7-26

Normal Approximation to the Binomial •

Rule of thumb: when n ≥ 10 and n(1  ) ≥ 10, then it is appropriate to use the normal approximation to the binomial distribution.

•

Set the mean and standard deviation for the binomial distribution equal to the normal µ and , respectively.

7-27

Chapter 7

Normal Approximations

Example: Coin Flips •

If we flip a coin n = 32 times and  = .50, are the requirements for a normal approximation to the binomial distribution met?

•

Yes, because: n = 32 x .50 = 16 n(1  ) = 32 x (1 .50) = 16

7-28

(at least 10 “successes”) (at least 10 “failures”)

•

When translating a discrete scale into a continuous scale, care must be taken about individual points.

•

For example, find the probability of more than 17 heads in 32 flips of a fair coin. This can be written as P(X  18).

•

However, “more than 17” actually falls between 17 and 18 on a discrete scale.

Chapter 7

Normal Approximations

Example: Coin Flips • • •

7-29

Chapter 7

Normal Approximations

Since the cutoff point for “more than 17” is halfway between 17 and 18, we add 0.5 to the lower limit and find P(X > 17.5). This addition to X is called the Continuity Correction. At this point, the problem can be completed as any normal distribution problem.

Example: Coin Flips

P(X > 17) = P(X ≥ 18)  P(X ≥ 17.5) = P(Z > 0.53) = 0.2981

7-30

Chapter 7

Normal Approximations

Normal Approximation to the Poisson • •

The normal approximation to the Poisson distribution works best when  is large (e.g., when  exceeds the values in Appendix B). Set the normal µ and  equal to the mean and standard deviation for the Poisson distribution.

Example: Utility Bills •

• • 7-31

On Wednesday between 10 a.m. and noon customer billing inquiries arrive at a mean rate of 42 inquiries per hour at Consumers Energy. What is the probability of receiving more than 50 calls in an hour?  = 42, which is too big to use the Poisson table. Use the normal approximation with  = 42 and  = 6.48074.

Chapter 7

Normal Approximations

Example: Utility Bills • •

7-32

To find P(X > 50) calls, use the continuity-corrected cutoff point halfway between 50 and 51 (i.e., X = 50.5). At this point, the problem can be completed as any normal distribution problem.

Chapter 7

Normal Approximations

Bottom Line: •

With Excel, we do not need these approximations for calculations.

•

They are still useful when Excel is not available.

•

They are taught to show the logical connection between discrete and continuous distributions.

7-33

Chapter 7

Normal Approximations

ML 5.3

Characteristics of the Exponential Distribution •

If events per unit of time follow a Poisson distribution (e.g., customer arrivals), the waiting time until the next event (e.g., customer arrival) follows the exponential distribution.

•

The time until the next event is a continuous variable.

Note: We seek tail probabilities such as P(X  x) or P(X ≤ x).

7-34

Chapter 7

Exponential Distribution

Characteristics of the Exponential Distribution

Probability of waiting less than or equal to x

Probability of waiting more than x

Note: A point has no area so P(X ≤ x) is the same as P( X < x) and similarly P(X > x) is the same as P( X  x). 7-35

Chapter 7

Exponential Distribution

Example: Customer Waiting Time

7-36

•

Between 2 p.m. and 4 p.m. on Wednesday, patient insurance inquiries arrive at Blue Choice insurance at a mean rate of 2.2 calls per minute.

•

What is the probability of waiting more than 30 seconds (i.e., 0.50 minutes) for the next call?

•

Set  = 2.2 events/min and x = 0.50 min

•

P(X > 0.50) = e–x = e–(2.2)(0.5) = .3329 or a 33.29% chance of waiting more than 30 seconds for the next call.

Chapter 7

Exponential Distribution

Example: Customer Waiting Time Given λ = 2.2 inquiries per minute, what is the probability of waiting more than 30 seconds (i.e., 0.50 minutes) for the next call?

P(X > 0.50) = e–x = e–(2.2)(0.5) = .3329

7-37

P(X ≤ 0.50) = 1-.3329 = .6671

Chapter 7

Exponential Distribution

Inverse Exponential •

If the mean arrival rate is 2.2 calls per minute, what is the 90th percentile for waiting time (the top 10% of waiting time)?

•

Find the x-value that defines the upper 10%.

7-38

Chapter 7

Inverse Exponential Distribution

Inverse Exponential If the mean arrival rate is 2.2 calls per minute, what is the 90th percentile for waiting time (the top 10% of waiting time)? Find the x-value that defines the upper 10%.

7-39

Chapter 7

Inverse Exponential Distribution

Mean Time Between Events

7-40

Chapter 7

Exponential Distribution

Bottom Line: You may encounter the exponential model in any situation that involves customer arrivals, waiting lines, and queueing (e.g., retail business, call center, concert, theme park, bank, grocery store, airline check-in, traffic planning). Such applications are not rare in our crowded world. Study simulation (Chapter 18) to learn more about how such situations can be modeled to plan facility capacity, predict waiting times, and study system throughput.

7-41

Chapter 7

Exponential Distribution

Characteristics of the Triangular Distribution

7-42

ML 5.4

Chapter 7

Other Continuous Distributions

Characteristics of the Triangular Distribution

• The triangular distribution is a way of thinking about variation that corresponds rather well to what-if analysis in business. • It is not surprising that business analysts are attracted to the triangular model. • Its finite range and simple form are more understandable than a normal distribution. 7-43

Chapter 7

Other Continuous Distributions

Characteristics of the Triangular Distribution • It is more versatile than a normal because it can be skewed in either direction. • Yet it has some of the nice properties of a normal, such as a distinct mode. • The triangular model is especially handy for what-if analysis when the business case depends on predicting a stochastic variable (e.g., the price of a raw material, an interest rate, a sales volume). • If the analyst can anticipate the range (a to c) and most likely value (b), it will be possible to calculate probabilities of various outcomes. • Many times, distributions will be skewed, so a normal wouldn’t be much help.

7-44

Chapter 7

Other Continuous Distributions

Triangular Distribution: Example T(15, 20, 30)

7-45

Chapter 7

Other Continuous Distributions

Triangular Distribution: Example T(15, 20, 30)

7-46

Chapter 7

Other Continuous Distributions

Characteristics of the Uniform Distribution If X is a random variable that is uniformly distributed between a and b, its PDF has constant height. • •

7-47

Denoted U(a, b) Area = base x height = (b  a) x 1/(b  a) = 1

Chapter 7

Uniform Continuous Distribution

Characteristics of the Uniform Distribution

7-48

Chapter 7

Uniform Continuous Distribution

Example: Anesthesia Effectiveness •

An oral surgeon injects a painkiller prior to extracting a tooth. Given the varying characteristics of patients, the dentist views the time for anesthesia effectiveness as a uniform random variable that takes between 15 minutes and 30 minutes.

•

X is U(15, 30)

•

a = 15, b = 30, find the mean and standard deviation.

• Find the probability that the effectiveness of the anaesthetic takes between 20 and 25 minutes. 7-49

Chapter 7

Uniform Continuous Distribution

Example: Anesthesia Effectiveness P(20 < X < 25) = (25 – 20)/(30 – 15) = 5/15 = 0.3333 = 33.33%

7-50

Chapter 7

Uniform Continuous Distribution

Chapter 7

Uniform Continuous Distribution Uses of Uniform Distribution • Can be a conservative “what-if” baseline model. • Excel’s =RAND() function follows this model:

μ = (a + b)/2 = (0 + 1)/2 = .5000 σ = [(b - a)2/12]1/2 = [(1 - 0)2/12]1/2 = [1/12]1/2 = .2887 Try it yourself! Calculate a bunch of =RAND() values in Excel, and look at the mean and standard deviation. They should be close to the above predictions (if sample is large).

7-51

0.84328 0.33170 0.45351 0.53490 0.46443 0.43802 0.00549

Mean = 0.494637 0.68397 0.69134 0.56953 0.04807 0.70129 0.15553 0.96473 0.62752 0.98558 0.25002 0.37406 0.08978 0.32222 0.63328

0.09071 0.65731 0.36416 0.78566 0.05013 0.29142 0.28581

St Dev = 0.271894 0.72185 0.73706 0.34992 0.79984 0.33627 0.71570 0.82808 0.34901 0.61517 0.09537 0.47772 0.25935 0.27208 0.81790

0.78645 0.97143 0.80646 0.14220 0.50000 0.36504 0.59686

Comparison of Models

• The normal distribution is the used most often. • The exponential is useful in modeling waiting lines (queues). • The triangular distribution is a way of thinking about variation that corresponds well to what-if analysis in business. • The uniform distribution is a useful baseline model or for random sampling (randomizing a list). 7-52

Chapter 7

Continuous Distributions

Chapter 7

Short Description

Description

Comments

We need your help!