ST-L1 Intro to Statistics

January 16, 2018 | Author: Anonymous | Category: Math, Statistics And Probability, Statistics
Share Embed Donate


Short Description

Download ST-L1 Intro to Statistics...

Description

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

Intro to Statistics

ST-L1 Objectives: To review measures of central tendency and dispersion. Learning Outcome B-4

Slide 1

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

Judith and Francine, both age 19, have decided to go on a Caribbean cruise, and they want to have an enjoyable time, which means that they want to travel with other people their own age. They buy tickets for a cruise where the average age of the other passengers is 20 years. Can you imagine their surprise at the start of the cruise when they discover that all the other passengers are parents (average age 32) with children (average age 8)? This lesson includes a brief review of sampling and the calculation of central tendencies. It also introduces 'measures of dispersion,' something Judith and Francine should have known about. Also, you will be introduced to information technology (Winstats) that will do the statistics calculations.

Theory – Intro

Slide 2

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

Teacher Bundy asked his students to measure the length of the classroom using metre sticks, and to write their measurements (rounded to the nearest millimetre) on the board. It quickly became evident that most of the measurements were quite similar, but a few were lower or higher than the 'main group' of measurements. Whenever some quantity or value is measured numerous times, there will likely be a variety of results. Most of the results will likely be close to a value believed to be the true value. The term variability in statistics refers to the variety of answers we get when measuring something. You will study ways of describing and interpreting variability in data. For example, you will review ways of finding a 'middle' measurement that represent all the data, and also ways of describing how the data are spread about this 'middle' value.

Theory – Variability, Continuous and Discrete Data

Slide 3

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

If we count books, the count would consist of whole numbers. This is an example of discrete data, or data you get when counting a finite number of distinct objects. (For example, there may be 71 or 72 books, but not 71.6 books.) If we measure the length of the classroom we must decide on an acceptable level of precision and round our final answer. We can never say the measurement is exactly correct. Such measurements are continuous data, because the data is NOT discrete the final answer is an approximation (eg. 71.4 cm, or 71.38 cm or 71.3810728 cm). Other examples of continuous data are measurements of speed, area, and time.

Theory – Variability, Continuous and Discrete Data cont’d

Slide 4

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

A measure of central tendency is a single quantity or score that in some way represents the 'middle' value of all the data in a sample or population. Three commonly used measures of central tendency are as follows: 1. The mean, or average, of a set of scores is found by adding the scores together and then dividing the sum by the number of scores. The symbols for mean are: (bar x) for the mean of a set of data (sample mean), and (mu) for the mean of a population

2. The mode for a sample or population is the value that appears most often.

Theory – Measures of Central Tendency

Slide 5

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

The median is the middle term when data are arranged in order of size from the smallest to the largest. Let 'n' represent the number of terms of data. If the data have an odd number of terms, the middle one is the

term. If the data have an even number of terms, you find the average of the middle two terms. That is, you find the average of the terms at and

Theory – Measures of Central Tendency

Slide 6

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

A clerk in a men's clothing store keeps a weekly record of the number of pairs of pants sold. The following is her list for two weeks. Calculate the mean, mode, and median for the data shown. Mon

Tue

Wed

Thur

Fri

Sat

Week 1

34

40

36

36

38

38

Week 2

32

36

36

42

34

34

Test Your Knowledge – Measures of Central Tendency

Slide 7

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

The question from the previous page is repeated here: A clerk in a men's clothing store keeps a weekly record of the number of pairs of pants sold. The following is her list for two weeks.

Winstats is used to answer this question. The window on the left-hand side shows the data entered in two columns, and the window on the right shows Mon Tue Wed Thur Fri Sat the overall statistics. Calculate the mean, mode, and median for the data shown.

Week 1

34

40

36

36

38

38

Week 2

32

36

36

42

34

34

Note that the mean shown is 36.333, and the median is 36.000. These answers are the same as the ones that were calculated. Winstats does not calculate the mode. Test Your Knowledge – Measures of Central Tendency

Slide 8

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

A central tendency is a measure of some kind of 'middle' number for a group of data. We also need ways to measure the variation of the individual data values, and how the data are spread about the central value.

For example, Robin can drive to university using the downtown route or the perimeter route. The downtown route is shorter, but it has more traffic, and can become quite crowded. The driving times in minutes for each route Downtown (arranged in ascending order) 15 26 30 39 45 Route for 5 days are shown on the table. Perimeter 29 30 31 32 33 Route

The average driving time for each route is 31 minutes. Which route should she take? The driving times for the downtown route vary from 15 to 45 minutes, and so she would need to allow 45 minutes travel time to ensure getting to class on time. The driving times for the perimeter route vary from 29 to 33 minutes, and so the travel times are more predictable. As you can see, Robin needs to consider more than the mean travel times when deciding on which route to take. Theory – Measures of Dispersion

Slide 9

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

The simplest measure of variation is the range. The range of a set of numbers is the difference between the largest number and the smallest number.

In the previous example (repeated here), the ranges are as follows: Downtown Route Perimeter Route

15

26

30

39

45

29

30

31

32

33

Downtown: Range = 45 - 15 = 30 minutes Perimeter: Range = 33 - 29 = 4 minutes Other applications of range may be: •temperature variation for the day •prices of stocks on the stock market •marks of students in class Theory – Range

Slide 10

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

One limitation of using range to describe the variation in a group of data (for example, student marks in a class) is that range provides information about only two scores - the highest and the lowest - and does not provide any information about all the other scores. One extreme score will make the range very large, even if all the other scores are very close. For example, the marks of the five students in both groups below have a range of 55. The variability in the first group, however, appears to be primarily attributable to one extreme student. Student marks: 25, 75, 77, 78, 80 Range = 80 - 25 = 55 Student marks: 25, 34, 56, 71, 80 Range = 80 - 25 = 55

Theory – Range

Slide 11

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

The standard deviation is a measure that shows how the data are spread about the mean value. We will use standard deviation to describe data, but we will not use algebraic formulas to calculate standard deviation. Instead, we will use technology (computer program or graphing calculator) to calculate standard deviation. The symbol for standard deviation of a population or large sample is and the symbol for standard deviation of a sample is s. A large sample is defined as a sample with 30 or more data items. In this course, we will use only (sigma), which represents the standard deviation of the population.

Theory – The Standard Deviation

Slide 12

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

For example: The mean math mark for Class A is 74, and the standard deviation is 4. This means that 68 percent of all the marks in the class are within 4 of 74. In other words, we can say that 68 percent of the marks in class are between (74 - 4) 70 and (74 + 4) 78.

Theory – The Standard Deviation

Slide 13

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

Calculate Standard Deviation Using Algebra (not really)

Theory – The Standard Deviation

Slide 14

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

The mean math marks and standard deviation for two classes are shown below. Assume that 68 percent of the marks in each class are within one standard deviation of the mean mark. mean

standard

mark

deviation

Class A

74

4

Class B

72

8

Questions:

1. In which class is the set of marks more dispersed? 2. Bert in Class A and Beth in Class B each have a mark of 82%. How many standard deviations are they from their class means? Who appears to have the better mark?

Theory – The Standard Deviation

Slide 15

40S Applied Math Mr. Knight – Killarney School

Answers:

Unit: Statistics Lesson: ST-L1 Intro to Statistics mean

standard

mark

deviation

Class A

74

4

Class B

72

8

1. Class A: 68% of all the marks are from (74 - 4) 70 to (74 + 4) 78. Class B: 68% of all the marks are from (72 - 8) 64 to (72 + 8) 80. Therefore, the marks in Class B are more dispersed. 2. Bert: Number of standard deviations above the mean.

Beth: Number of standard deviations above the mean. Therefore, Bert appears to have the better mark because he is 2 above the class average.

End day 1 Theory – The Standard Deviation

Slide 16

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

On previous pages all the elements of data were listed in tables. This is fine if the number of elements is relatively small, but becomes awkward if the number of elements gets large -- say 100 elements or more. For this reason, it is convenient to present the data as a frequency distribution. A frequency distribution table shows the number of elements of data (frequency) at each measure. Sometimes the measures need to be grouped, especially if the measures are continuous.

Example: The table below is a frequency distribution table that shows the heights of 100 Senior 4 students. The students are grouped into suitable height groups in 7 cm. intervals.

Theory – Grouped Data

Slide 17

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

Example: The table below is a frequency distribution table that shows the heights of 100 Senior 4 students. The students are grouped into suitable height groups in 7 cm. intervals. height interval

interval mean

# of students

153.5 to 160.5

157

5

160.5 to 167.5

164

16

167.5 to 174.5

171

43

174.5 to 181.5

178

27

181.5 to 188.5

185

9

Total

100

Determine the mean, median, mode, range, and standard deviation for the student heights. Be sure to select 'grouped data' in Winstats.

Theory – Grouped Data

Slide 18

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

From the diagram, you can read the following answers. mean = m = 172.33 median = 171.00

mode = 171.00 (i.e., the largest # of students, read from input data at 43 students) range = 28 standard deviation = s = 6.837

Theory – Grouped Data

Slide 19

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

A histogram is a bar graph that shows equal intervals of a measured or counted quantity on the horizontal axis, and the frequencies associated with these intervals on the vertical axis. The data from the previous page is repeated below. The histogram has been drawn with Winstats, and represents the heights of the students in graphical form.

Note that the five bars represent the five height intervals, and the heights of the bars represent the number of students at each height. The tallest bar represents the mode height. Such a histogram is known as a Frequency Distribution Graph. Theory – Histogram

Slide 20

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics

The frequency distribution table below shows the midterm marks of 85 Senior 4 math students at Parksville High. The first column shows the mark interval, the second column the average mark within each mark interval, and the third column the number of students at each mark. Answer the following questions. mark interval

mark

# of students

29 to 37

33

1

38 to 46

42

4

47 to 55

51

12

56 to 64

60

18

65 to 73

69

24

74 to 82

78

16

83 to 91

87

7

92 to 100

96

3

Total

Sample Problem

85

Slide 21

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics mark interval

mark

# of students

29 to 37

33

1

38 to 46

42

4

47 to 55

51

12

56 to 64

60

18

65 to 73

69

24

74 to 82

78

16

83 to 91

87

7

92 to 100

96

3

Total

Sample Problem

85

Slide 22

40S Applied Math Mr. Knight – Killarney School

Sample Problem

Unit: Statistics Lesson: ST-L1 Intro to Statistics

Slide 23

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics mark interval

mark

# of students

29 to 37

33

1

38 to 46

42

4

47 to 55

51

12

56 to 64

60

18

65 to 73

69

24

74 to 82

78

16

83 to 91

87

7

92 to 100

96

3

Total

Sample Problem

85

Slide 24

40S Applied Math Mr. Knight – Killarney School

Unit: Statistics Lesson: ST-L1 Intro to Statistics mark interval

mark

# of students

29 to 37

33

1

38 to 46

42

4

47 to 55

51

12

56 to 64

60

18

65 to 73

69

24

74 to 82

78

16

83 to 91

87

7

92 to 100

96

3

Total

Sample Problem

85

Slide 25

View more...

Comments

Copyright � 2017 NANOPDF Inc.
SUPPORT NANOPDF