We are a sharing community. So please help us by uploading **1** new document or like us to download:

OR LIKE TO DOWNLOAD IMMEDIATELY

Basic Statistics KNES 510 Research Methods in Kinesiology

1

How Software is Used in Statistics

Types of software for statistics Minitab Statistical Analysis system (SAS) Statistical Package for the Social Sciences (SPSS)

Predictive Analytics Software (PASW)

Why not Microsoft Excel? http://www.youtube.com/

2

Why We Need Statistics

Statistics is an objective way of interpreting a collection of observations Types of statistics 1.

Descriptive Central tendency Variability

2.

3.

Correlational Inferential

Differences within or between groups

3

Ways to Select a Sample 1.

2.

Random sampling: tables of random numbers Stratified random sampling

3.

Systematic sampling

4.

Pick a start and sample every nth number.

Random assignment Justifying post hoc explanations

Strata=small groups. Sample from each strata

Convenience sample?

How good does the sample have to be?

Good enough for our purposes!

4

Descriptive Statistics

Descriptive statistics are used to summarize or condense a group of scores They include measures of central tendency and measures of variability

Humans Mean=100 SD=15

5

Central Tendency

Measures of central tendency describe the average or common score of a group of scores Common measures of central tendency include the mean, median, and mode

6

Mean

The mean is the arithmetic average of the scores The calculation of the mean considers both the number of scores and their value The formula for the mean of the variable X is:

X M n 7

Mean

Six men with high serum cholesterol participated in a study to examine the effects of diet on cholesterol At the beginning of the study, their serum cholesterol levels (mg/dL) were: 366, 327, 274, 292, 274, 230 Determine the mean

8

Mean 366 327 274 292 274 230 M 6 1,763 M 6 M 293.83

9

Calculating the Mean Using SPSS

Analyze -> Descriptive Statistics -> Frequencies command may be used to determine the mean (you will need to select the “Statistics…” button to choose the “Mean”

Statistics Cholesterol N Valid Missing Mean

6 0 293.83 10

Median

The median is the middle point in an ordered distribution at which an equal number of scores lie on each side of it It is also known as the 50th percentile (P50), or 2nd quartile (Q2)

11

Median

The position of the median (Mdn) can be calculated as follows:

n 1 Mdn 2

12

Median

Example: Calculate the median for the following measurements for height: 71”, 73”, 74”, 75”, 72”

13

Median

Step One: Place the scores in order from lowest to highest. 71”, 72”, 73”, 74”, 75” Step Two: Calculate the position of the median using the following formula: n 1 Mdn 2 5 1 Mdn 3rd score 2 14

Median

Step Three: Determine the value of the median by counting from either the highest or the lowest score until the desired score is reached (in this case the 3rd score)

15

Median

Suppose that in our previous distribution we had a sixth score as follows: 71”, 72”, 73”, 74”, 74”, 75” What are the position and value of the median?

16

Median

Consider the following example: Nine people each perform 40 sit-ups, and one does 1,000 The median score for the group is 40, and the mean (arithmetic average) is 136 The median would still be 40 even if the highest score were 2,000 instead of 40

17

The Median is Unaffected by Extreme Scores Statistics Sit-Ups N

Valid Missing

Mean Median

10 0 136.00 40.00

Statistics Sit-Ups N Mean Median

Valid Missing

10 0 236.00 40.00 18

Mode

The mode is the most frequently occurring score Which of the following scores is the mode? 3, 7, 3, 9, 9, 3, 5, 1, 8, 5 Similarly, for another data set (2, 4, 9, 6, 4, 6, 6, 2, 8, 2), there are two modes; What are they? What is the mode for 7, 7, 6, 6, 5, 5, 4 and 4

19

Mode

A distribution with a single mode is said to be unimodal A distribution with more than one mode is said to be bimodal, trimodal, etc., or in general, multimodal

20

Calculating the Mode Using SPSS

Analyze -> Descriptive Statistics -> Frequencies command may be used to calculate the mode (you will need to select the “Statistics…” button to choose the mode, etc Note differences in the SPSS output when the distribution is unimodal, multimodal, or when there is no mode

21

SPSS Output - Unimodal

Statistics Scores N Valid Missing Mode

10 0 3

22

SPSS Output - Bimodal Statistics Scores N Valid Missing Mode

10 0 2a

a. Multiple modes exist. The smallest value is shown

23

SPSS Output – No Mode Statistics Scores N Valid Missing Mode

8 0 4a

a. Multiple modes exist. The smallest value is shown

24

Variability

Measures of variability describe the extent of similarity or difference in a set of scores These measures include the range, standard deviation, and variance

25

Standard Deviation (SD)

Standard Deviation – a measure of the variability, or spread, of a set of scores around the mean Intuitively, the sum of the differences between each score and the mean (known as deviation scores) appears to be a good approach for measuring variability around the mean

26

SD

Symbolically, we can write this as

X M

Let’s use the scores 1, 2, 6, 6, and 15, where

M 6 27

SD

Now let’s calculate the sum of the deviation scores:

X M = (1-6) + (2-6) + (6-6) + (6-6) + (15-6) = (-5) + (-4) + (0) + (0) + (9) = = -9 + 9 = 0

28

SD

We can avoid this problem (deviation scores sum to 0) by squaring each deviation score before summing them This would be written symbolically as

X M

2

29

SD

Substituting our X scores again, = (1-6)2 + (2-6)2 + (6-6)2 + (6-6)2 + (15-6)2 = (-5)2 + (-4)2 + (0)2 + (0)2 + (9)2 = 25 + 16 + 0 + 0 + 81 = 122

30

SD

We then divide this value by n-1 to arrive at the mean squared deviation 122/4 = 30.5 We then take the square root of this value to bring the units back to the raw score units 30.5 5.52 31

Example calculation of variance and standard deviation on strength scores.

Subj

Score (x)

Deviation

(x)2

1

216

22.7

515.29

2

144

-49.3

2430.49

X 1740 X= 193.3 n

9

2 ( x X ) 11774.01 2 s 147175 . n 1 8

3

183

-10.3

106.09

4

138

-55.3

3058.09

5

212

18.7

349.69

6

180

-13.3

176.89

7

200

6.7

44.89

8

264

70.7

4998.49

9

203

9.7

94.09

=1740

=0

=11774.01

2 ( x X ) s 38.4 n 1

Calculating the SD Using SPSS

Analyze -> Descriptive Statistics -> Frequencies command may be used to determine the standard deviation (you will need to select the “Statistics…” button to choose the “Std. deviation”

Statistics Scores N Std. Deviation

Valid Missing

5 0 5.523 33

Variance

The variance is the square of the standard deviation It is used most commonly with more advanced statistical procedures such as regression analysis, analysis of variance (ANOVA), and the determination of the reliability of a test

34

Variance

The variance is also known as the mean square (MS)

X M s n 1

2

2

35

Range

The range is equal to the high score minus the low score in a distribution It is considered an unstable measure of variability, and can change drastically if extreme scores are introduced to the distribution

36

Range

As a result of gas analysis in a respirometer, an investigator obtains the following four readings of oxygen percentages: 14.9, 10.8, 12.3, and 23.3 What is the range?

37

Calculating the Range Using SPSS

Analyze -> Descriptive Statistics -> Frequencies command may be used to calculate the range (you will need to select the “Statistics…” button to choose “Minimum,” “Maximum,” and “Range” Statistics Oxygen_Content N Valid Missing Range Minimum Maximum

4 0 12.5 10.8 23.3 38

Example of Descriptive Statistics

39

Confidence Intervals

Provide an expected upper and lower limit for a statistic at a specified probability level (usually 95% or 99%) CI is dependent upon the sample size, homogeneity of values within the sample and the level of confidence selected by the researcher

40

Confidence Interval, cont’d For example, a sample mean is an estimate of the population mean A confidence interval provides a band within which the population mean is likely to fall CI = mean ± (standard error × confidence level) The standard error (sM) is the variability of the sampling distribution of the statistic

sM s / n 41

Calculating a CI

Example: n = 30, M = 40, s = 8 CI = 40 ± (1.46 × 2.045) CI = 40 ± 2.99 = 37.01 to 42.99 The value “1.46” came from the following formula: sM 8 / 30 The value “2.045” came from table A.5 (next slide) 42

43

Correlation

Correlation “indicates the extent to which two variables are related or associated

The extent to which the direction and size of deviations from the mean in one variable are related to the direction and size of deviations from the mean in another variable”

Z r

X

ZY

N 44

Example of Correlations

45

Categories of Statistical Tests

Parametric Normal distribution Equal variances Independent observations

Nonparametric (distribution free)

Distribution is not normal

Normal curve Skewness Kurtosis

46

Normal Curve

47

Skewness

48

Kurtosis

49

Statistics

What statistical techniques tell us Reliability (significance) of effect Strength of the relationship (meaningfulness)

Types of statistical techniques Relationships among variability Differences among groups

Cause and effect

Correlation is no proof of causation 50

Next Class

Chapter 7 and 8 Full Lit Review

51

View more...
1

How Software is Used in Statistics

Types of software for statistics Minitab Statistical Analysis system (SAS) Statistical Package for the Social Sciences (SPSS)

Predictive Analytics Software (PASW)

Why not Microsoft Excel? http://www.youtube.com/

2

Why We Need Statistics

Statistics is an objective way of interpreting a collection of observations Types of statistics 1.

Descriptive Central tendency Variability

2.

3.

Correlational Inferential

Differences within or between groups

3

Ways to Select a Sample 1.

2.

Random sampling: tables of random numbers Stratified random sampling

3.

Systematic sampling

4.

Pick a start and sample every nth number.

Random assignment Justifying post hoc explanations

Strata=small groups. Sample from each strata

Convenience sample?

How good does the sample have to be?

Good enough for our purposes!

4

Descriptive Statistics

Descriptive statistics are used to summarize or condense a group of scores They include measures of central tendency and measures of variability

Humans Mean=100 SD=15

5

Central Tendency

Measures of central tendency describe the average or common score of a group of scores Common measures of central tendency include the mean, median, and mode

6

Mean

The mean is the arithmetic average of the scores The calculation of the mean considers both the number of scores and their value The formula for the mean of the variable X is:

X M n 7

Mean

Six men with high serum cholesterol participated in a study to examine the effects of diet on cholesterol At the beginning of the study, their serum cholesterol levels (mg/dL) were: 366, 327, 274, 292, 274, 230 Determine the mean

8

Mean 366 327 274 292 274 230 M 6 1,763 M 6 M 293.83

9

Calculating the Mean Using SPSS

Analyze -> Descriptive Statistics -> Frequencies command may be used to determine the mean (you will need to select the “Statistics…” button to choose the “Mean”

Statistics Cholesterol N Valid Missing Mean

6 0 293.83 10

Median

The median is the middle point in an ordered distribution at which an equal number of scores lie on each side of it It is also known as the 50th percentile (P50), or 2nd quartile (Q2)

11

Median

The position of the median (Mdn) can be calculated as follows:

n 1 Mdn 2

12

Median

Example: Calculate the median for the following measurements for height: 71”, 73”, 74”, 75”, 72”

13

Median

Step One: Place the scores in order from lowest to highest. 71”, 72”, 73”, 74”, 75” Step Two: Calculate the position of the median using the following formula: n 1 Mdn 2 5 1 Mdn 3rd score 2 14

Median

Step Three: Determine the value of the median by counting from either the highest or the lowest score until the desired score is reached (in this case the 3rd score)

15

Median

Suppose that in our previous distribution we had a sixth score as follows: 71”, 72”, 73”, 74”, 74”, 75” What are the position and value of the median?

16

Median

Consider the following example: Nine people each perform 40 sit-ups, and one does 1,000 The median score for the group is 40, and the mean (arithmetic average) is 136 The median would still be 40 even if the highest score were 2,000 instead of 40

17

The Median is Unaffected by Extreme Scores Statistics Sit-Ups N

Valid Missing

Mean Median

10 0 136.00 40.00

Statistics Sit-Ups N Mean Median

Valid Missing

10 0 236.00 40.00 18

Mode

The mode is the most frequently occurring score Which of the following scores is the mode? 3, 7, 3, 9, 9, 3, 5, 1, 8, 5 Similarly, for another data set (2, 4, 9, 6, 4, 6, 6, 2, 8, 2), there are two modes; What are they? What is the mode for 7, 7, 6, 6, 5, 5, 4 and 4

19

Mode

A distribution with a single mode is said to be unimodal A distribution with more than one mode is said to be bimodal, trimodal, etc., or in general, multimodal

20

Calculating the Mode Using SPSS

Analyze -> Descriptive Statistics -> Frequencies command may be used to calculate the mode (you will need to select the “Statistics…” button to choose the mode, etc Note differences in the SPSS output when the distribution is unimodal, multimodal, or when there is no mode

21

SPSS Output - Unimodal

Statistics Scores N Valid Missing Mode

10 0 3

22

SPSS Output - Bimodal Statistics Scores N Valid Missing Mode

10 0 2a

a. Multiple modes exist. The smallest value is shown

23

SPSS Output – No Mode Statistics Scores N Valid Missing Mode

8 0 4a

a. Multiple modes exist. The smallest value is shown

24

Variability

Measures of variability describe the extent of similarity or difference in a set of scores These measures include the range, standard deviation, and variance

25

Standard Deviation (SD)

Standard Deviation – a measure of the variability, or spread, of a set of scores around the mean Intuitively, the sum of the differences between each score and the mean (known as deviation scores) appears to be a good approach for measuring variability around the mean

26

SD

Symbolically, we can write this as

X M

Let’s use the scores 1, 2, 6, 6, and 15, where

M 6 27

SD

Now let’s calculate the sum of the deviation scores:

X M = (1-6) + (2-6) + (6-6) + (6-6) + (15-6) = (-5) + (-4) + (0) + (0) + (9) = = -9 + 9 = 0

28

SD

We can avoid this problem (deviation scores sum to 0) by squaring each deviation score before summing them This would be written symbolically as

X M

2

29

SD

Substituting our X scores again, = (1-6)2 + (2-6)2 + (6-6)2 + (6-6)2 + (15-6)2 = (-5)2 + (-4)2 + (0)2 + (0)2 + (9)2 = 25 + 16 + 0 + 0 + 81 = 122

30

SD

We then divide this value by n-1 to arrive at the mean squared deviation 122/4 = 30.5 We then take the square root of this value to bring the units back to the raw score units 30.5 5.52 31

Example calculation of variance and standard deviation on strength scores.

Subj

Score (x)

Deviation

(x)2

1

216

22.7

515.29

2

144

-49.3

2430.49

X 1740 X= 193.3 n

9

2 ( x X ) 11774.01 2 s 147175 . n 1 8

3

183

-10.3

106.09

4

138

-55.3

3058.09

5

212

18.7

349.69

6

180

-13.3

176.89

7

200

6.7

44.89

8

264

70.7

4998.49

9

203

9.7

94.09

=1740

=0

=11774.01

2 ( x X ) s 38.4 n 1

Calculating the SD Using SPSS

Analyze -> Descriptive Statistics -> Frequencies command may be used to determine the standard deviation (you will need to select the “Statistics…” button to choose the “Std. deviation”

Statistics Scores N Std. Deviation

Valid Missing

5 0 5.523 33

Variance

The variance is the square of the standard deviation It is used most commonly with more advanced statistical procedures such as regression analysis, analysis of variance (ANOVA), and the determination of the reliability of a test

34

Variance

The variance is also known as the mean square (MS)

X M s n 1

2

2

35

Range

The range is equal to the high score minus the low score in a distribution It is considered an unstable measure of variability, and can change drastically if extreme scores are introduced to the distribution

36

Range

As a result of gas analysis in a respirometer, an investigator obtains the following four readings of oxygen percentages: 14.9, 10.8, 12.3, and 23.3 What is the range?

37

Calculating the Range Using SPSS

Analyze -> Descriptive Statistics -> Frequencies command may be used to calculate the range (you will need to select the “Statistics…” button to choose “Minimum,” “Maximum,” and “Range” Statistics Oxygen_Content N Valid Missing Range Minimum Maximum

4 0 12.5 10.8 23.3 38

Example of Descriptive Statistics

39

Confidence Intervals

Provide an expected upper and lower limit for a statistic at a specified probability level (usually 95% or 99%) CI is dependent upon the sample size, homogeneity of values within the sample and the level of confidence selected by the researcher

40

Confidence Interval, cont’d For example, a sample mean is an estimate of the population mean A confidence interval provides a band within which the population mean is likely to fall CI = mean ± (standard error × confidence level) The standard error (sM) is the variability of the sampling distribution of the statistic

sM s / n 41

Calculating a CI

Example: n = 30, M = 40, s = 8 CI = 40 ± (1.46 × 2.045) CI = 40 ± 2.99 = 37.01 to 42.99 The value “1.46” came from the following formula: sM 8 / 30 The value “2.045” came from table A.5 (next slide) 42

43

Correlation

Correlation “indicates the extent to which two variables are related or associated

The extent to which the direction and size of deviations from the mean in one variable are related to the direction and size of deviations from the mean in another variable”

Z r

X

ZY

N 44

Example of Correlations

45

Categories of Statistical Tests

Parametric Normal distribution Equal variances Independent observations

Nonparametric (distribution free)

Distribution is not normal

Normal curve Skewness Kurtosis

46

Normal Curve

47

Skewness

48

Kurtosis

49

Statistics

What statistical techniques tell us Reliability (significance) of effect Strength of the relationship (meaningfulness)

Types of statistical techniques Relationships among variability Differences among groups

Cause and effect

Correlation is no proof of causation 50

Next Class

Chapter 7 and 8 Full Lit Review

51

We are a sharing community. So please help us by uploading **1** new document or like us to download:

OR LIKE TO DOWNLOAD IMMEDIATELY

Thank you for interesting in our services. We are a non-profit group that run this website to share documents. We need your help to maintenance this website.

To keep our site running, we need your help to cover our server cost (about $400/m), a small donation will help us a lot.