|
|||||||||
![]() |
|||||||||
|
|
T-TEST: A Guide to Testing Differences Between Two Groups REQUIREMENTS: The
T-statistic is used to test differences in the means of two groups. The
grouping variable is categorical and data for the dependent variable is interval
scaled. The following table shows
alternative statistical techniques that can be used to analyze 2 groups. ![]()
The t-distribution was developed by W. S.
Gosset (1908)1. As an employee of the Guinness brewery in The t-distribution revolutionized
statistics and the ability to work with small samples. Prior to this time, statistical work was
based largely on the value of z, which was used to designate a point on the
normal distribution where population parameters were known. The z value is the deviation of the sample
mean from the mean of the population and is expressed in terms of variance
within a normally distributed population.2
The purpose of the z value is to express
the amount of deviation between the sample mean and the population mean and to
permit the making of inferences as to whether the sample mean belongs to the
population in question. The characteristics of the population
(μ and σ), to which we desire to make inferences, are rarely
known. This state of perfect knowledge
assumed by the z test makes the use of this statistic difficult to justify. The t statistic does not provide the population
variance information needed for the z test, but instead uses the sample
variance (and sample standard deviation).
The t-distribution is symmetrical about
the mean and is approximately normal. It
is centered at the population mean of 0 and for large samples has a variance
σ = 1. The central limit theorem tells us that
the sampling distribution of all possible sample means ( bar), approaches normality as the size
of the samples increase. This is true,
even when the population is not normally distributed. The t-distribution rapidly approaches the
shape of a normal distribution as sample size increases. The t-distribution is considered normal when
n=30. The t-distribution is used to make inferences
concerning the difference between the two populations μ1
and μ2. The specific
statistical theory used, relates to the distribution of differences between the
two sets of independent sample means, the sampling distribution of MATHEMATICAL COMPUTATIONS FOR THE T-TEST
a. Statistical analysis programs
compute t-statistics and associated probability levels for the equality of the
means of two groups based on pooled and separate variance estimates. An F-statistic and associated probability
level for the equality of group variances is also computed. Groups may be defined by specifying codes to
be included. Several dependent variables
may often be analyzed concurrently.
Paired comparison t-ratios may be obtained through the use of IF and
RECODE commands (SPSS).
b. Output from this program
includes: (1) F
- ratio of variance (2) t - value (based on pooled1 variance estimate) (3) t - value (based on separate variance estimate) (4) Two-tailed
probability levels for each t and for the F (5) Means (6) Standard
deviations (7) Standard
error of the means (8) Number
of observations included in computing 5-7 above (9) Optional
output of data input 1.
Pooling is required in each of the following two cases: (i) When we are testing for the same population proportion in
two populations. (ii) When testing the
difference in means between two small samples.
In case of pooling, we pool the point estimate by simple averages. The pooled standard error is given by:
c. COMPUTATIONAL PROCEDURE P variables for the first case are
read. In the data manipulation, if Xi
or Xj is missing, then Xk
is missing. If a case meets the
specifications for a specific t-test analysis, the observation will be included
in that calculation. Each problem is
divided into two groups: an X and a Y category.
For each analysis, the number of non-missing observations, the mean,
standard deviation, and standard error are computed for each variable of each
category. The t-values, F-values, and
corresponding probability level for between category comparison
are computed for each variable. Step 1. Xijk,i = 1,2,...,n
j = 1,2,...,(p+q)
k = 1,2; 1 = X category, 2 = Y category Step 2.
Degrees of freedom based on pooled
variance estimate:
t based on separate variance estimate:
Degrees of freedom based on separate
variance estimate: DEPENDENT VARIABLE IS : V64
EDUCATION SPOUSE
GROUP 1 IS LESS THAN OR EQUAL
TO 2 AND GROUP 2 IS GREATER THAN 2 *
POOLED VARIANCE ESTIMATE1 *
SEPARATE VARIANCE ESTIMATE VAR. NO.OF STD. STD. * F2 P3 * T
DEGREES P *
T DEGREES P
INDEX CASES4 MEAN DEVN. ERROR
* VALUE VALUE * VALUE OF FREEDOM
VALUE* VALUE OF FREEDOM
VALUE
********************************************************************************************************************* 1
V18 COMPETENCE OF NURSING STAFF * * CODE 1 68 1.0147 .121 .015
* * * *
1.69 .014 *
.38 181 .707
* .35 113.86
.726 CODE 2 115 1.0087
.093 .009 *
* * --------------------------------------------------------------------------------------------------------------------- 2
V19 PROMPTNESS OF
RESPONSE * * CODE 1 67 1.0299 .171 .021
* * * * 3.29
.000 *
1.05 177 .294
* .92 90.39
.361 CODE 2 112 1.0089
.094 .009 * * * --------------------------------------------------------------------------------------------------------------------- 3
V20 FRIENDLINESS OF
STAFF * * CODE 1 68 1.0147 .121 .015
* * * * 1.17
.452 * -.14 181
.891 * -.14
149.78 .889 CODE 2 115 1.0174 .131 .012
* * * --------------------------------------------------------------------------------------------------------------------- 4
V21 NURSE EXPLAINS TO PATIENT
* * CODE 1 65
1.0615 242 .030
* * * *
1.72 .012 *
.82 177 .413
* .76 106.88 .447 CODE 2 114 1.0351 .185 .017
* * * --------------------------------------------------------------------------------------------------------------------- CODE 1 67 1.0149 .122 .015
* * * * .00 1.000
* 1.32 181
.189 * 1.00
66.00 .321 CODE 2 116 1.0000
.000 .000 * * * 1.In case of pooling, the pooled
point estimate and pooled standard error are calculated. In case of the separate variance estimate, the difference between the two groups is used as the data and the
mean and standard are computed. 2.F-value is the ratio of the
mean square error (mean of the sum of squares) for each of the two groups. 3.The P-value is significance
level corresponding to the computed F-value.
If this P-value is less than or equal to the significance level of the test, then we reject the null hypothesis. 4.The two groups are defined by
the cut-off value selected. The cases
are assigned to the two groups as
Footnotes: 1. Student (1908), "The Probable
Error of a Mean", Biometrica, 6:1. 2. The normal probability distribution is
defined by the equation:
|
| Copyright 2006 www.surveyz.com All Rights Reserved |