Go back to my Home Page or to the SPSS Home Page (and contact SPSS by phone at 1-800-543-2185). More information about using SPSS is available here.
Using SPSS Version 19.0
This document contains instructions on using
SPSS to perform various statistical analyses.
The list of section titles are as follows:
Data
Manipulation
Data
Diagnostics
Hypothesis
Tests Involving One Variable
Hypothesis
Tests Involving Two Variables
Data Manipulation
IMPORTANT:
The default in SPSS is to use all cases. In order to use only selected cases in the
data file, first select the Data > Select Cases options to
display the Select Cases dialog box; then click on the If button to display the Select
Cases: If dialog box. After the
desired condition is entered, click on the Continue
button, and then click on the OK
button, after which only the desired cases should not be marked as being
excluded from data analysis; also, a variable named filter_$ will be added to the data.
In many of the SPSS dialog boxes (generally from clicking
an Options button), there will be
two choices for handling missing data.
The Exclude cases pairwise
choice will have SPSS perform each specific procedure using all cases with no
missing data for the variables involved in the given procedure, which implies
that with missing data the sample size may not be same for each procedure
performed. The Exclude cases listwise choice will have
SPSS perform each specific procedure using only those cases with no missing
data for the variables involved in every procedure that is to be performed,
which implies that with missing data the sample size will be same for each
procedure performed. (Of course when
there is no missing data, it makes no difference which of these two choices is
made.)
Creating new variables with transformation of existing
variables
1.
Select the Transform >
Compute Variable options to display the Compute Variable dialog
box.
2.
In the Target Variable slot, type an
appropriate name for the new variable which is to be a function of existing
variables.
3.
In the Numeric Expression section, a formula
is needed to indicate how the values for the new variable are to be calculated;
this can be accomplished by constructing the appropriate formula in the Numeric Expression section through the
selection of variable names from the list of variables on the left and clicking
on the arrow pointing toward the Numeric
Expression section, together with the selection of algebraic operation
buttons from the keypad displayed in the middle of the dialog box.
4.
Click on the OK button, after which the new variable
should be added to the data.
Creating new variables by recoding existing variables
1.
Select the Transform >
Recode into Different Variables options to display the Recode into
Different Variables dialog box.
2.
From the list of variables
on the left, select the existing variable to be recoded into a new variable,
and click on the arrow pointing toward the Numeric
Variable -> Output Variable
section.
3.
In the Output
Variable section, type an appropriate name in the Name slot for the
new variable to be created by recoding, and click on the Change button. You should
now see in the Numeric Variable -> Output Variable section an indication of which
existing variable is being recoded into the new variable.
4.
Click the Old
and New Values button to display the Recode into Different Variables:
Old and New Values dialog box.
5.
In the Old
Value section click an appropriate option and enter the appropriate
information for a value or range of values for the existing variable being
recoded; in the New Value section click an appropriate option and enter
the appropriate information for a corresponding value for the new variable to
be created by recoding; click on the Add
button after, which you should see in the Old -> New
section an indication of how this recoding will be done. Repeat this process until all the recoding
information has been entered.
6.
After all the
recoding information has been entered, click on the Continue button to
return to the Recode into Different Variables dialog box, and then click
on the OK button, after which the new variable should appear in the SPSS
data file.
Data Diagnostics
Checking for skewness and
non-normality in data
1.
Identify the
(quantitative) variable(s) in the SPSS data file to be checked for normality or
skewness, and identify the grouping variable if there
is one.
2.
Select the Analyze >
Descriptive Statistics > Explore options to display the Explore
dialog box.
3.
From the list of
variables on the left, select the desired (quantitative) variable(s), and click
on the arrow pointing toward the Dependent List section; if there is a
grouping variable in the list, select this variable and click on the arrow
pointing toward the Factor section.
4.
In the Display
section of the dialog box, make certain that the Both
option is selected.
5.
Click on the Plots
button to display the Explore: Plots dialog box.
6.
In the Descriptive
section of the dialog box, select the Stem-and-leaf option and the Histogram
option; also, select the Normality plots with tests option.
7.
Click on the Continue
button to return to the Explore dialog box, and then click on the OK
button, after which the SPSS output will be generated.
Hypothesis Tests Involving One
Variable
Performing a one‑sample t test about a mean m
1.
Identify the
(quantitative) variable in the SPSS data file on which the test is to be
performed, decide on the hypothesized value for the mean m, and select a (two‑sided) significance level a. (More than
one (quantitative) variable may be selected on which the test is to be performed
simultaneously, but only one hypothesized value for the mean is permitted.)
2.
Select the Analyze >
Compare Means > One Sample T Test options to display the One
Sample T Test dialog box.
3.
From the list of
variables on the left, select the desired (quantitative) variable (and more
than one selection is permitted), and click on the arrow pointing toward the Test Variable(s) section.
4.
Type the
hypothesized value for the mean m in the Test
Value slot.
5.
Click on the Options button to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)).
6.
Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated. The p-value
displayed on the SPSS output is for a two‑sided test; this must be
divided by 2 when doing a one‑sided test; also, the confidence interval
limits actually displayed on the SPSS output are actually the confidence interval
limits for the difference between the population mean and the hypothesized
value of the mean; adding the hypothesized value for the mean to each of these
limits gives the confidence interval limits for the population mean.
One possible appropriate graphical
display for the data used in a one‑sample t test is a box plot, which can be
obtained in SPSS by doing the following:
1.
Select the Graphs >
Legacy Dialogs > Boxplot options to display the Boxplot
dialog box.
2.
Select the Simple option and the Summaries of separate variables option.
3.
Click on the Define button to display the Define
Simple Boxplot: Summaries of
Separate Variables dialog box.
4.
From the list of
variables on the left, select the desired (quantitative) variable, and click on
the arrow pointing toward the Boxes
Represent section.
5.
Click on the OK button, after which the SPSS output
will be generated. The box plot will
be displayed vertically; in order to display it horizontally, first double
click on the graph to enter the SPSS
Chart Editor, and then select the Options >
Transpose Chart options from the main menu, after which selecting the File > Close options will close
the chart editor.
Performing a chi‑square goodness‑of‑fit
test with hypothesized proportions
1.
If the data have
already been entered into an SPSS data file, identify the (qualitative)
variable on which the test is to be performed in the SPSS data file, and skip
to step #7; otherwise, enter the data into SPSS by following the instructions
beginning in step #2.
2.
Go to the Variable
View sheet (by clicking on the appropriate tab at the bottom of the
screen), and in the first row, enter a name for the (qualitative) variable on
which the test is to be performed.
3.
Define codes for
this variable so that 1 (one) represents one category, 2 (two)
represents a second category, 3
(three) represents a third category, etc., making certain that all categories
of the (qualitative) variable have been included.
4.
In the second
row, enter the variable name count, and since all the counts must be
integers, make the entry in the third cell of the Decimals column to 0
(zero).
5.
Go to the Data
View sheet (by clicking on the appropriate tab at the bottom of the
screen), and in the column for the (qualitative) variable on which the test is
to be performed, enter the codes 1, 2, 3, etc. respectively into the first
cell, the second cell, the third cell, etc., making certain that all codes used
have been entered. (If the category
labels are not displayed, then select View > Value Labels from the
main menu.)
6.
In the column for
the variable count, enter the corresponding counts (i.e., the raw
frequency of occurrence in the data for each category); after all the data is
entered, it may be a good idea to save this SPSS file using an appropriate file
name.
7.
If each line of
the data file represents one case (i.e., the data were not entered in
the format described in steps #2 to #6), then select the Analyze >
Nonparametric Tests > Legacy Dialogs > Chi‑Square options
to display the Chi‑Square Test dialog box, and proceed to the next
step; if each line of the data file represents one category of the
(qualitative) variable on which the test is to be performed (i.e., the data were
entered in the format described in steps #2 to #6), then do the following:
First, select the Data > Weight Cases options to display the Weight
Cases dialog box; then, select the Weight cases by option, select
from the list on the left the variable name for the raw frequency (count) of
occurrence in the data for each category, and click on the arrow button
pointing toward the Frequency Variable slot; next, click on the OK
button; finally, select the Analyze > Nonparametric Tests >
Legacy Dialogs > Chi‑Square options to display the Chi‑Square
Test dialog box, and proceed to the next step.
8.
For each category,
decide on the hypothesized value for the proportion.
9.
In the Chi-Square
Test dialog box, from the list of the variables on the left, select the
(qualitative) variable on which the test is to be performed, and click on the
arrow button pointing toward the Test Variable List section.
10.
Which option
should be selected in the Expected Values section depends on the
hypothesized proportions. If the
hypothesized proportions are all equal, then select the All categories equal option; if the hypothesized proportions are
not all equal, then select the Values option, and enter the hypothesized
proportions by typing the hypothesized proportion for the category coded with
the smallest value (which would be 1 if the data were entered in the
format described in steps #2 to #6) in the Values slot and clicking on
the Add button, typing the hypothesized proportion for the category
coded with the next smallest value ( which would be 2 if the data were
entered in the format described in steps #2 to #6) in the Values slot and
clicking on the Add button, typing the hypothesized proportion for the
category coded with the third smallest value (which would be 3 if the data were
entered in the format described in steps #2 to #6) in the Values slot
and clicking on the Add button, etc., making certain that the
hypothesized proportion for each code used has been entered. (The order in which these hypothesized
proportions are entered must correspond to the numerical order of the codes for
the different categories; also, it does not matter whether percentages or
proportions are entered, so that, for instance, 45, 35, and 20 could be entered
instead of 0.45, 0.35, and 0.20.)
11.
Click on the OK button, after which the SPSS output
will be generated.
One possible appropriate
graphical display for the data used in a chi‑square goodness‑of‑fit
test is a bar chart, which can be obtained in SPSS by doing the following:
1.
Select the Graphs >
Legacy Dialogs > Bar options to display the Bar Charts
dialog box.
2.
Select the Simple option and the Summaries for groups of cases option.
3.
Click on the Define button to display the Define
Simple Bar: Summaries for Groups of
Cases dialog box.
4.
From the list of
variables on the left, select the desired (qualitative) variable, and
click on the arrow pointing toward the Category
Axis slot.
5.
To display raw
frequency, select the N of cases
option in the Bars Represent
section; to display percentages, select the % of cases option in the Bars
Represent section.
6.
Click on the OK button, after which the SPSS output
will be generated. To edit the bar
chart, if desired, double click on the graph to enter the SPSS Chart Editor. When
editing is complete, select the File >
Close options to close the chart editor.
Another possible appropriate graphical
display for the data used in a chi‑square goodness‑of‑fit
test is a pie chart, which can be obtained in SPSS by doing the following:
1.
Select the Graphs >
Legacy Dialogs > Pie options to display the Pie Charts
dialog box.
2.
Select the Summaries for groups of cases option.
3.
Click on the Define button to display the Define
Pie: Summaries for Groups of Cases
dialog box.
4.
From the list of
variables on the left, select the desired (qualitative) variable, and click on
the arrow pointing toward the Define Slices
by slot.
5.
To display raw
frequency, select the N of cases
option; to display percentages, select the %
of cases option.
6.
Click on the OK button, after which the SPSS output
will be generated. To edit the pie
chart, if desired, double click on the graph to enter the SPSS Chart Editor. When
editing is complete, select the File >
Close options to close the chart editor.
Hypothesis Tests Involving Two
Variables
Performing a paired‑sample t test about a mean difference md (i.e., a difference between means from dependent
samples)
1.
Identify in the
SPSS data file both of the (quantitative) variables for which the mean
differences are being tested, and select a (two‑sided) significance level a.
2.
Select the Analyze >
Compare Means > Paired‑Samples T Test options to display
the Paired‑Samples T Test dialog box.
3.
From the list of
variables on the left, select one of the desired (quantitative) variables, and
click on the arrow pointing toward the Paired
Variables section; then select the other desired (quantitative) variable,
and click on the arrow pointing toward the Paired
Variables section. (Selection of
more than one pair is permitted.)
4.
Click on the Options button to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)).
5.
Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated. The p-value
displayed on the SPSS output is for a two‑sided test; this must be
divided by 2 when doing a one‑sided test.
One possible appropriate
graphical display for the data used in a paired‑sample t test is two box plots, one for each
variable, which can be obtained in SPSS by doing the following:
1.
Select the Graphs >
Legacy Dialogs > Boxplot options to display the Boxplot
dialog box.
2.
Select the Simple option and the Summaries of separate variables option.
3.
Click on the Define button to display the Define
Simple Boxplot: Summaries of Separate
Variables dialog box.
4.
From the list of
variables on the left, select one of the desired (quantitative) variables, and
click on the arrow pointing toward the Boxes
Represent section; then select the other desired (quantitative) variable,
and click on the arrow pointing toward the Boxes
Represent section.
5.
Click on the OK button, after which the SPSS output
will be generated. The box plots will
be displayed vertically; in order to display them horizontally, first double
click on the graph to enter the SPSS
Chart Editor, and then select the Options >
Transpose Chart options from the main menu, after which selecting the File > Close options will close
the chart editor.
Another possible appropriate
graphical display for the data used in a paired‑sample t test is one box plot of the
differences between the two variables, which can be obtained in SPSS by doing
the following:
1.
A new variable
which is the difference between the two (quantitative) variables must be added
to the data; to accomplish this, decide on the order of subtraction for the
difference and select the Transform > Compute Variable
options to display the Compute Variable dialog box.
2.
In the Target Variable slot, type an
appropriate name for the new variable which is to be the difference between the
two (quantitative) variables.
3.
In the Numeric Expression section, a formula
is needed to indicate how the values for the new variable are to be calculated;
to accomplish this, select the name of the first variable in the difference
from the list of variables on the left, and click on the arrow pointing toward
the Numeric Expression section; then
select the minus sign button from the keypad displayed in the middle of the
dialog box; finally select the name of the second variable in the difference
from the list of variables on the left.
4.
Click on the OK button, after which the new variable
should be added to the data.
5.
Create a box plot
for this new variable by following the steps to create such a box plot in the
section titled Performing a one sample t
test about a mean m.
Performing a two‑sample t test about a difference between means m1 and m2
1.
Identify in the
SPSS data file both the (qualitative‑dichotomous) variable which defines
the two groups being compared and the (quantitative) variable for which the
means are being compared, and select a (two‑sided) significance level a. (More than one (quantitative)
variable may be selected on which to compare means simultaneously, but only one
grouping variable may be selected.)
2.
Select the Analyze >
Compare Means > Independent‑Samples T Test options to
display the Independent‑Samples T Test dialog box.
3.
From the list of
variables on the left, select the desired (quantitative) variable (and more
than one selection is permitted), and click on the arrow pointing toward the Test Variable(s) section.
4.
From the list of
variables on the left, select the desired (qualitative‑dichotomous)
variable, and click on the arrow pointing toward the Grouping Variable
slot.
5.
Click on the Define
Groups button to display the Define
Groups dialog box.
6.
In the Group 1 slot type the numerical code
which represents one of the categories for the (qualitative‑dichotomous)
variable which defines the two groups being compared, and in the Group 2 slot type the numerical code
which represents the other category; then click on the Continue button to return to the Independent‑Samples T
Test dialog box.
7.
Click on the Options button to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)).
8.
Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated. The p-value
displayed on the SPSS output is for a two‑sided test; this must be
divided by 2 when doing a one‑sided test.
One possible appropriate
graphical display for the data used in a two‑sample t test is two box plots, one for each group, which can be obtained
in SPSS by doing the following:
1.
Select the Graphs >
Legacy Dialogs > Boxplot options to display the Boxplot
dialog box.
2.
Select the Simple option and the Summaries for groups of cases option.
3.
Click on the Define button to display the Define
Simple Boxplot: Summaries for Groups
of Cases dialog box.
4.
From the list of
variables on the left, first select the (quantitative) variable to be displayed
by each boxplot, and click on the arrow pointing toward the Variable section; then select the
(qualitative) variable which defines each group for which a box plot is to be
displayed, and click on the arrow pointing toward the Category Axis slot.
5.
Click on the OK button, after which the SPSS output
will be generated. The box plots will
be displayed vertically; in order to display them horizontally, first double
click on the graph to enter the SPSS
Chart Editor, and then select the Options >
Transpose Chart options from the main menu, after which selecting the File > Close options will close
the chart editor.
Performing a one‑way ANOVA (analysis of
variance) to test for a difference between multiple means m1 , m2 , …, mk
1.
Identify in the
SPSS data file both the (qualitative) variable which defines the groups being
compared and the (quantitative) variable for which the means are being
compared, and select a significance level a. (More than
one (quantitative) variable may be selected on which to compare means
simultaneously, but only one grouping variable may be selected.)
2.
Select the Analyze >
Compare Means > One‑Way ANOVA options to display the One‑Way
ANOVA dialog box.
3.
From the list of
variables on the left, select the desired (quantitative) variable (and more
than one selection is permitted), and click on the arrow pointing toward the Dependent List section.
4.
From the list of
variables on the left, select the desired (qualitative) variable, and click on
the arrow pointing toward the Factor slot.
5.
Click on the Options
button to display the One‑Way ANOVA: Options dialog box; in order
to have descriptive statistics displayed, select the Descriptive option;
in order to have results for Levene’s test
(concerning equal variances) displayed, select the Homogeneity of variance
test option; in order to have results for alternative f tests (which are adjusted for unequal variances) displayed,
select the Brown‑Forsythe option and/or the Welch option;
click on the Continue button to
return to the One‑Way ANOVA dialog box.
6.
Click on the Post
Hoc button to display the One-Way ANOVA: Post Hoc Multiple Comparisons
dialogue box, select the desired multiple comparisons method(s), enter the desired
significance level, and click on the Continue
button to return to the One‑Way ANOVA dialog box.
7.
Click on the OK button, after which the SPSS output
will be generated.
One possible appropriate
graphical display for the data used in a one‑way ANOVA is multiple box
plots, one for each group, which can be obtained in SPSS by following the steps
to create two box plots in the section titled Performing a two‑sample t
test about a difference between means m1 and m2 .
Performing a chi‑square test for association
1.
If the data have
already been entered into an SPSS data file, identify the two (qualitative)
variables on which the test is to be performed in the SPSS data file, and skip
to step #7; otherwise, enter the data into SPSS by following the instructions beginning
in step #2.
2.
Go to the Variable
View sheet (by clicking on the appropriate tab at the bottom of the
screen), in the first row enter a name for one of the two (qualitative)
variables on which the test is to be performed, and in the second row enter a
name for the other (qualitative) variable.
3.
For each of the
two variables, define codes so that 1 (one) represents one category, 2
(two) represents a second category, 3
(three) represents a third category, etc., making certain that all categories
of the (qualitative) variable have been included.
4.
In the third row,
enter the variable name count, and since all the counts must be
integers, make the entry in the third cell of the Decimals column to 0
(zero).
5.
Go to the Data
View sheet (by clicking on the appropriate tab at the bottom of the
screen), and in the first two columns for the two (qualitative) variables on
which the test is to be performed, enter the codes 1 and 1 respectively into
the first and second cells of the first row, enter the codes 1 and 2 respectively
into the first and second cells of the second row, enter the codes 1 and 3
respectively into the first and second cells of the third row, etc., making
certain that all codes used for the (qualitative) variable in the second column
have been entered. Now, repeat this in
the next rows with the code 2 entered in the first column, and then repeat this
again with the code 3 entered in the first column, etc. making certain that all
codes used for the (qualitative) variable in the first column have been
entered. Each possible combination of
categories for the two (qualitative) variables should now be displayed exactly
once in the first two columns. (If the
category labels are not displayed, then select View > Value Labels
from the main menu.)
6.
In the column for
the variable count, enter the corresponding counts (i.e., the raw
frequency of occurrence in the data for each combination of categories); after
all the data is entered, it may be a good idea to save this SPSS file using an
appropriate file name.
7.
If each line of
the data file represents one case (i.e., the data were not entered in
the format described in steps #2 to #6), then select the Analyze >
Descriptive Statistics > Crosstabs options to display the Crosstabs
dialog box, and proceed to the next step; if each line of the data file
represents a combination of categories of the two (qualitative) variables on
which the test is to be performed (i.e., the data were entered in the
format described in steps #2 to #6), then do the following: First, select the Data >
Weight Cases options to display the Weight Cases dialog box; then,
select the Weight cases by option, select from the list on the left the
variable name for the raw frequency (count) of occurrence in the data for each
category, and click on the arrow button pointing toward the Frequency
Variable slot; next, click on the OK button; finally, select the Analyze >
Descriptive Statistics > Crosstabs options to display the Crosstabs
dialog box, and proceed to the next step.
8.
From the list of
the variables on the left, select one of the two (qualitative) variables on
which the test is to be performed, and click on the arrow button pointing
toward the Row(s) section; then select the other (qualitative) variable,
and click on the arrow button pointing toward the Column(s) section.
9.
Click on the Statistics
button to display the Crosstabs: Statistics dialogue box, select the Chi-square
option in the upper left corner of the dialogue box, and click on the Continue
button to return to the Crosstabs dialog box.
10.
Click on the Cells
button to display the Crosstabs: Cell Display dialogue box; in order to
have the data (observed frequencies) displayed, select the Observed
option in the Counts section; in order to have the expected frequencies
displayed, select the Expected option in the Counts section; in
order to have the percentages for column variable categories displayed for each
row variable category, select the Row option in the Percentages
section; in order to have the percentages for row variable categories displayed
for each column variable category, select the Column option in the Percentages
section; in order to have the percentages for each cell out of the total
displayed, select the Total option in the Percentages section; in
order to have the standardized residuals displayed for each cell, select the Standardized
option in the Residuals section; click on the Continue button to
return to the Crosstabs dialog box.
(NOTE: The column heading for the display of the p‑value for the Pearson Chi‑Square statistic has “2‑sided”
in parentheses on the SPSS output, which can be misleading since the Pearson
Chi‑Square test is generally a one‑sided test.)
11.
Click on the OK button, after which the SPSS output
will be generated.
One possible appropriate
graphical display for the data used in a chi‑square test for association
is a stacked bar chart, which can be obtained in SPSS by doing the following:
1.
Select the Graphs >
Legacy Dialogs > Bar options to display the Bar Charts
dialog box.
2.
Select the Stacked option and the Summaries for groups of cases option.
3.
Click on the Define button to display the Define
Stacked Bar: Summaries for Groups of
Cases dialog box.
4.
From the list of
variables on the left, select the (qualitative) variable whose categories will
be represented by the bars, and click on the arrow pointing toward the Category Axis slot; then select the
(qualitative) variable whose categories will be represented by stacks on each
of the bars, and click on the arrow pointing toward the Define Stacks by
slot.
5.
Select the % of cases option in the Bars Represent section. (The N
of cases option can be selected if raw frequencies are desired, but it is
much more common to use percentages in stacked bar charts.)
6.
Click on the OK button, after which a stacked bar
chart in SPSS output will be generated, but the bars may not all be of the same
height, and the percentages scaled on the vertical axis are for the stacks
across bars instead of within bars. Both
of these issues can be addressed to make the stacked bar chart easier to read
by editing the graph.
7.
To edit the
stacked bar chart, double click on the graph to enter the SPSS Chart Editor. Then,
select the Options > Scale to
100% options, and select the File >
Close options to close the chart editor.
The stacked bar chart should now be easier to read.
Performing a simple linear regression with bivariate
data, with checks of linearity, homoscedasticity, and normality assumptions
1.
Identify in the
SPSS data file the (quantitative) dependent (response) variable and the
(quantitative) independent (explanatory or predictor) variable.
2.
Select the Graphs
> Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot
dialog box.
3.
Make certain that
the Simple Scatter option is selected; then, click on the Define
button to display the Simple Scatterplot dialog box.
4.
From the list of
variables on the left, select the (quantitative) dependent (response) variable,
and click on the arrow pointing toward the Y-Axis slot; then select the
(quantitative) independent (explanatory or predictor) variable, and click on
the arrow pointing toward the X-Axis slot.
5.
Click on the OK
button, after which SPSS output displaying a scatter plot will be
generated. In order to have the least
squares line appear on the scatter plot, first double click on the graph to
enter the SPSS Chart Editor, and
then select the Elements> Fit Line at
Total options from the main menu (and close the dialog box which appears),
after which selecting the File >
Close options will close the chart editor.
By examining the how the points on the scatterplot are distributed
around the least squares line, a decision can be made
as to whether the linearity assumption about the relationship between the
dependent and independent variables is satisfied.
6.
Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
7.
From the list of
variables on the left, select the (quantitative) dependent (response) variable,
and click on the arrow pointing toward the Dependent
slot; then select the (quantitative) independent (explanatory or predictor)
variable, and click on the arrow pointing toward the Independent(s)
section (where selection of more than one variable is permitted).
8.
Click on the Plots button to display the Linear
Regression: Plots dialog box. From
the list of variables on the left, select ZRESID,
and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing
toward the X slot. This will
generate a scatter plot with standardized predicted values on the horizontal
axis and standardized residuals on the vertical axis, so that a decision can be
made as to whether the homoscedasticity assumption is satisfied.
9.
In the Standardized Residual Plots section of
the Linear Regression: Plots dialog box, select the Histogram
option and the Normal probability plot option. This will generate a histogram and a
normal probability plot for standardized residuals, so that a decision can be
made as to whether the normality assumption is satisfied.
10.
Click on the Continue button, and then click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, and select the Descriptives
option to generate means, standard deviations, and the Pearson correlation;
also, select the Confidence intervals option to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)). This will generate
confidence intervals for the slope and for the intercept in the regression.
11.
Click on the Continue button, and then click on the Save button to display the Linear Regression:
Save dialog box. In the Residuals
section, select the Standardized option to save the standardized
residuals as part of the data. This
allows further analysis to be performed using the standardized residuals.
12.
Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated.
13.
The scatter plot
with standardized predicted values on the horizontal axis and standardized
residuals on the vertical axis, requested in Step #8, will be displayed on the
SPSS output without a horizontal line at zero; since this line can be helpful
in examining this plot, the instructions in Step #5 to have the least squares
line appear on the scatter plot can be used to add this horizontal line at
zero.
Statistical Analysis Involving Two or
More Variables
Performing a two‑way ANOVA (analysis of
variance)
1.
Identify in the
SPSS data file the two (qualitative) variables which define the cells and the
(quantitative) variable for which means are being compared, and select a
significance level a.
2.
Select the Analyze>
General Linear Model> Univariate options to
display the Univariate dialog box.
3.
From the list of
variables on the left, select the desired (quantitative) variable, and click on
the arrow pointing toward the Dependent Variable slot.
4.
From the list of
variables on the left, select the two desired (qualitative) variables, and
click on the arrow pointing toward the Fixed Factor(s) section.
5.
Click on the Post
Hoc button to display the Univariate:
Post Hoc Multiple Comparisons for Observed Means dialog box; select each
item from the list in the Factor(s) section on the left for the Post
Hoc Tests for section on the right. From
the Equal Variances Assumed section, select a desired multiple
comparison procedure (such as Bonferroni); if it
is deemed necessary later, an option from the Equal Variances Not Assumed
section can be used. These multiple comparison
procedures are generally needed when one or both main effects are statistically
significant. Click on the Continue
button to return to the Univariate dialog box.
6.
Click on the Options
button to display the Univariate: Options
dialogue box. Select each item from the
list in the Factor(s) and Factor Interactions section on the left for
the Display Means for section on the right.
7.
In the Display
section, select Homogeneity tests (which will generate results for Levene’s test) and select Estimates of effect size. Click on the Continue button to return
to the Univariate dialog box.
8.
Click on the Plots
button to display the Univariate: Profile
Plots dialogue box. To generate one
of the two possible interaction plots, select one of the two variables from the
Factor(s) section on the left for the Horizontal Axis slot on the
right, and select the other variable for the Separate Lines slot on the
right; then, click the Add button to add this plot to the Plots
section; to generate the other possible interaction plot, repeat this with the
roles of the variables reversed. Click
on the Continue button to return to the Univariate
dialog box.
9.
Click on the OK button, after which the SPSS output
will be generated.
Appropriate graphical displays
for the data used in a two‑way ANOVA include box plots to display main
effects, and interaction plots for the interaction effects.
Performing a multiple linear regression with checks of
linearity, homoscedasticity, and normality assumptions
1.
Identify in the
SPSS data file the (quantitative) dependent (response) variable, all
quantitative independent (explanatory or predictor) variables, and all
qualitative independent (explanatory or predictor) variables.
2.
Select the Graphs
> Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot
dialog box.
3.
Make certain that
the Simple Scatter option is selected; then, click on the Define
button to display the Simple Scatterplot dialog box.
4.
From the list of
variables on the left, select the (quantitative) dependent (response) variable,
and click on the arrow pointing toward the Y-Axis slot; then select one
of the quantitative independent (explanatory or predictor) variables, and click
on the arrow pointing toward the X-Axis slot.
5.
Click on the OK
button, after which SPSS output displaying a scatter plot will be
generated. In order to have the least
squares line appear on the scatter plot, first double click on the graph to
enter the SPSS Chart Editor, and
then select the Elements> Fit Line at
Total options from the main menu (and close the dialog box which appears),
after which selecting the File >
Close options will close the chart editor.
By examining the how the points on the scatterplot are distributed
around the least squares line, a decision can be made
as to whether the linearity assumption about the relationship between the
dependent and independent variables is satisfied.
6.
Repeat Steps #2
to #5 for each of the quantitative independent (explanatory or predictor)
variables.
7.
For each of the
qualitative independent (explanatory or predictor) variables, add the
appropriate dummy variable(s) to the data file.
For a qualitative variable with k
categories, this can be done by defining dummy variables, where the first dummy
variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second
dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise,
etc.; since k - 1 dummy variables are sufficient to represent a
qualitative variable with k
categories, the kth
dummy variable is not really necessary, and may or may not be used.
8.
Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
9.
From the list of
variables on the left, select the (quantitative) dependent (response) variable,
and click on the arrow pointing toward the Dependent
slot; then select each independent (explanatory or predictor) variable, and
click on the arrow pointing toward the Independent(s) section (where
selection of more than one variable is permitted).
10.
Click on the Plots button to display the Linear
Regression: Plots dialog box. From
the list of variables on the left, select ZRESID,
and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing
toward the X slot. This will
generate a scatter plot with standardized predicted values on the horizontal
axis and standardized residuals on the vertical axis, so that a decision can be
made as to whether the homoscedasticity assumption is satisfied.
11.
In the Standardized Residual Plots section of
the Linear Regression: Plots dialog box, select the Histogram
option and the Normal probability plot option. This will generate a histogram and a normal
probability plot for standardized residuals, so that a decision can be made as
to whether the normality assumption is satisfied.
12.
Click on the Continue button, and then click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, and select the Descriptives
option to generate means, standard deviations, and the Pearson correlation;
also, select the Confidence intervals option to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)). This will generate
confidence intervals for the slope and for the intercept in the regression.
13.
Click on the Continue button, and then click on the Save button to display the Linear
Regression: Save dialog box. In the Residuals
section, select the Standardized option to save the standardized
residuals as part of the data. This
allows further analysis to be performed using the standardized residuals.
14.
Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated.
15.
The scatter plot
with standardized predicted values on the horizontal axis and standardized
residuals on the vertical axis, requested in Step #8, will be displayed on the
SPSS output without a horizontal line at zero; since this line can be helpful
in examining this plot, the instructions in Step #5 to have the least squares
line appear on the scatter plot can be used to add this horizontal line at
zero.
Methods to decide which of
many predictors are the most important to include in a model are available with
SPSS by doing the following:
1.
Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
2.
From the list of
variables on the left, select the (quantitative) dependent (response) variable,
and click on the arrow pointing toward the Dependent
slot; then select each independent (explanatory or predictor) variable, and
click on the arrow pointing toward the Independent(s) section (where
selection of more than one variable is permitted).
3.
In the Method
slot, select the desired method for variable selection (such as the Stepwise
option).
4.
Click on the Continue button, and then click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, select the R squared change option, and select the Collinearity diagnostics option; also, select
the Confidence intervals option to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)). This will generate
confidence intervals for the slope and for the intercept in the regression.
5.
Click on the OK button, after which the SPSS output
will be generated.
Generating a correlation matrix
????????????????