Go back to my Home Page or to the SPSS Home Page.
Using SPSS for Windows (Version 25.0)
This document contains instructions on using
SPSS to perform various statistical analyses.
The list of section and subsection titles is as follows:
I. Data Entry and Manipulation
IMPORTANT
NOTES
(1)
Defining Variables
(2)
Creating New Variables with Transformation of Existing Variables
(3)
Creating New Variables with Recoding of Existing Variables
II. Data Diagnostics, Graphical Displays, and
Descriptive Statistics
IMPORTANT
NOTES
(1)
Checking Data Ranges, Summary Statistics, and Missing Values
(2)
Checking for Skewness and Non-Normality
(3)
Creating Graphical Displays and Obtaining Descriptive Statistics
III. Statistical Analysis Involving One
Variable
(1)
Performing a One sample t Test about
a Mean m
(2)
Performing a Chi-square Goodness-of-Fit Test with Hypothesized Proportions
IV. Statistical Analysis Involving Two
Variables
(1)
Generating a Correlation Matrix with p-values
(2)
Performing a Paired Sample t Test
about a Mean Difference md (i.e., a difference between means from dependent
samples) or a Wilcoxon Signed Rank Test
(3)
Performing a Two Sample t Test about
a Difference Between Means m1 and m2 or a
Mann-Whitney Rank Sum Test
(4)
Performing a One-Way ANOVA (Analysis of Variance) to Test for at Least One
Difference Among Multiple Means m1 , m2 ,
…, mk or a
Kruskal-Wallis Rank Sum Test
(5)
Performing a Chi-square Test Concerning Independence
(6)
Performing a Simple Linear Regression with Checks of Linearity,
Homoscedasticity, and Normality Assumptions
V. Statistical Analysis Involving Multiple
(Two or More) Variables
(1)
Performing a Quadratic Regression with Checks of Model, Homoscedasticity, and
Normality Assumptions
(2)
Performing a Multiple Linear Regression with Checks for Multicollinearity and
of Linearity, Homoscedasticity, and Normality Assumptions
(3)
Performing a Stepwise Linear Regression to Build a Model
(4)
Performing a Stepwise Binary Logistic Regression to Build a Model
(5)
Performing a Two-Way ANOVA (Analysis of Variance) with Checks of Equal Variance
and Normality Assumptions
(6)
Performing a One-Way ANCOVA (Analysis of Covariance) with Checks of Equal
Variance and Normality Assumptions
(7)
Performing a Two-Way ANCOVA (Analysis of Covariance) with Checks of Equal
Variance and Normality Assumptions
(8)
Performing a Repeated Measures Within-Subjects ANOVA with Checks of Sphericity
and Normality Assumptions
(9)
Performing a Repeated Measures Mixed Between-Within-Subjects ANOVA with Checks
of Sphericity, Equal Variance and Covariance, and Normality Assumptions
I.
Data Entry and Manipulation
IMPORTANT NOTES:
Data can be entered in SPSS either before or after
variables are defined; cells with missing values will display a period
point. The default in SPSS is to use all
cases. In order to use only selected
cases in the data file, first select the Data > Select Cases
options to display the Select Cases dialog box; then click on the If button to display the Select
Cases: If dialog box. After the
desired condition is entered, click on the Continue
button, and then click on the OK
button, after which only the desired cases should not be marked as being
excluded from data analysis; also, a variable named filter_$ will be added to the data.
In many of the SPSS dialog boxes (generally from clicking
an Options button), there will be
two choices for handling missing data.
The Exclude cases pairwise
choice will have SPSS perform each specific procedure using all cases with no
missing data for the variables involved in the given procedure, which implies
that with missing data the sample size may not be same for each procedure
performed. The Exclude cases listwise choice will have SPSS perform each specific
procedure using only those cases with no missing data for the variables
involved in every procedure that is to be performed, which implies that with
missing data the sample size will be same for each procedure performed. (Of course when there is no missing data, it
makes no difference which of these two choices is made.)
(1) Defining Variables
Step
1: After entering SPSS, you
should see at the bottom of the screen a tab for Data View and a tab
for Variable View. Go to
the Variable View tab, and notice that there are several column
headings. Each row corresponds to a
different variable, and the column headings indicate different information
about each variable. In the Name column is where an abbreviated
name for each variable can be entered.
In the Label column is where
a longer and more descriptive name for each variable can be entered, but this
is optional. There are several choices
available for the Type column, but
for basic analysis of data, it will suffice to use the Numeric option for all variables except those which are merely case
identifiers never to be used in any statistical analysis, for which the String option can be used.
Step
2: For variables which are not to
be treated as categorical, no more information is required, although options
for columns such as Width and Decimals (which are concerned with how
the data is displayed in the data file) can be set as desired. For variables which are designed to be
treated as categorical, the categories must be defined in the Values column. To define these categories for a variable
designed to be treated as categorical, click on the cell for the Values
column, and click on the button which appears in the right hand side of the
cell, after which the Value Labels dialog box is displayed.
Step
3: In the Value slot type the
numerical code corresponding to one of the categories, and in the Label
slot type the label the corresponding name for the category. Then click on the Add button, after
which the information just entered is listed in the section at the bottom of
the dialog box. Repeat this step until
the information for every category is listed in the section at the bottom of
the dialog box.
Step
4: To leave the dialog box, click
on the OK button. To return to
viewing the data, go to the Data View tab, and if you do not
now see the category names displayed (once the data is entered), then select View>
Value Labels from the main menu.
(2) Creating New Variables by
Transformation of Existing Variables
Step
1: Select the Transform >
Compute Variable options to display the Compute Variable dialog
box.
Step
2: In the Target Variable slot, type an appropriate name for the new variable
which is to be a function of existing variables.
Step
3: In the Numeric Expression section, a formula is needed to indicate how the
values for the new variable are to be calculated; this can be accomplished by
constructing the appropriate formula in the Numeric
Expression section through the selection of variable names from the list of
variables on the left and clicking on the arrow pointing toward the Numeric Expression section, together
with the selection of algebraic operation buttons from the keypad displayed in
the middle of the dialog box.
Step
4: Click on the OK button, after which the new variable
should be added to the data.
(3) Creating New Variables by Recoding
Existing Variables
Step
1: Select the Transform >
Recode into Different Variables options to display the Recode into
Different Variables dialog box.
Step
2: From the list of variables on
the left, select the existing variable to be recoded into a new variable, and
click on the arrow pointing toward the Numeric Variable ŕ Output Variable section.
Step
3: In the Output Variable
section, type an appropriate name in the Name slot for the new variable to
be created by recoding, and click on the Change
button. You should now see in the Numeric
Variable ŕ Output
Variable section an indication of
which existing variable is being recoded into the new variable.
Step
4: Click the Old and New
Values button to display the Recode into Different Variables: Old and
New Values dialog box.
Step
5: In the Old Value section click
an appropriate option and enter the appropriate information for a value or
range of values for the existing variable being recoded; in the New
Value section click an appropriate option and enter the appropriate
information for a corresponding value for the new variable to be created by
recoding; click on the Add button
after, which you should see in the Old ŕ New section an indication
of how this recoding will be done.
Repeat this process until all the recoding information has been entered.
Step
6: After all the recoding
information has been entered, click on the Continue button to return to
the Recode into Different Variables dialog box, and then click on the OK
button, after which the new variable should appear in the SPSS data file.
II.
Data Diagnostics, Graphical Displays, and Descriptive Statistics
IMPORTANT NOTES:
After all data has been entered in SPSS, it can be
desirable to check for data entry errors that might have been made and to
assess how much missing data there is; this can be accomplished by checking
data ranges and missing values as described below. Also, it is can be desirable to evaluate the
degree to which the data satisfy certain assumptions required for statistical
analysis; some features in SPSS described below illustrate how to do this. Finally, it can be desirable to create
graphical displays and obtain descriptive statistics; using the appropriate
procedures in SPSS is addressed below.
(1) Checking Data Ranges, Summary
Statistics, and Missing Values
Step
1: Identify the qualitative
(i.e., categorical) variable(s) in the SPSS data file to be checked for data
entry errors and missing data.
Step
2: Select the Analyze >
Descriptive Statistics > Frequencies options to display the Frequencies
dialog box. From the list of variables
on the left, select one (or more) of the qualitative (i.e., categorical)
variables to be checked, and click on the arrow pointing toward the Variables(s) slot; any variable needing
to be removed from the Variables(s)
slot can be selected and removed by clicking on the arrow pointing toward the
list on the left. Use this process until
the list in the Variables(s) slot
consists exactly of all desired qualitative variables.
Step
3: The Display frequency tables option can be checked to generate a frequency
table listing the individual values of each qualitative variable. To see what descriptive statistics will be
displayed, click on the Statistics
button, after which the Frequencies: Statistics dialog box is
displayed. Since all the descriptive statistics
options displayed are primarily for quantitative variables, all boxes should be
unchecked (unless one or more of these is specifically of interest for some
reason). Click on the Continue
button to return to the Frequencies dialog box.
Step
4: Click on the OK button,
after which the SPSS output will be generated.
Step
5: Identify the quantitative
variable(s) in the SPSS data file to be checked for data entry errors and
missing data.
Step
6: Select the Analyze >
Descriptive Statistics > Frequencies options to display the Frequencies
dialog box. From the list of variables
on the left, select one (or more) of the quantitative variables to be checked,
and click on the arrow pointing toward the Variables(s)
slot; any variable needing to be removed from the Variables(s) slot can be selected and removed by clicking on the
arrow pointing toward the list on the left.
Use this process until the list in the Variables(s) slot consists exactly of all desired quantitative
variables.
Step
7: To obtain descriptive
statistics, click on the Statistics
button, after which the Frequencies: Statistics dialog box is
displayed. Select the options for the
descriptive statistics that would be of interest, which often include the Quartiles option in the Percentile Values section, the Mean, Median, and Mode options
in the Central Tendency section, the
Std deviation, Minimum, Maximum, and Range options in the Dispersion section, and the Skewness and Kurtosis options in the Distribution
section. Click on the Continue
button to return to the Frequencies dialog box. The Display
frequency tables option can be unchecked to avoid generating a frequency
table listing every individual value of the quantitative variable(s), unless
this is desired.
Step
8: Click on the OK button,
after which the SPSS output will be generated.
(2) Checking for Skewness and
Non-Normality
Step
1: In the SPSS data file,
identify one or more quantitative variables to be checked for normality or
skewness, and identify, if there are any, one or more qualitative (i.e.,
categorical) variables for defining groups.
Step
2: Select the Analyze >
Descriptive Statistics > Explore options to display the Explore
dialog box.
Step
3: From the list of variables on
the left, select the quantitative variable, or variables, for which skewness
and non‑normality are to be investigated, and click on the arrow pointing
toward the Dependent List section; if there is a grouping variable in
the list, select this variable and click on the arrow pointing toward the Factor
List section.
Step
4: In the Display section
near the bottom of the dialog box, select the Both option.
Step
5: Click on the Plots
button to display the Explore: Plots dialog box, and select the Normality
plots with tests option; notice that the Stem-and-leaf option is
selected, and this can be unselected if there is no interest in this, in order
to minimize the amount output.
Step
6: Click on the Continue
button to return to the Explore dialog box, and click on the OK
button, after which the SPSS output will be generated.
(3) Creating Graphical Displays and
Obtaining Descriptive Statistics
Creating
a Bar Chart
Step
1: In the SPSS data file,
identify the qualitative (i.e., categorical) variable for which a bar chart is
to be created.
Step
2: Select the Graphs >
Legacy Dialogs > Bar options to display the Bar Charts
dialog box
Step
3: Make certain the Simple
option is selected, and that the Summaries for groups of cases option is
selected.
Step
4: Click on the Define
button to display the Define Simple Bar: Summaries for Groups of Cases
dialog box.
Step
5: From the list of variables on
the left, select the desired qualitative (i.e., categorical) variable, and
click on the arrow pointing toward the Category Axis slot.
Step
6: Click on the OK button,
after which the SPSS output will be generated, and note that raw frequencies
are displayed.
Step
7: If it is desirable to make
changes to the bar chart, double click on the graph to enter the SPSS Chart
Editor.
Step
8: After making desired changes,
exit from the Chart Editor, after which you should see that the SPSS output has
been updated.
Creating
a Pie Chart
Step
1: In the SPSS data file,
identify the qualitative (i.e., categorical) variable for which a pie chart is
to be created.
Step
2: Select the Graphs >
Legacy Dialogs > Pie options to display the Pie Charts
dialog box
Step
3: Make certain the Simple
option is selected, and that the Summaries for groups of cases option is
selected.
Step
4: Click on the Define
button to display the Define Pie: Summaries for Groups of Cases dialog
box.
Step
5: From the list of variables on
the left, select the desired qualitative (i.e., categorical) variable, and
click on the arrow pointing toward the Define Slices by slot.
Step
6: Click on the OK button,
after which the SPSS output will be generated, and note that a legend has been
created, but no raw or relative frequencies are displayed.
Step
7: Double click on the graph to
enter the SPSS Chart Editor.
Step
8: Select the Elements > Show
Data Labels options to display the Properties dialog box, and select
the Data Value Labels tab.
Step
9: Move Percent (or any
other desired choice) from the Not Displayed box to the Displayed
box, and click on the Apply button.
Step
10: Close the Properties
dialog box, and exit from the Chart Editor, after which you should see that the
SPSS output has been updated.
Creating
a Stacked Bar Chart
Step
1: In the SPSS data file,
identify the two qualitative (i.e., categorical) variables for which a stacked
bar chart is to be created.
Step
2: Select the Graphs >
Legacy Dialogs > Bar options to display the Bar Charts
dialog box.
Step
3: Select the Stacked
option and the Summaries for groups of cases option.
Step
4: Click on the Define
button to display the Define Stacked Bar: Summaries for Groups of Cases
dialog box.
Step
5: From the list of variables on
the left, select the qualitative variable name that will be used to define the
bars, and click on the arrow pointing toward the Category Axis slot;
then select the variable name that will be used to define stacks, and click on
the arrow pointing toward the Define Stacks by slot.
Step
6: Select the N of cases
option in the Bars Represent section, and click the OK button,
after which a stacked bar chart in SPSS output will be generated; notice that
raw frequency is scaled on the vertical axis for the stacks within each bar,
and you may also notice that the bars are not all the same height.
Step
7: In order to have relative
frequency (percentages) scaled on the vertical axis with the bars all scaled to
the same height, which are generally preferable, double click on the graph to
enter the SPSS Chart Editor.
Step
8: Select the Options >
Scale to 100% options, after which percentages should be scaled on the
vertical axis; by double clicking on the title for the vertical axis, it can be
changed to Percent.
Step
9: Exit from the chart editor,
after which you should see that the SPSS output has been updated; the stacked
bar chart displayed is one of two possible stacked bar charts.
Step
10: To create the other stacked
bar chart, repeat all previous steps with the variable names switched in Step
5.
Creating
a Histogram
Step
1: In the SPSS data file,
identify the quantitative variable for which a histogram is to be created.
Step
2: Select the Graphs >
Legacy Dialogs > Histogram options to display the Histogram
dialog box.
Step
3: From the list of variables on
the left, select the desired quantitative variable, and click on the arrow
pointing toward the Variable slot.
Step
4: Click on the OK button,
after which the SPSS output will be generated.
Step
5: If it is desirable to make
changes to the histogram, double click on the graph to enter the SPSS Chart
Editor.
Step
6: After making desired changes,
exit from the Chart Editor, after which you should see that the SPSS output has
been updated.
Creating
One Box Plot or Multiple Box Plots for Commensurate Variables
Step
1: In the SPSS data file,
identify either the quantitative variable for which a box plot is to be
created, or the multiple commensurate quantitative variables for which box
plots are to be created.
Step
2: Select the Graphs >
Legacy Dialogs > Boxplot options to display the Boxplot
dialog box.
Step
3: Make certain the Simple
option is selected, and then select the Summaries of separate variables
option.
Step
4: Click on the Define
button to display the Define Simple Boxplot: Summaries of Separate Variables
dialog box.
Step
5: From the list of variables on
the left, select the desired quantitative variable(s), and click on the arrow
pointing toward the Boxes Represent box.
Step
6: Click on the OK button,
after which the SPSS output will be generated; note that the numerical scale is
on the vertical axis.
Step
7: In order to get the numerical
scale displayed on the horizontal axis, which is what is more common, double
click on the graph to enter the SPSS Chart Editor.
Step
8: Select the Options >
Transpose Chart options, after which the numerical scale should be on the
horizontal axis.
Step
9: Exit from the Chart Editor,
after which you should see that the SPSS output has been updated.
Creating Box Plots for Two or More Groups
Step
1: In the SPSS data file,
identify the quantitative variable for which box plots are to be created, and
identify the qualitative (i.e., categorical) variable which defines the groups.
Step
2: Select the Graphs > Legacy Dialogs > Boxplot
options to display the Boxplot dialog box.
Step
3: Make certain the Simple
option is selected, and that the Summaries for groups of cases option is
selected.
Step
4: Click on the Define
button to display the Define Simple Boxplot: Summaries for Groups of Cases
dialog box.
Step
5: From the list of variables on
the left, select the desired quantitative variable, and click on the arrow
pointing toward the Variable slot.
Step
6: From the list of variables on
the left, select the qualitative (i.e., categorical) variable which defines the
groups, and click on the arrow pointing toward the Category Axis slot.
Step
7: Click on the OK button,
after which the SPSS output will be generated; note that the box plots are
displayed with the numerical scale on the vertical axis;
Step
8: In order to get the numerical
scale displayed on the horizontal axis, which is what is more common, double
click on the graph to enter the SPSS Chart Editor.
Step
9: Select the Options >
Transpose Chart options, after which the numerical scale should be on the
horizontal axis.
Step
10: Exit from the Chart Editor,
after which you should see that the SPSS output has been updated.
Creating
a Scatter Plot
Step
1: In the SPSS data file,
identify the two quantitative variables for which a scatter plot is to be
created; if appropriate, one of the two variables can be designated as the
dependent (or response) variable and the other as the independent (or
predictor) variable.
Step
2: Select the Graphs >
Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot
dialog box.
Step
3: Make certain that the Simple
Scatter option is selected; then, click on the Define button to
display the Simple Scatterplot dialog box.
Step
4: From the list of variables on
the left, select the variable designated as the dependent variable or, if no
such designation was made, select either one of the quantitative variables, and
click on the arrow pointing toward the Y‑Axis slot; then select
the other quantitative variable (which would be treated as the independent
variable, if such s designation were made), and click
on the arrow pointing toward the X‑Axis slot.
Step
5: Click on the OK button,
after which the SPSS output will be generated.
Step
6: If it is desirable to include
a graph of the least squares line on the scatter plot, double click on the
graph to enter the SPSS Chart Editor.
Step
7: Select the Elements> Fit Line at Total options
from the main menu to display the least squares line and open the Properties
dialog box.
Step
8: You should notice that a label
displaying the equation of the least squares line appears in the middle of the
scatter plot; to delete this label, uncheck the Attach label to line option near the bottom of the Properties
dialog box, and click on the Apply button.
Step
9: Close the Properties
dialog box, and exit from the Chart Editor, after which you should see that the
SPSS output has been updated.
Creating
a Frequency Table
Step
1: In the SPSS data file,
identify the variable, or variables, for which a frequency table is to be
created; the variable(s) can be either qualitative (in which case the table
will display a list of codes used to represent categories) or quantitative (in
which case the table will display a list of all values of the variable in the
data set).
Step
2: Select the Analyze >
Descriptive Statistics > Frequencies options to display the Frequencies
dialog box.
Step
3: From the list of variables on
the left, select the variable, or variables, for which numerical summaries are
to be obtained, and click on the arrow pointing toward the Variable(s)
box.
Step
4: Make certain the Display
frequency tables option is checked, and click on the OK button,
after which the SPSS output will be generated.
Obtaining
Numerical Summaries
First Method
Step
1: In the SPSS data file,
identify one or more quantitative variables for which numerical summaries are
to be obtained, and identify, if there are any, one or more qualitative (i.e.,
categorical) variables for defining groups.
Step
2: Select the Analyze >
Descriptive Statistics > Explore options to display the Explore
dialog box.
Step
3: From the list of variables on
the left, select the quantitative variable, or variables, for which numerical
summaries are to be obtained, and click on the arrow pointing toward the Dependent
List box; then select, if any, the qualitative (i.e., categorical)
variables for defining groups, and click on the arrow pointing toward the Factor
List box.
Step
4: In the Display section
near the bottom of the dialog box, select the Statistics option.
Step
5: Click on the Statistics
button to display the Explore: Statistics dialog box, and notice that
the Descriptives option is selected; if there
is interest in including the five‑number summary among the numerical
summaries, check the appropriate boxes to also select the Percentiles
option.
Step
6: Click on the Continue
button to return to the Explore dialog box, and click on the OK
button, after which the SPSS output will be generated.
Second Method
Step
1: In the SPSS data file,
identify one or more quantitative variables for which numerical summaries are
to be obtained, and identify, if there are any, one or more qualitative (i.e.,
categorical) variables for defining groups.
Step
2: Select the Analyze >
Compare Means > Means options to display the Means dialog box.
Step
3: From the list of variables on
the left, select the quantitative variable, or variables, for which numerical
summaries are to be obtained, and click on the arrow pointing toward the Dependent
List box; then select, if any, the qualitative (i.e., categorical)
variables for defining groups, and click on the arrow pointing toward the Independent
List box.
Step
4: Click on the Options
button to display the Means: Options dialog box. In this dialog box, you should notice that
the options Mean, Number of Cases, and Standard Deviation
are each listed in the Cell Statistics box on the right, and that
several other options are listed in the Statistics box on the left.
Step
5: After selecting any additional
options desired from the list in the Cell Statistics box, click on the
arrow button pointing toward the Cell Statistics box.
Step
6: Click on the Continue
button to return to the Means dialog box, and then click on the OK
button, after which the SPSS output will be generated.
Third Method
Step
1: In the SPSS data file,
identify one or more quantitative variables for which numerical summaries are
to be obtained.
Step
2: Select the Analyze >
Descriptive Statistics > Descriptives options
to display the Descriptives dialog box.
Step
3: From the list of variables on
the left, select the quantitative variable, or variables, for which numerical
summaries are to be obtained, and click on the arrow pointing toward the Variable(s)
box.
Step
4: Click on the Options
button to display the Descriptives: Options
dialog box. In this dialog box, you
should notice that the options Mean, Std. deviation, Minimum,
and Maximum are checked, and that other options are available to be
checked.
Step
5: After checking any additional
options desired, click on the Continue button to return to the Descriptives dialog box.
Step
6: Click on the OK button,
after which the SPSS output will be generated.
III. Statistical
Analysis Involving One Variable
(1) Performing a One sample t Test about a Mean m
Step
1: Identify the (quantitative)
variable in the SPSS data file on which the test is to be performed, decide on
the hypothesized value for the mean m, and select a (two‑sided) significance level a. (More than
one (quantitative) variable may be selected on which the test is to be
performed simultaneously, but only one hypothesized value for the mean is
permitted.)
Step
2: Select the Analyze >
Compare Means > One Sample T Test options to display the One
Sample T Test dialog box.
Step
3: From the list of variables on
the left, select the desired (quantitative) variable (and more than one
selection is permitted), and click on the arrow pointing toward the Test Variable(s) section.
Step
4: Type the hypothesized value
for the mean m in the Test Value slot.
Step
5: Click on the Options button to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)).
Step
6: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated. The p-value
displayed on the SPSS output is for a two‑sided test; this must be
divided by 2 when doing a one‑sided test; also, the confidence interval
limits actually displayed on the SPSS output are actually the confidence
interval limits for the difference between the population mean and the
hypothesized value of the mean; adding the hypothesized value for the mean to
each of these limits gives the confidence interval limits for the population
mean.
One possible appropriate graphical display
for the quantitative variable used in a one‑sample t test is a box plot, which can be obtained by following the steps
labeled “Creating One Box Plot or Multiple Box Plots for Commensurate
Variables” in Section II(3) of this document.
(2) Performing a Chi-square
Goodness-of-Fit Test with Hypothesized Proportions
Step
1: If the individual cases making
up the raw data have already been entered into an SPSS data file, identify the
(qualitative) variable on which the test is to be performed, and skip to Step
7; if the data is to be entered into SPSS with raw frequencies (i.e., counts),
then follow the instructions beginning in Step 2.
Step
2: Go to the Variable View
sheet (by clicking on the appropriate tab at the bottom of the screen), and in
the first row, enter a name for the (qualitative) variable on which the test is
to be performed.
Step
3: Define codes for this
(qualitative) variable so that 1 (one) represents one category, 2
(two) represents a second category, 3
(three) represents a third category, etc., making certain that all categories
of the variable have been included.
Step
4: In the second row, enter the
variable name RawFrequency, and since all the
raw frequencies must be integers, make the entry in the third cell of the Decimals
column to 0 (zero).
Step
5: Go to the Data View
sheet (by clicking on the appropriate tab at the bottom of the screen), and in
the column for the (qualitative) variable on which the test is to be performed,
enter the codes 1, 2, 3, etc. respectively into the first cell, the second
cell, the third cell, etc., making certain that all codes used have been
entered. (If the category labels are not
displayed, then select View > Value Labels from the main menu.)
Step
6: In the column for the variable
RawFrequency, enter the corresponding raw
frequency in the data for each category; after all the data is entered, it may
be a good idea to save this SPSS file using an appropriate file name.
Step
7: If each line of the data file
represents one case (i.e., the data were not entered in the format
described in Steps 2 to 6), then select the Analyze > Nonparametric
Tests > Legacy Dialogs > Chi‑Square options to display the
Chi‑Square Test dialog box, and proceed to the next step; if each
line of the data file represents one category of the (qualitative) variable on
which the test is to be performed (i.e., the data were entered in the
format described in steps #2 to #6), then do the following: First, select the Data >
Weight Cases options to display the Weight Cases dialog box; then,
select the Weight cases by option, select from the list on the left the
variable RawFrequency (which is the frequency
of occurrence in the data for each category), and click on the arrow button
pointing toward the Frequency Variable slot; next, click on the OK
button; finally, select the Analyze > Nonparametric Tests >
Legacy Dialogs > Chi‑Square options to display the Chi‑Square
Test dialog box, and proceed to the next step.
Step
8: In the Chi-Square Test
dialog box, from the list of the variables on the left, select the
(qualitative) variable on which the test is to be performed, and click on the
arrow button pointing toward the Test Variable List section. For each category, decide on the hypothesized
value for the proportion.
Step
9: Which option should be
selected in the Expected Values section depends on the hypothesized
proportions. If the hypothesized
proportions are all equal, then select the All
categories equal option; if the hypothesized proportions are not all equal,
then select the Values option, and enter the hypothesized proportions by
first typing the hypothesized proportion for the category coded with the
smallest value (which would be 1 if the data were entered in the format
described in Steps 2 to 6) in the Values slot and clicking on the Add
button. Then, type the hypothesized
proportion for the category coded with the next smallest value (which would be
2 if the data were entered in the format described in Steps 2 to 6) in
the Values slot and click on the Add button. Next, type the hypothesized proportion for
the category coded with the third smallest value (which would be 3 if the data were
entered in the format described in Steps 2 to 6) in the Values slot and
click on the Add button. Continue
this process until the hypothesized proportion for each code used has been
entered. (The order in which these
hypothesized proportions are entered must correspond to the numerical order of
the codes for the different categories; also, it does not matter whether
percentages or proportions are entered, so that, for instance, 45, 35, and 20
could be entered instead of 0.45, 0.35, and 0.20.)
Step
10: Click on the OK button, after which the SPSS output
will be generated.
One possible appropriate graphical display
for the qualitative variable used in a chi‑square goodness‑of‑fit
test is a bar chart, which can be obtained by following the steps labeled “Creating
a Bar Chart” in Section II(3) of this document;
another possible graphical display is a pie chart, which can be obtained by
following the steps labeled “Creating a Pie Chart” in Section II(3) of
this document.
IV. Statistical
Analysis Involving Two Variables
(1) Generating a Correlation Matrix with
p-values
Step
1: Identify the (quantitative or
qualitative‑ordinal) variables in the SPSS data file for which
correlations between pairs of variables are to be calculated.
Step
2: Select the Analyze >
Correlate > Bivariate options to display the Bivariate
Correlations dialog box.
Step
3: From the list of variables on
the left, select the desired (quantitative) variables, and click on the arrow
pointing toward the Variables
section.
Step
4: In the Correlation Coefficients section, select all the desired options
(such as the Pearson option to
generate the Pearson product moment correlation(s) for variables which are
assumed to be at least approximately normally distributed, and the Spearman option to generate the
Spearman rank correlation(s) when at least one variable is either qualitative‑ordinal
or assumed to have a distribution substantially different from a normal
distribution).
Step
5: In the Tests of Significance section, select either the Two‑tailed or One‑tailed option, depending on
what type of hypothesis test is desired.
Step
6: Click on the OK button, after which the SPSS output
will be generated.
One possible appropriate graphical display
for the variables whose correlation is of interest a scatter plot, which can be
obtained by following the steps labeled “Creating a Scatter Plot” in
Section II(3) of this document.
(2) Performing a Paired Sample t Test about a Mean Difference md (i.e., a
difference between means from dependent samples) or a Wilcoxon Signed Rank Test
Step
1: Identify in the SPSS data file
the pair of (quantitative) variables for which the mean difference is being
tested, and select a (two‑sided) significance level a. (More than
one pair of (quantitative) variables may be selected on which to test the mean
difference simultaneously.)
Step
2: Select the Analyze >
Compare Means > Paired‑Samples T Test options to display
the Paired‑Samples T Test dialog box.
Step
3: From the list of variables on
the left, select one of the desired (quantitative) variables, and click on the
arrow pointing toward the Paired
Variables section; then select the other desired (quantitative) variable,
and click on the arrow pointing toward the Paired
Variables section. (Selection of
more than one pair is permitted.)
Step
4: Click on the Options button to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level, i.e.,
if the significance level is 0.05 (5%), then the confidence level will be 0.95
(95%)).
Step
5: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated. The p-value
displayed on the SPSS output is for a two‑sided test; this must be
divided by 2 when doing a one‑sided test.
A nonparametric test which can be considered
an alternative to the paired‑sample t
test (when appropriate assumptions might not be satisfied) is the Wilcoxon
signed rank test to compare the median of a distribution of a quantitative or
qualitative‑ordinal variable to zero (0), which can be performed as
follows:
Step
1: Identify in the SPSS data file
the pair of (quantitative) variables for which the mean difference is being
tested, and select a (two‑sided) significance level a. (More than
one pair of (quantitative) variables may be selected on which to test the mean
difference simultaneously.)
Step
2: Select the Analyze >
Nonparametric Tests > Legacy Dialogs > 2 Related Samples
options to display the Two-Related-Samples Tests dialog box.
Step
3: From the list of variables on
the left, select one of the desired (quantitative or qualitative‑ordinal)
variables, and click on the arrow pointing toward the Paired Variables section; then select the other desired
(quantitative or qualitative‑ordinal) variable, and click on the arrow
pointing toward the Test Pairs
section. (Selection of more than one
pair is permitted.)
Step
4: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated. The p-value
displayed on the SPSS output is for a two‑sided test; this must be
divided by 2 when doing a one‑sided test.
The following two appropriate graphical
displays for the data used in a paired‑sample t test or a Wilcoxon signed rank test are possible:
(1)
one box plot of the differences between the two variables; the differences can
be obtained by following the steps labeled “Creating New Variables by
Transformation of Existing Variables” in Section I(2) of this document, and
noting that the formula to be entered in Step 3 should be one of the two
variables minus the other (and the minus sign button from the keypad displayed
in the middle of the dialog box can be used); the box plot of the new variable
of differences can then be obtained by following the steps labeled “Creating
One Box Plot or Multiple Box Plots for Commensurate Variables” in Section
II(3) of this document.
(2)
two box plots, one for each variable, which can be obtained by following the steps
labeled “Creating One Box Plot or Multiple Box Plots for Commensurate
Variables” in Section II(3) of this document.
(3) Performing a Two Sample t Test about a Difference Between Means m1 and m2 or a
Mann-Whitney Rank Sum Test
Step
1: Identify in the SPSS data file
both the (qualitative‑dichotomous) variable which defines the two groups
being compared and the (quantitative) variable for which the means are being
compared, and select a (two‑sided) significance level a. (More than
one (quantitative) variable may be selected on which to compare means
simultaneously, but only one grouping variable may be selected.)
Step
2: Select the Analyze >
Compare Means > Independent‑Samples T Test options to
display the Independent‑Samples T Test dialog box.
Step
3: From the list of variables on
the left, select the desired (quantitative) variable (and more than one
selection is permitted), and click on the arrow pointing toward the Test Variable(s) section.
Step
4: From the list of variables on
the left, select the desired (qualitative‑dichotomous) variable, and
click on the arrow pointing toward the Grouping Variable slot.
Step
5: Click on the Define Groups button to display the Define Groups dialog box.
Step
6: In the Group 1 slot type the numerical code which represents one of the
categories for the (qualitative‑dichotomous) variable which defines the
two groups being compared, and in the Group
2 slot type the numerical code which represents the other category; then
click on the Continue button to
return to the Independent‑Samples T Test dialog box.
Step
7: Click on the Options button to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)).
Step
8: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated. The p-value
displayed on the SPSS output is for a two‑sided test; this must be
divided by 2 when doing a one‑sided test.
A nonparametric test which can be considered
an alternative to the two‑sample t
test (when appropriate assumptions might not be satisfied) is the Mann‑Whitney
rank sum test to compare the distributions of a quantitative or qualitative‑ordinal
variable for two groups, which can be performed as follows:
Step
1: Identify in the SPSS data file
both the (qualitative‑dichotomous) variable which defines the two groups
being compared and the (quantitative or qualitative‑ordinal) variable for
which the distributions are being compared.
(More than one (quantitative or qualitative‑ordinal) variable may
be selected on which to compare distributions simultaneously, but only one
grouping variable may be selected.)
Step
2: Select the Analyze >
Nonparametric Tests > Legacy Dialogs > 2 Independent Samples
options to display the Two-Independent-Samples Tests dialog box.
Step
3: From the list of variables on
the left, select the desired (quantitative or qualitative‑ordinal)
variable (and more than one selection is permitted), and click on the arrow
pointing toward the Test Variable List
section.
Step
4: From the list of variables on
the left, select the desired (qualitative‑dichotomous) variable, and
click on the arrow pointing toward the Grouping Variable slot.
Step
5: Click on the Define Groups button to display the Define Groups dialog box.
Step
6: In the Group 1 slot type the numerical code which represents one of the
categories for the (qualitative‑dichotomous) variable which defines the
two groups being compared, and in the Group
2 slot type the numerical code which represents the other category; then
click on the Continue button to
return to the Two‑Independent‑Samples Test dialog box.
Step
7: Make certain that the Mann‑Whitney U option is selected
in the Test Type section.
Step
8: Click on the OK button, after which the SPSS output
will be generated. The p-value
displayed on the SPSS output is for a two‑sided test; this must be
divided by 2 when doing a one‑sided test.
One possible appropriate graphical display
for the data used in a two‑sample t
test or a Mann‑Whitney rank sum test is two box plots, one for each
group, which can be obtained by following the steps labeled “Creating Box
Plots for Two or More Groups” in Section II(3) of this document.
(4) Performing a One-Way ANOVA (Analysis
of Variance) to Test for at Least One Difference Among Multiple Means m1 , m2 , …, mk or a
Kruskal-Wallis Rank Sum Test
Step
1: Identify in the SPSS data file
both the (qualitative) variable which defines the groups being compared and the
(quantitative) variable for which the means are being compared, and select a
significance level a. (More than one (quantitative)
variable may be selected on which to compare means simultaneously, but only one
grouping variable may be selected.)
Step
2: Select the Analyze >
Compare Means > One‑Way ANOVA options to display the One‑Way
ANOVA dialog box.
Step
3: From the list of variables on
the left, select the desired (quantitative) variable (and more than one
selection is permitted), and click on the arrow pointing toward the Dependent List section.
Step
4: From the list of variables on
the left, select the desired (qualitative) variable, and click on the arrow
pointing toward the Factor slot.
Step
5: Click on the Options
button to display the One‑Way ANOVA: Options dialog box; in order
to have descriptive statistics displayed, select the Descriptive option;
in order to have results for Levene’s test
(concerning equal variances) displayed, select the Homogeneity of variance
test option; in order to have results for alternative f tests (which are adjusted for unequal variances) displayed,
select the Brown‑Forsythe option and/or the Welch option;
click on the Continue button to
return to the One‑Way ANOVA dialog box.
Step
6: Click on the Post Hoc button
to display the One‑Way ANOVA: Post Hoc Multiple Comparisons dialog
box; in order to have results for one or more multiple comparison methods
displayed, make the desired selections in the Equal Variances Assumed
and/or Equal Variances Not Assumed sections, and enter the significance
level in the Significance level slot; click on the Continue button to return to the One‑Way ANOVA dialog
box.
Step
7: Click on the OK button, after which the SPSS output
will be generated.
A nonparametric test which can be considered
an alternative to the one‑way ANOVA f
test (when appropriate assumptions might not be satisfied) is the Kruskal‑Wallis
rank sum test to compare the distributions of a quantitative or qualitative‑ordinal
variable for k groups, which can be
performed as follows:
Step
1: Identify in the SPSS data file
both the (qualitative‑dichotomous) variable which defines the two groups
being compared and the (quantitative or qualitative‑ordinal) variable for
which the distributions are being compared.
(More than one (quantitative or qualitative‑ordinal) variable may
be selected on which to compare distributions simultaneously, but only one
grouping variable may be selected.)
Step
2: Select the Analyze >
Nonparametric Tests > Legacy Dialogs > K Independent Samples
options to display the Tests for Several Independent Samples dialog box.
Step
3: From the list of variables on
the left, select the desired (quantitative or qualitative‑ordinal)
variable (and more than one selection is permitted), and click on the arrow
pointing toward the Test Variable List
section.
Step
4: From the list of variables on
the left, select the desired (qualitative‑dichotomous) variable, and
click on the arrow pointing toward the Grouping Variable slot.
Step
5: Click on the Define Range
button to display the Several Independent Samples: Define Range dialog
box.
Step
6: In the Minimum slot type the smallest numerical code used to represent the
categories for the (qualitative‑dichotomous) variable which defines the
groups being compared, and in the Maximum
slot type the largest numerical code which represents the other category; then
click on the Continue button to
return to the Tests for Several Independent Samples dialog box.
Step
7: Make certain that the Kruskal‑Wallis H option is
selected in the Test Type section.
Step
8: Click on the OK button, after which the SPSS output
will be generated.
One possible appropriate graphical display
for the data used in a one‑way ANOVA or a Kruskal‑Wallis rank sum
test is multiple box plots, one for each group, which can be obtained by
following the steps labeled “Creating Box Plots for Two or More Groups”
in Section II(3) of this document.
(5) Performing a Chi-square Test
Concerning Independence
Step
1: If the individual cases making
up the raw data have already been entered into an SPSS data file, identify the
two (qualitative) variables on which the test is to be performed, and skip to
Step 7; otherwise, enter the data into SPSS by following the instructions
beginning in Step 2.
Step
2: Go to the Variable View
sheet (by clicking on the appropriate tab at the bottom of the screen), in the
first row enter a name for one of the two (qualitative) variables on which the
test is to be performed, and in the second row enter a name for the other (qualitative)
variable.
Step
3: For each of the two
(qualitative) variables, define codes so that 1 (one) represents one
category, 2 (two) represents a second category, 3 (three) represents a third category, etc., making certain that
all categories of the variable have been included.
Step
4: In the third row, enter the
variable name RawFrequency, and since all the
counts must be integers, make the entry in the third cell of the Decimals
column to 0 (zero).
Step
5: Go to the Data View
sheet (by clicking on the appropriate tab at the bottom of the screen), and in
the first two columns for the two (qualitative) variables on which the test is
to be performed, enter the codes 1 and 1 respectively into the first and second
cells of the first row, enter the codes 1 and 2 respectively into the first and
second cells of the second row, enter the codes 1 and 3 respectively into the
first and second cells of the third row, etc., making certain that all codes
used for the (qualitative) variable in the second column have been
entered. Now, repeat this in the next
rows with the code 2 entered in the first column, and then repeat this again
with the code 3 entered in the first column, etc. making certain that all codes
used for the (qualitative) variable in the first column have been entered. Each possible combination of categories for
the two (qualitative) variables should now be displayed exactly once in the
first two columns. (If the category labels
are not displayed, then select View > Value Labels from the main
menu.)
Step
6: In the column for the variable
RawFrequency, enter the corresponding raw
frequency in the data for each combination of categories; after all the data is
entered, it may be a good idea to save this SPSS file using an appropriate file
name.
Step
7: If each line of the data file
represents one case (i.e., the data were not entered in the format
described in steps #2 to #6), then select the Analyze > Descriptive
Statistics > Crosstabs options to display the Crosstabs
dialog box, and proceed to the next step; if each line of the data file
represents a combination of categories of the two (qualitative) variables on
which the test is to be performed (i.e., the data were entered in the
format described in steps #2 to #6), then do the following: First, select the Data >
Weight Cases options to display the Weight Cases dialog box; then,
select the Weight cases by option, select from the list on the left the
variable name for the variable RawFrequency
(which is the frequency of occurrence in the data for each combination of
categories), and click on the arrow button pointing toward the Frequency
Variable slot; next, click on the OK button; finally, select the Analyze >
Descriptive Statistics > Crosstabs options to display the Crosstabs
dialog box, and proceed to the next step.
Step
8: From the list of the variables
on the left, select one of the two (qualitative) variables on which the test is
to be performed, and click on the arrow button pointing toward the Row(s)
section; then select the other (qualitative) variable, and click on the arrow
button pointing toward the Column(s) section.
Step
9: Click on the Statistics
button to display the Crosstabs: Statistics dialogue box, select the Chi-square
option in the upper left corner of the dialogue box, and click on the Continue
button to return to the Crosstabs dialog box.
Step
10: Click on the Cells
button to display the Crosstabs: Cell Display dialogue box; in order to
have the data (observed frequencies) displayed, select the Observed
option in the Counts section; in order to have the expected frequencies
displayed, select the Expected option in the Counts section; in
order to have the percentages for column variable categories displayed for each
row variable category, select the Row option in the Percentages section;
in order to have the percentages for row variable categories displayed for each
column variable category, select the Column option in the Percentages
section; in order to have the percentages for each cell out of the total
displayed, select the Total option in the Percentages section; in
order to have the standardized residuals displayed for each cell, select the Standardized
option in the Residuals section; click on the Continue button to
return to the Crosstabs dialog box.
(NOTE: The column heading for the display of the p‑value for the Pearson Chi‑Square statistic has “2‑sided”
in parentheses on the SPSS output, which can be misleading since the Pearson
Chi‑Square test is generally a one‑sided test.
Step
11: Click on the OK button, after which the SPSS output
will be generated.
One possible appropriate graphical display
for the data used in a chi‑square test concerning independence is a
stacked bar chart, which can be obtained by following the steps labeled “Creating
a Stacked Bar Chart” in Section II(3) of this document.
(6) Performing a Simple Linear
Regression with Checks of Linearity, Homoscedasticity, and Normality
Assumptions
Step
1: Identify in the SPSS data file
the (quantitative) dependent (response) variable and the (quantitative) independent
(explanatory or predictor) variable.
Step
2: Select the Graphs >
Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot
dialog box.
Step
3: Make certain that the Simple
Scatter option is selected; then, click on the Define button to
display the Simple Scatterplot dialog box.
Step
4: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Y-Axis slot; then select the
(quantitative) independent (explanatory or predictor) variable, and click on
the arrow pointing toward the X-Axis slot.
Step
5: Click on the OK button,
after which SPSS output displaying a scatter plot will be generated. In order to have the least squares line
appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options
from the main menu (and close the dialog box which appears), after which
selecting the File > Close
options will close the chart editor. By
examining the how the points on the scatterplot are distributed around the
least squares line, a decision can be made as to whether the linearity
assumption about the relationship between the dependent and independent
variables is satisfied.
Step
6: Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
7: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Dependent
slot; then select the (quantitative) independent (explanatory or predictor)
variable, and click on the arrow pointing toward the Independent(s)
section (where selection of more than one variable is permitted).
Step
8: Click on the Plots button to display the Linear
Regression: Plots dialog box. From
the list of variables on the left, select ZRESID,
and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing
toward the X slot. This will
generate a scatter plot with standardized predicted values on the horizontal
axis and standardized residuals on the vertical axis, so that a decision can be
made as to whether the homoscedasticity assumption is satisfied.
Step
9: In the Standardized Residual Plots section of the Linear Regression:
Plots dialog box, select the Histogram option and the Normal
probability plot option. This
will generate a histogram and a normal probability plot for standardized
residuals, so that a decision can be made as to whether the normality
assumption is satisfied.
Step
10: Click on the Continue button, and then click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, and select the Descriptives
option to generate means, standard deviations, and the Pearson correlation;
also, select the Confidence intervals option to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)). This will generate
confidence intervals for the slope and for the intercept in the regression.
Step
11: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated.
Step
12: The scatter plot with
standardized predicted values on the horizontal axis and standardized residuals
on the vertical axis, requested in Step 8, will be displayed on the SPSS output
without a horizontal line at zero; since this line can be helpful in examining
this plot, the instructions in Step 5 to have the least squares line appear on
the scatter plot can be used to add this horizontal line at zero.
V.
Statistical Analysis Involving Multiple (Two or More) Variables
(1) Performing a Quadratic Regression
with Checks of Model, Homoscedasticity, and Normality Assumptions
Step
1: Identify in the SPSS data file
the (quantitative) dependent (response) variable and the (quantitative) independent
(explanatory or predictor) variable.
Step
2: Create a new variable in the
SPSS data file which is the square of independent (explanatory or predictor)
variable. (This can be done using the
instructions in subsection (2) of section I.)
Step
3: Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
4: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Dependent
slot; then select the independent (explanatory or predictor) variable and the
variable created in Step 2 which is its square, and click on the arrow pointing
toward the Independent(s) section.
Step
5: Click on the Plots button to display the Linear
Regression: Plots dialog box. From
the list of variables on the left, select ZRESID,
and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing
toward the X slot. This will
generate a scatter plot with standardized predicted values on the horizontal
axis and standardized residuals on the vertical axis, so that a decision can be
made as to whether the homoscedasticity assumption is satisfied.
Step
6: In the Standardized Residual Plots section of the Linear Regression:
Plots dialog box, select the Histogram option and the Normal
probability plot option. This
will generate a histogram and a normal probability plot for standardized
residuals, so that a decision can be made as to whether the normality
assumption is satisfied.
Step
7: Click on the Continue button, and then click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, and select the Descriptives
option to generate means, standard deviations, and the Pearson correlation;
also, select the Confidence intervals option to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)). This will generate
confidence intervals for each of the coefficients in the regression.
Step
8: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated.
Step
9: The scatter plot with
standardized predicted values on the horizontal axis and standardized residuals
on the vertical axis, requested in Step 5, will be displayed on the SPSS output
without a horizontal line at zero; since this line can be helpful in examining
this plot, add this horizontal line at zero by doing following: double click on
the graph to enter the SPSS Chart Editor,
and then select the Elements> Fit
Line at Total options from the main menu (and close the dialog box which
appears), after which selecting the File >
Close options will close the chart editor.
(2) Performing a Multiple Linear
Regression with Checks for Multicollinearity and of Linearity,
Homoscedasticity, and Normality Assumptions
Step
1: Identify in the SPSS data file
the (quantitative) dependent (response) variable, all quantitative independent
(explanatory or predictor) variables, and all qualitative independent
(explanatory or predictor) variables.
Step
2: Select the Graphs >
Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot
dialog box.
Step
3: Make certain that the Simple
Scatter option is selected; then, click on the Define button to
display the Simple Scatterplot dialog box.
Step
4: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Y-Axis slot; then select one of the
quantitative independent (explanatory or predictor) variables, and click on the
arrow pointing toward the X-Axis slot.
Step
5: Click on the OK button,
after which SPSS output displaying a scatter plot will be generated. In order to have the least squares line
appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options
from the main menu (and close the dialog box which appears), after which
selecting the File > Close
options will close the chart editor. By
examining the how the points on the scatterplot are distributed around the
least squares line, a decision can be made as to whether the linearity
assumption about the relationship between the dependent and independent
variables is satisfied.
Step
6: Repeat Steps #2 to #5 for each
of the quantitative independent (explanatory or predictor) variables.
Step
7: For each of the qualitative
independent (explanatory or predictor) variables, add the appropriate dummy
variable(s) to the data file. For a
qualitative variable with k
categories, this can be done by defining dummy variables, where the first dummy
variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second
dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise,
etc.; since k - 1 dummy variables are sufficient to represent a
qualitative variable with k
categories, the kth dummy variable is
not really necessary, and may or may not be used.
Step
8: Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
9: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Dependent
slot; then select each independent (explanatory or predictor) variable, and
click on the arrow pointing toward the Independent(s) section (where
selection of more than one variable is permitted).
Step
10: Click on the Plots button to display the Linear
Regression: Plots dialog box. From
the list of variables on the left, select ZRESID,
and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing
toward the X slot. This will
generate a scatter plot with standardized predicted values on the horizontal
axis and standardized residuals on the vertical axis, so that a decision can be
made as to whether the homoscedasticity assumption is satisfied.
Step
11: In the Standardized Residual Plots section of the Linear Regression:
Plots dialog box, select the Histogram option and the Normal
probability plot option. This will
generate a histogram and a normal probability plot for standardized residuals,
so that a decision can be made as to whether the normality assumption is
satisfied.
Step
12: Click on the Continue button, and then click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, select the Descriptives
option to generate means, standard deviations, and the Pearson correlations,
and select the Collinearity diagnostics option to generate information
about whether multicollinearity could be a problem; also, select the Confidence
intervals option to set the desired (two‑sided) confidence level
(which will generally be 100% minus the significance level, i.e., if the
significance level is 0.05 (5%), then the confidence level will be 0.95
(95%)). This will generate confidence
intervals for the slope and for the intercept in the regression.
Step
13: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated.
Step
14: The scatter plot with
standardized predicted values on the horizontal axis and standardized residuals
on the vertical axis, requested in Step #8, will be displayed on the SPSS
output without a horizontal line at zero; since this line can be helpful in
examining this plot, the instructions in Step #5 to have the least squares line
appear on the scatter plot can be used to add this horizontal line at zero.
(3) Performing a Stepwise Linear Regression
to Build a Model
Step
1: Identify in the SPSS data file
the (quantitative) dependent (response) variable, all potential quantitative
independent (explanatory or predictor) variables, and all potential qualitative
independent (explanatory or predictor) variables.
Step
2: For each of the qualitative
independent (explanatory or predictor) variables, add the appropriate dummy
variable(s) to the data file. For a
qualitative variable with k
categories, this can be done by defining dummy variables, where the first dummy
variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second
dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise,
etc.; since k - 1 dummy variables are sufficient to represent a
qualitative variable with k
categories, the kth dummy variable is
not really necessary, and may or may not be used.
Step
3: Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
4: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Dependent
slot; then select each potential independent (explanatory or predictor)
variable, and click on the arrow pointing toward the Independent(s)
section (where selection of more than one variable is permitted).
Step
5: In the Method slot,
select the desired method for variable selection (such as the Stepwise
option).
Step
6: Click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, and select the R squared change option; also, select
the Confidence intervals option to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)). This will generate
confidence intervals for the coefficients and for the intercept in the
regression
Step
7: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated.
(4) Performing a Stepwise Binary
Logistic Regression to Build a Model
??????????????*********** THIS SECTION (4) IS STILL UNDER
CONSTRUCTION **************???????????????????
Step
1: Identify in the SPSS data file
the ?????????????????????(quantitative) dependent (response) variable, all
potential quantitative independent (explanatory or predictor) variables, and
all potential qualitative independent (explanatory or predictor) variables.
Step
2: For each of the qualitative
independent (explanatory or predictor) variables, add the appropriate dummy
variable(s) to the data file. For a
qualitative variable with k
categories, this can be done by defining dummy variables, where the first dummy
variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second
dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise,
etc.; since k - 1 dummy variables are sufficient to represent a
qualitative variable with k
categories, the kth dummy variable is
not really necessary, and may or may not be used.
Step
3: Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
4: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Dependent
slot; then select each potential independent (explanatory or predictor)
variable, and click on the arrow pointing toward the Independent(s)
section (where selection of more than one variable is permitted).
Step
5: In the Method slot,
select the desired method for variable selection (such as the Stepwise
option).
Step
6: Click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, and select the R squared change option; also, select
the Confidence intervals option to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)). This will generate
confidence intervals for the coefficients and for the intercept in the
regression
Step
7: Click on the Continue button, and then click on the OK button, after which the SPSS output
will be generated.
(5) Performing a Two-Way ANOVA (Analysis
of Variance) with Checks of Equal Variance and Normality Assumptions
Step
1: Identify in the SPSS data file
the quantitative (response) variable for which means are to be compared and the
two qualitative (independent) variables which defines the groups among which
the means are to be compared; then select a significance level a.
Step
2: In order to examine residuals
for non-normality, first add the appropriate dummy variables to the data file
for each of the two qualitative (independent) variables as follows (but if no
check for non-normality in residuals is desired, skip to Step 7.): If one of the qualitative variables is
defined by r categories, and the
other is defined by c categories, then
create r - 1 dummy variables to define the qualitative
variable defined by r categories, and
create c - 1 dummy variables to define the qualitative
variable defined by c
categories. Finally, create the (r - 1)(c - 1) dummy variables which result from the products of one of the r - 1 dummy variables and one of the c - 1 dummy variables.
Step
3: Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
4: From the list of variables on
the left, select the quantitative (response) variable, and click on the arrow
pointing toward the Dependent slot;
then select each dummy variable created in Step 2, and click on the arrow
pointing toward the Independent(s) section (where selection of more than
one variable is permitted).
Step
5: Click on the Plots button to display the Linear
Regression: Plots dialog box. In the
Standardized Residual Plots section
of the Linear Regression: Plots dialog box, select the Histogram
option and the Normal probability plot option. This will generate a histogram and a
normal probability plot for standardized residuals, so that a decision can be
made as to whether the normality assumption is satisfied.
Step
6: Click on the Continue button, and then click on the OK button, after which the SPSS output
containing an ANOVA table and a normal probability plot for standardized
residuals will be generated.
Step
7: Select the Analyze>
General Linear Model> Univariate options to display the Univariate
dialog box.
Step
8: From the list of variables on
the left, select the quantitative (response) variable, and click on the arrow
pointing toward the Dependent Variable slot.
Step
9: From the list of variables on
the left, select the two qualitative (independent) variables, and click on the
arrow pointing toward the Fixed Factor(s) section.
Step
10: Click on the Post Hoc
button to display the Univariate: Post Hoc Multiple Comparisons for Observed
Means dialog box; from the list in the Factor(s) section on the
left, identify each variable name which represents a qualitative variable
having more than two categories, and select these for the Post Hoc Tests for
section on the right. From the Equal
Variances Assumed section, select a desired multiple comparison procedure
(such as Bonferroni); if it is deemed necessary later, an option from
the Equal Variances Not Assumed section can be used. These multiple comparison procedures are
generally needed when one or both main effects are statistically
significant. Click on the Continue
button to return to the Univariate dialog box.
Step
11: Click on the EM Means
button to display the Univariate: Estimated Marginal Means dialogue
box. Select each item from the list in
the Factor(s) and Factor Interactions section on the left for the Display
Means for section on the right.
Click on the Continue button to return to the Univariate
dialog box.
Step
12: Click on the Options
button to display the Univariate: Options dialogue box. In the Display section, select Homogeneity
tests to generate results for Levene’s test
concerning the equal-variance assumption, and, if sample sizes are not equal
for all cells, select Descriptive statistics to get row and column means
calculated from individual observations in addition to those calculated from
cell means; if desired, select Estimates of effect size and/or Observed power. Click on the Continue button to return
to the Univariate dialog box.
Step
13: Click on the Plots
button to display the Univariate: Profile Plots dialogue box. To generate one of the two possible
interaction plots, select one of the two variables from the Factor(s)
section on the left for the Horizontal Axis slot on the right, and
select the other variable for the Separate Lines slot on the right;
then, click the Add button to add this plot to the Plots section;
to generate the other possible interaction plot, repeat this with the roles of
the variables reversed. Click on the Continue
button to return to the Univariate dialog box.
Step
14: Click on the OK button, after which the SPSS output
will be generated.
An appropriate graphical display for
significant interaction in a two‑way ANOVA is one of the plots created in
Step 13; however, if there is no significant interaction, then an appropriate
graphical display for significant main effects is multiple box plots, which can
be obtained by following the steps labeled “Creating Box Plots for Two or
More Groups” in Section II(3) of this document.
(6) Performing a One-Way ANCOVA
(Analysis of Covariance) with Checks of Equal Variance and Normality
Assumptions
Step
1: Identify in the SPSS data file
the quantitative (response) variable for which means are to be compared, the
qualitative independent variable which defines the groups among which the means
are to be compared, and each quantitative variable which is a covariate.
Step
2: In order to use one-way
ANCOVA, the assumption of no interaction between the qualitative independent
variable and each covariate must be satisfied; to check the validity of this
assumption, select the Analyze> General Linear Model> Univariate
options to display the Univariate dialog box.
Step
3: From the list of variables on
the left, select the quantitative (response) variable, and click on the arrow
pointing toward the Dependent Variable slot.
Step
4: From the list of variables on
the left, select the qualitative independent variable, and click on the arrow
pointing toward the Fixed Factor(s) section.
Step
5: From the list of variables on
the left, select each of the covariate quantitative independent variables, and
click on the arrow pointing toward the Covariate(s) section.
Step
6: Click on the Model
button to display the Univariate: Model dialog box, and select the Build
terms option in the Specify Model section.
Step
7: From the list of variables in
the Factors & Covariates section on the left, select the qualitative
independent variable, and click on the arrow in the Build Term(s)
section pointing toward the Model section.
Step
8: Repeat the previous step for
each quantitative variable which is a covariate.
Step
9: From the list of variables in
the Factors & Covariates section on the left, simultaneously select
the qualitative independent variable and one quantitative variable which is a
covariate (which can be accomplished by using the ctrl key), and click on the arrow in the Build Term(s)
section pointing toward the Model section.
Step
10: Repeat the previous step for
the qualitative independent variable and each quantitative variable which is a
covariate.
Step
11: Verify that Type III is displayed in the Sum of squares slot at the lower left
of the dialog box, and click on the Continue
button to return the Univariate dialog box.
Step
12: Click on the Options
button to display the Univariate: Options dialog box, and select the Homogeneity
tests option in the Display section.
Step
13: Click on the Continue button, and click on the OK button, after which the SPSS output
will be generated.
Step
14: The validity of the
no-interaction assumption (in Step 2) can be checked by looking at the p-value corresponding to each f test concerning an interaction between
the qualitative independent variable and a covariate; if each of these f tests is not statistically
significant, then the no-interaction assumption is considered to be
satisfied. The equal variance assumption
can be checked by looking at the p-value
corresponding to Levene’s f test.
Step
15: If the no-interaction
assumption and equal variance assumption are considered to be satisfied, then
select a significance level a for the one-way ANCOVA and proceed to the next step; if the
no-interaction assumption is not considered to be satisfied, then one-way
ANCOVA is NOT an appropriate statistical analysis and instead of proceeding to
the next step, perform a more complicated regression analysis which is an
appropriate statistical analysis.
Step
16: In order to examine residuals
for non-normality, first add the appropriate dummy variables to the data file
for the qualitative independent variable as follows (but if no check for
non-normality in residuals is desired, skip to Step 21.): If the qualitative variable is defined by k categories, then create k - 1 dummy variables to define this qualitative
variable.
Step
17: Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
18: From the list of variables on
the left, select the quantitative (response) variable, and click on the arrow
pointing toward the Dependent slot;
then select each dummy variable created in Step 16, and click on the arrow
pointing toward the Independent(s) section (where selection of more than
one variable is permitted); finally, select each covariate, and click on the
arrow pointing toward the Independent(s) section.
Step
19: Click on the Plots button to display the Linear
Regression: Plots dialog box. In the
Standardized Residual Plots section
of the Linear Regression: Plots dialog box, select the Histogram
option and the Normal probability plot option. This will generate a histogram and a
normal probability plot for standardized residuals, so that a decision can be
made as to whether the normality assumption is satisfied.
Step
20: Click on the Continue button, and then click on the OK button, after which the SPSS output
containing an ANOVA table and a normal probability plot for standardized
residuals will be generated; this normal probability plot can be examined for
significant departures from normality.
Step
21: In order to verify the
linearity assumption for the dependent variable and each covariate (i.e., each
quantitative independent (explanatory or predictor) variable), select the Graphs
> Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot
dialog box (but if no check for linearity is desired, skip to Step 26.).
Step
22: Make certain that the Simple
Scatter option is selected; then, click on the Define button to
display the Simple Scatterplot dialog box.
Step
23: From the list of variables on
the left, select the quantitative dependent (response) variable, and click on
the arrow pointing toward the Y-Axis slot; then select one of the
covariates, and click on the arrow pointing toward the X-Axis slot.
Step
24: Click on the OK
button, after which SPSS output displaying a scatter plot will be
generated. In order to have the least
squares line appear on the scatter plot, first double click on the graph to
enter the SPSS Chart Editor, and
then select the Elements> Fit Line at
Total options from the main menu (and close the dialog box which appears),
after which selecting the File >
Close options will close the chart editor.
By examining the how the points on the scatterplot are distributed
around the least squares line, a decision can be made as to whether the
linearity assumption about the relationship between the dependent and
independent variables is satisfied.
Step
25: Repeat Steps #21 to #24 for
each covariate. Each scatter plot can be
examined for significant departures from linearity.
Step
26: In order to do the one-way
ANCOVA, select the Analyze> General Linear Model> Univariate
options to display the Univariate dialog box.
Step
27: From the list of variables on
the left, select the quantitative (response) variable, and click on the arrow
pointing toward the Dependent Variable slot.
Step
28: From the list of variables on
the left, select the qualitative independent variable, and click on the arrow
pointing toward the Fixed Factor(s) section.
Step
29: From the list of variables on
the left, select each of the covariates, and click on the arrow pointing toward
the Covariate(s) section.
Step
30: Click on the Model
button to display the Univariate: Model dialog box, select the Full
factorial option in the Specify Model section, verify that Type III is displayed in the Sum of squares slot at the lower left
of the dialog box, and click on the Continue
button to return the Univariate dialog box.
Step
31: Click on the Options
button to display the Univariate: Options dialogue box. In the Display section, deselect Homogeneity
tests (the results of which were already generated in an earlier step) and,
if desired, select Descriptive statistics and/or Estimates of effect
size. Click on the Continue
button to return to the Univariate dialog box.
Step
32: Click on the EM Means
button to display the Univariate: Estimated Marginal Means dialogue
box. Select each item from the list in
the Factor(s) and Factor Interactions section on the left for the Display
Means for section on the right.
Immediately below the Display Means for section select the Compare
main effect option, and immediately below this in the Confidence interval adjustment drop-down box select a desired
multiple comparison procedure (such as Bonferroni). Click on the Continue button to return
to the Univariate dialog box.
Step
33: Click on the OK button, after which the SPSS output
will be generated; if desired, separate least squares regression equations for
each category of the qualitative independent variable can be obtained from the
instructions in the remaining steps; otherwise, there is no need to proceed
beyond this step.
Step
34: Based on the statistically
significant differences among parallel regressions found, decide which of the k - 1 dummy variables created in Step 16 should be
included in the regression along with the covariates, and then select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
35: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Dependent
slot; then select each of the dummy variables chosen in the previous step and
each covariate, and click on the arrow pointing toward the Independent(s)
section.
Step
36: Click on the OK button, after which the SPSS output
will be generated.
(7) Performing a Two-Way ANCOVA
(Analysis of Covariance) with Checks of Equal Variance and Normality
Assumptions
Step
1: Identify in the SPSS data file
the quantitative (response) variable for which means are to be compared, the
two qualitative independent variables which define the groups among which the
means are to be compared, and each quantitative variable which is a covariate.
Step
2: In order to use two-way
ANCOVA, the assumption of no interaction between each qualitative independent
variable and each covariate must be satisfied; to check the validity of this
assumption and also check the equal variance assumption, select the Analyze>
General Linear Model> Univariate options to display the Univariate
dialog box.
Step
3: From the list of variables on the
left, select the quantitative (response) variable, and click on the arrow
pointing toward the Dependent Variable slot.
Step
4: From the list of variables on
the left, select each of the two qualitative independent variables, and click
on the arrow pointing toward the Fixed Factor(s) section.
Step
5: From the list of variables on
the left, select each of the covariate quantitative independent variables, and
click on the arrow pointing toward the Covariate(s) section.
Step
6: Click on the Model
button to display the Univariate: Model dialog box, and select the Build
terms option in the Specify Model section.
Step
7: From the list of variables in
the Factors & Covariates section on the left, select one of the two
qualitative independent variables, and click on the arrow in the Build
Term(s) section pointing toward the Model section; repeat this for
the other qualitative independent variable and for each quantitative variable
which is a covariate.
Step
8: From the list of variables in
the Factors & Covariates section on the left, simultaneously select
both of the qualitative independent variables and one quantitative variable
which is a covariate (which can be accomplished by using the ctrl key), and click on the arrow in the
Build Term(s) section pointing toward the Model section; repeat
this for each quantitative variable which is a covariate.
Step
9: Verify that Type III is displayed in the Sum of squares slot at the lower left
of the dialog box, and click on the Continue
button to return the Univariate dialog box.
Step
10: Click on the Options
button to display the Univariate: Options dialog box, and select the Homogeneity
tests option in the Display section.
Step
11: Click on the Continue button, and click on the OK button, after which the SPSS output
will be generated.
Step
12: The validity of the
no-interaction assumption (in Step 2) can be checked by looking at the p-value corresponding to each f test concerning an interaction between
the two qualitative independent variables and a covariate; if each of these f tests is not statistically
significant, then the no-interaction assumption is considered to be
satisfied. The equal variance assumption
can be checked by looking at the p-value
corresponding to Levene’s f test.
Step
13: If the no-interaction assumption
and equal variance assumption are considered to be satisfied, then select a
significance level a for the one-way ANCOVA and proceed to the next step; if the
no-interaction assumption is not considered to be satisfied, then one-way ANCOVA
is NOT an appropriate statistical analysis and instead of proceeding to the
next step, perform a more complicated regression analysis which is an
appropriate statistical analysis.
Step
14: In order to examine residuals
for non-normality, first add the appropriate dummy variables to the data file
for each of the two qualitative (independent) variables as follows (but if no
check for non-normality in residuals is desired, skip to Step 19.): If one of the qualitative variables is
defined by r categories, and the
other is defined by c categories,
then create r - 1 dummy variables to define the qualitative
variable defined by r categories, and
create c - 1 dummy variables to define the qualitative
variable defined by c
categories. Finally, create the (r - 1)(c - 1) dummy variables which result from the products of one of the r - 1 dummy variables and one of the c - 1 dummy variables.
Step
15: Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
Step
16: From the list of variables on
the left, select the quantitative (response) variable, and click on the arrow
pointing toward the Dependent slot;
then select each dummy variable created in Step 14, and click on the arrow
pointing toward the Independent(s) section (where selection of more than
one variable is permitted); finally, select each covariate, and click on the
arrow pointing toward the Independent(s) section.
Step
17: Click on the Plots button to display the Linear
Regression: Plots dialog box. In the
Standardized Residual Plots section
of the Linear Regression: Plots dialog box, select the Histogram
option and the Normal probability plot option. This will generate a histogram and a
normal probability plot for standardized residuals, so that a decision can be made
as to whether the normality assumption is satisfied.
Step
18: Click on the Continue button, and then click on the OK button, after which the SPSS output
containing an ANOVA table and a normal probability plot for standardized
residuals will be generated; this normal probability plot can be examined for
significant departures from normality.
Step
19: In order to verify the
linearity assumption for the dependent variable and each covariate (i.e., each
quantitative independent (explanatory or predictor) variable), select the Graphs
> Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot
dialog box (but if no check for linearity is desired, skip to Step 24.).
Step
20: Make certain that the Simple
Scatter option is selected; then, click on the Define button to
display the Simple Scatterplot dialog box.
Step
21: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Y-Axis slot; then select one of the
covariates, and click on the arrow pointing toward the X-Axis slot.
Step
22: Click on the OK
button, after which SPSS output displaying a scatter plot will be
generated. In order to have the least
squares line appear on the scatter plot, first double click on the graph to
enter the SPSS Chart Editor, and
then select the Elements> Fit Line at
Total options from the main menu (and close the dialog box which appears),
after which selecting the File >
Close options will close the chart editor.
By examining the how the points on the scatterplot are distributed
around the least squares line, a decision can be made as to whether the
linearity assumption about the relationship between the dependent and
independent variables is satisfied.
Step
23: Repeat Steps #19 to #22 for each
covariate. Each scatter plot can be
examined for significant departures from linearity.
Step
24: In order to do the two-way
ANCOVA, select the Analyze> General Linear Model> Univariate
options to display the Univariate dialog box.
Step
25: From the list of variables on
the left, select the quantitative (response) variable, and click on the arrow
pointing toward the Dependent Variable slot.
Step
26: From the list of variables on
the left, select each of the two qualitative independent variables, and click
on the arrow pointing toward the Fixed Factor(s) section.
Step
27: From the list of variables on
the left, select each of the covariates, and click on the arrow pointing toward
the Covariate(s) section.
Step
28: Click on the Model
button to display the Univariate: Model dialog box, select the Full
factorial option in the Specify Model section, verify that Type III is displayed in the Sum of squares slot at the lower left
of the dialog box, and click on the Continue
button to return the Univariate dialog box.
Step
29: Click on the Options
button to display the Univariate: Options dialogue box. In the Display section, deselect Homogeneity
tests (the results of which were already generated in an earlier step) and,
if desired, select Descriptive statistics and/or Estimates of effect
size. Click on the Continue
button to return to the Univariate dialog box.
Step
30: Click on the EM Means
button to display the Univariate: Estimated Marginal Means dialogue
box. Select each item from the list in
the Factor(s) and Factor Interactions section on the left for the Display
Means for section on the right.
Immediately below the Display Means for section select the Compare
main effect option, and immediately below this in the Confidence interval adjustment drop-down box select a desired
multiple comparison procedure (such as Bonferroni). Click on the Continue button to return
to the Univariate dialog box.
Click on the Continue button to return to the Univariate
dialog box.
Step
31: Click on the Plots
button to display the Univariate: Profile Plots dialogue box. To generate one of the two possible
interaction plots, select one of the two variables from the Factor(s)
section on the left for the Horizontal Axis slot on the right, and
select the other variable for the Separate Lines slot on the right;
then, click the Add button to add this plot to the Plots section;
to generate the other possible interaction plot, repeat this with the roles of
the variables reversed. Click on the Continue
button to return to the Univariate dialog box.
Step
32: Click on the OK button, after which the SPSS output
will be generated; if desired, separate least squares regression equations for
each category of the qualitative independent variable can be obtained from the
instructions in the remaining steps; otherwise, there is no need to proceed
beyond this step.
Step
33: Based on the statistically
significant differences among parallel regressions found, create the dummy
variables which should be included in the regression (some of which may have
already been created in Step 14) along with the statistically significant
covariates; then select the Analyze > Regression >
Linear options to display the Linear Regression dialog box.
Step
34: From the list of variables on
the left, select the (quantitative) dependent (response) variable, and click on
the arrow pointing toward the Dependent
slot; then select each of the dummy variables chosen in the previous step and
each covariate, and click on the arrow pointing toward the Independent(s)
section.
Step
35: Click on the OK button, after which the SPSS output
will be generated.
(8) Performing a Repeated Measures
Within-Subjects ANOVA with Checks of Sphericity and Normality Assumptions
Step
1: Identify in the SPSS data file
the quantitative (response or dependent) variables whose means are to be
compared. (The normality assumption can
be checked by using the steps in (2) of the Data Diagnostics section.)
Step
2: Select the Analyze>
General Linear Model> Repeated Measures options to display the Repeated
Measures Define Factor(s) dialog box.
Step
3: In the Within-Subject Factor Name slot, you can change the default name factor(1) to a more appropriate name for the
group of quantitative (response or dependent) variables.
Step
4: Enter the number of
quantitative (response or dependent) variables in the Number of Levels slot, and click on the Add button. (This number of
dependent variables is typically greater than 2; if it is equal to 2, then performing
a repeated measures ANOVA is equivalent to performing a paired t test.)
Step
5: Click on the Define
button to display the Repeated Measures dialog box, and from the list of
variables on the left, select each of the quantitative (response or dependent)
variables, and click on the arrow pointing toward the Within-Subjects Variables
section to replace each “_?_” which you will see there.
Step
6: Click on the Options
button to display the Repeated Measures: Options dialog box. In the Display section, select Descriptive
statistics and/or Estimates of effect size. Click on the Continue button to return
to the Repeated Measures dialog box.
Step
7: Click on the EM Means
button to display the Repeated Measures: Estimated Marginal Means
dialogue box. Select each item from the
list in the Factor(s) and Factor Interactions section on the left for
the Display Means for section on the right. Immediately below the Display Means for
section select the Compare main effect option, and immediately below
this in the Confidence interval adjustment
drop-down box select a desired multiple comparison procedure (such as Bonferroni). Click on the Continue button to return
to the Repeated Measures dialog box.
Step
8: Click on the Plots
button to display the Repeated Measures: Profile Plots dialogue
box. Select the factor name from the Factors
section on the left for the Horizontal Axis slot on the right; then,
click the Add button to add this plot to the Plots section. Click on the Continue button to return
to the Repeated Measures dialog box.
Step
9: Click on the OK button, after which the SPSS output
will be generated.
(9) Performing a Repeated Measures Mixed
Between-Within-Subjects ANOVA with Checks of Sphericity, Equal Variance and
Covariance, and Normality Assumptions
Step
1: Identify in the SPSS data file
the quantitative (response or dependent) variables whose means are to be
compared whose means are to be compared and the qualitative independent
variable which defines the groups among which the means are to be
compared. (The normality assumption can
be checked by using the steps in (2) of the Data Diagnostics section.)
Step
2: Select the Analyze>
General Linear Model> Repeated Measures options to display the Repeated
Measures Define Factor(s) dialog box.
Step
3: In the Within-Subject Factor Name slot, you can change the default name factor(1) to a more appropriate name for the
group of quantitative (response or dependent) variables.
Step
4: Enter the number of
quantitative (response or dependent) variables in the Number of Levels slot, and click on the Add button.
Step
5: Click on the Define button
to display the Repeated Measures dialog box, and from the list of
variables on the left, select each of the quantitative (response or dependent)
variables, and click on the arrow pointing toward the Within-Subjects Variables
section to replace each “_?_” which you will see there; then select the
qualitative independent variable from the list of variables on the left, and
click on the arrow pointing toward the Between-Subjects Variables section.
Step
6: Click on the Options
button to display the Repeated Measures: Options dialog box. In the Display section, select Descriptive
statistics and/or Estimates of effect size. Click on the Continue button to return
to the Repeated Measures dialog box.
Step
7: Click on the EM Means
button to display the Repeated Measures: Estimated Marginal Means
dialogue box. Select each item from the
list in the Factor(s) and Factor Interactions section on the left for
the Display Means for section on the right. Immediately below the Display Means for
section select the Compare main effect option, and immediately below
this in the Confidence interval adjustment
drop-down box select a desired multiple comparison procedure (such as Bonferroni). Click on the Continue button to return
to the Repeated Measures dialog box.
Step
8: If the qualitative independent
variable has only 2 categories, then go to Step 9; if the qualitative
independent variable has more than 2 categories, then click on the Post Hoc
button to display the Repeated Measures: Post Hoc Multiple Comparisons for
Observed Means dialog box; from the list in the Factor(s) section on
the left, select the name of the qualitative variable having more than two
categories, and use the arrow button to move this name into the Post Hoc
Tests for section on the right. From
the Equal Variances Assumed section, select a desired multiple
comparison procedure (such as Bonferroni); if it is deemed necessary
later, an option from the Equal Variances Not Assumed section can be
used. These multiple comparison
procedures are generally needed when one or both main effects are statistically
significant. Click on the Continue
button to return to the Repeated Measures dialog box.
Step
9: Click on the Plots
button to display the Repeated Measures: Profile Plots dialogue
box. To generate one of the two possible
interaction plots, select the factor name from the Factors section on
the left for the Horizontal Axis slot on the right, and select the name
of the qualitative independent variable from the Factors section on the
left for the Separate Lines slot on the right; then, click the Add
button to add this plot to the Plots section; to generate the other
possible interaction plot, repeat this with the roles of the factor name and
the name of the qualitative independent variable reversed. Click on the Continue button to return
to the Repeated Measures dialog box.
Step
10: Click on the OK button, after which the SPSS output
will be generated.
???????????????
Performing a logistic regression with checks for
multicollinearity
1.
Identify in the
SPSS data file the (qualitative-dichotomous) dependent (response) variable, all
quantitative independent (explanatory or predictor) variables, and all
qualitative independent (explanatory or predictor) variables.
2.
For each of the qualitative
independent (explanatory or predictor) variables, add the appropriate dummy
variable(s) to the data file. For a
qualitative variable with k
categories, this can be done by defining dummy variables, where the first dummy
variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second
dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise,
etc.; since k - 1 dummy variables are sufficient to represent a
qualitative variable with k
categories, the kth dummy variable is
not really necessary, and may or may not be used.
3.
The multiple regression routine can be used to check for
multicollinearity (but is not appropriate for statistical analysis,
since the dependent variable is not quantitative); select the Analyze >
Regression > Linear options, select the (qualitative-dichotomous)
dependent for the Dependent slot, and select all quantitative
independent variables and all dummy variables representing qualitative
independent variables for the Independent(s) section. Click on the Statistics button, and in
the dialog box which appears select the Collinearity diagnostics
option. Click the Continue button
to close the dialog box, and click the OK button to obtain the desired
SPSS output. The desired values for
tolerance and VIF are all available in the Coefficients table of the
output.
4.
Make ???????.
5.
From the list of
variables on the left, select the (quantitative) dependent (response) variable,
and click on the arrow pointing toward the Y-Axis slot; then select one
of the quantitative independent (explanatory or predictor) variables, and click
on the arrow pointing toward the X-Axis slot.
6.
Click on the OK
button, after which SPSS output displaying a scatter plot will be
generated. In order to have the least
squares line appear on the scatter plot, first double click on the graph to
enter the SPSS Chart Editor, and
then select the Elements> Fit Line at
Total options from the main menu (and close the dialog box which appears),
after which selecting the File >
Close options will close the chart editor.
By examining the how the points on the scatterplot are distributed
around the least squares line, a decision can be made as to whether the
linearity assumption about the relationship between the dependent and
independent variables is satisfied.
7.
Repeat Steps #2
to #5 for each of the quantitative independent (explanatory or predictor)
variables.
8.
For each of the
qualitative independent (explanatory or predictor) variables, add the
appropriate dummy variable(s) to the data file.
For a qualitative variable with k
categories, this can be done by defining dummy variables, where the first dummy
variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second
dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise,
etc.; since k - 1 dummy variables are sufficient to represent a
qualitative variable with k
categories, the kth dummy variable is
not really necessary, and may or may not be used.
9.
Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
10. From the list of variables on the left, select the
(quantitative) dependent (response) variable, and click on the arrow pointing
toward the Dependent slot; then select
each independent (explanatory or predictor) variable, and click on the arrow
pointing toward the Independent(s) section (where selection of more than
one variable is permitted).
11. Click on the Plots
button to display the Linear Regression: Plots dialog box. From the list of variables on the left,
select ZRESID, and click on the arrow
pointing toward the Y slot; then select ZPRED, and click on the arrow pointing toward the X
slot. This will generate a scatter
plot with standardized predicted values on the horizontal axis and standardized
residuals on the vertical axis, so that a decision can be made as to whether
the homoscedasticity assumption is satisfied.
12. In the Standardized
Residual Plots section of the Linear Regression: Plots dialog box,
select the Histogram option and the Normal probability plot
option. This will generate a
histogram and a normal probability plot for standardized residuals, so that a
decision can be made as to whether the normality assumption is satisfied.
13. Click on the Continue
button, and then click on the Statistics
button to display the Linear Regression: Statistics dialog box. In the Regression
Coefficients section, make certain that the Estimates option is selected, select the Descriptives
option to generate means, standard deviations, and the Pearson correlations,
and select the Collinearity diagnostics option to generate information
about whether multicollinearity could be a problem; also, select the Confidence
intervals option to set the desired (two‑sided) confidence level
(which will generally be 100% minus the significance level, i.e., if the
significance level is 0.05 (5%), then the confidence level will be 0.95
(95%)). This will generate confidence
intervals for the slope and for the intercept in the regression.
14. Click on the Continue
button, and then click on the Save
button to display the Linear Regression: Save dialog box. In the Residuals section, select the Standardized
option to save the standardized residuals as part of the data. This allows further analysis to be
performed using the standardized residuals.
15. Click on the Continue
button, and then click on the OK
button, after which the SPSS output will be generated.
16. The scatter plot with standardized predicted values on
the horizontal axis and standardized residuals on the vertical axis, requested
in Step #8, will be displayed on the SPSS output without a horizontal line at
zero; since this line can be helpful in examining this plot, the instructions
in Step #5 to have the least squares line appear on the scatter plot can be
used to add this horizontal line at zero.
Methods to decide which of
many predictors are the most important to include in a model are available with
SPSS by doing the following:
1.
Select the Analyze >
Regression > Linear options to display the Linear Regression
dialog box.
2.
From the list of
variables on the left, select the (quantitative) dependent (response) variable,
and click on the arrow pointing toward the Dependent
slot; then select each independent (explanatory or predictor) variable, and click
on the arrow pointing toward the Independent(s) section (where selection
of more than one variable is permitted).
3.
In the Method
slot, select the desired method for variable selection (such as the Stepwise
option).
4.
Click on the Continue button, and then click on the Statistics button to display the Linear
Regression: Statistics dialog box.
In the Regression Coefficients
section, make certain that the Estimates
option is selected, and select the R squared change option; also, select
the Confidence intervals option to set the desired (two‑sided)
confidence level (which will generally be 100% minus the significance level,
i.e., if the significance level is 0.05 (5%), then the confidence level will be
0.95 (95%)). This will generate confidence
intervals for the slope and for the intercept in the regression.
5.
Click on the OK button, after which the SPSS output
will be generated.
Generating a correlation matrix
????????????????