Go back to my Home Page or to the SPSS Home Page.

Using SPSS for Windows (Version 25.0)

This document contains instructions on using SPSS to perform various statistical analyses.  The list of section and subsection titles is as follows:

I. Data Entry and Manipulation

          IMPORTANT NOTES

          (1) Defining Variables

          (2) Creating New Variables with Transformation of Existing Variables

          (3) Creating New Variables with Recoding of Existing Variables

II. Data Diagnostics, Graphical Displays, and Descriptive Statistics

          IMPORTANT NOTES

          (1) Checking Data Ranges, Summary Statistics, and Missing Values

          (2) Checking for Skewness and Non-Normality

          (3) Creating Graphical Displays and Obtaining Descriptive Statistics

III. Statistical Analysis Involving One Variable

          (1) Performing a One sample t Test about a Mean m

          (2) Performing a Chi-square Goodness-of-Fit Test with Hypothesized Proportions

IV. Statistical Analysis Involving Two Variables

          (1) Generating a Correlation Matrix with p-values

          (2) Performing a Paired Sample t Test about a Mean Difference md (i.e., a difference between means from dependent samples) or a Wilcoxon Signed Rank Test

          (3) Performing a Two Sample t Test about a Difference Between Means m1 and m2 or a Mann-Whitney Rank Sum Test

          (4) Performing a One-Way ANOVA (Analysis of Variance) to Test for at Least One Difference Among Multiple Means m, m, …, mk or a Kruskal-Wallis Rank Sum Test

          (5) Performing a Chi-square Test Concerning Independence

          (6) Performing a Simple Linear Regression with Checks of Linearity, Homoscedasticity, and Normality Assumptions

V. Statistical Analysis Involving Multiple (Two or More) Variables

          (1) Performing a Quadratic Regression with Checks of Model, Homoscedasticity, and Normality Assumptions

          (2) Performing a Multiple Linear Regression with Checks for Multicollinearity and of Linearity, Homoscedasticity, and Normality Assumptions

          (3) Performing a Stepwise Linear Regression to Build a Model

          (4) Performing a Stepwise Binary Logistic Regression to Build a Model

          (5) Performing a Two-Way ANOVA (Analysis of Variance) with Checks of Equal Variance and Normality Assumptions

          (6) Performing a One-Way ANCOVA (Analysis of Covariance) with Checks of Equal Variance and Normality Assumptions

          (7) Performing a Two-Way ANCOVA (Analysis of Covariance) with Checks of Equal Variance and Normality Assumptions

          (8) Performing a Repeated Measures Within-Subjects ANOVA with Checks of Sphericity and Normality Assumptions

          (9) Performing a Repeated Measures Mixed Between-Within-Subjects ANOVA with Checks of Sphericity, Equal Variance and Covariance, and Normality Assumptions

 

I. Data Entry and Manipulation

IMPORTANT NOTES:

          Data can be entered in SPSS either before or after variables are defined; cells with missing values will display a period point.  The default in SPSS is to use all cases.  In order to use only selected cases in the data file, first select the Data > Select Cases options to display the Select Cases dialog box; then click on the If button to display the Select Cases: If dialog box.  After the desired condition is entered, click on the Continue button, and then click on the OK button, after which only the desired cases should not be marked as being excluded from data analysis; also, a variable named filter_$ will be added to the data.

          In many of the SPSS dialog boxes (generally from clicking an Options button), there will be two choices for handling missing data.  The Exclude cases pairwise choice will have SPSS perform each specific procedure using all cases with no missing data for the variables involved in the given procedure, which implies that with missing data the sample size may not be same for each procedure performed.  The Exclude cases listwise choice will have SPSS perform each specific procedure using only those cases with no missing data for the variables involved in every procedure that is to be performed, which implies that with missing data the sample size will be same for each procedure performed.  (Of course when there is no missing data, it makes no difference which of these two choices is made.)

 

        (1) Defining Variables

Step 1: After entering SPSS, you should see at the bottom of the screen a tab for Data View and a tab for Variable View.  Go to the Variable View tab, and notice that there are several column headings.  Each row corresponds to a different variable, and the column headings indicate different information about each variable.  In the Name column is where an abbreviated name for each variable can be entered.  In the Label column is where a longer and more descriptive name for each variable can be entered, but this is optional.  There are several choices available for the Type column, but for basic analysis of data, it will suffice to use the Numeric option for all variables except those which are merely case identifiers never to be used in any statistical analysis, for which the String option can be used.

Step 2: For variables which are not to be treated as categorical, no more information is required, although options for columns such as Width and Decimals (which are concerned with how the data is displayed in the data file) can be set as desired.  For variables which are designed to be treated as categorical, the categories must be defined in the Values column.  To define these categories for a variable designed to be treated as categorical, click on the cell for the Values column, and click on the button which appears in the right hand side of the cell, after which the Value Labels dialog box is displayed.

Step 3: In the Value slot type the numerical code corresponding to one of the categories, and in the Label slot type the label the corresponding name for the category.  Then click on the Add button, after which the information just entered is listed in the section at the bottom of the dialog box.  Repeat this step until the information for every category is listed in the section at the bottom of the dialog box.

Step 4: To leave the dialog box, click on the OK button.  To return to viewing the data, go to the Data View tab, and if you do not now see the category names displayed (once the data is entered), then select View> Value Labels from the main menu.

 

        (2) Creating New Variables by Transformation of Existing Variables

Step 1: Select the Transform > Compute Variable options to display the Compute Variable dialog box.

Step 2: In the Target Variable slot, type an appropriate name for the new variable which is to be a function of existing variables.

Step 3: In the Numeric Expression section, a formula is needed to indicate how the values for the new variable are to be calculated; this can be accomplished by constructing the appropriate formula in the Numeric Expression section through the selection of variable names from the list of variables on the left and clicking on the arrow pointing toward the Numeric Expression section, together with the selection of algebraic operation buttons from the keypad displayed in the middle of the dialog box.

Step 4: Click on the OK button, after which the new variable should be added to the data.

 

        (3) Creating New Variables by Recoding Existing Variables

Step 1: Select the Transform > Recode into Different Variables options to display the Recode into Different Variables dialog box.

Step 2: From the list of variables on the left, select the existing variable to be recoded into a new variable, and click on the arrow pointing toward the Numeric Variable ŕ Output Variable section.

Step 3: In the Output Variable section, type an appropriate name in the Name slot for the new variable to be created by recoding, and click on the Change button.  You should now see in the Numeric Variable ŕ Output Variable section an indication of which existing variable is being recoded into the new variable.

Step 4: Click the Old and New Values button to display the Recode into Different Variables: Old and New Values dialog box.

Step 5: In the Old Value section click an appropriate option and enter the appropriate information for a value or range of values for the existing variable being recoded; in the New Value section click an appropriate option and enter the appropriate information for a corresponding value for the new variable to be created by recoding; click on the Add button after, which you should see in the Old ŕ New section an indication of how this recoding will be done.  Repeat this process until all the recoding information has been entered.

Step 6: After all the recoding information has been entered, click on the Continue button to return to the Recode into Different Variables dialog box, and then click on the OK button, after which the new variable should appear in the SPSS data file.

 

II. Data Diagnostics, Graphical Displays, and Descriptive Statistics

IMPORTANT NOTES:

          After all data has been entered in SPSS, it can be desirable to check for data entry errors that might have been made and to assess how much missing data there is; this can be accomplished by checking data ranges and missing values as described below.  Also, it is can be desirable to evaluate the degree to which the data satisfy certain assumptions required for statistical analysis; some features in SPSS described below illustrate how to do this.  Finally, it can be desirable to create graphical displays and obtain descriptive statistics; using the appropriate procedures in SPSS is addressed below.

 

        (1) Checking Data Ranges, Summary Statistics, and Missing Values

Step 1: Identify the qualitative (i.e., categorical) variable(s) in the SPSS data file to be checked for data entry errors and missing data.

Step 2: Select the Analyze > Descriptive Statistics > Frequencies options to display the Frequencies dialog box.  From the list of variables on the left, select one (or more) of the qualitative (i.e., categorical) variables to be checked, and click on the arrow pointing toward the Variables(s) slot; any variable needing to be removed from the Variables(s) slot can be selected and removed by clicking on the arrow pointing toward the list on the left.  Use this process until the list in the Variables(s) slot consists exactly of all desired qualitative variables.

Step 3: The Display frequency tables option can be checked to generate a frequency table listing the individual values of each qualitative variable.  To see what descriptive statistics will be displayed, click on the Statistics button, after which the Frequencies: Statistics dialog box is displayed.  Since all the descriptive statistics options displayed are primarily for quantitative variables, all boxes should be unchecked (unless one or more of these is specifically of interest for some reason).  Click on the Continue button to return to the Frequencies dialog box.

Step 4: Click on the OK button, after which the SPSS output will be generated.

Step 5: Identify the quantitative variable(s) in the SPSS data file to be checked for data entry errors and missing data.

Step 6: Select the Analyze > Descriptive Statistics > Frequencies options to display the Frequencies dialog box.  From the list of variables on the left, select one (or more) of the quantitative variables to be checked, and click on the arrow pointing toward the Variables(s) slot; any variable needing to be removed from the Variables(s) slot can be selected and removed by clicking on the arrow pointing toward the list on the left.  Use this process until the list in the Variables(s) slot consists exactly of all desired quantitative variables.

Step 7: To obtain descriptive statistics, click on the Statistics button, after which the Frequencies: Statistics dialog box is displayed.  Select the options for the descriptive statistics that would be of interest, which often include the Quartiles option in the Percentile Values section, the Mean, Median, and Mode options in the Central Tendency section, the Std deviation, Minimum, Maximum, and Range options in the Dispersion section, and the Skewness and Kurtosis options in the Distribution section.  Click on the Continue button to return to the Frequencies dialog box.  The Display frequency tables option can be unchecked to avoid generating a frequency table listing every individual value of the quantitative variable(s), unless this is desired.

Step 8: Click on the OK button, after which the SPSS output will be generated.

 

        (2) Checking for Skewness and Non-Normality

Step 1: In the SPSS data file, identify one or more quantitative variables to be checked for normality or skewness, and identify, if there are any, one or more qualitative (i.e., categorical) variables for defining groups.

Step 2: Select the Analyze > Descriptive Statistics > Explore options to display the Explore dialog box.

Step 3: From the list of variables on the left, select the quantitative variable, or variables, for which skewness and non‑normality are to be investigated, and click on the arrow pointing toward the Dependent List section; if there is a grouping variable in the list, select this variable and click on the arrow pointing toward the Factor List section.

Step 4: In the Display section near the bottom of the dialog box, select the Both option.

Step 5: Click on the Plots button to display the Explore: Plots dialog box, and select the Normality plots with tests option; notice that the Stem-and-leaf option is selected, and this can be unselected if there is no interest in this, in order to minimize the amount output.

Step 6: Click on the Continue button to return to the Explore dialog box, and click on the OK button, after which the SPSS output will be generated.

 

        (3) Creating Graphical Displays and Obtaining Descriptive Statistics

Creating a Bar Chart

Step 1: In the SPSS data file, identify the qualitative (i.e., categorical) variable for which a bar chart is to be created.

Step 2: Select the Graphs > Legacy Dialogs > Bar options to display the Bar Charts dialog box

Step 3: Make certain the Simple option is selected, and that the Summaries for groups of cases option is selected.

Step 4: Click on the Define button to display the Define Simple Bar: Summaries for Groups of Cases dialog box.

Step 5: From the list of variables on the left, select the desired qualitative (i.e., categorical) variable, and click on the arrow pointing toward the Category Axis slot.

Step 6: Click on the OK button, after which the SPSS output will be generated, and note that raw frequencies are displayed.

Step 7: If it is desirable to make changes to the bar chart, double click on the graph to enter the SPSS Chart Editor.

Step 8: After making desired changes, exit from the Chart Editor, after which you should see that the SPSS output has been updated.

Creating a Pie Chart

Step 1: In the SPSS data file, identify the qualitative (i.e., categorical) variable for which a pie chart is to be created.

Step 2: Select the Graphs > Legacy Dialogs > Pie options to display the Pie Charts dialog box

Step 3: Make certain the Simple option is selected, and that the Summaries for groups of cases option is selected.

Step 4: Click on the Define button to display the Define Pie: Summaries for Groups of Cases dialog box.

Step 5: From the list of variables on the left, select the desired qualitative (i.e., categorical) variable, and click on the arrow pointing toward the Define Slices by slot.

Step 6: Click on the OK button, after which the SPSS output will be generated, and note that a legend has been created, but no raw or relative frequencies are displayed.

Step 7: Double click on the graph to enter the SPSS Chart Editor.

Step 8: Select the Elements > Show Data Labels options to display the Properties dialog box, and select the Data Value Labels tab.

Step 9: Move Percent (or any other desired choice) from the Not Displayed box to the Displayed box, and click on the Apply button.

Step 10: Close the Properties dialog box, and exit from the Chart Editor, after which you should see that the SPSS output has been updated.

Creating a Stacked Bar Chart

Step 1: In the SPSS data file, identify the two qualitative (i.e., categorical) variables for which a stacked bar chart is to be created.

Step 2: Select the Graphs > Legacy Dialogs > Bar options to display the Bar Charts dialog box.

Step 3: Select the Stacked option and the Summaries for groups of cases option.

Step 4: Click on the Define button to display the Define Stacked Bar: Summaries for Groups of Cases dialog box.

Step 5: From the list of variables on the left, select the qualitative variable name that will be used to define the bars, and click on the arrow pointing toward the Category Axis slot; then select the variable name that will be used to define stacks, and click on the arrow pointing toward the Define Stacks by slot.

Step 6: Select the N of cases option in the Bars Represent section, and click the OK button, after which a stacked bar chart in SPSS output will be generated; notice that raw frequency is scaled on the vertical axis for the stacks within each bar, and you may also notice that the bars are not all the same height.

Step 7: In order to have relative frequency (percentages) scaled on the vertical axis with the bars all scaled to the same height, which are generally preferable, double click on the graph to enter the SPSS Chart Editor.

Step 8: Select the Options > Scale to 100% options, after which percentages should be scaled on the vertical axis; by double clicking on the title for the vertical axis, it can be changed to Percent.

Step 9: Exit from the chart editor, after which you should see that the SPSS output has been updated; the stacked bar chart displayed is one of two possible stacked bar charts.

Step 10: To create the other stacked bar chart, repeat all previous steps with the variable names switched in Step 5.

Creating a Histogram

Step 1: In the SPSS data file, identify the quantitative variable for which a histogram is to be created.

Step 2: Select the Graphs > Legacy Dialogs > Histogram options to display the Histogram dialog box.

Step 3: From the list of variables on the left, select the desired quantitative variable, and click on the arrow pointing toward the Variable slot.

Step 4: Click on the OK button, after which the SPSS output will be generated.

Step 5: If it is desirable to make changes to the histogram, double click on the graph to enter the SPSS Chart Editor.

Step 6: After making desired changes, exit from the Chart Editor, after which you should see that the SPSS output has been updated.

Creating One Box Plot or Multiple Box Plots for Commensurate Variables

Step 1: In the SPSS data file, identify either the quantitative variable for which a box plot is to be created, or the multiple commensurate quantitative variables for which box plots are to be created.

Step 2: Select the Graphs > Legacy Dialogs > Boxplot options to display the Boxplot dialog box.

Step 3: Make certain the Simple option is selected, and then select the Summaries of separate variables option.

Step 4: Click on the Define button to display the Define Simple Boxplot: Summaries of Separate Variables dialog box.

Step 5: From the list of variables on the left, select the desired quantitative variable(s), and click on the arrow pointing toward the Boxes Represent box.

Step 6: Click on the OK button, after which the SPSS output will be generated; note that the numerical scale is on the vertical axis.

Step 7: In order to get the numerical scale displayed on the horizontal axis, which is what is more common, double click on the graph to enter the SPSS Chart Editor.

Step 8: Select the Options > Transpose Chart options, after which the numerical scale should be on the horizontal axis.

Step 9: Exit from the Chart Editor, after which you should see that the SPSS output has been updated.

Creating Box Plots for Two or More Groups

Step 1: In the SPSS data file, identify the quantitative variable for which box plots are to be created, and identify the qualitative (i.e., categorical) variable which defines the groups.

Step 2: Select the Graphs > Legacy Dialogs > Boxplot options to display the Boxplot dialog box.

Step 3: Make certain the Simple option is selected, and that the Summaries for groups of cases option is selected.

Step 4: Click on the Define button to display the Define Simple Boxplot: Summaries for Groups of Cases dialog box.

Step 5: From the list of variables on the left, select the desired quantitative variable, and click on the arrow pointing toward the Variable slot.

Step 6: From the list of variables on the left, select the qualitative (i.e., categorical) variable which defines the groups, and click on the arrow pointing toward the Category Axis slot.

Step 7: Click on the OK button, after which the SPSS output will be generated; note that the box plots are displayed with the numerical scale on the vertical axis;

Step 8: In order to get the numerical scale displayed on the horizontal axis, which is what is more common, double click on the graph to enter the SPSS Chart Editor.

Step 9: Select the Options > Transpose Chart options, after which the numerical scale should be on the horizontal axis.

Step 10: Exit from the Chart Editor, after which you should see that the SPSS output has been updated.

Creating a Scatter Plot

Step 1: In the SPSS data file, identify the two quantitative variables for which a scatter plot is to be created; if appropriate, one of the two variables can be designated as the dependent (or response) variable and the other as the independent (or predictor) variable.

Step 2: Select the Graphs > Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot dialog box.

Step 3: Make certain that the Simple Scatter option is selected; then, click on the Define button to display the Simple Scatterplot dialog box.

Step 4: From the list of variables on the left, select the variable designated as the dependent variable or, if no such designation was made, select either one of the quantitative variables, and click on the arrow pointing toward the Y‑Axis slot; then select the other quantitative variable (which would be treated as the independent variable, if such s designation were made), and click on the arrow pointing toward the X‑Axis slot.

Step 5: Click on the OK button, after which the SPSS output will be generated.

Step 6: If it is desirable to include a graph of the least squares line on the scatter plot, double click on the graph to enter the SPSS Chart Editor.

Step 7: Select the Elements> Fit Line at Total options from the main menu to display the least squares line and open the Properties dialog box.

Step 8: You should notice that a label displaying the equation of the least squares line appears in the middle of the scatter plot; to delete this label, uncheck the Attach label to line option near the bottom of the Properties dialog box, and click on the Apply button.

Step 9: Close the Properties dialog box, and exit from the Chart Editor, after which you should see that the SPSS output has been updated.

Creating a Frequency Table

Step 1: In the SPSS data file, identify the variable, or variables, for which a frequency table is to be created; the variable(s) can be either qualitative (in which case the table will display a list of codes used to represent categories) or quantitative (in which case the table will display a list of all values of the variable in the data set).

Step 2: Select the Analyze > Descriptive Statistics > Frequencies options to display the Frequencies dialog box.

Step 3: From the list of variables on the left, select the variable, or variables, for which numerical summaries are to be obtained, and click on the arrow pointing toward the Variable(s) box.

Step 4: Make certain the Display frequency tables option is checked, and click on the OK button, after which the SPSS output will be generated.

Obtaining Numerical Summaries

First Method

Step 1: In the SPSS data file, identify one or more quantitative variables for which numerical summaries are to be obtained, and identify, if there are any, one or more qualitative (i.e., categorical) variables for defining groups.

Step 2: Select the Analyze > Descriptive Statistics > Explore options to display the Explore dialog box.

Step 3: From the list of variables on the left, select the quantitative variable, or variables, for which numerical summaries are to be obtained, and click on the arrow pointing toward the Dependent List box; then select, if any, the qualitative (i.e., categorical) variables for defining groups, and click on the arrow pointing toward the Factor List box.

Step 4: In the Display section near the bottom of the dialog box, select the Statistics option.

Step 5: Click on the Statistics button to display the Explore: Statistics dialog box, and notice that the Descriptives option is selected; if there is interest in including the five‑number summary among the numerical summaries, check the appropriate boxes to also select the Percentiles option.

Step 6: Click on the Continue button to return to the Explore dialog box, and click on the OK button, after which the SPSS output will be generated.

Second Method

Step 1: In the SPSS data file, identify one or more quantitative variables for which numerical summaries are to be obtained, and identify, if there are any, one or more qualitative (i.e., categorical) variables for defining groups.

Step 2: Select the Analyze > Compare Means > Means options to display the Means dialog box.

Step 3: From the list of variables on the left, select the quantitative variable, or variables, for which numerical summaries are to be obtained, and click on the arrow pointing toward the Dependent List box; then select, if any, the qualitative (i.e., categorical) variables for defining groups, and click on the arrow pointing toward the Independent List box.

Step 4: Click on the Options button to display the Means: Options dialog box.  In this dialog box, you should notice that the options Mean, Number of Cases, and Standard Deviation are each listed in the Cell Statistics box on the right, and that several other options are listed in the Statistics box on the left.

Step 5: After selecting any additional options desired from the list in the Cell Statistics box, click on the arrow button pointing toward the Cell Statistics box.

Step 6: Click on the Continue button to return to the Means dialog box, and then click on the OK button, after which the SPSS output will be generated.

Third Method

Step 1: In the SPSS data file, identify one or more quantitative variables for which numerical summaries are to be obtained.

Step 2: Select the Analyze > Descriptive Statistics > Descriptives options to display the Descriptives dialog box.

Step 3: From the list of variables on the left, select the quantitative variable, or variables, for which numerical summaries are to be obtained, and click on the arrow pointing toward the Variable(s) box.

Step 4: Click on the Options button to display the Descriptives: Options dialog box.  In this dialog box, you should notice that the options Mean, Std. deviation, Minimum, and Maximum are checked, and that other options are available to be checked.

Step 5: After checking any additional options desired, click on the Continue button to return to the Descriptives dialog box.

Step 6: Click on the OK button, after which the SPSS output will be generated.

 

III. Statistical Analysis Involving One Variable

        (1) Performing a One sample t Test about a Mean m

Step 1: Identify the (quantitative) variable in the SPSS data file on which the test is to be performed, decide on the hypothesized value for the mean m, and select a (two‑sided) significance level a.  (More than one (quantitative) variable may be selected on which the test is to be performed simultaneously, but only one hypothesized value for the mean is permitted.)

Step 2: Select the Analyze > Compare Means > One Sample T Test options to display the One Sample T Test dialog box.

Step 3: From the list of variables on the left, select the desired (quantitative) variable (and more than one selection is permitted), and click on the arrow pointing toward the Test Variable(s) section.

Step 4: Type the hypothesized value for the mean m in the Test Value slot.

Step 5: Click on the Options button to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).

Step 6: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.  The p-value displayed on the SPSS output is for a two‑sided test; this must be divided by 2 when doing a one‑sided test; also, the confidence interval limits actually displayed on the SPSS output are actually the confidence interval limits for the difference between the population mean and the hypothesized value of the mean; adding the hypothesized value for the mean to each of these limits gives the confidence interval limits for the population mean.

One possible appropriate graphical display for the quantitative variable used in a one‑sample t test is a box plot, which can be obtained by following the steps labeled “Creating One Box Plot or Multiple Box Plots for Commensurate Variables” in Section II(3) of this document.

 

        (2) Performing a Chi-square Goodness-of-Fit Test with Hypothesized Proportions

Step 1: If the individual cases making up the raw data have already been entered into an SPSS data file, identify the (qualitative) variable on which the test is to be performed, and skip to Step 7; if the data is to be entered into SPSS with raw frequencies (i.e., counts), then follow the instructions beginning in Step 2.

Step 2: Go to the Variable View sheet (by clicking on the appropriate tab at the bottom of the screen), and in the first row, enter a name for the (qualitative) variable on which the test is to be performed.

Step 3: Define codes for this (qualitative) variable so that 1 (one) represents one category, 2 (two) represents a second category, 3 (three) represents a third category, etc., making certain that all categories of the variable have been included.

Step 4: In the second row, enter the variable name RawFrequency, and since all the raw frequencies must be integers, make the entry in the third cell of the Decimals column to 0 (zero).

Step 5: Go to the Data View sheet (by clicking on the appropriate tab at the bottom of the screen), and in the column for the (qualitative) variable on which the test is to be performed, enter the codes 1, 2, 3, etc. respectively into the first cell, the second cell, the third cell, etc., making certain that all codes used have been entered.  (If the category labels are not displayed, then select View > Value Labels from the main menu.)

Step 6: In the column for the variable RawFrequency, enter the corresponding raw frequency in the data for each category; after all the data is entered, it may be a good idea to save this SPSS file using an appropriate file name.

Step 7: If each line of the data file represents one case (i.e., the data were not entered in the format described in Steps 2 to 6), then select the Analyze > Nonparametric Tests > Legacy Dialogs > Chi‑Square options to display the Chi‑Square Test dialog box, and proceed to the next step; if each line of the data file represents one category of the (qualitative) variable on which the test is to be performed (i.e., the data were entered in the format described in steps #2 to #6), then do the following: First, select the Data > Weight Cases options to display the Weight Cases dialog box; then, select the Weight cases by option, select from the list on the left the variable RawFrequency (which is the frequency of occurrence in the data for each category), and click on the arrow button pointing toward the Frequency Variable slot; next, click on the OK button; finally, select the Analyze > Nonparametric Tests > Legacy Dialogs > Chi‑Square options to display the Chi‑Square Test dialog box, and proceed to the next step.

Step 8: In the Chi-Square Test dialog box, from the list of the variables on the left, select the (qualitative) variable on which the test is to be performed, and click on the arrow button pointing toward the Test Variable List section.  For each category, decide on the hypothesized value for the proportion.

Step 9: Which option should be selected in the Expected Values section depends on the hypothesized proportions.  If the hypothesized proportions are all equal, then select the All categories equal option; if the hypothesized proportions are not all equal, then select the Values option, and enter the hypothesized proportions by first typing the hypothesized proportion for the category coded with the smallest value (which would be 1 if the data were entered in the format described in Steps 2 to 6) in the Values slot and clicking on the Add button.  Then, type the hypothesized proportion for the category coded with the next smallest value (which would be 2 if the data were entered in the format described in Steps 2 to 6) in the Values slot and click on the Add button.  Next, type the hypothesized proportion for the category coded with the third smallest value (which would be 3 if the data were entered in the format described in Steps 2 to 6) in the Values slot and click on the Add button.  Continue this process until the hypothesized proportion for each code used has been entered.  (The order in which these hypothesized proportions are entered must correspond to the numerical order of the codes for the different categories; also, it does not matter whether percentages or proportions are entered, so that, for instance, 45, 35, and 20 could be entered instead of 0.45, 0.35, and 0.20.)

Step 10: Click on the OK button, after which the SPSS output will be generated.

One possible appropriate graphical display for the qualitative variable used in a chi‑square goodness‑of‑fit test is a bar chart, which can be obtained by following the steps labeled “Creating a Bar Chart” in Section II(3) of this document; another possible graphical display is a pie chart, which can be obtained by following the steps labeled “Creating a Pie Chart” in Section II(3) of this document.

 

IV. Statistical Analysis Involving Two Variables

        (1) Generating a Correlation Matrix with p-values

Step 1: Identify the (quantitative or qualitative‑ordinal) variables in the SPSS data file for which correlations between pairs of variables are to be calculated.

Step 2: Select the Analyze > Correlate > Bivariate options to display the Bivariate Correlations dialog box.

Step 3: From the list of variables on the left, select the desired (quantitative) variables, and click on the arrow pointing toward the Variables section.

Step 4: In the Correlation Coefficients section, select all the desired options (such as the Pearson option to generate the Pearson product moment correlation(s) for variables which are assumed to be at least approximately normally distributed, and the Spearman option to generate the Spearman rank correlation(s) when at least one variable is either qualitative‑ordinal or assumed to have a distribution substantially different from a normal distribution).

Step 5: In the Tests of Significance section, select either the Two‑tailed or One‑tailed option, depending on what type of hypothesis test is desired.

Step 6: Click on the OK button, after which the SPSS output will be generated.

One possible appropriate graphical display for the variables whose correlation is of interest a scatter plot, which can be obtained by following the steps labeled “Creating a Scatter Plot” in Section II(3) of this document.

 

        (2) Performing a Paired Sample t Test about a Mean Difference md (i.e., a difference between means from dependent samples) or a Wilcoxon Signed Rank Test

Step 1: Identify in the SPSS data file the pair of (quantitative) variables for which the mean difference is being tested, and select a (two‑sided) significance level a.  (More than one pair of (quantitative) variables may be selected on which to test the mean difference simultaneously.)

Step 2: Select the Analyze > Compare Means > Paired‑Samples T Test options to display the Paired‑Samples T Test dialog box.

Step 3: From the list of variables on the left, select one of the desired (quantitative) variables, and click on the arrow pointing toward the Paired Variables section; then select the other desired (quantitative) variable, and click on the arrow pointing toward the Paired Variables section.  (Selection of more than one pair is permitted.)

Step 4: Click on the Options button to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).

Step 5: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.  The p-value displayed on the SPSS output is for a two‑sided test; this must be divided by 2 when doing a one‑sided test.

A nonparametric test which can be considered an alternative to the paired‑sample t test (when appropriate assumptions might not be satisfied) is the Wilcoxon signed rank test to compare the median of a distribution of a quantitative or qualitative‑ordinal variable to zero (0), which can be performed as follows:

Step 1: Identify in the SPSS data file the pair of (quantitative) variables for which the mean difference is being tested, and select a (two‑sided) significance level a.  (More than one pair of (quantitative) variables may be selected on which to test the mean difference simultaneously.)

Step 2: Select the Analyze > Nonparametric Tests > Legacy Dialogs > 2 Related Samples options to display the Two-Related-Samples Tests dialog box.

Step 3: From the list of variables on the left, select one of the desired (quantitative or qualitative‑ordinal) variables, and click on the arrow pointing toward the Paired Variables section; then select the other desired (quantitative or qualitative‑ordinal) variable, and click on the arrow pointing toward the Test Pairs section.  (Selection of more than one pair is permitted.)

Step 4: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.  The p-value displayed on the SPSS output is for a two‑sided test; this must be divided by 2 when doing a one‑sided test.

The following two appropriate graphical displays for the data used in a paired‑sample t test or a Wilcoxon signed rank test are possible:

          (1) one box plot of the differences between the two variables; the differences can be obtained by following the steps labeled “Creating New Variables by Transformation of Existing Variables” in Section I(2) of this document, and noting that the formula to be entered in Step 3 should be one of the two variables minus the other (and the minus sign button from the keypad displayed in the middle of the dialog box can be used); the box plot of the new variable of differences can then be obtained by following the steps labeled “Creating One Box Plot or Multiple Box Plots for Commensurate Variables” in Section II(3) of this document.

          (2) two box plots, one for each variable, which can be obtained by following the steps labeled “Creating One Box Plot or Multiple Box Plots for Commensurate Variables” in Section II(3) of this document.

 

        (3) Performing a Two Sample t Test about a Difference Between Means m1 and m2 or a Mann-Whitney Rank Sum Test

Step 1: Identify in the SPSS data file both the (qualitative‑dichotomous) variable which defines the two groups being compared and the (quantitative) variable for which the means are being compared, and select a (two‑sided) significance level a.  (More than one (quantitative) variable may be selected on which to compare means simultaneously, but only one grouping variable may be selected.)

Step 2: Select the Analyze > Compare Means > Independent‑Samples T Test options to display the Independent‑Samples T Test dialog box.

Step 3: From the list of variables on the left, select the desired (quantitative) variable (and more than one selection is permitted), and click on the arrow pointing toward the Test Variable(s) section.

Step 4: From the list of variables on the left, select the desired (qualitative‑dichotomous) variable, and click on the arrow pointing toward the Grouping Variable slot.

Step 5: Click on the Define Groups button to display the Define Groups dialog box.

Step 6: In the Group 1 slot type the numerical code which represents one of the categories for the (qualitative‑dichotomous) variable which defines the two groups being compared, and in the Group 2 slot type the numerical code which represents the other category; then click on the Continue button to return to the Independent‑Samples T Test dialog box.

Step 7: Click on the Options button to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).

Step 8: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.  The p-value displayed on the SPSS output is for a two‑sided test; this must be divided by 2 when doing a one‑sided test.

A nonparametric test which can be considered an alternative to the two‑sample t test (when appropriate assumptions might not be satisfied) is the Mann‑Whitney rank sum test to compare the distributions of a quantitative or qualitative‑ordinal variable for two groups, which can be performed as follows:

Step 1: Identify in the SPSS data file both the (qualitative‑dichotomous) variable which defines the two groups being compared and the (quantitative or qualitative‑ordinal) variable for which the distributions are being compared.  (More than one (quantitative or qualitative‑ordinal) variable may be selected on which to compare distributions simultaneously, but only one grouping variable may be selected.)

Step 2: Select the Analyze > Nonparametric Tests > Legacy Dialogs > 2 Independent Samples options to display the Two-Independent-Samples Tests dialog box.

Step 3: From the list of variables on the left, select the desired (quantitative or qualitative‑ordinal) variable (and more than one selection is permitted), and click on the arrow pointing toward the Test Variable List section.

Step 4: From the list of variables on the left, select the desired (qualitative‑dichotomous) variable, and click on the arrow pointing toward the Grouping Variable slot.

Step 5: Click on the Define Groups button to display the Define Groups dialog box.

Step 6: In the Group 1 slot type the numerical code which represents one of the categories for the (qualitative‑dichotomous) variable which defines the two groups being compared, and in the Group 2 slot type the numerical code which represents the other category; then click on the Continue button to return to the Two‑Independent‑Samples Test dialog box.

Step 7: Make certain that the Mann‑Whitney U option is selected in the Test Type section.

Step 8: Click on the OK button, after which the SPSS output will be generated.  The p-value displayed on the SPSS output is for a two‑sided test; this must be divided by 2 when doing a one‑sided test.

One possible appropriate graphical display for the data used in a two‑sample t test or a Mann‑Whitney rank sum test is two box plots, one for each group, which can be obtained by following the steps labeled “Creating Box Plots for Two or More Groups” in Section II(3) of this document.

 

        (4) Performing a One-Way ANOVA (Analysis of Variance) to Test for at Least One Difference Among Multiple Means m, m, …, mk or a Kruskal-Wallis Rank Sum Test

Step 1: Identify in the SPSS data file both the (qualitative) variable which defines the groups being compared and the (quantitative) variable for which the means are being compared, and select a significance level a.  (More than one (quantitative) variable may be selected on which to compare means simultaneously, but only one grouping variable may be selected.)

Step 2: Select the Analyze > Compare Means > One‑Way ANOVA options to display the One‑Way ANOVA dialog box.

Step 3: From the list of variables on the left, select the desired (quantitative) variable (and more than one selection is permitted), and click on the arrow pointing toward the Dependent List section.

Step 4: From the list of variables on the left, select the desired (qualitative) variable, and click on the arrow pointing toward the Factor slot.

Step 5: Click on the Options button to display the One‑Way ANOVA: Options dialog box; in order to have descriptive statistics displayed, select the Descriptive option; in order to have results for Levene’s test (concerning equal variances) displayed, select the Homogeneity of variance test option; in order to have results for alternative f tests (which are adjusted for unequal variances) displayed, select the Brown‑Forsythe option and/or the Welch option; click on the Continue button to return to the One‑Way ANOVA dialog box.

Step 6: Click on the Post Hoc button to display the One‑Way ANOVA: Post Hoc Multiple Comparisons dialog box; in order to have results for one or more multiple comparison methods displayed, make the desired selections in the Equal Variances Assumed and/or Equal Variances Not Assumed sections, and enter the significance level in the Significance level slot; click on the Continue button to return to the One‑Way ANOVA dialog box.

Step 7: Click on the OK button, after which the SPSS output will be generated.

A nonparametric test which can be considered an alternative to the one‑way ANOVA f test (when appropriate assumptions might not be satisfied) is the Kruskal‑Wallis rank sum test to compare the distributions of a quantitative or qualitative‑ordinal variable for k groups, which can be performed as follows:

Step 1: Identify in the SPSS data file both the (qualitative‑dichotomous) variable which defines the two groups being compared and the (quantitative or qualitative‑ordinal) variable for which the distributions are being compared.  (More than one (quantitative or qualitative‑ordinal) variable may be selected on which to compare distributions simultaneously, but only one grouping variable may be selected.)

Step 2: Select the Analyze > Nonparametric Tests > Legacy Dialogs > K Independent Samples options to display the Tests for Several Independent Samples dialog box.

Step 3: From the list of variables on the left, select the desired (quantitative or qualitative‑ordinal) variable (and more than one selection is permitted), and click on the arrow pointing toward the Test Variable List section.

Step 4: From the list of variables on the left, select the desired (qualitative‑dichotomous) variable, and click on the arrow pointing toward the Grouping Variable slot.

Step 5: Click on the Define Range button to display the Several Independent Samples: Define Range dialog box.

Step 6: In the Minimum slot type the smallest numerical code used to represent the categories for the (qualitative‑dichotomous) variable which defines the groups being compared, and in the Maximum slot type the largest numerical code which represents the other category; then click on the Continue button to return to the Tests for Several Independent Samples dialog box.

Step 7: Make certain that the Kruskal‑Wallis H option is selected in the Test Type section.

Step 8: Click on the OK button, after which the SPSS output will be generated.

One possible appropriate graphical display for the data used in a one‑way ANOVA or a Kruskal‑Wallis rank sum test is multiple box plots, one for each group, which can be obtained by following the steps labeled “Creating Box Plots for Two or More Groups” in Section II(3) of this document.

 

        (5) Performing a Chi-square Test Concerning Independence

Step 1: If the individual cases making up the raw data have already been entered into an SPSS data file, identify the two (qualitative) variables on which the test is to be performed, and skip to Step 7; otherwise, enter the data into SPSS by following the instructions beginning in Step 2.

Step 2: Go to the Variable View sheet (by clicking on the appropriate tab at the bottom of the screen), in the first row enter a name for one of the two (qualitative) variables on which the test is to be performed, and in the second row enter a name for the other (qualitative) variable.

Step 3: For each of the two (qualitative) variables, define codes so that 1 (one) represents one category, 2 (two) represents a second category, 3 (three) represents a third category, etc., making certain that all categories of the variable have been included.

Step 4: In the third row, enter the variable name RawFrequency, and since all the counts must be integers, make the entry in the third cell of the Decimals column to 0 (zero).

Step 5: Go to the Data View sheet (by clicking on the appropriate tab at the bottom of the screen), and in the first two columns for the two (qualitative) variables on which the test is to be performed, enter the codes 1 and 1 respectively into the first and second cells of the first row, enter the codes 1 and 2 respectively into the first and second cells of the second row, enter the codes 1 and 3 respectively into the first and second cells of the third row, etc., making certain that all codes used for the (qualitative) variable in the second column have been entered.  Now, repeat this in the next rows with the code 2 entered in the first column, and then repeat this again with the code 3 entered in the first column, etc. making certain that all codes used for the (qualitative) variable in the first column have been entered.  Each possible combination of categories for the two (qualitative) variables should now be displayed exactly once in the first two columns.  (If the category labels are not displayed, then select View > Value Labels from the main menu.)

Step 6: In the column for the variable RawFrequency, enter the corresponding raw frequency in the data for each combination of categories; after all the data is entered, it may be a good idea to save this SPSS file using an appropriate file name.

Step 7: If each line of the data file represents one case (i.e., the data were not entered in the format described in steps #2 to #6), then select the Analyze > Descriptive Statistics > Crosstabs options to display the Crosstabs dialog box, and proceed to the next step; if each line of the data file represents a combination of categories of the two (qualitative) variables on which the test is to be performed (i.e., the data were entered in the format described in steps #2 to #6), then do the following: First, select the Data > Weight Cases options to display the Weight Cases dialog box; then, select the Weight cases by option, select from the list on the left the variable name for the variable RawFrequency (which is the frequency of occurrence in the data for each combination of categories), and click on the arrow button pointing toward the Frequency Variable slot; next, click on the OK button; finally, select the Analyze > Descriptive Statistics > Crosstabs options to display the Crosstabs dialog box, and proceed to the next step.

Step 8: From the list of the variables on the left, select one of the two (qualitative) variables on which the test is to be performed, and click on the arrow button pointing toward the Row(s) section; then select the other (qualitative) variable, and click on the arrow button pointing toward the Column(s) section.

Step 9: Click on the Statistics button to display the Crosstabs: Statistics dialogue box, select the Chi-square option in the upper left corner of the dialogue box, and click on the Continue button to return to the Crosstabs dialog box.

Step 10: Click on the Cells button to display the Crosstabs: Cell Display dialogue box; in order to have the data (observed frequencies) displayed, select the Observed option in the Counts section; in order to have the expected frequencies displayed, select the Expected option in the Counts section; in order to have the percentages for column variable categories displayed for each row variable category, select the Row option in the Percentages section; in order to have the percentages for row variable categories displayed for each column variable category, select the Column option in the Percentages section; in order to have the percentages for each cell out of the total displayed, select the Total option in the Percentages section; in order to have the standardized residuals displayed for each cell, select the Standardized option in the Residuals section; click on the Continue button to return to the Crosstabs dialog box.  (NOTE: The column heading for the display of the p‑value for the Pearson Chi‑Square statistic has “2‑sided” in parentheses on the SPSS output, which can be misleading since the Pearson Chi‑Square test is generally a one‑sided test.

Step 11: Click on the OK button, after which the SPSS output will be generated.

One possible appropriate graphical display for the data used in a chi‑square test concerning independence is a stacked bar chart, which can be obtained by following the steps labeled “Creating a Stacked Bar Chart” in Section II(3) of this document.

 

        (6) Performing a Simple Linear Regression with Checks of Linearity, Homoscedasticity, and Normality Assumptions

Step 1: Identify in the SPSS data file the (quantitative) dependent (response) variable and the (quantitative) independent (explanatory or predictor) variable.

Step 2: Select the Graphs > Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot dialog box.

Step 3: Make certain that the Simple Scatter option is selected; then, click on the Define button to display the Simple Scatterplot dialog box.

Step 4: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Y-Axis slot; then select the (quantitative) independent (explanatory or predictor) variable, and click on the arrow pointing toward the X-Axis slot.

Step 5: Click on the OK button, after which SPSS output displaying a scatter plot will be generated.  In order to have the least squares line appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options from the main menu (and close the dialog box which appears), after which selecting the File > Close options will close the chart editor.  By examining the how the points on the scatterplot are distributed around the least squares line, a decision can be made as to whether the linearity assumption about the relationship between the dependent and independent variables is satisfied.

Step 6: Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 7: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select the (quantitative) independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

Step 8: Click on the Plots button to display the Linear Regression: Plots dialog box.  From the list of variables on the left, select ZRESID, and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing toward the X slot.  This will generate a scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, so that a decision can be made as to whether the homoscedasticity assumption is satisfied.

Step 9: In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

Step 10: Click on the Continue button, and then click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, and select the Descriptives option to generate means, standard deviations, and the Pearson correlation; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the slope and for the intercept in the regression.

Step 11: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.

Step 12: The scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, requested in Step 8, will be displayed on the SPSS output without a horizontal line at zero; since this line can be helpful in examining this plot, the instructions in Step 5 to have the least squares line appear on the scatter plot can be used to add this horizontal line at zero.

 

V. Statistical Analysis Involving Multiple (Two or More) Variables

        (1) Performing a Quadratic Regression with Checks of Model, Homoscedasticity, and Normality Assumptions

Step 1: Identify in the SPSS data file the (quantitative) dependent (response) variable and the (quantitative) independent (explanatory or predictor) variable.

Step 2: Create a new variable in the SPSS data file which is the square of independent (explanatory or predictor) variable.  (This can be done using the instructions in subsection (2) of section I.)

Step 3: Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 4: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select the independent (explanatory or predictor) variable and the variable created in Step 2 which is its square, and click on the arrow pointing toward the Independent(s) section.

Step 5: Click on the Plots button to display the Linear Regression: Plots dialog box.  From the list of variables on the left, select ZRESID, and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing toward the X slot.  This will generate a scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, so that a decision can be made as to whether the homoscedasticity assumption is satisfied.

Step 6: In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

Step 7: Click on the Continue button, and then click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, and select the Descriptives option to generate means, standard deviations, and the Pearson correlation; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for each of the coefficients in the regression.

Step 8: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.

Step 9: The scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, requested in Step 5, will be displayed on the SPSS output without a horizontal line at zero; since this line can be helpful in examining this plot, add this horizontal line at zero by doing following: double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options from the main menu (and close the dialog box which appears), after which selecting the File > Close options will close the chart editor.

 

        (2) Performing a Multiple Linear Regression with Checks for Multicollinearity and of Linearity, Homoscedasticity, and Normality Assumptions

Step 1: Identify in the SPSS data file the (quantitative) dependent (response) variable, all quantitative independent (explanatory or predictor) variables, and all qualitative independent (explanatory or predictor) variables.

Step 2: Select the Graphs > Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot dialog box.

Step 3: Make certain that the Simple Scatter option is selected; then, click on the Define button to display the Simple Scatterplot dialog box.

Step 4: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Y-Axis slot; then select one of the quantitative independent (explanatory or predictor) variables, and click on the arrow pointing toward the X-Axis slot.

Step 5: Click on the OK button, after which SPSS output displaying a scatter plot will be generated.  In order to have the least squares line appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options from the main menu (and close the dialog box which appears), after which selecting the File > Close options will close the chart editor.  By examining the how the points on the scatterplot are distributed around the least squares line, a decision can be made as to whether the linearity assumption about the relationship between the dependent and independent variables is satisfied.

Step 6: Repeat Steps #2 to #5 for each of the quantitative independent (explanatory or predictor) variables.

Step 7: For each of the qualitative independent (explanatory or predictor) variables, add the appropriate dummy variable(s) to the data file.  For a qualitative variable with k categories, this can be done by defining dummy variables, where the first dummy variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise, etc.; since k - 1 dummy variables are sufficient to represent a qualitative variable with k categories, the kth dummy variable is not really necessary, and may or may not be used.

Step 8: Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 9: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

Step 10: Click on the Plots button to display the Linear Regression: Plots dialog box.  From the list of variables on the left, select ZRESID, and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing toward the X slot.  This will generate a scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, so that a decision can be made as to whether the homoscedasticity assumption is satisfied.

Step 11: In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

Step 12: Click on the Continue button, and then click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, select the Descriptives option to generate means, standard deviations, and the Pearson correlations, and select the Collinearity diagnostics option to generate information about whether multicollinearity could be a problem; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the slope and for the intercept in the regression.

Step 13: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.

Step 14: The scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, requested in Step #8, will be displayed on the SPSS output without a horizontal line at zero; since this line can be helpful in examining this plot, the instructions in Step #5 to have the least squares line appear on the scatter plot can be used to add this horizontal line at zero.

 

        (3) Performing a Stepwise Linear Regression to Build a Model

Step 1: Identify in the SPSS data file the (quantitative) dependent (response) variable, all potential quantitative independent (explanatory or predictor) variables, and all potential qualitative independent (explanatory or predictor) variables.

Step 2: For each of the qualitative independent (explanatory or predictor) variables, add the appropriate dummy variable(s) to the data file.  For a qualitative variable with k categories, this can be done by defining dummy variables, where the first dummy variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise, etc.; since k - 1 dummy variables are sufficient to represent a qualitative variable with k categories, the kth dummy variable is not really necessary, and may or may not be used.

Step 3: Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 4: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each potential independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

Step 5: In the Method slot, select the desired method for variable selection (such as the Stepwise option).

Step 6: Click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, and select the R squared change option; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the coefficients and for the intercept in the regression

Step 7: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.

 

        (4) Performing a Stepwise Binary Logistic Regression to Build a Model

??????????????***********  THIS SECTION (4) IS STILL UNDER CONSTRUCTION **************???????????????????

Step 1: Identify in the SPSS data file the ?????????????????????(quantitative) dependent (response) variable, all potential quantitative independent (explanatory or predictor) variables, and all potential qualitative independent (explanatory or predictor) variables.

Step 2: For each of the qualitative independent (explanatory or predictor) variables, add the appropriate dummy variable(s) to the data file.  For a qualitative variable with k categories, this can be done by defining dummy variables, where the first dummy variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise, etc.; since k - 1 dummy variables are sufficient to represent a qualitative variable with k categories, the kth dummy variable is not really necessary, and may or may not be used.

Step 3: Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 4: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each potential independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

Step 5: In the Method slot, select the desired method for variable selection (such as the Stepwise option).

Step 6: Click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, and select the R squared change option; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the coefficients and for the intercept in the regression

Step 7: Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.

 

        (5) Performing a Two-Way ANOVA (Analysis of Variance) with Checks of Equal Variance and Normality Assumptions

Step 1: Identify in the SPSS data file the quantitative (response) variable for which means are to be compared and the two qualitative (independent) variables which defines the groups among which the means are to be compared; then select a significance level a.

Step 2: In order to examine residuals for non-normality, first add the appropriate dummy variables to the data file for each of the two qualitative (independent) variables as follows (but if no check for non-normality in residuals is desired, skip to Step 7.):  If one of the qualitative variables is defined by r categories, and the other is defined by c categories, then create r - 1 dummy variables to define the qualitative variable defined by r categories, and create c - 1 dummy variables to define the qualitative variable defined by c categories.  Finally, create the (r - 1)(c - 1) dummy variables which result from the products of one of the r - 1 dummy variables and one of the c - 1 dummy variables.

Step 3: Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 4: From the list of variables on the left, select the quantitative (response) variable, and click on the arrow pointing toward the Dependent slot; then select each dummy variable created in Step 2, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

Step 5: Click on the Plots button to display the Linear Regression: Plots dialog box.  In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

Step 6: Click on the Continue button, and then click on the OK button, after which the SPSS output containing an ANOVA table and a normal probability plot for standardized residuals will be generated.

Step 7: Select the Analyze> General Linear Model> Univariate options to display the Univariate dialog box.

Step 8: From the list of variables on the left, select the quantitative (response) variable, and click on the arrow pointing toward the Dependent Variable slot.

Step 9: From the list of variables on the left, select the two qualitative (independent) variables, and click on the arrow pointing toward the Fixed Factor(s) section.

Step 10: Click on the Post Hoc button to display the Univariate: Post Hoc Multiple Comparisons for Observed Means dialog box; from the list in the Factor(s) section on the left, identify each variable name which represents a qualitative variable having more than two categories, and select these for the Post Hoc Tests for section on the right.  From the Equal Variances Assumed section, select a desired multiple comparison procedure (such as Bonferroni); if it is deemed necessary later, an option from the Equal Variances Not Assumed section can be used.  These multiple comparison procedures are generally needed when one or both main effects are statistically significant.  Click on the Continue button to return to the Univariate dialog box.

Step 11: Click on the EM Means button to display the Univariate: Estimated Marginal Means dialogue box.  Select each item from the list in the Factor(s) and Factor Interactions section on the left for the Display Means for section on the right.  Click on the Continue button to return to the Univariate dialog box.

Step 12: Click on the Options button to display the Univariate: Options dialogue box.  In the Display section, select Homogeneity tests to generate results for Levene’s test concerning the equal-variance assumption, and, if sample sizes are not equal for all cells, select Descriptive statistics to get row and column means calculated from individual observations in addition to those calculated from cell means; if desired, select Estimates of effect size and/or Observed power.  Click on the Continue button to return to the Univariate dialog box.

Step 13: Click on the Plots button to display the Univariate: Profile Plots dialogue box.  To generate one of the two possible interaction plots, select one of the two variables from the Factor(s) section on the left for the Horizontal Axis slot on the right, and select the other variable for the Separate Lines slot on the right; then, click the Add button to add this plot to the Plots section; to generate the other possible interaction plot, repeat this with the roles of the variables reversed.  Click on the Continue button to return to the Univariate dialog box.

Step 14: Click on the OK button, after which the SPSS output will be generated.

An appropriate graphical display for significant interaction in a two‑way ANOVA is one of the plots created in Step 13; however, if there is no significant interaction, then an appropriate graphical display for significant main effects is multiple box plots, which can be obtained by following the steps labeled “Creating Box Plots for Two or More Groups” in Section II(3) of this document.

 

        (6) Performing a One-Way ANCOVA (Analysis of Covariance) with Checks of Equal Variance and Normality Assumptions

Step 1: Identify in the SPSS data file the quantitative (response) variable for which means are to be compared, the qualitative independent variable which defines the groups among which the means are to be compared, and each quantitative variable which is a covariate.

Step 2: In order to use one-way ANCOVA, the assumption of no interaction between the qualitative independent variable and each covariate must be satisfied; to check the validity of this assumption, select the Analyze> General Linear Model> Univariate options to display the Univariate dialog box.

Step 3: From the list of variables on the left, select the quantitative (response) variable, and click on the arrow pointing toward the Dependent Variable slot.

Step 4: From the list of variables on the left, select the qualitative independent variable, and click on the arrow pointing toward the Fixed Factor(s) section.

Step 5: From the list of variables on the left, select each of the covariate quantitative independent variables, and click on the arrow pointing toward the Covariate(s) section.

Step 6: Click on the Model button to display the Univariate: Model dialog box, and select the Build terms option in the Specify Model section.

Step 7: From the list of variables in the Factors & Covariates section on the left, select the qualitative independent variable, and click on the arrow in the Build Term(s) section pointing toward the Model section.

Step 8: Repeat the previous step for each quantitative variable which is a covariate.

Step 9: From the list of variables in the Factors & Covariates section on the left, simultaneously select the qualitative independent variable and one quantitative variable which is a covariate (which can be accomplished by using the ctrl key), and click on the arrow in the Build Term(s) section pointing toward the Model section.

Step 10: Repeat the previous step for the qualitative independent variable and each quantitative variable which is a covariate.

Step 11: Verify that Type III is displayed in the Sum of squares slot at the lower left of the dialog box, and click on the Continue button to return the Univariate dialog box.

Step 12: Click on the Options button to display the Univariate: Options dialog box, and select the Homogeneity tests option in the Display section.

Step 13: Click on the Continue button, and click on the OK button, after which the SPSS output will be generated.

Step 14: The validity of the no-interaction assumption (in Step 2) can be checked by looking at the p-value corresponding to each f test concerning an interaction between the qualitative independent variable and a covariate; if each of these f tests is not statistically significant, then the no-interaction assumption is considered to be satisfied.  The equal variance assumption can be checked by looking at the p-value corresponding to Levene’s f test.

Step 15: If the no-interaction assumption and equal variance assumption are considered to be satisfied, then select a significance level a for the one-way ANCOVA and proceed to the next step; if the no-interaction assumption is not considered to be satisfied, then one-way ANCOVA is NOT an appropriate statistical analysis and instead of proceeding to the next step, perform a more complicated regression analysis which is an appropriate statistical analysis.

Step 16: In order to examine residuals for non-normality, first add the appropriate dummy variables to the data file for the qualitative independent variable as follows (but if no check for non-normality in residuals is desired, skip to Step 21.):  If the qualitative variable is defined by k categories, then create k - 1 dummy variables to define this qualitative variable.

Step 17: Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 18: From the list of variables on the left, select the quantitative (response) variable, and click on the arrow pointing toward the Dependent slot; then select each dummy variable created in Step 16, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted); finally, select each covariate, and click on the arrow pointing toward the Independent(s) section.

Step 19: Click on the Plots button to display the Linear Regression: Plots dialog box.  In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

Step 20: Click on the Continue button, and then click on the OK button, after which the SPSS output containing an ANOVA table and a normal probability plot for standardized residuals will be generated; this normal probability plot can be examined for significant departures from normality.

Step 21: In order to verify the linearity assumption for the dependent variable and each covariate (i.e., each quantitative independent (explanatory or predictor) variable), select the Graphs > Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot dialog box (but if no check for linearity is desired, skip to Step 26.).

Step 22: Make certain that the Simple Scatter option is selected; then, click on the Define button to display the Simple Scatterplot dialog box.

Step 23: From the list of variables on the left, select the quantitative dependent (response) variable, and click on the arrow pointing toward the Y-Axis slot; then select one of the covariates, and click on the arrow pointing toward the X-Axis slot.

Step 24: Click on the OK button, after which SPSS output displaying a scatter plot will be generated.  In order to have the least squares line appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options from the main menu (and close the dialog box which appears), after which selecting the File > Close options will close the chart editor.  By examining the how the points on the scatterplot are distributed around the least squares line, a decision can be made as to whether the linearity assumption about the relationship between the dependent and independent variables is satisfied.

Step 25: Repeat Steps #21 to #24 for each covariate.  Each scatter plot can be examined for significant departures from linearity.

Step 26: In order to do the one-way ANCOVA, select the Analyze> General Linear Model> Univariate options to display the Univariate dialog box.

Step 27: From the list of variables on the left, select the quantitative (response) variable, and click on the arrow pointing toward the Dependent Variable slot.

Step 28: From the list of variables on the left, select the qualitative independent variable, and click on the arrow pointing toward the Fixed Factor(s) section.

Step 29: From the list of variables on the left, select each of the covariates, and click on the arrow pointing toward the Covariate(s) section.

Step 30: Click on the Model button to display the Univariate: Model dialog box, select the Full factorial option in the Specify Model section, verify that Type III is displayed in the Sum of squares slot at the lower left of the dialog box, and click on the Continue button to return the Univariate dialog box.

Step 31: Click on the Options button to display the Univariate: Options dialogue box.  In the Display section, deselect Homogeneity tests (the results of which were already generated in an earlier step) and, if desired, select Descriptive statistics and/or Estimates of effect size.  Click on the Continue button to return to the Univariate dialog box.

Step 32: Click on the EM Means button to display the Univariate: Estimated Marginal Means dialogue box.  Select each item from the list in the Factor(s) and Factor Interactions section on the left for the Display Means for section on the right.  Immediately below the Display Means for section select the Compare main effect option, and immediately below this in the Confidence interval adjustment drop-down box select a desired multiple comparison procedure (such as Bonferroni).  Click on the Continue button to return to the Univariate dialog box.

Step 33: Click on the OK button, after which the SPSS output will be generated; if desired, separate least squares regression equations for each category of the qualitative independent variable can be obtained from the instructions in the remaining steps; otherwise, there is no need to proceed beyond this step.

Step 34: Based on the statistically significant differences among parallel regressions found, decide which of the k - 1 dummy variables created in Step 16 should be included in the regression along with the covariates, and then select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 35: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each of the dummy variables chosen in the previous step and each covariate, and click on the arrow pointing toward the Independent(s) section.

Step 36: Click on the OK button, after which the SPSS output will be generated.

 

        (7) Performing a Two-Way ANCOVA (Analysis of Covariance) with Checks of Equal Variance and Normality Assumptions

Step 1: Identify in the SPSS data file the quantitative (response) variable for which means are to be compared, the two qualitative independent variables which define the groups among which the means are to be compared, and each quantitative variable which is a covariate.

Step 2: In order to use two-way ANCOVA, the assumption of no interaction between each qualitative independent variable and each covariate must be satisfied; to check the validity of this assumption and also check the equal variance assumption, select the Analyze> General Linear Model> Univariate options to display the Univariate dialog box.

Step 3: From the list of variables on the left, select the quantitative (response) variable, and click on the arrow pointing toward the Dependent Variable slot.

Step 4: From the list of variables on the left, select each of the two qualitative independent variables, and click on the arrow pointing toward the Fixed Factor(s) section.

Step 5: From the list of variables on the left, select each of the covariate quantitative independent variables, and click on the arrow pointing toward the Covariate(s) section.

Step 6: Click on the Model button to display the Univariate: Model dialog box, and select the Build terms option in the Specify Model section.

Step 7: From the list of variables in the Factors & Covariates section on the left, select one of the two qualitative independent variables, and click on the arrow in the Build Term(s) section pointing toward the Model section; repeat this for the other qualitative independent variable and for each quantitative variable which is a covariate.

Step 8: From the list of variables in the Factors & Covariates section on the left, simultaneously select both of the qualitative independent variables and one quantitative variable which is a covariate (which can be accomplished by using the ctrl key), and click on the arrow in the Build Term(s) section pointing toward the Model section; repeat this for each quantitative variable which is a covariate.

Step 9: Verify that Type III is displayed in the Sum of squares slot at the lower left of the dialog box, and click on the Continue button to return the Univariate dialog box.

Step 10: Click on the Options button to display the Univariate: Options dialog box, and select the Homogeneity tests option in the Display section.

Step 11: Click on the Continue button, and click on the OK button, after which the SPSS output will be generated.

Step 12: The validity of the no-interaction assumption (in Step 2) can be checked by looking at the p-value corresponding to each f test concerning an interaction between the two qualitative independent variables and a covariate; if each of these f tests is not statistically significant, then the no-interaction assumption is considered to be satisfied.  The equal variance assumption can be checked by looking at the p-value corresponding to Levene’s f test.

Step 13: If the no-interaction assumption and equal variance assumption are considered to be satisfied, then select a significance level a for the one-way ANCOVA and proceed to the next step; if the no-interaction assumption is not considered to be satisfied, then one-way ANCOVA is NOT an appropriate statistical analysis and instead of proceeding to the next step, perform a more complicated regression analysis which is an appropriate statistical analysis.

Step 14: In order to examine residuals for non-normality, first add the appropriate dummy variables to the data file for each of the two qualitative (independent) variables as follows (but if no check for non-normality in residuals is desired, skip to Step 19.):  If one of the qualitative variables is defined by r categories, and the other is defined by c categories, then create r - 1 dummy variables to define the qualitative variable defined by r categories, and create c - 1 dummy variables to define the qualitative variable defined by c categories.  Finally, create the (r - 1)(c - 1) dummy variables which result from the products of one of the r - 1 dummy variables and one of the c - 1 dummy variables.

Step 15: Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 16: From the list of variables on the left, select the quantitative (response) variable, and click on the arrow pointing toward the Dependent slot; then select each dummy variable created in Step 14, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted); finally, select each covariate, and click on the arrow pointing toward the Independent(s) section.

Step 17: Click on the Plots button to display the Linear Regression: Plots dialog box.  In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

Step 18: Click on the Continue button, and then click on the OK button, after which the SPSS output containing an ANOVA table and a normal probability plot for standardized residuals will be generated; this normal probability plot can be examined for significant departures from normality.

Step 19: In order to verify the linearity assumption for the dependent variable and each covariate (i.e., each quantitative independent (explanatory or predictor) variable), select the Graphs > Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot dialog box (but if no check for linearity is desired, skip to Step 24.).

Step 20: Make certain that the Simple Scatter option is selected; then, click on the Define button to display the Simple Scatterplot dialog box.

Step 21: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Y-Axis slot; then select one of the covariates, and click on the arrow pointing toward the X-Axis slot.

Step 22: Click on the OK button, after which SPSS output displaying a scatter plot will be generated.  In order to have the least squares line appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options from the main menu (and close the dialog box which appears), after which selecting the File > Close options will close the chart editor.  By examining the how the points on the scatterplot are distributed around the least squares line, a decision can be made as to whether the linearity assumption about the relationship between the dependent and independent variables is satisfied.

Step 23: Repeat Steps #19 to #22 for each covariate.  Each scatter plot can be examined for significant departures from linearity.

Step 24: In order to do the two-way ANCOVA, select the Analyze> General Linear Model> Univariate options to display the Univariate dialog box.

Step 25: From the list of variables on the left, select the quantitative (response) variable, and click on the arrow pointing toward the Dependent Variable slot.

Step 26: From the list of variables on the left, select each of the two qualitative independent variables, and click on the arrow pointing toward the Fixed Factor(s) section.

Step 27: From the list of variables on the left, select each of the covariates, and click on the arrow pointing toward the Covariate(s) section.

Step 28: Click on the Model button to display the Univariate: Model dialog box, select the Full factorial option in the Specify Model section, verify that Type III is displayed in the Sum of squares slot at the lower left of the dialog box, and click on the Continue button to return the Univariate dialog box.

Step 29: Click on the Options button to display the Univariate: Options dialogue box.  In the Display section, deselect Homogeneity tests (the results of which were already generated in an earlier step) and, if desired, select Descriptive statistics and/or Estimates of effect size.  Click on the Continue button to return to the Univariate dialog box.

Step 30: Click on the EM Means button to display the Univariate: Estimated Marginal Means dialogue box.  Select each item from the list in the Factor(s) and Factor Interactions section on the left for the Display Means for section on the right.  Immediately below the Display Means for section select the Compare main effect option, and immediately below this in the Confidence interval adjustment drop-down box select a desired multiple comparison procedure (such as Bonferroni).  Click on the Continue button to return to the Univariate dialog box.  Click on the Continue button to return to the Univariate dialog box.

Step 31: Click on the Plots button to display the Univariate: Profile Plots dialogue box.  To generate one of the two possible interaction plots, select one of the two variables from the Factor(s) section on the left for the Horizontal Axis slot on the right, and select the other variable for the Separate Lines slot on the right; then, click the Add button to add this plot to the Plots section; to generate the other possible interaction plot, repeat this with the roles of the variables reversed.  Click on the Continue button to return to the Univariate dialog box.

Step 32: Click on the OK button, after which the SPSS output will be generated; if desired, separate least squares regression equations for each category of the qualitative independent variable can be obtained from the instructions in the remaining steps; otherwise, there is no need to proceed beyond this step.

Step 33: Based on the statistically significant differences among parallel regressions found, create the dummy variables which should be included in the regression (some of which may have already been created in Step 14) along with the statistically significant covariates; then select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

Step 34: From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each of the dummy variables chosen in the previous step and each covariate, and click on the arrow pointing toward the Independent(s) section.

Step 35: Click on the OK button, after which the SPSS output will be generated.

 

        (8) Performing a Repeated Measures Within-Subjects ANOVA with Checks of Sphericity and Normality Assumptions

Step 1: Identify in the SPSS data file the quantitative (response or dependent) variables whose means are to be compared.  (The normality assumption can be checked by using the steps in (2) of the Data Diagnostics section.)

Step 2: Select the Analyze> General Linear Model> Repeated Measures options to display the Repeated Measures Define Factor(s) dialog box.

Step 3: In the Within-Subject Factor Name slot, you can change the default name factor(1) to a more appropriate name for the group of quantitative (response or dependent) variables.

Step 4: Enter the number of quantitative (response or dependent) variables in the Number of Levels slot, and click on the Add button.  (This number of dependent variables is typically greater than 2; if it is equal to 2, then performing a repeated measures ANOVA is equivalent to performing a paired t test.)

Step 5: Click on the Define button to display the Repeated Measures dialog box, and from the list of variables on the left, select each of the quantitative (response or dependent) variables, and click on the arrow pointing toward the Within-Subjects Variables section to replace each “_?_” which you will see there.

Step 6: Click on the Options button to display the Repeated Measures: Options dialog box.  In the Display section, select Descriptive statistics and/or Estimates of effect size.  Click on the Continue button to return to the Repeated Measures dialog box.

Step 7: Click on the EM Means button to display the Repeated Measures: Estimated Marginal Means dialogue box.  Select each item from the list in the Factor(s) and Factor Interactions section on the left for the Display Means for section on the right.  Immediately below the Display Means for section select the Compare main effect option, and immediately below this in the Confidence interval adjustment drop-down box select a desired multiple comparison procedure (such as Bonferroni).  Click on the Continue button to return to the Repeated Measures dialog box.

Step 8: Click on the Plots button to display the Repeated Measures: Profile Plots dialogue box.  Select the factor name from the Factors section on the left for the Horizontal Axis slot on the right; then, click the Add button to add this plot to the Plots section.  Click on the Continue button to return to the Repeated Measures dialog box.

Step 9: Click on the OK button, after which the SPSS output will be generated.

 

        (9) Performing a Repeated Measures Mixed Between-Within-Subjects ANOVA with Checks of Sphericity, Equal Variance and Covariance, and Normality Assumptions

Step 1: Identify in the SPSS data file the quantitative (response or dependent) variables whose means are to be compared whose means are to be compared and the qualitative independent variable which defines the groups among which the means are to be compared.  (The normality assumption can be checked by using the steps in (2) of the Data Diagnostics section.)

Step 2: Select the Analyze> General Linear Model> Repeated Measures options to display the Repeated Measures Define Factor(s) dialog box.

Step 3: In the Within-Subject Factor Name slot, you can change the default name factor(1) to a more appropriate name for the group of quantitative (response or dependent) variables.

Step 4: Enter the number of quantitative (response or dependent) variables in the Number of Levels slot, and click on the Add button.

Step 5: Click on the Define button to display the Repeated Measures dialog box, and from the list of variables on the left, select each of the quantitative (response or dependent) variables, and click on the arrow pointing toward the Within-Subjects Variables section to replace each “_?_” which you will see there; then select the qualitative independent variable from the list of variables on the left, and click on the arrow pointing toward the Between-Subjects Variables section.

Step 6: Click on the Options button to display the Repeated Measures: Options dialog box.  In the Display section, select Descriptive statistics and/or Estimates of effect size.  Click on the Continue button to return to the Repeated Measures dialog box.

Step 7: Click on the EM Means button to display the Repeated Measures: Estimated Marginal Means dialogue box.  Select each item from the list in the Factor(s) and Factor Interactions section on the left for the Display Means for section on the right.  Immediately below the Display Means for section select the Compare main effect option, and immediately below this in the Confidence interval adjustment drop-down box select a desired multiple comparison procedure (such as Bonferroni).  Click on the Continue button to return to the Repeated Measures dialog box.

Step 8: If the qualitative independent variable has only 2 categories, then go to Step 9; if the qualitative independent variable has more than 2 categories, then click on the Post Hoc button to display the Repeated Measures: Post Hoc Multiple Comparisons for Observed Means dialog box; from the list in the Factor(s) section on the left, select the name of the qualitative variable having more than two categories, and use the arrow button to move this name into the Post Hoc Tests for section on the right.  From the Equal Variances Assumed section, select a desired multiple comparison procedure (such as Bonferroni); if it is deemed necessary later, an option from the Equal Variances Not Assumed section can be used.  These multiple comparison procedures are generally needed when one or both main effects are statistically significant.  Click on the Continue button to return to the Repeated Measures dialog box.

Step 9: Click on the Plots button to display the Repeated Measures: Profile Plots dialogue box.  To generate one of the two possible interaction plots, select the factor name from the Factors section on the left for the Horizontal Axis slot on the right, and select the name of the qualitative independent variable from the Factors section on the left for the Separate Lines slot on the right; then, click the Add button to add this plot to the Plots section; to generate the other possible interaction plot, repeat this with the roles of the factor name and the name of the qualitative independent variable reversed.  Click on the Continue button to return to the Repeated Measures dialog box.

Step 10: Click on the OK button, after which the SPSS output will be generated.

???????????????

 

Performing a logistic regression with checks for multicollinearity

 

       1.       Identify in the SPSS data file the (qualitative-dichotomous) dependent (response) variable, all quantitative independent (explanatory or predictor) variables, and all qualitative independent (explanatory or predictor) variables.

       2.       For each of the qualitative independent (explanatory or predictor) variables, add the appropriate dummy variable(s) to the data file.  For a qualitative variable with k categories, this can be done by defining dummy variables, where the first dummy variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise, etc.; since k - 1 dummy variables are sufficient to represent a qualitative variable with k categories, the kth dummy variable is not really necessary, and may or may not be used.

       3.       The multiple regression routine can be used to check for multicollinearity (but is not appropriate for statistical analysis, since the dependent variable is not quantitative); select the Analyze > Regression > Linear options, select the (qualitative-dichotomous) dependent for the Dependent slot, and select all quantitative independent variables and all dummy variables representing qualitative independent variables for the Independent(s) section.  Click on the Statistics button, and in the dialog box which appears select the Collinearity diagnostics option.  Click the Continue button to close the dialog box, and click the OK button to obtain the desired SPSS output.  The desired values for tolerance and VIF are all available in the Coefficients table of the output.

       4.       Make ???????.

       5.       From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Y-Axis slot; then select one of the quantitative independent (explanatory or predictor) variables, and click on the arrow pointing toward the X-Axis slot.

       6.       Click on the OK button, after which SPSS output displaying a scatter plot will be generated.  In order to have the least squares line appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options from the main menu (and close the dialog box which appears), after which selecting the File > Close options will close the chart editor.  By examining the how the points on the scatterplot are distributed around the least squares line, a decision can be made as to whether the linearity assumption about the relationship between the dependent and independent variables is satisfied.

       7.       Repeat Steps #2 to #5 for each of the quantitative independent (explanatory or predictor) variables.

       8.       For each of the qualitative independent (explanatory or predictor) variables, add the appropriate dummy variable(s) to the data file.  For a qualitative variable with k categories, this can be done by defining dummy variables, where the first dummy variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise, etc.; since k - 1 dummy variables are sufficient to represent a qualitative variable with k categories, the kth dummy variable is not really necessary, and may or may not be used.

       9.       Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

     10.     From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

     11.     Click on the Plots button to display the Linear Regression: Plots dialog box.  From the list of variables on the left, select ZRESID, and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing toward the X slot.  This will generate a scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, so that a decision can be made as to whether the homoscedasticity assumption is satisfied.

     12.     In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

     13.     Click on the Continue button, and then click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, select the Descriptives option to generate means, standard deviations, and the Pearson correlations, and select the Collinearity diagnostics option to generate information about whether multicollinearity could be a problem; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the slope and for the intercept in the regression.

     14.     Click on the Continue button, and then click on the Save button to display the Linear Regression: Save dialog box. In the Residuals section, select the Standardized option to save the standardized residuals as part of the data.  This allows further analysis to be performed using the standardized residuals.

     15.     Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.

     16.     The scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, requested in Step #8, will be displayed on the SPSS output without a horizontal line at zero; since this line can be helpful in examining this plot, the instructions in Step #5 to have the least squares line appear on the scatter plot can be used to add this horizontal line at zero.

 

Methods to decide which of many predictors are the most important to include in a model are available with SPSS by doing the following:

 

       1.       Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

       2.       From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

       3.       In the Method slot, select the desired method for variable selection (such as the Stepwise option).

       4.       Click on the Continue button, and then click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, and select the R squared change option; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the slope and for the intercept in the regression.

       5.       Click on the OK button, after which the SPSS output will be generated.

 

 

 

Generating a correlation matrix

 

 

????????????????