Go back to my Home Page or to the SPSS Home Page (and contact SPSS by phone at 1-800-543-2185).  More information about using SPSS is available here.

Using SPSS Version 19.0

This document contains instructions on using SPSS to perform various statistical analyses.  The list of section titles are as follows:

Data Manipulation

Data Diagnostics

Hypothesis Tests Involving One Variable

Hypothesis Tests Involving Two Variables

 

Data Manipulation

 

IMPORTANT:

          The default in SPSS is to use all cases.  In order to use only selected cases in the data file, first select the Data > Select Cases options to display the Select Cases dialog box; then click on the If button to display the Select Cases: If dialog box.  After the desired condition is entered, click on the Continue button, and then click on the OK button, after which only the desired cases should not be marked as being excluded from data analysis; also, a variable named filter_$ will be added to the data.

          In many of the SPSS dialog boxes (generally from clicking an Options button), there will be two choices for handling missing data.  The Exclude cases pairwise choice will have SPSS perform each specific procedure using all cases with no missing data for the variables involved in the given procedure, which implies that with missing data the sample size may not be same for each procedure performed.  The Exclude cases listwise choice will have SPSS perform each specific procedure using only those cases with no missing data for the variables involved in every procedure that is to be performed, which implies that with missing data the sample size will be same for each procedure performed.  (Of course when there is no missing data, it makes no difference which of these two choices is made.)

 

Creating new variables with transformation of existing variables

 

         1.        Select the Transform > Compute Variable options to display the Compute Variable dialog box.

         2.        In the Target Variable slot, type an appropriate name for the new variable which is to be a function of existing variables.

         3.        In the Numeric Expression section, a formula is needed to indicate how the values for the new variable are to be calculated; this can be accomplished by constructing the appropriate formula in the Numeric Expression section through the selection of variable names from the list of variables on the left and clicking on the arrow pointing toward the Numeric Expression section, together with the selection of algebraic operation buttons from the keypad displayed in the middle of the dialog box.

         4.        Click on the OK button, after which the new variable should be added to the data.

 

Creating new variables by recoding existing variables

 

         1.        Select the Transform > Recode into Different Variables options to display the Recode into Different Variables dialog box.

         2.        From the list of variables on the left, select the existing variable to be recoded into a new variable, and click on the arrow pointing toward the Numeric Variable -> Output Variable section.

         3.        In the Output Variable section, type an appropriate name in the Name slot for the new variable to be created by recoding, and click on the Change button.  You should now see in the Numeric Variable -> Output Variable section an indication of which existing variable is being recoded into the new variable.

         4.        Click the Old and New Values button to display the Recode into Different Variables: Old and New Values dialog box.

         5.        In the Old Value section click an appropriate option and enter the appropriate information for a value or range of values for the existing variable being recoded; in the New Value section click an appropriate option and enter the appropriate information for a corresponding value for the new variable to be created by recoding; click on the Add button after, which you should see in the Old -> New section an indication of how this recoding will be done.  Repeat this process until all the recoding information has been entered.

         6.        After all the recoding information has been entered, click on the Continue button to return to the Recode into Different Variables dialog box, and then click on the OK button, after which the new variable should appear in the SPSS data file.

 

 

Data Diagnostics

 

Checking for skewness and non-normality in data

 

         1.        Identify the (quantitative) variable(s) in the SPSS data file to be checked for normality or skewness, and identify the grouping variable if there is one.

         2.        Select the Analyze > Descriptive Statistics > Explore options to display the Explore dialog box.

         3.        From the list of variables on the left, select the desired (quantitative) variable(s), and click on the arrow pointing toward the Dependent List section; if there is a grouping variable in the list, select this variable and click on the arrow pointing toward the Factor section.

         4.        In the Display section of the dialog box, make certain that the Both option is selected.

         5.        Click on the Plots button to display the Explore: Plots dialog box.

         6.        In the Descriptive section of the dialog box, select the Stem-and-leaf option and the Histogram option; also, select the Normality plots with tests option.

         7.        Click on the Continue button to return to the Explore dialog box, and then click on the OK button, after which the SPSS output will be generated.

 

 

Hypothesis Tests Involving One Variable

 

Performing a one‑sample t test about a mean m

 

         1.        Identify the (quantitative) variable in the SPSS data file on which the test is to be performed, decide on the hypothesized value for the mean m, and select a (two‑sided) significance level a.  (More than one (quantitative) variable may be selected on which the test is to be performed simultaneously, but only one hypothesized value for the mean is permitted.)

         2.        Select the Analyze > Compare Means > One Sample T Test options to display the One Sample T Test dialog box.

         3.        From the list of variables on the left, select the desired (quantitative) variable (and more than one selection is permitted), and click on the arrow pointing toward the Test Variable(s) section.

         4.        Type the hypothesized value for the mean m in the Test Value slot.

         5.        Click on the Options button to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).

         6.        Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.  The p-value displayed on the SPSS output is for a two‑sided test; this must be divided by 2 when doing a one‑sided test; also, the confidence interval limits actually displayed on the SPSS output are actually the confidence interval limits for the difference between the population mean and the hypothesized value of the mean; adding the hypothesized value for the mean to each of these limits gives the confidence interval limits for the population mean.

 

One possible appropriate graphical display for the data used in a one‑sample t test is a box plot, which can be obtained in SPSS by doing the following:

 

         1.        Select the Graphs > Legacy Dialogs > Boxplot options to display the Boxplot dialog box.

         2.        Select the Simple option and the Summaries of separate variables option.

         3.        Click on the Define button to display the Define Simple Boxplot: Summaries of Separate Variables dialog box.

         4.        From the list of variables on the left, select the desired (quantitative) variable, and click on the arrow pointing toward the Boxes Represent section.

         5.        Click on the OK button, after which the SPSS output will be generated.  The box plot will be displayed vertically; in order to display it horizontally, first double click on the graph to enter the SPSS Chart Editor, and then select the Options > Transpose Chart options from the main menu, after which selecting the File > Close options will close the chart editor.

 

 

Performing a chi‑square goodness‑of‑fit test with hypothesized proportions

 

         1.        If the data have already been entered into an SPSS data file, identify the (qualitative) variable on which the test is to be performed in the SPSS data file, and skip to step #7; otherwise, enter the data into SPSS by following the instructions beginning in step #2.

         2.        Go to the Variable View sheet (by clicking on the appropriate tab at the bottom of the screen), and in the first row, enter a name for the (qualitative) variable on which the test is to be performed.

         3.        Define codes for this variable so that 1 (one) represents one category, 2 (two) represents a second category, 3 (three) represents a third category, etc., making certain that all categories of the (qualitative) variable have been included.

         4.        In the second row, enter the variable name count, and since all the counts must be integers, make the entry in the third cell of the Decimals column to 0 (zero).

         5.        Go to the Data View sheet (by clicking on the appropriate tab at the bottom of the screen), and in the column for the (qualitative) variable on which the test is to be performed, enter the codes 1, 2, 3, etc. respectively into the first cell, the second cell, the third cell, etc., making certain that all codes used have been entered.  (If the category labels are not displayed, then select View > Value Labels from the main menu.)

         6.        In the column for the variable count, enter the corresponding counts (i.e., the raw frequency of occurrence in the data for each category); after all the data is entered, it may be a good idea to save this SPSS file using an appropriate file name.

         7.        If each line of the data file represents one case (i.e., the data were not entered in the format described in steps #2 to #6), then select the Analyze > Nonparametric Tests > Legacy Dialogs > Chi‑Square options to display the Chi‑Square Test dialog box, and proceed to the next step; if each line of the data file represents one category of the (qualitative) variable on which the test is to be performed (i.e., the data were entered in the format described in steps #2 to #6), then do the following: First, select the Data > Weight Cases options to display the Weight Cases dialog box; then, select the Weight cases by option, select from the list on the left the variable name for the raw frequency (count) of occurrence in the data for each category, and click on the arrow button pointing toward the Frequency Variable slot; next, click on the OK button; finally, select the Analyze > Nonparametric Tests > Legacy Dialogs > Chi‑Square options to display the Chi‑Square Test dialog box, and proceed to the next step.

         8.        For each category, decide on the hypothesized value for the proportion.

         9.        In the Chi-Square Test dialog box, from the list of the variables on the left, select the (qualitative) variable on which the test is to be performed, and click on the arrow button pointing toward the Test Variable List section.

      10.      Which option should be selected in the Expected Values section depends on the hypothesized proportions.  If the hypothesized proportions are all equal, then select the All categories equal option; if the hypothesized proportions are not all equal, then select the Values option, and enter the hypothesized proportions by typing the hypothesized proportion for the category coded with the smallest value (which would be 1 if the data were entered in the format described in steps #2 to #6) in the Values slot and clicking on the Add button, typing the hypothesized proportion for the category coded with the next smallest value ( which would be 2 if the data were entered in the format described in steps #2 to #6) in the Values slot and clicking on the Add button, typing the hypothesized proportion for the category coded with the third smallest value (which would be 3 if the data were entered in the format described in steps #2 to #6) in the Values slot and clicking on the Add button, etc., making certain that the hypothesized proportion for each code used has been entered.  (The order in which these hypothesized proportions are entered must correspond to the numerical order of the codes for the different categories; also, it does not matter whether percentages or proportions are entered, so that, for instance, 45, 35, and 20 could be entered instead of 0.45, 0.35, and 0.20.)

      11.      Click on the OK button, after which the SPSS output will be generated.

 

One possible appropriate graphical display for the data used in a chi‑square goodness‑of‑fit test is a bar chart, which can be obtained in SPSS by doing the following:

 

         1.        Select the Graphs > Legacy Dialogs > Bar options to display the Bar Charts dialog box.

         2.        Select the Simple option and the Summaries for groups of cases option.

         3.        Click on the Define button to display the Define Simple Bar: Summaries for Groups of Cases dialog box.

         4.        From the list of variables on the left, select the desired (qualitative) variable, and click on the arrow pointing toward the Category Axis slot.

         5.        To display raw frequency, select the N of cases option in the Bars Represent section; to display percentages, select the % of cases option in the Bars Represent section.

         6.        Click on the OK button, after which the SPSS output will be generated.  To edit the bar chart, if desired, double click on the graph to enter the SPSS Chart Editor.  When editing is complete, select the File > Close options to close the chart editor.

 

Another possible appropriate graphical display for the data used in a chi‑square goodness‑of‑fit test is a pie chart, which can be obtained in SPSS by doing the following:

 

         1.        Select the Graphs > Legacy Dialogs > Pie options to display the Pie Charts dialog box.

         2.        Select the Summaries for groups of cases option.

         3.        Click on the Define button to display the Define Pie: Summaries for Groups of Cases dialog box.

         4.        From the list of variables on the left, select the desired (qualitative) variable, and click on the arrow pointing toward the Define Slices by slot.

         5.        To display raw frequency, select the N of cases option; to display percentages, select the % of cases option.

         6.        Click on the OK button, after which the SPSS output will be generated.  To edit the pie chart, if desired, double click on the graph to enter the SPSS Chart Editor.  When editing is complete, select the File > Close options to close the chart editor.

 

Hypothesis Tests Involving Two Variables

 

Performing a paired‑sample t test about a mean difference md (i.e., a difference between means from dependent samples)

 

         1.        Identify in the SPSS data file both of the (quantitative) variables for which the mean differences are being tested, and select a (two‑sided) significance level a.

         2.        Select the Analyze > Compare Means > Paired‑Samples T Test options to display the Paired‑Samples T Test dialog box.

         3.        From the list of variables on the left, select one of the desired (quantitative) variables, and click on the arrow pointing toward the Paired Variables section; then select the other desired (quantitative) variable, and click on the arrow pointing toward the Paired Variables section.  (Selection of more than one pair is permitted.)

         4.        Click on the Options button to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).

         5.        Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.  The p-value displayed on the SPSS output is for a two‑sided test; this must be divided by 2 when doing a one‑sided test.

 

One possible appropriate graphical display for the data used in a paired‑sample t test is two box plots, one for each variable, which can be obtained in SPSS by doing the following:

 

         1.        Select the Graphs > Legacy Dialogs > Boxplot options to display the Boxplot dialog box.

         2.        Select the Simple option and the Summaries of separate variables option.

         3.        Click on the Define button to display the Define Simple Boxplot: Summaries of Separate Variables dialog box.

         4.        From the list of variables on the left, select one of the desired (quantitative) variables, and click on the arrow pointing toward the Boxes Represent section; then select the other desired (quantitative) variable, and click on the arrow pointing toward the Boxes Represent section.

         5.        Click on the OK button, after which the SPSS output will be generated.  The box plots will be displayed vertically; in order to display them horizontally, first double click on the graph to enter the SPSS Chart Editor, and then select the Options > Transpose Chart options from the main menu, after which selecting the File > Close options will close the chart editor.

 

Another possible appropriate graphical display for the data used in a paired‑sample t test is one box plot of the differences between the two variables, which can be obtained in SPSS by doing the following:

 

         1.        A new variable which is the difference between the two (quantitative) variables must be added to the data; to accomplish this, decide on the order of subtraction for the difference and select the Transform > Compute Variable options to display the Compute Variable dialog box.

         2.        In the Target Variable slot, type an appropriate name for the new variable which is to be the difference between the two (quantitative) variables.

         3.        In the Numeric Expression section, a formula is needed to indicate how the values for the new variable are to be calculated; to accomplish this, select the name of the first variable in the difference from the list of variables on the left, and click on the arrow pointing toward the Numeric Expression section; then select the minus sign button from the keypad displayed in the middle of the dialog box; finally select the name of the second variable in the difference from the list of variables on the left.

         4.        Click on the OK button, after which the new variable should be added to the data.

         5.        Create a box plot for this new variable by following the steps to create such a box plot in the section titled Performing a one sample t test about a mean m.

 

Performing a two‑sample t test about a difference between means m1 and m2

 

         1.        Identify in the SPSS data file both the (qualitative‑dichotomous) variable which defines the two groups being compared and the (quantitative) variable for which the means are being compared, and select a (two‑sided) significance level a.  (More than one (quantitative) variable may be selected on which to compare means simultaneously, but only one grouping variable may be selected.)

         2.        Select the Analyze > Compare Means > Independent‑Samples T Test options to display the Independent‑Samples T Test dialog box.

         3.        From the list of variables on the left, select the desired (quantitative) variable (and more than one selection is permitted), and click on the arrow pointing toward the Test Variable(s) section.

         4.        From the list of variables on the left, select the desired (qualitative‑dichotomous) variable, and click on the arrow pointing toward the Grouping Variable slot.

         5.        Click on the Define Groups button to display the Define Groups dialog box.

         6.        In the Group 1 slot type the numerical code which represents one of the categories for the (qualitative‑dichotomous) variable which defines the two groups being compared, and in the Group 2 slot type the numerical code which represents the other category; then click on the Continue button to return to the Independent‑Samples T Test dialog box.

         7.        Click on the Options button to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).

         8.        Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.  The p-value displayed on the SPSS output is for a two‑sided test; this must be divided by 2 when doing a one‑sided test.

 

One possible appropriate graphical display for the data used in a two‑sample t test is two box plots, one for each group, which can be obtained in SPSS by doing the following:

 

         1.        Select the Graphs > Legacy Dialogs > Boxplot options to display the Boxplot dialog box.

         2.        Select the Simple option and the Summaries for groups of cases option.

         3.        Click on the Define button to display the Define Simple Boxplot: Summaries for Groups of Cases dialog box.

         4.        From the list of variables on the left, first select the (quantitative) variable to be displayed by each boxplot, and click on the arrow pointing toward the Variable section; then select the (qualitative) variable which defines each group for which a box plot is to be displayed, and click on the arrow pointing toward the Category Axis slot.

         5.        Click on the OK button, after which the SPSS output will be generated.  The box plots will be displayed vertically; in order to display them horizontally, first double click on the graph to enter the SPSS Chart Editor, and then select the Options > Transpose Chart options from the main menu, after which selecting the File > Close options will close the chart editor.

 

Performing a one‑way ANOVA (analysis of variance) to test for a difference between multiple means m1 , m2 , …, mk

 

         1.        Identify in the SPSS data file both the (qualitative) variable which defines the groups being compared and the (quantitative) variable for which the means are being compared, and select a significance level a.  (More than one (quantitative) variable may be selected on which to compare means simultaneously, but only one grouping variable may be selected.)

         2.        Select the Analyze > Compare Means > One‑Way ANOVA options to display the One‑Way ANOVA dialog box.

         3.        From the list of variables on the left, select the desired (quantitative) variable (and more than one selection is permitted), and click on the arrow pointing toward the Dependent List section.

         4.        From the list of variables on the left, select the desired (qualitative) variable, and click on the arrow pointing toward the Factor slot.

         5.        Click on the Options button to display the One‑Way ANOVA: Options dialog box; in order to have descriptive statistics displayed, select the Descriptive option; in order to have results for Levene’s test (concerning equal variances) displayed, select the Homogeneity of variance test option; in order to have results for alternative f tests (which are adjusted for unequal variances) displayed, select the Brown‑Forsythe option and/or the Welch option; click on the Continue button to return to the One‑Way ANOVA dialog box.

         6.        Click on the Post Hoc button to display the One-Way ANOVA: Post Hoc Multiple Comparisons dialogue box, select the desired multiple comparisons method(s), enter the desired significance level, and click on the Continue button to return to the One‑Way ANOVA dialog box.

         7.        Click on the OK button, after which the SPSS output will be generated.

 

One possible appropriate graphical display for the data used in a one‑way ANOVA is multiple box plots, one for each group, which can be obtained in SPSS by following the steps to create two box plots in the section titled Performing a two‑sample t test about a difference between means m1 and m.

 

Performing a chi‑square test for association

 

         1.        If the data have already been entered into an SPSS data file, identify the two (qualitative) variables on which the test is to be performed in the SPSS data file, and skip to step #7; otherwise, enter the data into SPSS by following the instructions beginning in step #2.

         2.        Go to the Variable View sheet (by clicking on the appropriate tab at the bottom of the screen), in the first row enter a name for one of the two (qualitative) variables on which the test is to be performed, and in the second row enter a name for the other (qualitative) variable.

         3.        For each of the two variables, define codes so that 1 (one) represents one category, 2 (two) represents a second category, 3 (three) represents a third category, etc., making certain that all categories of the (qualitative) variable have been included.

         4.        In the third row, enter the variable name count, and since all the counts must be integers, make the entry in the third cell of the Decimals column to 0 (zero).

         5.        Go to the Data View sheet (by clicking on the appropriate tab at the bottom of the screen), and in the first two columns for the two (qualitative) variables on which the test is to be performed, enter the codes 1 and 1 respectively into the first and second cells of the first row, enter the codes 1 and 2 respectively into the first and second cells of the second row, enter the codes 1 and 3 respectively into the first and second cells of the third row, etc., making certain that all codes used for the (qualitative) variable in the second column have been entered.  Now, repeat this in the next rows with the code 2 entered in the first column, and then repeat this again with the code 3 entered in the first column, etc. making certain that all codes used for the (qualitative) variable in the first column have been entered.  Each possible combination of categories for the two (qualitative) variables should now be displayed exactly once in the first two columns.  (If the category labels are not displayed, then select View > Value Labels from the main menu.)

         6.        In the column for the variable count, enter the corresponding counts (i.e., the raw frequency of occurrence in the data for each combination of categories); after all the data is entered, it may be a good idea to save this SPSS file using an appropriate file name.

         7.        If each line of the data file represents one case (i.e., the data were not entered in the format described in steps #2 to #6), then select the Analyze > Descriptive Statistics > Crosstabs options to display the Crosstabs dialog box, and proceed to the next step; if each line of the data file represents a combination of categories of the two (qualitative) variables on which the test is to be performed (i.e., the data were entered in the format described in steps #2 to #6), then do the following: First, select the Data > Weight Cases options to display the Weight Cases dialog box; then, select the Weight cases by option, select from the list on the left the variable name for the raw frequency (count) of occurrence in the data for each category, and click on the arrow button pointing toward the Frequency Variable slot; next, click on the OK button; finally, select the Analyze > Descriptive Statistics > Crosstabs options to display the Crosstabs dialog box, and proceed to the next step.

         8.        From the list of the variables on the left, select one of the two (qualitative) variables on which the test is to be performed, and click on the arrow button pointing toward the Row(s) section; then select the other (qualitative) variable, and click on the arrow button pointing toward the Column(s) section.

         9.        Click on the Statistics button to display the Crosstabs: Statistics dialogue box, select the Chi-square option in the upper left corner of the dialogue box, and click on the Continue button to return to the Crosstabs dialog box.

      10.      Click on the Cells button to display the Crosstabs: Cell Display dialogue box; in order to have the data (observed frequencies) displayed, select the Observed option in the Counts section; in order to have the expected frequencies displayed, select the Expected option in the Counts section; in order to have the percentages for column variable categories displayed for each row variable category, select the Row option in the Percentages section; in order to have the percentages for row variable categories displayed for each column variable category, select the Column option in the Percentages section; in order to have the percentages for each cell out of the total displayed, select the Total option in the Percentages section; in order to have the standardized residuals displayed for each cell, select the Standardized option in the Residuals section; click on the Continue button to return to the Crosstabs dialog box.  (NOTE: The column heading for the display of the p‑value for the Pearson Chi‑Square statistic has “2‑sided” in parentheses on the SPSS output, which can be misleading since the Pearson Chi‑Square test is generally a one‑sided test.)

      11.      Click on the OK button, after which the SPSS output will be generated.

 

One possible appropriate graphical display for the data used in a chi‑square test for association is a stacked bar chart, which can be obtained in SPSS by doing the following:

 

         1.        Select the Graphs > Legacy Dialogs > Bar options to display the Bar Charts dialog box.

         2.        Select the Stacked option and the Summaries for groups of cases option.

         3.        Click on the Define button to display the Define Stacked Bar: Summaries for Groups of Cases dialog box.

         4.        From the list of variables on the left, select the (qualitative) variable whose categories will be represented by the bars, and click on the arrow pointing toward the Category Axis slot; then select the (qualitative) variable whose categories will be represented by stacks on each of the bars, and click on the arrow pointing toward the Define Stacks by slot.

         5.        Select the % of cases option in the Bars Represent section.  (The N of cases option can be selected if raw frequencies are desired, but it is much more common to use percentages in stacked bar charts.)

         6.        Click on the OK button, after which a stacked bar chart in SPSS output will be generated, but the bars may not all be of the same height, and the percentages scaled on the vertical axis are for the stacks across bars instead of within bars.  Both of these issues can be addressed to make the stacked bar chart easier to read by editing the graph.

         7.        To edit the stacked bar chart, double click on the graph to enter the SPSS Chart Editor.  Then, select the Options > Scale to 100% options, and select the File > Close options to close the chart editor.  The stacked bar chart should now be easier to read.

 

Performing a simple linear regression with bivariate data, with checks of linearity, homoscedasticity, and normality assumptions

 

         1.        Identify in the SPSS data file the (quantitative) dependent (response) variable and the (quantitative) independent (explanatory or predictor) variable.

         2.        Select the Graphs > Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot dialog box.

         3.        Make certain that the Simple Scatter option is selected; then, click on the Define button to display the Simple Scatterplot dialog box.

         4.        From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Y-Axis slot; then select the (quantitative) independent (explanatory or predictor) variable, and click on the arrow pointing toward the X-Axis slot.

         5.        Click on the OK button, after which SPSS output displaying a scatter plot will be generated.  In order to have the least squares line appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options from the main menu (and close the dialog box which appears), after which selecting the File > Close options will close the chart editor.  By examining the how the points on the scatterplot are distributed around the least squares line, a decision can be made as to whether the linearity assumption about the relationship between the dependent and independent variables is satisfied.

         6.        Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

         7.        From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select the (quantitative) independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

         8.        Click on the Plots button to display the Linear Regression: Plots dialog box.  From the list of variables on the left, select ZRESID, and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing toward the X slot.  This will generate a scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, so that a decision can be made as to whether the homoscedasticity assumption is satisfied.

         9.        In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

      10.      Click on the Continue button, and then click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, and select the Descriptives option to generate means, standard deviations, and the Pearson correlation; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the slope and for the intercept in the regression.

      11.      Click on the Continue button, and then click on the Save button to display the Linear Regression: Save dialog box. In the Residuals section, select the Standardized option to save the standardized residuals as part of the data.  This allows further analysis to be performed using the standardized residuals.

      12.      Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.

      13.      The scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, requested in Step #8, will be displayed on the SPSS output without a horizontal line at zero; since this line can be helpful in examining this plot, the instructions in Step #5 to have the least squares line appear on the scatter plot can be used to add this horizontal line at zero.

 

 

Statistical Analysis Involving Two or More Variables

 

Performing a two‑way ANOVA (analysis of variance)

 

         1.        Identify in the SPSS data file the two (qualitative) variables which define the cells and the (quantitative) variable for which means are being compared, and select a significance level a.

         2.        Select the Analyze> General Linear Model> Univariate options to display the Univariate dialog box.

         3.        From the list of variables on the left, select the desired (quantitative) variable, and click on the arrow pointing toward the Dependent Variable slot.

         4.        From the list of variables on the left, select the two desired (qualitative) variables, and click on the arrow pointing toward the Fixed Factor(s) section.

         5.        Click on the Post Hoc button to display the Univariate: Post Hoc Multiple Comparisons for Observed Means dialog box; select each item from the list in the Factor(s) section on the left for the Post Hoc Tests for section on the right.  From the Equal Variances Assumed section, select a desired multiple comparison procedure (such as Bonferroni); if it is deemed necessary later, an option from the Equal Variances Not Assumed section can be used.  These multiple comparison procedures are generally needed when one or both main effects are statistically significant.  Click on the Continue button to return to the Univariate dialog box.

         6.        Click on the Options button to display the Univariate: Options dialogue box.  Select each item from the list in the Factor(s) and Factor Interactions section on the left for the Display Means for section on the right.

         7.        In the Display section, select Homogeneity tests (which will generate results for Levene’s test) and select Estimates of effect size.  Click on the Continue button to return to the Univariate dialog box.

         8.        Click on the Plots button to display the Univariate: Profile Plots dialogue box.  To generate one of the two possible interaction plots, select one of the two variables from the Factor(s) section on the left for the Horizontal Axis slot on the right, and select the other variable for the Separate Lines slot on the right; then, click the Add button to add this plot to the Plots section; to generate the other possible interaction plot, repeat this with the roles of the variables reversed.  Click on the Continue button to return to the Univariate dialog box.

         9.        Click on the OK button, after which the SPSS output will be generated.

 

Appropriate graphical displays for the data used in a two‑way ANOVA include box plots to display main effects, and interaction plots for the interaction effects.

 

Performing a multiple linear regression with checks of linearity, homoscedasticity, and normality assumptions

 

         1.        Identify in the SPSS data file the (quantitative) dependent (response) variable, all quantitative independent (explanatory or predictor) variables, and all qualitative independent (explanatory or predictor) variables.

         2.        Select the Graphs > Legacy Dialogs > Scatter/Dot options to display the Scatter/Dot dialog box.

         3.        Make certain that the Simple Scatter option is selected; then, click on the Define button to display the Simple Scatterplot dialog box.

         4.        From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Y-Axis slot; then select one of the quantitative independent (explanatory or predictor) variables, and click on the arrow pointing toward the X-Axis slot.

         5.        Click on the OK button, after which SPSS output displaying a scatter plot will be generated.  In order to have the least squares line appear on the scatter plot, first double click on the graph to enter the SPSS Chart Editor, and then select the Elements> Fit Line at Total options from the main menu (and close the dialog box which appears), after which selecting the File > Close options will close the chart editor.  By examining the how the points on the scatterplot are distributed around the least squares line, a decision can be made as to whether the linearity assumption about the relationship between the dependent and independent variables is satisfied.

         6.        Repeat Steps #2 to #5 for each of the quantitative independent (explanatory or predictor) variables.

         7.        For each of the qualitative independent (explanatory or predictor) variables, add the appropriate dummy variable(s) to the data file.  For a qualitative variable with k categories, this can be done by defining dummy variables, where the first dummy variable is equal to 1 (one) for category #1 and 0 (zero) otherwise, the second dummy variable is equal to 1 (one) for category #2 and 0 (zero) otherwise, etc.; since k - 1 dummy variables are sufficient to represent a qualitative variable with k categories, the kth dummy variable is not really necessary, and may or may not be used.

         8.        Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

         9.        From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

      10.      Click on the Plots button to display the Linear Regression: Plots dialog box.  From the list of variables on the left, select ZRESID, and click on the arrow pointing toward the Y slot; then select ZPRED, and click on the arrow pointing toward the X slot.  This will generate a scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, so that a decision can be made as to whether the homoscedasticity assumption is satisfied.

      11.      In the Standardized Residual Plots section of the Linear Regression: Plots dialog box, select the Histogram option and the Normal probability plot option.  This will generate a histogram and a normal probability plot for standardized residuals, so that a decision can be made as to whether the normality assumption is satisfied.

      12.      Click on the Continue button, and then click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, and select the Descriptives option to generate means, standard deviations, and the Pearson correlation; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the slope and for the intercept in the regression.

      13.      Click on the Continue button, and then click on the Save button to display the Linear Regression: Save dialog box. In the Residuals section, select the Standardized option to save the standardized residuals as part of the data.  This allows further analysis to be performed using the standardized residuals.

      14.      Click on the Continue button, and then click on the OK button, after which the SPSS output will be generated.

      15.      The scatter plot with standardized predicted values on the horizontal axis and standardized residuals on the vertical axis, requested in Step #8, will be displayed on the SPSS output without a horizontal line at zero; since this line can be helpful in examining this plot, the instructions in Step #5 to have the least squares line appear on the scatter plot can be used to add this horizontal line at zero.

 

Methods to decide which of many predictors are the most important to include in a model are available with SPSS by doing the following:

 

         1.        Select the Analyze > Regression > Linear options to display the Linear Regression dialog box.

         2.        From the list of variables on the left, select the (quantitative) dependent (response) variable, and click on the arrow pointing toward the Dependent slot; then select each independent (explanatory or predictor) variable, and click on the arrow pointing toward the Independent(s) section (where selection of more than one variable is permitted).

         3.        In the Method slot, select the desired method for variable selection (such as the Stepwise option).

         4.        Click on the Continue button, and then click on the Statistics button to display the Linear Regression: Statistics dialog box.  In the Regression Coefficients section, make certain that the Estimates option is selected, select the R squared change option, and select the Collinearity diagnostics option; also, select the Confidence intervals option to set the desired (two‑sided) confidence level (which will generally be 100% minus the significance level, i.e., if the significance level is 0.05 (5%), then the confidence level will be 0.95 (95%)).  This will generate confidence intervals for the slope and for the intercept in the regression.

         5.        Click on the OK button, after which the SPSS output will be generated.

 

 

 

 

Generating a correlation matrix

 

 

????????????????