Skip to Main Content
WJC Logo

Doc Project Resources

These are resources for students completing doctoral projects.

Resources for Results Chapter

This page contains resources that will help you write your doc project. This includes instructions for how to analyze data.

Before you can analyze your data, you will need to download it, clean it up, and prepare it for analysis.

Downloading Data

This video shows how to download your data from Qualtrics into SPSS. If you randomized your conditions, be sure to tell Qualtrics that you want the downloaded file to include the randomization order. To do that, go to your survey in Qualtrics. Click on Data & Analysis at the top of the window. Click on Export & Import --> Export Data. At the bottom of the box, click More options. Check the box next to "Export viewing order data for randomized surveys." Click Download.

Cleaning Data and Determining Who to Pay

If you used Prolific (or similar) for recruitment, you need to determine who to pay. Prolific has specific policies about who you can deny payment. Follow these directions to determine whether you should consider denying anyone payment. If you didn't use Prolific, you can use these instructions to determine whether you need to exclude any participants.

Preparing Your Data for Analysis

Before you can analyze your data, you will need to ensure that your data is in the correct format. You will likely need one column for each of your independent and dependent variables.

Dependent Variables

If you measured your DV using a single question

Your dependent variable (DV) will already be in a single column if all of your participants answered one question regardless of the condition to which they were assigned (e.g., if the DV question was not randomized with vignettes). If you asked just one question to measure your DV (e.g., how likely are you to find this person guilty?), then you are good to go.

The next thing you need to do is determine whether you have any missing data. In data view, right click on the top of the column for your DV. Select Sort Ascending. This will bring any cases with missing data to the top (they will have just a period in the cell). If a participant did not respond to your DV question, you will need to decide whether the person should be excluded from the analysis (you can't analyze data if you don't have information for all of the variables). Ideally, you would have identified such cases during the data cleaning process. Discuss with your chair if you have more than one DV measured with a single question each (e.g., how likely are you to vote for a ballot initiative and how much money are you willing to pay to fund the initiative) and a participant answered some but not others.

If you need to compute your DV from multiple questions on your survey

If you measured your DV using a questionnaire, you will need to compute your DV so that you have a single value to put into your analysis. Here, I will use the example of the Social Distance Scale, which has nine questions (3 personal, 3 housing, 3 workplace). Follow these steps:

1. Replace any missing data imputed values. In data view, right click on the top of the column for your DV. Select Sort Ascending. This will bring any cases with missing data to the top (they will have just a period in the cell). You will need to replace this with an an estimated value (called an imputed value; for the Social Distance Scale, you would need to check all nine questions for missing data and impute any missing values for each question). In this case, you're going to impute the mean. This means that you're going to replace the missing values with the mean from all of the participants on that variable (this is a conservative way to estimate the missing value). To do this, go to Transform --> Replace Missing Values. Put the variable with the missing value(s) into the box labeled New Variable(s). Under the box labeled Name and Method, SPSS will propose a name for the new variable that will include the imputed values. You can change the name of this. I like to call it something with the word "imputed" in it so I can tell which variables are which (e.g., if you are using the Social Distance Scale, the original questions might be called SD1, SD2, etc.; you might name a new variable SD1_imputed). Remember that variable names cannot have spaces in them. Next to Method, select Series mean. This will tell SPSS to replace the missing data with the mean of the original variable. Press OK. SPSS will create the new variable at the end of your data set. Repeat this as necessary to ensure that all of your dependent variables are complete (no missing data). If you don't replace the missing values with something, the summed values will be artificially low (e.g., if someone skipped one question on the Social Distance Scale, you will be adding together eight values instead of nine). If there is no missing data, you can skip this step.

2. Sum the values from the separate questions. Remember to use the variables with the imputed values instead of the original variables as necessary (this is why it's helpful to put the word "imputed" into the variable name). To do this, go to Transform --> Compute Variable. In the box labeled Target Variable, type in a name for the new variable. You can call it whatever you want. For instance, f you're computing social distance personal score, you might call it SDpersonal; if you're calculating social distance total score, you might call it SDtotal). In the box labeled Numeric Expression, you will enter the formula. Select the first value you want to include from your list of variables on the left. Press the + button. Enter the next variable. Do this until you have included all necessary variables (e.g., SDpersonal = SD1 + SD2 + SD3; if you imputed means for one of those, it might look like SD1_imputed + SD2 + SD3). Click OK. SPSS will add the new variable to the end of your data set. For the Social Distance Scale, you need to compute sums for each subscale (personal, housing, and workplace) and the total (you can add together SDpersonal + SDhousing + SDworkplace or add together all nine questions - the result will be the same).

If you put your DV question(s) on a page with each vignette (you randomized the questions with the vignettes)

Sometimes the wording of your DV questions needs to change depending on the condition participants receive (e.g., if you're varying the name of the person in your vignette (Joe v. Mike), you might need to ask some participants whether they are willing to be friends with Joe and others whether they are willing to be friends with Mike. You will need to ensure that everyone who gets the vignette about Joe gets the Joe questions, so you'll have the question(s) randomize with the vignettes. In this case, your DV data ended up split into multiple columns. You will need to combine the data into a single column before you can analyze it. This video describes how to combine data from two (or more) columns into one column. Combine the columns first and then deal with any missing data (see instructions above).

Independent Variables

Most data sets download with the randomization order, and you will need to combine those columns into new columns for your independent variables. The randomization order is split into multiple variables at the end of your data set. They usually start with the letters FL (e.g., FL_18_DO_FL_28) and are labeled (under the Label column) with something that says, for example, "FL_18 - Block Randomizer - Display Order FL_28." Each column is one condition. You first need to figure out which condition is which. For instance, if you had two variables, defendant age (13 or 16) and crime (murder or robbery), figure out which conditions are 13 and murder, 13 and robbery, 16 and murder, and 16 and robbery. You might want to change the name of the variable to reflect the condition (e.g., 13M, 13 R). This will make it easier to see which is which in the analysis boxes.

Now you need to combine them so that you have a single column for age and one for crime. You will need to use the technique described in the video above to combine multiple variables into a single variable. To do this:

If your variable has two levels

1. Go to Transform --> Compute variable

2. In the box, enter your first independent variable name into the Target Variable box (e.g., Crime). Then enter the formula into the Numeric Expression box. For crime, enter SUM.1(13M, 16M). Click OK. This will make a column called Crime, and all participants who read a vignette about a defendant who committed murder will have a 1 in that column. The rest will be blank.

3. You will need to change the blank cells to 0 (indicating that participants were in the other condition). To fill in the blank cells, go to Transform --> Recode into Same Variables. Select the variable you just created (it should be at the bottom of the variable list), and move it into the Numeric Variables box. Click Old and New Values... Under Old Value --> Value:, put in 1. Under New Value --> Value:, put 1 (you want the cells with 1 in them already to stay the same). Click Add. Then, do the same thing again, but under old value, click System-missing. Under New Value, put 0. Click Add. Click Continue.

4. Go to Variable View. Scroll to the bottom of the variable list, and find your new variable. Under the Values column, click the three dots to bring up the Value Labels box. Click the plus sign. Under Value, enter 0. Under Label, enter the condition associated with that (in the above example, robbery). Click the plus sign again, and enter Value 1 and Label murder. Now you have created one independent variable.

5. Repeat these steps for any other IVs.

If your variable has more than two levels

1. To do this with more than two levels, you will need to first change the values in the original columns so that they're not all 1. Decide how you are going to code your variable. For instance, if the variable is crime, and it has three levels (robbery, murder, assault), you might code 0 = robbery, 1 = murder, and 2 = assault.

2. Click Transform --> Recode into Same Variables... Put all of the original variables with the first condition (e.g., robbery) into the Variables: box. Click Old and New Values. In the Old Value box, under Value, put 1. Under New Value --> Value, put 0. Click Continue and then OK. This will make all of the 1s into 0s. You don't need to do this for murder because it already has values of 1. Repeat this for assault, but under New Value, put 2.

3. Use the SUM.1 formula to combine all of the original variables into your intended variable (e.g., crime).

NOTE: You can name your variables whatever you want in SPSS. They only have to make sense to you and whoever might help you with your SPSS file. I recommend keeping variable names short. Also note that variable names cannot contain spaces. Therefore, if you have a variable like defendant age, you might name it either DefAge or Def_Age, but you cannot name it Def Age.

You can also assign whatever numbers you want to your conditions (e.g., the results will be the same regardless of whether robbery is coded as 0 or 1). You don't need to report how you coded conditions when you write up your results.

Analyzing Your Data

See the page on this website labeled SPSS Instructions and Write-up Templates.

Reporting Your Results

See the document at the top of this page.

How many decimal places should I use in reporting my data?

APA is funny about rounding.

  1. Round all descriptive statistics to one decimal. APA specifically lists means and SDs. They don’t specifically address skewness and kurtosis. As those are descriptive, I recommend rounding to one decimal.
  2. Round everything else to two decimals except pvalues and some partial eta squared values. According to Purdue Owl, exact p values can be reported to two or three places. I usually report to three places when p<.06 and two when it is greater than .06. When p is less than .001, report p<.001. APA doesn’t specifically mention partial eta squared. I usually round to two decimals if this is .01 or greater and to three decimals if it is less than .01.
  3. If the statistic can be greater than 1, use a leading zero (0.24 in). If the statistic cannot be greater than 1, do not use a leading zero (p = .042 or r = .65).