After spending many years and countless hours working with SPSS data exports, I’ve developed hundreds of custom SPSS scripts, syntax, and macros that automate a wide variety of tasks. My hope is that I can save you some of that time by sharing this insider knowledge with you.
Starting to work with syntax in SPSS can seem daunting, but it doesn’t need to be. In this article I want to demonstrate a very simple syntax use that serves me very well on most projects.
On nearly every study I work on I end up graphing my results in Excel; it is very rare for me to graph the questions exactly as they were asked. I typically change the tense, streamline, or abbreviate long words.
SurveyGizmo exports both SPSS syntax and data files. While the data file is convenient, I usually export the syntax file and use it as a starting point for creating a well defined data file as well as custom labels. Let’s take a look at how this works.
Pre-Work in SPSS
SPSS files aren’t ever going to be perfect when exported from a survey tool, but SurveyGizmo’s exports are some of the best I’ve seen. So the first thing we’ll want to do is tweak the syntax file to the path of where you have saved it.
To keep it simple, we’ll work from root directory of your C drive (“C:”). If you wish to save them elsewhere be sure you update the path to reflect where you save the files. Also, keep in mind that if you share your file with others they will also need to update the path to where they have saved the files.
Original syntax out of export:
GET DATA /TYPE=TXT /FILE="spss.txt" /DELCASE=LINE …
Updated version to reflect our path in C:\
GET DATA /TYPE=TXT /FILE="C:\spss.txt" /DELCASE=LINE …
Getting Started Applying Variable Labels to SPSS Data
Taking a quick look at the SPSS files exported from SurveyGizmo you might notice a few things that don’t line up with the questions asked. Some characters such as trademark symbols, Copyright signs, etc., have some oddities. For instance, below is a question I have in a test survey:
When considering your next soda purchase, which of the following brands do you consider? (select all that apply)
- Coke ™
- Diet Coke ©
- Pepsi ®
And below is the corresponding syntax that was generated when exporting the SPSS data/syntax file:
var5O3 "Coke â„¢:When considering your next soda purchase, which of the following brands do you consider? (select all that apply)"
var5O4 "Diet Coke Â©:When considering your next soda purchase, which of the following brands do you consider? (select all that apply)"
var5O5 "Pepsi Â®:When considering your next soda purchase, which of the following brands do you consider? (select all that apply)"
Also note that the stem of the question is included in each label. This is helpful when first working with the data file, especially if you have >80 variables; however, when it comes time to report you don’t want to include the stem.
Let’s create labels that match the options listed and also remove the stem of the question. The following will do just that:
var503 "Coke ™"
var504 "Diet Coke ©"
var505 "Pepsi ®"
To illustrate the benefit of making the change, here are two examples comparing the differences. The first chart is a side-by-side comparison of how the variables look when using the SPSS GUI to run descriptives.
The next example is how the SPSS output looks after running the descriptives.
While I’m not displaying it here, you can imagine how nice our “updated” labels version would graph. (In case you’re wondering why I’m computing the mean of these variables, remember that when you compute the mean of a bi-modal variable that only has the values of zero and one, the mean will represent the percent that selected it. Thus in the above graph you would interpret it as 35% selected Coke ™, 45% selected Diet Coke ©, and 52% selected Pepsi ®.)
In the above example, the value labels in the syntax file from SurveyGizmo display as follows:
0 'Unchecked' 1 'Checked' /
0 'Unchecked' 1 'Checked' /
0 'Unchecked' 1 'Checked' /.
To display “Value Labels” in your data editor view, check this option as shown below:
Your data would look something like this:
Hint: I recommend unchecking the “Value Labels” option and re-examining the data to fully understand what this setting does.
The previous syntax works great, however, because it is generated from an automated process. There is a lot of redundant code which can be removed and cleaned up. For instance, since these variables are consecutive, we can use a short-hand trick to apply the same variable labels to all three variables at the same time by changing the code to the following:
VALUE LABELS var503 TO var504 0 'Unchecked' 1 'Checked' .
To illustrate again, let’s say you add additional questions about “how devoted” the respondents are to each brand. Your syntax might look like this:
VALUE LABELS var603 TO var605 1 "Couldn't care less" 2 'Somewhat devoted' 3 "Can't live w/o it!" .
Notice in the above example that I switched to using double-quotes to wrap labels that have single quotes. This ensures that SPSS understands where you mean the variable label to end. If you have double-quotes in your label, be sure to use single quotes. Both work equally well, but in my experience you’ll run into single quotes in value labels much more frequently than double quotes. This is why I tend to use double-quotes by default.
The above two examples works fine however, if we really want to trim down our code, we can use one more short-cut by only using the Value Label command once and then applying the labels to each variable.
/ var503 TO var504 0 ‘Unchecked’ 1 ‘Checked’
/ var603 TO var605 1 “Couldn’t care less” 2 ‘Somewhat devoted’ 3 “Can’t live w/o it!”
Hint: I typically like leaving the period on its own line. This gives me a nice, clear, visual reminder of where my command stops running. Also, if you end up adding more labels, it isn’t as easy to spot at the end of a long line.
Another nice tip to be aware of is that if you want to change one value in a list of value labels (and not affect the others), you can use the “add value labels” command. Let’s say you’d decided to add a fourth value to the devoted variable. By not including the original value labels, you can add one more like so:
Add VALUE LABELS var603 TO var605 6"I'm in LOVE!".
Granted, in this instance, it would probably be easier to just revise original however if you ever get data pre-defined data sets from someone else and they have variables with 50-100 values (think of countries or states), it is very nice to be able to just update the values you want without having to re-write everything!
Joe Glines is the co-founder of the-Automator, a small company that specializes in automating reporting and daily tasks. He is an expert at SPSS as well as market research, and will be bringing his expertise to the SurveyGizmo blog on a regular basis.