SPSS Variable and Value Labels: A Quick Tutorial

I have hundreds of custom SPSS Scripts, Syntax & Macros which automate a wide variety of tasks.

Starting to work with syntax can seem daunting, but it does not need to be. In this post, I want to demonstrate a very simple use which I utilize on most projects.

On nearly every study I work on, I end up graphing my results in Excel. It is very rare for me to graph the questions exactly as they were asked. I typically change the tense, streamline, or abbreviate long words.

SurveyGizmo exports both SPSS syntax and data files. While the data file is convenient, I usually export the syntax file and use it as a start-point for creating a well-defined data file as well as custom labels.

To keep it simple, we’ll work from root directory of your C drive (“C:”). If you wish to save them elsewhere be sure you update the path to reflect where you save the files.

Taking a quick look at the file exported from SurveyGizmo you might notice a few things that don’t line up perfectly with the questions asked. Special characters such as commas, apostrophes, slashes and periods (,’/. respectively) don’t show up in the labels. Also note, the stem of the question is included in each label. This is helpful when first working with the data file, however, when you plan to report, you don’t want the stem included.

To get started, let’s create labels that match the questions we asked. The following will do just that:

Var Label
Week_Comm ‘Get stuck in commute / “traffic”‘
Week_FF “Eat at a fast-food joint like McDonald’s, Wendy’s or Taco Bell”
Week_Sing “Sing in the shower, in the car, on a boat, with a goat, crossing a moat, etc.”
Week_Faux “Pretend not to hear the kids fighting so my spouse/partner will deal with them instead of me”
Week_Wiki “Use Wikipedia™ to learn the obscure origin of a commonly used phrase”

After running the above syntax, you’ll see the variable labels in SPSS now reflect the actual questions that were asked. In cross-tabs or frequency tables, you can get away with having long labels.

Most my clients get glassy-eyed very quickly when I show tables so I try and graph nearly all data I present to them. Creating good looking graphs with very long labels is difficult, so I edit them down attempting to keep the meaning but keep the length short (TIP: make sure you have a copy of the original questions somewhere handy.)

Var Label
Week_Comm “Stuck in traffic”
Week_FF “Eat fast food”
Week_Sing “Sing…”
Week_Faux “Faux unconsciousness”
Week_Wiki “Use Wiki “

Now it will be much easier to graph the data and fit into a PowerPoint slide legible for all in the room. Here is an example of how this would look:

For each variable we’ve reviewed, there are three different variable labels, each of which having benefits depending upon the need. Since there are three different labels, the question then arises, “Which one is “right” for me?”

I would simply ask you why should you choose? Granted, one version should be saved in your data set. However, that is not a reason why you can’t have the other two version defined in a syntax file which can be run using an INCLUDE command before your analysis.

Having separate syntax files allows you to easily work with whichever version is most beneficial to you. I’ve created a syntax file which runs descriptive commands on the variables using the three separate label commands via an INCLUDE command.

Normally, I put the label commands in separate files, then just call the version I want at that point in time. (TIP: In the descriptive command I’ve sorted them Ascending (lowest to highest) this is because Excel inverts the data when graphed)

When your data file is a dozen variables or less, creating the various syntax files is probably not worth the effort. My typical data file has several hundred variables and using syntax files allows me to crank out an incredible amount of work BEFORE data collection is completed. (There is no reason why you can’t begin working on your labels right after you’ve launched your study!)

Join the Conversation