Variable names should be short, contain no strange characters, and be informative.
Short variable names are good
Variable names are routinely used when writing code. Where the variable names are long, it can make the code very hard to read. Consequently, it's better to use, say, Q12 than Question12 or WhatDidYouBuyThisBrand.
Further, some software, particularly older software (e.g., earlier versions of SPSS Statistics), require that variable names be 8 characters in length or shorter.
Variable names should contain no punctuation or strange characters
Most data analysis software requires variable names to comply with certain rules. For example:
- Older versions of SPSS do not permit the use of punctuation
- Modern versions of SPSS do not permit periods (i.e., ".") at the end of a variable name.
- Most software doesn't permit spaces in a variable name.
The safe thing to do is to only use letters and numbers and to always start with a letter.
Good variable names are informative
Having informative names greatly improves the ease with which data can be analyzed.
Poor variable names, such as random variable names or a simple sequence (e.g., VAR001, VAR0002, VAR003, etc.) make data analysis unnecessarily painful.
Consider a survey with 20 questions. A good set of variable names would be Q1, Q2, ..., Q20, where the numbering corresponds to the question numbering in the written questionnaire.
If the questionnaire contains sections, such as screeners at the beginning and demographics at the end, the variable names can include this information (e.g., S1, S2, …., Q1, Q2, …, C1, C2,…).
When variables are part of a set, it's helpful for them to share a common prefix (e.g., Q4a, Q4b, Q4c), rather than each variable having a different question number (e.g., Q231, Q232, Q233). Where a question is a loop of a multiple response question or a grid, this is generally best represented via a common prefix and two separate looping suffixes (e.g., Q4a1, Q4a2, Q4b1, Q4b2).
Comments
0 comments
Please sign in to leave a comment.