In the example below, the first character represents age bands (e.g., 3 may represent 35 to 44), and the second column gender (e.g., 1 = Male, 2 = Female).
Sometimes fixed-width files data contain no return characters (e.g., the data above may be represented as 31321242). Sometimes the first few lines of the data file contain other information (e.g., the name of the study, the number of variables in the study, the number of observations).
The file format is rarely a good choice for storing data
Fixed-width files were common in the early days of computers, as they are they have some efficiencies in terms of data storage and the time taken for a computer to read data from a file. However, their substantial disadvantages mean the format is rarely used today. Key disadvantages are:
- A data dictionary is required to analyze the data. Unless you know that age is in column 2, for example, there's no way of analyzing the data. Similarly, when there are more than 10 possible values that need to represent in a column, other symbols, such as letters in the alphabet are used, meaning that even numbers end up not being stored as numbers, making interpretation difficult.
- The file format is problematic if the data needs to represent changes. For example, if a variable is meant to be stored in a single column, it may become impossible to store all the unique values in that column.