Data should be shaped like a rectangle. This greatly simplifies data analysis. There are a variety of common ways that data is misshapen.

## The desired rectangular shape

As discussed in Raw Data, efficient data analysis requires that the raw data has a rectangular shape, with rows representing cases and columns representing variables. This article reviews the most common ways that data can be misshapen, making data analysis problematic.

## The rectangular shape greatly simplifies data analysis

Data analysis software is a collection of algorithms for analyzing data. These algorithms assume that the data is in a specific format. Where the data is not in the shape assumed by the algorithms, it means that either:

- The desired calculations cannot be performed until the data is placed into the correct format.
- Specialist algorithms need to be found or created that can perform the calculations with the data.
- Manual calculations are required to glue together results.
- The wrong results are calculated.

## Common ways in which data is misshapen

Some common mistakes when setting up data files are:

- Messy Rectangles, where there are gaps in the rectangle.
- Too Wide Data File, where the rectangle is wider and shorter than it should be.
- Multiple Variables in a Single Column, where the rectangle is taller and narrower than it should be.
- Multiple Tables, Rather Than a Single Table

## Comments

0 comments

Please sign in to leave a comment.