Excel files are files that are saved in one of the Excel file formats, the most common of which are the .xls and .xlsx file formats.
Although Excel files are extremely widely available and widely used, they are generally not a good file format for data analysis (unless the analysis is done in Excel), as:
- They are limited to being 1,048,576 rows, and 16,384 columns.
- They contain multiple sheets, but typically most data analysis software will ignore all the data except in one sheet.
- People that use Excel tend to often store data in ways that get ignored when the data is imported which creates bugs. For example, comments, tables of variable definitions, inconsistently formatted comments, charts, pictures.
Typically, when data is stored in Excel it is saved as a CSV file and then analyzed. A practical check when doing this is to open the CSV file in Excel and check that everything in it is OK.