The first step in data analysis is to import a data file. Before this can be done, a data file needs to be created or obtained. This can be done by:
- Integrations between data analysis and data collection software
- Obtaining a file from another person or organization
- Exporting a file from data collection software
- Creating a file using data collection software
- Creating a data file "by hand"
- Database queries
- APIs
In general, the higher the option in the list above, the better. That is, integrations are usually the best way and APIs are usually the worst.
Integrations between data analysis and data collection software
An integration is where you can click a button in your data analysis software and it automatically pulls the data file in from wherever it is stored. For example, Displayr allows users to import data directly from Qualtrics, SurveyMonkey, and Decipher.
Typically, integrations can only be used when the user has passwords and login credentials for the software that stores the data.
Integrations need to be programmed by engineers a the data analysis software, using the API of the software that stores the data.
Integrations have a couple of big advantages over the other options:
- They are quick and easy to use.
- The integration will have been designed to ensure that the best available data file is imported.
Obtaining a file from another person or organization
Not all data files are created equally. It's important to get the best data file format that is available. For more information, see Obtaining a Data File From Another Person or Organization.
Exporting a file from data collection software
All the major data collection platforms have the ability to export data files. The basic process is to take the time to identify all the export options and choose the best file, following the same process as described in Obtaining a Data File From Another Person or Organization.
Creating a file using data collection software
Sometimes it is necessary to create your own data file. For example, if data has been collected on paper questionnaires. This can be done using data collection software. Some data collection software has specialist data entry tools to make this process efficient. However, even when the software doesn't have such tools, a file can be created by:
- Setting up the software for online data collection.
- Re-entering any data into the software.
- Exporting a file (see the previous section).
Creating a data file "by hand"
Data files can also be manually created, in two steps:
- Creating a Flat File Using a Spreadsheet (this can also be done in SPSS Statistics).
- Reading the flat file into data analysis software and adding metadata.
- Either:
- Analyzing the data in the data analysis software used to add the metadata
- Saving the data file containing the metadata (e.g., saving it as an SPSS Data File (.SAV)).
Database queries
Data files can be imported from relational databases, typically via a SQL query. See Getting Survey Data via Database Queries (e.g., SQL).
Typically, the resulting file will be a flat data file (e.g., a CSV file), and there is a need to:
- Read the flat file into data analysis software and add metadata.
- Either:
- Analyze the data in the data analysis software used to add the metadata
- Save the data file containing the metadata (e.g., save it as an SPSS Data File (.SAV)).
APIs
APIs can be used to connect to data where:
- The place where the is stored (e.g., the data collection platform) has an API that allows users to write code to extract data from it.
- The user has the necessary credentials (e.g., passwords) to use the API.
- The data analysis software allows users to write code to import data.
- The user has sufficient expertise to write the code. This last step is non-trivial. The programming skills required to use an API are different from those required to analyze data; it's not something that a skilled analyst can figure out on their own.
By definition, all integrations use APIs. However, they do so in the background, without the user having to do any programming because the programming has already been performed.
Comments
0 comments
Please sign in to leave a comment.