Steps Involved in Secondary Data Collection
Step 1 - Identify the topic of research - find a topic that you would like to pursue for your research.
Step 2 - Identify research sources - find information sources that will provide the most relevant data and information applicable to your research.
Step 3 - Collect existing data - check for any earlier data that are available that are closely related to your research topic.
Step 4 - Combine and compare - combine and compare the data for any duplication and assemble data into a usable format. Make sure to collect data from authentic/verified sources. Incorrect data will severely hamper your research.
Step 5 - Analyse data: Analyse all data collected and identify if all questions are answered
Source: The above five steps involved in secondary data collection were adapted from QuestionPro
Secondary data are useful for a range of purposes:
- identifying the research problem,
- developing an approach to the problem or to a sampling plan,
- conceptualising appropriate research designs,
- answering research questions,
- testing hypotheses, and
- validating qualitative findings.
However, the usefulness of secondary data is limited regarding relevance and accuracy when it comes to addressing the issues at hand (Malhotra and Birks, 2007).
Some types of data may be primary sources for some purposes and secondary sources for others. For example, a high school textbook in Latvian history is classified as secondary source, but the book would be a primary source of data if the researcher conducted a study of the changing emphasis on national integration in high school history textbooks.
When collecting secondary data, researchers may consider the following questions to test the reliability of their data:
- Who originally made the discoveries or brought the conclusions in this document to light?
- Who collected the data?
- What were their sources?
- Did the author(s) who collected the data use proper methods?
- When (and, if it is relevant, at what time) were the data collected?
- Do the author(s)/researcher(s) who collected the data show evidence of bias?
In secondary sources, since the focus is on analysing or discussion of a primary source, you would look for words that describe the action of the author indicating that this is an analysis or discussion, such as:
- analysis
- synthesis
- overview
- appraisal/evaluation
- reported on
Watch the video below (2 minutes, 7 seconds) to learn how to cite secondary sources in APA format:
Table 1. Advantages and Disadvantages of Using Secondary Data in Social Inquiry
| Advantages | Disadvantages | 
| Cost-effective: The costs are shared or already paid. | Data might be dated | 
| Ease of access: The data have been already collected and analysed. | Objectivity of data: The data can be biased in favour of the researcher(s) who gathered the data. | 
| Time-saving: Others have already spent time researching the specific phenomenon. | Complexity of data: Researchers who use secondary data may need more time to fully understand the essence of data | 
| Allow researchers to generate new insights from earlier research | Generic or off-target: Not specific to your needs as a researcher | 
| Secondary data drawn from credible sources: The data is usually gathered by experienced researchers affiliated with large organisations, such as, OECD-PISA. | 
 | 
| Opportunity for longitudinal analysis | 
 | 
Source: adapted from Bryman et al. (2019).
Primary data is gathered first-hand by a researcher for a specific research goal, usually through data sources such as experiments, questionnaires, interviews, observations, and focus groups. Whereas secondary data comprises “pre-existing data that was originally collected for a different research purpose or by someone other than the researcher” (Given, 2008, p. 803). We can say that secondary data has been previously collected usually by other researchers, hence reasons other than your research. Wit that, “by virtue of being archived and made available, any type of primary data can serve as secondary data” (Hox & Boeije, 2005, p.596). It should be noted that secondary data may either be published or unpublished data.
Researchers using secondary data sources for their research are drawing their findings from: (i) the already collected, analysed and completed work of other researchers, scholars and writers (e.g., from books, articles or reports of historians, anthropologists, sociologists, teachers, journalists); (ii) records produced as a result of the everyday activities within various organisational contexts (e.g., government offices, non-profit organisations, private businesses and corporations); (iii) personal memoirs (e.g., correspondence, diaries, photographs).
In the contemporary context of digital technologies and communications, the internet is considered a rich and expanding resource of various types of text (e.g., written and image-based) for sociological analysis (e.g., website pages, Facebook, Twitter, blogs) alongside more traditional sources such as periodicals, newspapers, or magazines from any period in history. Secondary data is, however, not limited to those sources in social research! Watch the video below (1 minute, 11 seconds) to learn more about secondary data sources.