• Skip to main content
  • Skip to primary sidebar
  • Skip to footer
  • QuestionPro

survey software icon

  • Solutions Industries Gaming Automotive Sports and events Education Government Travel & Hospitality Financial Services Healthcare Cannabis Technology Use Case NPS+ Communities Audience Contactless surveys Mobile LivePolls Member Experience GDPR Positive People Science 360 Feedback Surveys
  • Resources Blog eBooks Survey Templates Case Studies Training Help center

an empirical study is based on research design that is

Home Market Research

Empirical Research: Definition, Methods, Types and Examples

What is Empirical Research

Content Index

Empirical research: Definition

Empirical research: origin, quantitative research methods, qualitative research methods, steps for conducting empirical research, empirical research methodology cycle, advantages of empirical research, disadvantages of empirical research, why is there a need for empirical research.

Empirical research is defined as any research where conclusions of the study is strictly drawn from concretely empirical evidence, and therefore “verifiable” evidence.

This empirical evidence can be gathered using quantitative market research and  qualitative market research  methods.

For example: A research is being conducted to find out if listening to happy music in the workplace while working may promote creativity? An experiment is conducted by using a music website survey on a set of audience who are exposed to happy music and another set who are not listening to music at all, and the subjects are then observed. The results derived from such a research will give empirical evidence if it does promote creativity or not.

LEARN ABOUT: Behavioral Research

You must have heard the quote” I will not believe it unless I see it”. This came from the ancient empiricists, a fundamental understanding that powered the emergence of medieval science during the renaissance period and laid the foundation of modern science, as we know it today. The word itself has its roots in greek. It is derived from the greek word empeirikos which means “experienced”.

In today’s world, the word empirical refers to collection of data using evidence that is collected through observation or experience or by using calibrated scientific instruments. All of the above origins have one thing in common which is dependence of observation and experiments to collect data and test them to come up with conclusions.

LEARN ABOUT: Causal Research

Types and methodologies of empirical research

Empirical research can be conducted and analysed using qualitative or quantitative methods.

  • Quantitative research : Quantitative research methods are used to gather information through numerical data. It is used to quantify opinions, behaviors or other defined variables . These are predetermined and are in a more structured format. Some of the commonly used methods are survey, longitudinal studies, polls, etc
  • Qualitative research:   Qualitative research methods are used to gather non numerical data.  It is used to find meanings, opinions, or the underlying reasons from its subjects. These methods are unstructured or semi structured. The sample size for such a research is usually small and it is a conversational type of method to provide more insight or in-depth information about the problem Some of the most popular forms of methods are focus groups, experiments, interviews, etc.

Data collected from these will need to be analysed. Empirical evidence can also be analysed either quantitatively and qualitatively. Using this, the researcher can answer empirical questions which have to be clearly defined and answerable with the findings he has got. The type of research design used will vary depending on the field in which it is going to be used. Many of them might choose to do a collective research involving quantitative and qualitative method to better answer questions which cannot be studied in a laboratory setting.

LEARN ABOUT: Qualitative Research Questions and Questionnaires

Quantitative research methods aid in analyzing the empirical evidence gathered. By using these a researcher can find out if his hypothesis is supported or not.

  • Survey research: Survey research generally involves a large audience to collect a large amount of data. This is a quantitative method having a predetermined set of closed questions which are pretty easy to answer. Because of the simplicity of such a method, high responses are achieved. It is one of the most commonly used methods for all kinds of research in today’s world.

Previously, surveys were taken face to face only with maybe a recorder. However, with advancement in technology and for ease, new mediums such as emails , or social media have emerged.

For example: Depletion of energy resources is a growing concern and hence there is a need for awareness about renewable energy. According to recent studies, fossil fuels still account for around 80% of energy consumption in the United States. Even though there is a rise in the use of green energy every year, there are certain parameters because of which the general population is still not opting for green energy. In order to understand why, a survey can be conducted to gather opinions of the general population about green energy and the factors that influence their choice of switching to renewable energy. Such a survey can help institutions or governing bodies to promote appropriate awareness and incentive schemes to push the use of greener energy.

Learn more: Renewable Energy Survey Template Descriptive Research vs Correlational Research

  • Experimental research: In experimental research , an experiment is set up and a hypothesis is tested by creating a situation in which one of the variable is manipulated. This is also used to check cause and effect. It is tested to see what happens to the independent variable if the other one is removed or altered. The process for such a method is usually proposing a hypothesis, experimenting on it, analyzing the findings and reporting the findings to understand if it supports the theory or not.

For example: A particular product company is trying to find what is the reason for them to not be able to capture the market. So the organisation makes changes in each one of the processes like manufacturing, marketing, sales and operations. Through the experiment they understand that sales training directly impacts the market coverage for their product. If the person is trained well, then the product will have better coverage.

  • Correlational research: Correlational research is used to find relation between two set of variables . Regression analysis is generally used to predict outcomes of such a method. It can be positive, negative or neutral correlation.

LEARN ABOUT: Level of Analysis

For example: Higher educated individuals will get higher paying jobs. This means higher education enables the individual to high paying job and less education will lead to lower paying jobs.

  • Longitudinal study: Longitudinal study is used to understand the traits or behavior of a subject under observation after repeatedly testing the subject over a period of time. Data collected from such a method can be qualitative or quantitative in nature.

For example: A research to find out benefits of exercise. The target is asked to exercise everyday for a particular period of time and the results show higher endurance, stamina, and muscle growth. This supports the fact that exercise benefits an individual body.

  • Cross sectional: Cross sectional study is an observational type of method, in which a set of audience is observed at a given point in time. In this type, the set of people are chosen in a fashion which depicts similarity in all the variables except the one which is being researched. This type does not enable the researcher to establish a cause and effect relationship as it is not observed for a continuous time period. It is majorly used by healthcare sector or the retail industry.

For example: A medical study to find the prevalence of under-nutrition disorders in kids of a given population. This will involve looking at a wide range of parameters like age, ethnicity, location, incomes  and social backgrounds. If a significant number of kids coming from poor families show under-nutrition disorders, the researcher can further investigate into it. Usually a cross sectional study is followed by a longitudinal study to find out the exact reason.

  • Causal-Comparative research : This method is based on comparison. It is mainly used to find out cause-effect relationship between two variables or even multiple variables.

For example: A researcher measured the productivity of employees in a company which gave breaks to the employees during work and compared that to the employees of the company which did not give breaks at all.

LEARN ABOUT: Action Research

Some research questions need to be analysed qualitatively, as quantitative methods are not applicable there. In many cases, in-depth information is needed or a researcher may need to observe a target audience behavior, hence the results needed are in a descriptive analysis form. Qualitative research results will be descriptive rather than predictive. It enables the researcher to build or support theories for future potential quantitative research. In such a situation qualitative research methods are used to derive a conclusion to support the theory or hypothesis being studied.

LEARN ABOUT: Qualitative Interview

  • Case study: Case study method is used to find more information through carefully analyzing existing cases. It is very often used for business research or to gather empirical evidence for investigation purpose. It is a method to investigate a problem within its real life context through existing cases. The researcher has to carefully analyse making sure the parameter and variables in the existing case are the same as to the case that is being investigated. Using the findings from the case study, conclusions can be drawn regarding the topic that is being studied.

For example: A report mentioning the solution provided by a company to its client. The challenges they faced during initiation and deployment, the findings of the case and solutions they offered for the problems. Such case studies are used by most companies as it forms an empirical evidence for the company to promote in order to get more business.

  • Observational method:   Observational method is a process to observe and gather data from its target. Since it is a qualitative method it is time consuming and very personal. It can be said that observational research method is a part of ethnographic research which is also used to gather empirical evidence. This is usually a qualitative form of research, however in some cases it can be quantitative as well depending on what is being studied.

For example: setting up a research to observe a particular animal in the rain-forests of amazon. Such a research usually take a lot of time as observation has to be done for a set amount of time to study patterns or behavior of the subject. Another example used widely nowadays is to observe people shopping in a mall to figure out buying behavior of consumers.

  • One-on-one interview: Such a method is purely qualitative and one of the most widely used. The reason being it enables a researcher get precise meaningful data if the right questions are asked. It is a conversational method where in-depth data can be gathered depending on where the conversation leads.

For example: A one-on-one interview with the finance minister to gather data on financial policies of the country and its implications on the public.

  • Focus groups: Focus groups are used when a researcher wants to find answers to why, what and how questions. A small group is generally chosen for such a method and it is not necessary to interact with the group in person. A moderator is generally needed in case the group is being addressed in person. This is widely used by product companies to collect data about their brands and the product.

For example: A mobile phone manufacturer wanting to have a feedback on the dimensions of one of their models which is yet to be launched. Such studies help the company meet the demand of the customer and position their model appropriately in the market.

  • Text analysis: Text analysis method is a little new compared to the other types. Such a method is used to analyse social life by going through images or words used by the individual. In today’s world, with social media playing a major part of everyone’s life, such a method enables the research to follow the pattern that relates to his study.

For example: A lot of companies ask for feedback from the customer in detail mentioning how satisfied are they with their customer support team. Such data enables the researcher to take appropriate decisions to make their support team better.

Sometimes a combination of the methods is also needed for some questions that cannot be answered using only one type of method especially when a researcher needs to gain a complete understanding of complex subject matter.

We recently published a blog that talks about examples of qualitative data in education ; why don’t you check it out for more ideas?

Since empirical research is based on observation and capturing experiences, it is important to plan the steps to conduct the experiment and how to analyse it. This will enable the researcher to resolve problems or obstacles which can occur during the experiment.

Step #1: Define the purpose of the research

This is the step where the researcher has to answer questions like what exactly do I want to find out? What is the problem statement? Are there any issues in terms of the availability of knowledge, data, time or resources. Will this research be more beneficial than what it will cost.

Before going ahead, a researcher has to clearly define his purpose for the research and set up a plan to carry out further tasks.

Step #2 : Supporting theories and relevant literature

The researcher needs to find out if there are theories which can be linked to his research problem . He has to figure out if any theory can help him support his findings. All kind of relevant literature will help the researcher to find if there are others who have researched this before, or what are the problems faced during this research. The researcher will also have to set up assumptions and also find out if there is any history regarding his research problem

Step #3: Creation of Hypothesis and measurement

Before beginning the actual research he needs to provide himself a working hypothesis or guess what will be the probable result. Researcher has to set up variables, decide the environment for the research and find out how can he relate between the variables.

Researcher will also need to define the units of measurements, tolerable degree for errors, and find out if the measurement chosen will be acceptable by others.

Step #4: Methodology, research design and data collection

In this step, the researcher has to define a strategy for conducting his research. He has to set up experiments to collect data which will enable him to propose the hypothesis. The researcher will decide whether he will need experimental or non experimental method for conducting the research. The type of research design will vary depending on the field in which the research is being conducted. Last but not the least, the researcher will have to find out parameters that will affect the validity of the research design. Data collection will need to be done by choosing appropriate samples depending on the research question. To carry out the research, he can use one of the many sampling techniques. Once data collection is complete, researcher will have empirical data which needs to be analysed.

LEARN ABOUT: Best Data Collection Tools

Step #5: Data Analysis and result

Data analysis can be done in two ways, qualitatively and quantitatively. Researcher will need to find out what qualitative method or quantitative method will be needed or will he need a combination of both. Depending on the unit of analysis of his data, he will know if his hypothesis is supported or rejected. Analyzing this data is the most important part to support his hypothesis.

Step #6: Conclusion

A report will need to be made with the findings of the research. The researcher can give the theories and literature that support his research. He can make suggestions or recommendations for further research on his topic.

Empirical research methodology cycle

A.D. de Groot, a famous dutch psychologist and a chess expert conducted some of the most notable experiments using chess in the 1940’s. During his study, he came up with a cycle which is consistent and now widely used to conduct empirical research. It consists of 5 phases with each phase being as important as the next one. The empirical cycle captures the process of coming up with hypothesis about how certain subjects work or behave and then testing these hypothesis against empirical data in a systematic and rigorous approach. It can be said that it characterizes the deductive approach to science. Following is the empirical cycle.

  • Observation: At this phase an idea is sparked for proposing a hypothesis. During this phase empirical data is gathered using observation. For example: a particular species of flower bloom in a different color only during a specific season.
  • Induction: Inductive reasoning is then carried out to form a general conclusion from the data gathered through observation. For example: As stated above it is observed that the species of flower blooms in a different color during a specific season. A researcher may ask a question “does the temperature in the season cause the color change in the flower?” He can assume that is the case, however it is a mere conjecture and hence an experiment needs to be set up to support this hypothesis. So he tags a few set of flowers kept at a different temperature and observes if they still change the color?
  • Deduction: This phase helps the researcher to deduce a conclusion out of his experiment. This has to be based on logic and rationality to come up with specific unbiased results.For example: In the experiment, if the tagged flowers in a different temperature environment do not change the color then it can be concluded that temperature plays a role in changing the color of the bloom.
  • Testing: This phase involves the researcher to return to empirical methods to put his hypothesis to the test. The researcher now needs to make sense of his data and hence needs to use statistical analysis plans to determine the temperature and bloom color relationship. If the researcher finds out that most flowers bloom a different color when exposed to the certain temperature and the others do not when the temperature is different, he has found support to his hypothesis. Please note this not proof but just a support to his hypothesis.
  • Evaluation: This phase is generally forgotten by most but is an important one to keep gaining knowledge. During this phase the researcher puts forth the data he has collected, the support argument and his conclusion. The researcher also states the limitations for the experiment and his hypothesis and suggests tips for others to pick it up and continue a more in-depth research for others in the future. LEARN MORE: Population vs Sample

LEARN MORE: Population vs Sample

There is a reason why empirical research is one of the most widely used method. There are a few advantages associated with it. Following are a few of them.

  • It is used to authenticate traditional research through various experiments and observations.
  • This research methodology makes the research being conducted more competent and authentic.
  • It enables a researcher understand the dynamic changes that can happen and change his strategy accordingly.
  • The level of control in such a research is high so the researcher can control multiple variables.
  • It plays a vital role in increasing internal validity .

Even though empirical research makes the research more competent and authentic, it does have a few disadvantages. Following are a few of them.

  • Such a research needs patience as it can be very time consuming. The researcher has to collect data from multiple sources and the parameters involved are quite a few, which will lead to a time consuming research.
  • Most of the time, a researcher will need to conduct research at different locations or in different environments, this can lead to an expensive affair.
  • There are a few rules in which experiments can be performed and hence permissions are needed. Many a times, it is very difficult to get certain permissions to carry out different methods of this research.
  • Collection of data can be a problem sometimes, as it has to be collected from a variety of sources through different methods.

LEARN ABOUT:  Social Communication Questionnaire

Empirical research is important in today’s world because most people believe in something only that they can see, hear or experience. It is used to validate multiple hypothesis and increase human knowledge and continue doing it to keep advancing in various fields.

For example: Pharmaceutical companies use empirical research to try out a specific drug on controlled groups or random groups to study the effect and cause. This way, they prove certain theories they had proposed for the specific drug. Such research is very important as sometimes it can lead to finding a cure for a disease that has existed for many years. It is useful in science and many other fields like history, social sciences, business, etc.

LEARN ABOUT: 12 Best Tools for Researchers

With the advancement in today’s world, empirical research has become critical and a norm in many fields to support their hypothesis and gain more knowledge. The methods mentioned above are very useful for carrying out such research. However, a number of new methods will keep coming up as the nature of new investigative questions keeps getting unique or changing.

Create a single source of real data with a built-for-insights platform. Store past data, add nuggets of insights, and import research data from various sources into a CRM for insights. Build on ever-growing research with a real-time dashboard in a unified research management platform to turn insights into knowledge.

LEARN MORE         FREE TRIAL

MORE LIKE THIS

Behavior analytics tools

Best 15 Behavior Analytics Tools to Explore Your User Actions

Apr 8, 2024

concept testing tools

Top 7 Concept Testing Tools to Elevate Your Ideas in 2024

AI Question Generator

AI Question Generator: Create Easy + Accurate Tests and Surveys

Apr 6, 2024

ux research software

Top 17 UX Research Software for UX Design in 2024

Apr 5, 2024

Other categories

  • Academic Research
  • Artificial Intelligence
  • Assessments
  • Brand Awareness
  • Case Studies
  • Communities
  • Consumer Insights
  • Customer effort score
  • Customer Engagement
  • Customer Experience
  • Customer Loyalty
  • Customer Research
  • Customer Satisfaction
  • Employee Benefits
  • Employee Engagement
  • Employee Retention
  • Friday Five
  • General Data Protection Regulation
  • Insights Hub
  • Life@QuestionPro
  • Market Research
  • Mobile diaries
  • Mobile Surveys
  • New Features
  • Online Communities
  • Question Types
  • Questionnaire
  • QuestionPro Products
  • Release Notes
  • Research Tools and Apps
  • Revenue at Risk
  • Survey Templates
  • Training Tips
  • Uncategorized
  • Video Learning Series
  • What’s Coming Up
  • Workforce Intelligence

Research Methods

  • Getting Started
  • Literature Review Research

Research Design

  • Research Design By Discipline
  • SAGE Research Methods
  • Teaching with SAGE Research Methods

Quantitative vs. Qualitative Research: The Differences Explained

From Scribbr 

Empirical Research

What is empirical research.

"Empirical research is research that is based on observation and measurement of phenomena, as directly experienced by the researcher. The data thus gathered may be compared against a theory or hypothesis, but the results are still based on real life experience. The data gathered is all primary data, although secondary data from a literature review may form the theoretical background."

Characteristics of Empirical Research

Emerald Publishing's  guide to conducting empirical research  identifies a number of common elements to empirical research: 

A  research question , which will determine research objectives.

A particular and planned  design  for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources.

The gathering of  primary data , which is then analysed.

A particular  methodology  for collecting and analysing the data, such as an experiment or survey.

The limitation of the data to a particular group, area or time scale, known as a  sample  [emphasis added]: for example, a specific number of employees of a particular company type, or all users of a library over a given time scale. The sample should be somehow representative of a wider population.

The ability to  recreate  the study and test the results. This is known as  reliability .

The ability to  generalize  from the findings to a larger sample and to other situations.

If you see these elements in a research article, you can feel confident that you have found empirical research. Emerald's guide goes into more detail on each element. 

Emerald Publishing (n.d.). How to... conduct empirical research. https://www.emeraldgrouppublishing.com/how-to/research-methods/conduct-empirical-research-l 

  • Quantitative vs. Qualitative
  • Data Collection Methods
  • Analyzing Data

When collecting and analyzing data,  quantitative research  deals with numbers and statistics, while  qualitative research  deals with words and meanings. Both are important for gaining different kinds of knowledge.

Quantitative research

Common quantitative methods include experiments, observations recorded as numbers, and surveys with closed-ended questions.

Qualitative research

Common qualitative methods include interviews with open-ended questions, observations described in words, and literature reviews that explore concepts and theories.

Streefkerk, R. (2022, February 7). Qualitative vs. quantitative research: Differences, examples & methods. Scibbr. https://www.scribbr.com/methodology/qualitative-quantitative-research/ 

Quantitative and qualitative data can be collected using various methods. It is important to use a  data collection  method that will help answer your research question(s).

Many data collection methods can be either qualitative or quantitative. For example, in surveys, observations or  case studies , your data can be represented as numbers (e.g. using rating scales or counting frequencies) or as words (e.g. with open-ended questions or descriptions of what you observe).

However, some methods are more commonly used in one type or the other.

Quantitative data collection methods

  • Surveys :  List of closed or multiple choice questions that is distributed to a  sample  (online, in person, or over the phone).
  • Experiments :  Situation in which  variables  are controlled and manipulated to establish cause-and-effect relationships.
  • Observations:  Observing subjects in a natural environment where variables can’t be controlled.

Qualitative data collection methods

  • Interviews : Asking open-ended questions verbally to respondents.
  • Focus groups:  Discussion among a group of people about a topic to gather opinions that can be used for further research.
  • Ethnography : Participating in a community or organization for an extended period of time to closely observe culture and behavior.
  • Literature review :  Survey of published works by other authors.

When to use qualitative vs. quantitative research

A rule of thumb for deciding whether to use qualitative or quantitative data is:

  • Use quantitative research if you want to  confirm or test something  (a theory or hypothesis)
  • Use qualitative research if you want to  understand something  (concepts, thoughts, experiences)

For most  research topics  you can choose a qualitative, quantitative or  mixed methods approach . Which type you choose depends on, among other things, whether you’re taking an  inductive vs. deductive research approach ; your  research question(s) ; whether you’re doing  experimental ,  correlational , or  descriptive research ; and practical considerations such as time, money, availability of data, and access to respondents.

Streefkerk, R. (2022, February 7).  Qualitative vs. quantitative research: Differences, examples & methods.  Scibbr. https://www.scribbr.com/methodology/qualitative-quantitative-research/ 

Qualitative or quantitative data by itself can’t prove or demonstrate anything, but has to be analyzed to show its meaning in relation to the research questions. The method of analysis differs for each type of data.

Analyzing quantitative data

Quantitative data is based on numbers. Simple math or more advanced  statistical analysis  is used to discover commonalities or patterns in the data. The results are often reported in graphs and tables.

Applications such as Excel, SPSS, or R can be used to calculate things like:

  • Average scores
  • The number of times a particular answer was given
  • The  correlation or causation  between two or more variables
  • The  reliability and validity  of the results

Analyzing qualitative data

Qualitative data is more difficult to analyze than quantitative data. It consists of text, images or videos instead of numbers.

Some common approaches to analyzing qualitative data include:

  • Qualitative content analysis : Tracking the occurrence, position and meaning of words or phrases
  • Thematic analysis : Closely examining the data to identify the main themes and patterns
  • Discourse analysis : Studying how communication works in social contexts

Comparison of Research Processes

Creswell, J. W., & Creswell, J. D. (2018).  Research design : qualitative, quantitative, and mixed methods approaches  (Fifth). SAGE Publications.

  • << Previous: Literature Review Research
  • Next: Research Design By Discipline >>
  • Last Updated: Aug 21, 2023 4:07 PM
  • URL: https://guides.lib.udel.edu/researchmethods

Banner

  • University of Memphis Libraries
  • Research Guides

Empirical Research: Defining, Identifying, & Finding

Defining empirical research, what is empirical research, quantitative or qualitative.

  • Introduction
  • Database Tools
  • Search Terms
  • Image Descriptions

Calfee & Chambliss (2005)  (UofM login required) describe empirical research as a "systematic approach for answering certain types of questions."  Those questions are answered "[t]hrough the collection of evidence under carefully defined and replicable conditions" (p. 43). 

The evidence collected during empirical research is often referred to as "data." 

Characteristics of Empirical Research

Emerald Publishing's guide to conducting empirical research identifies a number of common elements to empirical research: 

  • A  research question , which will determine research objectives.
  • A particular and planned  design  for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources.
  • The gathering of  primary data , which is then analysed.
  • A particular  methodology  for collecting and analysing the data, such as an experiment or survey.
  • The limitation of the data to a particular group, area or time scale, known as a sample [emphasis added]: for example, a specific number of employees of a particular company type, or all users of a library over a given time scale. The sample should be somehow representative of a wider population.
  • The ability to  recreate  the study and test the results. This is known as  reliability .
  • The ability to  generalize  from the findings to a larger sample and to other situations.

If you see these elements in a research article, you can feel confident that you have found empirical research. Emerald's guide goes into more detail on each element. 

Empirical research methodologies can be described as quantitative, qualitative, or a mix of both (usually called mixed-methods).

Ruane (2016)  (UofM login required) gets at the basic differences in approach between quantitative and qualitative research:

  • Quantitative research  -- an approach to documenting reality that relies heavily on numbers both for the measurement of variables and for data analysis (p. 33).
  • Qualitative research  -- an approach to documenting reality that relies on words and images as the primary data source (p. 33).

Both quantitative and qualitative methods are empirical . If you can recognize that a research study is quantitative or qualitative study, then you have also recognized that it is empirical study. 

Below are information on the characteristics of quantitative and qualitative research. This video from Scribbr also offers a good overall introduction to the two approaches to research methodology: 

Characteristics of Quantitative Research 

Researchers test hypotheses, or theories, based in assumptions about causality, i.e. we expect variable X to cause variable Y. Variables have to be controlled as much as possible to ensure validity. The results explain the relationship between the variables. Measures are based in pre-defined instruments.

Examples: experimental or quasi-experimental design, pretest & post-test, survey or questionnaire with closed-ended questions. Studies that identify factors that influence an outcomes, the utility of an intervention, or understanding predictors of outcomes. 

Characteristics of Qualitative Research

Researchers explore “meaning individuals or groups ascribe to social or human problems (Creswell & Creswell, 2018, p3).” Questions and procedures emerge rather than being prescribed. Complexity, nuance, and individual meaning are valued. Research is both inductive and deductive. Data sources are multiple and varied, i.e. interviews, observations, documents, photographs, etc. The researcher is a key instrument and must be reflective of their background, culture, and experiences as influential of the research.

Examples: open question interviews and surveys, focus groups, case studies, grounded theory, ethnography, discourse analysis, narrative, phenomenology, participatory action research.

Calfee, R. C. & Chambliss, M. (2005). The design of empirical research. In J. Flood, D. Lapp, J. R. Squire, & J. Jensen (Eds.),  Methods of research on teaching the English language arts: The methodology chapters from the handbook of research on teaching the English language arts (pp. 43-78). Routledge.  http://ezproxy.memphis.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=125955&site=eds-live&scope=site .

Creswell, J. W., & Creswell, J. D. (2018).  Research design: Qualitative, quantitative, and mixed methods approaches  (5th ed.). Thousand Oaks: Sage.

How to... conduct empirical research . (n.d.). Emerald Publishing.  https://www.emeraldgrouppublishing.com/how-to/research-methods/conduct-empirical-research .

Scribbr. (2019). Quantitative vs. qualitative: The differences explained  [video]. YouTube.  https://www.youtube.com/watch?v=a-XtVF7Bofg .

Ruane, J. M. (2016).  Introducing social research methods : Essentials for getting the edge . Wiley-Blackwell.  http://ezproxy.memphis.edu/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=nlebk&AN=1107215&site=eds-live&scope=site .  

  • << Previous: Home
  • Next: Identifying Empirical Research >>
  • Last Updated: Apr 2, 2024 11:25 AM
  • URL: https://libguides.memphis.edu/empirical-research
  • What is Empirical Research Study? [Examples & Method]

busayo.longe

The bulk of human decisions relies on evidence, that is, what can be measured or proven as valid. In choosing between plausible alternatives, individuals are more likely to tilt towards the option that is proven to work, and this is the same approach adopted in empirical research. 

In empirical research, the researcher arrives at outcomes by testing his or her empirical evidence using qualitative or quantitative methods of observation, as determined by the nature of the research. An empirical research study is set apart from other research approaches by its methodology and features hence; it is important for every researcher to know what constitutes this investigation method. 

What is Empirical Research? 

Empirical research is a type of research methodology that makes use of verifiable evidence in order to arrive at research outcomes. In other words, this  type of research relies solely on evidence obtained through observation or scientific data collection methods. 

Empirical research can be carried out using qualitative or quantitative observation methods , depending on the data sample, that is, quantifiable data or non-numerical data . Unlike theoretical research that depends on preconceived notions about the research variables, empirical research carries a scientific investigation to measure the experimental probability of the research variables 

Characteristics of Empirical Research

  • Research Questions

An empirical research begins with a set of research questions that guide the investigation. In many cases, these research questions constitute the research hypothesis which is tested using qualitative and quantitative methods as dictated by the nature of the research.

In an empirical research study, the research questions are built around the core of the research, that is, the central issue which the research seeks to resolve. They also determine the course of the research by highlighting the specific objectives and aims of the systematic investigation. 

  • Definition of the Research Variables

The research variables are clearly defined in terms of their population, types, characteristics, and behaviors. In other words, the data sample is clearly delimited and placed within the context of the research. 

  • Description of the Research Methodology

 An empirical research also clearly outlines the methods adopted in the systematic investigation. Here, the research process is described in detail including the selection criteria for the data sample, qualitative or quantitative research methods plus testing instruments. 

An empirical research is usually divided into 4 parts which are the introduction, methodology, findings, and discussions. The introduction provides a background of the empirical study while the methodology describes the research design, processes, and tools for the systematic investigation. 

The findings refer to the research outcomes and they can be outlined as statistical data or in the form of information obtained through the qualitative observation of research variables. The discussions highlight the significance of the study and its contributions to knowledge. 

Uses of Empirical Research

Without any doubt, empirical research is one of the most useful methods of systematic investigation. It can be used for validating multiple research hypotheses in different fields including Law, Medicine, and Anthropology. 

  • Empirical Research in Law : In Law, empirical research is used to study institutions, rules, procedures, and personnel of the law, with a view to understanding how they operate and what effects they have. It makes use of direct methods rather than secondary sources, and this helps you to arrive at more valid conclusions.
  • Empirical Research in Medicine : In medicine, empirical research is used to test and validate multiple hypotheses and increase human knowledge.
  • Empirical Research in Anthropology : In anthropology, empirical research is used as an evidence-based systematic method of inquiry into patterns of human behaviors and cultures. This helps to validate and advance human knowledge.
Discover how Extrapolation Powers statistical research: Definition, examples, types, and applications explained.

The Empirical Research Cycle

The empirical research cycle is a 5-phase cycle that outlines the systematic processes for conducting and empirical research. It was developed by Dutch psychologist, A.D. de Groot in the 1940s and it aligns 5 important stages that can be viewed as deductive approaches to empirical research. 

In the empirical research methodological cycle, all processes are interconnected and none of the processes is more important than the other. This cycle clearly outlines the different phases involved in generating the research hypotheses and testing these hypotheses systematically using the empirical data. 

  • Observation: This is the process of gathering empirical data for the research. At this stage, the researcher gathers relevant empirical data using qualitative or quantitative observation methods, and this goes ahead to inform the research hypotheses.
  • Induction: At this stage, the researcher makes use of inductive reasoning in order to arrive at a general probable research conclusion based on his or her observation. The researcher generates a general assumption that attempts to explain the empirical data and s/he goes on to observe the empirical data in line with this assumption.
  • Deduction: This is the deductive reasoning stage. This is where the researcher generates hypotheses by applying logic and rationality to his or her observation.
  • Testing: Here, the researcher puts the hypotheses to test using qualitative or quantitative research methods. In the testing stage, the researcher combines relevant instruments of systematic investigation with empirical methods in order to arrive at objective results that support or negate the research hypotheses.
  • Evaluation: The evaluation research is the final stage in an empirical research study. Here, the research outlines the empirical data, the research findings and the supporting arguments plus any challenges encountered during the research process.

This information is useful for further research. 

Learn about qualitative data: uncover its types and examples here.

Examples of Empirical Research 

  • An empirical research study can be carried out to determine if listening to happy music improves the mood of individuals. The researcher may need to conduct an experiment that involves exposing individuals to happy music to see if this improves their moods.

The findings from such an experiment will provide empirical evidence that confirms or refutes the hypotheses. 

  • An empirical research study can also be carried out to determine the effects of a new drug on specific groups of people. The researcher may expose the research subjects to controlled quantities of the drug and observe research subjects to controlled quantities of the drug and observe the effects over a specific period of time to gather empirical data.
  • Another example of empirical research is measuring the levels of noise pollution found in an urban area to determine the average levels of sound exposure experienced by its inhabitants. Here, the researcher may have to administer questionnaires or carry out a survey in order to gather relevant data based on the experiences of the research subjects.
  • Empirical research can also be carried out to determine the relationship between seasonal migration and the body mass of flying birds. A researcher may need to observe the birds and carry out necessary observation and experimentation in order to arrive at objective outcomes that answer the research question.

Empirical Research Data Collection Methods

Empirical data can be gathered using qualitative and quantitative data collection methods. Quantitative data collection methods are used for numerical data gathering while qualitative data collection processes are used to gather empirical data that cannot be quantified, that is, non-numerical data. 

The following are common methods of gathering data in empirical research

  • Survey/ Questionnaire

A survey is a method of data gathering that is typically employed by researchers to gather large sets of data from a specific number of respondents with regards to a research subject. This method of data gathering is often used for quantitative data collection , although it can also be deployed during quantitative research.

A survey contains a set of questions that can range from close-ended to open-ended questions together with other question types that revolve around the research subject. A survey can be administered physically or with the use of online data-gathering platforms like Formplus. 

Empirical data can also be collected by carrying out an experiment. An experiment is a controlled simulation in which one or more of the research variables is manipulated using a set of interconnected processes in order to confirm or refute the research hypotheses.

An experiment is a useful method of measuring causality; that is cause and effect between dependent and independent variables in a research environment. It is an integral data gathering method in an empirical research study because it involves testing calculated assumptions in order to arrive at the most valid data and research outcomes. 

T he case study method is another common data gathering method in an empirical research study. It involves sifting through and analyzing relevant cases and real-life experiences about the research subject or research variables in order to discover in-depth information that can serve as empirical data.

  • Observation

The observational method is a method of qualitative data gathering that requires the researcher to study the behaviors of research variables in their natural environments in order to gather relevant information that can serve as empirical data.

How to collect Empirical Research Data with Questionnaire

With Formplus, you can create a survey or questionnaire for collecting empirical data from your research subjects. Formplus also offers multiple form sharing options so that you can share your empirical research survey to research subjects via a variety of methods.

Here is a step-by-step guide of how to collect empirical data using Formplus:

Sign in to Formplus

empirical-research-data-collection

In the Formplus builder, you can easily create your empirical research survey by dragging and dropping preferred fields into your form. To access the Formplus builder, you will need to create an account on Formplus. 

Once you do this, sign in to your account and click on “Create Form ” to begin. 

Unlock the secrets of Quantitative Data: Click here to explore the types and examples.

Edit Form Title

Click on the field provided to input your form title, for example, “Empirical Research Survey”.

empirical-research-questionnaire

Edit Form  

  • Click on the edit button to edit the form.
  • Add Fields: Drag and drop preferred form fields into your form in the Formplus builder inputs column. There are several field input options for survey forms in the Formplus builder.
  • Edit fields
  • Click on “Save”
  • Preview form.

empirical-research-survey

Customize Form

Formplus allows you to add unique features to your empirical research survey form. You can personalize your survey using various customization options. Here, you can add background images, your organization’s logo, and use other styling options. You can also change the display theme of your form. 

empirical-research-questionnaire

  • Share your Form Link with Respondents

Formplus offers multiple form sharing options which enables you to easily share your empirical research survey form with respondents. You can use the direct social media sharing buttons to share your form link to your organization’s social media pages. 

You can send out your survey form as email invitations to your research subjects too. If you wish, you can share your form’s QR code or embed it on your organization’s website for easy access. 

formplus-form-share

Empirical vs Non-Empirical Research

Empirical and non-empirical research are common methods of systematic investigation employed by researchers. Unlike empirical research that tests hypotheses in order to arrive at valid research outcomes, non-empirical research theorizes the logical assumptions of research variables. 

Definition: Empirical research is a research approach that makes use of evidence-based data while non-empirical research is a research approach that makes use of theoretical data. 

Method: In empirical research, the researcher arrives at valid outcomes by mainly observing research variables, creating a hypothesis and experimenting on research variables to confirm or refute the hypothesis. In non-empirical research, the researcher relies on inductive and deductive reasoning to theorize logical assumptions about the research subjects.

The major difference between the research methodology of empirical and non-empirical research is while the assumptions are tested in empirical research, they are entirely theorized in non-empirical research. 

Data Sample: Empirical research makes use of empirical data while non-empirical research does not make use of empirical data. Empirical data refers to information that is gathered through experience or observation. 

Unlike empirical research, theoretical or non-empirical research does not rely on data gathered through evidence. Rather, it works with logical assumptions and beliefs about the research subject. 

Data Collection Methods : Empirical research makes use of quantitative and qualitative data gathering methods which may include surveys, experiments, and methods of observation. This helps the researcher to gather empirical data, that is, data backed by evidence.  

Non-empirical research, on the other hand, does not make use of qualitative or quantitative methods of data collection . Instead, the researcher gathers relevant data through critical studies, systematic review and meta-analysis. 

Advantages of Empirical Research 

  • Empirical research is flexible. In this type of systematic investigation, the researcher can adjust the research methodology including the data sample size, data gathering methods plus the data analysis methods as necessitated by the research process.
  • It helps the research to understand how the research outcomes can be influenced by different research environments.
  • Empirical research study helps the researcher to develop relevant analytical and observation skills that can be useful in dynamic research contexts.
  • This type of research approach allows the researcher to control multiple research variables in order to arrive at the most relevant research outcomes.
  • Empirical research is widely considered as one of the most authentic and competent research designs.
  • It improves the internal validity of traditional research using a variety of experiments and research observation methods.

Disadvantages of Empirical Research 

  • An empirical research study is time-consuming because the researcher needs to gather the empirical data from multiple resources which typically takes a lot of time.
  • It is not a cost-effective research approach. Usually, this method of research incurs a lot of cost because of the monetary demands of the field research.
  • It may be difficult to gather the needed empirical data sample because of the multiple data gathering methods employed in an empirical research study.
  • It may be difficult to gain access to some communities and firms during the data gathering process and this can affect the validity of the research.
  • The report from an empirical research study is intensive and can be very lengthy in nature.

Conclusion 

Empirical research is an important method of systematic investigation because it gives the researcher the opportunity to test the validity of different assumptions, in the form of hypotheses, before arriving at any findings. Hence, it is a more research approach. 

There are different quantitative and qualitative methods of data gathering employed during an empirical research study based on the purpose of the research which include surveys, experiments, and various observatory methods. Surveys are one of the most common methods or empirical data collection and they can be administered online or physically. 

You can use Formplus to create and administer your online empirical research survey. Formplus allows you to create survey forms that you can share with target respondents in order to obtain valuable feedback about your research context, question or subject. 

In the form builder, you can add different fields to your survey form and you can also modify these form fields to suit your research process. Sign up to Formplus to access the form builder and start creating powerful online empirical research survey forms. 

Logo

Connect to Formplus, Get Started Now - It's Free!

  • advantage of empirical research
  • disadvantages of empirical resarch
  • empirical research characteristics
  • empirical research cycle
  • empirical research method
  • example of empirical research
  • uses of empirical research
  • busayo.longe

Formplus

You may also like:

What is Pure or Basic Research? + [Examples & Method]

Simple guide on pure or basic research, its methods, characteristics, advantages, and examples in science, medicine, education and psychology

an empirical study is based on research design that is

Extrapolation in Statistical Research: Definition, Examples, Types, Applications

In this article we’ll look at the different types and characteristics of extrapolation, plus how it contrasts to interpolation.

Research Questions: Definitions, Types + [Examples]

A comprehensive guide on the definition of research questions, types, importance, good and bad research question examples

Recall Bias: Definition, Types, Examples & Mitigation

This article will discuss the impact of recall bias in studies and the best ways to avoid them during research.

Formplus - For Seamless Data Collection

Collect data the right way with a versatile data collection tool. try formplus and transform your work productivity today..

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is a Research Design | Types, Guide & Examples

What Is a Research Design | Types, Guide & Examples

Published on June 7, 2021 by Shona McCombes . Revised on November 20, 2023 by Pritha Bhandari.

A research design is a strategy for answering your   research question  using empirical data. Creating a research design means making decisions about:

  • Your overall research objectives and approach
  • Whether you’ll rely on primary research or secondary research
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods
  • The procedures you’ll follow to collect data
  • Your data analysis methods

A well-planned research design helps ensure that your methods match your research objectives and that you use the right kind of analysis for your data.

Table of contents

Step 1: consider your aims and approach, step 2: choose a type of research design, step 3: identify your population and sampling method, step 4: choose your data collection methods, step 5: plan your data collection procedures, step 6: decide on your data analysis strategies, other interesting articles, frequently asked questions about research design.

  • Introduction

Before you can start designing your research, you should already have a clear idea of the research question you want to investigate.

There are many different ways you could go about answering this question. Your research design choices should be driven by your aims and priorities—start by thinking carefully about what you want to achieve.

The first choice you need to make is whether you’ll take a qualitative or quantitative approach.

Qualitative research designs tend to be more flexible and inductive , allowing you to adjust your approach based on what you find throughout the research process.

Quantitative research designs tend to be more fixed and deductive , with variables and hypotheses clearly defined in advance of data collection.

It’s also possible to use a mixed-methods design that integrates aspects of both approaches. By combining qualitative and quantitative insights, you can gain a more complete picture of the problem you’re studying and strengthen the credibility of your conclusions.

Practical and ethical considerations when designing research

As well as scientific considerations, you need to think practically when designing your research. If your research involves people or animals, you also need to consider research ethics .

  • How much time do you have to collect data and write up the research?
  • Will you be able to gain access to the data you need (e.g., by travelling to a specific location or contacting specific people)?
  • Do you have the necessary research skills (e.g., statistical analysis or interview techniques)?
  • Will you need ethical approval ?

At each stage of the research design process, make sure that your choices are practically feasible.

Prevent plagiarism. Run a free check.

Within both qualitative and quantitative approaches, there are several types of research design to choose from. Each type provides a framework for the overall shape of your research.

Types of quantitative research designs

Quantitative designs can be split into four main types.

  • Experimental and   quasi-experimental designs allow you to test cause-and-effect relationships
  • Descriptive and correlational designs allow you to measure variables and describe relationships between them.

With descriptive and correlational designs, you can get a clear picture of characteristics, trends and relationships as they exist in the real world. However, you can’t draw conclusions about cause and effect (because correlation doesn’t imply causation ).

Experiments are the strongest way to test cause-and-effect relationships without the risk of other variables influencing the results. However, their controlled conditions may not always reflect how things work in the real world. They’re often also more difficult and expensive to implement.

Types of qualitative research designs

Qualitative designs are less strictly defined. This approach is about gaining a rich, detailed understanding of a specific context or phenomenon, and you can often be more creative and flexible in designing your research.

The table below shows some common types of qualitative design. They often have similar approaches in terms of data collection, but focus on different aspects when analyzing the data.

Your research design should clearly define who or what your research will focus on, and how you’ll go about choosing your participants or subjects.

In research, a population is the entire group that you want to draw conclusions about, while a sample is the smaller group of individuals you’ll actually collect data from.

Defining the population

A population can be made up of anything you want to study—plants, animals, organizations, texts, countries, etc. In the social sciences, it most often refers to a group of people.

For example, will you focus on people from a specific demographic, region or background? Are you interested in people with a certain job or medical condition, or users of a particular product?

The more precisely you define your population, the easier it will be to gather a representative sample.

  • Sampling methods

Even with a narrowly defined population, it’s rarely possible to collect data from every individual. Instead, you’ll collect data from a sample.

To select a sample, there are two main approaches: probability sampling and non-probability sampling . The sampling method you use affects how confidently you can generalize your results to the population as a whole.

Probability sampling is the most statistically valid option, but it’s often difficult to achieve unless you’re dealing with a very small and accessible population.

For practical reasons, many studies use non-probability sampling, but it’s important to be aware of the limitations and carefully consider potential biases. You should always make an effort to gather a sample that’s as representative as possible of the population.

Case selection in qualitative research

In some types of qualitative designs, sampling may not be relevant.

For example, in an ethnography or a case study , your aim is to deeply understand a specific context, not to generalize to a population. Instead of sampling, you may simply aim to collect as much data as possible about the context you are studying.

In these types of design, you still have to carefully consider your choice of case or community. You should have a clear rationale for why this particular case is suitable for answering your research question .

For example, you might choose a case study that reveals an unusual or neglected aspect of your research problem, or you might choose several very similar or very different cases in order to compare them.

Data collection methods are ways of directly measuring variables and gathering information. They allow you to gain first-hand knowledge and original insights into your research problem.

You can choose just one data collection method, or use several methods in the same study.

Survey methods

Surveys allow you to collect data about opinions, behaviors, experiences, and characteristics by asking people directly. There are two main survey methods to choose from: questionnaires and interviews .

Observation methods

Observational studies allow you to collect data unobtrusively, observing characteristics, behaviors or social interactions without relying on self-reporting.

Observations may be conducted in real time, taking notes as you observe, or you might make audiovisual recordings for later analysis. They can be qualitative or quantitative.

Other methods of data collection

There are many other ways you might collect data depending on your field and topic.

If you’re not sure which methods will work best for your research design, try reading some papers in your field to see what kinds of data collection methods they used.

Secondary data

If you don’t have the time or resources to collect data from the population you’re interested in, you can also choose to use secondary data that other researchers already collected—for example, datasets from government surveys or previous studies on your topic.

With this raw data, you can do your own analysis to answer new research questions that weren’t addressed by the original study.

Using secondary data can expand the scope of your research, as you may be able to access much larger and more varied samples than you could collect yourself.

However, it also means you don’t have any control over which variables to measure or how to measure them, so the conclusions you can draw may be limited.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

As well as deciding on your methods, you need to plan exactly how you’ll use these methods to collect data that’s consistent, accurate, and unbiased.

Planning systematic procedures is especially important in quantitative research, where you need to precisely define your variables and ensure your measurements are high in reliability and validity.

Operationalization

Some variables, like height or age, are easily measured. But often you’ll be dealing with more abstract concepts, like satisfaction, anxiety, or competence. Operationalization means turning these fuzzy ideas into measurable indicators.

If you’re using observations , which events or actions will you count?

If you’re using surveys , which questions will you ask and what range of responses will be offered?

You may also choose to use or adapt existing materials designed to measure the concept you’re interested in—for example, questionnaires or inventories whose reliability and validity has already been established.

Reliability and validity

Reliability means your results can be consistently reproduced, while validity means that you’re actually measuring the concept you’re interested in.

For valid and reliable results, your measurement materials should be thoroughly researched and carefully designed. Plan your procedures to make sure you carry out the same steps in the same way for each participant.

If you’re developing a new questionnaire or other instrument to measure a specific concept, running a pilot study allows you to check its validity and reliability in advance.

Sampling procedures

As well as choosing an appropriate sampling method , you need a concrete plan for how you’ll actually contact and recruit your selected sample.

That means making decisions about things like:

  • How many participants do you need for an adequate sample size?
  • What inclusion and exclusion criteria will you use to identify eligible participants?
  • How will you contact your sample—by mail, online, by phone, or in person?

If you’re using a probability sampling method , it’s important that everyone who is randomly selected actually participates in the study. How will you ensure a high response rate?

If you’re using a non-probability method , how will you avoid research bias and ensure a representative sample?

Data management

It’s also important to create a data management plan for organizing and storing your data.

Will you need to transcribe interviews or perform data entry for observations? You should anonymize and safeguard any sensitive data, and make sure it’s backed up regularly.

Keeping your data well-organized will save time when it comes to analyzing it. It can also help other researchers validate and add to your findings (high replicability ).

On its own, raw data can’t answer your research question. The last step of designing your research is planning how you’ll analyze the data.

Quantitative data analysis

In quantitative research, you’ll most likely use some form of statistical analysis . With statistics, you can summarize your sample data, make estimates, and test hypotheses.

Using descriptive statistics , you can summarize your sample data in terms of:

  • The distribution of the data (e.g., the frequency of each score on a test)
  • The central tendency of the data (e.g., the mean to describe the average score)
  • The variability of the data (e.g., the standard deviation to describe how spread out the scores are)

The specific calculations you can do depend on the level of measurement of your variables.

Using inferential statistics , you can:

  • Make estimates about the population based on your sample data.
  • Test hypotheses about a relationship between variables.

Regression and correlation tests look for associations between two or more variables, while comparison tests (such as t tests and ANOVAs ) look for differences in the outcomes of different groups.

Your choice of statistical test depends on various aspects of your research design, including the types of variables you’re dealing with and the distribution of your data.

Qualitative data analysis

In qualitative research, your data will usually be very dense with information and ideas. Instead of summing it up in numbers, you’ll need to comb through the data in detail, interpret its meanings, identify patterns, and extract the parts that are most relevant to your research question.

Two of the most common approaches to doing this are thematic analysis and discourse analysis .

There are many other ways of analyzing qualitative data depending on the aims of your research. To get a sense of potential approaches, try reading some qualitative research papers in your field.

If you want to know more about the research process , methodology , research bias , or statistics , make sure to check out some of our other articles with explanations and examples.

  • Simple random sampling
  • Stratified sampling
  • Cluster sampling
  • Likert scales
  • Reproducibility

 Statistics

  • Null hypothesis
  • Statistical power
  • Probability distribution
  • Effect size
  • Poisson distribution

Research bias

  • Optimism bias
  • Cognitive bias
  • Implicit bias
  • Hawthorne effect
  • Anchoring bias
  • Explicit bias

A research design is a strategy for answering your   research question . It defines your overall approach and determines how you will collect and analyze data.

A well-planned research design helps ensure that your methods match your research aims, that you collect high-quality data, and that you use the right kind of analysis to answer your questions, utilizing credible sources . This allows you to draw valid , trustworthy conclusions.

Quantitative research designs can be divided into two main categories:

  • Correlational and descriptive designs are used to investigate characteristics, averages, trends, and associations between variables.
  • Experimental and quasi-experimental designs are used to test causal relationships .

Qualitative research designs tend to be more flexible. Common types of qualitative design include case study , ethnography , and grounded theory designs.

The priorities of a research design can vary depending on the field, but you usually have to specify:

  • Your research questions and/or hypotheses
  • Your overall approach (e.g., qualitative or quantitative )
  • The type of design you’re using (e.g., a survey , experiment , or case study )
  • Your data collection methods (e.g., questionnaires , observations)
  • Your data collection procedures (e.g., operationalization , timing and data management)
  • Your data analysis methods (e.g., statistical tests  or thematic analysis )

A sample is a subset of individuals from a larger population . Sampling means selecting the group that you will actually collect data from in your research. For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

In statistics, sampling allows you to test a hypothesis about the characteristics of a population.

Operationalization means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioral avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalize the variables that you want to measure.

A research project is an academic, scientific, or professional undertaking to answer a research question . Research projects can take many forms, such as qualitative or quantitative , descriptive , longitudinal , experimental , or correlational . What kind of research approach you choose will depend on your topic.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

McCombes, S. (2023, November 20). What Is a Research Design | Types, Guide & Examples. Scribbr. Retrieved April 9, 2024, from https://www.scribbr.com/methodology/research-design/

Is this article helpful?

Shona McCombes

Shona McCombes

Other students also liked, guide to experimental design | overview, steps, & examples, how to write a research proposal | examples & templates, ethical considerations in research | types & examples, "i thought ai proofreading was useless but..".

I've been using Scribbr for years now and I know it's a service that won't disappoint. It does a good job spotting mistakes”

Penn State University Libraries

Empirical research in the social sciences and education.

  • What is Empirical Research and How to Read It
  • Finding Empirical Research in Library Databases
  • Designing Empirical Research
  • Ethics, Cultural Responsiveness, and Anti-Racism in Research
  • Citing, Writing, and Presenting Your Work

Contact the Librarian at your campus for more help!

Ellysa Cahoy

Introduction: What is Empirical Research?

Empirical research is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. 

How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology."  Ask yourself: Could I recreate this study and test these results?

Key characteristics to look for:

  • Specific research questions to be answered
  • Definition of the population, behavior, or   phenomena being studied
  • Description of the process used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys)

Another hint: some scholarly journals use a specific layout, called the "IMRaD" format, to communicate empirical research findings. Such articles typically have 4 components:

  • Introduction : sometimes called "literature review" -- what is currently known about the topic -- usually includes a theoretical framework and/or discussion of previous studies
  • Methodology: sometimes called "research design" -- how to recreate the study -- usually describes the population, research process, and analytical tools used in the present study
  • Results : sometimes called "findings" -- what was learned through the study -- usually appears as statistical data or as substantial quotations from research participants
  • Discussion : sometimes called "conclusion" or "implications" -- why the study is important -- usually describes how the research results influence professional practices or future studies

Reading and Evaluating Scholarly Materials

Reading research can be a challenge. However, the tutorials and videos below can help. They explain what scholarly articles look like, how to read them, and how to evaluate them:

  • CRAAP Checklist A frequently-used checklist that helps you examine the currency, relevance, authority, accuracy, and purpose of an information source.
  • IF I APPLY A newer model of evaluating sources which encourages you to think about your own biases as a reader, as well as concerns about the item you are reading.
  • Credo Video: How to Read Scholarly Materials (4 min.)
  • Credo Tutorial: How to Read Scholarly Materials
  • Credo Tutorial: Evaluating Information
  • Credo Video: Evaluating Statistics (4 min.)
  • Next: Finding Empirical Research in Library Databases >>
  • Last Updated: Feb 18, 2024 8:33 PM
  • URL: https://guides.libraries.psu.edu/emp

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology

Research Design | Step-by-Step Guide with Examples

Published on 5 May 2022 by Shona McCombes . Revised on 20 March 2023.

A research design is a strategy for answering your research question  using empirical data. Creating a research design means making decisions about:

  • Your overall aims and approach
  • The type of research design you’ll use
  • Your sampling methods or criteria for selecting subjects
  • Your data collection methods
  • The procedures you’ll follow to collect data
  • Your data analysis methods

A well-planned research design helps ensure that your methods match your research aims and that you use the right kind of analysis for your data.

Table of contents

Step 1: consider your aims and approach, step 2: choose a type of research design, step 3: identify your population and sampling method, step 4: choose your data collection methods, step 5: plan your data collection procedures, step 6: decide on your data analysis strategies, frequently asked questions.

  • Introduction

Before you can start designing your research, you should already have a clear idea of the research question you want to investigate.

There are many different ways you could go about answering this question. Your research design choices should be driven by your aims and priorities – start by thinking carefully about what you want to achieve.

The first choice you need to make is whether you’ll take a qualitative or quantitative approach.

Qualitative research designs tend to be more flexible and inductive , allowing you to adjust your approach based on what you find throughout the research process.

Quantitative research designs tend to be more fixed and deductive , with variables and hypotheses clearly defined in advance of data collection.

It’s also possible to use a mixed methods design that integrates aspects of both approaches. By combining qualitative and quantitative insights, you can gain a more complete picture of the problem you’re studying and strengthen the credibility of your conclusions.

Practical and ethical considerations when designing research

As well as scientific considerations, you need to think practically when designing your research. If your research involves people or animals, you also need to consider research ethics .

  • How much time do you have to collect data and write up the research?
  • Will you be able to gain access to the data you need (e.g., by travelling to a specific location or contacting specific people)?
  • Do you have the necessary research skills (e.g., statistical analysis or interview techniques)?
  • Will you need ethical approval ?

At each stage of the research design process, make sure that your choices are practically feasible.

Prevent plagiarism, run a free check.

Within both qualitative and quantitative approaches, there are several types of research design to choose from. Each type provides a framework for the overall shape of your research.

Types of quantitative research designs

Quantitative designs can be split into four main types. Experimental and   quasi-experimental designs allow you to test cause-and-effect relationships, while descriptive and correlational designs allow you to measure variables and describe relationships between them.

With descriptive and correlational designs, you can get a clear picture of characteristics, trends, and relationships as they exist in the real world. However, you can’t draw conclusions about cause and effect (because correlation doesn’t imply causation ).

Experiments are the strongest way to test cause-and-effect relationships without the risk of other variables influencing the results. However, their controlled conditions may not always reflect how things work in the real world. They’re often also more difficult and expensive to implement.

Types of qualitative research designs

Qualitative designs are less strictly defined. This approach is about gaining a rich, detailed understanding of a specific context or phenomenon, and you can often be more creative and flexible in designing your research.

The table below shows some common types of qualitative design. They often have similar approaches in terms of data collection, but focus on different aspects when analysing the data.

Your research design should clearly define who or what your research will focus on, and how you’ll go about choosing your participants or subjects.

In research, a population is the entire group that you want to draw conclusions about, while a sample is the smaller group of individuals you’ll actually collect data from.

Defining the population

A population can be made up of anything you want to study – plants, animals, organisations, texts, countries, etc. In the social sciences, it most often refers to a group of people.

For example, will you focus on people from a specific demographic, region, or background? Are you interested in people with a certain job or medical condition, or users of a particular product?

The more precisely you define your population, the easier it will be to gather a representative sample.

Sampling methods

Even with a narrowly defined population, it’s rarely possible to collect data from every individual. Instead, you’ll collect data from a sample.

To select a sample, there are two main approaches: probability sampling and non-probability sampling . The sampling method you use affects how confidently you can generalise your results to the population as a whole.

Probability sampling is the most statistically valid option, but it’s often difficult to achieve unless you’re dealing with a very small and accessible population.

For practical reasons, many studies use non-probability sampling, but it’s important to be aware of the limitations and carefully consider potential biases. You should always make an effort to gather a sample that’s as representative as possible of the population.

Case selection in qualitative research

In some types of qualitative designs, sampling may not be relevant.

For example, in an ethnography or a case study, your aim is to deeply understand a specific context, not to generalise to a population. Instead of sampling, you may simply aim to collect as much data as possible about the context you are studying.

In these types of design, you still have to carefully consider your choice of case or community. You should have a clear rationale for why this particular case is suitable for answering your research question.

For example, you might choose a case study that reveals an unusual or neglected aspect of your research problem, or you might choose several very similar or very different cases in order to compare them.

Data collection methods are ways of directly measuring variables and gathering information. They allow you to gain first-hand knowledge and original insights into your research problem.

You can choose just one data collection method, or use several methods in the same study.

Survey methods

Surveys allow you to collect data about opinions, behaviours, experiences, and characteristics by asking people directly. There are two main survey methods to choose from: questionnaires and interviews.

Observation methods

Observations allow you to collect data unobtrusively, observing characteristics, behaviours, or social interactions without relying on self-reporting.

Observations may be conducted in real time, taking notes as you observe, or you might make audiovisual recordings for later analysis. They can be qualitative or quantitative.

Other methods of data collection

There are many other ways you might collect data depending on your field and topic.

If you’re not sure which methods will work best for your research design, try reading some papers in your field to see what data collection methods they used.

Secondary data

If you don’t have the time or resources to collect data from the population you’re interested in, you can also choose to use secondary data that other researchers already collected – for example, datasets from government surveys or previous studies on your topic.

With this raw data, you can do your own analysis to answer new research questions that weren’t addressed by the original study.

Using secondary data can expand the scope of your research, as you may be able to access much larger and more varied samples than you could collect yourself.

However, it also means you don’t have any control over which variables to measure or how to measure them, so the conclusions you can draw may be limited.

As well as deciding on your methods, you need to plan exactly how you’ll use these methods to collect data that’s consistent, accurate, and unbiased.

Planning systematic procedures is especially important in quantitative research, where you need to precisely define your variables and ensure your measurements are reliable and valid.

Operationalisation

Some variables, like height or age, are easily measured. But often you’ll be dealing with more abstract concepts, like satisfaction, anxiety, or competence. Operationalisation means turning these fuzzy ideas into measurable indicators.

If you’re using observations , which events or actions will you count?

If you’re using surveys , which questions will you ask and what range of responses will be offered?

You may also choose to use or adapt existing materials designed to measure the concept you’re interested in – for example, questionnaires or inventories whose reliability and validity has already been established.

Reliability and validity

Reliability means your results can be consistently reproduced , while validity means that you’re actually measuring the concept you’re interested in.

For valid and reliable results, your measurement materials should be thoroughly researched and carefully designed. Plan your procedures to make sure you carry out the same steps in the same way for each participant.

If you’re developing a new questionnaire or other instrument to measure a specific concept, running a pilot study allows you to check its validity and reliability in advance.

Sampling procedures

As well as choosing an appropriate sampling method, you need a concrete plan for how you’ll actually contact and recruit your selected sample.

That means making decisions about things like:

  • How many participants do you need for an adequate sample size?
  • What inclusion and exclusion criteria will you use to identify eligible participants?
  • How will you contact your sample – by mail, online, by phone, or in person?

If you’re using a probability sampling method, it’s important that everyone who is randomly selected actually participates in the study. How will you ensure a high response rate?

If you’re using a non-probability method, how will you avoid bias and ensure a representative sample?

Data management

It’s also important to create a data management plan for organising and storing your data.

Will you need to transcribe interviews or perform data entry for observations? You should anonymise and safeguard any sensitive data, and make sure it’s backed up regularly.

Keeping your data well organised will save time when it comes to analysing them. It can also help other researchers validate and add to your findings.

On their own, raw data can’t answer your research question. The last step of designing your research is planning how you’ll analyse the data.

Quantitative data analysis

In quantitative research, you’ll most likely use some form of statistical analysis . With statistics, you can summarise your sample data, make estimates, and test hypotheses.

Using descriptive statistics , you can summarise your sample data in terms of:

  • The distribution of the data (e.g., the frequency of each score on a test)
  • The central tendency of the data (e.g., the mean to describe the average score)
  • The variability of the data (e.g., the standard deviation to describe how spread out the scores are)

The specific calculations you can do depend on the level of measurement of your variables.

Using inferential statistics , you can:

  • Make estimates about the population based on your sample data.
  • Test hypotheses about a relationship between variables.

Regression and correlation tests look for associations between two or more variables, while comparison tests (such as t tests and ANOVAs ) look for differences in the outcomes of different groups.

Your choice of statistical test depends on various aspects of your research design, including the types of variables you’re dealing with and the distribution of your data.

Qualitative data analysis

In qualitative research, your data will usually be very dense with information and ideas. Instead of summing it up in numbers, you’ll need to comb through the data in detail, interpret its meanings, identify patterns, and extract the parts that are most relevant to your research question.

Two of the most common approaches to doing this are thematic analysis and discourse analysis .

There are many other ways of analysing qualitative data depending on the aims of your research. To get a sense of potential approaches, try reading some qualitative research papers in your field.

A sample is a subset of individuals from a larger population. Sampling means selecting the group that you will actually collect data from in your research.

For example, if you are researching the opinions of students in your university, you could survey a sample of 100 students.

Statistical sampling allows you to test a hypothesis about the characteristics of a population. There are various sampling methods you can use to ensure that your sample is representative of the population as a whole.

Operationalisation means turning abstract conceptual ideas into measurable observations.

For example, the concept of social anxiety isn’t directly observable, but it can be operationally defined in terms of self-rating scores, behavioural avoidance of crowded places, or physical anxiety symptoms in social situations.

Before collecting data , it’s important to consider how you will operationalise the variables that you want to measure.

The research methods you use depend on the type of data you need to answer your research question .

  • If you want to measure something or test a hypothesis , use quantitative methods . If you want to explore ideas, thoughts, and meanings, use qualitative methods .
  • If you want to analyse a large amount of readily available data, use secondary data. If you want data specific to your purposes with control over how they are generated, collect primary data.
  • If you want to establish cause-and-effect relationships between variables , use experimental methods. If you want to understand the characteristics of a research subject, use descriptive methods.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

McCombes, S. (2023, March 20). Research Design | Step-by-Step Guide with Examples. Scribbr. Retrieved 8 April 2024, from https://www.scribbr.co.uk/research-methods/research-design/

Is this article helpful?

Shona McCombes

Shona McCombes

Logo for University of Southern Queensland

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

5 Research design

Research design is a comprehensive plan for data collection in an empirical research project. It is a ‘blueprint’ for empirical research aimed at answering specific research questions or testing specific hypotheses, and must specify at least three processes: the data collection process, the instrument development process, and the sampling process. The instrument development and sampling processes are described in the next two chapters, and the data collection process—which is often loosely called ‘research design’—is introduced in this chapter and is described in further detail in Chapters 9–12.

Broadly speaking, data collection methods can be grouped into two categories: positivist and interpretive. Positivist methods , such as laboratory experiments and survey research, are aimed at theory (or hypotheses) testing, while interpretive methods, such as action research and ethnography, are aimed at theory building. Positivist methods employ a deductive approach to research, starting with a theory and testing theoretical postulates using empirical data. In contrast, interpretive methods employ an inductive approach that starts with data and tries to derive a theory about the phenomenon of interest from the observed data. Often times, these methods are incorrectly equated with quantitative and qualitative research. Quantitative and qualitative methods refers to the type of data being collected—quantitative data involve numeric scores, metrics, and so on, while qualitative data includes interviews, observations, and so forth—and analysed (i.e., using quantitative techniques such as regression or qualitative techniques such as coding). Positivist research uses predominantly quantitative data, but can also use qualitative data. Interpretive research relies heavily on qualitative data, but can sometimes benefit from including quantitative data as well. Sometimes, joint use of qualitative and quantitative data may help generate unique insight into a complex social phenomenon that is not available from either type of data alone, and hence, mixed-mode designs that combine qualitative and quantitative data are often highly desirable.

Key attributes of a research design

The quality of research designs can be defined in terms of four key design attributes: internal validity, external validity, construct validity, and statistical conclusion validity.

Internal validity , also called causality, examines whether the observed change in a dependent variable is indeed caused by a corresponding change in a hypothesised independent variable, and not by variables extraneous to the research context. Causality requires three conditions: covariation of cause and effect (i.e., if cause happens, then effect also happens; if cause does not happen, effect does not happen), temporal precedence (cause must precede effect in time), and spurious correlation, or there is no plausible alternative explanation for the change. Certain research designs, such as laboratory experiments, are strong in internal validity by virtue of their ability to manipulate the independent variable (cause) via a treatment and observe the effect (dependent variable) of that treatment after a certain point in time, while controlling for the effects of extraneous variables. Other designs, such as field surveys, are poor in internal validity because of their inability to manipulate the independent variable (cause), and because cause and effect are measured at the same point in time which defeats temporal precedence making it equally likely that the expected effect might have influenced the expected cause rather than the reverse. Although higher in internal validity compared to other methods, laboratory experiments are by no means immune to threats of internal validity, and are susceptible to history, testing, instrumentation, regression, and other threats that are discussed later in the chapter on experimental designs. Nonetheless, different research designs vary considerably in their respective level of internal validity.

External validity or generalisability refers to whether the observed associations can be generalised from the sample to the population (population validity), or to other people, organisations, contexts, or time (ecological validity). For instance, can results drawn from a sample of financial firms in the United States be generalised to the population of financial firms (population validity) or to other firms within the United States (ecological validity)? Survey research, where data is sourced from a wide variety of individuals, firms, or other units of analysis, tends to have broader generalisability than laboratory experiments where treatments and extraneous variables are more controlled. The variation in internal and external validity for a wide range of research designs is shown in Figure 5.1.

Internal and external validity

Some researchers claim that there is a trade-off between internal and external validity—higher external validity can come only at the cost of internal validity and vice versa. But this is not always the case. Research designs such as field experiments, longitudinal field surveys, and multiple case studies have higher degrees of both internal and external validities. Personally, I prefer research designs that have reasonable degrees of both internal and external validities, i.e., those that fall within the cone of validity shown in Figure 5.1. But this should not suggest that designs outside this cone are any less useful or valuable. Researchers’ choice of designs are ultimately a matter of their personal preference and competence, and the level of internal and external validity they desire.

Construct validity examines how well a given measurement scale is measuring the theoretical construct that it is expected to measure. Many constructs used in social science research such as empathy, resistance to change, and organisational learning are difficult to define, much less measure. For instance, construct validity must ensure that a measure of empathy is indeed measuring empathy and not compassion, which may be difficult since these constructs are somewhat similar in meaning. Construct validity is assessed in positivist research based on correlational or factor analysis of pilot test data, as described in the next chapter.

Statistical conclusion validity examines the extent to which conclusions derived using a statistical procedure are valid. For example, it examines whether the right statistical method was used for hypotheses testing, whether the variables used meet the assumptions of that statistical test (such as sample size or distributional requirements), and so forth. Because interpretive research designs do not employ statistical tests, statistical conclusion validity is not applicable for such analysis. The different kinds of validity and where they exist at the theoretical/empirical levels are illustrated in Figure 5.2.

Different types of validity in scientific research

Improving internal and external validity

The best research designs are those that can ensure high levels of internal and external validity. Such designs would guard against spurious correlations, inspire greater faith in the hypotheses testing, and ensure that the results drawn from a small sample are generalisable to the population at large. Controls are required to ensure internal validity (causality) of research designs, and can be accomplished in five ways: manipulation, elimination, inclusion, and statistical control, and randomisation.

In manipulation , the researcher manipulates the independent variables in one or more levels (called ‘treatments’), and compares the effects of the treatments against a control group where subjects do not receive the treatment. Treatments may include a new drug or different dosage of drug (for treating a medical condition), a teaching style (for students), and so forth. This type of control is achieved in experimental or quasi-experimental designs, but not in non-experimental designs such as surveys. Note that if subjects cannot distinguish adequately between different levels of treatment manipulations, their responses across treatments may not be different, and manipulation would fail.

The elimination technique relies on eliminating extraneous variables by holding them constant across treatments, such as by restricting the study to a single gender or a single socioeconomic status. In the inclusion technique, the role of extraneous variables is considered by including them in the research design and separately estimating their effects on the dependent variable, such as via factorial designs where one factor is gender (male versus female). Such technique allows for greater generalisability, but also requires substantially larger samples. In statistical control , extraneous variables are measured and used as covariates during the statistical testing process.

Finally, the randomisation technique is aimed at cancelling out the effects of extraneous variables through a process of random sampling, if it can be assured that these effects are of a random (non-systematic) nature. Two types of randomisation are: random selection , where a sample is selected randomly from a population, and random assignment , where subjects selected in a non-random manner are randomly assigned to treatment groups.

Randomisation also ensures external validity, allowing inferences drawn from the sample to be generalised to the population from which the sample is drawn. Note that random assignment is mandatory when random selection is not possible because of resource or access constraints. However, generalisability across populations is harder to ascertain since populations may differ on multiple dimensions and you can only control for a few of those dimensions.

Popular research designs

As noted earlier, research designs can be classified into two categories—positivist and interpretive—depending on the goal of the research. Positivist designs are meant for theory testing, while interpretive designs are meant for theory building. Positivist designs seek generalised patterns based on an objective view of reality, while interpretive designs seek subjective interpretations of social phenomena from the perspectives of the subjects involved. Some popular examples of positivist designs include laboratory experiments, field experiments, field surveys, secondary data analysis, and case research, while examples of interpretive designs include case research, phenomenology, and ethnography. Note that case research can be used for theory building or theory testing, though not at the same time. Not all techniques are suited for all kinds of scientific research. Some techniques such as focus groups are best suited for exploratory research, others such as ethnography are best for descriptive research, and still others such as laboratory experiments are ideal for explanatory research. Following are brief descriptions of some of these designs. Additional details are provided in Chapters 9–12.

Experimental studies are those that are intended to test cause-effect relationships (hypotheses) in a tightly controlled setting by separating the cause from the effect in time, administering the cause to one group of subjects (the ‘treatment group’) but not to another group (‘control group’), and observing how the mean effects vary between subjects in these two groups. For instance, if we design a laboratory experiment to test the efficacy of a new drug in treating a certain ailment, we can get a random sample of people afflicted with that ailment, randomly assign them to one of two groups (treatment and control groups), administer the drug to subjects in the treatment group, but only give a placebo (e.g., a sugar pill with no medicinal value) to subjects in the control group. More complex designs may include multiple treatment groups, such as low versus high dosage of the drug or combining drug administration with dietary interventions. In a true experimental design , subjects must be randomly assigned to each group. If random assignment is not followed, then the design becomes quasi-experimental . Experiments can be conducted in an artificial or laboratory setting such as at a university (laboratory experiments) or in field settings such as in an organisation where the phenomenon of interest is actually occurring (field experiments). Laboratory experiments allow the researcher to isolate the variables of interest and control for extraneous variables, which may not be possible in field experiments. Hence, inferences drawn from laboratory experiments tend to be stronger in internal validity, but those from field experiments tend to be stronger in external validity. Experimental data is analysed using quantitative statistical techniques. The primary strength of the experimental design is its strong internal validity due to its ability to isolate, control, and intensively examine a small number of variables, while its primary weakness is limited external generalisability since real life is often more complex (i.e., involving more extraneous variables) than contrived lab settings. Furthermore, if the research does not identify ex ante relevant extraneous variables and control for such variables, such lack of controls may hurt internal validity and may lead to spurious correlations.

Field surveys are non-experimental designs that do not control for or manipulate independent variables or treatments, but measure these variables and test their effects using statistical methods. Field surveys capture snapshots of practices, beliefs, or situations from a random sample of subjects in field settings through a survey questionnaire or less frequently, through a structured interview. In cross-sectional field surveys , independent and dependent variables are measured at the same point in time (e.g., using a single questionnaire), while in longitudinal field surveys , dependent variables are measured at a later point in time than the independent variables. The strengths of field surveys are their external validity (since data is collected in field settings), their ability to capture and control for a large number of variables, and their ability to study a problem from multiple perspectives or using multiple theories. However, because of their non-temporal nature, internal validity (cause-effect relationships) are difficult to infer, and surveys may be subject to respondent biases (e.g., subjects may provide a ‘socially desirable’ response rather than their true response) which further hurts internal validity.

Secondary data analysis is an analysis of data that has previously been collected and tabulated by other sources. Such data may include data from government agencies such as employment statistics from the U.S. Bureau of Labor Services or development statistics by countries from the United Nations Development Program, data collected by other researchers (often used in meta-analytic studies), or publicly available third-party data, such as financial data from stock markets or real-time auction data from eBay. This is in contrast to most other research designs where collecting primary data for research is part of the researcher’s job. Secondary data analysis may be an effective means of research where primary data collection is too costly or infeasible, and secondary data is available at a level of analysis suitable for answering the researcher’s questions. The limitations of this design are that the data might not have been collected in a systematic or scientific manner and hence unsuitable for scientific research, since the data was collected for a presumably different purpose, they may not adequately address the research questions of interest to the researcher, and interval validity is problematic if the temporal precedence between cause and effect is unclear.

Case research is an in-depth investigation of a problem in one or more real-life settings (case sites) over an extended period of time. Data may be collected using a combination of interviews, personal observations, and internal or external documents. Case studies can be positivist in nature (for hypotheses testing) or interpretive (for theory building). The strength of this research method is its ability to discover a wide variety of social, cultural, and political factors potentially related to the phenomenon of interest that may not be known in advance. Analysis tends to be qualitative in nature, but heavily contextualised and nuanced. However, interpretation of findings may depend on the observational and integrative ability of the researcher, lack of control may make it difficult to establish causality, and findings from a single case site may not be readily generalised to other case sites. Generalisability can be improved by replicating and comparing the analysis in other case sites in a multiple case design .

Focus group research is a type of research that involves bringing in a small group of subjects (typically six to ten people) at one location, and having them discuss a phenomenon of interest for a period of one and a half to two hours. The discussion is moderated and led by a trained facilitator, who sets the agenda and poses an initial set of questions for participants, makes sure that the ideas and experiences of all participants are represented, and attempts to build a holistic understanding of the problem situation based on participants’ comments and experiences. Internal validity cannot be established due to lack of controls and the findings may not be generalised to other settings because of the small sample size. Hence, focus groups are not generally used for explanatory or descriptive research, but are more suited for exploratory research.

Action research assumes that complex social phenomena are best understood by introducing interventions or ‘actions’ into those phenomena and observing the effects of those actions. In this method, the researcher is embedded within a social context such as an organisation and initiates an action—such as new organisational procedures or new technologies—in response to a real problem such as declining profitability or operational bottlenecks. The researcher’s choice of actions must be based on theory, which should explain why and how such actions may cause the desired change. The researcher then observes the results of that action, modifying it as necessary, while simultaneously learning from the action and generating theoretical insights about the target problem and interventions. The initial theory is validated by the extent to which the chosen action successfully solves the target problem. Simultaneous problem solving and insight generation is the central feature that distinguishes action research from all other research methods, and hence, action research is an excellent method for bridging research and practice. This method is also suited for studying unique social problems that cannot be replicated outside that context, but it is also subject to researcher bias and subjectivity, and the generalisability of findings is often restricted to the context where the study was conducted.

Ethnography is an interpretive research design inspired by anthropology that emphasises that research phenomenon must be studied within the context of its culture. The researcher is deeply immersed in a certain culture over an extended period of time—eight months to two years—and during that period, engages, observes, and records the daily life of the studied culture, and theorises about the evolution and behaviours in that culture. Data is collected primarily via observational techniques, formal and informal interaction with participants in that culture, and personal field notes, while data analysis involves ‘sense-making’. The researcher must narrate her experience in great detail so that readers may experience that same culture without necessarily being there. The advantages of this approach are its sensitiveness to the context, the rich and nuanced understanding it generates, and minimal respondent bias. However, this is also an extremely time and resource-intensive approach, and findings are specific to a given culture and less generalisable to other cultures.

Selecting research designs

Given the above multitude of research designs, which design should researchers choose for their research? Generally speaking, researchers tend to select those research designs that they are most comfortable with and feel most competent to handle, but ideally, the choice should depend on the nature of the research phenomenon being studied. In the preliminary phases of research, when the research problem is unclear and the researcher wants to scope out the nature and extent of a certain research problem, a focus group (for an individual unit of analysis) or a case study (for an organisational unit of analysis) is an ideal strategy for exploratory research. As one delves further into the research domain, but finds that there are no good theories to explain the phenomenon of interest and wants to build a theory to fill in the unmet gap in that area, interpretive designs such as case research or ethnography may be useful designs. If competing theories exist and the researcher wishes to test these different theories or integrate them into a larger theory, positivist designs such as experimental design, survey research, or secondary data analysis are more appropriate.

Regardless of the specific research design chosen, the researcher should strive to collect quantitative and qualitative data using a combination of techniques such as questionnaires, interviews, observations, documents, or secondary data. For instance, even in a highly structured survey questionnaire, intended to collect quantitative data, the researcher may leave some room for a few open-ended questions to collect qualitative data that may generate unexpected insights not otherwise available from structured quantitative data alone. Likewise, while case research employ mostly face-to-face interviews to collect most qualitative data, the potential and value of collecting quantitative data should not be ignored. As an example, in a study of organisational decision-making processes, the case interviewer can record numeric quantities such as how many months it took to make certain organisational decisions, how many people were involved in that decision process, and how many decision alternatives were considered, which can provide valuable insights not otherwise available from interviewees’ narrative responses. Irrespective of the specific research design employed, the goal of the researcher should be to collect as much and as diverse data as possible that can help generate the best possible insights about the phenomenon of interest.

Social Science Research: Principles, Methods and Practices (Revised edition) Copyright © 2019 by Anol Bhattacherjee is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

Canvas | University | Ask a Librarian

  • Library Homepage
  • Arrendale Library

Empirical Research: Quantitative & Qualitative

  • Empirical Research

Introduction: What is Empirical Research?

Quantitative methods, qualitative methods.

  • Quantitative vs. Qualitative
  • Reference Works for Social Sciences Research
  • Contact Us!

 Call us at 706-776-0111

  Chat with a Librarian

  Send Us Email

  Library Hours

Empirical research  is based on phenomena that can be observed and measured. Empirical research derives knowledge from actual experience rather than from theory or belief. 

Key characteristics of empirical research include:

  • Specific research questions to be answered;
  • Definitions of the population, behavior, or phenomena being studied;
  • Description of the methodology or research design used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as surveys);
  • Two basic research processes or methods in empirical research: quantitative methods and qualitative methods (see the rest of the guide for more about these methods).

(based on the original from the Connelly LIbrary of LaSalle University)

an empirical study is based on research design that is

Empirical Research: Qualitative vs. Quantitative

Learn about common types of journal articles that use APA Style, including empirical studies; meta-analyses; literature reviews; and replication, theoretical, and methodological articles.

Academic Writer

© 2024 American Psychological Association.

  • More about Academic Writer ...

Quantitative Research

A quantitative research project is characterized by having a population about which the researcher wants to draw conclusions, but it is not possible to collect data on the entire population.

  • For an observational study, it is necessary to select a proper, statistical random sample and to use methods of statistical inference to draw conclusions about the population. 
  • For an experimental study, it is necessary to have a random assignment of subjects to experimental and control groups in order to use methods of statistical inference.

Statistical methods are used in all three stages of a quantitative research project.

For observational studies, the data are collected using statistical sampling theory. Then, the sample data are analyzed using descriptive statistical analysis. Finally, generalizations are made from the sample data to the entire population using statistical inference.

For experimental studies, the subjects are allocated to experimental and control group using randomizing methods. Then, the experimental data are analyzed using descriptive statistical analysis. Finally, just as for observational data, generalizations are made to a larger population.

Iversen, G. (2004). Quantitative research . In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.), Encyclopedia of social science research methods . (pp. 897-898). Thousand Oaks, CA: SAGE Publications, Inc.

Qualitative Research

What makes a work deserving of the label qualitative research is the demonstrable effort to produce richly and relevantly detailed descriptions and particularized interpretations of people and the social, linguistic, material, and other practices and events that shape and are shaped by them.

Qualitative research typically includes, but is not limited to, discerning the perspectives of these people, or what is often referred to as the actor’s point of view. Although both philosophically and methodologically a highly diverse entity, qualitative research is marked by certain defining imperatives that include its case (as opposed to its variable) orientation, sensitivity to cultural and historical context, and reflexivity. 

In its many guises, qualitative research is a form of empirical inquiry that typically entails some form of purposive sampling for information-rich cases; in-depth interviews and open-ended interviews, lengthy participant/field observations, and/or document or artifact study; and techniques for analysis and interpretation of data that move beyond the data generated and their surface appearances. 

Sandelowski, M. (2004).  Qualitative research . In M. Lewis-Beck, A. Bryman, & T. Liao (Eds.),  Encyclopedia of social science research methods . (pp. 893-894). Thousand Oaks, CA: SAGE Publications, Inc.

  • Next: Quantitative vs. Qualitative >>
  • Last Updated: Mar 22, 2024 10:47 AM
  • URL: https://library.piedmont.edu/empirical-research
  • Ebooks & Online Video
  • New Materials
  • Renew Checkouts
  • Faculty Resources
  • Friends of the Library
  • Library Services
  • Request Books from Demorest
  • Our Mission
  • Library History
  • Ask a Librarian!
  • Making Citations
  • Working Online

Friend us on Facebook!

Arrendale Library Piedmont University 706-776-0111

What is Empirical Research? Definition, Methods, Examples

Appinio Research · 09.02.2024 · 35min read

What is Empirical Research Definition Methods Examples

Ever wondered how we gather the facts, unveil hidden truths, and make informed decisions in a world filled with questions? Empirical research holds the key.

In this guide, we'll delve deep into the art and science of empirical research, unraveling its methods, mysteries, and manifold applications. From defining the core principles to mastering data analysis and reporting findings, we're here to equip you with the knowledge and tools to navigate the empirical landscape.

What is Empirical Research?

Empirical research is the cornerstone of scientific inquiry, providing a systematic and structured approach to investigating the world around us. It is the process of gathering and analyzing empirical or observable data to test hypotheses, answer research questions, or gain insights into various phenomena. This form of research relies on evidence derived from direct observation or experimentation, allowing researchers to draw conclusions based on real-world data rather than purely theoretical or speculative reasoning.

Characteristics of Empirical Research

Empirical research is characterized by several key features:

  • Observation and Measurement : It involves the systematic observation or measurement of variables, events, or behaviors.
  • Data Collection : Researchers collect data through various methods, such as surveys, experiments, observations, or interviews.
  • Testable Hypotheses : Empirical research often starts with testable hypotheses that are evaluated using collected data.
  • Quantitative or Qualitative Data : Data can be quantitative (numerical) or qualitative (non-numerical), depending on the research design.
  • Statistical Analysis : Quantitative data often undergo statistical analysis to determine patterns , relationships, or significance.
  • Objectivity and Replicability : Empirical research strives for objectivity, minimizing researcher bias . It should be replicable, allowing other researchers to conduct the same study to verify results.
  • Conclusions and Generalizations : Empirical research generates findings based on data and aims to make generalizations about larger populations or phenomena.

Importance of Empirical Research

Empirical research plays a pivotal role in advancing knowledge across various disciplines. Its importance extends to academia, industry, and society as a whole. Here are several reasons why empirical research is essential:

  • Evidence-Based Knowledge : Empirical research provides a solid foundation of evidence-based knowledge. It enables us to test hypotheses, confirm or refute theories, and build a robust understanding of the world.
  • Scientific Progress : In the scientific community, empirical research fuels progress by expanding the boundaries of existing knowledge. It contributes to the development of theories and the formulation of new research questions.
  • Problem Solving : Empirical research is instrumental in addressing real-world problems and challenges. It offers insights and data-driven solutions to complex issues in fields like healthcare, economics, and environmental science.
  • Informed Decision-Making : In policymaking, business, and healthcare, empirical research informs decision-makers by providing data-driven insights. It guides strategies, investments, and policies for optimal outcomes.
  • Quality Assurance : Empirical research is essential for quality assurance and validation in various industries, including pharmaceuticals, manufacturing, and technology. It ensures that products and processes meet established standards.
  • Continuous Improvement : Businesses and organizations use empirical research to evaluate performance, customer satisfaction, and product effectiveness. This data-driven approach fosters continuous improvement and innovation.
  • Human Advancement : Empirical research in fields like medicine and psychology contributes to the betterment of human health and well-being. It leads to medical breakthroughs, improved therapies, and enhanced psychological interventions.
  • Critical Thinking and Problem Solving : Engaging in empirical research fosters critical thinking skills, problem-solving abilities, and a deep appreciation for evidence-based decision-making.

Empirical research empowers us to explore, understand, and improve the world around us. It forms the bedrock of scientific inquiry and drives progress in countless domains, shaping our understanding of both the natural and social sciences.

How to Conduct Empirical Research?

So, you've decided to dive into the world of empirical research. Let's begin by exploring the crucial steps involved in getting started with your research project.

1. Select a Research Topic

Selecting the right research topic is the cornerstone of a successful empirical study. It's essential to choose a topic that not only piques your interest but also aligns with your research goals and objectives. Here's how to go about it:

  • Identify Your Interests : Start by reflecting on your passions and interests. What topics fascinate you the most? Your enthusiasm will be your driving force throughout the research process.
  • Brainstorm Ideas : Engage in brainstorming sessions to generate potential research topics. Consider the questions you've always wanted to answer or the issues that intrigue you.
  • Relevance and Significance : Assess the relevance and significance of your chosen topic. Does it contribute to existing knowledge? Is it a pressing issue in your field of study or the broader community?
  • Feasibility : Evaluate the feasibility of your research topic. Do you have access to the necessary resources, data, and participants (if applicable)?

2. Formulate Research Questions

Once you've narrowed down your research topic, the next step is to formulate clear and precise research questions . These questions will guide your entire research process and shape your study's direction. To create effective research questions:

  • Specificity : Ensure that your research questions are specific and focused. Vague or overly broad questions can lead to inconclusive results.
  • Relevance : Your research questions should directly relate to your chosen topic. They should address gaps in knowledge or contribute to solving a particular problem.
  • Testability : Ensure that your questions are testable through empirical methods. You should be able to gather data and analyze it to answer these questions.
  • Avoid Bias : Craft your questions in a way that avoids leading or biased language. Maintain neutrality to uphold the integrity of your research.

3. Review Existing Literature

Before you embark on your empirical research journey, it's essential to immerse yourself in the existing body of literature related to your chosen topic. This step, often referred to as a literature review, serves several purposes:

  • Contextualization : Understand the historical context and current state of research in your field. What have previous studies found, and what questions remain unanswered?
  • Identifying Gaps : Identify gaps or areas where existing research falls short. These gaps will help you formulate meaningful research questions and hypotheses.
  • Theory Development : If your study is theoretical, consider how existing theories apply to your topic. If it's empirical, understand how previous studies have approached data collection and analysis.
  • Methodological Insights : Learn from the methodologies employed in previous research. What methods were successful, and what challenges did researchers face?

4. Define Variables

Variables are fundamental components of empirical research. They are the factors or characteristics that can change or be manipulated during your study. Properly defining and categorizing variables is crucial for the clarity and validity of your research. Here's what you need to know:

  • Independent Variables : These are the variables that you, as the researcher, manipulate or control. They are the "cause" in cause-and-effect relationships.
  • Dependent Variables : Dependent variables are the outcomes or responses that you measure or observe. They are the "effect" influenced by changes in independent variables.
  • Operational Definitions : To ensure consistency and clarity, provide operational definitions for your variables. Specify how you will measure or manipulate each variable.
  • Control Variables : In some studies, controlling for other variables that may influence your dependent variable is essential. These are known as control variables.

Understanding these foundational aspects of empirical research will set a solid foundation for the rest of your journey. Now that you've grasped the essentials of getting started, let's delve deeper into the intricacies of research design.

Empirical Research Design

Now that you've selected your research topic, formulated research questions, and defined your variables, it's time to delve into the heart of your empirical research journey – research design . This pivotal step determines how you will collect data and what methods you'll employ to answer your research questions. Let's explore the various facets of research design in detail.

Types of Empirical Research

Empirical research can take on several forms, each with its own unique approach and methodologies. Understanding the different types of empirical research will help you choose the most suitable design for your study. Here are some common types:

  • Experimental Research : In this type, researchers manipulate one or more independent variables to observe their impact on dependent variables. It's highly controlled and often conducted in a laboratory setting.
  • Observational Research : Observational research involves the systematic observation of subjects or phenomena without intervention. Researchers are passive observers, documenting behaviors, events, or patterns.
  • Survey Research : Surveys are used to collect data through structured questionnaires or interviews. This method is efficient for gathering information from a large number of participants.
  • Case Study Research : Case studies focus on in-depth exploration of one or a few cases. Researchers gather detailed information through various sources such as interviews, documents, and observations.
  • Qualitative Research : Qualitative research aims to understand behaviors, experiences, and opinions in depth. It often involves open-ended questions, interviews, and thematic analysis.
  • Quantitative Research : Quantitative research collects numerical data and relies on statistical analysis to draw conclusions. It involves structured questionnaires, experiments, and surveys.

Your choice of research type should align with your research questions and objectives. Experimental research, for example, is ideal for testing cause-and-effect relationships, while qualitative research is more suitable for exploring complex phenomena.

Experimental Design

Experimental research is a systematic approach to studying causal relationships. It's characterized by the manipulation of one or more independent variables while controlling for other factors. Here are some key aspects of experimental design:

  • Control and Experimental Groups : Participants are randomly assigned to either a control group or an experimental group. The independent variable is manipulated for the experimental group but not for the control group.
  • Randomization : Randomization is crucial to eliminate bias in group assignment. It ensures that each participant has an equal chance of being in either group.
  • Hypothesis Testing : Experimental research often involves hypothesis testing. Researchers formulate hypotheses about the expected effects of the independent variable and use statistical analysis to test these hypotheses.

Observational Design

Observational research entails careful and systematic observation of subjects or phenomena. It's advantageous when you want to understand natural behaviors or events. Key aspects of observational design include:

  • Participant Observation : Researchers immerse themselves in the environment they are studying. They become part of the group being observed, allowing for a deep understanding of behaviors.
  • Non-Participant Observation : In non-participant observation, researchers remain separate from the subjects. They observe and document behaviors without direct involvement.
  • Data Collection Methods : Observational research can involve various data collection methods, such as field notes, video recordings, photographs, or coding of observed behaviors.

Survey Design

Surveys are a popular choice for collecting data from a large number of participants. Effective survey design is essential to ensure the validity and reliability of your data. Consider the following:

  • Questionnaire Design : Create clear and concise questions that are easy for participants to understand. Avoid leading or biased questions.
  • Sampling Methods : Decide on the appropriate sampling method for your study, whether it's random, stratified, or convenience sampling.
  • Data Collection Tools : Choose the right tools for data collection, whether it's paper surveys, online questionnaires, or face-to-face interviews.

Case Study Design

Case studies are an in-depth exploration of one or a few cases to gain a deep understanding of a particular phenomenon. Key aspects of case study design include:

  • Single Case vs. Multiple Case Studies : Decide whether you'll focus on a single case or multiple cases. Single case studies are intensive and allow for detailed examination, while multiple case studies provide comparative insights.
  • Data Collection Methods : Gather data through interviews, observations, document analysis, or a combination of these methods.

Qualitative vs. Quantitative Research

In empirical research, you'll often encounter the distinction between qualitative and quantitative research . Here's a closer look at these two approaches:

  • Qualitative Research : Qualitative research seeks an in-depth understanding of human behavior, experiences, and perspectives. It involves open-ended questions, interviews, and the analysis of textual or narrative data. Qualitative research is exploratory and often used when the research question is complex and requires a nuanced understanding.
  • Quantitative Research : Quantitative research collects numerical data and employs statistical analysis to draw conclusions. It involves structured questionnaires, experiments, and surveys. Quantitative research is ideal for testing hypotheses and establishing cause-and-effect relationships.

Understanding the various research design options is crucial in determining the most appropriate approach for your study. Your choice should align with your research questions, objectives, and the nature of the phenomenon you're investigating.

Data Collection for Empirical Research

Now that you've established your research design, it's time to roll up your sleeves and collect the data that will fuel your empirical research. Effective data collection is essential for obtaining accurate and reliable results.

Sampling Methods

Sampling methods are critical in empirical research, as they determine the subset of individuals or elements from your target population that you will study. Here are some standard sampling methods:

  • Random Sampling : Random sampling ensures that every member of the population has an equal chance of being selected. It minimizes bias and is often used in quantitative research.
  • Stratified Sampling : Stratified sampling involves dividing the population into subgroups or strata based on specific characteristics (e.g., age, gender, location). Samples are then randomly selected from each stratum, ensuring representation of all subgroups.
  • Convenience Sampling : Convenience sampling involves selecting participants who are readily available or easily accessible. While it's convenient, it may introduce bias and limit the generalizability of results.
  • Snowball Sampling : Snowball sampling is instrumental when studying hard-to-reach or hidden populations. One participant leads you to another, creating a "snowball" effect. This method is common in qualitative research.
  • Purposive Sampling : In purposive sampling, researchers deliberately select participants who meet specific criteria relevant to their research questions. It's often used in qualitative studies to gather in-depth information.

The choice of sampling method depends on the nature of your research, available resources, and the degree of precision required. It's crucial to carefully consider your sampling strategy to ensure that your sample accurately represents your target population.

Data Collection Instruments

Data collection instruments are the tools you use to gather information from your participants or sources. These instruments should be designed to capture the data you need accurately. Here are some popular data collection instruments:

  • Questionnaires : Questionnaires consist of structured questions with predefined response options. When designing questionnaires, consider the clarity of questions, the order of questions, and the response format (e.g., Likert scale, multiple-choice).
  • Interviews : Interviews involve direct communication between the researcher and participants. They can be structured (with predetermined questions) or unstructured (open-ended). Effective interviews require active listening and probing for deeper insights.
  • Observations : Observations entail systematically and objectively recording behaviors, events, or phenomena. Researchers must establish clear criteria for what to observe, how to record observations, and when to observe.
  • Surveys : Surveys are a common data collection instrument for quantitative research. They can be administered through various means, including online surveys, paper surveys, and telephone surveys.
  • Documents and Archives : In some cases, data may be collected from existing documents, records, or archives. Ensure that the sources are reliable, relevant, and properly documented.

To streamline your process and gather insights with precision and efficiency, consider leveraging innovative tools like Appinio . With Appinio's intuitive platform, you can harness the power of real-time consumer data to inform your research decisions effectively. Whether you're conducting surveys, interviews, or observations, Appinio empowers you to define your target audience, collect data from diverse demographics, and analyze results seamlessly.

By incorporating Appinio into your data collection toolkit, you can unlock a world of possibilities and elevate the impact of your empirical research. Ready to revolutionize your approach to data collection?

Book a Demo

Data Collection Procedures

Data collection procedures outline the step-by-step process for gathering data. These procedures should be meticulously planned and executed to maintain the integrity of your research.

  • Training : If you have a research team, ensure that they are trained in data collection methods and protocols. Consistency in data collection is crucial.
  • Pilot Testing : Before launching your data collection, conduct a pilot test with a small group to identify any potential problems with your instruments or procedures. Make necessary adjustments based on feedback.
  • Data Recording : Establish a systematic method for recording data. This may include timestamps, codes, or identifiers for each data point.
  • Data Security : Safeguard the confidentiality and security of collected data. Ensure that only authorized individuals have access to the data.
  • Data Storage : Properly organize and store your data in a secure location, whether in physical or digital form. Back up data to prevent loss.

Ethical Considerations

Ethical considerations are paramount in empirical research, as they ensure the well-being and rights of participants are protected.

  • Informed Consent : Obtain informed consent from participants, providing clear information about the research purpose, procedures, risks, and their right to withdraw at any time.
  • Privacy and Confidentiality : Protect the privacy and confidentiality of participants. Ensure that data is anonymized and sensitive information is kept confidential.
  • Beneficence : Ensure that your research benefits participants and society while minimizing harm. Consider the potential risks and benefits of your study.
  • Honesty and Integrity : Conduct research with honesty and integrity. Report findings accurately and transparently, even if they are not what you expected.
  • Respect for Participants : Treat participants with respect, dignity, and sensitivity to cultural differences. Avoid any form of coercion or manipulation.
  • Institutional Review Board (IRB) : If required, seek approval from an IRB or ethics committee before conducting your research, particularly when working with human participants.

Adhering to ethical guidelines is not only essential for the ethical conduct of research but also crucial for the credibility and validity of your study. Ethical research practices build trust between researchers and participants and contribute to the advancement of knowledge with integrity.

With a solid understanding of data collection, including sampling methods, instruments, procedures, and ethical considerations, you are now well-equipped to gather the data needed to answer your research questions.

Empirical Research Data Analysis

Now comes the exciting phase of data analysis, where the raw data you've diligently collected starts to yield insights and answers to your research questions. We will explore the various aspects of data analysis, from preparing your data to drawing meaningful conclusions through statistics and visualization.

Data Preparation

Data preparation is the crucial first step in data analysis. It involves cleaning, organizing, and transforming your raw data into a format that is ready for analysis. Effective data preparation ensures the accuracy and reliability of your results.

  • Data Cleaning : Identify and rectify errors, missing values, and inconsistencies in your dataset. This may involve correcting typos, removing outliers, and imputing missing data.
  • Data Coding : Assign numerical values or codes to categorical variables to make them suitable for statistical analysis. For example, converting "Yes" and "No" to 1 and 0.
  • Data Transformation : Transform variables as needed to meet the assumptions of the statistical tests you plan to use. Common transformations include logarithmic or square root transformations.
  • Data Integration : If your data comes from multiple sources, integrate it into a unified dataset, ensuring that variables match and align.
  • Data Documentation : Maintain clear documentation of all data preparation steps, as well as the rationale behind each decision. This transparency is essential for replicability.

Effective data preparation lays the foundation for accurate and meaningful analysis. It allows you to trust the results that will follow in the subsequent stages.

Descriptive Statistics

Descriptive statistics help you summarize and make sense of your data by providing a clear overview of its key characteristics. These statistics are essential for understanding the central tendencies, variability, and distribution of your variables. Descriptive statistics include:

  • Measures of Central Tendency : These include the mean (average), median (middle value), and mode (most frequent value). They help you understand the typical or central value of your data.
  • Measures of Dispersion : Measures like the range, variance, and standard deviation provide insights into the spread or variability of your data points.
  • Frequency Distributions : Creating frequency distributions or histograms allows you to visualize the distribution of your data across different values or categories.

Descriptive statistics provide the initial insights needed to understand your data's basic characteristics, which can inform further analysis.

Inferential Statistics

Inferential statistics take your analysis to the next level by allowing you to make inferences or predictions about a larger population based on your sample data. These methods help you test hypotheses and draw meaningful conclusions. Key concepts in inferential statistics include:

  • Hypothesis Testing : Hypothesis tests (e.g., t-tests, chi-squared tests) help you determine whether observed differences or associations in your data are statistically significant or occurred by chance.
  • Confidence Intervals : Confidence intervals provide a range within which population parameters (e.g., population mean) are likely to fall based on your sample data.
  • Regression Analysis : Regression models (linear, logistic, etc.) help you explore relationships between variables and make predictions.
  • Analysis of Variance (ANOVA) : ANOVA tests are used to compare means between multiple groups, allowing you to assess whether differences are statistically significant.

Inferential statistics are powerful tools for drawing conclusions from your data and assessing the generalizability of your findings to the broader population.

Qualitative Data Analysis

Qualitative data analysis is employed when working with non-numerical data, such as text, interviews, or open-ended survey responses. It focuses on understanding the underlying themes, patterns, and meanings within qualitative data. Qualitative analysis techniques include:

  • Thematic Analysis : Identifying and analyzing recurring themes or patterns within textual data.
  • Content Analysis : Categorizing and coding qualitative data to extract meaningful insights.
  • Grounded Theory : Developing theories or frameworks based on emergent themes from the data.
  • Narrative Analysis : Examining the structure and content of narratives to uncover meaning.

Qualitative data analysis provides a rich and nuanced understanding of complex phenomena and human experiences.

Data Visualization

Data visualization is the art of representing data graphically to make complex information more understandable and accessible. Effective data visualization can reveal patterns, trends, and outliers in your data. Common types of data visualization include:

  • Bar Charts and Histograms : Used to display the distribution of categorical or discrete data.
  • Line Charts : Ideal for showing trends and changes in data over time.
  • Scatter Plots : Visualize relationships and correlations between two variables.
  • Pie Charts : Display the composition of a whole in terms of its parts.
  • Heatmaps : Depict patterns and relationships in multidimensional data through color-coding.
  • Box Plots : Provide a summary of the data distribution, including outliers.
  • Interactive Dashboards : Create dynamic visualizations that allow users to explore data interactively.

Data visualization not only enhances your understanding of the data but also serves as a powerful communication tool to convey your findings to others.

As you embark on the data analysis phase of your empirical research, remember that the specific methods and techniques you choose will depend on your research questions, data type, and objectives. Effective data analysis transforms raw data into valuable insights, bringing you closer to the answers you seek.

How to Report Empirical Research Results?

At this stage, you get to share your empirical research findings with the world. Effective reporting and presentation of your results are crucial for communicating your research's impact and insights.

1. Write the Research Paper

Writing a research paper is the culmination of your empirical research journey. It's where you synthesize your findings, provide context, and contribute to the body of knowledge in your field.

  • Title and Abstract : Craft a clear and concise title that reflects your research's essence. The abstract should provide a brief summary of your research objectives, methods, findings, and implications.
  • Introduction : In the introduction, introduce your research topic, state your research questions or hypotheses, and explain the significance of your study. Provide context by discussing relevant literature.
  • Methods : Describe your research design, data collection methods, and sampling procedures. Be precise and transparent, allowing readers to understand how you conducted your study.
  • Results : Present your findings in a clear and organized manner. Use tables, graphs, and statistical analyses to support your results. Avoid interpreting your findings in this section; focus on the presentation of raw data.
  • Discussion : Interpret your findings and discuss their implications. Relate your results to your research questions and the existing literature. Address any limitations of your study and suggest avenues for future research.
  • Conclusion : Summarize the key points of your research and its significance. Restate your main findings and their implications.
  • References : Cite all sources used in your research following a specific citation style (e.g., APA, MLA, Chicago). Ensure accuracy and consistency in your citations.
  • Appendices : Include any supplementary material, such as questionnaires, data coding sheets, or additional analyses, in the appendices.

Writing a research paper is a skill that improves with practice. Ensure clarity, coherence, and conciseness in your writing to make your research accessible to a broader audience.

2. Create Visuals and Tables

Visuals and tables are powerful tools for presenting complex data in an accessible and understandable manner.

  • Clarity : Ensure that your visuals and tables are clear and easy to interpret. Use descriptive titles and labels.
  • Consistency : Maintain consistency in formatting, such as font size and style, across all visuals and tables.
  • Appropriateness : Choose the most suitable visual representation for your data. Bar charts, line graphs, and scatter plots work well for different types of data.
  • Simplicity : Avoid clutter and unnecessary details. Focus on conveying the main points.
  • Accessibility : Make sure your visuals and tables are accessible to a broad audience, including those with visual impairments.
  • Captions : Include informative captions that explain the significance of each visual or table.

Compelling visuals and tables enhance the reader's understanding of your research and can be the key to conveying complex information efficiently.

3. Interpret Findings

Interpreting your findings is where you bridge the gap between data and meaning. It's your opportunity to provide context, discuss implications, and offer insights. When interpreting your findings:

  • Relate to Research Questions : Discuss how your findings directly address your research questions or hypotheses.
  • Compare with Literature : Analyze how your results align with or deviate from previous research in your field. What insights can you draw from these comparisons?
  • Discuss Limitations : Be transparent about the limitations of your study. Address any constraints, biases, or potential sources of error.
  • Practical Implications : Explore the real-world implications of your findings. How can they be applied or inform decision-making?
  • Future Research Directions : Suggest areas for future research based on the gaps or unanswered questions that emerged from your study.

Interpreting findings goes beyond simply presenting data; it's about weaving a narrative that helps readers grasp the significance of your research in the broader context.

With your research paper written, structured, and enriched with visuals, and your findings expertly interpreted, you are now prepared to communicate your research effectively. Sharing your insights and contributing to the body of knowledge in your field is a significant accomplishment in empirical research.

Examples of Empirical Research

To solidify your understanding of empirical research, let's delve into some real-world examples across different fields. These examples will illustrate how empirical research is applied to gather data, analyze findings, and draw conclusions.

Social Sciences

In the realm of social sciences, consider a sociological study exploring the impact of socioeconomic status on educational attainment. Researchers gather data from a diverse group of individuals, including their family backgrounds, income levels, and academic achievements.

Through statistical analysis, they can identify correlations and trends, revealing whether individuals from lower socioeconomic backgrounds are less likely to attain higher levels of education. This empirical research helps shed light on societal inequalities and informs policymakers on potential interventions to address disparities in educational access.

Environmental Science

Environmental scientists often employ empirical research to assess the effects of environmental changes. For instance, researchers studying the impact of climate change on wildlife might collect data on animal populations, weather patterns, and habitat conditions over an extended period.

By analyzing this empirical data, they can identify correlations between climate fluctuations and changes in wildlife behavior, migration patterns, or population sizes. This empirical research is crucial for understanding the ecological consequences of climate change and informing conservation efforts.

Business and Economics

In the business world, empirical research is essential for making data-driven decisions. Consider a market research study conducted by a business seeking to launch a new product. They collect data through surveys, focus groups, and consumer behavior analysis.

By examining this empirical data, the company can gauge consumer preferences, demand, and potential market size. Empirical research in business helps guide product development, pricing strategies, and marketing campaigns, increasing the likelihood of a successful product launch.

Psychological studies frequently rely on empirical research to understand human behavior and cognition. For instance, a psychologist interested in examining the impact of stress on memory might design an experiment. Participants are exposed to stress-inducing situations, and their memory performance is assessed through various tasks.

By analyzing the data collected, the psychologist can determine whether stress has a significant effect on memory recall. This empirical research contributes to our understanding of the complex interplay between psychological factors and cognitive processes.

These examples highlight the versatility and applicability of empirical research across diverse fields. Whether in medicine, social sciences, environmental science, business, or psychology, empirical research serves as a fundamental tool for gaining insights, testing hypotheses, and driving advancements in knowledge and practice.

Conclusion for Empirical Research

Empirical research is a powerful tool for gaining insights, testing hypotheses, and making informed decisions. By following the steps outlined in this guide, you've learned how to select research topics, collect data, analyze findings, and effectively communicate your research to the world. Remember, empirical research is a journey of discovery, and each step you take brings you closer to a deeper understanding of the world around you. Whether you're a scientist, a student, or someone curious about the process, the principles of empirical research empower you to explore, learn, and contribute to the ever-expanding realm of knowledge.

How to Collect Data for Empirical Research?

Introducing Appinio , the real-time market research platform revolutionizing how companies gather consumer insights for their empirical research endeavors. With Appinio, you can conduct your own market research in minutes, gaining valuable data to fuel your data-driven decisions.

Appinio is more than just a market research platform; it's a catalyst for transforming the way you approach empirical research, making it exciting, intuitive, and seamlessly integrated into your decision-making process.

Here's why Appinio is the go-to solution for empirical research:

  • From Questions to Insights in Minutes : With Appinio's streamlined process, you can go from formulating your research questions to obtaining actionable insights in a matter of minutes, saving you time and effort.
  • Intuitive Platform for Everyone : No need for a PhD in research; Appinio's platform is designed to be intuitive and user-friendly, ensuring that anyone can navigate and utilize it effectively.
  • Rapid Response Times : With an average field time of under 23 minutes for 1,000 respondents, Appinio delivers rapid results, allowing you to gather data swiftly and efficiently.
  • Global Reach with Targeted Precision : With access to over 90 countries and the ability to define target groups based on 1200+ characteristics, Appinio empowers you to reach your desired audience with precision and ease.

Register now EN

Get free access to the platform!

Join the loop 💌

Be the first to hear about new updates, product news, and data insights. We'll send it all straight to your inbox.

Get the latest market research news straight to your inbox! 💌

Wait, there's more

What is a Confidence Interval and How to Calculate It

09.04.2024 | 29min read

What is a Confidence Interval and How to Calculate It?

What is Field Research Definition Types Methods Examples

05.04.2024 | 27min read

What is Field Research? Definition, Types, Methods, Examples

What is Cluster Sampling Definition Methods Examples

03.04.2024 | 29min read

What is Cluster Sampling? Definition, Methods, Examples

Purdue University

  • Ask a Librarian

Research: Overview & Approaches

  • Getting Started with Undergraduate Research
  • Planning & Getting Started
  • Building Your Knowledge Base
  • Locating Sources
  • Reading Scholarly Articles
  • Creating a Literature Review
  • Productivity & Organizing Research
  • Scholarly and Professional Relationships

Introduction to Empirical Research

Databases for finding empirical research, guided search, google scholar, examples of empirical research, sources and further reading.

  • Interpretive Research
  • Action-Based Research
  • Creative & Experimental Approaches

Your Librarian

Profile Photo

  • Introductory Video This video covers what empirical research is, what kinds of questions and methods empirical researchers use, and some tips for finding empirical research articles in your discipline.

Help Resources

  • Guided Search: Finding Empirical Research Articles This is a hands-on tutorial that will allow you to use your own search terms to find resources.

Google Scholar Search

  • Study on radiation transfer in human skin for cosmetics
  • Long-Term Mobile Phone Use and the Risk of Vestibular Schwannoma: A Danish Nationwide Cohort Study
  • Emissions Impacts and Benefits of Plug-In Hybrid Electric Vehicles and Vehicle-to-Grid Services
  • Review of design considerations and technological challenges for successful development and deployment of plug-in hybrid electric vehicles
  • Endocrine disrupters and human health: could oestrogenic chemicals in body care cosmetics adversely affect breast cancer incidence in women?

an empirical study is based on research design that is

  • << Previous: Scholarly and Professional Relationships
  • Next: Interpretive Research >>
  • Last Updated: Apr 5, 2024 9:55 AM
  • URL: https://guides.lib.purdue.edu/research_approaches

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • v.15(1); 2023 Jan

Logo of cureus

Clinical Research: A Review of Study Designs, Hypotheses, Errors, Sampling Types, Ethics, and Informed Consent

Addanki purna singh.

1 Physiology, Department of Biomedical Sciences, Saint James School of Medicine, The Quarter, AIA

Sabitha Vadakedath

2 Biochemistry, Prathima Institute of Medical Sciences, Karimnagar, IND

Venkataramana Kandi

3 Clinical Microbiology, Prathima Institute of Medical Sciences, Karimnagar, IND

Recently, we have been noticing an increase in the emergence and re-emergence of microbial infectious diseases. In the previous 100 years, there were several incidences of pandemics caused by different microbial species like the influenza virus , human immunodeficiency virus (HIV), dengue virus , severe acute respiratory syndrome Coronavirus (SARS-CoV), middle east respiratory syndrome coronavirus (MERS-CoV), and SARS-CoV-2 that were responsible for severe morbidity and mortality among humans. Moreover, non-communicable diseases, including malignancies, diabetes, heart, liver, kidney, and lung diseases, have been on the rise. The medical fraternity, people, and governments all need to improve their preparedness to effectively tackle health emergencies. Clinical research, therefore, assumes increased significance in the current world and may potentially be applied to manage human health-related problems. In the current review, we describe the critical aspects of clinical research that include research designs, types of study hypotheses, errors, types of sampling, ethical concerns, and informed consent.

Introduction and background

To conduct successful and credible research, scientists/researchers should understand the key elements of clinical research like neutrality (unbiased), reliability, validity, and generalizability. Moreover, results from clinical studies are applied in the real world to benefit human health. As a result, researchers must understand the various types of research designs [ 1 ]. Before choosing a research design, the researchers must work out the aims and objectives of the study, identify the study population, and address the ethical concerns associated with the clinical study. Another significant aspect of clinical studies is the research methodology and the statistical applications that are employed to process the data and draw conclusions. There are primarily two types of research designs: observational studies and experimental studies [ 2 ]. Observational studies do not involve any interventions and are therefore considered inferior to experimental designs. The experimental studies include the clinical trials that are carried out among a selected group of participants who are given a drug to assess its safety and efficacy in treating and managing the disease. However, in the absence of a study group, a single-case experimental design (SCED) was suggested as an alternative methodology that is equally reliable as a randomization study [ 3 ]. The single case study designs are called N-of-1 type clinical trials [ 4 , 5 ]. The N-of-1 study design is being increasingly applied in healthcare-related research. Experimental studies are complex and are generally performed by pharmaceutical industries as a part of research and development activities during the discovery of a therapeutic drug/device. Also, clinical trials are undertaken by individual researchers or a consortium. In a recent study, the researchers were cautioned about the consequences of a faulty research design [ 6 ]. It was noted that clinical studies on the effect of the gut microbiome and its relationship with the feed could potentially be influenced by the choice of the experimental design, controls, and comparison groups included in the study. Moreover, clinical studies can be affected by sampling errors and biases [ 7 ]. In the present review, we briefly discuss the types of clinical study designs, study hypotheses, sampling errors, and the ethical issues associated with clinical research.

Research design

A research design is a systematic elucidation of the whole research process that includes methods and techniques, starting from the planning of research, execution (data collection), analysis, and drawing a logical conclusion based on the results obtained. A research design is a framework developed by a research team to find an answer/solution to a problem. The research designs are of several types that include descriptive research, surveys, correlation type, experimental, review (systematic/literature), and meta-analysis. The choice of research design is determined by the type of research question that is opted for. Both the research design and the research question are interdependent. For every research question, a complementary/appropriate research design must have been chosen. The choice of research design influences the research credibility, reliability, and accuracy of the data collected. A well-defined research design would contain certain elements that include a specific purpose of the research, methods to be applied while collecting and analyzing the data, the research methodology used to interpret the collected data, research infrastructure, limitations, and most importantly, the time required to complete the research. The research design can broadly be categorized into two types: qualitative and quantitative designs. In a qualitative research method, the collected data are measured and evaluated using mathematical and statistical applications. Whereas in quantitative research, a larger sample size is selected, and the results derived from statistics can benefit society. The various types of research designs are shown in Figure ​ Figure1 1 [ 8 ].

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000033374-i01.jpg

Types of research studies

There are various types of research study designs. The researcher who aims to take up the study determines the type of study design to choose among the available ones. The choice of study design depends on many factors that include but are not limited to the research question, the aim of the study, the available funds, manpower, and infrastructure, among others. The research study designs include systematic reviews, meta-analyses, randomized controlled trials, cross-sectional studies, case-control studies, cohort studies, case reports/studies, animal experiments, and other in vitro studies, as shown in Figure ​ Figure2 2 [ 9 ].

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000033374-i02.jpg

Systematic Reviews

In these studies, the researcher makes an elaborate and up-to-date search of the available literature. By doing a systematic review of a selected topic, the researcher collects the data, analyses it, and critically evaluates it to evolve with impactful conclusions. Systematic reviews could equip healthcare professionals with more than adequate evidence with respect to the decisions to be taken during improved patient management that may include diagnosis, interventions, prognosis, and others [ 10 ]. A recent systematic research study evaluated the role of socioeconomic conditions on the knowledge of risk factors for stroke in the World Health Organization (WHO) European region. This study collected data from PubMed, Embase, Web of Science (WoS), and other sources and finally included 20 studies and 67,309 subjects. This study concluded that the high socioeconomic group had better knowledge of risk factors and warning signs of stroke and suggested improved public awareness programs to better address the issue [ 11 ].

Meta-Analysis

Meta-analysis is like a systematic review, but this type of research design uses quantitative tools that include statistical methods to draw conclusions. Such a research method is therefore considered both equal and superior to the original research studies. Both the systematic review and the meta-analyses follow a similar research process that includes the research question, preparation of a protocol, registration of the study, devising study methods using inclusion and exclusion criteria, an extensive literature survey, selection of studies, assessing the quality of the evidence, data collection, analysis, assessment of the evidence, and finally the interpretation/drawing the conclusions [ 12 ]. A recent research study, using a meta-analytical study design, evaluated the quality of life (QoL) among patients suffering from chronic pulmonary obstructive disease (COPD). This study used WoS to collect the studies, and STATA to analyze and interpret the data. The study concluded that non-therapeutic mental health and multidisciplinary approaches were used to improve QoL along with increased support from high-income countries to low and middle-income countries [ 13 ].

Cross-Sectional Studies

These studies undertake the observation of a select population group at a single point in time, wherein the subjects included in the studies are evaluated for exposure and outcome simultaneously. These are probably the most common types of studies undertaken by students pursuing postgraduation. A recent study evaluated the activities of thyroid hormones among the pre- and post-menopausal women attending a tertiary care teaching hospital. The results of this study demonstrated that there was no significant difference in the activities of thyroid hormones in the study groups [ 14 ].

Cohort Studies

Cohort studies use participant groups called cohorts, which are followed up for a certain period and assess the exposure to the outcome. They are used for epidemiological observations to improve public health. Although cohort studies are laborious, financially burdensome, and difficult to undertake as they require a large population group, such study designs are frequently used to conduct clinical studies and are only second to randomized control studies in terms of their significance [ 15 ]. Also, cohort studies can be undertaken both retrospectively and prospectively. A retrospective study assessed the effect of alcohol intake among human immunodeficiency virus (HIV)-infected persons under the national program of the United States of America (USA) for HIV care. This study, which included more than 30,000 HIV patients under the HIV care continuum program, revealed that excessive alcohol use among the participants affected HIV care, including treatment [ 16 ].

Case-Control Study

The case-control studies use a single point of observation among two population groups that are categorized based on the outcome. Those who had an outcome are termed as cases, and the ones who did not develop the disease are called control groups. This type of study design is easy to perform and is extensively undertaken as a part of medical research. Such studies are frequently used to assess the efficacy of vaccines among the population [ 17 ]. A previous study evaluated the activities of zinc among patients suffering from beta-thalassemia and compared it with the control group. This study concluded that the patients with beta-thalassemia are prone to hypozincaemia and had low concentrations of zinc as compared to the control group [ 18 ].

Case Studies

Such types of studies are especially important from the perspective of patient management. Although these studies are just observations of single or multiple cases, they may prove to be particularly important in the management of patients suffering from unusual diseases or patients presenting with unusual presentations of a common disease. Listeria is a bacterium that generally affects humans in the form of food poisoning and neonatal meningitis. Such an organism was reported to cause breast abscesses [ 19 ].

Randomized Control Trial

This is probably the most trusted research design that is frequently used to evaluate the efficacy of a novel pharmacological drug or a medical device. This type of study has a negligible bias, and the results obtained from such studies are considered accurate. The randomized controlled studies use two groups, wherein the treatment group receives the trial drug and the other group, called the placebo group, receives a blank drug that appears remarkably like the trial drug but without the pharmacological element. This can be a single-blind study (only the investigator knows who gets the trial drug and who is given a placebo) or a double-blind study (both the investigator and the study participant have no idea what is being given). A recent study (clinical trial registration number: {"type":"clinical-trial","attrs":{"text":"NCT04308668","term_id":"NCT04308668"}} NCT04308668 ) concluded that post-exposure prophylaxis with hydroxychloroquine does not protect against Coronavirus disease-19 (COVID-19) after a high and moderate risk exposure when the treatment was initiated within four days of potential exposure [ 20 ].

Factors that affect study designs

Among the different factors that affect a study's design is the recruitment of study participants. It is not yet clear as to what is the optimal method to increase participant participation in clinical studies. A previous study had identified that the language barrier and the long study intervals could potentially hamper the recruitment of subjects for clinical trials [ 21 ]. It was noted that patient recruitment for a new drug trial is more difficult than for a novel diagnostic study [ 22 ].

Reproducibility is an important factor that affects a research design. The study designs must be developed in such a way that they are replicable by others. Only those studies that can be re-done by others to generate the same/similar results are considered credible [ 23 ]. Choosing an appropriate study design to answer a research question is probably the most important factor that could affect the research result [ 24 ]. This can be addressed by clearly understanding various study designs and their applications before selecting a more relevant design.

Retention is another significant aspect of the study design. It is hard to hold the participants of a study until it is completed. Loss of follow-up among the study participants will influence the study results and the credibility of the study. Other factors that considerably influence the research design are the availability of a source of funding, the necessary infrastructure, and the skills of the investigators and clinical trial personnel.

Synthesizing a research question or a hypothesis

A research question is at the core of research and is the point from which a clinical study is initiated. It should be well-thought-out, clear, and concise, with an arguable element that requires the conduction of well-designed research to answer it. A research question should generally be a topic of curiosity in the researcher's mind, and he/she must be passionate enough about it to do all that is possible to answer it [ 25 ].

A research question must be generated/framed only after a preliminary literature search, choosing an appropriate topic, identifying the audience, self-questioning, and brainstorming for its clarity, feasibility, and reproducibility.

A recent study suggested a stepwise process to frame the research question. The research question is developed to address a phenomenon, describe a case, establish a relationship for comparison, and identify causality, among others. A better research question is one that describes the statement of the problem, points out the study area, puts focus on the study aspects, and guides data collection, analysis, and interpretation. The aspects of a good research question are shown in Figure ​ Figure3 3 [ 26 ].

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000033374-i03.jpg

Research questions may be framed to prove the existence of a phenomenon, describe and classify a condition, elaborate the composition of a disease condition, evaluate the relationship between variables, describe and compare disease conditions, establish causality, and compare the variables resulting in causality. Some examples of the research questions include: (i) Does the coronavirus mutate when it jumps from one organism to another?; (ii) What is the therapeutic efficacy of vitamin C and dexamethasone among patients infected with COVID-19?; (iii) Is there any relationship between COPD and the complications of COVID-19?; (iv) Is Remdesivir alone or in combination with vitamin supplements improve the outcome of COVID-19?; (v) Are males more prone to complications from COVID-19 than females?

The research hypothesis is remarkably like a research question except for the fact that in a hypothesis the researcher assumes either positively or negatively about a causality, relation, correlation, and association. An example of a research hypothesis: overweight and obesity are risk factors for cardiovascular disease.

Types of errors in hypothesis testing

An assumption or a preliminary observation made by the researcher about the potential outcome of research that is being envisaged may be called a hypothesis. There are different types of hypotheses, including simple hypotheses, complex hypotheses, empirical hypotheses, statistical hypotheses, null hypotheses, and alternative hypotheses. However, the null hypothesis (H0) and the alternative hypothesis (HA) are commonly practiced. The H0 is where the researcher assumes that there is no relation/causality/effect, and the HA is when the researcher believes/assumes that there is a relationship/effect [ 27 , 28 ].

Hypothesis testing is affected by two types of errors that include the type I error (α) and the type II error (β). The type I error (α) occurs when the investigator contradicts the null hypothesis despite it being true, which is considered a false positive error. The type II error (β) happens when the researcher considers/accepts the null hypothesis despite it being false, which is termed a false negative error [ 28 , 29 ].

The reasons for errors in the hypothesis testing may be due to bias and other causes. Therefore, the researchers set the standards for studies to rule out errors. A 5% deviation (α=0.05; range: 0.01-0.10) in the case of a type I error and up to a 20% probability (β=0.20; range: 0.05-0.20) for type II errors are generally accepted [ 28 , 29 ]. The features of a reasonable hypothesis include simplicity and specificity, and the hypothesis is generally determined by the researcher before the initiation of the study and during the preparation of the study proposal/protocol [ 28 , 29 ].

The applications of hypothesis testing

A hypothesis is tested by assessing the samples, where appropriate statistics are applied to the collected data and an inference is drawn from it. It was noted that a hypothesis can be made based on the observations of physicians using anatomical characteristics and other physiological attributes [ 28 , 30 ]. The hypothesis may also be tested by employing proper statistical techniques. Hypothesis testing is carried out on the sample data to affirm the null hypothesis or otherwise.

An investigator needs to believe the null hypothesis or accept that the alternate hypothesis is true based on the data collected from the samples. Interestingly, most of the time, a study that is carried out has only a 50% chance of either the null hypothesis or the alternative hypothesis coming true [ 28 , 31 ].

Hypothesis testing is a step-by-step strategy that is initiated by the assumption and followed by the measures applied to interpret the results, analysis, and conclusion. The margin of error and the level of significance (95% free of type I error and 80% free of type II error) are initially fixed. This enables the chance for the study results to be reproduced by other researchers [ 32 ].

Ethics in health research

Ethical concerns are an important aspect of civilized societies. Moreover, ethics in medical research and practice assumes increased significance as most health-related research is undertaken to find a cure or discover a medical device/diagnostic tool that can either diagnose or cure the disease. Because such research involves human participants, and due to the fact that people approach doctors to find cures for their diseased condition, ethics, and ethical concerns take center stage in public health-related clinical/medical practice and research.

The local and international authorities like the Drugs Controller General of India (DCGI), and the Food and Drug Administration (FDA) make sure that health-related research is carried out following all ethical concerns and good clinical practice (GCP) guidelines. The ethics guidelines are prescribed by both national and international bodies like the Indian Council of Medical Research (ICMR) and the World Medical Association (WMA) Declaration of Helsinki guidelines for ethical principles for medical research involving human subjects [ 33 ].

Ethical conduct is more significant during clinical practice, medical education, and research. It is recommended that medical practitioners embark on self-regulation of the medical profession. Becoming proactive in terms of ethical practices will enhance the social image of a medical practitioner/researcher. Moreover, such behavior will allow people to comprehend that this profession is not for trade/money but for the benefit of the patients and the public at large. Administrations should promote ethical practitioners and penalize unethical practitioners and clinical research organizations. It is suggested that the medical curriculum should incorporate ethics as a module and ethics-related training must be delivered to all medical personnel. It should be noted that a tiny seed grows into an exceptionally gigantic tree if adequately watered and taken care of [ 33 ]. It is therefore inevitable to address the ethical concerns in medical education, research and practice to make more promising medical practitioners and acceptable medical educators and researchers as shown in Figure ​ Figure4 4 .

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000033374-i04.jpg

Sampling in health research

Sampling is the procedure of picking a precise number of individuals from a defined group to accomplish a research study. This sample is a true representative subset of individuals who potentially share the same characteristics as a large population, and the results of the research can be generalized [ 34 , 35 ]. Sampling is a prerogative because it is almost impossible to include all the individuals who want to partake in a research investigation. A sample identified from a representative population can be depicted in Figure ​ Figure5 5 .

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000033374-i05.jpg

Sampling methods are of different types and are broadly classified into probability sampling and non-probability sampling. In a probability sampling method, which is routinely employed in quantitative research, each individual in the representative population is provided with an equivalent likelihood of being included in the study [ 35 ]. Probability sampling can be separated into four types that include simple random sampling, systematic sampling, stratified sampling, and cluster sampling, as shown in Figure ​ Figure6 6 .

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000033374-i06.jpg

Simple Random Sample

In the simple random sampling method, every person in the representative population is given an equal chance of being selected. It may use a random number generator for selecting the study participants. To study the employees’ perceptions of government policies, a researcher initially assigns a number to each employee [ 35 ]. After this, the researcher randomly chooses the required number of samples. In this type of sampling method, each one has an equal chance of being selected.

Systematic Sample

In this sampling method, the researcher selects the study participants depending on a pre-defined order (1, 3, 5, 7, 9…), wherein the researcher assigns a serial number (1-100 (n)) to volunteers [ 35 ]. The researcher in this type of sample selects a number from 1 to 10 and later applies a systematic pattern to select the sample like 2, 12, 22, 32, etc.

Stratified Sample

The stratified sampling method is applied when the people from whom the sample must be taken have mixed features. In this type of sampling, the representative population is divided into clusters/strata based on attributes like age, sex, and other factors. Subsequently, a simple random or systematic sampling method is applied to select the samples from each group. Initially, different age groups, sexes, and other characters were selected as a group [ 35 ]. The investigator finds his/her sample from each group using simple or systematic random sampling methods.

Cluster Sample

This sampling method is used to create clusters of the representative population with mixed qualities. Because such groups have mixed features, each one can be regarded as a sample. Conversely, a sample can be developed by using simple random/systematic sampling approaches. The cluster sampling method is similar to stratified sampling but differs in the group characteristics, wherein each group has representatives of varied ages, different sexes, and other mixed characters [ 35 ]. Although each group appears as a sample, the researcher again applies a simple or systematic random sampling method to choose the sample.

Non-probability Sample

In this type of sampling method, the participants are chosen based on non-random criteria. In a non-probability sampling method, the volunteers do not have an identical opportunity to get selected. This method, although it appears to be reasonable and effortless to do, is plagued by selection bias. The non-probability sampling method is routinely used in experimental and qualitative research. It is suitable to perform a pilot study that is carried out to comprehend the qualities of a representative population [ 35 ]. The non-probability sampling is of four types, including convenience sampling, voluntary response sampling, purposive sampling, and snowball sampling, as shown in Figure ​ Figure7 7 .

An external file that holds a picture, illustration, etc.
Object name is cureus-0015-00000033374-i07.jpg

Convenience Sample

In the convenience sampling method, there are no pre-defined criteria, and only those volunteers who are readily obtainable to the investigator are included. Despite it being an inexpensive method, the results yielded from studies that apply convenience sampling may not reflect the qualities of the population, and therefore, the results cannot be generalized [ 35 ]. The best example of this type of sampling is when the researcher invites people from his/her own work area (company, school, city, etc.).

Voluntary Response Sample

In the voluntary response sampling method, the participants volunteer to partake in the study. This sampling method is similar to convenience sampling and therefore leaves sufficient room for bias [ 35 ]. The researcher waits for the participants who volunteer in the study in a voluntary response sampling method.

Purposive Sample/Judgment Sample

In the purposive or judgemental sampling method, the investigator chooses the participants based on his/her judgment/discretion. In this type of sampling method, the attributes (opinions/experiences) of the precise population group can be achieved [ 35 ]. An example of such a sampling method is the handicapped group's opinion on the facilities at an educational institute.

Snowball Sample

In the snowball sampling method, suitable study participants are found based on the recommendations and suggestions made by the participating subjects [ 36 ]. In this type, the individual/sample recruited by the investigator in turn invites/recruits other participants.

Significance of informed consent and confidentiality in health research

Informed consent is a document that confirms the fact that the study participants are recruited only after being thoroughly informed about the research process, risks, and benefits, along with other important details of the study like the time of research. The informed consent is generally drafted in the language known to the participants. The essential contents of informed consent include the aim of research in a way that is easily understood even by a layman. It must also brief the person as to what is expected from participation in the study. The informed consent contains information such as that the participant must be willing to share demographic characteristics, participate in the clinical and diagnostic procedures, and have the liberty to withdraw from the study at any time during the research. The informed consent must also have a statement that confirms the confidentiality of the participant and the protection of privacy of information and identity [ 37 ].

Health research is so complex that there may be several occasions when a researcher wants to re-visit a medical record to investigate a specific clinical condition, which also requires informed consent [ 38 ]. Awareness of biomedical research and the importance of human participation in research studies is a key element in the individual’s knowledge that may contribute to participation or otherwise in the research study [ 39 ]. In the era of information technology, the patient’s medical data are stored as electronic health records. Research that attempts to use such records is associated with ethical, legal, and social concerns [ 40 , 41 ]. Improved technological advances and the availability of medical devices to treat, diagnose, and prevent diseases have thrown a new challenge at healthcare professionals. Medical devices are used for interventions only after being sure of the potential benefit to the patients, and at any cost, they must never affect the health of the patient and only improve the outcome [ 42 ]. Even in such cases, the medical persons must ensure informed consent from the patients.

Conclusions

Clinical research is an essential component of healthcare that enables physicians, patients, and governments to tackle health-related problems. Increased incidences of both communicable and non-communicable diseases warrant improved therapeutic interventions to treat, control, and manage diseases. Several illnesses do not have a treatment, and for many others, the treatment, although available, is plagued by drug-related adverse effects. For many other infections, like dengue, we require preventive vaccines. Therefore, clinical research studies must be carried out to find solutions to the existing problems. Moreover, the knowledge of clinical research, as discussed briefly in this review, is required to carry out research and enhance preparedness to counter conceivable public health emergencies in the future.

The content published in Cureus is the result of clinical experience and/or research by independent individuals or organizations. Cureus is not responsible for the scientific accuracy or reliability of data or conclusions published herein. All content published within Cureus is intended only for educational, research and reference purposes. Additionally, articles published within Cureus should not be deemed a suitable substitute for the advice of a qualified health care professional. Do not disregard or avoid professional medical advice due to content published within Cureus.

The authors have declared that no competing interests exist.

Get science-backed answers as you write with Paperpal's Research feature

Empirical Research: A Comprehensive Guide for Academics 

empirical research

Empirical research relies on gathering and studying real, observable data. The term ’empirical’ comes from the Greek word ’empeirikos,’ meaning ‘experienced’ or ‘based on experience.’ So, what is empirical research? Instead of using theories or opinions, empirical research depends on real data obtained through direct observation or experimentation. 

Why Empirical Research?

Empirical research plays a key role in checking or improving current theories, providing a systematic way to grow knowledge across different areas. By focusing on objectivity, it makes research findings more trustworthy, which is critical in research fields like medicine, psychology, economics, and public policy. In the end, the strengths of empirical research lie in deepening our awareness of the world and improving our capacity to tackle problems wisely. 1,2  

Qualitative and Quantitative Methods

There are two main types of empirical research methods – qualitative and quantitative. 3,4 Qualitative research delves into intricate phenomena using non-numerical data, such as interviews or observations, to offer in-depth insights into human experiences. In contrast, quantitative research analyzes numerical data to spot patterns and relationships, aiming for objectivity and the ability to apply findings to a wider context. 

Steps for Conducting Empirical Research

When it comes to conducting research, there are some simple steps that researchers can follow. 5,6  

  • Create Research Hypothesis:  Clearly state the specific question you want to answer or the hypothesis you want to explore in your study. 
  • Examine Existing Research:  Read and study existing research on your topic. Understand what’s already known, identify existing gaps in knowledge, and create a framework for your own study based on what you learn. 
  • Plan Your Study:  Decide how you’ll conduct your research—whether through qualitative methods, quantitative methods, or a mix of both. Choose suitable techniques like surveys, experiments, interviews, or observations based on your research question. 
  • Develop Research Instruments:  Create reliable research collection tools, such as surveys or questionnaires, to help you collate data. Ensure these tools are well-designed and effective. 
  • Collect Data:  Systematically gather the information you need for your research according to your study design and protocols using the chosen research methods. 
  • Data Analysis:  Analyze the collected data using suitable statistical or qualitative methods that align with your research question and objectives. 
  • Interpret Results:  Understand and explain the significance of your analysis results in the context of your research question or hypothesis. 
  • Draw Conclusions:  Summarize your findings and draw conclusions based on the evidence. Acknowledge any study limitations and propose areas for future research. 

Advantages of Empirical Research

Empirical research is valuable because it stays objective by relying on observable data, lessening the impact of personal biases. This objectivity boosts the trustworthiness of research findings. Also, using precise quantitative methods helps in accurate measurement and statistical analysis. This precision ensures researchers can draw reliable conclusions from numerical data, strengthening our understanding of the studied phenomena. 4  

Disadvantages of Empirical Research

While empirical research has notable strengths, researchers must also be aware of its limitations when deciding on the right research method for their study.4 One significant drawback of empirical research is the risk of oversimplifying complex phenomena, especially when relying solely on quantitative methods. These methods may struggle to capture the richness and nuances present in certain social, cultural, or psychological contexts. Another challenge is the potential for confounding variables or biases during data collection, impacting result accuracy.  

Tips for Empirical Writing

In empirical research, the writing is usually done in research papers, articles, or reports. The empirical writing follows a set structure, and each section has a specific role. Here are some tips for your empirical writing. 7   

  • Define Your Objectives:  When you write about your research, start by making your goals clear. Explain what you want to find out or prove in a simple and direct way. This helps guide your research and lets others know what you have set out to achieve. 
  • Be Specific in Your Literature Review:  In the part where you talk about what others have studied before you, focus on research that directly relates to your research question. Keep it short and pick studies that help explain why your research is important. This part sets the stage for your work. 
  • Explain Your Methods Clearly : When you talk about how you did your research (Methods), explain it in detail. Be clear about your research plan, who took part, and what you did; this helps others understand and trust your study. Also, be honest about any rules you follow to make sure your study is ethical and reproducible. 
  • Share Your Results Clearly : After doing your empirical research, share what you found in a simple way. Use tables or graphs to make it easier for your audience to understand your research. Also, talk about any numbers you found and clearly state if they are important or not. Ensure that others can see why your research findings matter. 
  • Talk About What Your Findings Mean:  In the part where you discuss your research results, explain what they mean. Discuss why your findings are important and if they connect to what others have found before. Be honest about any problems with your study and suggest ideas for more research in the future. 
  • Wrap It Up Clearly:  Finally, end your empirical research paper by summarizing what you found and why it’s important. Remind everyone why your study matters. Keep your writing clear and fix any mistakes before you share it. Ask someone you trust to read it and give you feedback before you finish. 

References:  

  • Empirical Research in the Social Sciences and Education, Penn State University Libraries. Available online at  https://guides.libraries.psu.edu/emp  
  • How to conduct empirical research, Emerald Publishing. Available online at  https://www.emeraldgrouppublishing.com/how-to/research-methods/conduct-empirical-research  
  • Empirical Research: Quantitative & Qualitative, Arrendale Library, Piedmont University. Available online at  https://library.piedmont.edu/empirical-research  
  • Bouchrika, I.  What Is Empirical Research? Definition, Types & Samples  in 2024. Research.com, January 2024. Available online at  https://research.com/research/what-is-empirical-research  
  • Quantitative and Empirical Research vs. Other Types of Research. California State University, April 2023. Available online at  https://libguides.csusb.edu/quantitative  
  • Empirical Research, Definitions, Methods, Types and Examples, Studocu.com website. Available online at  https://www.studocu.com/row/document/uganda-christian-university/it-research-methods/emperical-research-definitions-methods-types-and-examples/55333816  
  • Writing an Empirical Paper in APA Style. Psychology Writing Center, University of Washington. Available online at  https://psych.uw.edu/storage/writing_center/APApaper.pdf  

Paperpal is an AI writing assistant that help academics write better, faster with real-time suggestions for in-depth language and grammar correction. Trained on millions of research manuscripts enhanced by professional academic editors, Paperpal delivers human precision at machine speed.  

Try it for free or upgrade to  Paperpal Prime , which unlocks unlimited access to premium features like academic translation, paraphrasing, contextual synonyms, consistency checks and more. It’s like always having a professional academic editor by your side! Go beyond limitations and experience the future of academic writing.  Get Paperpal Prime now at just US$19 a month!  

Related Reads:

  • How to Write a Scientific Paper in 10 Steps 
  • What is a Literature Review? How to Write It (with Examples)
  • What is an Argumentative Essay? How to Write It (With Examples)
  • Ethical Research Practices For Research with Human Subjects

Ethics in Science: Importance, Principles & Guidelines 

Presenting research data effectively through tables and figures, you may also like, ai + human expertise – a paradigm shift..., how to use paperpal to generate emails &..., ai in education: it’s time to change the..., is it ethical to use ai-generated abstracts without..., do plagiarism checkers detect ai content, word choice problems: how to use the right..., how to avoid plagiarism when using generative ai..., what are journal guidelines on using generative ai..., types of plagiarism and 6 tips to avoid..., how to write an essay introduction (with examples)....

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

GIS for empirical research design: An illustration with georeferenced point data

Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliation Center for Southeast Asian Studies, Kyoto University, Kyoto, Japan

ORCID logo

Roles Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

Affiliation Graduate School of Economics, University of Tokyo, Tokyo, Japan

  • Katsuo Kogure, 
  • Yoshito Takasaki

PLOS

  • Published: March 4, 2019
  • https://doi.org/10.1371/journal.pone.0212316
  • Reader Comments

Fig 1

This paper demonstrates how Geographic Information Systems (GIS) can be utilized to study the effects of spatial phenomena. Since experimental designs such as Randomized Controlled Trials are generally not feasible for spatial problems, researchers need to rely on quasi-experimental approaches using observational data. We provide a regression-based framework of the key procedures for GIS-based empirical research design using georeferenced point data for both spatial events of interest and subjects exposed to the events. We illustrate its utility and implementation through a case study on the impacts of the Cambodian genocide under the Pol Pot regime on post-conflict education.

Citation: Kogure K, Takasaki Y (2019) GIS for empirical research design: An illustration with georeferenced point data. PLoS ONE 14(3): e0212316. https://doi.org/10.1371/journal.pone.0212316

Editor: Roee Gutman, Brown University, UNITED STATES

Received: May 15, 2018; Accepted: January 31, 2019; Published: March 4, 2019

Copyright: © 2019 Kogure, Takasaki. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: All relevant data are within the paper and its Supporting Information files.

Funding: This study is supported by Grants-in-Aid for Scientific Research No. 15K17044, No. 25257106, and No. 18H05312, Japan Society for the Promotion of Science ( https://kaken.nii.ac.jp/ja/grant/KAKENHI-PROJECT-15K17044/ , https://kaken.nii.ac.jp/ja/grant/KAKENHI-PROJECT-25257106/ , https://kaken.nii.ac.jp/ja/grant/KAKENHI-PROJECT-18H05312/ ). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

With the growing availability of spatial data from Global Positioning Systems (GPS) and remote sensing, Geographic Information Systems (GIS)—computer systems designed to gather, store, manage, display, and analyze spatial data—have become an important tool in geographical, environmental, health, and social science research [ 1 , 2 ]. Although GIS has been used to visualize, process, create, and analyze spatial data across disciplines, its full potential has yet to be realized [ 3 , 4 ]. This paper demonstrates how GIS can help in designing empirical research to study the effects of spatial phenomena. Although investigating causal questions is one of the most important research themes across disciplines [ 5 ], it has received limited attention in spatial contexts [ 6 , 7 ]. The gold standard for answering causal questions is Randomized Controlled Trials, which can identify the causal effects of interest through random variation in treatment variables generated by researchers. In spatial contexts, this experimental approach is generally not feasible, so researchers need to rely on a quasi-experimental approach using observational data to approximate an experimental study [ 8 ].

Spatial data can be classified into three types: (1) geostatistical data collected over a continuous spatial domain (e.g., temperature), (2) area/lattice data collected on a regular or irregular lattice with well-defined boundaries (e.g., population density in an administrative area), and (3) point pattern data consisting of the observed locations of events/objects of interest (e.g., schools) [ 9 ]. Focusing on point pattern data, we consider the occurrence of particular ‘events’ as ‘treatments’ in the literature of causal analysis. This way of formulating an empirical question is a new perspective in the literature that enables empirical researchers to connect spatial statistics with causal analysis.

Our general goal is to evaluate the impacts of the occurrence of spatial events (treatments) during a specific period on local subjects. We consider the following situation/setting where researchers can use GIS to design credible and transparent empirical research. First, the impacts of spatial events are limited to the surrounding local areas. Such events may include the placement of local public goods (e.g., schools, health facilities, clean water facilities) and the occurrence of environmental, epidemiological, political, social, and cultural events (e.g., point source pollution, contagious disease, local conflicts, local beliefs/norms). Second, point pattern data of both events of interest and subjects exposed to the events are available. Although this presumption may not be satisfied in many cases at the moment, more and more household and field surveys have collected georeferenced information using a smartphone or tablet with an integrated GPS chip-set [ 10 ]. Spatial point data will become even more widely available in the near future, despite continuing privacy concerns [ 11 ]. Regarding the two sets of spatial point data, we assume that while the data-generating process of events follows some form of stochastic mechanism, the locations of subjects are fixed: Local residents do not move/relocate in response to the spatial events of interest (or the migration patterns of subjects are similar around the locations of events if migration is prevalent). Although how reasonable this assumption is depends on empirical contexts, it serves as a benchmark setting.

The key question in estimating the treatment effects in spatial contexts is how to address the threat of unobserved spatial confounding factors. Following the potential outcomes framework [ 12 ] for a binary treatment variable, extant studies use covariates and spatial (geographic) information—as a proxy for unobserved spatial confounding factors—to construct matched pairs of treated and control units. Specifically, one study [ 13 ] proposes an integer programing method that matches directly on covariates and spatial proximity (distance) and other study [ 14 ] proposes distance adjusted propensity score that combines spatial proximity with propensity score.

Following the strategy of exploiting spatial proximity for comparison, we (1) limit a global sample to spatially limited (local) samples using point buffers around the locations of events processed by GIS (spatial clusters) and (2) adjust for unobserved factors that affect or are correlated with both outcomes of interest and treatments and are shared within spatial clusters by controlling for spatial cluster fixed effects in a linear regression model. We further limit the sample to subjects within selected spatial clusters where the locations of events are plausibly exogenous (according to the statistical significance test discussed below). Our approach shares the same objective of adjusting for unobserved spatial confounding factors in the literature.

Our approach follows one of Fisher’s three principles of experimental design, “blocking” (“local control”), with the aim of comparing outcomes among more homogeneous groups, with different levels of treatments within blocks, to reduce omitted variable bias systematically [ 15 ]. GIS can flexibly implement this principle in spatial contexts. We do not specifically consider/model spatially correlated residuals addressed in some related literature [ 16 , 17 ]. Rather, we restrict local samples so that residual spatial variability within small spatial clusters is plausibly assumed to be nonsignificant. Since a linear regression model is a primary workhorse in empirical studies across disciplines, our regression-based approach, which considers both binary and continuous treatment variables, should have significant practical merits.

The reminder of this paper is divided into two sections: We first provide a regression-based framework of the key procedures for GIS-based empirical research design and then implement the framework in a case study on historical political events—the Cambodian genocide.

A framework of GIS-based empirical research

The key procedures for GIS-based empirical research design consist of three stages: diagnosis , design , and analysis .

Stage 1: Diagnosis

The diagnosis stage examines how the events of interest occur. Based on theories, findings in the literature, and/or anecdotal evidence, researchers must understand the potential determinants of the locations of the events. Visualizing spatial patterns using GIS can facilitate this process.

Stage 2: Design

The design stage consists of two main tasks: creating treatment variables and constructing a credible analysis sample to approximate an experimental study.

Treatment variables.

We consider a general case where exactly how each subject is affected (treated) by the events is unknown (if it is known, it directly informs the creation of treatment variables). We assume that subjects living in closer proximity to the locations of events are more strongly treated by the events: The degree to which subjects are exposed to the events depends on the distance from the locations of events. This assumption should be reasonable for various spatial events. By calculating distance between two spatial points using GIS, we can create distance-based treatment variables. If information about the attributes of points (“marks”) is available, it is possible to make more flexible treatment variables. Treatment variables can be binary, discrete, or continuous.

Credible sample.

This central task consists of two or three steps of constructing a credible analysis sample from a given global sample which represents the population of interest (global sample). The procedure is the same for binary, discrete, and continuous treatment variables.

Step 1—Global sample.

The first step assesses the exogeneity of locations of events/treatment variables (binary, discrete, or continuous) for the global sample. As commonly done in empirical studies based on a regression-based framework, we statistically examine the joint significance of observed covariates on treatment variables. Rejecting the exogeneity implies that adjusting for their systematic difference is indispensable. This exogeneity check cannot assess the significance of unobserved factors affecting or correlated with both the outcomes of interest and treatments (“confounders”). If such unobserved confounders exist, the global sample fails to capture the treatment effects of interest: The estimates suffer from omitted variable bias.

Step 2—Spatial clusters.

The second step deals with this potential omitted variable bias. While instrumental variables methods are conventional, it is often difficult to find valid instrumental variables that are correlated with the locations of events (treatments), but not with the outcome of interest. We employ an alternative approach, blocking, by considering point buffers around the locations of events processed by GIS as neighborhoods, which we call spatial clusters , and compare outcomes among subjects, with different levels of exposure to the events, within spatial clusters. Since unobserved confounders (e.g., socioeconomic environments) should be similar within small spatial clusters, omitted variable bias can be reduced systematically. Specifically, for each subject, we create a dummy variable for each spatial cluster that takes the value of 1 if the subject belongs to the spatial cluster and 0 otherwise. If the subject belongs to more than one spatial cluster, corresponding more than one dummy variable takes 1 (an example is given in our case study). Adjusting for these dummy variables ( spatial cluster fixed effects ) in the regression model can eliminate unobserved confounders that are constant within spatial clusters. Prior to the analysis stage, we reassess the exogeneity of the treatment variables for this spatially limited (local) sample, adding spatial cluster fixed effects.

In this second step (and the third step discussed below), the size of spatial clusters is a key auxiliary parameter. Although with smaller cluster size, the number of balanced spatial clusters (described below) becomes larger and socioeconomic characteristics within spatial clusters should be more similar, the number of observations within spatial clusters becomes smaller and the variation in the exposure to the events becomes more limited, thus constraining the creation of treatment variables. This bias-variance trade-off needs to be carefully taken into account by researchers when they choose cluster size. Choosing the optimal size of spatial clusters based on data-driven approaches is beyond the scope of this paper.

Step 3—Balanced spatial clusters.

If concerns about omitted variable bias still remain in this local sample, the third step is called for. Although omitted variable bias cannot be directly assessed, it may be indirectly assessed through the correlations between the treatment variable and observed covariates (which are relevant according to theories, contextual knowledge, and/or findings in related literature) as well as their joint significance, because unobserved confounders about which researchers are typically concerned are those related to observed confounders [ 18 ]: Their significant correlations may imply that some unobserved confounders (that are related to observed confounders) vary within spatial clusters. Then, the estimates based on the local sample may still suffer from omitted variable bias. With information of the location determinants examined in the first diagnosis stage (that affects the occurrence of events), GIS can be utilized to alleviate the remaining concerns about omitted variable bias, as follows. Here we assume that a key binary location determinant is available.

We employ the following procedure for each spatial cluster. First, we additionally create small buffers around the location of the event and classify subjects into the buffers. Second, we create a crosstabulation on the distribution of the key binary location determinant across the buffers. Lastly, we examine the homogeneity of the distribution based on Fisher’s exact test, a statistical significance test for independence between two categorical variables [ 19 ]. Based on the results, we further limit the subjects to those within selected spatial clusters with the homogeneous distribution of the location determinant, which we call balanced spatial clusters . Since the locations of events within balanced spatial clusters are plausibly assumed to be exogenous (unobserved confounders are arguably more similar within selected spatial clusters), we can further reduce bias. The choice of buffer size for Fisher’s exact test involves a trade-off: Although with narrower buffer bandwidth, the homogeneity of the distribution of the location determinant is tested more strictly, the number of observations within buffers becomes smaller, thus weakening statistical power.

Stage 3: Analysis

The analysis stage examines the impacts of the treatment variables on the outcomes of interest based on the samples constructed in the second design stage. We estimate a regression equation controlling for spatial cluster fixed effects by ordinary least squares (OLS). We assume that within spatial clusters, there is no residual spatial variability and neighborhood spillover effects, if any, are constant. Then, in our regression model, spatial cluster fixed effects control for local spillover effects within spatial clusters. We assume that there are no spillover effects across spatial clusters. We check the sensitivity of estimation results to potential omitted variable bias due to remaining unobserved confounders within balanced spatial clusters. We do so by employing coefficient stability approaches (e.g., [ 18 , 20 ]) corresponding to our regression-based framework.

Internal and external validity.

Of critical importance in empirical studies is the validation of empirical findings [ 21 ]. Internal validity is the degree to which the findings for the (sub)population being studied are credible; external validity is the degree to which the findings can be extrapolated to other populations and settings [ 22 ]. Our framework aims to improve internal validity: The estimates based on the local sample within balanced spatial clusters have higher internal validity than those based on the global sample. On the other hand, the former estimates are prone to limited external validity because the local sample may not represent the population of interest. In quasi-experimental designs, such a trade-off always exists. Although the subpopulation being studied can differ distinctly from the population of interest, in our approach, well-defined subpopulations represented by the local sample are identified and thus assessing the credibility of findings for the population of interest is possible. This is not possible in standard instrumental variables methods because subpopulations affected by instruments are not identified.

A case study: Impacts of the Cambodian genocide

Our case study examines how the Cambodian genocide under the Pol Pot regime (1975-1979) altered people’s post-conflict behavior, parental investment in child education in particular. The locations of events and subjects of our interest are those of execution sites (“killing sites”) and households, respectively. The relationship between violence and behavior is an important topic in psychology and social sciences [ 23 – 25 ]. The availability of the two types of georeferenced point data enables our framework; in contrast, finding valid instrumental variables is difficult in this context. After providing brief historical background, the motivation, and the data, we implement our GIS-based empirical research design.

Historical background.

The Khmer Rouge (officially the Communist Party of Kampuchea) led by Pol Pot ruled Cambodia in a form of primitive communism from 1975 to 1979. Its communist revolution completely denied any right to private property, not only material private properties, but also one’s own family: Spouses and children were treated as collective property owned by the state [ 26 ]. People were forced to conform with the ideologies of the Pol Pot regime; those who disobeyed Khmer Rouge’s rules were treated as enemies of the society, being sent to reeducation camps and/or executed. During the Pol Pot era, approximately two million people died of execution, disease, starvation, or exhaustion [ 27 ].

Motivation.

Our case study is motivated by the following social contexts. Under the Pol Pot regime, intellectuals were persecuted and many of them were executed [ 28 ]; formal school education was also denied and abolished. After its collapse in 1979, the remnants of the Khmer Rouge continued guerilla warfare against the new government army until the 1990s; thus, the threat of violence against people by the Khmer Rouge persisted. Indeed, survivors often suffered from long-term mental health disorders, such as post-traumatic stress disorder (PTSD) [ 29 ]. Although formal school education resumed soon after the regime’s collapse [ 30 ], parental investment in child education may have been influenced negatively by the Khmer Rouge’s rules, particularly for those who were strongly exposed to the genocidal violence.

We utilize two data sets: the Khmer Rouge historical database and the complete count 1998 Cambodian Population Census microdata. The former data contain comprehensive information on the genocide during the Pol Pot era (events) with geocoded locations of more than 500 killing sites (e.g., burials, prisons) and the number of victims (marks). The latter data contain the basic information of individual and household socioeconomic characteristics (subjects’ outcomes) and geocoded locations of villages in the country. Since no location information is available at the household level, we substitute the point locations of villages for those of households; thus, all households residing in the same village share the same location information. Section 1 in S1 Supplementary Material provides a detailed description of the data. All data used in the paper were fully anonymized before we accessed them. The IRB approval for this study was received from the University of Tokyo (Approval number: 16-80, Date: August 10, 2016).

We first examine how killing sites were established. According to anecdotal evidence provided by international organizations and historians, the Khmer Rouge used schools, universities, and government buildings as prisons and reeducation camps and trucks to transport prisoners to killing sites [ 31 ]. This suggests that the locations of killing sites were relatively developed areas. Using GIS, we plot the locations of 514 killing sites, along with information about victims, as well as the locations of villages on the base map of the 1977 administrative divisions ( Fig 1 ). We also overlay the major roads (national and provincial road networks) in 1973 and the district mean education level of non-migrant women aged 36-50 (a cohort that should have finished primary school education, if receiving any education, before 1975; see Section 2 in S1 Supplementary Material for a detailed discussion) to capture the level of regional development before the Pol Pot era. This GIS map reveals that killing sites were commonly located near major roads and in districts with relatively high education levels (though in the eastern and western zones many sites were also located relatively far from major roads). These relationships are also confirmed empirically (Table B in S1 Supplementary Material ). These results suggest that the level of regional development prior to the Pol Pot era is a key determinant of the placement of killing sites.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

The geographic distribution of 514 killing sites and their number of victims in districts surveyed by DC-Cam are depicted (districts not surveyed have a white background). The 1977 administrative zones of the Pol Pot regime (DK zones (1977)) and the 1998 districts are depicted. For information on education and major roads, see the text and Section 2 in S1 Supplementary Material .

https://doi.org/10.1371/journal.pone.0212316.g001

We first describe the population of interest and the sample, then create treatment variables, and after that follow the three steps of constructing a credible sample.

Population.

In 1975, urban people, who were persecuted under the Pol Pot regime, were forced to migrate to the countryside to engage in forced agricultural work [ 28 ]. During the Pol Pot era, many of them experienced forced migration several times. Because our treatment variables (defined below) are based on the locations of killing sites and villages where couples lived in 1998, we can construct treatment variables only for non-migrant couples. The population of our interest is thus non-migrant rural couples (who had their first child during and after the Pol Pot era, as detailed below). We exclude urban couples and rural couples with migration experience (about 57% of the couples in rural areas) from the scope of our study.

Our global sample consists of non-migrant rural couples in the areas surveyed about the mass killings who had their first child in 1977-1982 (see Section 3.1 in S1 Supplementary Material for details). Because the couples who had their first child during and after the Pol Pot regime had distinct institutional experiences (the former were controlled as family organizations and the latter were not), we divide the sample into two groups: couples whose first child was born in 1977-1979 and couples whose first child was born in 1980-1982. Taking into account a transition period, we further divide the latter into two groups: couples whose first child was born in 1980 and those whose first child was born in 1981-1982. We examine the genocide impacts on children’s educational outcomes (defined below) for these three subsamples separately. We assume that the timing of having the first child is unrelated to potential genocide impacts (Section 3.2 in S1 Supplementary Material provides evidence for this assumption). This assumption implies that for example, if couples whose first child was born in 1980 or 1981-1982 had had their first child in 1977-1979, their estimated genocide impacts would be identical to those for couples whose first child was actually born in 1977-1979. For illustrative purposes and the sake of brevity, we highlight the results for the subsample of couples whose first child was born in 1977-1979 in the text and report those for other two subsamples in S1 Supplementary Material .

Using GIS, we create two treatment variables of the genocidal violence. A binary measure ( Genocidal Violence I ) takes the value of 1 if the points of villages where couples resided are within 3.0 km of at least one killing site and 0 otherwise. A continuous measure ( Genocidal Violence II ) is the logarithmic value of the inverse-distance weighted sum of the number of victims for all killing sites located within 6.0 km of villages where couples resided (this corresponds to 6.0 km spatial clusters defined below) (we use log because the original value has right-skewed distribution, S1 Fig ). Since 184 killing sites lack information about the number of victims ( Fig 1 ), the continuous measure is defined only among villages with complete victim information for all killing sites located within 6.0 km of the villages. For this reason, we focus on the binary treatment variable in the design stage for the analysis of both the binary and continuous treatment variables. Section 4 in S1 Supplementary Material conducts robustness checks for alternative cluster size and continuous measures.

In the three subsamples of the global sample ( Global Sample ), we examine the joint significance of pre-treatment village characteristics (distance to major roads (km), the proportion of non-migrant women aged 36-50 with grade 1-5 of primary education and the proportion of non-migrant women aged 36-50 with grade 6 or above) and parental characteristics (mother’s and father’s age, and a set of dummy variables for mother’s and father’s educational attainment (grade 1-5 and grade 6 or above)). The dependent variable is the binary treatment variable defined above (i.e., Genocidal Violence I ). We estimate all regression equations by OLS with zone and district fixed effects controlled for. Logistic regression with many dummy variables can have a well-known incidental parameter problem [ 32 ]. The results reported in column 1 of Table 1 (and columns 1 and 2 of Table J in S1 Supplementary Material ) show that villages (couples) located (living) near killing sites are more likely to have been developed (educated) in the subsample of couples whose first child was born in 1977-1979 (the results for other two subsamples reported in columns 1 and 2 of Table J in S1 Supplementary Material are similar). Thus, controlling for these pre-treatment factors is indispensable. With limited relevant pre-treatment variables in our data, however, unobserved confounders are a major concern. It is likely that our estimates based on Global Sample suffer from omitted variable bias.

thumbnail

https://doi.org/10.1371/journal.pone.0212316.t001

Using GIS, we create 6.0 km spatial clusters around the locations of killing sites and identify village points within the spatial clusters ( Fig 2 ). We then limit the sample to couples living within the spatial clusters. We assess this local sample ( Local Sample I ) in the same way as we do Global Sample above, except that spatial cluster fixed effects are additionally controlled (433 spatial clusters in total): Unobserved confounders common within spatial clusters are now fully controlled for (see S2 Fig for the distribution of the number of spatial clusters to which villages in the three subsamples of all local samples belong). The results reported in column 2 of Table 1 (and columns 3 and 4 of Table J in S1 Supplementary Material ) show that although some significant differences disappear, substantial significant differences remain; the listed variables are all jointly significant in the three subsamples (in both panels A and B). These results raise a concern that our estimates in Local Sample I are likely to still suffer from omitted variable bias because unobserved confounders that are related to the observed covariates may vary within the spatial clusters.

thumbnail

The number of villages within 0-2.0 km, 2.0-4.0 km, and 4.0-6.0 km buffers of Killing Site 474 is 9, 15, and 23, respectively.

https://doi.org/10.1371/journal.pone.0212316.g002

We further limit the sample to couples living within selected spatial clusters with similar levels of regional development according to Fisher’s exact test. Since it is not feasible to implement Fisher’s exact test for the two measures depicted in Fig 1 , we use the proportion of migrant households as a proxy for the level of regional development (see Section 3.3 in S1 Supplementary Material for details); in-migration is generally strongly correlated with regional development [ 33 ]. With the lack of historical migration data, we use data from the 1998 Census to calculate the migrant proportion. We confirm that the migrant proportion is positively correlated with the locations of killing sites (Table B in S1 Supplementary Material ). We assume that this measure captures the level of regional development within spatial clusters well.

Within each 6.0 km spatial cluster, we create smaller buffers using a 2.0 km bandwidth which is narrower than the 3.0 km bandwidth determining the treatment status ( Fig 2 ). For each of the three subsamples of Local Sample I, we conduct Fisher’s exact test on the homogeneity in the proportion of migrant households across the three (0-2.0 km, 2.0-4.0 km, and 4.0-6.0 km) buffers. We define spatial clusters as balanced if the null hypothesis of no association in the proportion of migrant households across the three buffers cannot be rejected for all three subsamples. To be conservative, we also conduct the same exercise for 4.0 km spatial clusters.

To provide a specific example, Table 2 reports the results of Fisher’s exact tests for Killing Site 474 depicted in Fig 2 (see S3 Fig for the location of Killing Site 474). In all three subsamples, the number of non-migrant households is larger than that of migrant households within the three buffers. As none of the three subsamples can reject the null hypothesis at conventional levels, the spatial cluster of Killing Site 474 is balanced (the same results hold for the 4.0 km spatial cluster (not reported)). S3 Fig shows the results of the same Fisher’s exact tests for all 514 killing sites. There are 115 balanced spatial clusters (see Section 3.3 in S1 Supplementary Material for details). Although the sample is limited to households within 115 balanced spatial clusters, we also adjust for spatial cluster fixed effects for unbalanced spatial clusters to which they belong, if any, in our regression model (see panel B of S2 Fig ).

thumbnail

https://doi.org/10.1371/journal.pone.0212316.t002

Almost all significant differences found in Global Sample and Local Sample I vanish in this further selected sample ( Local Sample II ) based on these 115 balanced spatial clusters ( Table 1 column 3 and Table J columns 5 and 6). The magnitude of many coefficients becomes smaller. In addition, the listed variables are jointly insignificant at conventional levels in all regressions. These results suggest that the observed covariates and unobserved covariates related to the observed ones, if any, are similar within the balanced spatial clusters; thus, Local Sample II should yield less biased estimates than Global Sample and Local Sample I. At the same time, relative to the population of interest, Local Sample II (and Local Sample I) contain households and villages with favorable characteristics because they focus on villages around killing sites, which tend to have been located in relatively developed areas (see Section 3.1 in S1 Supplementary Material for a more detailed consideration). Therefore, the results should be taken with some caution regarding external validity.

We construct outcome variables, specify regression models, and present the estimation results.

We consider educational outcomes for children aged 15-21 and 6-14. In the 1998 Cambodian education system, the former cohort had already finished the nine-year compulsory education (due to delayed entry, temporary dropout, or grade retention, some were still receiving it though), and the latter cohort were still receiving it, if they were receiving any education. Most of these children were born after the Pol Pot regime. The exposure of couples to the genocidal violence may have affected their fertility decisions and thus the existence of most of these children. Thus, we use household, not child, as a unit of analysis. The household-level outcome measures of interest are the average years of schooling ( Years of schooling ) for children aged 15-21 and the average grade progression ( Grade progression ) for children aged 6-14, where the grade progression of each child is given by Grade −( Age −5), which takes 0 if the child progresses from any grade to the next higher grade and negative values otherwise. Table D in S1 Supplementary Material provides the descriptive statistics of these outcome measures.

Empirical specification.

an empirical study is based on research design that is

Fig 3 plots the point estimates and 95% confidence intervals of the impacts of the binary genocide measure for children aged 15-21 (panel A) and 6-14 (panel B) of the couples whose first child was born in 1977-1979. For comparison, we present all results in Global Sample and Local Samples I and II; the smaller the sample size, the wider the confidence intervals. Although the point estimates are mostly positive in Global Sample and Local Sample I, estimates become negative in Local Sample II. For example, in Local Sample II, although the couples’ exposure to the genocidal violence increased the years of schooling of children aged 15-21 by 0.136 years in Global Sample, it rather decreased their years of schooling by 0.355 years (7.9% of the mean among those with no exposure). Such adverse impacts are found for children aged both 15-21 and 6-14. In contrast, among couples whose first child was born in 1980 or 1981-1982, none of the estimates in Local Sample II are statistically significantly different from 0 (see S4 Fig ).

thumbnail

https://doi.org/10.1371/journal.pone.0212316.g003

Fig 4 (and S5 Fig ) show the results for the continuous genocide measure in Local Sample I with victim information ( Local Sample III ) and Local Sample II with victim information ( Local Sample IV ), which contain 289 spatial clusters and 83 balanced spatial clusters, respectively (Table A in S1 Supplementary Material ). To construct Local Sample IV from Local Sample III , we employ the same procedure of Fisher’s exact test discussed above. The results are qualitatively the same as those for the binary genocide measure.

thumbnail

https://doi.org/10.1371/journal.pone.0212316.g004

Regression diagnostics based on Local Samples II and IV suggest that our models are appropriate (see S6 and S7 Figs). It is also noted that we use robust standard errors adjusted for clustering by village for conservative statistical inference.

To address remaining threats to internal validity, we assess the sensitivity of the results to potential omitted variable bias due to unobserved confounders that vary within balanced spatial clusters for Local Samples II and IV, using the Oster’s coefficient stability approach [ 18 ] (see Section 5 in S1 Supplementary Material for details). The results confirm that the estimated negative impacts are robust to omitted variable bias even under conservative assumptions.

Thus, we conclude that the genocidal violence had adverse impacts on child education among the couples who had their first child during the Pol Pot era. The analysis and discussion of potential mechanisms underlying these patterns are provided elsewhere [ 34 ].

Concluding remarks

This paper provided a regression-based framework for GIS-based empirical research design using georeferenced point data for both spatial events of interest and subjects exposed to the events and illustrated its utility and implementation through an empirical case study from Cambodia. GIS is particularly useful in understanding the locational determinants of spatial events, creating treatment variables, constructing credible samples, and implementing blocking (local control). GIS can potentially play a key role in designing credible and transparent empirical research as spatial point data become more widely available. Much work is needed to overcome the limitations of our approach. Promising avenues include designing the optimal choice of the size of spatial clusters and the bandwidth of buffers, allowing a change in subject locations, capturing local spillover effects, and analyzing dynamic treatment effects. Ultimately, our approach needs to be extended to one in the potential outcomes framework for causal analysis. These works are left for future research.

Supporting information

S1 fig. distribution of continuous genocide measures..

Kernel density of the distribution of the continuous genocide measures based on the first-, second-, and third-order polynomials in distance is shown for each subsample in Global Sample.

https://doi.org/10.1371/journal.pone.0212316.s001

S2 Fig. Distribution of number of spatial clusters to which villages belong.

The figure provides the distribution of the number of spatial clusters to which villages in each subsample of Local Samples I (panel A), II (panel B), III (panel C), and IV (panel D) belong.

https://doi.org/10.1371/journal.pone.0212316.s002

S3 Fig. Results of Fisher’s exact tests.

The figure provides the results of Fisher’s exact tests. The location of killing site 474 analyzed in Fig 2 and Table 2 is depicted.

https://doi.org/10.1371/journal.pone.0212316.s003

S4 Fig. Estimation results (binary genocide measure)—Other subsamples.

The figure provides point estimates and 95% confidence intervals of genocide impacts on children’s educational outcomes based on binary genocide measure.

https://doi.org/10.1371/journal.pone.0212316.s004

S5 Fig. Estimation results (continuous genocide measure)—Other subsamples.

The figure provides point estimates and 95% confidence intervals of genocide impacts on children’s educational outcomes based on continuous genocide measure.

https://doi.org/10.1371/journal.pone.0212316.s005

S6 Fig. Regression diagnostics—Local Sample II (binary genocide measure).

Regression diagnostics are presented for each subsample of Local Sample II. Each of the three figures depicts the following: the distribution of residuals, along with a normal density (green) (left); a normal quantile-quantile plot of residuals (middle); a residual plot, along with a locally weighted scatterplot smoothing curve (bandwidth = 0.8) (green) (right).

https://doi.org/10.1371/journal.pone.0212316.s006

S7 Fig. Regression diagnostics—Local Sample IV (continuous genocide measure).

Regression diagnostics are presented for each subsample of Local Sample IV. See the notes to S6 Fig for each figure.

https://doi.org/10.1371/journal.pone.0212316.s007

S1 Supplementary Material. PDF file containing the Supplementary Material.

This file provides the detailed descriptions of data, diagnosis, design, and analysis (Section 1: Data details; 2: Diagnosis details; 3: Design details; 4: Robustness checks; 5: Sensitivity analysis), including 18 tables (Tables A–R).

https://doi.org/10.1371/journal.pone.0212316.s008

S1 File. Replication files.

This zip file contains the dataset and Stata do-file to replicate the empirical analyses presented in the case study.

https://doi.org/10.1371/journal.pone.0212316.s009

Acknowledgments

We thank the editor and three anonymous referees for comments that greatly improved the paper. An earlier version of this paper benefited significantly from the comments and suggestions of Oliver T. Coomes, Keisuke Hirano, Kosuke Imai, Da-Wei Kuan, and seminar participants at Chengchi University and GIScience 2018 (RMIT University). We also thank H.E. San Sy Than, H.E. Hang Lina, and Fumihiko Nishi for providing us with the 100% count 1998 Cambodia Population Census microdata.

  • 1. Alam BM. Application of Geographic Information Systems. InTech; 2012.
  • 2. Ballas D, Clarke G, Franklin RS, Newing A. GIS and the Social Sciences: Theory and Applications. 1st ed. Oxford: Routledge; 2017.
  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 5. Angrist JD, Pischke J. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton: Princeton University Press; 2009.
  • 6. Anselin L. Spatial Econometrics: Methods and Models. Boston: Kluwer; 1988.
  • 8. Rosenbaum PR. Design of Observational Studies. New York: Springer; 2010.
  • 9. Cressie NAC. Statistics for Spatial Data. Revised ed. New York: Wiley; 1993.
  • 11. Eldawy A, Mokbel MF. The Era of Big Spatial Data: A Survey. Hanover, MA: Now Publishers; 2016.
  • 14. Papadogeorgou G, Choirat C, Zigler CM. Adjusting for Unmeasured Spatial Confounding with Distance Adjusted Propensity Score Matching. Biostatistics (forthcoming).
  • 15. Fisher RA. Design of Experiments. 1st ed. London: Oliver and Boyd; 1935.
  • 18. Emily Oster. Unobservable Selection and Coefficient Stability: Theory and Evidence. Journal of Business & Economic Statistics (forthcoming).
  • 19. Fisher RA. Statistical Methods for Research Workers. 1st ed. London: Oliver and Boyd; 1925.
  • 21. Shadish WR, Cook TD, Campbell DT. Experimental and Quasi-Experimental Designs for Generalized Causal Inference. Belmont, CA: Wadsworth Cengage Learning; 2002.
  • 22. Stock J, Watson M. Introduction to Econometrics. 3rd ed. Boston: Pearson; 2011.
  • 26. Short P. Pol Pot: Anatomy of a Nightmare. New York: Henry Holt and Company; 2004.
  • 27. Dy K. A History of Democratic Kampuchea (1975-1979). Phnom Penh: Documentation Center of Cambodia; 2007.
  • 28. Kiernan B. The Pol Pot Regime: Race, Power, and Genocide in Cambodia under the Khmer Rouge, 1975-79. 3rd ed. New Haven, CT: Yale University Press; 2008.
  • 29. Beth VS, Daryn R, Youk C. Cambodia’s Hidden Scars: Trauma Psychology in the Wake of the Khmer Rouge. Phnom Penh: Documentation Center of Cambodia; 2011.
  • 30. Vickery M. Kampuchea: Politics, Economics and Society. London: Frances Pinter; 1986.
  • 31. Chandler D. Voices from S-21. Berkeley: University of California Press; 1999.
  • 33. Mazumdar D. Rural-Urban Migration in Developing Countries. In: Mills ES, editor. Handbook of Regional and Urban Economics: Urban Economics. vol. 2. New York: North Holland; 1987. p. 1097–1128.
  • 34. Kogure K, Takasaki Y. Conflict, Institutions, and Economic Behavior: Legacies of the Cambodian Genocide; 2016. Unpublished manuscript, CIRJE-F-1034, University of Tokyo.
  • Reference Manager
  • Simple TEXT file

People also looked at

Original research article, impact of industrial policy on urban green innovation: empirical evidence of china’s national high-tech zones based on double machine learning.

www.frontiersin.org

  • College of Economics and Management, Taiyuan University of Technology, Taiyuan, China

Effective industrial policies need to be implemented, particularly aligning with environmental protection goals to drive the high-quality growth of China’s economy in the new era. Setting up national high-tech zones falls under the purview of both regional and industrial policies. Using panel data from 163 prefecture-level cities in China from 2007 to 2019, this paper empirically analyzes the impact of national high-tech zones on the level of urban green innovation and its underlying mechanisms. It utilizes the national high-tech zones as a quasi-natural experiment and employs a double machine learning model. The study findings reveal that the policy for national high-tech zones greatly enhances urban green innovation. This conclusion remains consistent even after adjusting the measurement method, empirical samples, and controlling for other policy interferences. The findings from the heterogeneity analysis reveal that the impact of the national high-tech zone policy on green innovation exhibits significant regional heterogeneity, with a particularly significant effect in the central and western regions. Among cities, there is a notable push for green innovation levels in second-tier, third-tier, and fourth-tier cities. The moderating effect results indicate that, at the current stage of development, transportation infrastructure primarily exerts a negative moderating effect on how the national high-tech zone policy impacts the level of urban green innovation. This research provides robust empirical evidence for informing the optimization of the industrial policy of China and the establishment of a future ecological civilization system.

1 Introduction

The Chinese economy currently focuses on high-quality development rather than quick growth. The traditional demographic and resource advantages gradually diminish, making the earlier crude development model reliant on excessive resource input and consumption unsustainable. Simultaneously, resource impoverishment, environmental pollution, and carbon emissions are growing more severe ( Wang F. et al., 2022 ). Consequently, pursuing a mutually beneficial equilibrium between the economy and the environment has emerged as a critical concern in China’s economic growth. Green innovation, the integration of innovation with sustainability development ideas, is progressively gaining significance within the framework of reshaping China’s economic development strategy and addressing the challenges associated with resource and environmental limitations. In light of the present circumstances, and with the objectives outlined in the “3060 Plan” for carbon peak and carbon neutral, the pursuit of a green and innovative development trajectory, emphasizing heightened innovation alongside environ-mental preservation, has emerged as a pivotal concern within the context of China’s contemporary economic progress.

Industrial policy is pivotal in government intervention within market-driven resource allocation and correcting structural disparities. The government orchestrates this initiative to bolster industrial expansion and operational effectiveness. In contrast to Western industrial policies, those in China are predominantly crafted within the administrative framework and promulgated through administrative regulations. Over an extended period, numerous industrial policies have been devised in response to regional disparities in industrial development. These policies aim to identify new growth opportunities in diverse regions, focusing on optimizing and upgrading industrial structures. These strategies have been implemented at various administrative levels, from the central government to local authorities ( Sun and Sun, 2015 ). As a distinctive regional economic policy in China, the national high-tech zone represents one of the foremost supportive measures a city can acquire at the national level. Its crucial role involves facilitating the dissemination and advancement of regional economic growth. Over more than three decades, it has evolved into the primary platform through which China executes its strategy of concentrating on high-tech industries and fostering development driven by innovation. Concurrently, the national high-tech zone, operating as a geographically focused policy customized for a specific region ( Cao, 2019 ), enhances the precision of policy support for the industries under its purview, covering a more limited range of municipalities, counties, and regions. Contrasting with conventional regional industrial policies, the industry-focused policy within national high-tech zones prioritizes comprehensive resource allocation advice and economic foundations to maximize synergy and promote the long-term sustainable growth of the regional economy, and this represents a significant paradigm shift in location-based policies within the framework of carrying out the new development idea. Its inception embodies a combination of central authorization, high-level strategic planning, local grassroots decision-making, and innovative system development. In recent years, driven by the objective of dual carbon, national high-tech have proactively promoted environmentally friendly innovation. Nevertheless, given the proliferation of new industrial policies and the escalating complexity of the policy framework, has the setting up of national high-tech zones genuinely elevated the level of urban green innovation in contrast to conventional regional industrial policies? What are the underlying mechanisms? Simultaneously, concerning the variations among different cities, have the industrial policy tools within the national high-tech zones been employed judiciously and adaptable? What are the concrete practical outcomes? Investigating these matters has emerged as a significant subject requiring resolution by government, industry and academia.

2 Literature review and research hypothesis

2.1 literature review.

When considering industrial policy, the setting up national high-tech zones embodies the intersection of regional and industrial policies. Domestic and international academic research concerning setting up national high-tech zones primarily centers on economic activities and innovation. Notably, the economic impact of national high-tech zones encompasses a wide range of factors, including their influence on total factor productivity ( Tan and Zhang, 2018 ; Wang and Liu, 2023 ), foreign trade ( Alder et al., 2016 ), industrial structure upgrades ( Yuan and Zhu, 2018 ), and economic growth ( Liu and Zhao, 2015 ; Huang and Fernández-Maldonado, 2016 ; Wang Z. et al., 2022 ). Regarding innovation, numerous researchers have confirmed the positive effects of national high-tech zones on company innovation ( Vásquez-Urriago et al., 2014 ; Díez-Vial and Fernández-Olmos, 2017 ; Wang and Xu, 2020 ); Nevertheless, a few scholars have disagreed on this matter ( Hong et al., 2016 ; Sosnovskikh, 2017 ). In general, the consensus among scholars is that setting up high-tech national zones fosters regional innovation significantly. This consensus is supported by various aspects of innovation, including innovation efficiency ( Park and Lee, 2004 ; Chandrashekar and Bala Subrahmanya, 2017 ), agglomeration effect ( De Beule and Van Beveren, 2012 ), innovation capability ( Yang and Guo, 2020 ), among other relevant dimensions. The existing literature predominantly delves into the correlation between the setting up of national high-tech zones, innovation, and economic significance. However, the rise of digital economic developments, notably industrial digitization, has accentuated the limitations of the traditional innovation paradigm. These shortcomings, such as the inadequate exploration of the social importance and sustainability of innovation, have become apparent in recent years. As the primary driver of sustainable development, green innovation represents a potent avenue for achieving economic benefits and environmental value ( Weber et al., 2014 ). Its distinctiveness from other innovation forms lies in its potential to facilitate the transformation of development modes, reshape economic structures, and address pollution prevention and control challenges. However, in the context of green innovation, based on the double-difference approach, Wang et al. (2020) has pointed out that national high-tech zones enhance the effectiveness of urban green innovation, but this is only significant in the eastern region.

Furthermore, scholars have also explored the mechanisms underlying the innovation effects of national high-tech. For example, Cattapan et al. (2012) focused on science parks in Italy. They found that green innovation represents a potent avenue for achieving economic benefits as the primary driver of sustainable development, and environmental value technology transfer services positively influence product innovation. Albahari et al. (2017) confirmed that higher education institutions’ involvement in advancing corporate innovation within technology and science parks has a beneficial moderating effect. Using the moderating effect of spatial agglomeration as a basis, Li WH. et al. (2022) found that industrial agglomeration has a significantly unfavorable moderating influence on the effectiveness of performance transformation in national high-tech zones. Multiple studies have examined the national high-tech zone industrial policy’s regulatory framework and urban innovation. However, in the age of rapidly expanding new infrastructure, infrastructure construction is concentrated on information technologies like blockchain, big data, cloud computing, artificial intelligence, and the Internet; Further research is needed to explore whether traditional infrastructure, particularly transportation infrastructure, can promote urban green innovation. Transportation infrastructure has consistently been vital in fostering economic expansion, integrating regional resources, and facilitating coordinated development ( Behrens et al., 2007 ; Zhang et al., 2018 ; Pokharel et al., 2021 ). Therefore, it is necessary to investigate whether transportation infrastructure can continue encouraging innovative urban green practices in the digital economy.

In summary, the existing literature has extensively examined the influence of national high-tech zones on economic growth and innovation from various levels and perspectives, establishing a solid foundation and offering valuable research insights for this study. Nonetheless, previous studies frequently overlooked the impact of national high-tech zones on urban green innovation levels, and a subsequent series of work in this paper aims to address this issue. Further exploration and expansion are needed to understand the industrial policy framework’s strategy for relating national high-tech zones to urban green innovation. Furthermore, there is a need for further improvement and refinement of the research model and methodology. Based on these, this paper aims to discuss the industrial policy effects of national high-tech zones from the perspective of urban green innovation to enrich and expand the existing research.

In contrast to earlier research, the marginal contribution of this paper is organized into three dimensions: 1) Most scholars have primarily focused on the effects of national high-tech zones on economic activity and innovation, with less emphasis on green innovation and rare studies according to the level of green innovation perspective. The study on national high-tech zones as an industrial policy that has already been done is enhanced by this work. 2) Regarding the research methodology, the Double Machine Learning (DML) approach is used to evaluate the policy effects of national high-tech zones, leveraging the advantages of machine learning algorithms for high-dimensional and non-parametric prediction. This approach circumvents the problems of model setting bias and the “curse of dimensionality” encountered in traditional econometric models ( Chernozhukov et al., 2018 ), enhancing the credibility of the research findings. 3) By introducing transportation infrastructure as a moderator variable, this study investigates the underlying mechanism of national high-tech zones on urban green innovation, offering suggestions for maximizing the influence of these zones on policy.

2.2 Theoretical analysis and hypotheses

2.2.1 national high-tech zones’ industrial policies and urban green innovation.

As one of the ways to land industrial policies at the national level, national high-tech zones serve as effective driving forces for enhancing China’s ability to innovate regionally and its contribution to economic growth ( Xu et al., 2022 ). Green innovation is a novel form of innovation activity that harmoniously balances the competing goals of environmental preservation and technological advancement, facilitating the superior expansion of the economy by alleviating the strain on resources and the environment ( Li, 2015 ). National high-tech zones mainly impact urban green innovation through three main aspects. Firstly, based on innovation compensation effects, national high-tech zones, established based on the government’s strategic planning, receive special treatment in areas such as land, taxation, financing, credit, and more, serving as pioneering special zones and experimental fields established by the government to promote high-quality regional development. When the government offers R&D subsidies to enterprises engaged in green innovation activities within the zones, enterprises are inclined to respond positively to the government’s policy support and enhance their level of green innovation as a means of seeking external legitimacy ( Fang et al., 2021 ), thereby contributing to the advancement of urban green innovation. Secondly, based on the industrial restructuring effect, strict regulation of businesses with high emissions, high energy consumption, and high pollution levels is another aspect of implementing the national high-tech zone program. Consequently, businesses with significant emissions and energy consumption are required to optimize their industrial structure to access various benefits within the park, resulting in the gradual transformation and upgrading of high-energy-consumption industries towards green practices, thereby further contributing to regional green innovation. Based on Porter’s hypothesis, the green and low-carbon requirements of the park policy increase the production costs for polluting industries, prompting polluting enterprises to upgrade their existing technology and adopt green innovation practices. Lastly, based on the theory of industrial agglomeration, the national high-tech zones’ industrial policy facilitates the concentration of innovative talents to a certain extent, resulting in intensified competition in the green innovation market. Increased competition fosters the sharing of knowledge, technology, and talent, stimulating a market environment where the survival of the fittest prevails ( Melitz and Ottaviano, 2008 ). These increase the effectiveness of urban green innovation, helping to propel urban green innovation forward. Furthermore, the infrastructure development within the national high-tech zones establishes a favorable physical environment for enterprises to engage in creative endeavors. Also, it enables the influx of high-quality innovation capital from foreign sources, complementing the inherent characteristics of national high-tech zones that attract such capital and concentrate green innovation resources, ultimately resulting in both environmental and economic benefits. Based on the above analysis, Hypothesis 1 is proposed:

Hypothesis 1. Implementing industrial policies in national high-tech zones enhances levels of urban green innovation.

2.2.2 Heterogeneity analysis

Given the variations in economic foundations, industrial statuses, and population distributions across different regions, development strategies in different regions are also influenced by these variations ( Chen and Zheng, 2008 ). Theoretically, when using administrative boundaries or geographic locations as benchmarks, the impact of national high-tech zone industrial policy on urban green innovation should be achieved through strategies like aligning with the region’s existing industrial structure. Compared to the western and central regions, the eastern region exhibits more incredible innovation and dynamism due to advantages such as a developed economy, good infrastructure, advanced management concepts, and technologies, combined with a relatively high initial level of green innovation factor endowment. Considering the diminishing marginal effect principle of green innovation, the industrial policy implementation in national high-tech zones favors an “icing on the cake” approach in the eastern region, contrasting with a “send carbon in the snow” approach in the central and western regions. In other words, the economic benefits of national high-tech zones for promoting urban green innovation may need to be more robust than their impact on the central and western regions. Literature confirms that establishing national high-tech zones yields a more beneficial technology agglomeration effect in the less developed central and western regions ( Liu and Zhao, 2015 ), leading to a more substantial impact on enhancing the level of urban green innovation.

Moreover, local governments consider economic development, industrial structure, and infrastructure levels when establishing national high-tech zones. These factors serve as the foundation for regional classification to address variations in regional quality and to compensate for gaps in theoretical research on the link between national high-tech zone industrial policy implementation and urban green innovation. Consequently, the execution of industrial policies in national high-tech zones relies on other vital factors influencing urban green innovation. Significant variations exist in economic development and infrastructure levels among cities of different grades ( Luo and Wang, 2023 ). Generally, cities with higher rankings exhibit strong economic growth and infrastructure, contrasting those with lower rankings. Consequently, the effect of establishing a national high-tech zone on green innovation may vary across different city grades. Thus, considering the disparities across city rankings, we delve deeper into identifying the underlying reasons for regional diversity in the green innovation outcomes of industrial policies implemented in national high-tech zones based on city grades. Based on the above analysis, Hypothesis 2 is proposed:

Hypothesis 2. There is regional heterogeneity and city-level heterogeneity in the impact of national high-tech zone policies on the level of urban green innovation.

2.2.3 The moderating effect of transportation infrastructure

Implementing industrial policies and facilitating the flow of innovation factors are closely intertwined with the role of transport infrastructure as carriers and linkages. Generally, enhanced transportation infrastructure facilitates the absorption of local factors and improves resource allocation efficiency, thereby influencing the spatial redistribution of production factors like labor, resources, and technology across cities. Enhanced transportation infrastructure fosters the development of more robust and advanced innovation networks ( Fritsch and Slavtchev, 2011 ). Banister and Berechman (2001) highlighted that transportation infrastructure exhibits network properties that are fundamental to its agglomeration or diffusion effects. From this perspective, robust infrastructure impacts various economic activities, including interregional labor mobility, factor agglomeration, and knowledge exchange among firms, thereby expediting the spillover effects of green technological innovations ( Yu et al., 2013 ). In turn, this could positively moderate the influence of national hi-tech zone policies on green innovation. On the other hand, while transportation infrastructure facilitates the growth of national high-tech zone policies, it also brings negative impacts, including high pollution, emissions, and ecological landscape fragmentation. Improving transportation infrastructure can also lead to the “relative congestion effect” in national high-tech zones. This phenomenon, observed in specific regions, refers to the excessive concentration of similar enterprises across different links of the same industrial chain, which exacerbates the competition for innovation resources among enterprises, making it challenging for enterprises in the region to allocate their limited innovation resources to technological research and development activities ( Li et al., 2015 ). As a result, there needs to be a higher green innovation level. Therefore, the impact of transportation infrastructure in the current stage of development will be more complex. When the level of transport infrastructure is moderate, adequate transport infrastructure supports the promotion of urban green innovation through national high-tech zone policies. However, the impact of transport infrastructure regulation may be harmful. Based on the above analysis, Hypothesis 3 is proposed:

Hypothesis 3. Transportation infrastructure moderates the relationship between national high-tech zones and levels of urban green invention.

3 Research design

3.1 model setting.

This research explores the impact of industrial policies of national high-tech zones on the level of urban green innovation. Many related studies utilize traditional causal inference models to assess the impact of these policies. However, these models have several limitations in their application. For instance, the commonly used double-difference model in the parallel trend test has stringent requirements for the sample data. Although the synthetic control approach can create a virtual control group that meets parallel trends’ needs, it is limited to addressing the ‘one-to-many’ problem and requires excluding groups with extreme values. The selection of matching variables in propensity score matching is subjective, among other limitations ( Zhang and Li, 2023 ). To address the limitations of conventional causal inference models, scholars have started to explore applying machine learning to infer causality ( Chernozhukov et al., 2018 ; Knittel and Stolper, 2021 ). Machine learning algorithms excel at an impartial assessment of the effect on the intended target variable for making accurate predictions.

In contrast to traditional machine learning algorithms, the formal proposal of DML was made in 2018 ( Chernozhukov et al., 2018 ). This approach offers a more robust approach to causal inference by mitigating bias through the incorporation of residual modeling. Currently, some scholars utilize DML to assess causality in economic phenomena. For instance, Hull and Grodecka-Messi (2022) examined the effects of local taxation, crime, education, and public services on migration using DML in the context of Swedish cities between 2010 and 2016. These existing research findings serve as valuable references for this study. Compared to traditional causal inference models, DML offers distinct advantages in variable selection and model estimation ( Zhang and Li, 2023 ). However, in promoting urban green innovation in China, there is a high probability of non-linear relationships between variables, and the traditional linear regression model may lead to bias and errors. Moreover, the double machine learning model can effectively avoid problems such as setting bias. Based on this, the present study employs a DML model to evaluate the policy implications of establishing a national high-tech zone.

3.1.1 Double machine learning framework

Prior to applying the DML algorithm, this paper refers to the practice of Chernozhukov et al. (2018) to construct a partially linear DML model, as depicted in Eq. 1 below:

where i represents the city, t represents the year, and l n G I i t represents the explained variable, which in this paper is the green innovation level of the city. Z o n e i t represents the disposition variable, which in this case is a national high-tech zone’s policy variable. It takes a value of 1 after the implementation of the pilot and 0 otherwise. θ 0 is the disposal factor that is the focus of this paper. X i t represents the set of high-dimensional control variables. Machine learning algorithms are utilized to estimate the specific form of g ^ X i t , whereas U i t , which has a conditional mean of 0, stands for the error term. n represents the sample size. Direct estimation of Eq. 1 provides an estimate for the coefficient of dispositions.

We can further explore the estimation bias by combining Eqs 1 , 2 as depicted in Eq. ( 3 ) below:

where a = 1 n ∑ i ∈ I , t ∈ T   Z o n e i t 2 − 1 1 n ∑ i ∈ I , t ∈ T   Z o n e i t U i t , by a normal distribution having 0 as the mean, b = 1 n ∑ i ∈ I , t ∈ T   Z o n e i t 2 − 1 1 n ∑ i ∈ I , t ∈ T   Z o n e i t g X i t − g ^ X i t . It is important to note that DML utilizes machine learning and a regularization algorithm to estimate a specific functional form g ^ X i t . The introduction of “canonical bias” is inevitable as it prevents the estimates from having excessive variance while maintaining their unbiasedness. Specifically, the convergence of g ^ X i t to g X i t , n −φg > n −1/2 , as n tends to infinity, b also tends to infinity, θ ^ 0 is difficult to converge to θ 0 . To expedite convergence and ensure unbiasedness of the disposal coefficient estimates with small samples, an auxiliary regression is constructed as follows:

where m X i t represents the disposition variable’s regression function on the high-dimensional control variable, this function also requires estimation using a machine learning algorithm in the specific form of m ^ X i t . Additionally, V i t represents the error term with a 0 conditional mean.

3.1.2 The test of the mediating effect within the DML framework

This study investigates how the national high-tech zone industrial policy influences the urban green innovation. It incorporates moderating variables within the DML framework, drawing on the testing procedure outlined by Jiang (2022) , and integrates it with the practice of He et al. (2022) , as outlined below:

Equation 5 is based on Eq. 1 with the addition of variables l n t r a i t and Z o n e i t * l n t r a i t .Where l n t r a i t represents the moderating variable, which in this paper is the transportation infrastructure. Z o n e i t * l n t r a i t represents the interaction term of the moderating variable and the disposition variable. The variables l n t r a i t and Z o n e i t are added to the high-dimensional control variables X i t , and the rest of the variables in Eq. 5 are identical to Eq. 1 . θ 1 represents the disposal factor to focus on.

3.2 Variable selection

3.2.1 dependent variable: level of urban green innovation (lngi).

Nowadays, many academics use indicators like the number of applications for patents or authorizations to assess the degree of urban innovation. To be more precise, the quantity of patent applications is a measure of technological innovation effort, while the number of patents authorized undergoes strict auditing and can provide a more direct reflection of the achievements and capacity of scientific and technological innovation. Thus, this paper refers to the studies of Zhou and Shen (2020) and Li X. et al. (2022) to utilize the count of authorized green invention patents in each prefecture-level city to indicate the level of green innovation. For the empirical study, the count of authorized green patents plus 1 is transformed using logarithm.

3.2.2 Disposal variable: dummy variables for national high-tech zones (Zone)

The national high-tech zone dummy variable’s value correlates with the city in which it is located and the list of national high-tech zones released by China’s Ministry of Science and Technology. If a national high-tech zone was established in the city by 2017, the value is set to 1 for the year the high-tech zone is established and subsequent years. Otherwise, it is set to 0.

3.2.3 Moderating variable: transportation infrastructure (lntra)

Previous studies have shown that China’s highway freight transport comprises 75% of the total freight transport ( Li and Tang, 2015 ). Highway transportation infrastructure has a significant influence on the evolution of the Chinese economy. The development and improvement of highway infrastructure are crucial for modern transportation. This paper uses the research methods of Wu (2019) and uses the roadway mileage (measured in kilometers) to population as a measure of the quality of the transportation system.

3.2.4 Control variables

(1) Foreign direct investment (lnfdi): There is general agreement among academics that foreign direct investment (FDI) significantly influences urban green innovation, as FDI provides expertise in management, human resources, and cutting-edge industrial technology ( Luo et al., 2021 ). Thus, it is necessary to consider and control the level of FDI. This paper uses the ratio of foreign investment to the local GDP in a million yuan.

(2) Financial development level (lnfd): Innovation in science and technology is greatly aided by finance. For the green innovation-driven strategy to advance, it is imperative that funding for science and technology innovation be strengthened. The amount of capital raised for innovation is strongly impacted by the state of urban financial development ( Zhou and Du, 2021 ). Thus, this paper uses the loan balance to GDP ratio as an indicator.

(3) Human capital (lnhum): Highly skilled human capital is essential for cities to drive green innovation. Generally, highly qualified human capital significantly boosts green innovation ( Ansaris et al., 2016 ). Therefore, a measure was employed: the proportion of people in the city who had completed their bachelor’s degree or above.

(4) Industrial structure (lnind): Generally, the secondary industry in China is the primary source of pollution, and there is a significant impact of industrial structure on green innovation ( Qiu et al., 2023 ). The metric used in this paper is the secondary industry-to-GDP ratio for the area.

(5) Regional economic development level (lnagdp): A region’s level of economic growth is indicative of the material foundation for urban green innovation and in-fluences the growth of green innovation in the region ( Bo et al., 2020 ). This research uses the annual gross domestic product per capita as a measurement.

3.3 Data source

By 2017, China had developed 157 national high-tech zones in total. In conjunction with the study’s objectives, this study performs sample adjustments and a screening process. The study’s sample period spans from 2007 to 2019. 57 national high-tech zones that were created prior to 2000 are omitted to lessen the impact on the test results of towns having high-tech zones founded before 2007. Due to the limitations of high-tech areas in cities at the county level in promoting urban green innovation, 8 high-tech zones located in county-level cities are excluded. And 4 high-tech zones with missing severe data are excluded. Among the list of established national high-tech zones, 88 high-tech zones are distributed across 83 prefecture-level cities due to multiple districts within a single city. As a result, 83 cities are selected as the experimental group for this study. Additionally, a control group of 80 cities was selected from among those that did not have high-tech zones by the end of 2019, resulting in a final sample size of 163 cities. This paper collects green patent data for each city from the China Green Patent Statistical Report published by the State Intellectual Property Office. The author compiled the list of national high-tech zones and the starting year of their establishment on the official government website. In addition, the remaining data in this paper primarily originated from the China Urban Statistical Yearbook (2007–2019), the EPS database, and the official websites of the respective city’s Bureau of Statistics. Missing values were addressed through linear interpolation. To address heteroskedasticity in the model, the study logarithmically transforms the variables, excluding the disposal variable. Table 1 shows the descriptive analysis of the variables.

www.frontiersin.org

Table 1 . Descriptive analysis.

4 Empirical analysis

4.1 national high-tech zones’ policy effects on urban green innovation.

This study utilizes the DML model to estimate the impact of industrial policies implemented in national high-tech zones at the level of urban green innovation. Following the approach of Zhang and Li (2023) , the sample is split in a ratio of 1:4, and the random forest algorithm is used to perform predictions and combine Eq. ( 1 ) with Eq. ( 4 ) for the regression. Table 2 presents the results with and without controlling for time and city effects. The results indicate that the treatment effect sizes for these four columns are 0.376, 0.293, 0.396, and 0.268, correspondingly, each of which was significant at a 1% level. Thus, Hypothesis 1 is supported.

www.frontiersin.org

Table 2 . Benchmark regression results.

4.2 Robustness tests

4.2.1 eliminate the influence of extreme values.

To reduce the impact of extreme values on the estimation outcomes, all variables on the benchmark regression, excluding the disposal variable, undergo a shrinkage process based on the upper and lower 1% and 5% quantiles. Values lower than the lowest and higher than the highest quantile are replaced accordingly. Regression analyses are conducted. Table 3 demonstrates that removing outliers did not substantially alter the findings of this study.

www.frontiersin.org

Table 3 . Extreme values removal results.

4.2.2 Considering province-time interaction fixed effects

Since provinces are critical administrative units in the governance system of the Chinese government, cities within the same province often share similarities in policy environment and location characteristics. Therefore, to account for the influence of temporal changes across different provinces, this study incorporates province-time interaction fixed effects based on the benchmark regression. Table 4 presents the individual regression results. Based on the regression results, after accounting for the correlation between different city characteristics within the same province, national high-tech zone policies continue to significantly influence urban green innovation, even at the 1% level.

www.frontiersin.org

Table 4 . The addition of province and time fixed effects interaction terms.

4.2.3 Excluding other policy disturbances

When analyzing how national high-tech zones affect strategy for urban green innovation, it is susceptible to the influence of concurrent policies. This study accounts for other comparable policies during the same period to ensure an accurate estimation of the policy effect. Since 2007, national high-tech zone policies have been successively implemented, including the development of “smart cities.” Therefore, this study incorporates a policy dummy variable for “smart cities” in the benchmark regression. The specific regression findings are shown in Table 5 . After controlling for the impact of concurrent policies, the importance of national high-tech zones’ policy impact remains consistent.

www.frontiersin.org

Table 5 . Results of removing the impact of parallel policies.

4.2.4 Resetting the DML model

To mitigate the potential bias introduced by the settings in the DML model on the conclusions, the purpose of this study is to assess the conclusions’ robustness using the following methods. First, the sample split ratio of the DML model is adjusted from 1:4 to 1:2 to examine the potential impact of the sample split ratio on the conclusions of this study. Second, the machine learning algorithm is substituted, replacing the random forest algorithm, which has been utilized as a prediction algorithm, with lasso regression, gradient boosting, and neural networks to investigate the potential influence of prediction algorithms on the conclusions of this study. Third, regarding benchmark regression, additional linear models were constructed and analyzed using DML, which involves subjective decisions regarding model form selection. Therefore, DML was employed to construct more comprehensive interactive models, aiming to assess the influence of model settings on the conclusions of this study. The main and auxiliary regressions utilized for the analysis were modified as follows:

Combining Eqs ( 7 ), ( 8 ) for the regression, the interactive model yielded estimated coefficients for the disposition effect:

The results of Eq. ( 9 ) are shown in column (5) of Table 6 . And all the regression results obtained from the modified DML model are presented in Table 6 .

www.frontiersin.org

Table 6 . Results of resetting the DML model.

The findings indicate that the sample split ratio in the DML model, the prediction algorithm used, or the model estimation approach does not impact the conclusion that the national high-tech zone policy raises urban areas’ level of green innovation. These factors only modify the magnitude of the policy effect to some degree.

4.3 Heterogeneity analysis

4.3.1 regional heterogeneity.

The sample cities were further divided into the east, central, and west regions based on the three major economic subregions to examine regional variations in national high-tech zone policies ' effects on urban green innovation, with the results presented in Table 7 . National high-tech zone policies do not statistically significantly affect urban green innovation in the eastern region. However, they have a considerable beneficial influence in the central and western areas. The lack of statistical significance may be explained by the possibility that the setting up of national high-tech zones in the eastern region will provide obstacles to the growth of urban green innovation, such as resource strain and environmental pollution. Given the central and western regions’ relatively underdeveloped economic status and industrial structure, coupled with the preceding theoretical analysis, establishing national high-tech zones is a crucial catalyst, significantly boosting urban green innovation levels. Furthermore, the central government emphasizes that setting high-tech national zones should consider regional resource endowments and local conditions, implementing tailored policies. The central and western regions possess unique geographic locations and natural conditions that make them well-suited for developing solar energy, wind energy, and other forms of green energy. Compared to the central region, the national high-tech zone initiative has a more pronounced impact on promoting urban green innovation in the western region. While further optimization is needed for the western region’s urban innovation environment, the policy on national high-tech zones has a more substantial incentive effect in this region due to its more significant development potential, positive transformation of industrial structure, and increased policy support from the state, including the development strategy for the western region.

www.frontiersin.org

Table 7 . Heterogeneity test results for different regions.

4.3.2 Urban hierarchical heterogeneity

The New Tier 1 Cities Institute’s ‘2020 City Business Charm Ranking’ is the basis for this study, with the sample cities categorized into Tier 1 (New Tier 1), Tier 2, Tier 3, Tier 4, and Tier 5. Table 8 presents the regression findings for each of the groups.

www.frontiersin.org

Table 8 . Heterogeneity test results for different classes of cities.

The results in Table 8 reveal significant heterogeneity at the city level regarding national high-tech zones’ effects on urban green innovation, confirming Hypothesis 2 . In particular, the coefficients for the first-tier cities are not statistically significant due to the small sample size, and the same applies to the fifth-tier cities. This could be attributed to the relatively weak economy and infrastructure development issues in the fifth-tier cities. Additionally, due to their limited level of development, the fifth-tier cities may have a relatively homogeneous industrial structure, with a dominance of traditional industries or agriculture and a need for a more diversified industrial layout. National high-tech zones have not greatly aided the development of green innovation in these cities. In contrast, national high-tech zone policies in second-tier, third-tier, and fourth-tier cities have a noteworthy favorable impact on green innovation, indicating their favorable influence on enhancing green innovation in these cities. Despite the lower level of economic development in fourth-tier cities compared to second-tier and third-tier cities, the fourth-tier cities’ national high-tech zones have the most pronounced impact on promoting green innovation. This could be attributed to the ongoing transformation of industries in fourth-tier cities, which are still in the technology diffusion and imitation stage, allowing these cities’ national high-tech zones to maintain a high marginal effect. Thus, Hypothesis 2 is supported.

5 Further analysis

According to the empirical findings, setting high-tech national zones significantly raises the bar for urban green innovation. Therefore, it is essential to understand the underlying factors and mechanisms that contribute to the positive correlation. This paper constructs a moderating effect test model using Eqs 5 , 6 and provides a detailed discussion by introducing transportation infrastructure as a moderating variable.

The empirical finding of the moderating impact of transportation infrastructure is shown in Table 9 . The dichotomous interaction term Zone*lntra is significantly negative at the 5% level, suggesting that the impact of national high-tech zone policies on the level of urban green innovation is negatively moderated by transportation infrastructure. This result deviates from the general expectation, but it aligns with the complexity of the role played by transportation infrastructure in the context of modern economic development, as discussed in the previous theoretical analysis. This could be attributed to the insufficient green innovation benefits generated by the policy on national high-tech zones at the current stage, which fails to compensate for the adverse effects of excessive resource consumption and environmental pollution caused by the construction of the zone. Furthermore, transportation infrastructure can lead to an excessive concentration of similar enterprises in the high-tech zones. This excessive concentration creates a relative crowding effect, intensifying competition among enterprises. It diminishes their inclination to engage in green innovation collaboration and investment and hinders their effective implementation of technological research and development activities. Moreover, the excessive clustering of similar enterprises implies a need for more diversity in green innovation activities among businesses located in national high-tech zones. This results in duplicated green innovation outputs and hinders the advancement of green innovation. Thus, Hypothesis 3 is supported.

www.frontiersin.org

Table 9 . Empirical results of moderating effects.

6 Conclusion and policy recommendations

6.1 conclusion.

Based on panel data from 163 prefecture-level cities in China from 2007 to 2019, the net effect of setting national high-tech zones on urban green innovation was analyzed using the double machine learning model. The results found that: firstly, the national high-tech zone policy significantly raises the degree of local green innovation, and these results remain robust even after accounting for various factors that could affect the estimation results. Secondly, in the central and western regions, the level of urban green innovation is positively impacted by the national high-tech zone policy; However, this impact is less significant in the eastern region. In the western region compared to the central region, the national high-tech zone initiative has a stronger impact on increasing the level of urban green innovation. Across different city levels, compared to second-tier and third-tier cities, the high-tech zone policy has a more substantial impact on increasing the level of green innovation in fourth-tier cities. Thirdly, based on the moderating effect mechanism test, the construction of transportation infrastructure weakens the promotional effect of national high-tech zones on urban green innovation.

6.2 Policy recommendations

In order that national high-tech zones can better promote China’s high-quality development, this paper proposes the following policy recommendations:

(1) Urban green innovation in China depends on accelerating the setting up of national high-tech zones and creating an atmosphere that supports innovation. Establishing national high-tech zones as testbeds for high-quality development and green innovation has significantly elevated urban green innovation. Thus, cities can efficiently foster urban green innovation by supporting the development of national high-tech zones. Cities that have already established national high-tech zones should further encourage enterprises within these zones to increase their investment in research and development. They should also proceed to foster the leadership of national high-tech zones for urban green innovation, assuming the role of pilot cities as models and leaders. Additionally, it is essential to establish mechanisms for cooperation and synergy between the pilot cities and their neighboring cities to promote collective green development in the region.

(2) Expanding the pilot program and implementing tailored policies based on local conditions are essential. Industrial policies about national high-tech zones have differing effects on urban green innovation. Regions should leverage their comparative advantages, consider urban development’s commonalities and unique aspects, and foster a stable and sustainable green innovation ecosystem. The western and central regions should prioritize constructing and enhancing new infrastructure and bolster support for the high-tech green industry. The western region should seize the opportunity presented by national policies that prioritize support, quicken the rate of environmental innovation, and progressively bridge the gap with the eastern and central regions in various aspects. Furthermore, second-tier, third-tier, and fourth-tier cities should enhance the advantages of national high-tech zone policies, further maintaining the high standard of green innovation and keeping green innovation at an elevated level. Regions facing challenges in green innovation, particularly fifth-tier cities, should learn from the development experiences of advanced regions with national high-tech zones to compensate for their deficiencies in green innovation.

(3) Highlighting the importance of transportation regulation and enhancing collaboration in green innovation is crucial. Firstly, transportation infrastructure should be maximized to strengthen coordination and cooperation among regions, facilitate the smooth movement of innovative talents across regions, and facilitate the rational sharing of innovative resources, collectively enhancing green innovation. Additionally, attention ought to be given to the industrial clustering effect of parks to prevent the wastage of resources and inefficiencies resulting from the excessive clustering of similar industries. Efforts should be focused on effectively harnessing the latent potential of crucial transportation infrastructure areas as long-term drivers of development, promptly mitigating the negative impact of transportation infrastructure construction, and gradually achieving the synergistic promotion of the setting up of national high-tech zones and the raising of urban levels of green innovation, among other overarching objectives.

6.3 Limitations and future research

Our study has some limitations because the research in this paper is conducted in the institutional context of China. For example, not all countries are suitable for implementing similar industrial policies to develop the economy while focusing on environmental protection. However, we recognize that this study is interesting and relevant, and it encourages us to focus more intensely on environmental protection from an industrial policy perspective. Moreover, this paper exhibits certain limitations in the research process. Firstly, the urban green innovation measurement index was developed using the quantity of green patent authorizations. Future studies could focus on green innovation processes, such as the quality of green patents granted. Secondly, the paper employs machine learning techniques for causal inference. Subsequent investigations could delve further into the potential applications of machine learning algorithms in environmental sciences to maximize the benefits of innovative research methodologies.

Data availability statement

The raw data supporting the conclusion of this article will be made available by the authors, without undue reservation.

Author contributions

WC: Conceptualization, Data curation, Formal Analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing–review and editing. YJ: Conceptualization, Data curation, Formal Analysis, Investigation, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing–original draft, Writing–review and editing. BT: Investigation, Project administration, Writing–review and editing.

The authors declare that financial support was received for the research, authorship, and/or publication of this article. This research was supported by the Youth Fund for Humanities and Social Science research of Ministry of Education (20YJC790004).

Acknowledgments

The authors are grateful to the editors and the reviewers for their insightful comments.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Albahari, A., Pérez-Canto, S., Barge-Gil, A., and Modrego, A. (2017). Technology parks versus science parks: does the university make the difference? Technol. Forecast. Soc. Change 116, 13–28. doi:10.1016/j.techfore.2016.11.012

CrossRef Full Text | Google Scholar

Alder, S., Shao, L., and Zilibotti, F. (2016). Economic reforms and industrial policy in a panel of Chinese cities. J. Econ. Growth 21, 305–349. doi:10.1007/s10887-016-9131-x

Ansaris, M., Ashrafi, S., and Jebellie, H. (2016). The impact of human capital on green innovation. Industrial Manag. J. 8 (2), 141–162. doi:10.22059/imj.2016.60653

Banister, D., and Berechman, Y. (2001). Transport investment and the promotion of economic growth. J. Transp. Geogr. 9 (3), 209–218. doi:10.1016/s0966-6923(01)00013-8

Behrens, K., Lamorgese, A. R., Ottaviano, G. I., and Tabuchi, T. (2007). Changes in transport and non-transport costs: local vs global impacts in a spatial network. Regional Sci. Urban Econ. 37 (6), 625–648. doi:10.1016/j.regsciurbeco.2007.08.003

Bo, W., Yongzhong, Z., Lingshan, C., and Xing, Y. (2020). Urban green innovation level and decomposition of its determinants in China. Sci. Res. Manag. 41 (8), 123. doi:10.19571/j.cnki.1000-2995.2020.08.013

Cao, Q. F. (2019). The latest researches on place based policy and its implications for the construction of xiong’an national new district. Sci. Technol. Prog. Policy 36 (2), 36–43. (in Chinese).

Google Scholar

Cattapan, P., Passarelli, M., and Petrone, M. (2012). Brokerage and SME innovation: an analysis of the technology transfer service at area science park, Italy. Industry High. Educ. 26 (5), 381–391. doi:10.5367/ihe.2012.0119

Chandrashekar, D., and Bala Subrahmanya, M. H. (2017). Absorptive capacity as a determinant of innovation in SMEs: a study of Bengaluru high-tech manufacturing cluster. Small Enterp. Res. 24 (3), 290–315. doi:10.1080/13215906.2017.1396491

Chen, M., and Zheng, Y. (2008). China's regional disparity and its policy responses. China & World Econ. 16 (4), 16–32. doi:10.1111/j.1749-124x.2008.00119.x

Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., et al. (2018). Double/debiased machine learning for treatment and structural parameters. Econ. J. 21 (1), C1–C68. doi:10.1111/ectj.12097

De Beule, F., and Van Beveren, I. (2012). Does firm agglomeration drive product innovation and renewal? An application for Belgium. Tijdschr. Econ. Soc. Geogr. 103 (4), 457–472. doi:10.1111/j.1467-9663.2012.00715.x

Díez-Vial, I., and Fernández-Olmos, M. (2017). The effect of science and technology parks on firms’ performance: how can firms benefit most under economic downturns? Technol. Analysis Strategic Manag. 29 (10), 1153–1166. doi:10.1080/09537325.2016.1274390

Fang, Z., Kong, X., Sensoy, A., Cui, X., and Cheng, F. (2021). Government’s awareness of environmental protection and corporate green innovation: a natural experiment from the new environmental protection law in China. Econ. Analysis Policy 70, 294–312. doi:10.1016/j.eap.2021.03.003

Fritsch, M., and Slavtchev, V. (2011). Determinants of the efficiency of regional innovation systems. Reg. Stud. 45 (7), 905–918. doi:10.1080/00343400802251494

He, J. A., Peng, F. P., and Xie, X. Y. (2022). Mixed-ownership reform, political connection and enterprise innovation: based on the double/unbiased machine learning method. Sci. Technol. Manag. Res. 42 (11), 116–126. (in Chinese).

Hong, J., Feng, B., Wu, Y., and Wang, L. (2016). Do government grants promote innovation efficiency in China's high-tech industries? Technovation 57, 4–13. doi:10.1016/j.technovation.2016.06.001

Huang, W. J., and Fernández-Maldonado, A. M. (2016). High-tech development and spatial planning: comparing The Netherlands and Taiwan from an institutional perspective. Eur. Plan. Stud. 24 (9), 1662–1683. doi:10.1080/09654313.2016.1187717

Hull, I., and Grodecka-Messi, A. (2022). Measuring the impact of taxes and public services on property values: a double machine learning approach . arXiv preprint arXiv:2203.14751.

Jiang, T. (2022). Mediating effects and moderating effects in causal inference. China Ind. Econ. 5, 100–120. doi:10.19581/j.cnki.ciejournal.2022.05.005

Knittel, C. R., and Stolper, S. (2021). Machine learning about treatment effect heterogeneity: the case of household energy use. Nashv. TN 37203, 440–444. doi:10.1257/pandp.20211090

Li, H., and Tang, L. (2015). Transportation infrastructure investment, spatial spillover effect and enterprise inventory. Manag. World 4, 126–136. doi:10.19744/j.cnki.11-1235/f.2015.04.012

Li, W. H., Liu, F., and Liu, T. S. (2022a). Can national high-tech zones improve the urban innovation efficiency? an empirical test based on the effect of spatial agglomeration regulation. Manag. Rev. 34 (5), 93. doi:10.14120/j.cnki.cn11-5057/f.2022.05.007

Li, X. (2015). Analysis and outlook of the related researches on green innovation. R&D Manag. 27 (2), 1–11. doi:10.13581/j.cnki.rdm.2015.02.001

Li, X., Shao, X., Chang, T., and Albu, L. L. (2022b). Does digital finance promote the green innovation of China's listed companies? Energy Econ. 114, 106254. doi:10.1016/j.eneco.2022.106254

Li, X. P., Li, P., Lu, D. G., and Jiang, F. T. (2015). Economic agglomeration, selection effects and firm productivity. J. Manag. World 4, 25–37+51. (in Chinese). doi:10.19744/j.cnki.11-1235/f.2015.04.004

Liu, R. M., and Zhao, R. J. (2015). Does the national high-tech zone promote regional economic development? A verification based on differences-in-differences method. J. Manag. World 8, 30–38. doi:10.19744/j.cnki.11-1235/f.2015.08.005

Luo, R., and Wang, Q. M. (2023). Does the construction of national demonstration logistics park produce economic growth effect? Econ. Surv. 40 (1), 47–56. doi:10.15931/j.cnki.1006-1096.2023.01.015

Luo, Y., Salman, M., and Lu, Z. (2021). Heterogeneous impacts of environmental regulations and foreign direct investment on green innovation across different regions in China. Sci. total Environ. 759, 143744. doi:10.1016/j.scitotenv.2020.143744

PubMed Abstract | CrossRef Full Text | Google Scholar

Melitz, M. J., and Ottaviano, G. I. (2008). Market size, trade, and productivity. Rev. Econ. Stud. 75 (1), 295–316. doi:10.1111/j.1467-937x.2007.00463.x

Park, S. C., and Lee, S. K. (2004). The regional innovation system in Sweden: a study of regional clusters for the development of high technology. Ai Soc. 18 (3), 276–292. doi:10.1007/s00146-003-0277-7

Pokharel, R., Bertolini, L., Te Brömmelstroet, M., and Acharya, S. R. (2021). Spatio-temporal evolution of cities and regional economic development in Nepal: does transport infrastructure matter? J. Transp. Geogr. 90, 102904. doi:10.1016/j.jtrangeo.2020.102904

Qiu, Y., Wang, H., and Wu, J. (2023). Impact of industrial structure upgrading on green innovation: evidence from Chinese cities. Environ. Sci. Pollut. Res. 30 (2), 3887–3900. doi:10.1007/s11356-022-22162-1

Sosnovskikh, S. (2017). Industrial clusters in Russia: the development of special economic zones and industrial parks. Russ. J. Econ. 3 (2), 174–199. doi:10.1016/j.ruje.2017.06.004

Sun, Z., and Sun, J. C. (2015). The effect of Chinese industrial policy: industrial upgrading or short-term economic growth. China Ind. Econ. 7, 52–67. (in Chinese). doi:10.19581/j.cnki.ciejournal.2015.07.004

Tan, J., and Zhang, J. (2018). Does national high-tech development zones promote the growth of urban total factor productivity? —based on" quasi-natural experiments" of 277 cities. Res. Econ. Manag. 39 (9), 75–90. doi:10.13502/j.cnki.issn1000-7636.2018.09.007

Vásquez-Urriago, Á. R., Barge-Gil, A., Rico, A. M., and Paraskevopoulou, E. (2014). The impact of science and technology parks on firms’ product innovation: empirical evidence from Spain. J. Evol. Econ. 24, 835–873. doi:10.1007/s00191-013-0337-1

Wang, F., Dong, M., Ren, J., Luo, S., Zhao, H., and Liu, J. (2022a). The impact of urban spatial structure on air pollution: empirical evidence from China. Environ. Dev. Sustain. 24, 5531–5550. doi:10.1007/s10668-021-01670-z

Wang, M., and Liu, X. (2023). The impact of the establishment of national high-tech zones on total factor productivity of Chinese enterprises. China Econ. 18 (3), 68–93. doi:10.19602/j.chinaeconomist.2023.05.04

Wang, Q., She, S., and Zeng, J. (2020). The mechanism and effect identification of the impact of National High-tech Zones on urban green innovation: based on a DID test. China Popul. Resour. Environ. 30 (02), 129–137.

Wang, W. S., and Xu, T. S. (2020). A research on the impact of national high-teach zone establishment on enterprise innovation performance. Econ. Surv. 37 (6), 76–87. doi:10.15931/j.cnki.1006-1096.20201010.001

Wang, Z., Yang, Y., and Wei, Y. (2022b). Has the construction of national high-tech zones promoted regional economic growth? empirical research from prefecture-level cities in China. Sustainability 14 (10), 6349. doi:10.3390/su14106349

Weber, M., Driessen, P. P., and Runhaar, H. A. (2014). Evaluating environmental policy instruments mixes; a methodology illustrated by noise policy in The Netherlands. J. Environ. Plan. Manag. 57 (9), 1381–1397. doi:10.1080/09640568.2013.808609

Wu, Y. B. (2019). Does fiscal decentralization promote technological innovation. Mod. Econ. Sci. 41, 13–25.

Xu, S. D., Jiang, J., and Zheng, J. (2022). Has the establishment of national high-tech zones promoted industrial Co-Agglomeration? an empirical test based on difference in difference method. Inq. into Econ. Issues 11, 113–127. (in Chinese).

Yang, F., and Guo, G. (2020). Fuzzy comprehensive evaluation of innovation capability of Chinese national high-tech zone based on entropy weight—taking the northern coastal comprehensive economic zone as an example. J. Intelligent Fuzzy Syst. 38 (6), 7857–7864. doi:10.3233/jifs-179855

Yu, N., De Jong, M., Storm, S., and Mi, J. (2013). Spatial spillover effects of transport infrastructure: evidence from Chinese regions. J. Transp. Geogr. 28, 56–66. doi:10.1016/j.jtrangeo.2012.10.009

Yuan, H., and Zhu, C. L. (2018). Do national high-tech zones promote the transformation and upgrading of China’s industrial structure. China Ind. Econ. 8, 60–77. doi:10.19581/j.cnki.ciejournal.2018.08.004

Zhang, T., Chen, L., and Dong, Z. (2018). Highway construction, firm dynamics and regional economic efficiency. China Ind. Econ. 1, 79–99. doi:10.19581/j.cnki.ciejournal.20180115.003

Zhang, T., and Li, J. C. (2023). Network infrastructure, inclusive green growth, and regional inequality: from causal inference based on double machine learning. J. Quantitative Technol. Econ. 40 (4), 113–135. doi:10.13653/j.cnki.jqte.20230310.005

Zhou, L., and Shen, K. (2020). National city group construction and green innovation. China Popul. Resour. Environ. 30 (8), 92–99.

Zhou, X., and Du, J. (2021). Does environmental regulation induce improved financial development for green technological innovation in China? J. Environ. Manag. 300, 113685. doi:10.1016/j.jenvman.2021.113685

Keywords: national high-tech zone, industrial policy, green innovation, heterogeneity analysis, moderating effect, double machine learning

Citation: Cao W, Jia Y and Tan B (2024) Impact of industrial policy on urban green innovation: empirical evidence of China’s national high-tech zones based on double machine learning. Front. Environ. Sci. 12:1369433. doi: 10.3389/fenvs.2024.1369433

Received: 12 January 2024; Accepted: 15 March 2024; Published: 04 April 2024.

Reviewed by:

Copyright © 2024 Cao, Jia and Tan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Yu Jia, [email protected]

Serious games in high-stakes assessment contexts: a systematic literature review into the game design principles for valid game-based performance assessment

  • Research Article
  • Open access
  • Published: 08 April 2024

Cite this article

You have full access to this open access article

  • Aranka Bijl   ORCID: orcid.org/0000-0001-5745-1396 1 , 2 , 3 ,
  • Bernard P. Veldkamp 2 ,
  • Saskia Wools 3 &
  • Sebastiaan de Klerk 3  

The systematic literature review (1) investigates whether ‘serious games’ provide a viable solution to the limitations posed by traditional high-stakes performance assessments and (2) aims to synthesize game design principles for the game-based performance assessment of professional competencies. In total, 56 publications were included in the final review, targeting knowledge, motor skills and cognitive skills and further narrowed down to teaching, training or assessing professional competencies. Our review demonstrates that serious games are able to provide an environment and task authentic to the target competency. Collected in-game behaviors indicate that serious games are able to elicit behavior that is related to a candidates’ ability level. Progress feedback and freedom of gameplay in serious games can be implemented to provide an engaging and enjoyable environment for candidates. Few studies examined adaptivity and some examined serious games without an authentic environment or task. Overall, the review gives an overview of game design principles for game-based performance assessment. It highlights two research gaps regarding authenticity and adaptivity and concludes with three implications for practice.

Avoid common mistakes on your manuscript.

In the years since their first introduction (ca. 1950s), videogames have only increased in popularity. In education, videogames are already widely applied as tools to support students in learning (cf. Boyle et al., 2016 ; Ifenthaler et al., 2012 ; Young et al., 2012 ). In contrast, less research has been done on the use of videogames as summative assessment environments, even though administering (high-stakes) summative assessments through games has several advantages.

First, videogames can be used to administer standardized assessments that provide richer data about candidate ability in comparison to traditional standardized assessments (e.g., multiple-choice tests; Schwartz & Arena, 2013 ; Shaffer & Gee, 2012 ; Shute & Rahimi, 2021 ). Second, assessment through videogames gives considerable freedom in recreating real-life criterion situations, which allows for authentic, situated assessment even when this is not feasible in the real working environment (Bell et al., 2008 ; Dörner et al., 2016 ; Fonteneau et al., 2020 ; Harteveld, 2011 ; Kirriemur & McFarlane, 2004 ; Michael & Chen, 2006 ). Third, videogames can offer candidates a more enjoyable test experience by providing an engaging environment where they are given a high degree of autonomy (Boyle et al., 2012 ; Jones, 1998 ; Mavridis & Tsiatsos, 2017 ). Finally, videogames allow for assessment through in-game behaviors (i.e., stealth assessment), which intends to make assessment less salient for candidates and lets them retain engagement (Shute & Ke, 2012 ; Shute et al., 2009 ).

The benefits above highlight why videogames are viable assessment environments, irrespective of the specific level of cognitive achievement (e.g., those depicted in Bloom’s revised taxonomy; Krathwohl, 2002 ). Moreover, the possibility for immersing candidates in complex, situated contexts make them especially interesting for higher-order learning outcomes such as problem solving and critical thinking (Dede, 2009 ; Shute & Ke, 2012 ). Therefore, videogames may provide a solution to the validity threats associated with traditional high-stakes performance assessments: an assessment type to evaluate competencies through a construct-relevant task in the context for which it is intended (Lane & Stone, 2006 ; Messick, 1994 ; Stecher, 2010 ), often used for the purpose of vocational certification.

The first validity threat associated with high-stakes performance assessments is the prevalence of test anxiety among candidates (Lane & Stone, 2006 ; Messick, 1994 ; Stecher, 2010 ), which is shown to be negatively correlated to test performance (von der Embse et al., 2018 ; von der Embse & Witmer, 2014 ). Although some debate exists about the causal relationship between the two (Jerrim, 2022 ; von der Embse et al., 2018 ), it is apparent that candidates who experience test anxiety are unfairly disadvantaged in high-stakes assessment contexts.

The second threat identified is caused by a need for high-stakes performance assessment to be both standardized to ensure objectivity and fairness (AERA et al., 2014 ; Kane, 2006 ) as well as include a construct-relevant task (e.g., writing an essay, participating in a roleplay; Lane & Stone, 2006 ; Messick, 1994 ). While neither rule out adaptivity (e.g., adaptive testing and open-ended assessments), the combination often restricts us to use a linear performance task that is not adaptable to candidate ability level. The potential mismatch that could occur between task difficulty and the ability level of candidates posits two disadvantages. First, the mismatch can frustrate candidates, which negatively affects their test performance (Wainer, 2000 ). Second, candidates likely receive fewer tasks that align with their ability level, which negatively affects test reliability and efficiency (Burr et al., 2023 ). High-stakes performance assessments would thus benefit from adaptive testing that is personalized and appropriately difficult, allowing candidates to be challenged enough to retain engagement (Burr et al., 2023 ; Malone & Lepper, 1987 ; Van Eck, 2006 ) while assessors are able to determine whether the candidate is at the required level efficiently and reliably (Burr et al., 2023 ; Davey, 2011 ). Additionally, adaptive testing allows for more personalized (end-of-assessment) feedback that could further boost candidate performance (Burr et al., 2023 ; Martin & Lazendic, 2018 ).

The third threat identified in high-stakes performance assessment is a lack of assessment authenticity. Logically, assessment would be administered best in the authentic context (i.e., the workplace in the case of professional competencies). This leads to a high degree of fidelity: how closely the assessment environment mirrors reality (Alessi, 1988, as cited in Gulikers et al., 2004 ). Unfortunately, this is not attainable for competencies that are dangerous or unethical to carry out (Bell et al., 2008 ; Williams-Bell et al., 2015 ). Another concern is that in the workplace, assessments are largely dependent on the workplace in which they are carried out. This would lead to considerable variations in testing conditions between candidates, but also the construct relevance of tasks they are evaluated on (Baartman & Gulikers, 2017 ). Authenticity of physical context and task are two dimensions required for mobilizing the competencies of interest (Gulikers et al., 2004 ), there is a need to achieve authenticity in other ways. Authenticity is also related to transfer: applying what is learned to new contexts. The higher the alignment between assessment and reality is, the more likely it is that the transfer of competence to the professional practice is made.

The fourth threat identified are inconsistencies between raters in scoring candidate performance. Traditional high-stakes performance assessments are often accompanied by rubrics to evaluate candidate performance; however, inconsistencies in how rubrics are interpreted and used leads to construct-irrelevant variance (Lane & Stone, 2006 ; Wools et al., 2010 ). In this study, the aim is to investigate whether ‘serious games’ (SGs)—those “used for purposes other than mere entertainment” (Susi et al., 2007 ; p. 1)—provide a viable solution to this and the other limitations posed by traditional high-stakes performance assessments.

The most important characteristic of games is that they are played with a clear goal in mind. Many games have a predetermined goal, but other games allow players to define their own objectives (Charsky, 2010 ; Prensky, 2001 ). Goals are given structure by the provision of rules, choices, and feedback (Lameras et al., 2017 ). First, rules direct players towards the goal by placing restrictions on gameplay (Charsky, 2010 ). Second, choices enable players to make decisions, for example to choose between different strategies to attain the goal (Charsky, 2010 ). The extent to which rules are restrictive for the gameplay is also closely related to the choices players have in the game (Charsky, 2010 ). Thus, rules and choices seem to be on two ends of a continuum that determines the linearity of a game. Linearity is defined as the extent to which players are given freedom of gameplay (Kim & Shute, 2015 ; Rouse, 2004 ). The third characteristic, feedback, is a well-versed topic in the field of education. In education, the main purpose of feedback is to help students get insight into their learning and get student understanding to the level of learning goals (Hattie & Timperley, 2007 ; Shute, 2008 ; van der Kleij et al., 2012 ). In games, feedback is used in a similar way to guide players towards the goal, as well as facilitate interactivity (Prensky, 2001 ). Feedback in games is provided in many modalities and gives players information about how they are progressing and where they stand with regards to the goal. For instance whether their actions have brought them closer to the goal or further away. Games are made up of a collection of game mechanics that define the game and determine how it is played (Rouse, 2004 ; Schell, 2015 ). In other words, game mechanics are how the defining features of games are translated into gameplay. To illustrate, game mechanics that provide feedback to players can include hints, gaining or losing lives, progress bars, dashboards, currencies and/or progress trees (Lameras et al., 2017 ).

When designing a game-based performance assessment, determining the information that should be collected about candidates to inform competence and designing the tasks that fulfill this information need is something that should be considered carefully for each professional competency. One way is through the use of the evidence-centered design (ECD) framework (cf. Mislevy & Riconscente, 2006 ). The ECD framework is a systematic approach to test development that relies on evidentiary arguments to move from a candidates behavior on a task to inferences about candidate ability. It is beyond the scope of the current study to examine the design of game content in relation to the target professional competencies. In this systematic literature review, the aim is to determine which game mechanics could help overcome the validity threats associated with high-stakes performance assessments and are suitable for use in such assessments.

Previous research for game design has been done for instructional SGs (e.g., dos Santos & Fraternali, 2016 ; Gunter et al., 2008 ). For SGs used in high-stakes performance assessments, emphasis is put on the potential effect of game mechanics on the validity of inferences should be considered. For instance, choices in game design can affect correlations between in-game behavior and player ability (Kim & Shute, 2015 ). Moreover, game mechanics exist that are likely to introduce construct-irrelevant variance when used in high-stakes performance assessments. To illustrate, when direct feedback about performance (e.g., points, lives, feedback messages) is given to players, at least part of the variance in test scores would be explained by the type and amount of feedback a candidate has received.

Establishing design principles for SGs for high-stakes performance assessment is important for several reasons. First, such an overview allows future developers such assessments to make more informed choices regarding game design. Second, combining and organizing the insights gained from the available empirical evidence advances the knowledge framework around the implementation of high-stakes performance assessment through games. Reviews on the use of games exist for learning (e.g., Boyle et al., 2016 ; Connolly et al., 2012 ; Young et al., 2012 ) or are targeted at specific professional domains (e.g., Gao et al., 2019 ; Gorbanev et al., 2018 ; Graafland et al., 2012 ; Wang et al., 2016 ). Nevertheless, a research gap remains as there is no knowledge of a systematic literature review that addresses the high-stakes performance assessment of professional competencies. To this end, this study begins with identifying the available literature on SGs targeted at professional competencies; then extracts the implemented game mechanics that could help to overcome the validity threats associated with high-stakes performance assessment; and finally synthesizes game design principles for game-based performance assessment in high-stakes contexts.

The scope of the current review is limited to professional competencies specifically catered to a vocation (e.g., construction hazard recognition). More generic professional competencies (e.g., programming) are not taken into consideration, as the context in which they are used can also fall outside of secondary vocational and higher education. Additionally, there is a growing body of literature that recognizes the potential of in-game behavior as a source of information about ability level in the context of game-based learning (e.g., Chen et al., 2020 ; Kim & Shute, 2015 ; Shute et al., 2009 ; Wang et al., 2015 ; Westera et al., 2014 ). As the relationship between in-game behavior and candidate ability is of equal importance in assessment, the scope of the current review includes SGs that focus not only on assessment, but also teaching and training of professional competencies.

The following section describes the procedure followed in conducting the current systematic literature review. First, a description of the inclusion criteria and search terms is given. This is followed by a description of the selection process and data extraction, together with an evaluation of the objectivity of the inclusion and quality criteria. Then, the search and selection results are presented, where two further categorizations of included studies operationalized: the type of competency and the how a successful SG is defined.

Following the guidelines described in Systematic Reviews in the Social Sciences (Petticrew & Roberts, 2005 ), the protocol below gives a description and the rationale behind the review along with a description of how different studies were identified, analyzed, and synthesized.

Databases and search terms

The databases that include most publications from the field of educational measurement ( Education Resources Information Center (ERIC) , PsycInfo , Scopus , and Web of Science) were consulted for the literature search using the following search terms:

Serious game : (serious gam* or game-based assess* or game-based learn* or game-based train*) and

Quality measure : (perform* or valid* or effect* or affect*)

Inclusion criteria and selection process

The initial search results were narrowed down by selecting only publications that were published in English and in a scientific, peer-reviewed journal. To be included, studies were required to report on the empirical research results of a study that (1) focused on a digital SG used for teaching, training, or assessment of one or more professional competencies specific to a work setting, (2) was conducted in secondary vocational education, higher education or vocational settings, and (3) included a measure to assess the dependent variable related to the quality of the SG. Studies were excluded when the focus was on simulations; while they have an overlapping role in the acquisition of professional competencies to SGs, these modalities represent distinct types of digital environments.

All results from the databases were exported to Endnote X9 (The EndNote Team, 2013 ) for screening. The selection process was conducted in three rounds. First, duplicates, and alternative document types (e.g., editorials, conference proceedings, letters) were removed. Then, the publications were screened based on the titles and abstracts; publications were removed when the title or abstract mentioned features of the study mutually exclusive with the inclusion criteria (e.g., primary school, rehabilitation, systematic literature review). Second, titles and abstracts of the remaining results were screened again. When the title or abstract lacked information, the full article was inspected. To illustrate, some titles and abstracts did not mention the target population, or whether the game was digital, or whether the professional competency was specific to a work setting. Finally, full-text articles were screened for full compliance with the inclusion criteria. Data was extracted from those publications.

The objectivity of the inclusion criteria was determined by blinded double classification on two occasions. The first occasion, after the removal of duplicates and alternative document types, 30 randomly selected publications were independently double-classified by an expert in the field of educational measurement based on the title and abstract. An agreement rate of 93% with a Cohen’s Kappa coefficient of .81 translated to a near perfect inter-rater reliability (Landis & Koch, 1977 ). On the second occasion, a random selection of 32 publications considered for data extraction were blindly double-classified based on the full-text by a master student in educational measurement which resulted in an agreement rate of 97% was with a near perfect Cohen’s Kappa coefficient (.94; Landis & Koch, 1977 ).

To assess the comprehensiveness of the systematic review and identify additional relevant studies, snowballing was conducted by backward and forward reference searching in Web of Science . For publications not available on Web of Science , snowballing was done in Scopus .

Data extraction

For the publications included, data was extracted systematically by means of a data extraction form (Supplementary Information SI1). The data extraction form includes: (1) general information, (2) details on the professional competency and research design, (3) serious game (SG) specifics and (4) a quality checklist.

The quality checklist contains 12 closed questions with three response options: the criterion is met (1), the criterion is met partly (.5), and the criterion is not met (0). Studies that scored 7 or below were considered to be of poor quality and were excluded. Studies that scored between 7.5 and 9.5 were considered to be of medium quality, while studies with scores 10 or above were considered to be of good quality (denoted with an asterisk in the data selection table; Supplementary Information SI2). These categories were determined by piloting the study quality checklist on two publications that were included, based on the inclusion criteria: one that was considered to be of a poor quality and one that was considered to be of good quality. The scores obtained by those studies were set as the lower and upper threshold, respectively.

As this systematic literature review is focused on the extraction of game mechanics to inform game design principles, all articles included in the review needed to obtain a score of at least .5 on the criteria that the game is discussed in enough detail. When publications explicitly refer to external sources for additional information, information from those sources were included in the data extraction form as well.

Blinded double coding to determine the reliability of the quality criteria for inclusion was done by the same raters described above. 24 randomly selected publications from the final review were included, with a varying overlap between three raters. The assigned scores were translated to the corresponding class (i.e., poor, medium, and good) to calculate the agreement rate. The rates ranged between 82 and 93%, which correspond to Cohen’s Kappa coefficients between substantial and near perfect (.66–.88; Landis & Koch, 1977 ; Table  1 ).

Search and selection results

In the PRISMA flow diagram of the publication selection process (Fig.  1 ; Moher et al., 2009 ), the two rounds in which titles and abstracts were screened for eligibility are combined. The databases were consulted on the 21st of December 2020 and yielded a total of 6,128 publications. After the removal of duplicates, 3,160 publications were left. On the basis of the inclusion criteria, another 2,981 publications were excluded from the review. In total, data was extracted from 179 publications. During the examination of the full-text articles, 129 studies were excluded due to insufficient quality (n = 42), lack of a detailed game description (n = 6), unavailability of the article (n = 5), not classifying the application as a game (n = 10) and an overall mismatch with the inclusion criteria (n = 66). In total, 50 publications were included. Snowballing was conducted in November of 2021 and resulted in the inclusion of six additional studies. In total, 56 publications were included in the final review.

figure 1

PRISMA flow diagram of inclusion of the systematic literature review. PRISMA  preferred reporting items for systematic reviews and meta-analyses

Categorization of selected studies

Competency types.

Professional competencies are acquired and assessed in different ways. Given the variety of professional competencies, there is no universal game design that is likely to be beneficial across the board (Wouters et al., 2009 ). Other researchers (e.g., Young et al., 2012 ) even suggest that game design principles should not be generalized across games, contexts or competencies. While more content-related game design principles likely need to be defined per context, this review is conducted with the idea that generic game design principles exist that can be successfully used in multiple contexts. In that sense, the aim is to provide a starting point from where more context-specific SGs can be designed, for example through the use of ECD.

The review is organized according to the type of professional competency that is evaluated rather than the content of the SG under investigation, as this provides an idea of what researchers expect to train or assess within the SG. Different distinctions between competencies can be made. For example, Wouters et al. ( 2009 ) distinguish between cognitive, motor, affective, and communicative competencies. Moreover, Harteveld ( 2011 ) distinguishes between knowledge, skills, and attitudes. These taxonomies served as a basis to inductively categorize the targeted professional competencies into knowledge, motor skills, and cognitive skills.

The knowledge category includes studies that focus on for instance declarative knowledge (i.e., fact-based) or procedural knowledge (i.e., how to do something). For instance, the procedural steps involved in cardiopulmonary resuscitation (CPR). The motor skills category refers to motor behaviors (i.e., movements). For CPR, an example would be compression depth. The cognitive skills category encompasses skills such as reasoning, planning, and decision making. For example, studies that focus on the recognition of situations that require CPR.

Successful SGs

The scope of this systematic literature review is limited to SGs that are shown to be successful in teaching, training, or the assessment of professional competencies. As research methodologies differ between studies, there is a need to define what characterizes a successful SG. When SGs were used in teaching or training, it was deemed successful when a significant improvement in the targeted professional competency was found (e.g., through an external validated measure of the competency). Some studies compared an active control group and an experimental group that additionally received an SG (e.g., Boada et al., 2015 ; Dankbaar et al., 2016 ; Graafland et al., 2017 ; see Supplementary Information SI2 for a full account): an SG was not deemed successful in the current results when such two groups showed comparable results. When SGs were used for assessment, it was deemed successful when (1) research results showed a significant relationship between the SG and a validated measure of the targeted competency, or (2) the SG was shown to accurately distinguish between different competency levels.

The studies included in the review are discussed in two ways. First, descriptives of the included studies are given in terms of the degree to which games were successful in teaching, training, or assessment of professional competencies, the professional domains, and the competency types. Then, the game mechanics associated with the potential solutions to the validity threats in traditional performance assessment are presented.

Descriptives of the included studies

The final review includes 56 studies, published between 2006 and 2020 (consult Supplementary Information SI2 for a more detailed overview). No noteworthy differences were found between the SGs that aimed to teach, train, and assess professional competencies. Therefore, the results for the SGs included in the review are presented collectively.

Serious games with successful results

Divided over the type of professional competency evaluated, 84%, 83%, and 100% reported research results showing the SG was successful for cognitive skills, knowledge, and motor skills respectively (Table  2 ). Of the studies included in the systematic review, three studies found mixed effects of the SG under investigation between competency types (i.e., Luu et al., 2020 ; Phungoen et al., 2020 ; Tan et al., 2017 ).

Professional domains and competency types

The studies included in the review can be divided over seven professional domains (Table  3 ). These are further separated into professional competencies (see Supplementary Information SI2 for a full account). Examples include history taking (Alyami et al., 2019 ), crisis management (Steinrücke et al., 2020 ) and cultural understanding (Brown et al., 2018 ). Furthermore, the studies included in the review can be divided into three competency types: cognitive skills (n = 21), knowledge (n = 31), and motor skills (n = 4). An important note is that some studies evaluate the SG on more than one competency type, thus the sum of these categories is greater than the total number of studies included.

Game mechanics

The following section discusses the inclusion of game mechanics—all design choices within the game—for the SGs discussed in the studies included in the review. Following the aim of the current paper, the game mechanics discussed are selected for having the potential to (1) mediate the validity threats associated with traditional performance assessments, and (2) be appropriate for implementing in a game-based performance assessment.

Authenticity

Authenticity in the SGs is divided into two dimensions: authenticity of the physical context and task. First, an example of a physical context that was not representative of the real working environment was found for all three competencies (Table  4 ). Regarding the SGs targeted at cognitive skills, this was the case for Effic’ Asthme (Fonteneau et al., 2020 ). In this SG, the target population—medical students—would normally carry out pediatric asthma exacerbation in a hospital setting. The game environment used is, however, the virtual bedroom of a child. Regarding the SGs targeted at knowledge, Alyami et al. ( 2019 ) implemented the game Metaphoria to teach history taking content to medical students. Here, the game environment is inside a pyramid within a fantasy world. The final SG using a game environment that does not resemble the real working environment within the motor skill competency type studied by Jalink et al. ( 2014 ). In this SG, laparoscopic skills are trained by having players perform tasks in an underground mining environment.

Second, of the studies for which task authenticity could be determined, all but four included an authentic task for the professional competency targeted (Table  5 ). Examples of a task that was not authentic were found for all three competency types. Two SGs that targeted cognitive skills did not include an authentic task (Brown et al., 2018 ; Chee et al., 2019 ) as a result of implementing role reversals. Within these SGs, the players played in a reversed role fashion, and thus the task was not authentic for the task in the real working environment. One SG targeting knowledge did not include an authentic task (Alyami et al., 2019 ). In Metaphoria , the task for players is to interpret visual metaphors in relation to symptoms, whereas the target professional competency was history taking content. Finally, the SG studied by Drummond et al. ( 2017 ), targeting motor skills, the professional competency under investigation was not represented authentically within the game as the navigation was through point-and-click.

Unobtrusive data collection

For all three competency types, studies were found that use in-game data to make inferences about player ability (Table  6 ). While other studies did mention the collection of in-game behaviors, the results were limited to those that assessed the appropriateness of using the data in the assessment of competencies.

Different measures of in-game behaviors were found. First, 12 SGs determine competency by comparing player performance to some predetermined target, sometimes also translated to a score. In the game VERITAS (Veracity Education and Reactance Instruction through Technology and Applied Skills; Miller et al., 2019 ), for instance, players are assessed on whether they accurately assess whether the statement given by a character in the game is true or false. Second, seven SGs use time spent (i.e., completion time or playing time) as a measure of performance. For example, in the SG Wii Laparoscopy (Jalink et al., 2014 ), completion time is used to assess performance. This performance metric in the game showed a high correlation with performance on a validated measure for laparoscopic skills, but it should be noted that time penalties were included for mistakes made during the task. Finally, the use of log data was found in one SG targeted at cognitive skills (Steinrücke et al., 2020 ). In the Dilemma Game, in-game measures collected during gameplay were found to have promising relationships with competency levels.

In SGs, the difficulty level can be adapted in two ways: independent of the actions of players or dependent on the actions of players (Table  7 ). Whereas SGs that varied in difficulty level were found for professional competencies related to both knowledge and motor skills, none were found for professional competencies related to cognitive skills. Three SGs were found that adjust difficulty level based on player actions; however, none of the SGs adjusts the difficulty level down based on player actions. Three studies evaluated SGs where difficulty level was varied independent of player actions. Regarding the SGs targeted at knowledge, players either received fixed assignments (Boada et al., 2015 ) or were able to set the difficulty level prior to gameplay (Taillandier & Adam, 2018 ). The SG studied by Asadipour et al. ( 2017 ), targeting motor skills, increased challenge by building up the flying speed during the game as well as random generation of coins, but this was independent of player ability. Two SGs targeted at knowledge did mention difficulty levels, but not how they were adjusted. The SG Metaphoria (Alyami et al., 2019 ) included three difficulty levels. The SG Sustainability Challenge (Dib & Adamo-Villani, 2014 ) became more challenging as players progress to higher levels, but it is not clear when or how this was done.

Test anxiety

As described earlier, games are able to provide a more enjoyable testing experience by providing an engaging environment with a high degree of autonomy. Therefore, the way game characteristics, feedback, rules, and choices—are expressed in the studies included in the review are discussed below. To avoid confusion with linearity of assessment, the expression freedom of gameplay to describe the interaction between rules and choices.

First, seven examples were found where players are given feedback unrelated to performance (Table  8 ). Some ways feedback was given included a dashboard (Perini et al., 2018 ), remaining resources (Calderón et al., 2018 ; Taillandier & Adam, 2018 ) remaining time (Calderón et al., 2018 ; Dankbaar et al., 2017a , 2017b ; Mohan et al., 2014 ) or remaining tasks (Jalink et al., 2014 ).

Second, all studies included in the review but two include game mechanics to give some freedom of gameplay (Table  9 ). For cognitive skills and knowledge, game mechanics included the choice between multiple options (n = 14 for both), the inclusion of interactive elements (n = 8, for both) and the possibility for free exploration (n = 5 and n = 8, respectively). Two examples of customization were found: Dib and Adamo-Villani ( 2014 ) gave players the choice of avatar, whereas Alyami et al. ( 2019 ) allowed for a custom name. For the SGs that target motor skills, freedom of gameplay was given through control over the movements. For three out of four SGs in this category, special controllers were developed to give players authentic control over the movements in the game. This was not the case for Drummond et al. ( 2017 ), as their game did not explicitly train CPR; however, the researchers did assess its effect on motor skills.

Included studies

The final review included 56 studies. Of these, many reported positive results. This suggests that SGs are often successful in teaching, training, or assessing professional competencies, but could also point to a publication bias of positive results. As similar reviews to the current one (e.g., Connolly et al., 2012 ; Randel et al., 1992 ; Vansickle, 1986 ; Wouters et al., 2009 ) draw on similar databases, it is difficult to establish what is true. Some studies found mixed results for different competency types, suggesting that different approaches are warranted. Therefore, game mechanics in SGs for different competency types are discussed separately.

The review included few studies on SGs targeting motor skills compared to those targeting cognitive skills and knowledge. The low number of SGs for motor skills could be due to the need for specialized equipment to create an SG targeting motor skills. For example, Wii Laparoscopy (Jalink et al., 2014 ) is played using controllers that are specifically designed for the game. Not only does it require an extra investment, it also affects the ease of large scale implementation. There is no indication that motor skills cannot be assessed through SGs: four out of five studies have shown positive effects, both in learning effectiveness and assessment accuracy. Despite this, the benefits may only outweigh the added costs in situations where it is unfeasible to perform the professional competency in the real working environment.

Focusing on game mechanics for the authenticity of the physical context and the task, the results indicate that SGs are able to provide both. It should be noted that, while SGs are able to simulate the physical context and task with high fidelity, authenticity remains a matter of perception (Gulikers et al., 2008 ). The review focused only on those SGs that were successful when compared to validated measures of the targeted professional competency. Since these measures are considered to be accurate proxies for workplace performance, the transfer to the real working environment is likely to have been made. For all three competency types, examples were found for SGs that did not include an authentic physical context or authentic task, while still mobilizing competencies of interest. Even though the number of SGs in these categories is quite small, it does indicate that it is possible to assess professional competencies without an authentic environment or task.

The in-game measures most often used in the included SGs are those that indicate how well a player did in comparison to some standard or target. This suggests that SGs are able to elicit behavior in players that is dependent on their ability level in the target professional competency. Since the accuracy measures varied depending on the professional competency, an investigation is warranted to determine which in-game measures are indicative of ability per situation. Evidentiary frameworks such as the ECD framework can provide guidance in determining which data could be used to make inferences about candidate ability. Despite the promising results, more research should be done on the informational value of log data before claims can be made.

Some examples of studies were found where adaptivity was implemented was adaptive. In particular, some promising relationships between in-game behaviors and ability level were found. In traditional (high-stakes) testing, adaptivity has already been implemented successfully (Martin & Lazendic, 2018 ; Straetmans & Eggen, 2007 ). Although there are professional competencies for which ability levels cannot be differentiated, you are either able to do it or not. For such competencies, adaptivity does not have an added benefit. In contrast, for professional competencies where it is possible to differentiate ability levels, adaptivity should be considered.

Considering the appropriateness of game mechanics for high-stakes assessment, feedback considered in the current review was limited to progress feedback. This adds a fourth type of feedback to the feedback already recognized for assessment: knowledge of correct response, elaborated feedback, and delayed knowledge of results (van der Kleij et al., 2012 ). Although the small number of SGs that incorporated progress feedback affect the generalizability of the finding, it does indicate that feedback about progress may be the most appropriate solution.

Freedom of gameplay

A variety of game mechanics implemented in the SGs included in the review fulfill freedom of gameplay. While some studies did not elaborate on the choices given in the game, common ways players are given freedom are through choice options, interactive elements, and freedom to explore. These game mechanics were found in various studies, which raises the possibility that these findings can be generalized to new SGs targeted at assessing professional competencies. Other game mechanics related to freedom of gameplay were also found in a smaller capacity. Thus, further research should shed light on their generalizability. Moreover, the freedom of gameplay provided to the player plays a substantial role in shaping overall player experience and behavior (Kim & Shute, 2015 ; Kirginas & Gouscos, 2017 ). Therefore, future research should shed further light on whether different game mechanics influence players in different ways.

Limitations

Although the current systematic literature review provides a useful overview of the game design principles for game-based performance assessment of professional competencies, some limitations are identified.

First, the review covered a substantial amount of studies from the healthcare domain. This may be because the medical field consists of many higher order standardized tasks which may be particularly suitable to SGs. Although the large contribution of studies in the healthcare domain could limit the generalizability to other domains. The results of this systematic review were quite uniform; no indication was found that SGs in healthcare employed different game mechanics were employed. Moreover, there is a growing popularity of SGs in healthcare education (Wang et al., 2016 ), resulting in a higher number of studies that were available compared to other professional domains. It is advisable to regard the current results as a starting point for game design principles game-based performance assessment. Further research into the generalizability of game design principles across professional domains is warranted.

The second limitation is true for all systematic literature reviews: it is a cross section of the literature and may not present the full picture. The inclusion of studies is dependent on what is available in the search databases, what is accessible, and what keywords are included in the literature. Likely due to this limitation, only studies published from 2006 are included in the review, while the use of SGs dates back much further (Randel et al., 1992 ; Vansickle, 1986 ). To minimize the omission of relevant literature, snowballing was conducted on the final selection of studies. This method allowed for including related and potentially relevant studies. In total, six additional publications were included through this method out of the 2,370 considered.

After snowballing, an assessment of why these additionally included studies were not found through the search results resulted in various insights. First, three studies used the terms (educational) video game in their publication on SGs (Duque et al., 2008 ; Jalink et al., 2014 ; Mohan et al., 2017 ). Including this term in the original search would have resulted in too many hits outside of the scope of the current review. Second, Moreno-Ger et al. ( 2010 ) used the term simulation to describe the application, but refer to the application as game-like. As simulations fall outside of the scope of the current review, the absence of this study in the initial search cannot be attributed to a gap in the search terms, Third, the publication from Blanié et al. ( 2020 ) was probably not found due to a mismatch in search terms related to the quality measure. Additional search terms such as impact or improve could have been included. As only one additional study was found that presented this issue, it is unlikely to have had a great effect on the outcome of the review. Finally, it is unclear why the study by Fonteneau et al. ( 2020 ) was not found through the initial search, as it showed a match with the search terms used in the current review. Perhaps, this misclassification can be ascribed to the search databases queried.

Finally, many of the studies included in the review compare SGs to other, non-digital or digital, alternatives in terms of learning. These types of studies often include many confounding variables (Cook, 2005 ). This is because a comparison is done between interventions that are different in more ways than one. These differences affect the results in different ways: positive, negative, or even through an interaction with other features.

Suggestions for future research

Besides providing interesting insights, the current review also has implications for research. First, the review identified SGs successful in teaching, training, or assessment that did not authentically represent the physical context or task. Although in this review, too few examples were found to generalize the findings. Second, while some studies were found in which the SGs difficulty was adaptive, more studies should be conducted on the implementation of adaptivity within SGs. In particular, how in-game behavior to match the difficulty level to the ability level of the candidates. Third, Fantasy is included in many games (Charsky, 2010 ; Prensky, 2001 ) and is regarded as one of the reasons for playing them (Boyle et al., 2016 ). By including fantasy elements in game-based performance assessments, assessment can become even more engaging and enjoyable and candidates can become even less aware of being assessed. For learning, it has been suggested that fantasy should be closely connected to the learning content (Gunter et al., 2008 ; Malone, 1981 ), but further research might explore whether this holds for SGs used for the (high-stakes) assessment of professional competencies. Furthermore, while fantasy elements may blur the direct link between the SG and the professional practice, in-game behavior may still have a clear relationship with professional competencies (Kim & Shute, 2015 ; Simons et al., 2021 ). More research into the effect of authenticity on the measurement validity of SGs in assessing professional competencies is warranted.

Implications for practice

Based on the results of the review, four recommendations can be made for practice. First, regardless of the competency type: design the SG in such a way that both the task and the context are authentic. The results have shown that SGs are able to provide a representation of the physical context and task, authentic to the professional competency under investigation. Thus, in situations where the physical context or assessment task are difficult to represent in a traditional performance assessments, SGs can provide a solution. At the same time, implementing non-authentic (fantasy) contexts and tasks should be investigated further before being implemented in high-stakes performance assessment.

Second, ensure that in-game behavior within the SG is collected. This review has synthesized additional evidence for the potential of in-game behavior as a source of information about ability level. That being said, the in-game behavior that can be used to inform ability level is dependent on both the professional competency of interest and game design. While no generalized design principles regarding the collection of gameplay data can be given, evidentiary frameworks (e.g., ECD) can be used to determine which in-game behavior can be used to infer ability level. This is ultimately connected to implementation of adaptivity. While a limited number of SGs were found that implemented adaptivity, the potential to unobtrusively data about ability level underscores a missed opportunity for the wider implementation of adaptivity in SGs. Taken together with the successful implementation of adaptive testing in traditional high-stakes assessments (Martin & Lazendic, 2018 ; Straetmans & Eggen, 2007 ), a third recommendation would be to implement adaptivity where appropriate.

Finally, this review gives an overview of the game mechanics for high-stakes game-based performance assessment with little risk of affecting validity. To provide freedom of gameplay for SGs targeted at cognitive skills and knowledge, include free exploration, interactive elements and providing options. For motor skills, giving control over movements is a, perhaps straightforward, game design principle. Furthermore, feedback in SGs for high-stakes performance assessments can be done through providing progress feedback, which is different from traditional types of feedback in education (van der Kleij et al., 2012 ) but has potential to satisfy feedback as a game mechanic. These recommendations, intended for game developers, may prove useful in designing future SGs for the (high-stakes) assessment of professional competencies.

In-text citations

American Educational Research Association (AERA), American Psychological Association (APA), & National Council on Measurement in Education (NCME). (2014). Standards for educational and psychological testing . American Educational Research Association.

Google Scholar  

Baartman, L., & Gulikers, J. (2017). Assessment in Dutch vocational education: Overview and tensions of the past 15 years. In E. De Bruijn, S. Billet, & J. Onstenk (Eds.), Enhancing teaching and learning in the Dutch vocational education system: Reforms enacted (pp. 245–266). Springer.

Chapter   Google Scholar  

Bell, B. S., Kanar, A. M., & Kozlowski, S. W. J. (2008). Current issues and future directions in simulation-based training in North America. The International Journal of Human Resource Management, 19 (8), 1416–1434. https://doi.org/10.1080/09585190802200173

Article   Google Scholar  

Boyle, E., Hainey, T., Connolly, T. M., Gray, G., Earp, J., Ott, M., Lim, T., Ninaus, M., Ribeiro, C., & Pereira, J. (2016). An update to the systematic literature review of empirical evidence on the impacts and outcomes of computer games and serious games. Computers & Education, 94 , 178–192. https://doi.org/10.1016/j.compedu.2015.11.003

Boyle, E. A., Connolly, T. M., Hainey, T., & Boyle, J. M. (2012). Engagement in digital entertainment games: A systematic review. Computers in Human Behavior, 28 (3), 771–780. https://doi.org/10.1016/j.chb.2011.11.020

Burr, S., Gale, T., Kisielewska, J., Millin, P., Pêgo, J., Pinter, G., Robinson, I., & Zahra, D. (2023). A narrative review of adaptive testing and its application to medical education. MedEdPublish . https://doi.org/10.12688/mep.19844.1

Charsky, D. (2010). From edutainment to serious games: A change in the use of game characteristics. Games and Culture, 5 (2), 177–198. https://doi.org/10.1177/1555412009354727

Chen, F., Cui, Y., & Chu, M.-W. (2020). Utilizing game analytics to inform and validate digital game-based assessment with evidence-centered game design: A case study. International Journal of Artificial Intelligence in Education, 30 (3), 481–503. https://doi.org/10.1007/s40593-020-00202-6

Connolly, T. M., Boyle, E. A., MacArthur, E., Hainey, T., & Boyle, J. M. (2012). A systematic literature review of empirical evidence on computer games and serious games. Computers & Education, 59 (2), 661–686. https://doi.org/10.1016/j.compedu.2012.03.004

Cook, D. A. (2005). The research we still are not doing: An agenda for the study of computer-based learning. Academic Medicine, 80 (6), 541–548. https://doi.org/10.1097/00001888-200506000-00005

Davey, T. (2011). A guide to computer adaptive testing systems . C. o. C. S. S. Officers.

Dede, C. (2009). Immersive interfaces for engagement and learning. Science, 323 (5910), 66–69. https://doi.org/10.1126/science.1167311

Dörner, R., Göbel, S., Effelsberg, W., & Wiemeyer, J. (2016). Introduction. In R. Dörner, S. Göbel, W. Effelsberg, & J. Wiemeyer (Eds.), Serious games: Foundations, concepts and practice (pp. 1–34). Springer.

dos Santos, A. D., & Fraternali, P. (2016). A Comparison of methodological frameworks for digital learning game design. Lecture notes in computer science games and learning alliance. Springer.

Gao, Y., Gonzalez, V. A., & Yiu, T. W. (2019). The effectiveness of traditional tools and computer-aided technologies for health and safety training in the in the construction sector: A systematic review. Computers & Education, 138 , 101–115. https://doi.org/10.1016/j.compedu.2019.05.003

Gorbanev, I., Agudelo-Londoño, S., González, R. A., Cortes, A., Pomares, A., Delgadillo, V., Yepes, F. J., & Muñoz, Ó. (2018). A systematic review of serious games in medical education: Quality of evidence and pedagogical strategy. Medical Education Online, 23 (1), Article 1438718. https://doi.org/10.1080/10872981.2018.1438718

Graafland, M., Schraagen, J. M., & Schijven, M. P. (2012). Systematic review of serious games for medical education and surgical skills training. British Journal of Surgery, 99 (10), 1322–1330. https://doi.org/10.1002/bjs.8819

Gulikers, J. T. M., Bastiaens, T. J., & Kirschner, P. A. (2004). A five-dimensional framework for authentic assessment. Educational Technology Research and Development, 52 (3), 67. https://doi.org/10.1007/BF02504676

Gulikers, J. T. M., Bastiaens, T. J., Kirschner, P. A., & Kester, L. (2008). Authenticity is in the eye of the beholder: Student and teacher perceptions of assessment authenticity. Journal of Vocational Education and Training, 60 (4), 401–412. https://doi.org/10.1080/13636820802591830

Gunter, G. A., Kenny, R. F., & Vick, E. H. (2008). Taking educational games seriously: using the RETAIN model to design endogenous fantasy into standalone educational games. Educational Technology Research and Development, 56 (5), 511–537. https://doi.org/10.1007/s11423-007-9073-2

Harteveld, C. (2011). Foundations. Triadic Game design: balancing reality, meaning and play (pp. 31–93). Springer.

Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77 (1), 81–112. https://doi.org/10.3102/003465430298487

Ifenthaler, D., Eseryel, D., & Ge, X. (2012). Assessment in game-based learning: Foundations, innovations, and perspectives . Springer.

Book   Google Scholar  

Jerrim, J. (2022). Test anxiety: Is it associated with performance in high-stakes examinations? Oxford Review of Education . https://doi.org/10.1080/03054985.2022.2079616

Jones, M. G. (1998). Creating engagement in computer-based learning environments . https://www.yumpu.com/en/document/read/18776351/creating-engagement-in-computer-based-learning-environments

Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 17–64). Praeger Publishers.

Kim, Y. J., & Shute, V. J. (2015). The interplay of game elements with psychometric qualities, learning, and enjoyment in game-based assessment. Computers & Education, 87 , 340–356. https://doi.org/10.1016/j.compedu.2015.07.009

Kirginas, S., & Gouscos, D. (2017). Exploring the impact of freeform gameplay on players’ experience: an experiment with maze games at varying levels of freedom of movement. International Journal of Serious Games . https://doi.org/10.17083/ijsg.v4i4.175

Kirriemur, J., & McFarlane, A. (2004). Literature review in games and learning . Sage.

Krathwohl, D. R. (2002). A revision of Bloom’s taxonomy: An overview. Theory Into Practice, 41 (4), 212–218. https://doi.org/10.1207/s15430421tip4104_2

Lameras, P., Arnab, S., Dunwell, I., Stewart, C., Clarke, S., & Petridis, P. (2017). Essential features of serious games design in higher education: Linking learning attributes to game mechanics. British Journal of Educational Technology, 48 (4), 972–994. https://doi.org/10.1111/bjet.12467

Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33 (1), 159–174.

Lane, S., & Stone, C. A. (2006). Performance assessment. In R. L. Brennan (Ed.), Educational measurement (4th ed., pp. 387–431). Praeger Publishers.

Malone, T. W. (1981). Toward a theory of intrinsically motivating instruction. Cogntive Science, 4 , 333–369. https://doi.org/10.1207/s15516709cog0504_2

Malone, T. W., & Lepper, M. R. (1987). Making learning fun: A taxonomy of intrinsic motivations for learning. In R. E. Snow & M. J. Farr (Eds.), Aptitude, learning, and instruction: Conative and affective process analyses (pp. 223–253). Lawrence Erlbaum Associates, Inc.

Martin, A. J., & Lazendic, G. (2018). Computer-adaptive testing: Implications for students’ achievement, motivation, engagement, and subjective test experience. Journal of Educational Psychology, 110 , 27–45. https://doi.org/10.1037/edu0000205

Mavridis, A., & Tsiatsos, T. (2017). Game-based assessment: Investigating the impact on test anxiety and exam performance. Journal of Computer Assisted Learning, 33 (2), 137–150. https://doi.org/10.1111/jcal.12170

Messick, S. (1994). Alternative modes of assessment, uniform standards of validity. ETS Research Report Series, 1994 (2), i–22. https://doi.org/10.1002/j.2333-8504.1994.tb01634.x

Michael, D., & Chen, S. (2006). Serious games: Games that educate, train, and inform . Muska & Lipman/Premier-Trade.

Mislevy, R. J., & Riconscente, M. M. (2006). Evidence-centered assessment design. In S. M. Downing & T. M. Haladyna (Eds.), Handbook of test development (pp. 61–90). Lawrence Erlbaum Associates.

Moher, D., Liberati, A., Tetzlaff, J., Altman, D. G., & The, P. G. (2009). Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLOS Medicine, 6 (7), e1000097. https://doi.org/10.1371/journal.pmed.1000097

Petticrew, M., & Roberts, H. (2005). Systematic reviews in the social sciences: A practical guide. Blackwell Publishing . https://doi.org/10.1002/9780470754887

Prensky, M. (2001). Fun, play and games: What makes games engaging? In M. Prensky (Ed.), Digital game-based learning (pp. 16–47). McGraw-Hill.

Randel, J. M., Morris, B. A., Wetzel, C. D., & Whitehill, B. V. (1992). The effectiveness of games for educational purposes: A review of recent research. Simulation & Gaming, 23 (3), 261–276. https://doi.org/10.1177/1046878192233001

Rouse, R. (2004). Game design: Theory and practice (2nd ed.). Jones and Bartlett Publishers, Inc.

Schell, J. (2015). The art of game design: A book of lenses (2nd ed.). CRC Press.

Schwartz, D. L., & Arena, D. (2013). Measuring what matters most: Choice-based assessment for the digital age . The MIT Press.

Shaffer, D. W., & Gee, J. P. (2012). The right kind of GATE: Computer games and the future of assessment. In M. C. Mayrath, J. Clarke-Midura, & D. H. Robinson (Eds.), Technology-based assessments for 21st century skills: Theoretical and practical implications from modern research (pp. 211–228). Information Age Publishing.

Shute, V. J. (2008). Focus on formative feedback. Review of Educational Research, 78 (1), 153–189. https://doi.org/10.3102/0034654307313795

Shute, V. J., & Ke, F. (2012). Games, learning, and assessment. In D. Ifenthaler, D. Eseryel, & X. Ge (Eds.), Assessment in game-based learning: foundations, innovations, and perspectives (pp. 43–58). Springer.

Shute, V. J., & Rahimi, S. (2021). Stealth assessment of creativity in a physics video game. Computers in Human Behavior, 116 , Article 106647. https://doi.org/10.1016/j.chb.2020.106647

Shute, V. J., Ventura, M., Bauer, M., & Zapata-Rivera, D. (2009). Melding the power of serious games and embedded assessment to monitor and foster learning: Flow and grow. In U. Ritterfeld, M. J. Cody, & P. Vorderer (Eds.), Serious games: Mechanisms and effects (pp. 295–321). Routledge.

Simons, A., Wohlgenannt, I., Weinmann, M., & Fleischer, S. (2021). Good gamers, good managers? A proof-of-concept study with Sid Meier’s Civilization. Review of Managerial Science, 15 (4), 957–990. https://doi.org/10.1007/s11846-020-00378-0

Stecher, B. (2010). Performance assessment in an era of standards-based educational accountability . Stanford University, Stanford Center for Opportunity Policy in Education.

Straetmans, G. J. J. M., & Eggen, T. J. H. M. (2007). WISCAT-pabo: computergestuurd adaptief toetspakket rekenen. Onderwijsinnovatie, 2017 (3), 17–27.

Susi, T., Johannesson, J., & Backlund, P. (2007). Serious game—An overview [IKI Technical Reports] . https://www.diva-portal.org/smash/get/diva2:2416/FULLTEXT01.pdf

The EndNote Team. (2013). EndNote (Version X9) Clarivate. https://endnote.com/

van der Kleij, F. M., Eggen, T. J. H. M., Timmers, C. F., & Veldkamp, B. P. (2012). Effects of feedback in a computer-based assessment for learning. Computers & Education, 58 (1), 263–272. https://doi.org/10.1016/j.compedu.2011.07.020

Van Eck, R. (2006). Digital game-based learning: It’s not just the digital natives who are restless. Educause Review, 41 (2), 16–30.

Vansickle, R. L. (1986). A quantitative review of research on instructional simulation gaming: A twenty-year perspective. Theory & Research in Social Education, 14 (3), 245–264. https://doi.org/10.1080/00933104.1986.10505525

von der Embse, N., Jester, D., Roy, D., & Post, J. (2018). Test anxiety effects, predictors, and correlates: A 30-year meta-analytic review. Journal of Affective Disorders, 483–493 , 132–156. https://doi.org/10.1016/j.jad.2017.11.048

von der Embse, N., & Witmer, S. E. (2014). High-stakes accountability: Student anxiety and large-scale testing. Journal of Applied School Psychology, 30 (2), 132–156. https://doi.org/10.1080/15377903.2014.888529

Wainer, H. (2000). Introduction and history. In H. Wainer, N. J. Dorans, R. Eignor, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen (Eds.), Computerized adaptive testing: A primer (2nd ed., pp. 1–21). Lawrence Erlbaum Associates Inc.

Wang, L., Shute, V., & Moore, G. R. (2015). Lessons learned and best practices of stealth assessment. International Journal of Gaming and Computer-Mediated Simulations, 7 (4), 66–87. https://doi.org/10.4018/ijgcms.2015100104

Wang, R., DeMaria, S., Jr., Goldberg, A., & Katz, D. (2016). A systematic review of serious games in training health care professionals. Simulation in Healthcare: The Journal of the Society for Simulation in Healthcare, 11 (1), 41–51. https://doi.org/10.1097/sih.0000000000000118

Westera, W., Nadolski, R., & Hummel, H. (2014). Serious gaming analytics—What students’ log files tell us about gaming and learning. International Journal of Serious Games, 1 (2), 35–50. https://doi.org/10.17083/ijsg.v1i2.9

Williams-Bell, F. M., Kapralos, B., Hogue, A., Murphy, B. M., & Weckman, E. J. (2015). Using serious games and virtual simulation for training in the fire service: A review. Fire Technology, 51 , 553–584. https://doi.org/10.1007/s10694-014-0398-1

Wools, S., Eggen, T., & Sanders, P. (2010). Evaluation of validity and validation by means of the argument-based approach. Cadmo . https://doi.org/10.3280/cad2010-001007

Wouters, P., van der Spek, E. D., & van Oostendorp, H. (2009). Current practices in serious game research: A review from a learning outcomes perspective. In T. Connolly, M. Stansfield, & L. Boyle (Eds.), Games-based learning advancements for multi-sensory human computer interfaces: Techniques and effective practices (pp. 232–250). IGI Global.

Young, M. F., Slota, S., Cutter, A. B., Jalette, G., Mullin, G., Lai, B., Simeoni, Z., Tran, M., & Yukhymenko, M. (2012). Our princess is in another castle: A review of trends in serious gaming for education. Review of Educational Research, 82 (1), 61–89. https://doi.org/10.3102/0034654312436980

Studies included in the systematic review

Adams, A., Hart, J., Iacovides, I., Beavers, S., Oliveira, M., & Magroudi, M. (2019). Co-created evaluation: Identifying how games support police learning. International Journal of Human-Computer Studies, 132 , 34–44. https://doi.org/10.1016/j.ijhcs.2019.03.009

Aksoy, E. (2019). Comparing the effects on learning outcomes of tablet-based and virtual reality–based serious gaming modules for basic life support training: Randomized trial. JMIR Serious Games, 7 (2), Article e13442. https://doi.org/10.2196/13442

Albert, A., Hallowell, M. R., Kleiner, B., Chen, A., & Golparvar-Fard, M. (2014). Enhancing construction hazard recognition with high-fidelity augmented virtuality. Journal of Construction Engineering and Management, 140 (7), Article 04014024. https://doi.org/10.1061/(ASCE)CO.1943-7862.0000860

Alyami, H., Alawami, M., Lyndon, M., Alyami, M., Coomarasamy, C., Henning, M., Hill, A., & Sundram, F. (2019). Impact of using a 3D visual metaphor serious game to teach history-taking content to medical students: Longitudinal mixed methods pilot study. JMIR Serious Games, 7 (3), Article e13748. https://doi.org/10.2196/13748

Ameerbakhsh, O., Maharaj, S., Hussain, A., & McAdam, B. (2019). A comparison of two methods of using a serious game for teaching marine ecology in a university setting. International Journal of Human-Computer Studies, 127 , 181–189. https://doi.org/10.1016/j.ijhcs.2018.07.004

Asadipour, A., Debattista, K., & Chalmers, A. (2017). Visuohaptic augmented feedback for enhancing motor skill acquisition. The Visual Computer, 33 (4), 401–411. https://doi.org/10.1007/s00371-016-1275-3

Barab, S. A., Scott, B., Siyahhan, S., Goldstone, R., Ingram-Goble, A., Zuiker, S. J., & Warren, S. (2009). Transformational play as a curriculur scaffold: Using videogames to support science education. Journal of Science Education and Technology, 18 (4), 305–320. https://doi.org/10.1007/s10956-009-9171-5

Benda, N. C., Kellogg, K. M., Hoffman, D. J., Fairbanks, R. J., & Auguste, T. (2020). Lessons learned from an evaluation of serious gaming as an alternative to mannequin-based simulation technology: Randomized controlled trial. JMIR Serious Games, 8 (3), Article e21123. https://doi.org/10.2196/21123

Bindoff, I., Ling, T., Bereznicki, L., Westbury, J., Chalmers, L., Peterson, G., & Ollington, R. (2014). A computer simulation of community pharmacy practice for educational use. American Journal of Pharmaceutical Education, 78 (9), Article 168. https://doi.org/10.5688/ajpe789168

Binsubaih, A., Maddock, S., & Romano, D. (2006). A serious game for traffic accident investigators. Interactive Technology and Smart Education, 3 (4), 329–346. https://doi.org/10.1108/17415650680000071

Blanié, A., Amorim, M. A., & Benhamou, D. (2020). Comparative value of a simulation by gaming and a traditional teaching method to improve clinical reasoning skills necessary to detect patient deterioration: A randomized study in nursing students. BMC Medical Education, 20 (1), Article 53. https://doi.org/10.1186/s12909-020-1939-6

Boada, I., Rodriguez-Benitez, A., Garcia-Gonzalez, J. M., Olivet, J., Carreras, V., & Sbert, M. (2015). Using a serious game to complement CPR instruction in a nurse faculty. Computer Methods and Programs in Biomedicine, 122 (2), 282–291. https://doi.org/10.1016/j.cmpb.2015.08.006

Brown, D. E., Moenning, A., Guerlain, S., Turnbull, B., Abel, D., & Meyer, C. (2018). Design and evaluation of an avatar-based cultural training system. The Journal of Defense Modeling and Simulation, 16 (2), 159–174. https://doi.org/10.1177/1548512918807593

Buttussi, F., Pellis, T., Cabas Vidani, A., Pausler, D., Carchietti, E., & Chittaro, L. (2013). Evaluation of a 3D serious game for advanced life support retraining. International Journal Medical Informatics, 82 (9), 798–809. https://doi.org/10.1016/j.ijmedinf.2013.05.007

Calderón, A., Ruiz, M., & O’Connor, R. V. (2018). A serious game to support the ISO 21500 standard education in the context of software project management. Computer Standards & Interfaces, 60 , 80–92. https://doi.org/10.1016/j.csi.2018.04.012

Chan, W. Y., Qin, J., Chui, Y. P., & Heng, P. A. (2012). A serious game for learning ultrasound-guided needle placement skills. IEEE Transactions on Information Technology in Biomedicine, 16 (6), 1032–1042. https://doi.org/10.1109/titb.2012.2204406

Chang, C., Kao, C., Hwang, G., & Lin, F. (2020). From experiencing to critical thinking: A contextual game-based learning approach to improving nursing students’ performance in electrocardiogram training. Educational Technology Research and Development, 68 (3), 1225–1245. https://doi.org/10.1007/s11423-019-09723-x

Chee, E. J. M., Prabhakaran, L., Neo, L. P., Carpio, G. A. C., Tan, A. J. Q., Lee, C. C. S., & Liaw, S. Y. (2019). Play and learn with patients—Designing and evaluating a serious game to enhance nurses’ inhaler teaching techniques: A randomized controlled trial. Games for Health Journal, 8 (3), 187–194. https://doi.org/10.1089/g4h.2018.0073

Chon, S., Timmermann, F., Dratsch, T., Schuelper, N., Plum, P., Berlth, F., Datta, R. R., Schramm, C., Haneder, S., Späth, M. R., Dübbers, M., Kleinert, J., Raupach, T., Bruns, C., & Kleinert, R. (2019). Serious games in surgical medical education: A virtual emergency department as a tool for teaching clinical reasoning to medical students. JMIR Serious Games, 7 (1), Article e13028. https://doi.org/10.2196/13028

Cook, N. F., McAloon, T., O’Neill, P., & Beggs, R. (2012). Impact of a web based interactive simulation game (PULSE) on nursing students’ experience and performance in life support training—A pilot study. Nurse Education Today, 32 (6), 714–720. https://doi.org/10.1016/j.nedt.2011.09.013

Cowley, B., Fantato, M., Jennett, C., Ruskov, M., & Ravaja, N. (2014). Learning when serious: Psychophysiological evaluation of a technology-enhanced learning game. Journal of Educational Technology & Society, 17 (1), 3–16.

Creutzfeldt, J., Hedman, L., & Felländer-Tsai, L. (2012). Effects of pre-training using serious game technology on CPR performance—An exploratory quasi-experimental transfer study. Scandinavian Journal of Trauma, Resuscitation and Emergency Medicine, 20 (1), Article 79. https://doi.org/10.1186/1757-7241-20-79

Creutzfeldt, J., Hedman, L., Medin, C., Heinrichs, W. L., & Felländer-Tsai, L. (2010). Exploring virtual worlds for scenario-based repeated team training of cardiopulmonary resuscitation in medical students. Journal of Medical Internet Research, 12 (3), Article e38. https://doi.org/10.2196/jmir.1426

Dankbaar, M. E. W., Alsma, J., Jansen, E. E. H., van Merrienboer, J. J. G., van Saase, J. L. C. M., & Schuit, S. C. E. (2016). An experimental study on the effects of a simulation game on students’ clinical cognitive skills and motivation. Advances in Health Sciences Education, 21 (3), 505–521. https://doi.org/10.1007/s10459-015-9641-x

Dankbaar, M. E. W., Bakhuys Roozeboom, M., Oprins, E. A. P. B., Rutten, F., van Merrienboer, J. J. G., van Saase, J. L. C. M., & Schuit, S. C. E. (2017a). Preparing residents effectively in emergency skills training with a serious game. Simulation in Healthcare, 12 (1), 9–16. https://doi.org/10.1097/sih.0000000000000194

Dankbaar, M. E. W., Richters, O., Kalkman, C. J., Prins, G., ten Cate, O. T. J., van Merrienboer, J. J. G., & Schuit, S. C. E. (2017). Comparative effectiveness of a serious game and an e-module to support patient safety knowledge and awareness. BMC Medical Education, 17 (1), Article 30. https://doi.org/10.1186/s12909-016-0836-5

de Sena, D. P., Fabrício, D. D., da Silva, V. D., Bodanese, L. C., & Franco, A. R. (2019). Comparative evaluation of video-based on-line course versus serious game for training medical students in cardiopulmonary resuscitation: A randomised trial. PLOS ONE, 14 (4), Article e0214722. https://doi.org/10.1371/journal.pone.0214722

Dib, H., & Adamo-Villani, N. (2014). Serious sustainability challenge game to promote teaching and learning of building sustainability. Journal of Computing in Civil Engineering, 28 (5), Article A4014007. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000357

Diehl, L. A., Souza, R. M., Gordan, P. A., Esteves, R. Z., & Coelho, I. C. M. (2017). InsuOnline, an electronic game for medical education on insulin therapy: A randomized controlled trial with primary care physicians. Journal of Medical Internet Research, 19 (3), Article e72. https://doi.org/10.2196/jmir.6944

Drummond, D., Delval, P., Abdenouri, S., Truchot, J., Ceccaldi, P., Plaisance, P., Hadchouel, A., & Tesnière, A. (2017). Serious game versus online course for pretraining medical students before a simulation-based mastery learning course on cardiopulmonary resuscitation: A randomised controlled study. European Journal of Anaesthesiology, 34 (12), 836–844. https://doi.org/10.1097/EJA.0000000000000675

Duque, G., Fung, S., Mallet, L., Posel, N., & Fleiszer, D. (2008). Learning while having fun: The use of video gaming to teach geriatric house calls to medical students. Journal of the American Geriatrics Society, 56 (7), 1328–1332. https://doi.org/10.1111/j.1532-5415.2008.01759.x

Fonteneau, T., Billion, E., Abdoul, C., Le, S., Hadchouel, A., & Drummond, D. (2020). Simulation game versus multiple choice questionnaire to assess the clinical competence of medical students: Prospective sequential trial. Journal of Medical Internet Research, 22 (12), Article e23254. https://doi.org/10.2196/23254

Gerard, J. M., Scalzo, A. J., Borgman, M. A., Watson, C. M., Byrnes, C. E., Chang, T. P., Auerbach, M., Kessler, D. O., Feldman, B. L., Payne, B. S., Nibras, S., Chokshi, R. K., & Lopreiato, J. O. (2018). Validity evidence for a serious game to assess performance on critical pediatric emergency medicine scenarios. Simulation in Healthcare, 13 (3), 168–180. https://doi.org/10.1097/SIH.0000000000000283

Graafland, M., Bemelman, W. A., & Schijven, M. P. (2014). Prospective cohort study on surgeons’ response to equipment failure in the laparoscopic environment. Surgical Endoscopy, 28 (9), 2695–2701. https://doi.org/10.1007/s00464-014-3530-x

Graafland, M., Bemelman, W. A., & Schijven, M. P. (2017). Game-based training improves the surgeon’s situational awareness in the operation room: A randomized controlled trial. Surgical Endoscopy, 31 (10), 4093–4101. https://doi.org/10.1007/s00464-017-5456-6

Hannig, A., Lemos, M., Spreckelsen, C., Ohnesorge-Radtke, U., & Rafai, N. (2013). Skills-O-Mat: Computer supported interactive motion- and game-based training in mixing alginate in dental education. Journal of Educational Computing Research, 48 (3), 315–343. https://doi.org/10.2190/EC.48.3.c

Hummel, H. G. K., van Houcke, J., Nadolski, R. J., van der Hiele, T., Kurvers, H., & Löhr, A. (2011). Scripted collaboration in serious gaming for complex learning: Effects of multiple perspectives when acquiring water management skills. British Journal of Educational Technology, 42 (6), 1029–1041. https://doi.org/10.1111/j.1467-8535.2010.01122.x

Jalink, M. B., Goris, J., Heineman, E., Pierie, J. P., & ten Cate Hoedemaker, H. O. (2014). Construct and concurrent validity of a Nintendo Wii video game made for training basic laparoscopic skills. Surgical Endoscopy, 28 (2), 537–542. https://doi.org/10.1007/s00464-013-3199-6

Katz, D., Zerillo, J., Kim, S., Hill, B., Wang, R., Goldberg, A., & DeMaria, S. (2017). Serious gaming for orthotopic liver transplant anesthesiology: A randomized control trial. Liver Transplantation, 23 (4), 430–439. https://doi.org/10.1002/lt.24732

Knight, J. F., Carley, S., Tregunna, B., Jarvis, S., Smithies, R., de Freitas, S., Dunwell, I., & Mackway-Jones, K. (2010). Serious gaming technology in major incident triage training: A pragmatic controlled trial. Resuscitation, 81 (9), 1175–1179. https://doi.org/10.1016/j.resuscitation.2010.03.042

LeFlore, J. L., Anderson, M., Zielke, M. A., Nelson, K. A., Thomas, P. E., Hardee, G., & John, L. D. (2012). Can a virtual patient trainer teach student nurses how to save lives—Teaching student nurses about pediatric respiratory diseases. Simulation in Healthcare, 7 (1), 10–17. https://doi.org/10.1097/SIH.0b013e31823652de

Li, K., Hall, M., Bermell-Garcia, P., Alcock, J., Tiwari, A., & González-Franco, M. (2017). Measuring the learning effectiveness of serious gaming for training of complex manufacturing tasks. Simulation & Gaming, 48 (6), 770–790. https://doi.org/10.1177/1046878117739929

Luu, C., Talbot, T. B., Fung, C. C., Ben-Isaac, E., Espinoza, J., Fischer, S., Cho, C. S., Sargsyan, M., Korand, S., & Chang, T. P. (2020). Development and performance assessment of a digital serious game to assess multi-patient care skills in a simulated pediatric emergency department. Simulation & Gaming, 51 (4), 550–570. https://doi.org/10.1177/1046878120904984

Middeke, A., Anders, S., Schuelper, M., Raupach, T., & Schuelper, N. (2018). Training of clinical reasoning with a serious game versus small-group problem-based learning: A prospective study. PLoS ONE, 13 (9), Article e0203851. https://doi.org/10.1371/journal.pone.0203851

Miller, C. H., Dunbar, N. E., Jensen, M. L., Massey, Z. B., Lee, Y., Nicholls, S. B., Anderson, C., Adams, A. S., Cecena, J. E., Thompson, W. M., & Wilson, S. N. (2019). Training law enforcement officers to identify reliable deception cues with a serious digital game. International Journal of Game-Based Learning, 9 (3), 1–22. https://doi.org/10.4018/IJGBL.2019070101

Mohan, D., Angus, D. C., Ricketts, D., Farris, C., Fischhoff, B., Rosengart, M. R., Yealy, D. M., & Barnato, A. E. (2014). Assessing the validity of using serious game technology to analyze physician decision making. PLOS ONE, 9 (8), Article e105445. https://doi.org/10.1371/journal.pone.0105445

Mohan, D., Farris, C., Fischhoff, B., Rosengart, M. R., Angus, D. C., Yealy, D. M., Wallace, D. J., & Barnato, A. E. (2017). Efficacy of educational video game versus traditional educational apps at improving physician decision making in trauma triage: Randomized controlled trial. BMJ, 359 , Article j5416. https://doi.org/10.1136/bmj.j5416

Mohan, D., Fischhoff, B., Angus, D. C., Rosengart, M. R., Wallace, D. J., Yealy, D. M., Farris, C., Chang, C. H., Kerti, S., & Barnato, A. E. (2018). Serious games may improve physician heuristics in trauma triage. Proceedings of the National Academy of Sciences, 115 (37), 9204–9209. https://doi.org/10.1073/pnas.1805450115

Moreno-Ger, P., Torrente, J., Bustamante, J., Fernandez-Galaz, C., Fernandez-Manjon, B., & Comas-Rengifo, M. D. (2010). Application of a low-cost web-based simulation to improve students’ practical skills in medical education. International Journal of Medical Informatics, 79 (6), 459–467. https://doi.org/10.1016/j.ijmedinf.2010.01.017

Perini, S., Luglietti, R., Margoudi, M., Oliveira, M., & Taisch, M. (2018). Learning and motivational effects of digital game-based learning (DGBL) for manufacturing education—The life cycle assessment (LCA) game. Computers in Industry, 102 , 40–49. https://doi.org/10.1016/j.compind.2018.08.005

Phungoen, P., Promto, S., Chanthawatthanarak, S., Maneepong, S., Apiratwarakul, K., Kotruchin, P., & Mitsungnern, T. (2020). Precourse preparation using a serious smartphone game on advanced life support knowledge and skills: Randomized controlled trial. Journal of Medical Internet Research, 22 (3), Article e16987. https://doi.org/10.2196/16987

Steinrücke, J., Veldkamp, B. P., & de Jong, T. (2020). Information literacy skills assessment in digital crisis management training for the safety domain: Developing an unobtrusive method. Frontiers in Education, 5 (140), Article 140. https://doi.org/10.3389/feduc.2020.00140

Su, C. (2016). The efects of students’ learning anxiety and motivation on the learning achievement in the activity theory based gamified learning environment. Eurasia Journal of Mathematics, Science and Technology Education, 13 , 1229–1258. https://doi.org/10.12973/eurasia.2017.00669a

Taillandier, F., & Adam, C. (2018). Games ready to use: A serious game for teaching natural risk management. Simulation & Gaming, 49 (4), 441–470. https://doi.org/10.1177/1046878118770217

Tan, A. J. Q., Lee, C. C. S., Lin, P. Y., Cooper, S., Lau, L. S. T., Chua, W. L., & Liaw, S. Y. (2017). Designing and evaluating the effectiveness of a serious game for safe administration of blood transfusion: A randomized controlled trial. Nurse Education Today, 55 , 38–44. https://doi.org/10.1016/j.nedt.2017.04.027

Zualkernan, I. A., Husseini, G. A., Loughlin, K. F., Mohebzada, J. G., & El Gami, M. (2013). Remote labs and game-based learning for process control. Chemical Engineering Education, 47 (3), 179–188.

Download references

Author information

Authors and affiliations.

eX:plain, Department of Applied Research, P.O. Box 1230, 3800 BE, Amersfoort, The Netherlands

Aranka Bijl

Faculty of Behavioural, Management and Social Sciences, Cognition, Data and Education, University of Twente, P.O. Box 217, 7500 AE, Enschede, The Netherlands

Aranka Bijl & Bernard P. Veldkamp

Cito, Department of Research and Innovation, P.O. Box 1034, 6801 MG, Arnhem, The Netherlands

Aranka Bijl, Saskia Wools & Sebastiaan de Klerk

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Aranka Bijl .

Ethics declarations

Conflict of interest.

We have no conflict of interest to disclose.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 42 kb)

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Bijl, A., Veldkamp, B.P., Wools, S. et al. Serious games in high-stakes assessment contexts: a systematic literature review into the game design principles for valid game-based performance assessment. Education Tech Research Dev (2024). https://doi.org/10.1007/s11423-024-10362-0

Download citation

Accepted : 24 February 2024

Published : 08 April 2024

DOI : https://doi.org/10.1007/s11423-024-10362-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Systematic literature review
  • Serious games
  • Professional competencies
  • Performance assessment
  • Game design principles
  • Find a journal
  • Publish with us
  • Track your research

IMAGES

  1. Empirical Research: Definition, Methods, Types and Examples

    an empirical study is based on research design that is

  2. What Is Empirical Research? Definition, Types & Samples in 2024

    an empirical study is based on research design that is

  3. Empirical Research: Definition, Methods, Types and Examples

    an empirical study is based on research design that is

  4. Empirical Research: Definition and Examples

    an empirical study is based on research design that is

  5. How to write an empirical research paper?

    an empirical study is based on research design that is

  6. Design of empirical research processes (own representation based on

    an empirical study is based on research design that is

VIDEO

  1. Selecting Studio based Research Design

  2. Empirical Research Based LL.M. Dissertation and Ph.D. (Law) Thesis Writing

  3. [2023-2 Global Convergence Research Seminar] Experimental research design and empirical findings

  4. Empirical Research Report

  5. Implications of sample Design

  6. What is experimental research design? (4 of 11)

COMMENTS

  1. Empirical Research: Definition, Methods, Types and Examples

    Empirical research is defined as any research where conclusions of the study is strictly drawn from concretely empirical evidence, and therefore "verifiable" evidence. This empirical evidence can be gathered using quantitative market research and qualitative market research methods. For example: A research is being conducted to find out if ...

  2. Research Design

    Quantitative research. Quantitative research is expressed in numbers and graphs. It is used to test or confirm theories and assumptions. This type of research can be used to establish generalizable facts about a topic. Common quantitative methods include experiments, observations recorded as numbers, and surveys with closed-ended questions.

  3. What Is Empirical Research? Definition, Types & Samples in 2024

    Under the qualitative research design, empirical studies had evolved to test the conventional concepts of evidence and truth while still observing the fundamental principles of recognizing the subjects beings studied as empirical (Powner, 2015). ... V. Steps for Conducting Empirical Research. Since empirical research is based on observation and ...

  4. Empirical Research: Defining, Identifying, & Finding

    Empirical research methodologies can be described as quantitative, qualitative, or a mix of both (usually called mixed-methods). Ruane (2016) (UofM login required) gets at the basic differences in approach between quantitative and qualitative research: Quantitative research -- an approach to documenting reality that relies heavily on numbers both for the measurement of variables and for data ...

  5. Empirical research

    Empirical research is research using empirical evidence. ... Research design varies by field and by the question being investigated. ... Based on this theory, statements or hypotheses will be proposed (e.g., "Listening to vocal music has a negative effect on learning a word list."). From these hypotheses, predictions about specific events are ...

  6. What is Empirical Research Study? [Examples & Method]

    The introduction provides a background of the empirical study while the methodology describes the research design, processes, and tools for the systematic investigation. ... There are different quantitative and qualitative methods of data gathering employed during an empirical research study based on the purpose of the research which include ...

  7. What Is a Research Design

    A research design is a strategy for answering your research question using empirical data. Creating a research design means making decisions about: Your overall research objectives and approach. Whether you'll rely on primary research or secondary research. Your sampling methods or criteria for selecting subjects. Your data collection methods.

  8. Empirical Research in the Social Sciences and Education

    Empirical research is based on observed and measured phenomena and derives knowledge from actual experience rather than from theory or belief. How do you know if a study is empirical? Read the subheadings within the article, book, or report and look for a description of the research "methodology." ... sometimes called "research design" -- how ...

  9. Research Design

    A research design is a strategy for answering your research question using empirical data. Creating a research design means making decisions about: ... allowing you to adjust your approach based on what you find throughout the research process. ... many studies use non-probability sampling, but it's important to be aware of the limitations ...

  10. Research design

    Research design is a comprehensive plan for data collection in an empirical research project. It is a 'blueprint' for empirical research aimed at answering specific research questions or testing specific hypotheses, and must specify at least three processes: the data collection process, the instrument development process, and the sampling process.

  11. Empirical Research: Quantitative & Qualitative

    Empirical research is based on phenomena that can be observed and measured. Empirical research derives knowledge from actual experience rather than from theory or belief. ... Description of the methodology or research design used to study this population or phenomena, including selection criteria, controls, and testing instruments (such as ...

  12. Conduct empirical research

    Typically, empirical research embodies the following elements: A research question, which will determine research objectives. A particular and planned design for the research, which will depend on the question and which will find ways of answering it with appropriate use of resources. The gathering of primary data, which is then analysed.

  13. What is Empirical Research? Definition, Methods, Examples

    Evidence-Based Knowledge: Empirical research provides a solid foundation of evidence-based knowledge. It enables us to test hypotheses, confirm or refute theories, and build a robust understanding of the world. ... Case Study Design. Case studies are an in-depth exploration of one or a few cases to gain a deep understanding of a particular ...

  14. Empirical Research

    This book introduces readers to methods and strategies for research and provides them with enough knowledge to become discerning, confident consumers of research in writing. Topics covered include: library research, empirical methodology, quantitative research, experimental research, surveys, focus groups, ethnographies, and much more.

  15. Study designs: Part 1

    The study design used to answer a particular research question depends on the nature of the question and the availability of resources. In this article, which is the first part of a series on "study designs," we provide an overview of research study designs and their classification. The subsequent articles will focus on individual designs.

  16. Clinical Research: A Review of Study Designs, Hypotheses, Errors

    Choosing an appropriate study design to answer a research question is probably the most important factor that could affect the research result . This can be addressed by clearly understanding various study designs and their applications before selecting a more relevant design. Retention is another significant aspect of the study design.

  17. Empirical Research: A Comprehensive Guide for Academics

    Tips for Empirical Writing. In empirical research, the writing is usually done in research papers, articles, or reports. The empirical writing follows a set structure, and each section has a specific role. Here are some tips for your empirical writing. 7. Define Your Objectives: When you write about your research, start by making your goals clear.

  18. On design-based empirical research and its interpretation and ...

    Balance on unobserved confounders is essential in design-based studies, but it is rarely fully under researcher control, especially because imbalance can arise after assignment. ... On design-based empirical research and its interpretation and ethics in sustainability science. Proceedings of the National Academy of Sciences. Vol. 118;

  19. Design-Based Research

    Design research is defined as "the systematic study of designing, developing and evaluating educational interventions (such as programs, teaching-learning strategies and materials, products and systems) as solutions for complex problems in educational practice, which also aims at advancing our knowledge about the characteristics of these interventions and the processes of designing and ...

  20. Full article: Design-based research: What it is and why it matters to

    Design-based research methods are a thirty-year old tradition from the learning sciences that have been taken up in many domains as a way to study designed interventions that challenge the traditional relationship between research and design, as is the case with online learning. ... Design guidelines and empirical case study for scaling ...

  21. The Empirical Research Paper: A Guide

    The Vice President for Research is a university-wide post responsible for overseeing all aspects of research at Tulane University. Priorities for the office include fostering interdisciplinary collaboration and building the research portfolio in a manner compliant with government regulation.

  22. GIS for empirical research design: An illustration with ...

    Concluding remarks. This paper provided a regression-based framework for GIS-based empirical research design using georeferenced point data for both spatial events of interest and subjects exposed to the events and illustrated its utility and implementation through an empirical case study from Cambodia.

  23. (PDF) Empirical Research Design

    Research Methodology. Empirical Research. Empirical Research Design. 10.1007/978-3-658-33139-9_4. CC BY 4.0. In book: The Impact of Individual Expertise and Public Information on Group Decision ...

  24. Factors affecting early career registered nurses' views of building

    2.1 Design. This is a qualitative study with an exploratory design using detailed individual semi-structured interviews. The results of the interviews were analysed using systematic text condensation (STC) analysis, developed by Malterud and based on the principles of Giorgi's phenomenological analysis.

  25. Individual Differences and Task-Based Language Teaching

    Abstract. This book consists of a collection of empirical studies and research syntheses investigating the role of individual difference (ID) variables in task-based language teaching (TBLT)-a ...

  26. Research paper Moderating personal factors for the effectiveness of a

    The current study investigated whether teachers' years of working experience and their initial level of mindfulness, self-care, stress, and emotional exhaustion moderated the effects of a self-care- and mindfulness-based seminar on their stress and emotional exhaustion using a repeated measures design.

  27. Retirement planning

    A systematic review is based on reproducible methods and is subject to identification, organization, and critical assessment of the field of study (Snyder 2019; Tranfield et al. 2003).It is a proven method for synthesizing the knowledge base transparently, unlike traditional narrative reviews, which are likely to suffer from researcher bias in the selection and absence of diligence (Tranfield ...

  28. Impact of industrial policy on urban green innovation: empirical

    Based on the above analysis, Hypothesis 3 is proposed: Hypothesis 3. Transportation infrastructure moderates the relationship between national high-tech zones and levels of urban green invention. 3 Research design 3.1 Model setting. This research explores the impact of industrial policies of national high-tech zones on the level of urban green ...

  29. Serious games in high-stakes assessment contexts: a ...

    Overall, the review gives an overview of game design principles for game-based performance assessment. It highlights two research gaps regarding authenticity and adaptivity and concludes with three implications for practice. ... peer-reviewed journal. To be included, studies were required to report on the empirical research results of a study ...

  30. Design and empirical evaluation of a magneto-rheological fluid-based

    The focus of this study is on the design and experimental validation of MRF self-sealing (also known as MRF seal) utilizing permanent magnets. ... Design and empirical evaluation of a magneto-rheological fluid-based seal with rectangular and trapezoidal pole head. ... Polish Maritime Research 15: 49-58. Crossref. Google Scholar. Meng Z, Jibin ...