• Privacy Policy

Buy Me a Coffee

Research Method

Home » Variables in Research – Definition, Types and Examples

Variables in Research – Definition, Types and Examples

Table of Contents

Variables in Research

Variables in Research


In Research, Variables refer to characteristics or attributes that can be measured, manipulated, or controlled. They are the factors that researchers observe or manipulate to understand the relationship between them and the outcomes of interest.

Types of Variables in Research

Types of Variables in Research are as follows:

Independent Variable

This is the variable that is manipulated by the researcher. It is also known as the predictor variable, as it is used to predict changes in the dependent variable. Examples of independent variables include age, gender, dosage, and treatment type.

Dependent Variable

This is the variable that is measured or observed to determine the effects of the independent variable. It is also known as the outcome variable, as it is the variable that is affected by the independent variable. Examples of dependent variables include blood pressure, test scores, and reaction time.

Confounding Variable

This is a variable that can affect the relationship between the independent variable and the dependent variable. It is a variable that is not being studied but could impact the results of the study. For example, in a study on the effects of a new drug on a disease, a confounding variable could be the patient’s age, as older patients may have more severe symptoms.

Mediating Variable

This is a variable that explains the relationship between the independent variable and the dependent variable. It is a variable that comes in between the independent and dependent variables and is affected by the independent variable, which then affects the dependent variable. For example, in a study on the relationship between exercise and weight loss, the mediating variable could be metabolism, as exercise can increase metabolism, which can then lead to weight loss.

Moderator Variable

This is a variable that affects the strength or direction of the relationship between the independent variable and the dependent variable. It is a variable that influences the effect of the independent variable on the dependent variable. For example, in a study on the effects of caffeine on cognitive performance, the moderator variable could be age, as older adults may be more sensitive to the effects of caffeine than younger adults.

Control Variable

This is a variable that is held constant or controlled by the researcher to ensure that it does not affect the relationship between the independent variable and the dependent variable. Control variables are important to ensure that any observed effects are due to the independent variable and not to other factors. For example, in a study on the effects of a new teaching method on student performance, the control variables could include class size, teacher experience, and student demographics.

Continuous Variable

This is a variable that can take on any value within a certain range. Continuous variables can be measured on a scale and are often used in statistical analyses. Examples of continuous variables include height, weight, and temperature.

Categorical Variable

This is a variable that can take on a limited number of values or categories. Categorical variables can be nominal or ordinal. Nominal variables have no inherent order, while ordinal variables have a natural order. Examples of categorical variables include gender, race, and educational level.

Discrete Variable

This is a variable that can only take on specific values. Discrete variables are often used in counting or frequency analyses. Examples of discrete variables include the number of siblings a person has, the number of times a person exercises in a week, and the number of students in a classroom.

Dummy Variable

This is a variable that takes on only two values, typically 0 and 1, and is used to represent categorical variables in statistical analyses. Dummy variables are often used when a categorical variable cannot be used directly in an analysis. For example, in a study on the effects of gender on income, a dummy variable could be created, with 0 representing female and 1 representing male.

Extraneous Variable

This is a variable that has no relationship with the independent or dependent variable but can affect the outcome of the study. Extraneous variables can lead to erroneous conclusions and can be controlled through random assignment or statistical techniques.

Latent Variable

This is a variable that cannot be directly observed or measured, but is inferred from other variables. Latent variables are often used in psychological or social research to represent constructs such as personality traits, attitudes, or beliefs.

Moderator-mediator Variable

This is a variable that acts both as a moderator and a mediator. It can moderate the relationship between the independent and dependent variables and also mediate the relationship between the independent and dependent variables. Moderator-mediator variables are often used in complex statistical analyses.

Variables Analysis Methods

There are different methods to analyze variables in research, including:

  • Descriptive statistics: This involves analyzing and summarizing data using measures such as mean, median, mode, range, standard deviation, and frequency distribution. Descriptive statistics are useful for understanding the basic characteristics of a data set.
  • Inferential statistics : This involves making inferences about a population based on sample data. Inferential statistics use techniques such as hypothesis testing, confidence intervals, and regression analysis to draw conclusions from data.
  • Correlation analysis: This involves examining the relationship between two or more variables. Correlation analysis can determine the strength and direction of the relationship between variables, and can be used to make predictions about future outcomes.
  • Regression analysis: This involves examining the relationship between an independent variable and a dependent variable. Regression analysis can be used to predict the value of the dependent variable based on the value of the independent variable, and can also determine the significance of the relationship between the two variables.
  • Factor analysis: This involves identifying patterns and relationships among a large number of variables. Factor analysis can be used to reduce the complexity of a data set and identify underlying factors or dimensions.
  • Cluster analysis: This involves grouping data into clusters based on similarities between variables. Cluster analysis can be used to identify patterns or segments within a data set, and can be useful for market segmentation or customer profiling.
  • Multivariate analysis : This involves analyzing multiple variables simultaneously. Multivariate analysis can be used to understand complex relationships between variables, and can be useful in fields such as social science, finance, and marketing.

Examples of Variables

  • Age : This is a continuous variable that represents the age of an individual in years.
  • Gender : This is a categorical variable that represents the biological sex of an individual and can take on values such as male and female.
  • Education level: This is a categorical variable that represents the level of education completed by an individual and can take on values such as high school, college, and graduate school.
  • Income : This is a continuous variable that represents the amount of money earned by an individual in a year.
  • Weight : This is a continuous variable that represents the weight of an individual in kilograms or pounds.
  • Ethnicity : This is a categorical variable that represents the ethnic background of an individual and can take on values such as Hispanic, African American, and Asian.
  • Time spent on social media : This is a continuous variable that represents the amount of time an individual spends on social media in minutes or hours per day.
  • Marital status: This is a categorical variable that represents the marital status of an individual and can take on values such as married, divorced, and single.
  • Blood pressure : This is a continuous variable that represents the force of blood against the walls of arteries in millimeters of mercury.
  • Job satisfaction : This is a continuous variable that represents an individual’s level of satisfaction with their job and can be measured using a Likert scale.

Applications of Variables

Variables are used in many different applications across various fields. Here are some examples:

  • Scientific research: Variables are used in scientific research to understand the relationships between different factors and to make predictions about future outcomes. For example, scientists may study the effects of different variables on plant growth or the impact of environmental factors on animal behavior.
  • Business and marketing: Variables are used in business and marketing to understand customer behavior and to make decisions about product development and marketing strategies. For example, businesses may study variables such as consumer preferences, spending habits, and market trends to identify opportunities for growth.
  • Healthcare : Variables are used in healthcare to monitor patient health and to make treatment decisions. For example, doctors may use variables such as blood pressure, heart rate, and cholesterol levels to diagnose and treat cardiovascular disease.
  • Education : Variables are used in education to measure student performance and to evaluate the effectiveness of teaching strategies. For example, teachers may use variables such as test scores, attendance, and class participation to assess student learning.
  • Social sciences : Variables are used in social sciences to study human behavior and to understand the factors that influence social interactions. For example, sociologists may study variables such as income, education level, and family structure to examine patterns of social inequality.

Purpose of Variables

Variables serve several purposes in research, including:

  • To provide a way of measuring and quantifying concepts: Variables help researchers measure and quantify abstract concepts such as attitudes, behaviors, and perceptions. By assigning numerical values to these concepts, researchers can analyze and compare data to draw meaningful conclusions.
  • To help explain relationships between different factors: Variables help researchers identify and explain relationships between different factors. By analyzing how changes in one variable affect another variable, researchers can gain insight into the complex interplay between different factors.
  • To make predictions about future outcomes : Variables help researchers make predictions about future outcomes based on past observations. By analyzing patterns and relationships between different variables, researchers can make informed predictions about how different factors may affect future outcomes.
  • To test hypotheses: Variables help researchers test hypotheses and theories. By collecting and analyzing data on different variables, researchers can test whether their predictions are accurate and whether their hypotheses are supported by the evidence.

Characteristics of Variables

Characteristics of Variables are as follows:

  • Measurement : Variables can be measured using different scales, such as nominal, ordinal, interval, or ratio scales. The scale used to measure a variable can affect the type of statistical analysis that can be applied.
  • Range : Variables have a range of values that they can take on. The range can be finite, such as the number of students in a class, or infinite, such as the range of possible values for a continuous variable like temperature.
  • Variability : Variables can have different levels of variability, which refers to the degree to which the values of the variable differ from each other. Highly variable variables have a wide range of values, while low variability variables have values that are more similar to each other.
  • Validity and reliability : Variables should be both valid and reliable to ensure accurate and consistent measurement. Validity refers to the extent to which a variable measures what it is intended to measure, while reliability refers to the consistency of the measurement over time.
  • Directionality: Some variables have directionality, meaning that the relationship between the variables is not symmetrical. For example, in a study of the relationship between smoking and lung cancer, smoking is the independent variable and lung cancer is the dependent variable.

Advantages of Variables

Here are some of the advantages of using variables in research:

  • Control : Variables allow researchers to control the effects of external factors that could influence the outcome of the study. By manipulating and controlling variables, researchers can isolate the effects of specific factors and measure their impact on the outcome.
  • Replicability : Variables make it possible for other researchers to replicate the study and test its findings. By defining and measuring variables consistently, other researchers can conduct similar studies to validate the original findings.
  • Accuracy : Variables make it possible to measure phenomena accurately and objectively. By defining and measuring variables precisely, researchers can reduce bias and increase the accuracy of their findings.
  • Generalizability : Variables allow researchers to generalize their findings to larger populations. By selecting variables that are representative of the population, researchers can draw conclusions that are applicable to a broader range of individuals.
  • Clarity : Variables help researchers to communicate their findings more clearly and effectively. By defining and categorizing variables, researchers can organize and present their findings in a way that is easily understandable to others.

Disadvantages of Variables

Here are some of the main disadvantages of using variables in research:

  • Simplification : Variables may oversimplify the complexity of real-world phenomena. By breaking down a phenomenon into variables, researchers may lose important information and context, which can affect the accuracy and generalizability of their findings.
  • Measurement error : Variables rely on accurate and precise measurement, and measurement error can affect the reliability and validity of research findings. The use of subjective or poorly defined variables can also introduce measurement error into the study.
  • Confounding variables : Confounding variables are factors that are not measured but that affect the relationship between the variables of interest. If confounding variables are not accounted for, they can distort or obscure the relationship between the variables of interest.
  • Limited scope: Variables are defined by the researcher, and the scope of the study is therefore limited by the researcher’s choice of variables. This can lead to a narrow focus that overlooks important aspects of the phenomenon being studied.
  • Ethical concerns: The selection and measurement of variables may raise ethical concerns, especially in studies involving human subjects. For example, using variables that are related to sensitive topics, such as race or sexuality, may raise concerns about privacy and discrimination.

About the author

' src=

Muhammad Hassan

Researcher, Academic Writer, Web developer

You may also like

Control Variable

Control Variable – Definition, Types and Examples

Moderating Variable

Moderating Variable – Definition, Analysis...

Categorical Variable

Categorical Variable – Definition, Types and...

Independent Variable

Independent Variable – Definition, Types and...

Ratio Variable

Ratio Variable – Definition, Purpose and Examples

Ordinal Variable

Ordinal Variable – Definition, Purpose and...

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, automatically generate references for free.

  • Knowledge Base
  • Methodology
  • Types of Variables in Research | Definitions & Examples

Types of Variables in Research | Definitions & Examples

Published on 19 September 2022 by Rebecca Bevans . Revised on 28 November 2022.

In statistical research, a variable is defined as an attribute of an object of study. Choosing which variables to measure is central to good experimental design .

You need to know which types of variables you are working with in order to choose appropriate statistical tests and interpret the results of your study.

You can usually identify the type of variable by asking two questions:

  • What type of data does the variable contain?
  • What part of the experiment does the variable represent?

Table of contents

Types of data: quantitative vs categorical variables, parts of the experiment: independent vs dependent variables, other common types of variables, frequently asked questions about variables.

Data is a specific measurement of a variable – it is the value you record in your data sheet. Data is generally divided into two categories:

  • Quantitative data represents amounts.
  • Categorical data represents groupings.

A variable that contains quantitative data is a quantitative variable ; a variable that contains categorical data is a categorical variable . Each of these types of variable can be broken down into further types.

Quantitative variables

When you collect quantitative data, the numbers you record represent real amounts that can be added, subtracted, divided, etc. There are two types of quantitative variables: discrete and continuous .

Categorical variables

Categorical variables represent groupings of some kind. They are sometimes recorded as numbers, but the numbers represent categories rather than actual amounts of things.

There are three types of categorical variables: binary , nominal , and ordinal variables.

*Note that sometimes a variable can work as more than one type! An ordinal variable can also be used as a quantitative variable if the scale is numeric and doesn’t need to be kept as discrete integers. For example, star ratings on product reviews are ordinal (1 to 5 stars), but the average star rating is quantitative.

Example data sheet

To keep track of your salt-tolerance experiment, you make a data sheet where you record information about the variables in the experiment, like salt addition and plant health.

To gather information about plant responses over time, you can fill out the same data sheet every few days until the end of the experiment. This example sheet is colour-coded according to the type of variable: nominal , continuous , ordinal , and binary .

Example data sheet showing types of variables in a plant salt tolerance experiment

Prevent plagiarism, run a free check.

Experiments are usually designed to find out what effect one variable has on another – in our example, the effect of salt addition on plant growth.

You manipulate the independent variable (the one you think might be the cause ) and then measure the dependent variable (the one you think might be the effect ) to find out what this effect might be.

You will probably also have variables that you hold constant ( control variables ) in order to focus on your experimental treatment.

In this experiment, we have one independent and three dependent variables.

The other variables in the sheet can’t be classified as independent or dependent, but they do contain data that you will need in order to interpret your dependent and independent variables.

Example of a data sheet showing dependent and independent variables for a plant salt tolerance experiment.

What about correlational research?

When you do correlational research , the terms ‘dependent’ and ‘independent’ don’t apply, because you are not trying to establish a cause-and-effect relationship.

However, there might be cases where one variable clearly precedes the other (for example, rainfall leads to mud, rather than the other way around). In these cases, you may call the preceding variable (i.e., the rainfall) the predictor variable and the following variable (i.e., the mud) the outcome variable .

Once you have defined your independent and dependent variables and determined whether they are categorical or quantitative, you will be able to choose the correct statistical test .

But there are many other ways of describing variables that help with interpreting your results. Some useful types of variable are listed below.

A confounding variable is closely related to both the independent and dependent variables in a study. An independent variable represents the supposed cause , while the dependent variable is the supposed effect . A confounding variable is a third variable that influences both the independent and dependent variables.

Failing to account for confounding variables can cause you to wrongly estimate the relationship between your independent and dependent variables.

Discrete and continuous variables are two types of quantitative variables :

  • Discrete variables represent counts (e.g., the number of objects in a collection).
  • Continuous variables represent measurable amounts (e.g., water volume or weight).

You can think of independent and dependent variables in terms of cause and effect: an independent variable is the variable you think is the cause , while a dependent variable is the effect .

In an experiment, you manipulate the independent variable and measure the outcome in the dependent variable. For example, in an experiment about the effect of nutrients on crop growth:

  • The  independent variable  is the amount of nutrients added to the crop field.
  • The  dependent variable is the biomass of the crops at harvest time.

Defining your variables, and deciding how you will manipulate and measure them, is an important part of experimental design .

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the ‘Cite this Scribbr article’ button to automatically add the citation to our free Reference Generator.

Bevans, R. (2022, November 28). Types of Variables in Research | Definitions & Examples. Scribbr. Retrieved 15 April 2024, from https://www.scribbr.co.uk/research-methods/variables-types/

Is this article helpful?

Rebecca Bevans

Rebecca Bevans

Other students also liked, a quick guide to experimental design | 5 steps & examples, quasi-experimental design | definition, types & examples, construct validity | definition, types, & examples.

Independent and Dependent Variables

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

On This Page:

In research, a variable is any characteristic, number, or quantity that can be measured or counted in experimental investigations . One is called the dependent variable, and the other is the independent variable.

In research, the independent variable is manipulated to observe its effect, while the dependent variable is the measured outcome. Essentially, the independent variable is the presumed cause, and the dependent variable is the observed effect.

Variables provide the foundation for examining relationships, drawing conclusions, and making predictions in research studies.


Independent Variable

In psychology, the independent variable is the variable the experimenter manipulates or changes and is assumed to directly affect the dependent variable.

It’s considered the cause or factor that drives change, allowing psychologists to observe how it influences behavior, emotions, or other dependent variables in an experimental setting. Essentially, it’s the presumed cause in cause-and-effect relationships being studied.

For example, allocating participants to drug or placebo conditions (independent variable) to measure any changes in the intensity of their anxiety (dependent variable).

In a well-designed experimental study , the independent variable is the only important difference between the experimental (e.g., treatment) and control (e.g., placebo) groups.

By changing the independent variable and holding other factors constant, psychologists aim to determine if it causes a change in another variable, called the dependent variable.

For example, in a study investigating the effects of sleep on memory, the amount of sleep (e.g., 4 hours, 8 hours, 12 hours) would be the independent variable, as the researcher might manipulate or categorize it to see its impact on memory recall, which would be the dependent variable.

Dependent Variable

In psychology, the dependent variable is the variable being tested and measured in an experiment and is “dependent” on the independent variable.

In psychology, a dependent variable represents the outcome or results and can change based on the manipulations of the independent variable. Essentially, it’s the presumed effect in a cause-and-effect relationship being studied.

An example of a dependent variable is depression symptoms, which depend on the independent variable (type of therapy).

In an experiment, the researcher looks for the possible effect on the dependent variable that might be caused by changing the independent variable.

For instance, in a study examining the effects of a new study technique on exam performance, the technique would be the independent variable (as it is being introduced or manipulated), while the exam scores would be the dependent variable (as they represent the outcome of interest that’s being measured).

Examples in Research Studies

For example, we might change the type of information (e.g., organized or random) given to participants to see how this might affect the amount of information remembered.

In this example, the type of information is the independent variable (because it changes), and the amount of information remembered is the dependent variable (because this is being measured).

Independent and Dependent Variables Examples

For the following hypotheses, name the IV and the DV.

1. Lack of sleep significantly affects learning in 10-year-old boys.



2. Social class has a significant effect on IQ scores.


3. Stressful experiences significantly increase the likelihood of headaches.

4. Time of day has a significant effect on alertness.

Operationalizing Variables

To ensure cause and effect are established, it is important that we identify exactly how the independent and dependent variables will be measured; this is known as operationalizing the variables.

Operational variables (or operationalizing definitions) refer to how you will define and measure a specific variable as it is used in your study. This enables another psychologist to replicate your research and is essential in establishing reliability (achieving consistency in the results).

For example, if we are concerned with the effect of media violence on aggression, then we need to be very clear about what we mean by the different terms. In this case, we must state what we mean by the terms “media violence” and “aggression” as we will study them.

Therefore, you could state that “media violence” is operationally defined (in your experiment) as ‘exposure to a 15-minute film showing scenes of physical assault’; “aggression” is operationally defined as ‘levels of electrical shocks administered to a second ‘participant’ in another room.

In another example, the hypothesis “Young participants will have significantly better memories than older participants” is not operationalized. How do we define “young,” “old,” or “memory”? “Participants aged between 16 – 30 will recall significantly more nouns from a list of twenty than participants aged between 55 – 70” is operationalized.

The key point here is that we have clarified what we mean by the terms as they were studied and measured in our experiment.

If we didn’t do this, it would be very difficult (if not impossible) to compare the findings of different studies to the same behavior.

Operationalization has the advantage of generally providing a clear and objective definition of even complex variables. It also makes it easier for other researchers to replicate a study and check for reliability .

For the following hypotheses, name the IV and the DV and operationalize both variables.

1. Women are more attracted to men without earrings than men with earrings.


D.V. ____________________________________________________________

Operational definitions:

I.V. ____________________________________________________________

2. People learn more when they study in a quiet versus noisy place.

I.V. _________________________________________________________

D.V. ___________________________________________________________

3. People who exercise regularly sleep better at night.

Can there be more than one independent or dependent variable in a study?

Yes, it is possible to have more than one independent or dependent variable in a study.

In some studies, researchers may want to explore how multiple factors affect the outcome, so they include more than one independent variable.

Similarly, they may measure multiple things to see how they are influenced, resulting in multiple dependent variables. This allows for a more comprehensive understanding of the topic being studied.

What are some ethical considerations related to independent and dependent variables?

Ethical considerations related to independent and dependent variables involve treating participants fairly and protecting their rights.

Researchers must ensure that participants provide informed consent and that their privacy and confidentiality are respected. Additionally, it is important to avoid manipulating independent variables in ways that could cause harm or discomfort to participants.

Researchers should also consider the potential impact of their study on vulnerable populations and ensure that their methods are unbiased and free from discrimination.

Ethical guidelines help ensure that research is conducted responsibly and with respect for the well-being of the participants involved.

Can qualitative data have independent and dependent variables?

Yes, both quantitative and qualitative data can have independent and dependent variables.

In quantitative research, independent variables are usually measured numerically and manipulated to understand their impact on the dependent variable. In qualitative research, independent variables can be qualitative in nature, such as individual experiences, cultural factors, or social contexts, influencing the phenomenon of interest.

The dependent variable, in both cases, is what is being observed or studied to see how it changes in response to the independent variable.

So, regardless of the type of data, researchers analyze the relationship between independent and dependent variables to gain insights into their research questions.

Can the same variable be independent in one study and dependent in another?

Yes, the same variable can be independent in one study and dependent in another.

The classification of a variable as independent or dependent depends on how it is used within a specific study. In one study, a variable might be manipulated or controlled to see its effect on another variable, making it independent.

However, in a different study, that same variable might be the one being measured or observed to understand its relationship with another variable, making it dependent.

The role of a variable as independent or dependent can vary depending on the research question and study design.

Print Friendly, PDF & Email

Grad Coach

Research Variables 101

Independent variables, dependent variables, control variables and more

By: Derek Jansen (MBA) | Expert Reviewed By: Kerryn Warren (PhD) | January 2023

If you’re new to the world of research, especially scientific research, you’re bound to run into the concept of variables , sooner or later. If you’re feeling a little confused, don’t worry – you’re not the only one! Independent variables, dependent variables, confounding variables – it’s a lot of jargon. In this post, we’ll unpack the terminology surrounding research variables using straightforward language and loads of examples .

Overview: Variables In Research

What (exactly) is a variable.

The simplest way to understand a variable is as any characteristic or attribute that can experience change or vary over time or context – hence the name “variable”. For example, the dosage of a particular medicine could be classified as a variable, as the amount can vary (i.e., a higher dose or a lower dose). Similarly, gender, age or ethnicity could be considered demographic variables, because each person varies in these respects.

Within research, especially scientific research, variables form the foundation of studies, as researchers are often interested in how one variable impacts another, and the relationships between different variables. For example:

  • How someone’s age impacts their sleep quality
  • How different teaching methods impact learning outcomes
  • How diet impacts weight (gain or loss)

As you can see, variables are often used to explain relationships between different elements and phenomena. In scientific studies, especially experimental studies, the objective is often to understand the causal relationships between variables. In other words, the role of cause and effect between variables. This is achieved by manipulating certain variables while controlling others – and then observing the outcome. But, we’ll get into that a little later…

The “Big 3” Variables

Variables can be a little intimidating for new researchers because there are a wide variety of variables, and oftentimes, there are multiple labels for the same thing. To lay a firm foundation, we’ll first look at the three main types of variables, namely:

  • Independent variables (IV)
  • Dependant variables (DV)
  • Control variables

What is an independent variable?

Simply put, the independent variable is the “ cause ” in the relationship between two (or more) variables. In other words, when the independent variable changes, it has an impact on another variable.

For example:

  • Increasing the dosage of a medication (Variable A) could result in better (or worse) health outcomes for a patient (Variable B)
  • Changing a teaching method (Variable A) could impact the test scores that students earn in a standardised test (Variable B)
  • Varying one’s diet (Variable A) could result in weight loss or gain (Variable B).

It’s useful to know that independent variables can go by a few different names, including, explanatory variables (because they explain an event or outcome) and predictor variables (because they predict the value of another variable). Terminology aside though, the most important takeaway is that independent variables are assumed to be the “cause” in any cause-effect relationship. As you can imagine, these types of variables are of major interest to researchers, as many studies seek to understand the causal factors behind a phenomenon.

Need a helping hand?

key variables in research

What is a dependent variable?

While the independent variable is the “ cause ”, the dependent variable is the “ effect ” – or rather, the affected variable . In other words, the dependent variable is the variable that is assumed to change as a result of a change in the independent variable.

Keeping with the previous example, let’s look at some dependent variables in action:

  • Health outcomes (DV) could be impacted by dosage changes of a medication (IV)
  • Students’ scores (DV) could be impacted by teaching methods (IV)
  • Weight gain or loss (DV) could be impacted by diet (IV)

In scientific studies, researchers will typically pay very close attention to the dependent variable (or variables), carefully measuring any changes in response to hypothesised independent variables. This can be tricky in practice, as it’s not always easy to reliably measure specific phenomena or outcomes – or to be certain that the actual cause of the change is in fact the independent variable.

As the adage goes, correlation is not causation . In other words, just because two variables have a relationship doesn’t mean that it’s a causal relationship – they may just happen to vary together. For example, you could find a correlation between the number of people who own a certain brand of car and the number of people who have a certain type of job. Just because the number of people who own that brand of car and the number of people who have that type of job is correlated, it doesn’t mean that owning that brand of car causes someone to have that type of job or vice versa. The correlation could, for example, be caused by another factor such as income level or age group, which would affect both car ownership and job type.

To confidently establish a causal relationship between an independent variable and a dependent variable (i.e., X causes Y), you’ll typically need an experimental design , where you have complete control over the environmen t and the variables of interest. But even so, this doesn’t always translate into the “real world”. Simply put, what happens in the lab sometimes stays in the lab!

As an alternative to pure experimental research, correlational or “ quasi-experimental ” research (where the researcher cannot manipulate or change variables) can be done on a much larger scale more easily, allowing one to understand specific relationships in the real world. These types of studies also assume some causality between independent and dependent variables, but it’s not always clear. So, if you go this route, you need to be cautious in terms of how you describe the impact and causality between variables and be sure to acknowledge any limitations in your own research.

Free Webinar: Research Methodology 101

What is a control variable?

In an experimental design, a control variable (or controlled variable) is a variable that is intentionally held constant to ensure it doesn’t have an influence on any other variables. As a result, this variable remains unchanged throughout the course of the study. In other words, it’s a variable that’s not allowed to vary – tough life 🙂

As we mentioned earlier, one of the major challenges in identifying and measuring causal relationships is that it’s difficult to isolate the impact of variables other than the independent variable. Simply put, there’s always a risk that there are factors beyond the ones you’re specifically looking at that might be impacting the results of your study. So, to minimise the risk of this, researchers will attempt (as best possible) to hold other variables constant . These factors are then considered control variables.

Some examples of variables that you may need to control include:

  • Temperature
  • Time of day
  • Noise or distractions

Which specific variables need to be controlled for will vary tremendously depending on the research project at hand, so there’s no generic list of control variables to consult. As a researcher, you’ll need to think carefully about all the factors that could vary within your research context and then consider how you’ll go about controlling them. A good starting point is to look at previous studies similar to yours and pay close attention to which variables they controlled for.

Of course, you won’t always be able to control every possible variable, and so, in many cases, you’ll just have to acknowledge their potential impact and account for them in the conclusions you draw. Every study has its limitations, so don’t get fixated or discouraged by troublesome variables. Nevertheless, always think carefully about the factors beyond what you’re focusing on – don’t make assumptions!

 A control variable is intentionally held constant (it doesn't vary) to ensure it doesn’t have an influence on any other variables.

Other types of variables

As we mentioned, independent, dependent and control variables are the most common variables you’ll come across in your research, but they’re certainly not the only ones you need to be aware of. Next, we’ll look at a few “secondary” variables that you need to keep in mind as you design your research.

  • Moderating variables
  • Mediating variables
  • Confounding variables
  • Latent variables

Let’s jump into it…

What is a moderating variable?

A moderating variable is a variable that influences the strength or direction of the relationship between an independent variable and a dependent variable. In other words, moderating variables affect how much (or how little) the IV affects the DV, or whether the IV has a positive or negative relationship with the DV (i.e., moves in the same or opposite direction).

For example, in a study about the effects of sleep deprivation on academic performance, gender could be used as a moderating variable to see if there are any differences in how men and women respond to a lack of sleep. In such a case, one may find that gender has an influence on how much students’ scores suffer when they’re deprived of sleep.

It’s important to note that while moderators can have an influence on outcomes , they don’t necessarily cause them ; rather they modify or “moderate” existing relationships between other variables. This means that it’s possible for two different groups with similar characteristics, but different levels of moderation, to experience very different results from the same experiment or study design.

What is a mediating variable?

Mediating variables are often used to explain the relationship between the independent and dependent variable (s). For example, if you were researching the effects of age on job satisfaction, then education level could be considered a mediating variable, as it may explain why older people have higher job satisfaction than younger people – they may have more experience or better qualifications, which lead to greater job satisfaction.

Mediating variables also help researchers understand how different factors interact with each other to influence outcomes. For instance, if you wanted to study the effect of stress on academic performance, then coping strategies might act as a mediating factor by influencing both stress levels and academic performance simultaneously. For example, students who use effective coping strategies might be less stressed but also perform better academically due to their improved mental state.

In addition, mediating variables can provide insight into causal relationships between two variables by helping researchers determine whether changes in one factor directly cause changes in another – or whether there is an indirect relationship between them mediated by some third factor(s). For instance, if you wanted to investigate the impact of parental involvement on student achievement, you would need to consider family dynamics as a potential mediator, since it could influence both parental involvement and student achievement simultaneously.

Mediating variables can explain the relationship between the independent and dependent variable, including whether it's causal or not.

What is a confounding variable?

A confounding variable (also known as a third variable or lurking variable ) is an extraneous factor that can influence the relationship between two variables being studied. Specifically, for a variable to be considered a confounding variable, it needs to meet two criteria:

  • It must be correlated with the independent variable (this can be causal or not)
  • It must have a causal impact on the dependent variable (i.e., influence the DV)

Some common examples of confounding variables include demographic factors such as gender, ethnicity, socioeconomic status, age, education level, and health status. In addition to these, there are also environmental factors to consider. For example, air pollution could confound the impact of the variables of interest in a study investigating health outcomes.

Naturally, it’s important to identify as many confounding variables as possible when conducting your research, as they can heavily distort the results and lead you to draw incorrect conclusions . So, always think carefully about what factors may have a confounding effect on your variables of interest and try to manage these as best you can.

What is a latent variable?

Latent variables are unobservable factors that can influence the behaviour of individuals and explain certain outcomes within a study. They’re also known as hidden or underlying variables , and what makes them rather tricky is that they can’t be directly observed or measured . Instead, latent variables must be inferred from other observable data points such as responses to surveys or experiments.

For example, in a study of mental health, the variable “resilience” could be considered a latent variable. It can’t be directly measured , but it can be inferred from measures of mental health symptoms, stress, and coping mechanisms. The same applies to a lot of concepts we encounter every day – for example:

  • Emotional intelligence
  • Quality of life
  • Business confidence
  • Ease of use

One way in which we overcome the challenge of measuring the immeasurable is latent variable models (LVMs). An LVM is a type of statistical model that describes a relationship between observed variables and one or more unobserved (latent) variables. These models allow researchers to uncover patterns in their data which may not have been visible before, thanks to their complexity and interrelatedness with other variables. Those patterns can then inform hypotheses about cause-and-effect relationships among those same variables which were previously unknown prior to running the LVM. Powerful stuff, we say!

Latent variables are unobservable factors that can influence the behaviour of individuals and explain certain outcomes within a study.

Let’s recap

In the world of scientific research, there’s no shortage of variable types, some of which have multiple names and some of which overlap with each other. In this post, we’ve covered some of the popular ones, but remember that this is not an exhaustive list .

To recap, we’ve explored:

  • Independent variables (the “cause”)
  • Dependent variables (the “effect”)
  • Control variables (the variable that’s not allowed to vary)

If you’re still feeling a bit lost and need a helping hand with your research project, check out our 1-on-1 coaching service , where we guide you through each step of the research journey. Also, be sure to check out our free dissertation writing course and our collection of free, fully-editable chapter templates .

key variables in research

Psst... there’s more!

This post was based on one of our popular Research Bootcamps . If you're working on a research project, you'll definitely want to check this out ...

You Might Also Like:

Survey Design 101: The Basics

Very informative, concise and helpful. Thank you

Ige Samuel Babatunde

Helping information.Thanks

Ancel George

practical and well-demonstrated


Very helpful and insightful

Submit a Comment Cancel reply

Your email address will not be published. Required fields are marked *

Save my name, email, and website in this browser for the next time I comment.

  • Print Friendly

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons
  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Social Sci LibreTexts

2.2: Concepts, Constructs, and Variables

  • Last updated
  • Save as PDF
  • Page ID 26212

  • Anol Bhattacherjee
  • University of South Florida via Global Text Project

We discussed in Chapter 1 that although research can be exploratory, descriptive, or explanatory, most scientific research tend to be of the explanatory type in that they search for potential explanations of observed natural or social phenomena. Explanations require development of concepts or generalizable properties or characteristics associated with objects, events, or people. While objects such as a person, a firm, or a car are not concepts, their specific characteristics or behavior such as a person’s attitude toward immigrants, a firm’s capacity for innovation, and a car’s weight can be viewed as concepts.

Knowingly or unknowingly, we use different kinds of concepts in our everyday conversations. Some of these concepts have been developed over time through our shared language. Sometimes, we borrow concepts from other disciplines or languages to explain a phenomenon of interest. For instance, the idea of gravitation borrowed from physics can be used in business to describe why people tend to “gravitate” to their preferred shopping destinations. Likewise, the concept of distance can be used to explain the degree of social separation between two otherwise collocated individuals. Sometimes, we create our own concepts to describe a unique characteristic not described in prior research. For instance, technostress is a new concept referring to the mental stress one may face when asked to learn a new technology.

Concepts may also have progressive levels of abstraction. Some concepts such as a person’s weight are precise and objective, while other concepts such as a person’s personality may be more abstract and difficult to visualize. A construct is an abstract concept that is specifically chosen (or “created”) to explain a given phenomenon. A construct may be a simple concept, such as a person’s weight , or a combination of a set of related concepts such as a person’s communication skill , which may consist of several underlying concepts such as the person’s vocabulary , syntax , and spelling . The former instance (weight) is a unidimensional construct , while the latter (communication skill) is a multi-dimensional construct (i.e., it consists of multiple underlying concepts). The distinction between constructs and concepts are clearer in multi-dimensional constructs, where the higher order abstraction is called a construct and the lower order abstractions are called concepts. However, this distinction tends to blur in the case of unidimensional constructs.

Constructs used for scientific research must have precise and clear definitions that others can use to understand exactly what it means and what it does not mean. For instance, a seemingly simple construct such as income may refer to monthly or annual income, before-tax or after-tax income, and personal or family income, and is therefore neither precise nor clear. There are two types of definitions: dictionary definitions and operational definitions. In the more familiar dictionary definition, a construct is often defined in terms of a synonym. For instance, attitude may be defined as a disposition, a feeling, or an affect, and affect in turn is defined as an attitude. Such definitions of a circular nature are not particularly useful in scientific research for elaborating the meaning and content of that construct. Scientific research requires operational definitions that define constructs in terms of how they will be empirically measured. For instance, the operational definition of a construct such as temperature must specify whether we plan to measure temperature in Celsius, Fahrenheit, or Kelvin scale. A construct such as income should be defined in terms of whether we are interested in monthly or annual income, before-tax or after-tax income, and personal or family income. One can imagine that constructs such as learning , personality , and intelligence can be quite hard to define operationally.


A term frequently associated with, and sometimes used interchangeably with, a construct is a variable. Etymologically speaking, a variable is a quantity that can vary (e.g., from low to high, negative to positive, etc.), in contrast to constants that do not vary (i.e., remain constant). However, in scientific research, a variable is a measurable representation of an abstract construct. As abstract entities, constructs are not directly measurable, and hence, we look for proxy measures called variables. For instance, a person’s intelligence is often measured as his or her IQ ( intelligence quotient ) score , which is an index generated from an analytical and pattern-matching test administered to people. In this case, intelligence is a construct, and IQ score is a variable that measures the intelligence construct. Whether IQ scores truly measures one’s intelligence is anyone’s guess (though many believe that they do), and depending on whether how well it measures intelligence, the IQ score may be a good or a poor measure of the intelligence construct. As shown in Figure 2.1, scientific research proceeds along two planes: a theoretical plane and an empirical plane. Constructs are conceptualized at the theoretical (abstract) plane, while variables are operationalized and measured at the empirical (observational) plane. Thinking like a researcher implies the ability to move back and forth between these two planes.

Depending on their intended use, variables may be classified as independent, dependent, moderating, mediating, or control variables. Variables that explain other variables are called independent variables , those that are explained by other variables are dependent variables , those that are explained by independent variables while also explaining dependent variables are mediating variables (or intermediate variables), and those that influence the relationship between independent and dependent variables are called moderating variables . As an example, if we state that higher intelligence causes improved learning among students, then intelligence is an independent variable and learning is a dependent variable. There may be other extraneous variables that are not pertinent to explaining a given dependent variable, but may have some impact on the dependent variable. These variables must be controlled for in a scientific study, and are therefore called control variables .


To understand the differences between these different variable types, consider the example shown in Figure 2.2. If we believe that intelligence influences (or explains) students’ academic achievement, then a measure of intelligence such as an IQ score is an independent variable, while a measure of academic success such as grade point average is a dependent variable. If we believe that the effect of intelligence on academic achievement also depends on the effort invested by the student in the learning process (i.e., between two equally intelligent students, the student who puts is more effort achieves higher academic achievement than one who puts in less effort), then effort becomes a moderating variable. Incidentally, one may also view effort as an independent variable and intelligence as a moderating variable. If academic achievement is viewed as an intermediate step to higher earning potential, then earning potential becomes the dependent variable for the independent variable academic achievement , and academic achievement becomes the mediating variable in the relationship between intelligence and earning potential. Hence, variable are defined as an independent, dependent, moderating, or mediating variable based on their nature of association with each other. The overall network of relationships between a set of related constructs is called a nomological network (see Figure 2.2). Thinking like a researcher requires not only being able to abstract constructs from observations, but also being able to mentally visualize a nomological network linking these abstract constructs.

  • USC Libraries
  • Research Guides

Organizing Your Social Sciences Research Paper

  • Theoretical Framework
  • Purpose of Guide
  • Design Flaws to Avoid
  • Independent and Dependent Variables
  • Glossary of Research Terms
  • Reading Research Effectively
  • Narrowing a Topic Idea
  • Broadening a Topic Idea
  • Extending the Timeliness of a Topic Idea
  • Academic Writing Style
  • Applying Critical Thinking
  • Choosing a Title
  • Making an Outline
  • Paragraph Development
  • Research Process Video Series
  • Executive Summary
  • The C.A.R.S. Model
  • Background Information
  • The Research Problem/Question
  • Citation Tracking
  • Content Alert Services
  • Evaluating Sources
  • Primary Sources
  • Secondary Sources
  • Tiertiary Sources
  • Scholarly vs. Popular Publications
  • Qualitative Methods
  • Quantitative Methods
  • Insiderness
  • Using Non-Textual Elements
  • Limitations of the Study
  • Common Grammar Mistakes
  • Writing Concisely
  • Avoiding Plagiarism
  • Footnotes or Endnotes?
  • Further Readings
  • Generative AI and Writing
  • USC Libraries Tutorials and Other Guides
  • Bibliography

Theories are formulated to explain, predict, and understand phenomena and, in many cases, to challenge and extend existing knowledge within the limits of critical bounded assumptions or predictions of behavior. The theoretical framework is the structure that can hold or support a theory of a research study. The theoretical framework encompasses not just the theory, but the narrative explanation about how the researcher engages in using the theory and its underlying assumptions to investigate the research problem. It is the structure of your paper that summarizes concepts, ideas, and theories derived from prior research studies and which was synthesized in order to form a conceptual basis for your analysis and interpretation of meaning found within your research.

Abend, Gabriel. "The Meaning of Theory." Sociological Theory 26 (June 2008): 173–199; Kivunja, Charles. "Distinguishing between Theory, Theoretical Framework, and Conceptual Framework: A Systematic Review of Lessons from the Field." International Journal of Higher Education 7 (December 2018): 44-53; Swanson, Richard A. Theory Building in Applied Disciplines . San Francisco, CA: Berrett-Koehler Publishers 2013; Varpio, Lara, Elise Paradis, Sebastian Uijtdehaage, and Meredith Young. "The Distinctions between Theory, Theoretical Framework, and Conceptual Framework." Academic Medicine 95 (July 2020): 989-994.

Importance of Theory and a Theoretical Framework

Theories can be unfamiliar to the beginning researcher because they are rarely applied in high school social studies curriculum and, as a result, can come across as unfamiliar and imprecise when first introduced as part of a writing assignment. However, in their most simplified form, a theory is simply a set of assumptions or predictions about something you think will happen based on existing evidence and that can be tested to see if those outcomes turn out to be true. Of course, it is slightly more deliberate than that, therefore, summarized from Kivunja (2018, p. 46), here are the essential characteristics of a theory.

  • It is logical and coherent
  • It has clear definitions of terms or variables, and has boundary conditions [i.e., it is not an open-ended statement]
  • It has a domain where it applies
  • It has clearly described relationships among variables
  • It describes, explains, and makes specific predictions
  • It comprises of concepts, themes, principles, and constructs
  • It must have been based on empirical data [i.e., it is not a guess]
  • It must have made claims that are subject to testing, been tested and verified
  • It must be clear and concise
  • Its assertions or predictions must be different and better than those in existing theories
  • Its predictions must be general enough to be applicable to and understood within multiple contexts
  • Its assertions or predictions are relevant, and if applied as predicted, will result in the predicted outcome
  • The assertions and predictions are not immutable, but subject to revision and improvement as researchers use the theory to make sense of phenomena
  • Its concepts and principles explain what is going on and why
  • Its concepts and principles are substantive enough to enable us to predict a future

Given these characteristics, a theory can best be understood as the foundation from which you investigate assumptions or predictions derived from previous studies about the research problem, but in a way that leads to new knowledge and understanding as well as, in some cases, discovering how to improve the relevance of the theory itself or to argue that the theory is outdated and a new theory needs to be formulated based on new evidence.

A theoretical framework consists of concepts and, together with their definitions and reference to relevant scholarly literature, existing theory that is used for your particular study. The theoretical framework must demonstrate an understanding of theories and concepts that are relevant to the topic of your research paper and that relate to the broader areas of knowledge being considered.

The theoretical framework is most often not something readily found within the literature . You must review course readings and pertinent research studies for theories and analytic models that are relevant to the research problem you are investigating. The selection of a theory should depend on its appropriateness, ease of application, and explanatory power.

The theoretical framework strengthens the study in the following ways :

  • An explicit statement of  theoretical assumptions permits the reader to evaluate them critically.
  • The theoretical framework connects the researcher to existing knowledge. Guided by a relevant theory, you are given a basis for your hypotheses and choice of research methods.
  • Articulating the theoretical assumptions of a research study forces you to address questions of why and how. It permits you to intellectually transition from simply describing a phenomenon you have observed to generalizing about various aspects of that phenomenon.
  • Having a theory helps you identify the limits to those generalizations. A theoretical framework specifies which key variables influence a phenomenon of interest and highlights the need to examine how those key variables might differ and under what circumstances.
  • The theoretical framework adds context around the theory itself based on how scholars had previously tested the theory in relation their overall research design [i.e., purpose of the study, methods of collecting data or information, methods of analysis, the time frame in which information is collected, study setting, and the methodological strategy used to conduct the research].

By virtue of its applicative nature, good theory in the social sciences is of value precisely because it fulfills one primary purpose: to explain the meaning, nature, and challenges associated with a phenomenon, often experienced but unexplained in the world in which we live, so that we may use that knowledge and understanding to act in more informed and effective ways.

The Conceptual Framework. College of Education. Alabama State University; Corvellec, Hervé, ed. What is Theory?: Answers from the Social and Cultural Sciences . Stockholm: Copenhagen Business School Press, 2013; Asher, Herbert B. Theory-Building and Data Analysis in the Social Sciences . Knoxville, TN: University of Tennessee Press, 1984; Drafting an Argument. Writing@CSU. Colorado State University; Kivunja, Charles. "Distinguishing between Theory, Theoretical Framework, and Conceptual Framework: A Systematic Review of Lessons from the Field." International Journal of Higher Education 7 (2018): 44-53; Omodan, Bunmi Isaiah. "A Model for Selecting Theoretical Framework through Epistemology of Research Paradigms." African Journal of Inter/Multidisciplinary Studies 4 (2022): 275-285; Ravitch, Sharon M. and Matthew Riggan. Reason and Rigor: How Conceptual Frameworks Guide Research . Second edition. Los Angeles, CA: SAGE, 2017; Trochim, William M.K. Philosophy of Research. Research Methods Knowledge Base. 2006; Jarvis, Peter. The Practitioner-Researcher. Developing Theory from Practice . San Francisco, CA: Jossey-Bass, 1999.

Strategies for Developing the Theoretical Framework

I.  Developing the Framework

Here are some strategies to develop of an effective theoretical framework:

  • Examine your thesis title and research problem . The research problem anchors your entire study and forms the basis from which you construct your theoretical framework.
  • Brainstorm about what you consider to be the key variables in your research . Answer the question, "What factors contribute to the presumed effect?"
  • Review related literature to find how scholars have addressed your research problem. Identify the assumptions from which the author(s) addressed the problem.
  • List  the constructs and variables that might be relevant to your study. Group these variables into independent and dependent categories.
  • Review key social science theories that are introduced to you in your course readings and choose the theory that can best explain the relationships between the key variables in your study [note the Writing Tip on this page].
  • Discuss the assumptions or propositions of this theory and point out their relevance to your research.

A theoretical framework is used to limit the scope of the relevant data by focusing on specific variables and defining the specific viewpoint [framework] that the researcher will take in analyzing and interpreting the data to be gathered. It also facilitates the understanding of concepts and variables according to given definitions and builds new knowledge by validating or challenging theoretical assumptions.

II.  Purpose

Think of theories as the conceptual basis for understanding, analyzing, and designing ways to investigate relationships within social systems. To that end, the following roles served by a theory can help guide the development of your framework.

  • Means by which new research data can be interpreted and coded for future use,
  • Response to new problems that have no previously identified solutions strategy,
  • Means for identifying and defining research problems,
  • Means for prescribing or evaluating solutions to research problems,
  • Ways of discerning certain facts among the accumulated knowledge that are important and which facts are not,
  • Means of giving old data new interpretations and new meaning,
  • Means by which to identify important new issues and prescribe the most critical research questions that need to be answered to maximize understanding of the issue,
  • Means of providing members of a professional discipline with a common language and a frame of reference for defining the boundaries of their profession, and
  • Means to guide and inform research so that it can, in turn, guide research efforts and improve professional practice.

Adapted from: Torraco, R. J. “Theory-Building Research Methods.” In Swanson R. A. and E. F. Holton III , editors. Human Resource Development Handbook: Linking Research and Practice . (San Francisco, CA: Berrett-Koehler, 1997): pp. 114-137; Jacard, James and Jacob Jacoby. Theory Construction and Model-Building Skills: A Practical Guide for Social Scientists . New York: Guilford, 2010; Ravitch, Sharon M. and Matthew Riggan. Reason and Rigor: How Conceptual Frameworks Guide Research . Second edition. Los Angeles, CA: SAGE, 2017; Sutton, Robert I. and Barry M. Staw. “What Theory is Not.” Administrative Science Quarterly 40 (September 1995): 371-384.

Structure and Writing Style

The theoretical framework may be rooted in a specific theory , in which case, your work is expected to test the validity of that existing theory in relation to specific events, issues, or phenomena. Many social science research papers fit into this rubric. For example, Peripheral Realism Theory, which categorizes perceived differences among nation-states as those that give orders, those that obey, and those that rebel, could be used as a means for understanding conflicted relationships among countries in Africa. A test of this theory could be the following: Does Peripheral Realism Theory help explain intra-state actions, such as, the disputed split between southern and northern Sudan that led to the creation of two nations?

However, you may not always be asked by your professor to test a specific theory in your paper, but to develop your own framework from which your analysis of the research problem is derived . Based upon the above example, it is perhaps easiest to understand the nature and function of a theoretical framework if it is viewed as an answer to two basic questions:

  • What is the research problem/question? [e.g., "How should the individual and the state relate during periods of conflict?"]
  • Why is your approach a feasible solution? [i.e., justify the application of your choice of a particular theory and explain why alternative constructs were rejected. I could choose instead to test Instrumentalist or Circumstantialists models developed among ethnic conflict theorists that rely upon socio-economic-political factors to explain individual-state relations and to apply this theoretical model to periods of war between nations].

The answers to these questions come from a thorough review of the literature and your course readings [summarized and analyzed in the next section of your paper] and the gaps in the research that emerge from the review process. With this in mind, a complete theoretical framework will likely not emerge until after you have completed a thorough review of the literature .

Just as a research problem in your paper requires contextualization and background information, a theory requires a framework for understanding its application to the topic being investigated. When writing and revising this part of your research paper, keep in mind the following:

  • Clearly describe the framework, concepts, models, or specific theories that underpin your study . This includes noting who the key theorists are in the field who have conducted research on the problem you are investigating and, when necessary, the historical context that supports the formulation of that theory. This latter element is particularly important if the theory is relatively unknown or it is borrowed from another discipline.
  • Position your theoretical framework within a broader context of related frameworks, concepts, models, or theories . As noted in the example above, there will likely be several concepts, theories, or models that can be used to help develop a framework for understanding the research problem. Therefore, note why the theory you've chosen is the appropriate one.
  • The present tense is used when writing about theory. Although the past tense can be used to describe the history of a theory or the role of key theorists, the construction of your theoretical framework is happening now.
  • You should make your theoretical assumptions as explicit as possible . Later, your discussion of methodology should be linked back to this theoretical framework.
  • Don’t just take what the theory says as a given! Reality is never accurately represented in such a simplistic way; if you imply that it can be, you fundamentally distort a reader's ability to understand the findings that emerge. Given this, always note the limitations of the theoretical framework you've chosen [i.e., what parts of the research problem require further investigation because the theory inadequately explains a certain phenomena].

The Conceptual Framework. College of Education. Alabama State University; Conceptual Framework: What Do You Think is Going On? College of Engineering. University of Michigan; Drafting an Argument. Writing@CSU. Colorado State University; Lynham, Susan A. “The General Method of Theory-Building Research in Applied Disciplines.” Advances in Developing Human Resources 4 (August 2002): 221-241; Tavallaei, Mehdi and Mansor Abu Talib. "A General Perspective on the Role of Theory in Qualitative Research." Journal of International Social Research 3 (Spring 2010); Ravitch, Sharon M. and Matthew Riggan. Reason and Rigor: How Conceptual Frameworks Guide Research . Second edition. Los Angeles, CA: SAGE, 2017; Reyes, Victoria. Demystifying the Journal Article. Inside Higher Education; Trochim, William M.K. Philosophy of Research. Research Methods Knowledge Base. 2006; Weick, Karl E. “The Work of Theorizing.” In Theorizing in Social Science: The Context of Discovery . Richard Swedberg, editor. (Stanford, CA: Stanford University Press, 2014), pp. 177-194.

Writing Tip

Borrowing Theoretical Constructs from Other Disciplines

An increasingly important trend in the social and behavioral sciences is to think about and attempt to understand research problems from an interdisciplinary perspective. One way to do this is to not rely exclusively on the theories developed within your particular discipline, but to think about how an issue might be informed by theories developed in other disciplines. For example, if you are a political science student studying the rhetorical strategies used by female incumbents in state legislature campaigns, theories about the use of language could be derived, not only from political science, but linguistics, communication studies, philosophy, psychology, and, in this particular case, feminist studies. Building theoretical frameworks based on the postulates and hypotheses developed in other disciplinary contexts can be both enlightening and an effective way to be more engaged in the research topic.

CohenMiller, A. S. and P. Elizabeth Pate. "A Model for Developing Interdisciplinary Research Theoretical Frameworks." The Qualitative Researcher 24 (2019): 1211-1226; Frodeman, Robert. The Oxford Handbook of Interdisciplinarity . New York: Oxford University Press, 2010.

Another Writing Tip

Don't Undertheorize!

Do not leave the theory hanging out there in the introduction never to be mentioned again. Undertheorizing weakens your paper. The theoretical framework you describe should guide your study throughout the paper. Be sure to always connect theory to the review of pertinent literature and to explain in the discussion part of your paper how the theoretical framework you chose supports analysis of the research problem or, if appropriate, how the theoretical framework was found to be inadequate in explaining the phenomenon you were investigating. In that case, don't be afraid to propose your own theory based on your findings.

Yet Another Writing Tip

What's a Theory? What's a Hypothesis?

The terms theory and hypothesis are often used interchangeably in newspapers and popular magazines and in non-academic settings. However, the difference between theory and hypothesis in scholarly research is important, particularly when using an experimental design. A theory is a well-established principle that has been developed to explain some aspect of the natural world. Theories arise from repeated observation and testing and incorporates facts, laws, predictions, and tested assumptions that are widely accepted [e.g., rational choice theory; grounded theory; critical race theory].

A hypothesis is a specific, testable prediction about what you expect to happen in your study. For example, an experiment designed to look at the relationship between study habits and test anxiety might have a hypothesis that states, "We predict that students with better study habits will suffer less test anxiety." Unless your study is exploratory in nature, your hypothesis should always explain what you expect to happen during the course of your research.

The key distinctions are:

  • A theory predicts events in a broad, general context;  a hypothesis makes a specific prediction about a specified set of circumstances.
  • A theory has been extensively tested and is generally accepted among a set of scholars; a hypothesis is a speculative guess that has yet to be tested.

Cherry, Kendra. Introduction to Research Methods: Theory and Hypothesis. About.com Psychology; Gezae, Michael et al. Welcome Presentation on Hypothesis. Slideshare presentation.

Still Yet Another Writing Tip

Be Prepared to Challenge the Validity of an Existing Theory

Theories are meant to be tested and their underlying assumptions challenged; they are not rigid or intransigent, but are meant to set forth general principles for explaining phenomena or predicting outcomes. Given this, testing theoretical assumptions is an important way that knowledge in any discipline develops and grows. If you're asked to apply an existing theory to a research problem, the analysis will likely include the expectation by your professor that you should offer modifications to the theory based on your research findings.

Indications that theoretical assumptions may need to be modified can include the following:

  • Your findings suggest that the theory does not explain or account for current conditions or circumstances or the passage of time,
  • The study reveals a finding that is incompatible with what the theory attempts to explain or predict, or
  • Your analysis reveals that the theory overly generalizes behaviors or actions without taking into consideration specific factors revealed from your analysis [e.g., factors related to culture, nationality, history, gender, ethnicity, age, geographic location, legal norms or customs , religion, social class, socioeconomic status, etc.].

Philipsen, Kristian. "Theory Building: Using Abductive Search Strategies." In Collaborative Research Design: Working with Business for Meaningful Findings . Per Vagn Freytag and Louise Young, editors. (Singapore: Springer Nature, 2018), pp. 45-71; Shepherd, Dean A. and Roy Suddaby. "Theory Building: A Review and Integration." Journal of Management 43 (2017): 59-86.

  • << Previous: The Research Problem/Question
  • Next: 5. The Literature Review >>
  • Last Updated: Apr 19, 2024 11:16 AM
  • URL: https://libguides.usc.edu/writingguide

Logo for UEN Digital Press with Pressbooks

Key Concepts in Quantitative Research

In this module, we are going to explore the nuances of quantitative research, including the main types of quantitative research, more exploration into variables (including confounding and extraneous variables), and causation.

Content includes:

  • Flaws, “Proof”, and Rigor
  • The Steps of Quantitative Methodology
  • Major Classes of Quantitative Research
  • Experimental versus Non-Experimental Research
  • Types of Experimental Research
  • Types of Non-Experimental Research
  • Research Variables
  • Confounding/Extraneous Variables
  • Causation versus correlation/association


  • Discuss the flaws, proof, and rigor in research.
  • Describe the differences between independent variables and dependent variables.
  • Describe the steps in quantitative research methodology.
  • Describe experimental, quasi-experimental, and non-experimental research studies
  • Describe confounding and extraneous variables.
  • Differentiate cause-and-effect (causality) versus association/correlation

Flaws, Proof, and Rigor in Research

One of the biggest hurdles that students and seasoned researchers alike struggle to grasp, is that research cannot “ prove ” nor “ disprove ”. Research can only support a hypothesis with reasonable, statistically significant evidence.

Indeed. You’ve heard it incorrectly your entire life. You will hear professors, scientists, radio ads, podcasts, and even researchers comment something to the effect of, “It has been proven that…” or “Research proves that…” or “Finally! There is proof that…”

We have been duped. Consider the “ prove ” word a very bad word in this course. The forbidden “P” word. Do not say it, write it, allude to it, or repeat it. And, for the love of avocados and all things fluffy, do not include the “P” word on your EBP poster. You will be deducted some major points.

We can only conclude with reasonable certainty through statistical analyses that there is a high probability that something did not happen by chance but instead happened due to the intervention that the researcher tested. Got that? We will come back to that concept but for now know that it is called “statistical significance”.

All research has flaws. We might not know what those flaws are, but we will be learning about confounding and extraneous variables later on in this module to help explain how flaws can happen.

Remember this: Sometimes, the researcher might not even know that there was a flaw that occurred. No research project is perfect. There is no 100% awesome. This is a major reason why it is so important to be able to duplicate a research project and obtain similar results. The more we can duplicate research with the same exact methodology and protocols, the more certainty we have in the results and we can start accounting for flaws that may have sneaked in.

Finally, not all research is equal. Some research is done very sloppily, and other research has a very high standard of rigor. How do we know which is which when reading an article? Well, within this module, we will start learning about some things to look for in a published research article to help determine rigor. We do not want lazy research to determine our actions as nurses, right? We want the strongest, most reliable, most valid, most rigorous research evidence possible so that we can take those results and embed them into patient care. Who wants shoddy evidence determining the actions we take with your grandmother’s heart surgery?

Independent Variables and Dependent Variables

As we were already introduced to, there are measures called “variables” in research. This will be a bit of a review but it is important to bring up again, as it is a hallmark of quantitative research. In quantitative studies, the concepts being measured are called variables (AKA: something that varies). Variables are something that can change – either by manipulation or from something causing a change. In the article snapshots that we have looked at, researchers are trying to find causes for phenomena. Does a nursing intervention cause an improvement in patient outcomes? Does the cholesterol medication cause a decrease in cholesterol level? Does smoking cause  cancer?

The presumed cause is called the independent variable. The presumed effect is called the dependent variable. The dependent variable is “dependent” on something causing it to change. The dependent variable is the outcome that a researcher is trying to understand, explain, or predict.

Think back to our PICO questions. You can think of the intervention (I) as the independent variable and the outcome (O) as the dependent variable.

The independent variable is manipulated by the researcher or can be variants of influence. Whereas the dependent variable is never manipulated.

key variables in research

Variables do not always measure cause-and-effect. They can also measure a direction of influence.

Here is an example of that: If we compared levels of depression among men and women diagnosed with pancreatic cancer and found men to be more depressed, we cannot conclude that depression was caused by gender. However, we can note that the direction of influence   clearly runs from gender to depression. It makes no sense to suggest the depression influenced their gender.

In the above example, what is the independent variable (IV) and what is the dependent variable (DV)? If you guessed gender as the IV and depression as the DV, you are correct! Important to note in this case that the researcher did not manipulate the IV, but the IV is manipulated on its own (male or female).

Researchers do not always have just one IV. In some cases, more than one IV may be measured. Take, for instance, a study that wants to measure the factors that influence one’s study habits. Independent variables of gender, sleep habits, and hours of work may be considered. Likewise, multiple DVs can be measured. For example, perhaps we want to measure weight and abdominal girth on a plant-based diet (IV).

Now, some studies do not have an intervention. We will come back to that when we talk about non-experimental research.

The point of variables is so that researchers have a very specific measurement that they seek to study.

key variables in research

Let’s look at a couple of examples:

Now you try! Identify the IVs and DVs:

IV and DV Case Studies (Leibold, 2020)

Case Three:   Independent variable: Healthy Lifestyle education with a focus on physical activity; Dependent variable: Physical activity rate before and after education intervention, Heart rate before and after education intervention, Blood pressures before and after education intervention.

Case Four:   Independent variable: Playing classical music; Dependent variable:  Grade point averages post classical music, compared to pre-classical music.

Case Five: Independent variable: No independent variable as there is no intervention.  Dependent variable: The themes that emerge from the qualitative data.

The Steps in Quantitative Research Methodology

Now, as we learned in the last module, quantitative research is completely objective. There is no subjectivity to it. Why is this? Well, as we have learned, the purpose of quantitative research is to make an inference about the results in order to generalize these results to the population.

In quantitative studies, there is a very systematic approach that moves from the beginning point of the study (writing a research question) to the end point (obtaining an answer). This is a very linear and purposeful flow across the study, and all quantitative research should follow the same sequence.

  • Identifying a problem and formulating a research question . Quantitative research begins with a theory . As in, “something is wrong and we want to fix it or improve it”.  Think back to when we discussed research problems and formulating a research question. Here we are! That is the first step in formulating a quantitative research plan.
  • Formulate a hypothesis . This step is key. Researchers need to know exactly what they are testing so that testing the hypothesis can be achieved through specific statistical analyses.
  • A thorough literature review .  At this step, researchers strive to understand what is already known about a topic and what evidence already exists.
  • Identifying a framework .  When an appropriate framework is identified, the findings of a study may have broader significance and utility (Polit & Beck, 2021).
  • Choosing a study design . The research design will determine exactly how the researcher will obtain the answers to the research question(s). The entire design needs to be structured and controlled, with the overarching goal of minimizing bias and errors. The design determines what data will be collected and how, how often data will be collected, what types of comparisons will be made. You can think of the study design as the architectural backbone of the entire study.
  • Sampling . The researcher needs to determine a subset of the population that is to be studied. We will come back to the sampling concept in the next module. However, the goal of sampling is to choose a subset of the population that adequate reflects the population of interest.
  • I nstruments to be used to collect data (with reliability and validity as a priority). Researchers must find a way to measure the research variables (intervention and outcome) accurately. The task of measuring is complex and challenging, as data needs to be collected reliably (measuring consistently each time) and valid. Reliability and validity are both about how well a method measures something. The next module will cover this in detail.
  • Obtaining approval for ethical/legal human rights procedures . As we will learn in an upcoming module, there needs to be methods in place to safeguard human rights.
  • Data collection . The fun part! Finally, after everything has been organized and planned, the researcher(s) begin to collect data. The pre-established plan (methodology) determines when data collection begins, how to accomplish it, how data collection staff will be trained, and how data will be recorded.
  • Data analysis . Here comes the statistical analyses. The next module will dive into this.
  • Discussion . After all the analyses have been complete, the researcher then needs to interpret the results and examine the implications. Researchers attempt to explain the findings in light of the theoretical framework, prior evidence, theory, clinical experience, and any limitations in the study now that it has been completed. Often, the researcher discusses not just the statistical significance, but also the clinical significance, as it is common to have one without the other.
  • Summary/references . Part of the final steps of any research project is to disseminate (AKA: share) the findings. This may be in a published article, conference, poster session, etc. The point of this step is to communicate to others the information found through the study.  All references are collected so that the researchers can give credit to others.
  • Budget and funding . As a last mention in the overall steps, budget and funding for research is a consideration. Research can be expensive. Often, researchers can obtain a grant or other funding to help offset the costs.

key variables in research

Edit: Steps in Quantitative Research video. Step 12 should say “Dissemination” (sharing the results).

Experimental, Quasi-Experimental, and Non-Experimental Studies

To start this section, please watch this wonderful video by Jenny Barrow, MSN, RN, CNE, that explains experimental versus nonexperimental research.

(Jenny Barrow, 2019)

Now that you have that overview, continue reading this module.

Experimental Research : In experimental research, the researcher is seeking to draw a conclusion between an independent variable and a dependent variable. This design attempts to establish cause-effect relationships among the variables. You could think of experimental research as experimenting with “something” to see if it caused “something else”.

A true experiment is called a Randomized Controlled Trial (or RCT). An RCT is at the top of the echelon as far as quantitative experimental research. It’s the gold standard of scientific research. An RCT, a true experimental design, must have 3 features:

  • An intervention : The experiment does something to the participants by the option of manipulating the independent variable.
  • Control : Some participants in the study receive either the standard care, or no intervention at all. This is also called the counterfactual – meaning, it shows what would happen if no intervention was introduced.
  • Randomization : Randomization happens when the researcher makes sure that it is completely random who receives the intervention and who receives the control. The purpose is to make the groups equal regarding all other factors except receipt of the intervention.

Note: There is a lot of confusion with students (and even some researchers!) when they refer to “ random assignment ” versus “ random sampling ”. Random assignment  is a signature of a true experiment. This means that if participants are not truly randomly assigned to intervention groups, then it is not a true experiment. We will talk more about random sampling in the next module.

One very common method for RCT’s is called a pretest-posttest design .  This is when the researcher measures the outcome before and after the intervention. For example, if the researcher had an IV (intervention/treatment) of a pain medication, the DV (pain) would be measured before the intervention is given and after it is given. The control group may just receive a placebo. This design permits the researcher to see if the change in pain was caused by the pain medication because only some people received it (Polit & Beck, 2021).

Another experimental design is called a crossover design . This type of design involves exposing participants to more than one treatment. For example, subject 1 first receives treatment A, then treatment B, then treatment C. Subject 2 might first receive treatment B, then treatment A, and then treatment C. In this type of study, the three conditions for an experiment are met: Intervention, randomization, and control – with the subjects serving as their own control group.

Control group conditions can be done in 4 ways:

  • No intervention is used; control group gets no treatment at all
  • “Usual care” or standard of care or normal procedures used
  • An alternative intervention is uses (e.g. auditory versus visual stimulation)
  • A placebo or pseudo-intervention, presumed to have no therapeutic value, is used

Quasi-Experimental Research : Quasi-experiments involve an experiment just like true experimental research. However, they lack randomization and some even lack a control group.  Therefore, there is implementation and testing of an intervention, but there is an absence of randomization.

For example, perhaps we wanted to measure the effect of yoga for nursing students. The IV (intervention of yoga) is being offered to all nursing students and therefore randomization is not possible. For comparison, we could measure quality of life data on nursing students at a different university. Data is collected from both groups at baseline and then again after the yoga classes. Note, that in quasi-experiments, the phrase “comparison group” is sometimes used instead of “control group” against which outcome measures are collected.

Sometimes there is no comparison group either. This would be called a one-group pretest-posttest design .

Non-Experimental Research : Sometimes, cause-problem research questions cannot be answered with an experimental or quasi-experimental design because the IV cannot be manipulated. For example, if we want to measure what impact prerequisite grades have on student success in nursing programs, we obviously cannot manipulate the prerequisite grades. In another example, if we wanted to investigate how low birth weight impacts developmental progression in children, we cannot manipulate the birth weight. Often, you will see the word “observational” in lieu of non-experimental researcher. This does not mean the researcher is just standing and watching people, but instead it refers to the method of observing data that has already been established without manipulation.

There are various types of non-experimental research:

Correlational research : A correlational research design investigates relationships between two variables (or more) without the researcher controlling or manipulating any of them. In the example of prerequisites and nursing program success, that is a correlational design. Consider hypothetically, a researcher is studying a correlation between cancer and marriage. In this study, there are two variables: disease and marriage. Let us say marriage has a negative association with cancer. This means that married people are less likely to develop cancer.

Cohort design (also called a prospective design) : In a cohort study, the participants do not have the outcome of interest to begin with. They are selected based on the exposure status of the individual. They are then followed over time to evaluate for the occurrence of the outcome of interest. Cohorts may be divided into exposure categories once baseline measurements of a defined population are made. For example, the Framingham Cardiovascular Disease Study (CVD) used baseline measurements to divide the population into categories of CVD risk factors. Another example:  An example of a cohort study is comparing the test scores of one group of people who underwent extensive tutoring and a special curriculum and those who did not receive any extra help. The group could be studied for years to assess whether their scores improve over time and at what rate.

Retrospective design : In retrospective studies, the outcome of interest has already occurred (or not occurred – e.g., in controls) in each individual by the time s/he is enrolled, and the data are collected either from records or by asking participants to recall exposures. There is no follow-up of participants. For example, a researcher might examine the medical histories of 1000 elderly women to identify the causes of health problems.

Case-control design : A study that compares two groups of people: those with the disease or condition under study (cases) and a very similar group of people who do not have the condition. For example, investigators conducted a case-control study to determine if there is an association between colon cancer and a high fat diet. Cases were all confirmed colon cancer cases in North Carolina in 2010. Controls were a sample of North Carolina residents without colon cancer.

Descriptive research : Descriptive research design is a type of research design that aims to obtain information to systematically describe a phenomenon, situation, or population. More specifically, it helps answer the what, when, where, and how questions regarding the research problem, rather than the why. For example, the researcher might wish to discover the percentage of motorists who tailgate – the prevalence  of a certain behavior.

There are two other designs to mention, which are both on a time continuum basis.

Cross-sectional design : All data are collected at a single point in time. Retrospective studies are usually cross-sectional. The IV usually concerns events or behaviors occurring in the past. One cross-sectional study example in medicine is a data collection of smoking habits and lung cancer incidence in a given population. A cross-sectional study like this cannot solely determine that smoking habits cause lung cancer, but it can suggest a relationship that merits further investigation. Cross-sectional studies serve many purposes, and the cross-sectional design is the most relevant design when assessing the prevalence of disease, attitudes and knowledge among patients and health personnel, in validation studies comparing, for example, different measurement instruments, and in reliability studies.

Longitudinal design : Data are collected two or more times over an extended period. Longitudinal designs are better at showing patterns of change and at clarifying whether a cause occurred before an effect (outcome). A challenge in longitudinal studies is attrition or the loss of participants over time. In a longitudinal study subjects are followed over time with continuous or repeated monitoring of risk factors or health outcomes, or both. Such investigations vary enormously in their size and complexity. At one extreme a large population may be studied over decades. An example of a longitudinal design is a multiyear comparative study of the same children in an urban and a suburban school to record their cognitive development in depth.

Confounding and Extraneous Variables

Confounding variables  are a type of extraneous variable that occur which interfere with or influence the relationship between the independent and dependent variables. In research that investigates a potential cause-and-effect relationship, a confounding variable is an unmeasured third variable that influences both the supposed cause and the supposed effect.

It’s important to consider potential confounding variables and account for them in research designs to ensure results are valid. You can imagine that if something sneaks in to influence the measured variables, it can really muck up the study!

Here is an example:

You collect data on sunburns and ice cream consumption. You find that higher ice cream consumption is associated with a higher probability of sunburn. Does that mean ice cream consumption causes sunburn?

Here, the confounding variable is temperature: hot temperatures cause people to both eat more ice cream and spend more time outdoors under the sun, resulting in more sunburns.


To ensure the internal validity of research, the researcher must account for confounding variables. If he/she fails to do so, the results may not reflect the actual relationship between the variables that they are interested in.

For instance, they may find a cause-and-effect relationship that does not actually exist, because the effect they measure is caused by the confounding variable (and not by the independent variable).

Here is another example:

The researcher finds that babies born to mothers who smoked during their pregnancies weigh significantly less than those born to non-smoking mothers. However, if the researcher does not account for the fact that smokers are more likely to engage in other unhealthy behaviors, such as drinking or eating less healthy foods, then he/she might overestimate the relationship between smoking and low birth weight.

Extraneous variables are any variables that the researcher is not investigating that can potentially affect the outcomes of the research study. If left uncontrolled, extraneous variables can lead to inaccurate conclusions about the relationship between IVs and DVs.

Extraneous variables can threaten the internal validity of a study by providing alternative explanations for the results. In an experiment, the researcher manipulates an independent variable to study its effects on a dependent variable.

In a study on mental performance, the researcher tests whether wearing a white lab coat, the independent variable (IV), improves scientific reasoning, the dependent variable (DV).

Students from a university are recruited to participate in the study. The researcher manipulates the independent variable by splitting participants into two groups:

  • Participants in the experimental   group are asked to wear a lab coat during the study.
  • Participants in the control group are asked to wear a casual coat during the study.

All participants are given a scientific knowledge quiz, and their scores are compared between groups.

When extraneous variables are uncontrolled, it’s hard to determine the exact effects of the independent variable on the dependent variable, because the effects of extraneous variables may mask them.

Uncontrolled extraneous variables can also make it seem as though there is a true effect of the independent variable in an experiment when there’s actually none.

In the above experiment example, these extraneous variables can affect the science knowledge scores:

  • Participant’s major (e.g., STEM or humanities)
  • Participant’s interest in science
  • Demographic variables such as gender or educational background
  • Time of day of testing
  • Experiment environment or setting

If these variables systematically differ between the groups, you can’t be sure whether your results come from your independent variable manipulation or from the extraneous variables.

In summary, an extraneous variable is anything that could influence the dependent variable. A confounding variable influences the dependent variable, and also correlates with or causally affects the independent variable.


Cause-and-Effect (Causality) Versus Association/Correlation  

A very important concept to understand is cause-and-effect, also known as causality, versus correlation. Let’s look at these two concepts in very simplified statements. Causation means that one thing caused  another thing to happen. Correlation means there is some association between the two thing we are measuring.

It would be nice if it were as simple as that. These two concepts can indeed by confused by many. Let’s dive deeper.

Two or more variables are considered to be related or associated, in a statistical context, if their values change so that as the value of one variable increases or decreases so does the value of the other variable (or the opposite direction).

For example, for the two variables of “hours worked” and “income earned”, there is a relationship between the two if the increase in hours is associated with an increase in income earned.

However, correlation is a statistical measure that describes the size and direction of a relationship between two or more variables. A correlation does not automatically mean that the change in one variable caused the change in value in the other variable.

Theoretically, the difference between the two types of relationships is easy to identify — an action or occurrence can cause another (e.g. smoking causes an increase in the risk of developing lung cancer), or it can correlate with another (e.g. smoking is correlated with alcoholism, but it does not cause alcoholism). In practice, however, it remains difficult to clearly establish cause and effect, compared with establishing correlation.

Simplified in this image, we can say that hot and sunny weather causes an increase in ice cream consumption. Similarly, we can demise that hot and sunny weather increases the incidence of sunburns. However, we cannot say that ice cream caused a sunburn (or that a sunburn increases consumption of ice cream). It is purely coincidental. In this example, it is pretty easy to anecdotally surmise correlation versus causation. However, in research, we have statistical tests that help researchers differentiate via specialized analyses.

An image showing a sun pointing to an ice cream cone and a person with a sunburn as causation. Then between the ice cream cone and sunburn as correlcations

Here is a great Khan Academy video of about 5 minutes that shows a worked example of correlation versus causation with regard to sledding accidents and frostbite cases:


key variables in research

References & Attribution

“ Light bulb doodle ” by rawpixel licensed CC0 .

“ Magnifying glass ” by rawpixel licensed CC0

“ Orange flame ” by rawpixel licensed CC0 .

Jenny Barrow. (2019). Experimental versus nonexperimental research. https://www.youtube.com/watch?v=FJo8xyXHAlE

Leibold, N. (2020). Research variables. Measures and Concepts Commonly Encountered in EBP. Creative Commons License: BY NC

Polit, D. & Beck, C. (2021).  Lippincott CoursePoint Enhanced for Polit’s Essentials of Nursing Research  (10th ed.). Wolters Kluwer Health.

Evidence-Based Practice & Research Methodologies Copyright © by Tracy Fawns is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License , except where otherwise noted.

Share This Book

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Indian Dermatol Online J
  • v.10(1); Jan-Feb 2019

Types of Variables, Descriptive Statistics, and Sample Size

Feroze kaliyadan.

Department of Dermatology, King Faisal University, Al Hofuf, Saudi Arabia

Vinay Kulkarni

1 Department of Dermatology, Prayas Amrita Clinic, Pune, Maharashtra, India

This short “snippet” covers three important aspects related to statistics – the concept of variables , the importance, and practical aspects related to descriptive statistics and issues related to sampling – types of sampling and sample size estimation.

What is a variable?[ 1 , 2 ] To put it in very simple terms, a variable is an entity whose value varies. A variable is an essential component of any statistical data. It is a feature of a member of a given sample or population, which is unique, and can differ in quantity or quantity from another member of the same sample or population. Variables either are the primary quantities of interest or act as practical substitutes for the same. The importance of variables is that they help in operationalization of concepts for data collection. For example, if you want to do an experiment based on the severity of urticaria, one option would be to measure the severity using a scale to grade severity of itching. This becomes an operational variable. For a variable to be “good,” it needs to have some properties such as good reliability and validity, low bias, feasibility/practicality, low cost, objectivity, clarity, and acceptance. Variables can be classified into various ways as discussed below.

Quantitative vs qualitative

A variable can collect either qualitative or quantitative data. A variable differing in quantity is called a quantitative variable (e.g., weight of a group of patients), whereas a variable differing in quality is called a qualitative variable (e.g., the Fitzpatrick skin type)

A simple test which can be used to differentiate between qualitative and quantitative variables is the subtraction test. If you can subtract the value of one variable from the other to get a meaningful result, then you are dealing with a quantitative variable (this of course will not apply to rating scales/ranks).

Quantitative variables can be either discrete or continuous

Discrete variables are variables in which no values may be assumed between the two given values (e.g., number of lesions in each patient in a sample of patients with urticaria).

Continuous variables, on the other hand, can take any value in between the two given values (e.g., duration for which the weals last in the same sample of patients with urticaria). One way of differentiating between continuous and discrete variables is to use the “mid-way” test. If, for every pair of values of a variable, a value exactly mid-way between them is meaningful, the variable is continuous. For example, two values for the time taken for a weal to subside can be 10 and 13 min. The mid-way value would be 11.5 min which makes sense. However, for a number of weals, suppose you have a pair of values – 5 and 8 – the midway value would be 6.5 weals, which does not make sense.

Under the umbrella of qualitative variables, you can have nominal/categorical variables and ordinal variables

Nominal/categorical variables are, as the name suggests, variables which can be slotted into different categories (e.g., gender or type of psoriasis).

Ordinal variables or ranked variables are similar to categorical, but can be put into an order (e.g., a scale for severity of itching).

Dependent and independent variables

In the context of an experimental study, the dependent variable (also called outcome variable) is directly linked to the primary outcome of the study. For example, in a clinical trial on psoriasis, the PASI (psoriasis area severity index) would possibly be one dependent variable. The independent variable (sometime also called explanatory variable) is something which is not affected by the experiment itself but which can be manipulated to affect the dependent variable. Other terms sometimes used synonymously include blocking variable, covariate, or predictor variable. Confounding variables are extra variables, which can have an effect on the experiment. They are linked with dependent and independent variables and can cause spurious association. For example, in a clinical trial for a topical treatment in psoriasis, the concomitant use of moisturizers might be a confounding variable. A control variable is a variable that must be kept constant during the course of an experiment.

Descriptive Statistics

Statistics can be broadly divided into descriptive statistics and inferential statistics.[ 3 , 4 ] Descriptive statistics give a summary about the sample being studied without drawing any inferences based on probability theory. Even if the primary aim of a study involves inferential statistics, descriptive statistics are still used to give a general summary. When we describe the population using tools such as frequency distribution tables, percentages, and other measures of central tendency like the mean, for example, we are talking about descriptive statistics. When we use a specific statistical test (e.g., Mann–Whitney U-test) to compare the mean scores and express it in terms of statistical significance, we are talking about inferential statistics. Descriptive statistics can help in summarizing data in the form of simple quantitative measures such as percentages or means or in the form of visual summaries such as histograms and box plots.

Descriptive statistics can be used to describe a single variable (univariate analysis) or more than one variable (bivariate/multivariate analysis). In the case of more than one variable, descriptive statistics can help summarize relationships between variables using tools such as scatter plots.

Descriptive statistics can be broadly put under two categories:

  • Sorting/grouping and illustration/visual displays
  • Summary statistics.

Sorting and grouping

Sorting and grouping is most commonly done using frequency distribution tables. For continuous variables, it is generally better to use groups in the frequency table. Ideally, group sizes should be equal (except in extreme ends where open groups are used; e.g., age “greater than” or “less than”).

Another form of presenting frequency distributions is the “stem and leaf” diagram, which is considered to be a more accurate form of description.

Suppose the weight in kilograms of a group of 10 patients is as follows:

56, 34, 48, 43, 87, 78, 54, 62, 61, 59

The “stem” records the value of the “ten's” place (or higher) and the “leaf” records the value in the “one's” place [ Table 1 ].

Stem and leaf plot

Illustration/visual display of data

The most common tools used for visual display include frequency diagrams, bar charts (for noncontinuous variables) and histograms (for continuous variables). Composite bar charts can be used to compare variables. For example, the frequency distribution in a sample population of males and females can be illustrated as given in Figure 1 .

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g001.jpg

Composite bar chart

A pie chart helps show how a total quantity is divided among its constituent variables. Scatter diagrams can be used to illustrate the relationship between two variables. For example, global scores given for improvement in a condition like acne by the patient and the doctor [ Figure 2 ].

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g002.jpg

Scatter diagram

Summary statistics

The main tools used for summary statistics are broadly grouped into measures of central tendency (such as mean, median, and mode) and measures of dispersion or variation (such as range, standard deviation, and variance).

Imagine that the data below represent the weights of a sample of 15 pediatric patients arranged in ascending order:

30, 35, 37, 38, 38, 38, 42, 42, 44, 46, 47, 48, 51, 53, 86

Just having the raw data does not mean much to us, so we try to express it in terms of some values, which give a summary of the data.

The mean is basically the sum of all the values divided by the total number. In this case, we get a value of 45.

The problem is that some extreme values (outliers), like “'86,” in this case can skew the value of the mean. In this case, we consider other values like the median, which is the point that divides the distribution into two equal halves. It is also referred to as the 50 th percentile (50% of the values are above it and 50% are below it). In our previous example, since we have already arranged the values in ascending order we find that the point which divides it into two equal halves is the 8 th value – 42. In case of a total number of values being even, we choose the two middle points and take an average to reach the median.

The mode is the most common data point. In our example, this would be 38. The mode as in our case may not necessarily be in the center of the distribution.

The median is the best measure of central tendency from among the mean, median, and mode. In a “symmetric” distribution, all three are the same, whereas in skewed data the median and mean are not the same; lie more toward the skew, with the mean lying further to the skew compared with the median. For example, in Figure 3 , a right skewed distribution is seen (direction of skew is based on the tail); data values' distribution is longer on the right-hand (positive) side than on the left-hand side. The mean is typically greater than the median in such cases.

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g003.jpg

Location of mode, median, and mean

Measures of dispersion

The range gives the spread between the lowest and highest values. In our previous example, this will be 86-30 = 56.

A more valuable measure is the interquartile range. A quartile is one of the values which break the distribution into four equal parts. The 25 th percentile is the data point which divides the group between the first one-fourth and the last three-fourth of the data. The first one-fourth will form the first quartile. The 75 th percentile is the data point which divides the distribution into a first three-fourth and last one-fourth (the last one-fourth being the fourth quartile). The range between the 25 th percentile and 75 th percentile is called the interquartile range.

Variance is also a measure of dispersion. The larger the variance, the further the individual units are from the mean. Let us consider the same example we used for calculating the mean. The mean was 45.

For the first value (30), the deviation from the mean will be 15; for the last value (86), the deviation will be 41. Similarly we can calculate the deviations for all values in a sample. Adding these deviations and averaging will give a clue to the total dispersion, but the problem is that since the deviations are a mix of negative and positive values, the final total becomes zero. To calculate the variance, this problem is overcome by adding squares of the deviations. So variance would be the sum of squares of the variation divided by the total number in the population (for a sample we use “n − 1”). To get a more realistic value of the average dispersion, we take the square root of the variance, which is called the “standard deviation.”

The box plot

The box plot is a composite representation that portrays the mean, median, range, and the outliers [ Figure 4 ].

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g004.jpg

The concept of skewness and kurtosis

Skewness is a measure of the symmetry of distribution. Basically if the distribution curve is symmetric, it looks the same on either side of the central point. When this is not the case, it is said to be skewed. Kurtosis is a representation of outliers. Distributions with high kurtosis tend to have “heavy tails” indicating a larger number of outliers, whereas distributions with low kurtosis have light tails, indicating lesser outliers. There are formulas to calculate both skewness and kurtosis [Figures ​ [Figures5 5 – 8 ].

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g005.jpg

Positive skew

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g008.jpg

High kurtosis (positive kurtosis – also called leptokurtic)

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g006.jpg

Negative skew

An external file that holds a picture, illustration, etc.
Object name is IDOJ-10-82-g007.jpg

Low kurtosis (negative kurtosis – also called “Platykurtic”)

Sample Size

In an ideal study, we should be able to include all units of a particular population under study, something that is referred to as a census.[ 5 , 6 ] This would remove the chances of sampling error (difference between the outcome characteristics in a random sample when compared with the true population values – something that is virtually unavoidable when you take a random sample). However, it is obvious that this would not be feasible in most situations. Hence, we have to study a subset of the population to reach to our conclusions. This representative subset is a sample and we need to have sufficient numbers in this sample to make meaningful and accurate conclusions and reduce the effect of sampling error.

We also need to know that broadly sampling can be divided into two types – probability sampling and nonprobability sampling. Examples of probability sampling include methods such as simple random sampling (each member in a population has an equal chance of being selected), stratified random sampling (in nonhomogeneous populations, the population is divided into subgroups – followed be random sampling in each subgroup), systematic (sampling is based on a systematic technique – e.g., every third person is selected for a survey), and cluster sampling (similar to stratified sampling except that the clusters here are preexisting clusters unlike stratified sampling where the researcher decides on the stratification criteria), whereas nonprobability sampling, where every unit in the population does not have an equal chance of inclusion into the sample, includes methods such as convenience sampling (e.g., sample selected based on ease of access) and purposive sampling (where only people who meet specific criteria are included in the sample).

An accurate calculation of sample size is an essential aspect of good study design. It is important to calculate the sample size much in advance, rather than have to go for post hoc analysis. A sample size that is too less may make the study underpowered, whereas a sample size which is more than necessary might lead to a wastage of resources.

We will first go through the sample size calculation for a hypothesis-based design (like a randomized control trial).

The important factors to consider for sample size calculation include study design, type of statistical test, level of significance, power and effect size, variance (standard deviation for quantitative data), and expected proportions in the case of qualitative data. This is based on previous data, either based on previous studies or based on the clinicians' experience. In case the study is something being conducted for the first time, a pilot study might be conducted which helps generate these data for further studies based on a larger sample size). It is also important to know whether the data follow a normal distribution or not.

Two essential aspects we must understand are the concept of Type I and Type II errors. In a study that compares two groups, a null hypothesis assumes that there is no significant difference between the two groups, and any observed difference being due to sampling or experimental error. When we reject a null hypothesis, when it is true, we label it as a Type I error (also denoted as “alpha,” correlating with significance levels). In a Type II error (also denoted as “beta”), we fail to reject a null hypothesis, when the alternate hypothesis is actually true. Type II errors are usually expressed as “1- β,” correlating with the power of the test. While there are no absolute rules, the minimal levels accepted are 0.05 for α (corresponding to a significance level of 5%) and 0.20 for β (corresponding to a minimum recommended power of “1 − 0.20,” or 80%).

Effect size and minimal clinically relevant difference

For a clinical trial, the investigator will have to decide in advance what clinically detectable change is significant (for numerical data, this is could be the anticipated outcome means in the two groups, whereas for categorical data, it could correlate with the proportions of successful outcomes in two groups.). While we will not go into details of the formula for sample size calculation, some important points are as follows:

In the context where effect size is involved, the sample size is inversely proportional to the square of the effect size. What this means in effect is that reducing the effect size will lead to an increase in the required sample size.

Reducing the level of significance (alpha) or increasing power (1-β) will lead to an increase in the calculated sample size.

An increase in variance of the outcome leads to an increase in the calculated sample size.

A note is that for estimation type of studies/surveys, sample size calculation needs to consider some other factors too. This includes an idea about total population size (this generally does not make a major difference when population size is above 20,000, so in situations where population size is not known we can assume a population of 20,000 or more). The other factor is the “margin of error” – the amount of deviation which the investigators find acceptable in terms of percentages. Regarding confidence levels, ideally, a 95% confidence level is the minimum recommended for surveys too. Finally, we need an idea of the expected/crude prevalence – either based on previous studies or based on estimates.

Sample size calculation also needs to add corrections for patient drop-outs/lost-to-follow-up patients and missing records. An important point is that in some studies dealing with rare diseases, it may be difficult to achieve desired sample size. In these cases, the investigators might have to rework outcomes or maybe pool data from multiple centers. Although post hoc power can be analyzed, a better approach suggested is to calculate 95% confidence intervals for the outcome and interpret the study results based on this.

Financial support and sponsorship

Conflicts of interest.

There are no conflicts of interest.

Once you have brainstormed project topics, narrowed down the list, and reviewed the research related to that narrowed list, select a topic that seems most appealing to you. However, this project topic is not set in stone yet. After you begin working through the project, you may realize that the topic needs to be revised, or even entirely changed to a different topic. The next step is to identify the key variables and the research design.

All research projects are based around variables. A variable is the characteristic or attribute of an individual, group, educational system, or the environment that is of interest in a research study. Variables can be straightforward and easy to measure, such as gender, age, or course of study. Other variables are more complex, such as socioeconomic status, academic achievement, or attitude toward school. Variables may also include an aspect of the educational system, such as a specific teaching method or counseling program. Characteristics of the environment may also be variables, such as the amount of school funding or availability of computers. Therefore, once the general research topic has been identified, the researcher should identify the key variables of interest.

For example, a researcher is interested in low levels of literacy. Literacy itself is still a broad topic. In most instances, the broad topic and general variables need to be specifically identified. For example, the researcher needs to identify specific variables that define literacy: reading fluency (the ability to read a text out loud), reading comprehension (understanding what is read), vocabulary, interest in reading, etc. If a researcher is interested in motivation, what specific motivation variables are of interest: external motivation, goals, need for achievement, etc? Reading other research studies about your chosen topic will help you better identify the specific variables of interest.

  • The key variables provide focus when writing the Introduction section.
  • The key variables are the major terms to use when searching for research articles for the Literature Review.
  • The key variables are the terms to be operationally defined if an Operational Definition of Terms section is necessary.
  • The key variables provide focus to the Methods section.
  • The Instrument will measure the key variables. These key variables must be directly measured or manipulated for the research study to be valid.
  • Descriptive: Describes the current state of variables. For example, a descriptive study might examine teachers' knowledge of literacy development. This is a descriptive study because it simply describes the current state of teachers' knowledge of literacy development.
  • Causal Comparative: Examines the effect of one variable that cannot be manipulated on other variables. An example would be the effect of gender on examination malpractice. A researcher cannot manipulate a person's gender, so instead males and females are compared on their examination malpractice behavior. Because the variable of interest cannot be manipulated, causal comparative studies (sometimes also called ex post facto) ccompare two groups that differ on the independent variable (e.g., gender) on the dependent variable (e.g., examination malpractice). Thus, the key identifying factor of a causal comparative study is that it compares two or more groups on a different variable.
  • Correlational: Describes the relationship between variables. Correlational studies must examine two variables that have continuous values. For example, academic achievement is a continuous variable because students' scores have a wide range of values - oftentimes from 0 to 100. However, gender is not a continuous variable because there are only two categories that gender can have: male and female. A correlational study might examine the relationship between motivation and academic achievement - both continuous variables. Note that in a correlational design, both variables must be studied within the same group of individuals. In other words, it is acceptable to study the relationship between academic achievement and motivation in students because the two variables (academic achievement and motivation) are in the same group of individuals (students). However, it is extremely difficult to study two variables in two groups of people, such as the relationship between teacher motivation and student achievement. Here, the two variables are compared between two groups: teachers and students. I strongly advise against this latter type of study.
  • Experimental and Quasi-Experimental: Examines the effect of a variable that the researcher manipulates on other variables. An experimental or quasi-experimental study might examine the effect of telling stories on children's literacy skills. In this case, the researcher will "manipulate" the variable of telling stories by placing half of the children in a treatment group that listens to stories and the other half of children in a control group that gets the ordinary literacy instruction. The difference between an experimental design and quasi-experimental design is described in Step 4: Research Design.

Descriptive studies are the most simple research design and provide the least amount of information about improving education. Therefore, descriptive studies should only be conducted for first degree and diploma projects. Only in special cases should a Masters thesis be descriptive. Doctoral dissertations should aim for experimental or quasi-experimental studies.

  • The purpose, research questions, and hypotheses will be written about the variables based on the research design.
  • The Instruments will be developed to measure the key variables and the Instruments section in Chapter 3 is written to describe the instruments.
  • The Procedures section describes the treatment for experimental studies and/or how the instrument will be administered.
  • The Method of Data Analysis describes how the data is summarized and tested based on the research questions and hypotheses.

Copyright 2012, Katrina A. Korb, All Rights Reserved

Related Platforms

  • Humanitarian Data Exchange (HDX)
  • Humanitarian Exchange Language (HXL)

Other OCHA Services

  • Financial Tracking Service
  • Humanitarian Data Exchange
  • Humanitarian ID
  • Humanitarian InSight
  • Humanitarian Response
  • Inter-Agency Standing Committee
  • OCHA Website
  • Virtual OSOCC

Learn With The Centre

  • All Learning Paths

Selecting Your Key Variables

« back to all learning paths.

  • An Introduction to Disclosure Risk Assessment
  • Prepare the Disclosure Risk Assessment
  • Run the Disclosure Risk Assessment
  • Read the Assessment Results
  • Manage Data Responsibly
  • Step-by-Step Guide to SDC using sdcMicro

The first step in a disclosure risk assessment is the selection of key variables . These are the variables, or the columns in your dataset, that are most likely to lead to the disclosure of confidential information, including an individual’s identity. Watch this video to learn more about different types of variables and how to select your key variables.

Key Takeaways

Classify your variables as identifying and non-identifying..

Identifying variables contain information that can lead to the identification of respondents in the dataset. These can be further categorized as either direct identifiers or indirect identifiers (also referred to as quasi-identifiers). Remember, direct identifiers such as full names, addresses, phone numbers and GPS coordinates should always be removed from the microdata before starting the risk assessment.

Select your key variables.

A key variable is typically an indirect identifier that could be used to re-identify individuals within a dataset or to link records between different datasets. Common examples of key variables are age, marital status, geographical variables, gender, and religion. Removing all indirect identifiers from a dataset is likely to severely limit the analytical value of the dataset. The SDC process is intended to assess the disclosure risk presented by the indirect identifiers and to take steps to limit that risk while maintaining the analytic power of the data.

Note whether your key variables are continuous or categorical.

You will use different techniques to assess the disclosure risk of continuous and categorical variables. Categorical variables take values from a finite set (i.e. gender) whereas continuous variables are numeric variables that can take an infinite number of values (i.e. income). Continuous variables can be transformed into categorical variables by creating intervals (i.e. income brackets).

The sensitivity of indirect identifiers depends on the context.

Direct identifiers are always considered sensitive while the sensitivity of indirect identifiers is often context-specific. This is why it is important to understand both the data environment and the real-life situation when selecting your key variables. Keep in mind that even when indirect identifiers are not themselves sensitive, it may be possible to combine them with other variables to lead to the disclosure of sensitive information.

Pay close attention to exclusive or partial variables.

While you do not want to remove all indirect identifiers, it may be important to remove some. For example, you may want to consider removing variables with many missing values, such as a variable recorded only for a select group.

General Questions

What is the difference between ‘key variables’ and ‘keys’.

Key variables are your indirect identifiers that are most likely to lead to a disclosure whereas keys are all the unique combinations of values those indirect identifiers take. For the key variables ‘Marital Status’ and ‘Gender’ you could have keys such as ‘Married, Female’, ‘Married, Male’ and ‘Single, Female’. The number of times, or the frequency, a given key appears in a dataset is the basis for many disclosure risk measures.

What if I don’t feel confident selecting key-variables?

Selecting key variables does take some practice. When in doubt, we recommend you working with a few colleagues to do the selection. You can also select different sets of key variables and run a disclosure risk assessment on each. Finally, remember that it is important for you to have an understanding of the data environment before selecting the key variables. Selecting key variables correctly requires you to make assumptions about the data that others are likely to have access to as well as whether specific data is sensitive in your context (even if it might not be considered sensitive in another context).

Logo with initials IJPBL

Path Analysis: The Predictive Relationships of Problem-based Learning Processes on Preservice Teachers’ Learning Strategies

Article sidebar, main article content.

Path Analysis is used to provide estimates of the magnitude and significance of hypothesized causal connections among sets of variables displayed using path diagrams. It is an extension of multiple regression analysis and holds strength as a methodology as it allows researchers to assess both direct and indirect effects of multiple independent variables on one or more dependent variables. In this paper, Path Analysis is used to examine the predictive relations of preservice teachers’ perception of key Problem-based Learning (PBL) processes and their learning strategies before and after their PBL experience. The sample involved in this study comprised of 1041 preservice teachers in the core Educational Psychology course using the PBL approach at a Teacher Education Institute in Singapore. The participants consisted of 333 males, 662 females, and 46 preservice teachers who did not indicate their gender. The mean age was 25.6 ( SD = 5.41). The Motivated Strategies for Learning Questionnaire (MSLQ) by Pintrich, Smith, Garcia, and Mckeachie (1993) was used to measure preservice teachers’ learning strategies. It consisted of five subscales namely, rehearsal, elaboration, organization, critical thinking and metacognitive self-regulation. The Problem-based Learning Process Inventory (PBLPI) by Chua (2016) was used to measure the key PBL processes namely problem-posing, scaffolding and connecting. Findings from the study suggested that in the PBL environment, (i) preservice teachers’ pre-PBL metacognitive self-regulation played a pivotal role in determining preservice teachers’ perceived importance of the key processes in enhancing their PBL experience; (ii) the key PBL scaffolding and connecting processes were salient predictors of preservice teachers’ subsequent post-PBL learning strategies; and (iii) the key PBL processes played a mediating role in relating preservice teachers’ pre-PBL learning strategies to their corresponding post-PBL factors. Implications for using path analysis for Problem-based Learning research will be discussed.

Article Details

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License .

1. Publication and Promotion : In consideration of the Publisher’s agreement to publish the Work, Author hereby grants and assigns to Publisher the non-exclusive right to print, publish, reproduce, or distribute the Work throughout the world in all means of expression by any method now known or hereafter developed, including electronic format, and to market or sell the Work orany part of it as Publisher sees fit. Author further grants Publisher the right to use Author’s name in association with the Work inpublished form and in advertising and promotional materials

2. C opyright : Copyright of the Work remains in Author’s name.

3. Prior Publication and Attribution : Author agrees not to publish the Work in print form prior to publication of the Work by the Publisher. Author agrees to cite, by author, title, and publisher, the original Interdisciplinary Journal of Problem-based Learning publication when publishing the Work elsewhere

4. Author Representations : The Author represents and warrants that the Work:

(a) is the Author’s original Work and that Author has full power to enter into this Agreement;

(b) does not infringe the copyright or property of another;

(c) contains no material which is obscene, libelous, defamatory or previously published, in whole or in part.

Author shall indemnify and hold Publisher harmless against loss of expenses arising from breach of any such warranties.

5. Licensing and Reuse : Reuse of the published Work will be governed by a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0; http://creativecommons.org/licenses/ by-nc/4.0/). This license lets others remix, tweak, and build upon the Work non-commercially; although new works must acknowledge the original Interdisciplinary Journal of Problem-based Learning publication and be non-commercial, they do not have to be licensed on the same terms.


  1. Types of Variables in Research & Statistics

    Learn how to identify and classify variables in statistical research based on data type and experiment role. See examples of quantitative, categorical, independent, dependent, and other types of variables with a salt-tolerance experiment.

  2. Variables in Research

    Types of Variables in Research. Types of Variables in Research are as follows: Independent Variable. This is the variable that is manipulated by the researcher. It is also known as the predictor variable, as it is used to predict changes in the dependent variable. Examples of independent variables include age, gender, dosage, and treatment type ...

  3. Types of Variables in Research

    Learn how to identify and classify variables in statistical research based on data type and experiment role. Find out the difference between quantitative, categorical, independent, dependent, and other types of variables with examples.

  4. Independent and Dependent Variables

    In research, the independent variable is manipulated to observe its effect, while the dependent variable is the measured outcome. Essentially, the independent variable is the presumed cause, and the dependent variable is the observed effect. Variables provide the foundation for examining relationships, drawing conclusions, and making ...

  5. A Student's Guide to the Classification and Operationalization of

    This article explains how an understanding of the classification and operationalization of variables is the key to the process. Variables describe aspects of the sample that is under study; they are so called because they vary in value from subject to subject in the sample. Variables may be independent or dependent. Independent variables ...

  6. Variables in Research: Breaking Down the Essentials of Experimental

    The Role of Variables in Research. In scientific research, variables serve several key functions: Define Relationships: Variables allow researchers to investigate the relationships between different factors and characteristics, providing insights into the underlying mechanisms that drive phenomena and outcomes. Establish Comparisons: By manipulating and comparing variables, scientists can ...

  7. Independent & Dependent Variables (With Examples)

    What (exactly) is a variable? The simplest way to understand a variable is as any characteristic or attribute that can experience change or vary over time or context - hence the name "variable". For example, the dosage of a particular medicine could be classified as a variable, as the amount can vary (i.e., a higher dose or a lower dose). ). Similarly, gender, age or ethnicity could be ...

  8. Independent and Dependent Variables

    A variable in research simply refers to a person, place, thing, or phenomenon that you are trying to measure in some way. The best way to understand the difference between a dependent and independent variable is that the meaning of each is implied by what the words tell us about the variable you are using. You can do this with a simple exercise ...

  9. Variables in Research

    Many research studies have independent and dependent variables, since understanding cause-and-effect between them is a key end goal. Some examples of research questions involving these variables ...

  10. Importance of Variables in Stating the Research Objectives

    Students without prior research experience may not know how to conceptualize and design a study. This article explains how an understanding of the classification and operationalization of variables is the key to the process. Variables describe aspects of the sample that is under study; they are so called because they vary in value from subject ...

  11. A Practical Guide to Writing Quantitative and Qualitative Research

    Examples from the authors and peer-reviewed scientific articles in the healthcare field are provided to illustrate key points. ... .1,5,14 These questions may also aim to discover differences between groups within the context of an outcome variable (comparative research questions),1,5,14 or elucidate trends and interactions among variables ...

  12. A Student's Guide to the Classification and Operationalization of

    This is the second of a two-part article that explains how an understanding of the classification and operationalization of variables is the key to the process. Variables need to be operationalized; that is, defined in a way that permits their accurate measurement. They may be operationalized as categorical or continuous variables.

  13. Modelling key variables in social science research: Introduction to the

    An aim of the later paper on 'Statistical Modelling of Key Variables' is to offer some useful practical prescriptions on modelling key variables in sociological research. The material presented in this special section updates earlier work on key variables in light of recent developments in survey datasets, statistical methods, and computing ...

  14. 2.2: Concepts, Constructs, and Variables

    As shown in Figure 2.1, scientific research proceeds along two planes: a theoretical plane and an empirical plane. Constructs are conceptualized at the theoretical (abstract) plane, while variables are operationalized and measured at the empirical (observational) plane. Thinking like a researcher implies the ability to move back and forth ...

  15. Organizing Your Social Sciences Research Paper

    Brainstorm about what you consider to be the key variables in your research. Answer the question, "What factors contribute to the presumed effect?" Review related literature to find how scholars have addressed your research problem. Identify the assumptions from which the author(s) addressed the problem. ...

  16. Key Concepts in Quantitative Research

    Key Concepts in Quantitative Research. In this module, we are going to explore the nuances of quantitative research, including the main types of quantitative research, more exploration into variables (including confounding and extraneous variables), and causation. Content includes: Objectives: Discuss the flaws, proof, and rigor in research.

  17. Research Variables: Types, Uses and Definition of Terms

    The purpose of research is to describe and explain variance in the world, that is, variance that. occurs naturally in the world or chang e that we create due to manipulation. Variables are ...

  18. Types of Variables, Descriptive Statistics, and Sample Size

    Abstract. This short "snippet" covers three important aspects related to statistics - the concept of variables, the importance, and practical aspects related to descriptive statistics and issues related to sampling - types of sampling and sample size estimation. Keywords: Biostatistics, descriptive statistics, sample size, variables.

  19. Research Variables

    Key Variables All research projects are based around variables. A variable is the characteristic or attribute of an individual, group, educational system, or the environment that is of interest in a research study. Variables can be straightforward and easy to measure, such as gender, age, or course of study. ...

  20. Research Methods

    Research methods are specific procedures for collecting and analyzing data. Developing your research methods is an integral part of your research design. When planning your methods, there are two key decisions you will make. First, decide how you will collect data. Your methods depend on what type of data you need to answer your research question:

  21. Selecting Your Key Variables

    A key variable is typically an indirect identifier that could be used to re-identify individuals within a dataset or to link records between different datasets. Common examples of key variables are age, marital status, geographical variables, gender, and religion. Removing all indirect identifiers from a dataset is likely to severely limit the ...

  22. Path Analysis: The Predictive Relationships of Problem-based Learning

    Path Analysis is used to provide estimates of the magnitude and significance of hypothesized causal connections among sets of variables displayed using path diagrams. It is an extension of multiple regression analysis and holds strength as a methodology as it allows researchers to assess both direct and indirect effects of multiple independent variables on one or more dependent variables.