U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Lippincott Open Access

Logo of lwwopen

Linear Regression in Medical Research

Patrick schober.

From the * Department of Anesthesiology, Amsterdam UMC, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands

Thomas R. Vetter

† Department of Surgery and Perioperative Care, Dell Medical School at the University of Texas at Austin, Austin, Texas.

Related Article, see p 110

Linear regression is used to quantify the relationship between ≥1 independent (predictor) variables and a continuous dependent (outcome) variable.

In this issue of Anesthesia & Analgesia , Müller-Wirtz et al 1 report results of a study in which they used linear regression to assess the relationship in a rat model between tissue propofol concentrations and exhaled propofol concentrations (Figure.

An external file that holds a picture, illustration, etc.
Object name is ane-132-108-g001.jpg

Table 2 given in Müller-Wirtz et al, 1 showing the estimated relationships between tissue (or plasma) propofol concentrations and exhaled propofol concentrations. The authors appropriately report the 95% confidence intervals as a measure of the precision of their estimates, as well as the coefficient of determination ( R 2 ). The presented values indicate, for example, that (1) the exhaled propofol concentrations are estimated to increase on average by 4.6 units, equal to the slope (regression) coefficient, for each 1-unit increase of plasma propofol concentration; (2) the “true” mean increase could plausibly be expected to lie anywhere between 3.6 and 5.7 units as indicated by the slope coefficient’s confidence interval; and (3) the R 2 suggests that about 71% of the variability in the exhaled concentration can be explained by its relationship with plasma propofol concentrations.

Linear regression is used to estimate the association of ≥1 independent (predictor) variables with a continuous dependent (outcome) variable. 2 In the most simple case, thus referred to as “simple linear regression,” there is only one independent variable. Simple linear regression fits a straight line to the data points that best characterizes the relationship

between the dependent ( Y ) variable and the independent ( X ) variable, with the y -axis intercept ( b 0 ), and the regression coefficient being the slope ( b 1 ) of this line:

equation image

A model that includes several independent variables is referred to as “multiple linear regression” or “multivariable linear regression.” Even though the term linear regression suggests otherwise, it can also be used to model curved relationships.

Linear regression is an extremely versatile technique that can be used to address a variety of research questions and study aims. Researchers may want to test whether there is evidence for a relationship between a categorical (grouping) variable (eg, treatment group or patient sex) and a quantitative outcome (eg, blood pressure). The 2-sample t test and analysis of variance, 3 which are commonly used for this purpose, are essentially special cases of linear regression. However, linear regression is more flexible, allowing for >1 independent variable and allowing for continuous independent variables. Moreover, when there is >1 independent variable, researchers can also test for the interaction of variables—in other words, whether the effect of 1 independent variable depends on the value or level of another independent variable.

Linear regression not only tests for relationships but also quantifies their direction and strength. The regression coefficient describes the average (expected) change in the dependent variable for each 1-unit change in the independent variable for continuous independent variables or the expected difference versus a reference category for categorical independent variables. The coefficient of determination, commonly referred to as R 2 , describes the proportion of the variability in the outcome variable that can be explained by the independent variables. With simple linear regression, the coefficient of determination is also equal to the square of the Pearson correlation between the x and y values.

When including several independent variables, the regression model estimates the effect of each independent variable while holding the values of all other independent variables constant. 4 Thus, linear regression is useful (1) to distinguish the effects of different variables on the outcome and (2) to control for other variables—like systematic confounding in observational studies or baseline imbalances due to chance in a randomized controlled trial. Ultimately, linear regression can be used to predict the value of the dependent outcome variable based on the value(s) of the independent predictor variable(s).

Valid inferences from linear regression rely on its assumptions being met, including

  • the residuals are the differences between the observed values and the values predicted by the regression model, and the residuals must be approximately normally distributed and have approximately the same variance over the range of predicted values;
  • the residuals are also assumed to be uncorrelated. In simple language, the observations must be independent of each other; for example, there must not be repeated measurements within the same subjects. Other techniques like linear mixed-effects models are required for correlated data 5 ; and
  • the model must be correctly specified, as explained in more detail in the next paragraph.

Whereas Müller-Wirtz et al 1 used simple linear regression to address their research question, researchers often need to specify a multivariable model and make choices on which independent variables to include and on how to model the functional relationship between variables (eg, straight line versus curve; inclusion of interaction terms).

Variable selection is a much-debated topic, and the details are beyond the scope of this Statistical Minute. Basically, variable selection depends on whether the purpose of the model is to understand the relationship between variables or to make predictions. This is also predicated on whether there is informed a priori theory to guide variable selection and on whether the model needs to control for variables that are not of primary interest but are confounders that could distort the relationship between other variables.

Omitting important variables or interactions can lead to biased estimates and a model that poorly describes the true underlying relationships, whereas including too many variables leads to modeling the noise (sampling error) in the data and reduces the precision of the estimates. Various statistics and plots, including adjusted R 2 , Mallows C p , and residual plots are available to assess the goodness of fit of the chosen linear regression model.

Change Password

Your password must have 8 characters or more and contain 3 of the following:.

  • a lower case character, 
  • an upper case character, 
  • a special character 

Password Changed Successfully

Your password has been changed

  • Sign in / Register

Request Username

Can't sign in? Forgot your username?

Enter your email address below and we will send you your username

If the address matches an existing account you will receive an email with instructions to retrieve your username

Is It the Intervention or the Students? Using Linear Regression to Control for Student Characteristics in Undergraduate STEM Education Research

  • Roddy Theobald
  • Scott Freeman

Address correspondence to: Roddy Theobald ( E-mail Address: [email protected] ).

*Department of Statistics, University of Washington, Seattle, WA 98195-4322

Search for more papers by this author

Department of Biology, University of Washington, Seattle, WA 98195-4322

Although researchers in undergraduate science, technology, engineering, and mathematics education are currently using several methods to analyze learning gains from pre- and posttest data, the most commonly used approaches have significant shortcomings. Chief among these is the inability to distinguish whether differences in learning gains are due to the effect of an instructional intervention or to differences in student characteristics when students cannot be assigned to control and treatment groups at random. Using pre- and posttest scores from an introductory biology course, we illustrate how the methods currently in wide use can lead to erroneous conclusions, and how multiple linear regression offers an effective framework for distinguishing the impact of an instructional intervention from the impact of student characteristics on test score gains. In general, we recommend that researchers always use student-level regression models that control for possible differences in student ability and preparation to estimate the effect of any nonrandomized instructional intervention on student performance.

INTRODUCTION

For the past several decades, discipline-based education researchers have focused on testing whether educational interventions in college science classrooms lead to improved student understanding and performance. Most interventions are given at the classroom level, meaning that all students in a given classroom receive the intervention. For example, all students in a class may be exposed to a new multimedia program ( Aly et al. , 2004 ), asked to participate in reciprocal peer tutoring ( Fantuzzo et al. , 1989 ), or taught in a workshop or studio format ( Udovic et al. , 2002 ).

To evaluate the impact of educational interventions like these on student performance, researchers typically collect student test scores before and after the intervention—that is, from a pretest and a posttest. Although some researchers are interested in whether student scores improve after instruction (see Arwood, 2004 ; McConnell et al. , 2006 ; Nam and Ito, 2011 ), most are interested in demonstrating that student test scores improve more in treatment classrooms than in control classrooms—that is, in sections that do receive the intervention versus sections that do not.

What is the best way to analyze pre–post data in this setting? At least four different methods for determining whether learning gains differ in the treatment and control classrooms are commonly used in the science, technology, engineering, and mathematics (STEM) education literature: comparing 1) raw change scores (e.g., Udovic et al. , 2002 ); 2) normalized gain scores ( Hake, 1998 ); 3) normalized change scores ( Marx and Cummings, 2007 ); and 4) effect sizes ( Andrews et al ., 2011 ). Unfortunately, none of these methods accounts for a fundamental problem: controlling for student equivalence, or lack thereof, in the classrooms being compared.

The problem of student nonequivalence is pervasive, because it is seldom possible to use randomization to control for differences in student ability or preparation (but see Fantuzzo et al ., 1989 ; Buzzell et al. , 2002 ; Aly et al. , 2004 ; Bilgin et al. , 2009 ). While nonrandomized designs are often unavoidable—it is very difficult to convince a registrar's office to randomly assign students to courses—they raise difficult questions about interpreting results. Namely, researchers who use the methods listed above have no way of knowing whether observed differences in learning gains between the treatment and control classes are due to the impact of the intervention itself or to differences between treatment and control classes—including the instructor, the instructional techniques used, and student characteristics—that are completely independent of the intervention.

In this paper, we use test score data from two sections of a college-level introductory biology course to illustrate how each of the four commonly used methods can lead to misleading conclusions. The two sections were taught by the same instructor, in the same term, using identical instructional techniques. However, due to a scheduling conflict during that term, the students enrolled in one of the sections had substantially better academic qualifications, on average, than students in the other section. We show that each of the four methods commonly used to assess educational interventions in college STEM classrooms would support the conclusion that an “instructional intervention” in the higher-performing section led to larger student learning gains, when in fact there was no intervention at all.

We propose a solution to the problem by introducing an approach that is ubiquitous in many other research areas but currently underused in the STEM education literature: multiple linear regression analysis. Specifically, we employ a student-level regression model that controls for observable differences between students in the treatment and control classes and demonstrate that it leads to the correct conclusion: differences in learning gains between the two sections are driven by differences in the composition of the students in the two sections, not by any intervention that was given in one section or the other. We argue that to estimate an unbiased intervention effect when analyzing data from nonrandom experimental designs, researchers must account for student background in a regression framework.

REVIEWING EXISTING METHODS FOR ANALYZING PRE–POST DATA

Before introducing regression approaches for analyzing pre–post data, we provide a brief review of the four approaches commonly used in the undergraduate STEM literature to analyze pre/posttest data and discuss some relative advantages and disadvantages of each. However, we stress than none of these four methods accounts for possible differences in the student composition of the treatment and control courses.

Raw Change Scores

Udovic et al. (2002) compare student learning gains in a “workshop” introductory biology course, which included numerous active-learning activities, with learning gains in comparison courses taught primarily through lectures. Like Dori et al. (2007 ), Fallahi (2008 ), and Linsey et al. (2007, 2009 ), Udovic and colleagues use a t test to compare what we refer to as “raw change scores” between treatment and control classes. Raw change scores are simply the difference between the postscore and the prescore. Udovic and coworkers contend that if student scores in the treatment course improve more, on average, from the pretest to the posttest than do student scores in the control course, then the gains must be due to the intervention in the treatment courses. This procedure is identical—meaning that it will result in the same p value and conclusions regarding the effect of the intervention—to the two-way repeated-measures analysis of variance (ANOVA) used by Martin et al. (2007) .

In both the treatment and control classes, the authors compute the average raw change for each of the 11 questions on their pre- and posttest. The mean raw change was higher in the treatment classes than in the control classes for all 11 questions, and the t test rejected the null hypothesis that the mean raw change scores were equal for seven of the 11 questions. The authors conclude that the active-learning strategies in the treatment courses had a significant impact on student learning gains.

Analyzing raw change is attractive in terms of simplicity but does not account for the observation that students with low scores on a pretest have more to gain than students who score higher. The problem arises because test scores are bounded—meaning that they have an upper limit. To account for differences in “ease of improving” from pre to post, researchers have used two methods for standardizing or normalizing gain scores, one at the classroom level and one at the student level.

Normalized Gain Scores

research article using linear regression

Reporting < g >, the normalized learning gain, became popular in the undergraduate STEM education literature for several reasons. First, by normalizing by the maximum gain possible in each class, it accounts for the fact that some classrooms have more room to gain than other classrooms. A class that scores an 80% on the pretest and a 90% on the posttest has an average normalized gain of 0.5, matching a class that scores 60% on the pretest and 80% on the posttest (i.e., each class gained exactly half of the amount it could have gained on the posttest). Second, the size of Hake's initial study made it possible for researchers to compare learning gains informally across classrooms, even if their own study did not include enough classes to make a formal statistical test possible. That is, researchers could compute < g > for one or a few classrooms under study and make a judgment about whether the values are similar to those reported in the Hake study (e.g., Knight and Wood, 2005 ; McDaniel et al. , 2007 ; Tanahoung et al. , 2009 ). Finally, in studies with large numbers of treatment and control classrooms, < g > can be used to formally test whether learning gains in the treatment classes are larger than in the control classes (e.g., Redish and Steinberg, 1999 ; LoPresto and Murrell, 2009 ). However, using < g > results in low sample sizes and thus poor statistical power, because it uses the class as the unit of analysis instead of using individual students.

Normalized Change Scores

research article using linear regression

For students who score higher on the posttest than the pretest, the student-level normalized change score is computed similarly to a classroom-level normalized gain. The last three possibilities deal with unusual circumstances: students who score 0 or 100 on both the pre- and posttest are dropped; students who score the same on the pre- and posttest get a 0; and students who score lower on the posttest than the pretest have this negative gain scaled by the possible number of points they could have lost.

Because normalized change scores compare learning gains for students rather than for classrooms, they have two substantial advantages over normalized gain scores. First, they can be used to compare the impact of interventions assigned within classrooms rather than across classrooms (see Smith et al. , 2011 , for an example). Second, because the observations are at the student level rather than at the classroom level, the sample size is substantially larger compared with using normalized change scores, providing increased statistical power.

Normalized change scores have an important limitation, however. If students get a perfect score on the posttest, their c is 1 no matter whether their prescore was 1% or 99%. Similarly, if students score the same on the pre- and posttest, their score is 0, no matter whether their prescore was 1% or 99%. In these cases, the goal of normalizing for “ease of improvement” is lost.

Effect Sizes

research article using linear regression

Andrews and colleagues’ (2011) use of linear regression is an important addition to the undergraduate STEM education literature, as it allows them to control for factors other than active learning—such as the instructor's position and years of teaching experience, class size, and student-rated course difficulty—that could influence learning gains in the treatment and control classrooms. However, Andrews and colleagues estimate their regression at the classroom level and do not have access to student characteristics that can be used as control variables. A large K–12 literature (e.g., Rockoff 2004 ; Rivkin et al. , 2005 ) demonstrates that observable student characteristics are often correlated not just with student performance but also with student learning gains. Thus, this approach—like the prior three methods we reviewed—does not account for the possibility that differences in student learning gains, or lack thereof, are due to differences in the characteristics of students in the treatment and control classrooms rather than to the effect of the intervention.

CONTROLLING FOR STUDENT NONEQUIVALENCE: THE PROBLEM

To illustrate the importance of controlling for observable student characteristics in the treatment and control classes when evaluating the impact of nonrandomized educational interventions, we apply the four methods above to pre- and posttest scores from two sections of an introductory biology course offered during the Summer of 2012 at the University of Washington. Each section was taught by the same instructor using the exact same materials and instructional strategies. Thus, without knowing anything about the student composition of the two classes, there is no a priori reason to expect different student performance in the two classes. Given that there is actually no treatment at all, this should be an example of a statistical test wherein the null hypothesis—that the treatment had no impact on student learning gains—should not be rejected. We will label one of these sections as the treatment (section A) and the other section as the control (section B).

We will demonstrate that each of the methods above does lead to the conclusion that, for this particular pair of sections in this particular course, learning gains in the treatment class are higher than in the control class. This would ordinarily be taken as evidence that the “instructional intervention” in the treatment class had a significant impact on student learning gains. But given that there was no intervention at all, we explore whether the student composition of these two particular sections may have contributed to the incorrect conclusion. Throughout the analysis that follows, we interpret the results of all tests of statistical significance at the 90% confidence level—meaning that the p value must be <0.1 to reject the null hypothesis. We caution, however, against overreliance on conventional levels of statistical significance.

Data Overview

At the start of the term, students in each section took a diagnostic test ( Shi et al ., 2010 ), converted to a 100-point scale, that was intended to measure their prior knowledge about the topics to be covered in the course. Then 2 wk into the course, students took an in-class exam on the same material—also graded on a 100-point scale—that covered material taught in the first 2 wk of the course. We will treat the diagnostic test as the pretest and the in-class exam as the posttest. The average pretest scores were 59.8 (SD = 18.1) in section A (the “treatment” section) and 59.3 (SD = 17.0) in section B (the “control” section), and are not significantly different between the two sections (the p value from a two-sample t test is 0.865). This is important because many authors (e.g., McDaniel et al ., 2007 ) assume that the treatment and control classes have similar incoming characteristics if the pretest scores are not significantly different. The average posttest scores were 72.0 (SD = 15.8) in the treatment section and 67.0 (SD = 15.0) in the control section, which a t test indicates is significantly different ( p = 0.050). We now analyze these data using the four methods discussed in the preceding section.

Comparison of Raw Change Scores

In the treatment class, the average raw change score is 72.0 – 59.8 = 12.2 (SD = 15.0), while in the control class, the average raw change score is 67.0 – 59.3 = 7.7 (SD = 15.8). A t test of the null hypothesis that these average raw change scores are the same gives a p value of 0.077, which is statistically significant at the 90% confidence level. Thus, with this methodology, there is sufficient evidence to reject the null hypothesis and conclude that student learning gains were greater in the treatment class than in the control class.

research article using linear regression

The average normalized change score in the treatment class is 0.31 (SD = 0.29), while the average normalized change score in the control class is 0.19 (SD = 0.29). A t test of the null hypothesis that these average normalized changes scores are the same gives a p value of 0.012, which is statistically significant. Thus, with this methodology, there is sufficient evidence to reject the null hypothesis and conclude that student learning gains were greater in the treatment class than in the control class.

research article using linear regression

Potential Explanation

Each of the above methods could lead to the conclusion that the intervention in the treatment class had a significant positive impact on student learning gains. But given that there was no intervention at all, there must be another explanation for the observed difference in learning gains. One possibility is that the differences occurred by chance. The p value for a t test comparing normalized change scores, for example, tells us that there is a 1.2% chance of observing differences this extreme by chance alone. Another more probable explanation, though, is that the student composition of the two sections is driving the differences.

To investigate this hypothesis, we collected data on two measures that should reflect student ability and preparation: incoming undergraduate grade point average (GPA) and final grade in the preceding course in the introductory biology sequence. Due to a scheduling conflict during this particular term, the two sections had substantially different incoming performance levels. Specifically, the average incoming GPA in the treatment class was 3.33 (SD = 0.42), which a t test shows is significantly higher ( p < 0.001) than the average incoming GPA in the control class, 3.04 (SD = 0.43). Likewise, the prior biology grade averaged 3.09 (SD = 0.76) in the treatment class, which a t test indicates is significantly higher ( p < 0.001) than the average prior biology grade in the control class, 2.69 (SD = 0.66).

This observation underlines the central message of this paper. The gold standard for evaluating the impact of treatments of any kind—educational or otherwise—is a large randomized controlled trial. If sample sizes are large and if treatments are randomly assigned to the experimental subjects (or students, in this case), then there is no reason to expect the treatment and control groups to differ in any way, except that the treatment group received the treatment, while the control group did not. But in the context of evaluating the impact of interventions in undergraduate STEM classrooms, it is often not feasible to randomly assign students to treatment and control classes. The nonrandomized design that results opens up the possibility that the treatment and control classes will be substantially different, as our example shows. If these differences are correlated with student learning gains, then any of the methods above runs the risk of attributing observed differences to the impact of the treatment, when in reality they are due to differences in the composition of the groups being compared. This is true even if the treatment had no effect at all.

Given that many interventions in undergraduate STEM education cannot be randomized, is there a way to distinguish the impact of incoming student characteristics from the impact of the intervention itself? We argue that the answer is often yes , and that multiple linear regression can be a useful tool in any such analysis. We introduce this methodology in the next section, apply it to our data, and demonstrate that it leads to the correct conclusion: controlling for incoming student characteristics, there is no statistically significant difference in learning gains between the treatment and control classes in our example. Our goal in the following section is not to provide a rigorous theory of linear regression, but rather to motivate its use for evaluating the impact of educational interventions on student learning gains. We refer interested readers to chapters 3 and 4 of Gelman and Hill (2007) for an accessible discussion of broader considerations in linear regression.

CONTROLLING FOR STUDENT NONEQUIVALENCE: A SOLUTION

It is intuitive to think of a student's performance on a test as a function of many factors: the student's prior knowledge about the specific topics on the test, the student's understanding of the larger discipline, the student's work habits and study skills, and the intervention itself. A linear regression model formalizes this intuition by assuming that an outcome or dependent variable (in this case, a student's score on the posttest) is a linear function of explanatory (or control) variables and the intervention itself. Linear regression is not the only methodology that allows for this framework, but we will restrict our attention to it for simplicity.

A key step in a linear regression analysis is collecting data about control variables—measurements that can serve as proxies for factors that may influence the outcome variable, other than the treatment of interest. In a pre- and posttest setting, each student's score on the pretest is one obvious control variable, as the prescore controls for each student's prior knowledge about the specific topics on the test. In our example, we also collected data on each student's undergraduate GPA and grade in a previous biology class. The latter may be a reasonable proxy for each student's understanding of the broader field of biology, while both measures provide some information about each student's work habits and study skills.

Undergraduate GPA and previous biology grade are certainly not the only variables we could select to control for variation in student preparation and ability. In fact, researchers often have access to more student-level variables than are practical to use. Procedures like stepwise regression can assist researchers in selecting control variables that are most predictive of the outcome variable (see Freeman et al ., 2007 , for an example). Researchers can also use professional judgment—based on the available variables, data from similar studies in the literature, and their own experience—in selecting control variables. We chose a measure of overall academic performance (undergraduate GPA) and a measure of performance specific to biology (previous biology grade). Although the correlation between these variables is high ( r = 0.84), we chose both to account for the possibility that they capture different dimensions of student academic background—a decision that is borne out in the results in the next section. We also note that if we had data on many classrooms taught at different times by different instructors, we would also consider controlling for indicators such as time-of-day and class instructor, if the data suggested these indicators were relevant to the outcome being measured.

research article using linear regression

β 0 is the intercept, or the expected postscore for a student with an average prescore, GPA, and prior grade, who did not receive the treatment (note that if the control variables are not centered, β 0 is the expected postscore for a student with prescore of 0, GPA of 0, etc., which are not meaningful values);

β 1 is the expected increase in the postscore for each additional point on the student's prescore;

β 2 is the expected increase in the postscore for each additional GPA point;

β 3 is the expected increase in the postscore for each additional grade point from the previous course; and

β 4 is the expected increase in the postscore for students who received the intervention relative to students who did not receive the intervention.

In contrast to approaches like type I ANOVA that estimate the effect of each variable sequentially, a linear regression estimates each of these coefficients simultaneously. Thus, each of these regression coefficients should be interpreted as “all else equal,” meaning that they represent the marginal effect of changing one variable while holding all the other variables constant . The error term ɛ captures the reality that the regression equation does not perfectly predict each student's postscore.

Statistical software packages provide an estimate for each regression coefficient and the p value from the t test of the null hypothesis that the coefficient equals zero. For example, consider the coefficient of interest in Eq. 1: β 4 , or the “treatment effect” for students who received the intervention relative to students who did not receive the intervention. The null hypothesis is that this coefficient equals zero—that is, the intervention had no effect. Linear regression provides both an estimate of this treatment effect and a test of whether the treatment effect really is significantly different from zero, controlling for the influence of each of the other variables in the model. Note that if differences in learning gains between the treatment and control classes can be explained by the control variables and not by the intervention itself, then the treatment effect should not be significantly different from zero. On the other hand, if the intervention does have a significant impact on student performance, the null hypothesis should be rejected, and (if the regression model is correctly specified) the estimated treatment effect should quantify the average effect of the intervention on student test scores.

Linear regression makes some important assumptions. While it is beyond the scope of this paper to discuss all of them in depth (see Gelman and Hill, 2007 , section 3.6, for more details), there are a few that are particularly important for the present application. The first is that the error term ɛ is normally distributed. This assumption can be problematic if the maximum score on the test creates a “ceiling effect” that artificially limits the scores of the best students in the class. In this situation, these students will consistently score lower than the model predicts, because there is a violation of the normality assumption. Another assumption is that the influence of the control variables truly is linear. There is no compelling reason, other than mathematical convenience, to assume that the influence of a student's prescore, GPA, and prior grades on his or her postscore is truly additive as opposed to multiplicative or otherwise nonlinear.

These assumptions are important, and there are many methods to test and relax them (see Gelman and Hill, 2007 , chapters 3–6). Here, however, we focus on standard linear regression.

DATA ANALYSIS USING MULTIPLE LINEAR REGRESSION

research article using linear regression

a Significance levels from two-sided t test: + , p < 0.1; *, p < 0.05; **, p < 0.01; ***, p < 0.001.

research article using linear regression

For each coefficient, the null hypothesis that the coefficient equals zero is rejected at the 90% confidence level, so we have sufficient evidence to conclude that each of these control variables has an independent, significant correlation with student performance on the posttest. This is extremely important, as it means that even when controlling for a student's score on the pretest, a student's GPA and prior grades are still predictive of his or her score on the posttest. This may be due to students with higher GPAs having better study skills and work habits, and therefore preparing more effectively for the posttest. As in many studies, the posttest in our example was announced on the syllabus and awarded course points, while the pretest was not—a situation that may increase the impact of differences in motivation or preparation. Alternatively, it is possible that students who received a better grade in the previous biology course have a better understanding of the broader discipline, which helped them prepare for and answer questions on the posttest.

research article using linear regression

TOWARD INTEGRATION OF LINEAR REGRESSION IN UNDERGRADUATE STEM RESEARCH

We have shown that existing methods of evaluating interventions in college science classrooms can lead to erroneous conclusions when the interventions are not randomly assigned to students, and that linear regression can help mitigate this problem by controlling for observable characteristics that are also correlated with student learning gains. We caution that estimates from a linear regression do not justify a causal interpretation, except under strict assumptions, and that randomization of the intervention is still the best way to establish a treatment effect.

In our motivating example, we have used a multiple linear regression model to illustrate the simplicity and utility of a regression framework. However, there are many other reasons that education researchers should be drawn to this framework. Although we choose not to control for gender and ethnicity in our regression model, regression can also be used to test whether women, minorities, or any other affinity group are gaining more or less in our classrooms, all else equal. Regression models can also include interaction terms that test whether the intervention has a differential impact on different types of students. Researchers who currently use normalized change scores can simply use these values as the outcome variable in a linear regression. (When doing so, though, we recommend not including prescore as a predictor variable, as prescore is already included in normalized change.) Finally, while it is beyond the scope of this paper to discuss more complex regression methods, an even more rigorous approach could use generalized linear models to model nonlinear relationships between student characteristics and test scores, analyze student responses at the individual-question level, or produce unbiased estimates in the presence of a ceiling effect.

The undergraduate STEM education literature has made remarkable strides in recent years, but the methods commonly used to estimate the impact of instructional interventions lead to troubling questions about whether these treatment effects really are due to the interventions. It is possible that none of the results in the studies we reviewed would have changed if the researchers had controlled for student characteristics in a regression framework, but we hope we have illustrated that linear regression should be a component of any analysis of a nonrandomized instructional intervention. It is time for this growing literature to take the next step and ensure that reported treatment effects are the result of the intervention itself, not the students.

ACKNOWLEDGMENTS

We thank Ben Wiggins for discussion and for access to the pre–post testing data on introductory biology students, and Sara Brownell, Sarah Eddy, Elli Theobald, Mary Pat Wenderoth, and two anonymous reviewers for comments that improved the manuscript. The study was funded by a grant from the National Science Foundation (DUE-0942215) and conducted under Human Subjects Division application #44438.

  • Aly M, Elen J, Willems G ( 2004 ). Instructional multimedia program versus standard lecture: a comparison of two methods for teaching the undergraduate orthodontic curriculum . Eur J Dent Educ 8 , 43-46. Medline ,  Google Scholar
  • Andrews TM, Leonard MJ, Colgrove CA, Kalinowski ST ( 2011 ). Active learning not associated with student learning in a random sample of college biology courses . CBE Life Sci Educ 10 , 394-405. Link ,  Google Scholar
  • Arwood L ( 2004 ). Teaching cell biology to nonscience majors through forensics, or how to design a killer course . Cell Biol Educ 3 , 131-138. Link ,  Google Scholar
  • Becker BJ ( 1988 ). Synthesizing standardized mean-change measures . Br J Math Stat Psych 41 , 257-278. Google Scholar
  • Bilgin I, Åženocak E, Sözbilir M ( 2009 ). The effects of problem-based learning instruction on university students’ performance of conceptual and quantitative problems in gas concepts . Eurasia J Math Sci Tech Ed 5 , 153-164. Google Scholar
  • Buzzell PR, Chamberlain VM, Pintauro SJ ( 2002 ). The effectiveness of web-based, multimedia tutorials for teaching methods of human body composition analysis . Adv Physiol Ed 26 , 21-29. Medline ,  Google Scholar
  • Dori YJ, Hult E, Breslow L, Belcher JW ( 2007 ). How much have they retained? Making unseen concepts seen in a freshman electromagnetism course at MIT . J Sci Ed Tech 16 , 299-323. Google Scholar
  • Dunlap WP, Cortina JM, Vaslow JB, Burke MJ ( 1996 ). Meta-analyses of experiments with matched groups or repeated measures designs . Psychol Methods 1 , 170-177. Google Scholar
  • Fallahi CR ( 2008 ). Redesign of a life span development course using Fink's taxonomy . Teach Psychol 35 , 169-175. Google Scholar
  • Fantuzzo JW, Dimeff LA, Fox SL ( 1989 ). Reciprocal peer tutoring: a multimodal assessment of effectiveness with college students . Teach Psychol 16 , 133-135. Google Scholar
  • Freeman S, O’Connor E, Parks JW, Cunningham M, Hurley D, Haak D, Dirks C, Wenderoth MP ( 2007 ). Prescribed active learning increases performance in introductory biology . CBE Life Sci Educ 6 , 132-139. Link ,  Google Scholar
  • Gelman A, Hill J ( 2007 ). Data Analysis Using Regression and Multilevel/Hierarchical Models , New York: Cambridge University Press. Google Scholar
  • Hake RR ( 1998 ). Interactive-engagement versus traditional methods: a six-thousand-student survey of mechanics test data for introductory physics courses . Am J Phys 66 , 64-74. Google Scholar
  • Knight JK, Wood WB ( 2005 ). Teaching more by lecturing less . Cell Biol Educ 4 , 298-310. Link ,  Google Scholar
  • Linsey J, Talley A, White C, Jensen D, Wood K ( 2009 ). From Tootsie Rolls to broken bones: an innovative approach for active learning in mechanics of materials . Adv Eng Educ 1 , 1-23. Google Scholar
  • LoPresto MC, Murrell SR ( 2009 ). Using the Star Properties Concept Inventory to compare instruction with lecture tutorials to traditional lectures . Astron Educ Rev 8 , 010105. Google Scholar
  • Martin T, Rivale SD, Diller KR ( 2007 ). Comparison of student learning in challenge-based and traditional instruction in biomedical engineering . Ann Biomed Eng 35 , 1312-1323. Medline ,  Google Scholar
  • Marx JD, Cummings K ( 2007 ). Normalized change . Am J Phys 75 , 87-91. Google Scholar
  • McConnell DA, et al. ( 2006 ). Using conceptests to assess and improve student conceptual understanding in introductory geoscience courses . J Geosci Educ 54 , 61-68. Google Scholar
  • McDaniel CN, Lister BC, Hanna MH, Roy H ( 2007 ). Increased learning observed in redesigned introductory biology course that employed web-enhanced, interactive pedagogy . CBE Life Sci Educ 6 , 243-9. Link ,  Google Scholar
  • Nam Y, Ito E ( 2011 ). A climate change course for undergraduate students . J Geosci Educ 59 , 229-241. Google Scholar
  • Redish EF, Steinberg RN ( 1999 ). Teaching physics: figuring out what works . Phys Today 52 , 24-30. Google Scholar
  • Rivkin SG, Hanushek EA, Kain JF ( 2005 ). Teachers, schools, and academic achievement . Econometrica 73 , 417-458. Google Scholar
  • Rockoff J ( 2004 ). The impact of individual teachers on student achievement: evidence from panel data . Am Econ Rev 94 , 247-252. Google Scholar
  • Shi J, Wood WB, Martin JM, Guild NA, Vicens Q, Knight JK ( 2010 ). A diagnostic assessment for introductory molecular and cell biology . CBE Life Sci Educ 9 , 453-461. Link ,  Google Scholar
  • Smith MK, Wood WB, Krauter K, Knight JK ( 2011 ). Combining peer discussion with instructor explanation increases student learning from in-class concept questions . CBE Life Sci Educ 10 , 55-63. Link ,  Google Scholar
  • Tanahoung C, Chitaree R, Soankwan C, Sharma MD, Johnston ID ( 2009 ). The effect of interactive lecture demonstrations on students’ understanding of heat and temperature: a study from Thailand . Res Sci Tech Educ 27 , 61-74. Google Scholar
  • Udovic D, Morris D, Dickman A, Postlethwait J, Wetherwax P ( 2002 ). Workshop Biology: demonstrating the effectiveness of active learning in an introductory biology course . BioScience 52 , 272-281. Google Scholar
  • Diane K. Angell ,
  • Sharon Lane-Getaz ,
  • Taylor Okonek , and
  • Stephanie Smith
  • Jennifer Knight, Monitoring Editor
  • Denise Pope ,
  • Joel K. Abraham ,
  • Kerry J Kim ,
  • Susan Maruca , and
  • Jennifer Palacio
  • Anita Schuchardt, Monitoring Editor
  • Elise M. Walck-Shannon ,
  • Shaina F. Rowell ,
  • April E. Bednarski ,
  • Ashton M. Barber ,
  • Grace J. Yuan , and
  • Regina F. Frey
  • The effect of gender composition and pedagogical approach on major and non-major undergraduates biology students’ achievement 8 May 2022 | Interactive Learning Environments, Vol. 31, No. 10
  • Georgianne L. Connell ,
  • Deborah A. Donovan , and
  • Elli J. Theobald
  • Erika Offerdahl, Monitoring Editor
  • Measuring the role of spatial ability and multiple external representations in introductory geology students’ knowledge of plate tectonics 24 October 2022 | Journal of Geoscience Education, Vol. 71, No. 4
  • Professional social connections are associated with student science identity in a research‐based field biology course 10 September 2023 | Ecosphere, Vol. 14, No. 9
  • Components of the preparation gap for physics learning vary in two learner groups 31 August 2023 | Physical Review Physics Education Research, Vol. 19, No. 2
  • Preliminary Outcomes from a Learning Community to Increase Biology Course Knowledge 2 April 2021 | Journal of Biological Education, Vol. 57, No. 2
  • Impact of combination of short lecture and group discussion on the learning of physiology by nonmajor undergraduates Advances in Physiology Education, Vol. 47, No. 1
  • Supporting Students’ Self-Regulated Learning in an Introductory Physics Course The Physics Teacher, Vol. 61, No. 1
  • The Impact of Physics Education Technology (PhET) Interactive Simulation-Based Learning on Motivation and Academic Achievement Among Malawian Physics Students 19 December 2022 | Journal of Science Education and Technology, Vol. 4
  • Is Active Learning Enough? The Contributions of Misconception-Focused Instruction and Active-Learning Dosage on Student Learning of Evolution 7 September 2022 | BioScience, Vol. 72, No. 11
  • Multivariable Fractional Polynomials for lithium-ion batteries degradation models under dynamic conditions Journal of Energy Storage, Vol. 52
  • Revisiting Clickers: In-Class Questions Followed by At-Home Reflections Are Associated with Higher Student Performance on Related Exam Questions 6 July 2022 | Journal of Microbiology & Biology Education, Vol. 9
  • Sheela Vemu ,
  • Kameryn Denaro ,
  • Brian K. Sato ,
  • Matthew R. Fisher , and
  • Adrienne E. Williams
  • Jenny McFarland, Monitoring Editor
  • Analysis of Parents' Education Level on Motivation for Learning Physics of Middle School Students in Kontukowuna District 2 April 2022 | Jurnal Pendidikan Fisika dan Teknologi, Vol. 8, No. 1
  • The Development of Clinical Self-Efficacy in Speech-Language Pathology Graduate Training: A Longitudinal Study American Journal of Speech-Language Pathology, Vol. 31, No. 2
  • Integrating Clinical Reasoning Skills in a Pre‐professional Undergraduate Human Anatomy Course 11 March 2021 | Anatomical Sciences Education, Vol. 15, No. 2
  • Journal of Advanced Academics, Vol. 33, No. 3
  • Investigating Predictors of Public- and Private-Sphere Sustainable Behaviors in the Context of Agritourism 7 January 2022 | Sustainability, Vol. 14, No. 2
  • Active Learning Compared With Lecture-Based Pedagogies in Gender and Socio-Cultural Context-Specific Major and Non-Major Biology Classes
  • Using Case Studies to Improve the Critical Thinking Skills of Undergraduate Conservation Biology Students Case Studies in the Environment, Vol. 5, No. 1
  • Teaching of experimental design skills: results from a longitudinal study 1 January 2021 | Chemistry Education Research and Practice, Vol. 22, No. 4
  • Sarah J. Adkins-Jablonsky ,
  • Justin F. Shaffer ,
  • J. Jeffrey Morris ,
  • Ben England , and
  • Samiksha Raut
  • Stephanie Gardner, Monitoring Editor
  • K. R. Williams ,
  • S. R. Wasson ,
  • A. Barrett ,
  • R. F. Greenall ,
  • S. R. Jones , and
  • E. G. Bailey
  • How outdoor science education can help girls stay engaged with science 22 March 2021 | International Journal of Science Education, Vol. 43, No. 7
  • Effects of Remote Teaching in a Crisis on Equity Gaps and the Constructivist Learning Environment in an Introductory Biology Course Series Journal of Microbiology & Biology Education, Vol. 22, No. 1
  • Getting Students Back on Track: Persistent Effects of Flipping Accelerated Organic Chemistry on Student Achievement, Study Strategies, and Perceptions of Instruction 1 March 2021 | Journal of Chemical Education, Vol. 98, No. 4
  • Christopher W. Beck and
  • Lawrence S. Blumer
  • Brian Sato, Monitoring Editor
  • Dissection versus Prosection: A Comparative Assessment of the Course Experiences, Approaches to Learning, and Academic Performance of Non‐medical Undergraduate Students in Human Anatomy 11 July 2020 | Anatomical Sciences Education, Vol. 14, No. 2
  • How do YouTube videos impact tolerance of wolves? 21 June 2020 | Human Dimensions of Wildlife, Vol. 25, No. 6
  • Nonparametric Analysis of the Effect of Knowledge Integration Activities on Third-Year Undergraduate Performance IEEE Transactions on Education, Vol. 63, No. 4
  • A survey of study skills of first-year university students: the relationships of strategy to gender, ethnicity and course type Journal of Applied Research in Higher Education, Vol. ahead-of-print, No. ahead-of-print
  • The effect of feedback on metacognition - A randomized experiment using polling technology Computers & Education, Vol. 152
  • “I'm just not that great at science”: Science self‐efficacy in arts and communication students 29 November 2019 | Journal of Research in Science Teaching, Vol. 57, No. 4
  • A sustained multidimensional conceptual change intervention in grade 9 and 10 science classes 9 February 2020 | International Journal of Science Education, Vol. 42, No. 5
  • Two-Year Community: Nonideal Placement of Nonmajors in Biology Major and Allied Health Courses Results in Poor Performance and Higher Attrition Rates 5 October 2023 | Journal of College Science Teaching, Vol. 49, No. 4
  • An Evaluation of the Hybrid Model for Predicting Surgery Duration 2 January 2020 | Journal of Medical Systems, Vol. 44, No. 2
  • Understanding Factors related to Undergraduate Student Decision-making about a Complex Socio-scientific Issue: Mountain Lion Management EURASIA Journal of Mathematics, Science and Technology Education, Vol. 16, No. 2
  • Mike Wilton ,
  • Eduardo Gonzalez-Niño ,
  • Peter McPartlan ,
  • Zach Terner ,
  • Rolf E. Christoffersen , and
  • Joel H. Rothman
  • Jane L. Indorf ,
  • Joanna Weremijewicz ,
  • David P. Janos , and
  • Michael S. Gaines
  • Graham F. Hatfull, Monitoring Editor
  • Julie Ferguson , and
  • Kameryn Denaro
  • Peggy Brickman, Monitoring Editor
  • Beyond linear regression: A reference for analyzing common data types in discipline based education research 3 July 2019 | Physical Review Physics Education Research, Vol. 15, No. 2
  • Michael J. Cahill ,
  • Mark A. McDaniel , and
  • Cynthia Brame, Monitoring Editor
  • Children can foster climate change concern among their parents 6 May 2019 | Nature Climate Change, Vol. 9, No. 6
  • The effects of flipped classrooms on undergraduate pharmaceutical marketing learning: A clustered randomized controlled study 10 April 2019 | PLOS ONE, Vol. 14, No. 4
  • Analysis of the role of a writing-to-learn assignment in student understanding of organic acid–base concepts 1 January 2019 | Chemistry Education Research and Practice, Vol. 20, No. 2
  • Metrics and Methods Used To Compare Student Performance Data in Chemistry Education Research Articles 11 February 2019 | Journal of Chemical Education, Vol. 96, No. 3
  • Lawrence S. Blumer and
  • Christopher W. Beck
  • Knowing is half the battle: Assessments of both student perception and performance are necessary to successfully evaluate curricular transformation 11 January 2019 | PLOS ONE, Vol. 14, No. 1
  • Developing a model of climate change behavior among adolescents 3 November 2018 | Climatic Change, Vol. 151, No. 3-4
  • Learning Neuroscience with Technology: a Scaffolded, Active Learning Approach 24 August 2018 | Journal of Science Education and Technology, Vol. 27, No. 6
  • Iterative design of a simulation-based module for teaching evolution by natural selection 26 April 2018 | Evolution: Education and Outreach, Vol. 11, No. 1
  • A Quantitative, Pilot Investigation of a Service-Learning Trip as a Platform for Growth of Empathy
  • What difference does it make? 19 October 2018 | Study Abroad Research in Second Language Acquisition and International Education, Vol. 3, No. 2
  • Elli Theobald
  • Erin L. Dolan, Monitoring Editor
  • Comparing student, instructor, and expert perceptions of learner-centeredness in post-secondary biology classrooms 11 July 2018 | PLOS ONE, Vol. 13, No. 7
  • The discriminative learning gain: a two-parameter quantification of the difference in learning success between courses 11 December 2018 | Australasian Journal of Engineering Education, Vol. 23, No. 2
  • Investigation of the role of writing-to-learn in promoting student understanding of light–matter interactions 1 January 2018 | Chemistry Education Research and Practice, Vol. 19, No. 3
  • Infusion of Quantitative and Statistical Concepts Into Biology Courses Does Not Improve Quantitative Literacy 10 October 2023 | Journal of College Science Teaching, Vol. 47, No. 5
  • Differences in acquired knowledge and attitudes achieved with traditional, computer-supported and virtual laboratory biology laboratory exercises 20 April 2017 | Journal of Biological Education, Vol. 52, No. 2
  • Understanding Student Perceptions and Practices for Pre-Lecture Content Reading in the Genetics Classroom Journal of Microbiology & Biology Education, Vol. 19, No. 2
  • Erin A. Becker ,
  • Erin J. Easlon ,
  • Sarah C. Potter ,
  • Alberto Guzman-Alvarez ,
  • Jensen M. Spear ,
  • Marc T. Facciotti ,
  • Michele M. Igo ,
  • Mitchell Singer , and
  • Christopher Pagliarulo
  • Marilyne Stains, Monitoring Editor
  • Using Backward Design in Education Research: A Research Methods Essay Journal of Microbiology & Biology Education, Vol. 18, No. 3
  • Testing the effectiveness of two natural selection simulations in the context of a large-enrollment undergraduate laboratory class 14 July 2017 | Evolution: Education and Outreach, Vol. 10, No. 1
  • Rebekah Lieu ,
  • Ashley Wong ,
  • Anahita Asefirad , and
  • Justin F. Shaffer
  • Jennifer Momsen, Monitoring Editor
  • Hannah Jordt ,
  • Sarah L. Eddy ,
  • Riley Brazil ,
  • Ignatius Lau ,
  • Chelsea Mann ,
  • Sara E. Brownell ,
  • Katherine King , and
  • Jeff Schinske, Monitoring Editor
  • Lacy M. Cleveland ,
  • Jeffrey T. Olimpo , and
  • Sue Ellen DeChenne-Peters
  • Daron Barnard, Monitoring Editor
  • Variation in external representations as part of the classroom lecture:An investigation of virtual cell animations in introductory photosynthesis instruction* 28 December 2016 | Biochemistry and Molecular Biology Education, Vol. 45, No. 3
  • Student Buy-In Toward Formative Assessments: The Influence of Student Factors and Importance for Course Success Journal of Microbiology & Biology Education, Vol. 18, No. 1
  • Michael J. Drinkwater ,
  • Kelly E. Matthews , and
  • Jacob Seiler
  • Michelle Smith, Monitoring Editor
  • Marilyne Stains , and
  • Trisha Vickrey
  • Deborah Allen, Monitoring Editor
  • Eric E. Goff ,
  • Katie M. Reindl ,
  • Christina Johnson ,
  • Phillip McClean ,
  • Erika G. Offerdahl ,
  • Noah L. Schroeder , and
  • Alan R. White
  • Correlation and Other Concepts You Should Know
  • How to Assess Your CURE: A Practical Guide for Instructors of Course-Based Undergraduate Research Experiences Journal of Microbiology & Biology Education, Vol. 17, No. 3
  • The flipped classroom allows for more class time devoted to critical thinking Advances in Physiology Education, Vol. 40, No. 4
  • Student performance in and perceptions of a high structure undergraduate human anatomy course 18 March 2016 | Anatomical Sciences Education, Vol. 9, No. 6
  • Beneath the numbers: A review of gender disparities in undergraduate education across science, technology, engineering, and math disciplines 1 August 2016 | Physical Review Physics Education Research, Vol. 12, No. 2
  • Kathleen Hoffman ,
  • Sarah Leupen ,
  • Kathy Dowell ,
  • Kerrie Kephart , and
  • Janet Batzli, Monitoring Editor
  • Timothy G. Chambers
  • Michelle Smith, Monitoring Editor:
  • Clicker Score Trajectories and Concept Inventory Scores as Predictors for Early Warning Systems for Large STEM Classes 7 May 2015 | Journal of Science Education and Technology, Vol. 24, No. 6
  • A Cross-Course Investigation of Integrative Cases for Evolution Education Journal of Microbiology & Biology Education, Vol. 16, No. 2
  • Correlations Between Tree Thinking and Acceptance of Evolution in Introductory Biology Students 13 August 2015 | Evolution: Education and Outreach, Vol. 8, No. 1
  • Zachary Batz ,
  • Brian J. Olsen ,
  • Jonathan Dumont ,
  • Farahad Dastoor , and
  • Michelle K. Smith
  • Lisa A. Corwin ,
  • Mark J. Graham , and
  • Erin L. Dolan
  • Mary Lee Ledbetter, Monitoring Editor
  • Trisha Vickrey ,
  • Kaitlyn Rosploch ,
  • Reihaneh Rahmanian ,
  • Matthew Pilarz , and
  • Marilyne Stains
  • Effect of lecture instruction on student performance on qualitative questions 22 January 2015 | Physical Review Special Topics - Physics Education Research, Vol. 11, No. 1
  • Jonathan Dees ,
  • Jennifer L. Momsen ,
  • Jarad Niemi , and
  • Lisa Montplaisir
  • Kenneth Miller, Monitoring Editor
  • Sara E. Brownell , and
  • Mary Pat Wenderoth
  • Christopher W. Beck , and
  • Nancy G. Bliwise
  • An Assessment of Blended Learning in Mechanics of Materials

Submitted: 19 July 2013 Revised: 15 October 2013 Accepted: 21 October 2013

© 2014 R. Theobald and S. Freeman. CBE—Life Sciences Education © 2014 The American Society for Cell Biology. This article is distributed by The American Society for Cell Biology under license from the author(s). It is available to the public under an Attribution–Noncommercial–Share Alike 3.0 Unported Creative Commons License (http://creativecommons.org/licenses/by-nc-sa/3.0).

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Published: 01 December 2015

Points of Significance

Multiple linear regression

  • Martin Krzywinski 2 &
  • Naomi Altman 1  

Nature Methods volume  12 ,  pages 1103–1104 ( 2015 ) Cite this article

44k Accesses

78 Citations

43 Altmetric

Metrics details

When multiple variables are associated with a response, the interpretation of a prediction equation is seldom simple.

You have full access to this article via your institution.

Last month we explored how to model a simple relationship between two variables, such as the dependence of weight on height 1 . In the more realistic scenario of dependence on several variables, we can use multiple linear regression (MLR). Although MLR is similar to linear regression, the interpretation of MLR correlation coefficients is confounded by the way in which the predictor variables relate to one another.

In simple linear regression 1 , we model how the mean of variable Y depends linearly on the value of a predictor variable X ; this relationship is expressed as the conditional expectation E( Y | X ) = β 0 + β 1 X . For more than one predictor variable X 1 , . . ., X p , this becomes β 0 + Σ β j X j . As for simple linear regression, one can use the least-squares estimator (LSE) to determine estimates b j of the β j regression parameters by minimizing the residual sum of squares, SSE = Σ( y i − ŷ i ) 2 , where ŷ i = b 0 + Σ j b j xij . When we use the regression sum of squares, SSR = Σ( ŷ i − Y − ) 2 , the ratio R 2 = SSR/(SSR + SSE) is the amount of variation explained by the regression model and in multiple regression is called the coefficient of determination.

The slope β j is the change in Y if predictor j is changed by one unit and others are held constant. When normality and independence assumptions are fulfilled, we can test whether any (or all) of the slopes are zero using a t -test (or regression F -test). Although the interpretation of β j seems to be identical to its interpretation in the simple linear regression model, the innocuous phrase “and others are held constant” turns out to have profound implications.

To illustrate MLR—and some of its perils—here we simulate predicting the weight ( W , in kilograms) of adult males from their height ( H , in centimeters) and their maximum jump height ( J , in centimeters). We use a model similar to that presented in our previous column 1 , but we now include the effect of J as E( W | H , J ) = β H H + β J J + β 0 + ε, with β H = 0.7, β J = −0.08, β 0 = −46.5 and normally distributed noise ε with zero mean and σ = 1 ( Table 1 ). We set β J negative because we expect a negative correlation between W and J when height is held constant (i.e., among men of the same height, lighter men will tend to jump higher). For this example we simulated a sample of size n = 40 with H and J normally distributed with means of 165 cm (σ = 3) and 50 cm (σ = 12.5), respectively.

Although the statistical theory for MLR seems similar to that for simple linear regression, the interpretation of the results is much more complex. Problems in interpretation arise entirely as a result of the sample correlation 2 among the predictors. We do, in fact, expect a positive correlation between H and J —tall men will tend to jump higher than short ones. To illustrate how this correlation can affect the results, we generated values using the model for weight with samples of J and H with different amounts of correlation.

Let's look first at the regression coefficients estimated when the predictors are uncorrelated, r ( H , J ) = 0, as evidenced by the zero slope in association between H and J ( Fig. 1a ). Here r is the Pearson correlation coefficient 2 . If we ignore the effect of J and regress W on H , we find Ŵ = 0.71 H − 51.7 ( R 2 = 0.66) ( Table 1 and Fig. 1b ). Ignoring H , we find Ŵ = −0.088 J + 69.3 ( R 2 = 0.19). If both predictors are fitted in the regression, we obtain Ŵ = 0.71 H − 0.088 J − 47.3 ( R 2 = 0.85). This regression fit is a plane in three dimensions ( H , J , W ) and is not shown in Figure 1 . In all three cases, the results of the F -test for zero slopes show high significance ( P ≤ 0.005).

figure 1

( a ) Simulated values of uncorrelated predictors, r ( H , J ) = 0. The thick gray line is the regression line, and thin gray lines show the 95% confidence interval of the fit. ( b ) Regression of weight ( W ) on height ( H ) and of weight on jump height ( J ) for uncorrelated predictors shown in a . Regression slopes are shown ( b H = 0.71, b J = −0.088). ( c ) Simulated values of correlated predictors, r ( H , J ) = 0.9. Regression and 95% confidence interval are denoted as in a . ( d ) Regression (red lines) using correlated predictors shown in c . Light red lines denote the 95% confidence interval. Notice that b J = 0.097 is now positive. The regression line from b is shown in blue. In all graphs, horizontal and vertical dotted lines show average values.

When the sample correlations of the predictors are exactly zero, the regression slopes ( b H and b J ) for the “one predictor at a time” regressions and the multiple regression are identical, and the simple regression R 2 sums to multiple regression R 2 (0.66 + 0.19 = 0.85; Fig. 2 ). The intercept changes when we add a predictor with a nonzero mean to satisfy the constraint that the least-squares regression line goes through the sample means, which is always true when the regression model includes an intercept.

figure 2

Shown are the values of regression coefficient estimates ( b H , b J , b 0 ) and R 2 and the significance of the test used to determine whether the coefficient is zero from 250 simulations at each value of predictor sample correlation −1 < r ( H , J ) < 1 for each scenario where either H or J or both H and J predictors are fitted in the regression. Thick and thin black curves show the coefficient estimate median and the boundaries of the 10th–90th percentile range, respectively. Histograms show the fraction of estimated P values in different significance ranges, and correlation intervals are highlighted in red where >20% of the P values are >0.01. Actual regression coefficients ( β H , β J , β 0 ) are marked on vertical axes. The decrease in significance for b J when jump height is the only predictor and r ( H , J ) is moderate (red arrow) is due to insufficient statistical power ( b J is close to zero). When predictors are uncorrelated, r ( H , J ) = 0, R 2 of individual regressions sum to R 2 of multiple regression (0.66 + 0.19 = 0.85). Panels are organized to correspond to Table 1 , which shows estimates of a single trial at two different predictor correlations.

Balanced factorial experiments show a sample correlation of zero among the predictors when their levels have been fixed. For example, we might fix three heights and three jump heights and select two men representative of each combination, for a total of 18 subjects to be weighed. But if we select the samples and then measure the predictors and response, the predictors are unlikely to have zero correlation.

When we simulate highly correlated predictors r ( H , J ) = 0.9 ( Fig. 1c ), we find that the regression parameters change depending on whether we use one or both predictors ( Table 1 and Fig. 1d ). If we consider only the effect of H , the coefficient β H = 0.7 is inaccurately estimated as b H = 0.44. If we include only J , we estimate β J = −0.08 inaccurately, and even with the wrong sign ( b J = 0.097). When we use both predictors, the estimates are quite close to the actual coefficients ( b H = 0.63, b J = −0.056).

In fact, as the correlation between predictors r ( H , J ) changes, the estimates of the slopes ( b H , b J ) and intercept ( b 0 ) vary greatly when only one predictor is fitted. We show the effects of this variation for all values of predictor correlation (both positive and negative) across 250 trials at each value ( Fig. 2 ). We include negative correlation because although J and H are likely to be positively correlated, other scenarios might use negatively correlated predictors (e.g., lung capacity and smoking habits). For example, if we include only H in the regression and ignore the effect of J , b H steadily decreases from about 1 to 0.35 as r ( H , J ) increases. Why is this? For a given height, larger values of J (an indicator of fitness) are associated with lower weight. If J and H are negatively correlated, as J increases, H decreases, and both changes result in a lower value of W . Conversely, as J decreases, H increases, and thus W increases. If we use only H as a predictor, J is lurking in the background, depressing W at low values of H and enhancing W at high levels of H , so that the effect of H is overestimated ( b H increases). The opposite effect occurs when J and H are positively correlated. A similar effect occurs for b J , which increases in magnitude (becomes more negative) when J and H are negatively correlated. Supplementary Figure 1 shows the effect of correlation when both regression coefficients are positive.

When both predictors are fitted ( Fig. 2 ), the regression coefficient estimates ( b H , b J , b 0 ) are centered at the actual coefficients ( β H , β J , β 0 ) with the correct sign and magnitude regardless of the correlation of the predictors. However, the standard error in the estimates steadily increases as the absolute value of the predictor correlation increases.

Neglecting important predictors has implications not only for R 2 , which is a measure of the predictive power of the regression, but also for interpretation of the regression coefficients. Unconsidered variables that may have a strong effect on the estimated regression coefficients are sometimes called 'lurking variables'. For example, muscle mass might be a lurking variable with a causal effect on both body weight and jump height. The results and interpretation of the regression will also change if other predictors are added.

Given that missing predictors can affect the regression, should we try to include as many predictors as possible? No, for three reasons. First, any correlation among predictors will increase the standard error of the estimated regression coefficients. Second, having more slope parameters in our model will reduce interpretability and cause problems with multiple testing. Third, the model may suffer from overfitting. As the number of predictors approaches the sample size, we begin fitting the model to the noise. As a result, we may seem to have a very good fit to the data but still make poor predictions.

MLR is powerful for incorporating many predictors and for estimating the effects of a predictor on the response in the presence of other covariates. However, the estimated regression coefficients depend on the predictors in the model, and they can be quite variable when the predictors are correlated. Accurate prediction of the response is not an indication that regression slopes reflect the true relationship between the predictors and the response.

Altman, N. & Krzywinski, M. Nat. Methods 12 , 999–1000 (2015).

Article   CAS   Google Scholar  

Altman, N. & Krzywinski, M. Nat. Methods 12 , 899–900 (2015).

Download references

Author information

Authors and affiliations.

Naomi Altman is a Professor of Statistics at The Pennsylvania State University.,

  • Naomi Altman

Martin Krzywinski is a staff scientist at Canada's Michael Smith Genome Sciences Centre.,

Martin Krzywinski

You can also search for this author in PubMed   Google Scholar

Ethics declarations

Competing interests.

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary figure 1 regression coefficients and r 2.

The significance and value of regression coefficients and R 2 for a model with both regression coefficients positive, E( W | H,J ) = 0.7 H + 0.08 J - 46.5 + ε. The format of the figure is the same as that of Figure 2 .

Supplementary information

Supplementary figure 1.

Regression coefficients and R 2 (PDF 299 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Krzywinski, M., Altman, N. Multiple linear regression. Nat Methods 12 , 1103–1104 (2015). https://doi.org/10.1038/nmeth.3665

Download citation

Published : 01 December 2015

Issue Date : December 2015

DOI : https://doi.org/10.1038/nmeth.3665

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Income and oral and general health-related quality of life: the modifying effect of sense of coherence, findings of a cross-sectional study.

  • Mehrsa Zakershahrak
  • Sergio Chrisopoulos
  • David Brennan

Applied Research in Quality of Life (2023)

Outcomes of a novel all-inside arthroscopic anterior talofibular ligament repair for chronic ankle instability

  • Xiao’ao Xue
  • Yinghui Hua

International Orthopaedics (2023)

Predicting financial losses due to apartment construction accidents utilizing deep learning techniques

  • Ji-Myong Kim
  • Sang-Guk Yum

Scientific Reports (2022)

Regression modeling of time-to-event data with censoring

  • Tanujit Dey
  • Stuart R. Lipsitz

Nature Methods (2022)

A Systematic Analysis for Energy Performance Predictions in Residential Buildings Using Ensemble Learning

  • Monika Goyal
  • Mrinal Pandey

Arabian Journal for Science and Engineering (2021)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research article using linear regression

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Anxiety, Affect, Self-Esteem, and Stress: Mediation and Moderation Effects on Depression

Affiliations Department of Psychology, University of Gothenburg, Gothenburg, Sweden, Network for Empowerment and Well-Being, University of Gothenburg, Gothenburg, Sweden

Affiliation Network for Empowerment and Well-Being, University of Gothenburg, Gothenburg, Sweden

Affiliations Department of Psychology, University of Gothenburg, Gothenburg, Sweden, Network for Empowerment and Well-Being, University of Gothenburg, Gothenburg, Sweden, Department of Psychology, Education and Sport Science, Linneaus University, Kalmar, Sweden

* E-mail: [email protected]

Affiliations Network for Empowerment and Well-Being, University of Gothenburg, Gothenburg, Sweden, Center for Ethics, Law, and Mental Health (CELAM), University of Gothenburg, Gothenburg, Sweden, Institute of Neuroscience and Physiology, The Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden

  • Ali Al Nima, 
  • Patricia Rosenberg, 
  • Trevor Archer, 
  • Danilo Garcia

PLOS

  • Published: September 9, 2013
  • https://doi.org/10.1371/journal.pone.0073265
  • Reader Comments

23 Sep 2013: Nima AA, Rosenberg P, Archer T, Garcia D (2013) Correction: Anxiety, Affect, Self-Esteem, and Stress: Mediation and Moderation Effects on Depression. PLOS ONE 8(9): 10.1371/annotation/49e2c5c8-e8a8-4011-80fc-02c6724b2acc. https://doi.org/10.1371/annotation/49e2c5c8-e8a8-4011-80fc-02c6724b2acc View correction

Table 1

Mediation analysis investigates whether a variable (i.e., mediator) changes in regard to an independent variable, in turn, affecting a dependent variable. Moderation analysis, on the other hand, investigates whether the statistical interaction between independent variables predict a dependent variable. Although this difference between these two types of analysis is explicit in current literature, there is still confusion with regard to the mediating and moderating effects of different variables on depression. The purpose of this study was to assess the mediating and moderating effects of anxiety, stress, positive affect, and negative affect on depression.

Two hundred and two university students (males  = 93, females  = 113) completed questionnaires assessing anxiety, stress, self-esteem, positive and negative affect, and depression. Mediation and moderation analyses were conducted using techniques based on standard multiple regression and hierarchical regression analyses.

Main Findings

The results indicated that (i) anxiety partially mediated the effects of both stress and self-esteem upon depression, (ii) that stress partially mediated the effects of anxiety and positive affect upon depression, (iii) that stress completely mediated the effects of self-esteem on depression, and (iv) that there was a significant interaction between stress and negative affect, and between positive affect and negative affect upon depression.

The study highlights different research questions that can be investigated depending on whether researchers decide to use the same variables as mediators and/or moderators.

Citation: Nima AA, Rosenberg P, Archer T, Garcia D (2013) Anxiety, Affect, Self-Esteem, and Stress: Mediation and Moderation Effects on Depression. PLoS ONE 8(9): e73265. https://doi.org/10.1371/journal.pone.0073265

Editor: Ben J. Harrison, The University of Melbourne, Australia

Received: February 21, 2013; Accepted: July 22, 2013; Published: September 9, 2013

Copyright: © 2013 Nima et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: The authors have no support or funding to report.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Mediation refers to the covariance relationships among three variables: an independent variable (1), an assumed mediating variable (2), and a dependent variable (3). Mediation analysis investigates whether the mediating variable accounts for a significant amount of the shared variance between the independent and the dependent variables–the mediator changes in regard to the independent variable, in turn, affecting the dependent one [1] , [2] . On the other hand, moderation refers to the examination of the statistical interaction between independent variables in predicting a dependent variable [1] , [3] . In contrast to the mediator, the moderator is not expected to be correlated with both the independent and the dependent variable–Baron and Kenny [1] actually recommend that it is best if the moderator is not correlated with the independent variable and if the moderator is relatively stable, like a demographic variable (e.g., gender, socio-economic status) or a personality trait (e.g., affectivity).

Although both types of analysis lead to different conclusions [3] and the distinction between statistical procedures is part of the current literature [2] , there is still confusion about the use of moderation and mediation analyses using data pertaining to the prediction of depression. There are, for example, contradictions among studies that investigate mediating and moderating effects of anxiety, stress, self-esteem, and affect on depression. Depression, anxiety and stress are suggested to influence individuals' social relations and activities, work, and studies, as well as compromising decision-making and coping strategies [4] , [5] , [6] . Successfully coping with anxiety, depressiveness, and stressful situations may contribute to high levels of self-esteem and self-confidence, in addition increasing well-being, and psychological and physical health [6] . Thus, it is important to disentangle how these variables are related to each other. However, while some researchers perform mediation analysis with some of the variables mentioned here, other researchers conduct moderation analysis with the same variables. Seldom are both moderation and mediation performed on the same dataset. Before disentangling mediation and moderation effects on depression in the current literature, we briefly present the methodology behind the analysis performed in this study.

Mediation and moderation

Baron and Kenny [1] postulated several criteria for the analysis of a mediating effect: a significant correlation between the independent and the dependent variable, the independent variable must be significantly associated with the mediator, the mediator predicts the dependent variable even when the independent variable is controlled for, and the correlation between the independent and the dependent variable must be eliminated or reduced when the mediator is controlled for. All the criteria is then tested using the Sobel test which shows whether indirect effects are significant or not [1] , [7] . A complete mediating effect occurs when the correlation between the independent and the dependent variable are eliminated when the mediator is controlled for [8] . Analyses of mediation can, for example, help researchers to move beyond answering if high levels of stress lead to high levels of depression. With mediation analysis researchers might instead answer how stress is related to depression.

In contrast to mediation, moderation investigates the unique conditions under which two variables are related [3] . The third variable here, the moderator, is not an intermediate variable in the causal sequence from the independent to the dependent variable. For the analysis of moderation effects, the relation between the independent and dependent variable must be different at different levels of the moderator [3] . Moderators are included in the statistical analysis as an interaction term [1] . When analyzing moderating effects the variables should first be centered (i.e., calculating the mean to become 0 and the standard deviation to become 1) in order to avoid problems with multi-colinearity [8] . Moderating effects can be calculated using multiple hierarchical linear regressions whereby main effects are presented in the first step and interactions in the second step [1] . Analysis of moderation, for example, helps researchers to answer when or under which conditions stress is related to depression.

Mediation and moderation effects on depression

Cognitive vulnerability models suggest that maladaptive self-schema mirroring helplessness and low self-esteem explain the development and maintenance of depression (for a review see [9] ). These cognitive vulnerability factors become activated by negative life events or negative moods [10] and are suggested to interact with environmental stressors to increase risk for depression and other emotional disorders [11] , [10] . In this line of thinking, the experience of stress, low self-esteem, and negative emotions can cause depression, but also be used to explain how (i.e., mediation) and under which conditions (i.e., moderation) specific variables influence depression.

Using mediational analyses to investigate how cognitive therapy intervations reduced depression, researchers have showed that the intervention reduced anxiety, which in turn was responsible for 91% of the reduction in depression [12] . In the same study, reductions in depression, by the intervention, accounted only for 6% of the reduction in anxiety. Thus, anxiety seems to affect depression more than depression affects anxiety and, together with stress, is both a cause of and a powerful mediator influencing depression (See also [13] ). Indeed, there are positive relationships between depression, anxiety and stress in different cultures [14] . Moreover, while some studies show that stress (independent variable) increases anxiety (mediator), which in turn increased depression (dependent variable) [14] , other studies show that stress (moderator) interacts with maladaptive self-schemata (dependent variable) to increase depression (independent variable) [15] , [16] .

The present study

In order to illustrate how mediation and moderation can be used to address different research questions we first focus our attention to anxiety and stress as mediators of different variables that earlier have been shown to be related to depression. Secondly, we use all variables to find which of these variables moderate the effects on depression.

The specific aims of the present study were:

  • To investigate if anxiety mediated the effect of stress, self-esteem, and affect on depression.
  • To investigate if stress mediated the effects of anxiety, self-esteem, and affect on depression.
  • To examine moderation effects between anxiety, stress, self-esteem, and affect on depression.

Ethics statement

This research protocol was approved by the Ethics Committee of the University of Gothenburg and written informed consent was obtained from all the study participants.

Participants

The present study was based upon a sample of 206 participants (males  = 93, females  = 113). All the participants were first year students in different disciplines at two universities in South Sweden. The mean age for the male students was 25.93 years ( SD  = 6.66), and 25.30 years ( SD  = 5.83) for the female students.

In total, 206 questionnaires were distributed to the students. Together 202 questionnaires were responded to leaving a total dropout of 1.94%. This dropout concerned three sections that the participants chose not to respond to at all, and one section that was completed incorrectly. None of these four questionnaires was included in the analyses.

Instruments

Hospital anxiety and depression scale [17] ..

The Swedish translation of this instrument [18] was used to measure anxiety and depression. The instrument consists of 14 statements (7 of which measure depression and 7 measure anxiety) to which participants are asked to respond grade of agreement on a Likert scale (0 to 3). The utility, reliability and validity of the instrument has been shown in multiple studies (e.g., [19] ).

Perceived Stress Scale [20] .

The Swedish version [21] of this instrument was used to measures individuals' experience of stress. The instrument consist of 14 statements to which participants rate on a Likert scale (0 =  never , 4 =  very often ). High values indicate that the individual expresses a high degree of stress.

Rosenberg's Self-Esteem Scale [22] .

The Rosenberg's Self-Esteem Scale (Swedish version by Lindwall [23] ) consists of 10 statements focusing on general feelings toward the self. Participants are asked to report grade of agreement in a four-point Likert scale (1 =  agree not at all, 4 =  agree completely ). This is the most widely used instrument for estimation of self-esteem with high levels of reliability and validity (e.g., [24] , [25] ).

Positive Affect and Negative Affect Schedule [26] .

This is a widely applied instrument for measuring individuals' self-reported mood and feelings. The Swedish version has been used among participants of different ages and occupations (e.g., [27] , [28] , [29] ). The instrument consists of 20 adjectives, 10 positive affect (e.g., proud, strong) and 10 negative affect (e.g., afraid, irritable). The adjectives are rated on a five-point Likert scale (1 =  not at all , 5 =  very much ). The instrument is a reliable, valid, and effective self-report instrument for estimating these two important and independent aspects of mood [26] .

Questionnaires were distributed to the participants on several different locations within the university, including the library and lecture halls. Participants were asked to complete the questionnaire after being informed about the purpose and duration (10–15 minutes) of the study. Participants were also ensured complete anonymity and informed that they could end their participation whenever they liked.

Correlational analysis

Depression showed positive, significant relationships with anxiety, stress and negative affect. Table 1 presents the correlation coefficients, mean values and standard deviations ( sd ), as well as Cronbach ' s α for all the variables in the study.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0073265.t001

Mediation analysis

Regression analyses were performed in order to investigate if anxiety mediated the effect of stress, self-esteem, and affect on depression (aim 1). The first regression showed that stress ( B  = .03, 95% CI [.02,.05], β = .36, t  = 4.32, p <.001), self-esteem ( B  = −.03, 95% CI [−.05, −.01], β = −.24, t  = −3.20, p <.001), and positive affect ( B  = −.02, 95% CI [−.05, −.01], β = −.19, t  = −2.93, p  = .004) had each an unique effect on depression. Surprisingly, negative affect did not predict depression ( p  = 0.77) and was therefore removed from the mediation model, thus not included in further analysis.

The second regression tested whether stress, self-esteem and positive affect uniquely predicted the mediator (i.e., anxiety). Stress was found to be positively associated ( B  = .21, 95% CI [.15,.27], β = .47, t  = 7.35, p <.001), whereas self-esteem was negatively associated ( B  = −.29, 95% CI [−.38, −.21], β = −.42, t  = −6.48, p <.001) to anxiety. Positive affect, however, was not associated to anxiety ( p  = .50) and was therefore removed from further analysis.

A hierarchical regression analysis using depression as the outcome variable was performed using stress and self-esteem as predictors in the first step, and anxiety as predictor in the second step. This analysis allows the examination of whether stress and self-esteem predict depression and if this relation is weaken in the presence of anxiety as the mediator. The result indicated that, in the first step, both stress ( B  = .04, 95% CI [.03,.05], β = .45, t  = 6.43, p <.001) and self-esteem ( B  = .04, 95% CI [.03,.05], β = .45, t  = 6.43, p <.001) predicted depression. When anxiety (i.e., the mediator) was controlled for predictability was reduced somewhat but was still significant for stress ( B  = .03, 95% CI [.02,.04], β = .33, t  = 4.29, p <.001) and for self-esteem ( B  = −.03, 95% CI [−.05, −.01], β = −.20, t  = −2.62, p  = .009). Anxiety, as a mediator, predicted depression even when both stress and self-esteem were controlled for ( B  = .05, 95% CI [.02,.08], β = .26, t  = 3.17, p  = .002). Anxiety improved the prediction of depression over-and-above the independent variables (i.e., stress and self-esteem) (Δ R 2  = .03, F (1, 198) = 10.06, p  = .002). See Table 2 for the details.

thumbnail

https://doi.org/10.1371/journal.pone.0073265.t002

A Sobel test was conducted to test the mediating criteria and to assess whether indirect effects were significant or not. The result showed that the complete pathway from stress (independent variable) to anxiety (mediator) to depression (dependent variable) was significant ( z  = 2.89, p  = .003). The complete pathway from self-esteem (independent variable) to anxiety (mediator) to depression (dependent variable) was also significant ( z  = 2.82, p  = .004). Thus, indicating that anxiety partially mediates the effects of both stress and self-esteem on depression. This result may indicate also that both stress and self-esteem contribute directly to explain the variation in depression and indirectly via experienced level of anxiety (see Figure 1 ).

thumbnail

Changes in Beta weights when the mediator is present are highlighted in red.

https://doi.org/10.1371/journal.pone.0073265.g001

For the second aim, regression analyses were performed in order to test if stress mediated the effect of anxiety, self-esteem, and affect on depression. The first regression showed that anxiety ( B  = .07, 95% CI [.04,.10], β = .37, t  = 4.57, p <.001), self-esteem ( B  = −.02, 95% CI [−.05, −.01], β = −.18, t  = −2.23, p  = .03), and positive affect ( B  = −.03, 95% CI [−.04, −.02], β = −.27, t  = −4.35, p <.001) predicted depression independently of each other. Negative affect did not predict depression ( p  = 0.74) and was therefore removed from further analysis.

The second regression investigated if anxiety, self-esteem and positive affect uniquely predicted the mediator (i.e., stress). Stress was positively associated to anxiety ( B  = 1.01, 95% CI [.75, 1.30], β = .46, t  = 7.35, p <.001), negatively associated to self-esteem ( B  = −.30, 95% CI [−.50, −.01], β = −.19, t  = −2.90, p  = .004), and a negatively associated to positive affect ( B  = −.33, 95% CI [−.46, −.20], β = −.27, t  = −5.02, p <.001).

A hierarchical regression analysis using depression as the outcome and anxiety, self-esteem, and positive affect as the predictors in the first step, and stress as the predictor in the second step, allowed the examination of whether anxiety, self-esteem and positive affect predicted depression and if this association would weaken when stress (i.e., the mediator) was present. In the first step of the regression anxiety ( B  = .07, 95% CI [.05,.10], β = .38, t  = 5.31, p  = .02), self-esteem ( B  = −.03, 95% CI [−.05, −.01], β = −.18, t  = −2.41, p  = .02), and positive affect ( B  = −.03, 95% CI [−.04, −.02], β = −.27, t  = −4.36, p <.001) significantly explained depression. When stress (i.e., the mediator) was controlled for, predictability was reduced somewhat but was still significant for anxiety ( B  = .05, 95% CI [.02,.08], β = .05, t  = 4.29, p <.001) and for positive affect ( B  = −.02, 95% CI [−.04, −.01], β = −.20, t  = −3.16, p  = .002), whereas self-esteem did not reach significance ( p < = .08). In the second step, the mediator (i.e., stress) predicted depression even when anxiety, self-esteem, and positive affect were controlled for ( B  = .02, 95% CI [.08,.04], β = .25, t  = 3.07, p  = .002). Stress improved the prediction of depression over-and-above the independent variables (i.e., anxiety, self-esteem and positive affect) (Δ R 2  = .02, F (1, 197)  = 9.40, p  = .002). See Table 3 for the details.

thumbnail

https://doi.org/10.1371/journal.pone.0073265.t003

Furthermore, the Sobel test indicated that the complete pathways from the independent variables (anxiety: z  = 2.81, p  = .004; self-esteem: z  =  2.05, p  = .04; positive affect: z  = 2.58, p <.01) to the mediator (i.e., stress), to the outcome (i.e., depression) were significant. These specific results might be explained on the basis that stress partially mediated the effects of both anxiety and positive affect on depression while stress completely mediated the effects of self-esteem on depression. In other words, anxiety and positive affect contributed directly to explain the variation in depression and indirectly via the experienced level of stress. Self-esteem contributed only indirectly via the experienced level of stress to explain the variation in depression. In other words, stress effects on depression originate from “its own power” and explained more of the variation in depression than self-esteem (see Figure 2 ).

thumbnail

https://doi.org/10.1371/journal.pone.0073265.g002

Moderation analysis

Multiple linear regression analyses were used in order to examine moderation effects between anxiety, stress, self-esteem and affect on depression. The analysis indicated that about 52% of the variation in the dependent variable (i.e., depression) could be explained by the main effects and the interaction effects ( R 2  = .55, adjusted R 2  = .51, F (55, 186)  = 14.87, p <.001). When the variables (dependent and independent) were standardized, both the standardized regression coefficients beta (β) and the unstandardized regression coefficients beta (B) became the same value with regard to the main effects. Three of the main effects were significant and contributed uniquely to high levels of depression: anxiety ( B  = .26, t  = 3.12, p  = .002), stress ( B  = .25, t  = 2.86, p  = .005), and self-esteem ( B  = −.17, t  = −2.17, p  = .03). The main effect of positive affect was also significant and contributed to low levels of depression ( B  = −.16, t  = −2.027, p  = .02) (see Figure 3 ). Furthermore, the results indicated that two moderator effects were significant. These were the interaction between stress and negative affect ( B  = −.28, β = −.39, t  = −2.36, p  = .02) (see Figure 4 ) and the interaction between positive affect and negative affect ( B  = −.21, β = −.29, t  = −2.30, p  = .02) ( Figure 5 ).

thumbnail

https://doi.org/10.1371/journal.pone.0073265.g003

thumbnail

Low stress and low negative affect leads to lower levels of depression compared to high stress and high negative affect.

https://doi.org/10.1371/journal.pone.0073265.g004

thumbnail

High positive affect and low negative affect lead to lower levels of depression compared to low positive affect and high negative affect.

https://doi.org/10.1371/journal.pone.0073265.g005

The results in the present study show that (i) anxiety partially mediated the effects of both stress and self-esteem on depression, (ii) that stress partially mediated the effects of anxiety and positive affect on depression, (iii) that stress completely mediated the effects of self-esteem on depression, and (iv) that there was a significant interaction between stress and negative affect, and positive affect and negative affect on depression.

Mediating effects

The study suggests that anxiety contributes directly to explaining the variance in depression while stress and self-esteem might contribute directly to explaining the variance in depression and indirectly by increasing feelings of anxiety. Indeed, individuals who experience stress over a long period of time are susceptible to increased anxiety and depression [30] , [31] and previous research shows that high self-esteem seems to buffer against anxiety and depression [32] , [33] . The study also showed that stress partially mediated the effects of both anxiety and positive affect on depression and that stress completely mediated the effects of self-esteem on depression. Anxiety and positive affect contributed directly to explain the variation in depression and indirectly to the experienced level of stress. Self-esteem contributed only indirectly via the experienced level of stress to explain the variation in depression, i.e. stress affects depression on the basis of ‘its own power’ and explains much more of the variation in depressive experiences than self-esteem. In general, individuals who experience low anxiety and frequently experience positive affect seem to experience low stress, which might reduce their levels of depression. Academic stress, for instance, may increase the risk for experiencing depression among students [34] . Although self-esteem did not emerged as an important variable here, under circumstances in which difficulties in life become chronic, some researchers suggest that low self-esteem facilitates the experience of stress [35] .

Moderator effects/interaction effects

The present study showed that the interaction between stress and negative affect and between positive and negative affect influenced self-reported depression symptoms. Moderation effects between stress and negative affect imply that the students experiencing low levels of stress and low negative affect reported lower levels of depression than those who experience high levels of stress and high negative affect. This result confirms earlier findings that underline the strong positive association between negative affect and both stress and depression [36] , [37] . Nevertheless, negative affect by itself did not predicted depression. In this regard, it is important to point out that the absence of positive emotions is a better predictor of morbidity than the presence of negative emotions [38] , [39] . A modification to this statement, as illustrated by the results discussed next, could be that the presence of negative emotions in conjunction with the absence of positive emotions increases morbidity.

The moderating effects between positive and negative affect on the experience of depression imply that the students experiencing high levels of positive affect and low levels of negative affect reported lower levels of depression than those who experience low levels of positive affect and high levels of negative affect. This result fits previous observations indicating that different combinations of these affect dimensions are related to different measures of physical and mental health and well-being, such as, blood pressure, depression, quality of sleep, anxiety, life satisfaction, psychological well-being, and self-regulation [40] – [51] .

Limitations

The result indicated a relatively low mean value for depression ( M  = 3.69), perhaps because the studied population was university students. These might limit the generalization power of the results and might also explain why negative affect, commonly associated to depression, was not related to depression in the present study. Moreover, there is a potential influence of single source/single method variance on the findings, especially given the high correlation between all the variables under examination.

Conclusions

The present study highlights different results that could be arrived depending on whether researchers decide to use variables as mediators or moderators. For example, when using meditational analyses, anxiety and stress seem to be important factors that explain how the different variables used here influence depression–increases in anxiety and stress by any other factor seem to lead to increases in depression. In contrast, when moderation analyses were used, the interaction of stress and affect predicted depression and the interaction of both affectivity dimensions (i.e., positive and negative affect) also predicted depression–stress might increase depression under the condition that the individual is high in negative affectivity, in turn, negative affectivity might increase depression under the condition that the individual experiences low positive affectivity.

Acknowledgments

The authors would like to thank the reviewers for their openness and suggestions, which significantly improved the article.

Author Contributions

Conceived and designed the experiments: AAN TA. Performed the experiments: AAN. Analyzed the data: AAN DG. Contributed reagents/materials/analysis tools: AAN TA DG. Wrote the paper: AAN PR TA DG.

  • View Article
  • Google Scholar
  • 3. MacKinnon DP, Luecken LJ (2008) How and for Whom? Mediation and Moderation in Health Psychology. Health Psychol 27 (2 Suppl.): s99–s102.
  • 4. Aaroe R (2006) Vinn över din depression [Defeat depression]. Stockholm: Liber.
  • 5. Agerberg M (1998) Ut ur mörkret [Out from the Darkness]. Stockholm: Nordstedt.
  • 6. Gilbert P (2005) Hantera din depression [Cope with your Depression]. Stockholm: Bokförlaget Prisma.
  • 8. Tabachnick BG, Fidell LS (2007) Using Multivariate Statistics, Fifth Edition. Boston: Pearson Education, Inc.
  • 10. Beck AT (1967) Depression: Causes and treatment. Philadelphia: University of Pennsylvania Press.
  • 21. Eskin M, Parr D (1996) Introducing a Swedish version of an instrument measuring mental stress. Stockholm: Psykologiska institutionen Stockholms Universitet.
  • 22. Rosenberg M (1965) Society and the Adolescent Self-Image. Princeton, NJ: Princeton University Press.
  • 23. Lindwall M (2011) Självkänsla – Bortom populärpsykologi & enkla sanningar [Self-Esteem – Beyond Popular Psychology and Simple Truths]. Lund:Studentlitteratur.
  • 25. Blascovich J, Tomaka J (1991) Measures of self-esteem. In: Robinson JP, Shaver PR, Wrightsman LS (Red.) Measures of personality and social psychological attitudes San Diego: Academic Press. 161–194.
  • 30. Eysenck M (Ed.) (2000) Psychology: an integrated approach. New York: Oxford University Press.
  • 31. Lazarus RS, Folkman S (1984) Stress, Appraisal, and Coping. New York: Springer.
  • 32. Johnson M (2003) Självkänsla och anpassning [Self-esteem and Adaptation]. Lund: Studentlitteratur.
  • 33. Cullberg Weston M (2005) Ditt inre centrum – Om självkänsla, självbild och konturen av ditt själv [Your Inner Centre – About Self-esteem, Self-image and the Contours of Yourself]. Stockholm: Natur och Kultur.
  • 34. Lindén M (1997) Studentens livssituation. Frihet, sårbarhet, kris och utveckling [Students' Life Situation. Freedom, Vulnerability, Crisis and Development]. Uppsala: Studenthälsan.
  • 35. Williams S (1995) Press utan stress ger maximal prestation [Pressure without Stress gives Maximal Performance]. Malmö: Richters förlag.
  • 37. Garcia D, Kerekes N, Andersson-Arntén A–C, Archer T (2012) Temperament, Character, and Adolescents' Depressive Symptoms: Focusing on Affect. Depress Res Treat. DOI:10.1155/2012/925372.
  • 40. Garcia D, Ghiabi B, Moradi S, Siddiqui A, Archer T (2013) The Happy Personality: A Tale of Two Philosophies. In Morris EF, Jackson M-A editors. Psychology of Personality. New York: Nova Science Publishers. 41–59.
  • 41. Schütz E, Nima AA, Sailer U, Andersson-Arntén A–C, Archer T, Garcia D (2013) The affective profiles in the USA: Happiness, depression, life satisfaction, and happiness-increasing strategies. In press.
  • 43. Garcia D, Nima AA, Archer T (2013) Temperament and Character's Relationship to Subjective Well- Being in Salvadorian Adolescents and Young Adults. In press.
  • 44. Garcia D (2013) La vie en Rose: High Levels of Well-Being and Events Inside and Outside Autobiographical Memory. J Happiness Stud. DOI: 10.1007/s10902-013-9443-x.
  • 48. Adrianson L, Djumaludin A, Neila R, Archer T (2013) Cultural influences upon health, affect, self-esteem and impulsiveness: An Indonesian-Swedish comparison. Int J Res Stud Psychol. DOI: 10.5861/ijrsp.2013.228.

ORIGINAL RESEARCH article

This article is part of the research topic.

Orthorexia Nervosa: New Insights into Clinical Management and Social Environment Aspects

Orthorexic Tendency and Its Association with Weight Control Methods and Dietary Variety in Polish Adults: A Cross-Sectional Study Provisionally Accepted

  • 1 Department of Food Market and Consumer Research, Institute of Human Nutrition Sciences, Warsaw University of Life Sciences, Poland
  • 2 Department of Human Nutrition, Faculty of Food Science, University of Warmia and Mazury in Olsztyn, Poland

The final, formatted version of the article will be published soon.

The methods for controlling weight play a central role in formally diagnosed eating disorders (EDs) and appear to be important in the context of other nonformally recognized disorders, such as orthorexia nervosa (ON). These methods also have an impact on eating behaviors, including dietary variety. Our study aimed to: (i) assess the intensity of ON tendency by sex and BMI groups, (ii) evaluate the associations between ON tendency, weight control methods, and dietary variety, and (iii) determine the extent to which weight control methods and dietary variety contribute to the ON tendency among both females and males. Data were gathered from a sample of 936 Polish adults (463 females and 473 males) through a cross-sectional quantitative study conducted in 2019. Participants were requested to complete the ORTO-6, the Weight Control Methods Scale, and the Food Intake Variety Questionnaire (FIVeQ). Multiple linear regression analysis was employed to evaluate associations between ON tendency, weight control methods, and dietary variety. Females exhibited a higher ON tendency than males (14.4 ± 3.4 vs. 13.5 ± 3.7, p < 0.001, d = 0.25). In the regression model, the higher ON tendency was predicted by more frequent use of weight control methods, such as restricting the amount of food consumed, using laxatives, and physical exercise among both females and males as well as following a starvation diet in females, and drinking teas to aid bowel movements among males. Moreover, the higher ON tendency was predicted by higher dietary variety, lower age in both sexes, and higher level of education among males. However, there were no differences in ON tendency across BMI groups. In conclusion, the findings showed that ON tendency was predicted by a higher frequency of weight control methods commonly used by individuals with anorexia nervosa (AN) and bulimia nervosa (BN). The resemblance to these two EDs is also suggested by the higher intensity of ON tendency among females and younger people. However, the prediction of ON tendency by dietary variety indicates that the obsessive preoccupation with healthy eating may not be advanced enough to observe a decrease in the dietary variety among these individuals.

Keywords: orthorexic tendency, Weight control methods, dietary variety, adults, Poland

Received: 14 Dec 2023; Accepted: 02 Apr 2024.

Copyright: © 2024 Plichta and Kowalkowska. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

* Correspondence: Dr. Marta Plichta, Department of Food Market and Consumer Research, Institute of Human Nutrition Sciences, Warsaw University of Life Sciences, Warsaw, Poland

People also looked at

Decoded

A behind-the-scenes blog about research methods at Pew Research Center.

For our latest research findings, visit  pewresearch.org .

A short intro to linear regression analysis using survey data

research article using linear regression

Many of Pew Research Center’s survey analyses show relationships between two variables. For example, our reports may explore how attitudes about one thing — such as views of the economy — are associated with attitudes about another thing — such as views of the president’s job performance. Or they might look at how different demographic groups respond to the same survey question.

But analysts are sometimes interested in understanding how  multiple  factors might contribute simultaneously to the same outcome. One useful tool to help us make sense of these kinds of problems is regression. Regression is a statistical method that allows us to look at the relationship between two variables, while holding other factors equal.

This post will show how to estimate and interpret linear regression models with survey data using R. We’ll use data taken from a Pew Research Center 2016 post-election survey, and you can  download the dataset for your own use here . We’ll discuss both bivariate regression, which has one outcome variable and one explanatory variable, and multiple regression, which has one outcome variable and multiple explanatory variables.

This post is meant as a brief introduction to how to estimate a regression model in R. It also offers a brief explanation of some of the aspects that need to be accounted for in the process.

Bivariate regression models with survey data

In the Center’s 2016 post-election survey, respondents were asked to rate then President-elect Donald Trump on a 0–100 “feeling thermometer.” Respondents were told, “a rating of zero degrees means you feel as cold and negative as possible. A rating of 100 degrees means you feel as warm and positive as possible. You would rate the person at 50 degrees if you don’t feel particularly positive or negative toward the person.”

We can use R’s plot function to take a look at the answers people gave. The plot below shows the distribution of the ratings of Trump. Round numbers and increments of 5 typically received more responses than other numbers. For example, 50 had a larger number of responses than 49.

In most survey research we also want to represent a population (in this case, the adult population in the U.S.), which requires weighting the data to known national statistics. Weights are used to correct for under- and overrepresentation among different demographic groups in our sample (like age, gender, region, education, race). When working with weighted survey data, we need to account for these weights correctly. Otherwise, population estimates, standard errors and significance tests will be incorrect.

research article using linear regression

One option for working with survey data in R is to use the “survey” package. For an introduction on working with survey data in R, see  our earlier blog post .

The first step involves creating a survey design object with our weights variable. Below, we define the “d_design” object with the corresponding weight from the WEIGHT_W23 variable. We can use this survey object to perform a wide variety of analyses included in the `survey` package. In this case, we’ll use it to calculate averages and run a regression.

The `svymean()` function lets us calculate Trump’s average thermometer rating and its standard error. Overall, the average rating of Trump among those who gave him a rating in this data is 43, but we know from existing research that  public views of Trump differ substantially by race , among other things. We can see this by tabulating the average Trump thermometer score by the race/ethnicity variable in the dataset (“F_RACETHN_RECRUITMENT”). The `svyby()` function lets us do that separately for each race category:

We can see that there is a large difference between whites, blacks and Hispanics, with whites rating Trump at least 23 points higher than the other racial/ethnic groups do. (The “other” and “don’t know/refused” categories account for about 7% of the public.) However, since we know that  there are large racial and ethnic differences in party identification , it may be that the racial divide in Trump ratings is a function of partisanship. This is where regression comes in.

By using the regression function `svyglm()` in R, we can conduct a regression analysis that includes party differences in the  same  model as race. Using `svyglm()` from the survey package (rather than `lm()` or `glm()`) is important because it accounts for the survey weights while estimating the model. The output from our `svyglm()` function will allow us to see whether a racial gap persists even after accounting for differences in partisanship between racial groups.

First, we can look at the results when we only include race in the regression:

When interpreting regression output, we want to examine the coefficients of the independent variables. These are given by the values in the “Estimate” column.

Notice that the estimate and standard error for the “(Intercept)” are identical to the values we calculated earlier for white non-Hispanics. By default, R treats the first category in an independent variable as the reference category. The coefficients for the other racial groups show how each group differs from whites in terms of the Trump thermometer score. Notice that the coefficients for blacks, Hispanics and those who identify with other racial groups are all negative. This means that, on average, the ratings of Trump are lower across each of these groups compared to whites. For example, the coefficient for blacks is -23.7. This can be interpreted as meaning that, on average, Trump’s thermometer rating is 23.7 points lower for blacks than for whites. If we think back to the overall averages, this makes sense because all the nonwhite racial/ethnic groups rated Trump lower than whites did. And, in fact, if you combine the intercept estimate with the estimate for non-Hispanic blacks, you get 49.3–23.7 = 25.6, exactly what we saw in the simple tabulation above.

Multiple regression models with survey data

Regression becomes a more useful tool when researchers want to look at multiple factors simultaneously. If we want to know whether the racial divide persists even after accounting for differences in party identification, we can enter partisanship into the regression equation. Note that the only difference here is one added explanatory variable (F_PARTYSUM_FINAL) which contains responses to questions about which political party the respondents identify with or lean toward. Since we have two independent variables now, the reference categories are now the group of people who are in the first level for the F_RACETHN_RECRUITMENT and F_PARTYSUM_FINAL variables. In this case, that means that the intercept is the expected average thermometer score among non-Hispanic whites who also identify as or lean Republican.

After including a new variable for partisanship, the racial and ethnic differences almost entirely disappear. The coefficients are quite small (none exceed 5) and are not statistically significant at p < 0.05. For blacks, we can interpret the coefficient of -2.1 as meaning that if we hold party constant, race does not explain differences in Trump’s rating. We would expect both black and white Republicans to give similar ratings of Trump. Likewise, we would expect only small differences between white and black Democrats. In contrast, party matters a lot: Democrats rate Trump about 51 points lower than Republicans on average. Those who don’t lean toward either party rate Trump about 39 points lower than Republicans.

Further analysis could be conducted to explore how other factors might account for variance in Trump thermometer ratings. Perhaps there are significant interactions that we haven’t accounted for (e.g., it might be the case that there is some kind of interaction between race and partisanship that isn’t accounted for in the simple additive model that we looked at above), and it is always important to remember that standard regression analysis of the kind presented in this post is not sufficient to show causal relationships. Regression allows us to sort out the relationships between many variables simultaneously, but we can’t say that just because a significant relationship was found between two variables, one  caused the other. Regression is a useful tool for summarizing descriptive relationships, but it is not a silver bullet (see  this post  for more on where regression can go wrong).

More from Decoded

How adding a ‘don’t know’ response option can affect cross-national survey results.

In our surveys, people are much less likely to skip questions online than when speaking to interviewers in person or on the phone; we explore how offering a “Don’t know” option in online surveys affects results.

Can machines compete with humans in transcribing audio? A case study using sermons from U.S. religious services

To test whether machine transcription would be practical for studies of sermons in 2019 and 2020, we compared human and machine transcriptions of snippets from a random sample of 200 audio and video sermons.

Nonresponse rates on open-ended survey questions vary by demographic group, other factors

Demographic characteristics and other factors, such as the devices that respondents use to take surveys, are tied to Americans’ willingness to engage with open-ended questions.

Testing survey questions about a hypothetical military conflict between China and Taiwan

Given the complexities of geopolitics, how might wording affect responses to a question about a hypothetical conflict between China and Taiwan?

How we keep our online surveys from running too long

While there is no magic length that an online survey should be, Pew Research Center caps the length of its online American Trends Panel surveys at 15 minutes.

More From Decoded

To browse all of Pew Research Center findings and data by topic, visit pewresearch.org

About Decoded

Copyright 2022 Pew Research Center

  • Open access
  • Published: 03 April 2024

Trends in antidiabetic drug use and expenditure in public hospitals in Northwest China, 2012-21: a case study of Gansu Province

  • Wenxuan Cao 1 ,
  • Hu Feng 1 ,
  • Yaya Yang 1 ,
  • Lei Wang 1 ,
  • Xuemei Wang 1 ,
  • Yongheng Ma 2 ,
  • Defang Zhao 2 &
  • Xiaobin Hu 1  

BMC Health Services Research volume  24 , Article number:  415 ( 2024 ) Cite this article

Metrics details

Since the twenty-first century, the prevalence of diabetes has risen globally year by year. In Gansu Province, an economically underdeveloped province in northwest China, the cost of drugs for diabetes patients accounted for one-third of their total drug costs. To fundamentally reduce national drug expenditures and the burden of medication on the population, the relevant departments of government have continued to reform and improve drug policies. This study aimed to analyse long-term trends in antidiabetic drug use and expenditure in Gansu Province from 2012 to 2021 and to explore the role of pharmaceutical policy.

Data were obtained from the provincial centralised bidding and purchasing (CBP) platform. Drug use was quantified using the anatomical therapeutic chemistry/defined daily dose (ATC/DDD) method and standardised by DDD per 1000 inhabitants per day (DID), and drug expenditure was expressed in terms of the total amount and defined daily cost (DDC). Linear regression was used to analyse the trends and magnitude of drug use and expenditure.

The overall trend in the use and expenditure of antidiabetic drugs was on the rise, with the use increasing from 1.04 in 2012 to 16.02 DID in 2021 and the expenditure increasing from 48.36 in 2012 to 496.42 million yuan in 2021 (from 7.66 to 76.95 million USD). Some new and expensive drugs changed in the use pattern, and their use and expenditure shares (as the percentage of all antidiabetic drugs) increased from 0 to 11.17% and 11.37%, but insulins and analogues and biguanides remained the most used drug class. The DDC of oral drugs all showed a decreasing trend, but essential medicines (EMs) and medical insurance drugs DDC gradually decreased with increasing use. The price reduction of the bid-winning drugs was over 40%, and the top three drugs were glimepiride 2mg/30, acarbose 50mg/30 and acarbose 100mg/30.

Conclusions

The implementation of pharmaceutical policies has significantly increased drug use and expenditure while reducing drug prices, and the introduction of novel drugs and updated treatment guidelines has led to changes in use patterns.

Peer Review reports

About one in ten of the world's population is diabetes patient, and what's worse is that this percentage is increasing every year. According to the International Diabetes Federation, in 2021, 537 million adults (aged 20–79) worldwide had diabetes, resulting in an estimated total global healthcare expenditure of 966 billion USD [ 1 ]. China had the largest number of diabetes patients in the world. According to statistics, approximately 11.2% of adults aged 18 years and older had diabetes in 2021, with the total number of patients reaching 141 million, and it was expected to reach 174 million by 2045 [ 2 ].

Gansu Province is an economically underdeveloped province in northwestern China, and a study showed that the local diabetes prevalence rate was 7.33% in 2018 [ 3 ]. In the same year, the total healthcare cost of diabetes patients was 1.348 billion yuan (20.37 million USD) [ 4 ], accounting for 2.56% of the total treatment cost in the province and 0.16% of the regional gross domestic product (GDP). Although this percentage was not high, the cost of drugs accounted for about one-third, which was much higher than the average level of 19.7% among the member countries of the Organisation for Economic Co-operation and Development in 2019 [ 5 ].

Appropriate use of antidiabetic drugs is key in the management of diabetes (especially type 2 diabetes), which is important for delaying disease progression, reducing the risk of complications and reducing the disease burden. To fundamentally reduce national drug expenditure and the burden of drug use on the population, relevant Chinese government departments have been reforming and improving their drug policies. The National Essential Medicine System (NEMS) was first established in 2009, provinces and cities across the country have started to implement the "three unified" (policies of unified bidding, distribution, and pricing) and zero-price markup policy to expand the coverage of essential medicines (EMs) [ 6 ]. And all EMs were included in the National Basic Medical Insurance Drug Catalogue (NBMIDC), so their reimbursement rate was higher than that of non-EMs. As Chinese residents' basic medical insurance coverage was upwards of 95% [ 7 ], this policy significantly reduced out-of-pocket payment for residents.

Since the beginning of the three-year new medical reform (2009–11), Gansu has involved government-run medical and health institutions in the scope of the national EMs management and has achieved zero mark-up drugs——No mark-ups or other surcharges are allowed in the price of medicines, the actual selling price is close to the cost price. By the end of 2012, the coverage rate of the EMs system reached 100%. After the policy came to maturity, it successfully reduced the drug cost share from 34.98% in 2014 to 26.91% in 2018 within four years [ 8 ]. At the same time actively implementing the national centralised drug procurement policy, until 2021, Gansu Province has carried out the national organization of five batches of 218 drugs centralized procurement work, which contained a total of 14 varieties (based on Anatomical Therapeutic Chemical (ATC) -5), 23 strains (based on formulations, dosage specifications and manufacturers) of antidiabetic drugs [ 9 , 10 , 11 , 12 , 13 ]. According to incomplete statistics from relevant government departments, the policy has led to an average price reduction of 54.6% for the drugs included and relative cost savings of more than 1.3 billion yuan (20.15 million USD) [ 14 ]. To date, several studies have reported on temporal trends in the use of antidiabetic drugs in other foreign countries, but few studies have focused on changes in the use and expenditure of this class of drugs in less economically developed areas of China and the role of pharmaceutical policies behind the changes [ 15 , 16 ].

This study collected data on antidiabetic drugs in Gansu Province from 2012 to 2021, aiming (1) to explore the long-term trends and use patterns of antidiabetic drug use and expenditure in Northwest China, (2) to reveal the impact of the implementation of pharmaceutical policy on drug use and DDC.

Study setting

Gansu Province is in northwest China, with a narrow and curving topography, and is an economically underdeveloped province. The province's GDP was 102.43 billion yuan (15.88 billion USD) in 2021——the fifth lowest in the country [ 17 ]. With over 70% of the population living in rural, the dispersed nature of the population due to geography and economic level, limited medical resources and inadequate public health education were the main reasons that prevented diabetes patients in the region from effectively managing their disease [ 18 ]. The residents' health insurance coverage was over 97%, and the patient's medical expenses were shared between the national health insurance (coordinated payment) and the patient (individual out-of-pocket payment) within reasonable reimbursement limits [ 19 ].

Data source

The study used data from the centralised bidding and purchasing (CBP) platform managed by the Drug Procurement Division of the Gansu Provincial Public Resources Transaction Center. Its main record information included the drug’s generic name, dosage form, specification, conversion factor, approval number, manufacturer, purchasing unit, purchasing time, purchasing quantity and amount. By 2020, the number of hospitals covered by this database accounted for 93.39% of the total number of public hospitals. This study adopted a retrospective research method, based on the daily drug procurement data of public hospitals in the province from 2012 to 2021, to sort out a total of 2 major categories (based on ATC-3), 10 subcategories (based on ATC-4), 40 varieties (based on ATC-5) and 338 strains (based on formulations, dosage specifications and manufacturers) of antidiabetic drugs, basically covering the main types of diabetes therapeutic drugs. In China, all antidiabetic drugs approved by the healthcare authorities are prescription drugs, and diabetes patients must be prescribed them by clinicians and obtained from hospital pharmacies.

Data management

The ATC system classified drugs into different groups according to the organ or system on which they act and chemical, pharmacological and therapeutic properties. Drugs were classified into ATC groups by its international non-proprietary name. In this paper, all "A10" (drugs used in diabetes) in "A" (alimentary tract and metabolism) were included in the study. Among them, we emphasized the use of a novel antidiabetic drug group, with the following ATC codes: A10BH and A10BD07-13 for dipeptidyl peptidase 4 inhibitors (DPP-4i); A10BJ, A10AE54 and A10AE56 for glucagon-like peptide-1 receptor analogue (GLP-1RAs); A10BK, A10BD15 and A10BD20 for sodium-glucose co-transporter 2 inhibitors (SGLT2i) [ 20 ].The data was analysed using the ATC/defined daily dose (DDD) system developed by the WHO [ 21 ], which calculated the frequency of use of defined daily doses (DDDs) and the defined daily cost (DDC) of each drug based on the DDD, and expressed the standardised dosing intensity in terms of doses per 1000 inhabitants per day (DID). The DDD is the average daily dose used in adults for the primary therapeutic purpose, based mainly on the DDD values of drugs published on the official website; DDDs and DDC were calculated using the following formula, the higher the DDDs, the more frequently such drugs were used and the greater the clinical tendency to choose the drug; the higher the DDC, the greater the financial burden on the patient.

where N i represents the number of packages of drug ( i ).

Data analysis

The number of residents per year was expressed as the average of the population at the end of that year and the previous year in the Gansu Statistical Yearbook (mid-year population) [ 22 ], on the basis of which the number of permanent residents was adjusted using data on the mobile population in Gansu Province [ 23 ] in order to reduce any possible errors in the results that may result from this. Expenditure was adjusted for price inflation at an annual rate of 1.42% (the average annual inflation rate of the Gansu provincial consumer price index from 2012 to 2021) [ 24 ]. Descriptive statistics were used to describe the annual use and expenditure of antidiabetic drugs in Gansu Province, and to calculate the use shares (as the proportion of the total use) and DDC values of each drug type. Linear regression was used to analyse the changes in the use and expenditure of each type of drug during the study period (at least five consecutive years of purchase records), and the regression coefficients ( B value) and significance ( P value) were used to indicate the direction of the trend. Whether the B value is greater than 0 indicates the direction of change (> 0—upward, < 0—downward), and P values < 0.05 were taken to indicate statistical significance. Microsoft Office Excel 2016 and Stata 16.0 were used for data management and analysis, and GraphPad Prism 9 was used for graphing.

Overview of the use of antidiabetic drugs

During the ten-year study period, the overall trend of antidiabetic drug use and expenditure in Gansu Province was increasing ( B  = 1.373, 44.229, P  < 0.001), use increased from 1.04 in 2012 to 16.02 DID in 2021, and expenditure increased from 48.36 in 2012 to 496.42 million yuan in 2021 (from 7.66 to 76.95 million USD), with the largest increase in 2013 and the largest decrease in 2018.

There were large differences in the composition of use by drug class. Insulins and analogues were the most used, but their proportion of use decreased by almost 20% over the decade ( B  = -0.048, P  < 0.001). In contrast, the proportion of use of oral antidiabetic drugs remained mostly stable or increased. For example, the proportion of use of biguanides increased by about 10% ( B  = 0.017, P  = 0.002), the proportion of use of sulfonylureas remained stable at about 20%, followed by α-glucosidase inhibitors, its proportion of use increased rapidly to one-fifth of the total ( B  = 0.272, P  = 0.001) (Fig.  1 ).

figure 1

Trends in the use and proportion of antidiabetic drugs. A trends in the use of antidiabetic drugs; B trends in the proportion of antidiabetic drugs. GLP-1RAs, glucagon-like peptide-1 receptor analogues; SGLT2, sodium-glucose co-transporter 2; DDD, defined daily dose

Among the different hospital levels, the use was comparable between tertiary and secondary hospitals (44.19% vs 42.18%) and lowest in primary hospitals (13.63%). Drug use increased in all hospital levels during the study period but with a gradual decrease in the proportion of drugs used in tertiary hospitals ( B  = -0.010, P  = 0.029) and an increasing trend in the proportion of drugs used in secondary ( B  = 0.009, P  = 0.042) (Fig.  2 ).

figure 2

Trends in the use and proportion of antidiabetic drugs in different hospital levels. A trends in the use of antidiabetic drugs in different hospital levels; B trends in the proportion of antidiabetic drugs in different hospital levels. PHCs, primary health care centres; DDD, defined daily dose

Trends in the use of novel antidiabetic drugs

The total use ( B  = 0.420, P  = 0.034) and expenditure ( B = 13.529,  P = 0.008) of novel antidiabetic drugs have continued to increase since 2017, and their shares reached 11.17% and 11.37% of the total by 2021 respectively. The most widely used class of novel antidiabetic drugs was SGLT2i, accounting for 56.67% of the total use in this group, followed by DPP-4i and GLP-1RAs with 30.83% and 10.68% respectively.

Trends in DDC for different classes of antidiabetic drugs

For the different classes of antidiabetic drugs, the top three drug classes in terms of DDC were GLP-1RAs (¥12.81), combinations of oral blood glucose lowering drugs (¥12.66), insulins and analogues (¥11.79), and the bottom three were sulfonylureas (¥1.17), biguanides (¥1.80) and SGLT2i (¥2.58), with the highest class having an average annual DDC of about 11 times that of the lowest class.

The mean DDC of all antidiabetic drugs showed a decreasing trend over the study period ( B  = -0.0250, P  = 0.044), with an average decrease of 4 percentage points per year, and this trend was evident after 2017. The DDC of insulins and analogues increased from year to year ( B  = 0.458, P  < 0.001), whereas the DDC of the remaining oral antidiabetic drugs showed a decreasing trend, with significant decreases for SGLT2i (-26.64%), α-glucosidase inhibitors (-13.74%) and other blood glucose lowering drugs (-7.23%) (Table  1 ).

The impact of pharmaceutical policy on drug use and expenditure

Based on the ems perspective.

The overall trend in the share of use and expenditure of EMs was slowly increasing during the study period, and the relative ratio with fixed base for both was almost the same (18.88% vs. 18.24%). In terms of DDC for both types of drugs, both EMs and non-EMs showed a decreasing trend, but the decrease for EMs was significantly higher than non-EMs (-35.93% vs. -8.84%) (Fig.  3 A).

figure 3

Trends in the use and DDC of EMs and medical insurance drugs. A trends in the use and DDC of EMs and non-EMs; B trends in the use and DDC of medical insurance and non-medical insurance drugs. DDC, defined daily cost; EMs, essential medicines

Based on the medical insurance classification perspective

Both the use and expenditure shares of medical insurance drugs tended to increase during the study period ( B  = 0.022, 0.030, P  = 0.024, 0.009), with a significantly higher relative ratio with fixed base in expenditure than in use (28.69% vs. 14.76%). In terms of DDC for both types of drugs, both medical insurance and non-medical insurance drugs showed a decreasing trend, but the decrease for non-medical insurance drugs was significantly higher than for medical insurance (-61.80% vs. -27.77%) (Fig.  3 B).

Based on the perspective of the centralised procurement

The impact of the centralised procurement policy on the use and expenditure of the bid-winning drugs was found to be accompanied by a short-term response to the implementation of each policy, with a significant increase in use and a significant decrease in expenditure, ultimately manifesting as a "precipitous" decrease in the DDC of the bid-winning drugs. Among them, the second batch of bid-winning drugs had the largest increase in use (189.91%) and expenditure (-76.32%) after the policy, while the impact on the fourth and fifth batches of bid-winning drugs was relatively small (Fig.  4 ).

figure 4

Trends in the use and expenditure of bid-winning drugs under the centralised procurement policy. A trends in the use of bid-winning drugs; B trends in the expenditure of bid-winning drugs. DDDs, defined daily doses; DDC, defined daily cost

In terms of DDC, the price reductions for all batches of bid-winning drugs were above 40%, again with the largest price reduction (81.71%) for the second batch of bid-winning drugs. Among the six antidiabetic drugs with records of use before the implementation of the policy, the top three price reductions were for glimepiride 2mg/30, acarbose 50mg/30 and acarbose 100mg/30, with price reductions of 6.43% (¥1.31), 8.45% (¥11.81) and 9.55% (¥8.81) in that order (Table  2 ).

The global prevalence and incidence of diabetes have risen dramatically since the beginning of the twenty-first century [ 25 ]. As diabetes and its early complications were heavily dependent on medication, this changing disease prevalence trend inevitably led to a continued increase in the use of related drugs. In addition, as the level of drug development has improved, new and more expensive antidiabetic drugs with better efficacy and fewer side effects have been included in the recommended treatment guidelines [ 26 ] and have been introduced to the market, which have also contributed to a significant increase in drug expenditure. Ultimately, the three main reasons were the ageing of the world's population, the decline in mortality from diseases triggered by improvements in health care, and the decreasing risk factors for disease [ 27 ].

Notwithstanding the trends above, the level of use and expenditure on antidiabetic drugs in Gansu Province was much lower than in some economically developed countries and regions worldwide, and was even comparable to that of Denmark at the end of the twentieth century [ 15 , 20 , 28 , 29 ]. This is partly because the prevalence rate in Gansu Province was lower than in these countries, and the knowledge and treatment of diabetes are lower, another partly because the consumption of medicines in retail pharmacies was not yet included in our study, resulting in an underestimation of the overall use [ 30 ]. To reduce the burden of drugs on the grass-roots population, the relevant departments of the Gansu Provincial Government have in recent years continued to update the medical insurance policy for the two diseases, including the expansion of the scope of outpatient medication services, the upward adjustment of the overall reimbursement rate, the implementation of long-term prescription management and the implementation of instant settlement at the place of medical treatment, among other things [ 31 ]. In the future, publicity efforts should also be further stepped up, and with the help of new technologies such as artificial intelligence, big data and cloud computing, personalised health management, proactive follow-up and lifestyle intervention should be carried out for the key groups of people with chronic diseases, to reduce the waste of medical resources due to the unregulated management and treatment of chronic diseases.

The composition of the use of different classes of antidiabetic drugs in Gansu Province varied greatly. Among the oral antidiabetic drugs, biguanides overtook sulfonylureas to jump to first place, and α-glucosidase inhibitors caught up to become the third most used oral antidiabetic drug. To the present time, as these facts are still true——hypoglycemia is one of the dangerous complications of diabetes. Because sulphonylureas tended to trigger weight gain and hypoglycemia in patients [ 32 ], international guidelines preferred metformin, which did not increase the risk of hypoglycemia. This drug was still the most widely used first-line treatment in most countries because it combined safety and cost-effectiveness [ 33 ]. And for patients with the limited effect of single drug therapy, multiple combinations of metformin and other oral antidiabetic drugs or injectable drugs were used to control blood glucose. The use of α-glucosidase inhibitors was higher in Gansu Province compared to the United States and some European countries. Due to a diet with carbohydrates as a core food and genetic differences, most Chinese had higher postprandial blood glucose levels than Europeans [ 34 ], and α-glucosidase inhibitors were widely used to treat type 2 diabetes in Chinese patients because their mechanism of action was to lower postprandial blood glucose levels by inhibiting the absorption of carbohydrates in the upper small intestine [ 35 ].

As a representative of the novel antidiabetic drug group, dapagliflozin was approved for entry into China in March 2017, becoming the first SGLT2i to be marketed in China, and in the same year sitagliptin, which belonged to DPP4i, was added to the NBMIDC (Class B) [ 36 ], both of which have contributed to the increased use of this drug group. Of course, another reason could be that the use base in the early days was too small. The proportion of novel antidiabetic drugs in Gansu Province has continued to increase over the past five years, with SGLT2i accounting for the highest proportion (about three-fifths), followed by DPP4i, despite being the first novel antidiabetic drug to enter China. Due to the wide disparity in economic and educational levels between urban and rural residents, the rate of early diagnosis and effective treatment of diabetes patients in rural areas of China was low, and patients often waited until they developed serious complications before receiving treatment. Some studies have shown that the incidence of complications in Chinese diabetes patients was higher than in some high-income countries, especially in cardiovascular complications and kidney damage [ 37 ]. SGLT2i had good efficacy in treating these complications of diabetes, and the inclusion of this class of drugs in the NBMIDC has resulted in more significant price reductions, which may be a reasonable explanation for the above phenomenon.

The results of the study showed that diabetes treatment in Gansu Province was still concentrated in secondary and tertiary hospitals, and there was no gradual downward trend toward primary health care. Despite the Chinese government's commitment to managing and treating chronic diseases through the primary health care system over the past few decades, including a significant increase in investment in health care resources and human resources [ 38 ], an unexpectedly large number of residents still tended to bypass nearby primary health care facilities in favour of higher-level facilities for the treatment of minor ailments. The reason for this was that residents living in urban areas choose high-level hospitals because of their proximity, while as mentioned earlier, some rural residents can only choose high-level hospitals because of the severity of their illnesses [ 39 ].

The DDC of different classes of antidiabetic drugs varied considerably, and there is no doubt that insulins and analogues and novel antidiabetic drugs were generally more expensive, which is consistent with other studies [ 40 ]. The price of combinations of oral antihyperglycemic drugs was also high. The combination of metformin and thiazolidinediones was the most used compounded hypoglycemic drug in Gansu Province, followed by and novel antidiabetic drug group, which was the main reason for the high price of this class of drugs. In addition to the limitations of the ATC/DDD evaluation system used in the study for the treatment of combinations of oral blood glucose lowering drugs dosage, which also caused the inflated DDC of this class of drugs to some extent.

The use of antidiabetic drugs in Gansu Province has been shifting towards EMs, while gradually moving towards medically insured drugs, and more high-priced drugs were included in the NBMIDC. This phenomenon has been reported in other studies [ 41 ]. Two key aspects of drug policy were pricing and reimbursement. In China, provincial governments set uniform procurement prices for their regions within the national guideline price range based on tenders to ensure reasonable value for money [ 42 ]. The inclusion of drugs in the NBMIDC meant that pharmaceutical companies could own the majority of the public hospital market, so companies were often willing to "trade volume for price" [ 43 ]. At the same time, high health insurance coverage played a benign role in reducing the financial burden of illness on the population and reducing catastrophic health expenses for families, especially for chronic diseases [ 44 ]. For the phenomenon of rapid decline in the price of non-medical insurance drugs, which does not exclude enterprises in the bidding for the sake of immediate interests, and was made regardless of the cost of "irrational price reduction" behaviour [ 45 ]. This is not a good trend, in the long run will not only lead to the supply of drugs but also not be conducive to the research and development of innovative drugs.

In 2019, the Chinese government launched the first round of national centralised drug procurement to reduce drug prices and save drug costs through economies of scale [ 46 ]. Available studies have shown that the implementation of this policy has significantly reduced the DDC of bid-winning drugs, including some common chronic disease drugs (e.g. antihypertensive drugs) and acute disease drugs (e.g. emergency drugs) [ 47 , 48 ]. As the results show, the DDC of each batch of bid-winning drugs in Gansu Province has experienced a "precipitous" decline, accompanied by a significant increase in dosage, which showed that patients' medication is generally concentrated in the bid-winning drugs. As the number of bid-winning drugs continues to increase in the future, it will be conducive to improving the overall quality of medication for diabetes patients in China and reducing the burden of medication to a greater extent. However, it is worth noting that most of the drugs have only been procured and used since they were selected, reflecting the strong policy guidance on drug selection, which also placed higher demands on the quality of the winning product.

The study has some limitations. Firstly, the data used in the study were procurement data, which may not directly reflect the actual use of antidiabetic drugs. Secondly, the procurement hospitals included in the study were all public healthcare institutions, so the results can not reflect the contribution of private hospitals and retail pharmacies to the consumption of antidiabetic drugs, even though this was a small proportion. Thirdly, due to the limitations of the data information, we were unable to analyse the relationship between market consumption and patient characteristics or prescriber information.

The study analysed the changing trends and use patterns of antidiabetic drugs and explored the impact of different pharmaceutical policies on drug use and expenditure. It was found that the use and expenditure of antidiabetic drugs showed a continuous upward trend in the last decade, and the prices of drugs were constantly reduced in the direction of benefiting the public, which was closely related to the increasing prevalence of the disease and the implementation of a series of pharmaceutical policies. The introduction of new antidiabetic drugs and changes in guideline recommendations have both influenced drug use patterns and led to significant increases in drug expenditure. This suggested that prevention of the disease is a key priority to reduce the financial burden on the health care system and individuals. In the future, the capacity of primary healthcare institutions to prevent and manage the disease should be improved as a whole, and the national basic public health service program should be carried out solidly, while at the same time strictly controlling the quality of the bid-winning drugs under the centralized purchasing policy, setting up an information system for the files of drug varieties, and steadily advancing the traceability management of the medicines that have been selected.

Availability of data and materials

The datasets generated or analysed during the current study are not publicly available due to confidentiality policies but are available from the corresponding author upon reasonable request.

Abbreviations

Centralised bidding and purchasing

Anatomical therapeutic chemistry

Defined daily dose

DDD per 1000 inhabitants per day

Defined daily cost

Essential medicines

World Health Organization

Gross domestic product

National Essential Medicine System

National Basic Medical Insurance Drug Catalogue

Dipeptidyl peptidase 4 inhibitors

Glucagon-like peptide-1 receptor analogue

Sodium-glucose co-transporter 2 inhibitors

Defined daily doses

Sun H, Saeedi P, Karuranga S, et al. IDF Diabetes Atlas: global, regional and country-level diabetes prevalence estimates for 2021 and projections for 2045. Diabetes Res Clin Pract. 2022;183:109119. https://doi.org/10.1016/j.diabres.2021.109119 .

Article   PubMed   Google Scholar  

Li Y, Teng D, Shi X, et al. Prevalence of diabetes recorded in mainland China using 2018 diagnostic criteria from the American Diabetes Association: national cross sectional study. BMJ. 2020;369:m997. https://doi.org/10.1136/bmj.m997 .

Article   PubMed   PubMed Central   Google Scholar  

Xi T, Pan L, Ren X, et al. Analysis of the prevalence and influencing factors of prediabetes in Gansu Province. Chin J Dis Control. 2018;22(10):987–91. https://doi.org/10.16462/j.cnki.zhjbkz.2018.10.002 .

Article   Google Scholar  

Wang C, Hu Y, Hu X, et al. A study on accounting for diabetes treatment costs in Gansu Province based on the health cost accounting system 2011. Chronic Disease Prevention and Control in China. 2021;29(08):582-8. https://doi.org/10.16386/j.cjpccd.issn.1004-6194.2021.08.005 .

OECD. Health at a Glance. 2019. https://doi.org/10.1787/4dd50c09-en . Accessed 30 Jun 2023.

Book   Google Scholar  

Fang Y, Wagner AK, Yang S, et al. Access to affordable medicines after health reform: evidence from two cross-sectional surveys in Shaanxi Province, western China. Lancet Glob Health. 2013;1(4):e227-37. https://doi.org/10.1016/S2214-109X(13)70072-X .

Dong W, Zwi AB, Bai R, et al. Benefit of China’s social health insurance schemes: trend analysis and associated factors since health reform. Int J Environ Res Public Health. 2021;18(11):5672. https://doi.org/10.3390/ijerph18115672 .

Quan P, Hu X, Wang C, et al. Trends in hospitalization costs of diabetic patients in Gansu Province and analysis of factors affecting them, 2014–2018. Chronic Disease Prevention and Control in China. 2022;30(01):8–13. https://doi.org/10.16386/j.cjpccd.issn.1004-6194.2022.01.003 .

Shanghai Sunshine Pharmaceutical Purchasing Network. Notice on the announcement of the successful results of the centralized purchase of drugs in the union region. 2019. https://www.smpaa.cn/gjsdcg/2019/09/30/9040.shtml . Accessed 30 Jun 2023.

Google Scholar  

Shanghai Sunshine Pharmaceutical Purchasing Network. Announcement of the proposed winning results of National Centralized Drug Purchasing. 2020. https://www.smpaa.cn/gjsdcg/2020/01/17/9261.shtml . Accessed 30 Jun 2023.

Shanghai Sunshine Pharmaceutical Purchasing Network. Notice on the announcement of the winning results of the National Centralized Purchasing of Pharmaceuticals. 2020. https://www.smpaa.cn/gjsdcg/2020/08/24/9560.shtml . Accessed 30 Jun 2023.

Lanzhou Municipal Bureau of Medical Protection. Supply list of selected drugs in the fourth batch of National Collective Purchasing (Gansu). 2021. http://ybj.lanzhou.gov.cn/art/2021/7/22/art_15304_1030279.html . Accessed 30 Jun 2023.

Lanzhou Municipal Bureau of Medical Protection. Supply list of selected medicines in Gansu Province collective purchasing (Fifth Batch 2021.10.15-2022.10.14). 2021. http://ybj.lanzhou.gov.cn/art/2022/4/27/art_15304_1114981.html . Accessed 30 Jun 2023.

CHINANEWS. Gansu to reduce the burden of public medicine: the collection of drug cost savings of more than 1.3 billion yuan. 2021. https://www.chinanews.com.cn/cj/2021/07-27/9529748.shtml . Accessed 30 Jun 2023.

Bucsa C, Farcas A, Iaru I, Mogosan C, Rusu A. Drug utilisation study of antidiabetic medication during 2012–2019 in Romania. Int J Clin Practice. 2021;75(11):e14770. https://doi.org/10.1111/ijcp.14770 .

Article   CAS   Google Scholar  

Moura AM, Martins SO, Raposo JF. Consumption of antidiabetic medicines in Portugal: results of a temporal data analysis of a thirteen-year study (2005–2017). BMC Endocrine Disorders. 2021;21(1):1–10. https://doi.org/10.1186/s12902-021-00686-w .

National Bureau of Statistics of China. China statistics yearbook. 2021. http://www.stats.gov.cn/sj/ndsj/2022/indexch.htm . Accessed 30 Jun 2023.

Dang Y, Yang Y, Yang A, et al. Factors influencing catastrophic health expenditure of households with people with diabetes in Northwest China-an example from Gansu Province. BMC Health Serv Res. 2023;23(1):401. https://doi.org/10.1186/s12913-023-09411-w .

Gansu Taxation Bureau of the State Administration of Taxation. Gansu daily: let the people feel more livelihood temperature. 2021. http://gansu.chinatax.gov.cn/art/2021/3/17/art_289_245484.html . Accessed 30 Jun 2023.

Csatordai M, Benko R, Matuz M, et al. Use of glucose-lowering drugs in Hungary between 2008 and 2017: the increasing use of novel glucose-lowering drug groups. Diabet Med. 2019;36(12):1612–20. https://doi.org/10.1111/dme.14117 .

Article   CAS   PubMed   Google Scholar  

The WHO Collaborating Centre for Drug Statistics Methodology. Implementation and maintenance of the ATC/DDD methodology. 2022. https://www.whocc.no/use_of_atc_ddd/ . Accessed 30 Jun 2023.

Gansu Provincial Bureau of Statistics. Gansu Province statistics yearbook. 2021. http://tjj.gansu.gov.cn/tjj/c109464/info_disp.shtml . Accessed 30 Jun 2023.

National Health Commission Floating Population Service Center. Data from China floating population dynamic monitoring survey. 2021. https://www.chinaldrk.org.cn/wjw/#/data/classify/population/provinceList . Accessed 30 Jun 2023.

CEIC statistical database. Gansu, China: Consumer Price Index. 2021. https://www.ceicdata.com/zh-hans . Accessed 30 Jun 2023.

Reed J, Bain S, Kanamarlapudi V. A review of current trends with type 2 diabetes epidemiology, aetiology, pathogenesis, treatments and future perspectives. Diabetes, Metabolic Syndrome and Obesity: Targets and Therapy. 2021;14:3567–602. https://doi.org/10.2147/DMSO.S319895 .

Inzucchi SE, Bergenstal RM, Buse JB, et al. Management of hyperglycemia in type 2 diabetes, 2015: a patient-centered approach: update to a position statement of the American Diabetes Association and the European Association for the study of diabetes. Diabetes Care. 2015;38(1):140–9. https://doi.org/10.2337/diaspect.25.3.154 .

Khawandanah J. Double or hybrid diabetes: a systematic review on disease prevalence, characteristics and risk factors. Nutr Diabetes. 2019;9(1):33. https://doi.org/10.1038/s41387-019-0101-1 .

Bang C, Mortensen MB, Lauridsen KG, Bruun JM. Trends in antidiabetic drug utilization and expenditure in Denmark: A 22-year nationwide study. Diabetes Obes Metab. 2020;22(2):167–72. https://doi.org/10.1111/dom.13877 .

Fürnsinn C, Śliwczyński A, Brzozowska M, et al. Drug-class-specific changes in the volume and cost of antidiabetic medications in Poland between 2012 and 2015. Plos One. 2017;12(6):e0178764. https://doi.org/10.1371/journal.pone.0178764 .

Dang Y, Yang Y, Cao S, et al. Exploring the factors influencing the use of health services by people with diabetes in Northwest China: an example from Gansu Province. J Health Popul Nutr. 2023;42(1):64. https://doi.org/10.1186/s41043-023-00402-5 .

People's Daily Online (PRC newspaper). Gansu Province, a number of health insurance policies to benefit the people to be implemented from next year. 2022. http://gs.people.com.cn/n2/2022/1209/c183283-40224921.html . Accessed 30 Jun 2023.

Harsch IA, Kaestner RH, Konturek PC. Hypoglycemic side effects of sulfonylureas and repaglinide in ageing patients - knowledge and self-management. J Physiol Pharmacol. 2018;69(4). https://doi.org/10.26402/jpp.2018.4.15 .

Fang M, Wang D, Coresh J, et al. Trends in diabetes treatment and control in U.S. adults, 1999–2018. N Engl J Med. 2021;384(23):2219–28. https://doi.org/10.1056/NEJMsa2032271 .

Chen Y, Li Q, Han Y, et al. Vildagliptin versus alpha-glucosidase inhibitor as add-on to metformin for type 2 diabetes: subgroup analysis of the China Prospective Diabetes Study. Diabetes Ther. 2020;11(1):247–57. https://doi.org/10.1007/s13300-019-00742-8 .

Alssema M, Ruijgrok C, Blaak EE, et al. Effects of alpha-glucosidase-inhibiting drugs on acute postprandial glucose and insulin responses: a systematic review and meta-analysis. Nutr Diabetes. 2021;11(1):11. https://doi.org/10.1038/s41387-021-00152-5 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Ministry of Human Resources and Social Security of China. Circular of the Ministry of Human Resources and Social Security on printing and issuing the drug catalogue of National Basic Medical Insurance, Industrial Injury Insurance and Maternity Insurance (2017 Edition). 2017. http://www.mohrss.gov.cn/SYrlzyhshbzb/shehuibaozhang/zcwj/yiliao/201702/t20170223_266775.html . Accessed 30 Jun 2023.

Gregg EW, Sattar N, Ali MK. The changing face of diabetes complications. Lancet Diabetes Endocrinol. 2016;4(6):537–47. https://doi.org/10.1016/S2213-8587(16)30010-9 .

Sun X, Meng H, Ye Z, et al. Factors associated with the choice of primary care facilities for initial treatment among rural and urban residents in Southwestern China. PLoS One. 2019;14(2):e0211984. https://doi.org/10.1371/journal.pone.0211984 .

Liu Y, Zhong L, Yuan S, et al. Why patients prefer high-level healthcare facilities: a qualitative study using focus groups in rural and urban China. BMJ Glob Health. 2018;3(5):e000854. https://doi.org/10.1136/bmjgh-2018-000854 .

Vaughan EM, Rueda JJ, Samson SL, et al. Reducing the burden of diabetes treatment: a review of low-cost oral hypoglycemic medications. Curr Diabetes Rev. 2020;16(8):851–8. https://doi.org/10.2174/1573399816666200206112318 .

Yao Q, Liu C, Ferrier JA, Liu Z, Sun J. Urban-rural inequality regarding drug prescriptions in primary care facilities – a pre-post comparison of the National Essential Medicines Scheme of China. Int J Equity Health. 2015;14(1):1–9. https://doi.org/10.1186/s12939-015-0186-7 .

Dong Z, Tao Q, Sun G. Survey and analysis of the availability and affordability of essential drugs in Hefei based on WHO / HAI standard survey methods. BMC Public Health. 2020;20(1):1405. https://doi.org/10.1186/s12889-020-09477-9 .

Liu GG, Vortherms SA, Hong X. China’s health reform update. Annu Rev Public Health. 2017;38:431–48. https://doi.org/10.1146/annurev-publhealth-031816-044247 .

Sun J, Lyu S. The effect of medical insurance on catastrophic health expenditure: evidence from China. Cost Effectiveness and Resource Allocation. 2020;18(1):1–11. https://doi.org/10.1186/s12962-020-00206-y .

Yu X, Mao N. Analysis of the causes of abnormal fluctuations in drug prices and proposals for countermeasures. Health Economics Research. 2021;38(7):44–7 ( https://kns.cnki.net/kcms/detail/33.1056.F.20210701.1746.020.html ).

State Council of the People’s Republic of China. Pilot program for national centralized drug procurement and use. 2019. http://www.gov.cn/zhengce/content/2019-01/17/content_5358604.htm . Accessed 30 Jun 2023.

Yang Y, Tong R, Yin S, et al. The impact of “4 + 7” volume-based drug procurement on the volume, expenditures, and daily costs of antihypertensive drugs in Shenzhen, China: an interrupted time series analysis. BMC Health Serv Res. 2021;21(1):1275. https://doi.org/10.1186/s12913-021-07143-3 .

Yang Y, Chen L, Ke X, et al. The impacts of Chinese drug volume-based procurement policy on the use of policy-related antibiotic drugs in Shenzhen, 2018–2019: an interrupted time-series analysis. BMC Health Serv Res. 2021;21(1):668. https://doi.org/10.1186/s12913-021-06698-5 .

Download references

Acknowledgements

We thank the Special Research on Lanzhou University Serving Gansu Economic and Social Development (contract number 2019-FWZX-11) and Research on Total Health Expenditure in Gansu Province (contract number 2022620005002671) for supporting this study.

This study was supported by Special Research on Lanzhou University Serving Gansu Economic and Social Development (contract number 2019-FWZX-11) and Research on Total Health Expenditure in Gansu Province (contract number 2022620005002671).

Author information

Authors and affiliations.

School of Public Health, Lanzhou University, 222# Tianshui South Road, Lanzhou, 730000, China

Wenxuan Cao, Hu Feng, Yaya Yang, Lei Wang, Xuemei Wang & Xiaobin Hu

Division of Pharmaceutical Procurement, Gansu Public Resources Trading Center, 68# Yanxing Road, Lanzhou, 730000, China

Yongheng Ma & Defang Zhao

You can also search for this author in PubMed   Google Scholar

Contributions

Yongheng Ma and Defang Zhao provided data and technical support, Hu Feng and Yaya Yang carried out partial data management, Lei Wang and Xuemei Wang polished the language, Xiaobin Hu provided critical comments, and Wenxuan Cao analyzed the data and wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xiaobin Hu .

Ethics declarations

Ethics approval and consent to participate.

Not applicable. No ethical approval was required for this study by the author’s institution because this research only included the medication procurement information and all the information was anonymous. No secondary data were used in this study and no humans were involved.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Cao, W., Feng, H., Yang, Y. et al. Trends in antidiabetic drug use and expenditure in public hospitals in Northwest China, 2012-21: a case study of Gansu Province. BMC Health Serv Res 24 , 415 (2024). https://doi.org/10.1186/s12913-024-10917-0

Download citation

Received : 16 August 2023

Accepted : 27 March 2024

Published : 03 April 2024

DOI : https://doi.org/10.1186/s12913-024-10917-0

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Antidiabetic drug
  • Public hospitals
  • Use pattern
  • Pharmaceutical policy

BMC Health Services Research

ISSN: 1472-6963

research article using linear regression

Predicting macroinvertebrate average score per taxon (ASPT) at water quality monitoring sites in Japanese rivers

  • Research Article
  • Published: 01 April 2024

Cite this article

  • Yuichi Iwasaki   ORCID: orcid.org/0000-0001-7006-8113 1 ,
  • Tomomi Suemori 1 &
  • Yuta Kobayashi   ORCID: orcid.org/0000-0001-8923-5006 2  

Biomonitoring with bioindicators such as river macroinvertebrates is fundamental for assessing the status of freshwater ecosystems. In Japan, water quality and biomonitoring surveys are conducted separately, leading to a lack of nationwide information on their relationships and the biological status of water quality monitoring (WQM) sites. To understand the biological status of WQM sites across Japan, we developed a multiple linear regression model to estimate the average score per taxon (ASPT) using river macroinvertebrate data surveyed at a total of 237 “aligned” sites based on the co-occurrence of biomonitoring and WQM sites. The resulting regression model with eight predictors, such as biological oxygen demand, the proportion of urban areas in the catchment, could predict ASPT with reasonable accuracy (e.g., an error of ±1 for 96% of the aligned data). Using this model, we estimated ASPT values at 2925 WQM sites in rivers nationwide, categorizing them into four levels of river environment quality: “very good” (29% of WQM sites), “good” (50%), “fairly good” (14%), and “not good” (8%). Furthermore, we observed statistically significant correlations (p < 0.05; 0.4 ≤ r ≤ 0.7) between ASPT and all eight macroinvertebrate metrics examined, such as mayfly and stonefly richness, providing ecological implications of changes in ASPT.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research article using linear regression

Data availability

All data and R code used are available from a GitHub repository at https://github.com/yuichiwsk/predict_ASPT_Japan .

Abell R et al (2008) Freshwater ecoregions of the world: A new map of biogeographic units for freshwater biodiversity conservation. Bioscience 58:403–414. https://doi.org/10.1641/b580507

Article   Google Scholar  

Armitage PD, Moss D, Wright JF, Furse MT (1983) The performance of a new biological water-quality score system based on macroinvertebrates over a wide-range of unpolluted running-water sites. Water Res 17:333–347. https://doi.org/10.1016/0043-1354(83)90188-4

Article   CAS   Google Scholar  

Aroviita J, MykrÄ H, Muotka T, HÄMÄLÄInen H (2009) Influence of geographical extent on typology- and model-based assessments of taxonomic completeness of river macroinvertebrates. Freshw Biol 54, 1774–1787. https://doi.org/10.1111/j.1365-2427.2009.02210.x

Aroviita J, Mykrä H, Hämäläinen H (2010) River bioassessment and the preservation of threatened species: Towards acceptable biological quality criteria. Ecol Indic 10:789–795. https://doi.org/10.1016/j.ecolind.2009.12.007

Barbour MT, Gerritsen J, Snyder BD, Stribling JB (1999) Rapid bioassessment protocols for use in streams and wadeable rivers: periphyton, benthic macroinvertebrates and fish (second edition). Office of Water, U.S. Environmental Protection Agency, Washington, DC, USA

Google Scholar  

Bartoń K (2022) MuMIn: Multi-Model Inference. R package version 1.47.1. https://CRAN.R-project.org/package=MuMIn

Birk S, Bonne W, Borja A, Brucet S, Courrat A, Poikane S, Solimini A, van de Bund WV, Zampoukas N, Hering D (2012) Three hundred ways to assess Europe's surface waters: An almost complete overview of biological methods to implement the Water Framework Directive. Ecol Indic 18:31–41. https://doi.org/10.1016/j.ecolind.2011.10.009

Birk S et al (2020) Impacts of multiple stressors on freshwater biota across spatial scales and ecosystems. Nat Ecol Evol 4:1060–1068. https://doi.org/10.1038/s41559-020-1216-4

Burnham KP, Anderson DR (2004) Multimodel inference - understanding AIC and BIC in model selection. Sociol Methods Res 33:261–304. https://doi.org/10.1177/0049124104268644

Burnham KP, Anderson DR, Huyvaert KP (2011) AIC model selection and multimodel inference in behavioral ecology: some background, observations, and comparisons. Behav Ecol Sociobiol 65:23–35. https://doi.org/10.1007/s00265-010-1029-6

Buss DF, Carlisle DM, Chon T-S, Culp J, Harding JS, Keizer-Vlek HE, Robinson WA, Strachan S, Thirion C, Hughes RM (2015) Stream biomonitoring using macroinvertebrates around the globe: a comparison of large-scale programs. Environ Monit Assess 187:4132. https://doi.org/10.1007/s10661-014-4132-8

Büttner O, Jawitz JW, Birk S, Borchardt D (2022) Why wastewater treatment fails to protect stream ecosystems in Europe. Water Res 217:118382. https://doi.org/10.1016/j.watres.2022.118382

Carlisle DM, Clements WH (1999) Sensitivity and variability of metrics used in biological assessments of running waters. Environ Toxicol Chem 18:285–291. https://doi.org/10.1002/etc.5620180227

Eriksen TE, Brittain JE, Søli G, Jacobsen D, Goethals P, Friberg N (2021) A global perspective on the application of riverine macroinvertebrates as biological indicators in Africa, South-Central America, Mexico and Southern Asia. Ecol Indic 126:107609. https://doi.org/10.1016/j.ecolind.2021.107609

Feio MJ et al (2021) The biological sssessment and rehabilitation of the world’s rivers: An overview. Water 13:371. https://doi.org/10.3390/w13030371

Hawkes HA (1998) Origin and development of the biological monitoring working party score system. Water Res 32:964–968. https://doi.org/10.1016/S0043-1354(97)00275-3

IPBES (2019) Global Assessment Report on Biodiversity and Ecosystem Services of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services. IPBES Secretariat, Bonn, Germany

Iwasaki Y, Kagaya T, Matsuda H (2018) Comparing macroinvertebrate assemblages at organic-contaminated river sites with different zinc concentrations: Metal-sensitive taxa may already be absent. Environ Pollut 241:272–278. https://doi.org/10.1016/j.envpol.2018.05.041

Iwasaki Y, Kobayashi Y, Suemori T, Takeshita K, Ryo M (2022) Compiling physicochemical characteristics of water quality monitoring sites (environmental reference points) in Japanese rivers and site grouping. J Japan Soc Water Environ 45:231–237. https://doi.org/10.2965/jswe.45.231

Johnson AC, Jin X, Nakada N, Sumpter JP (2020) Learning from the past and considering the future of chemicals in the environment. Science 367:384–387. https://doi.org/10.1126/science.aay6637

Jones JI, Lloyd CEM, Murphy JF, Arnold A, Duerdoth CP, Hawczak A, Pretty JL, Johnes PJ, Freer JE, Stirling MW, Richmond C, Collins AL (2023) What do macroinvertebrate indices measure? Stressor-specific stream macroinvertebrate indices can be confounded by other stressors. Freshw Biol 68:1330–1345. https://doi.org/10.1111/fwb.14106

Larsen S, Vaughan IP, Ormerod SJ (2009) Scale-dependent effects of fine sediments on temperate headwater invertebrates. Freshw Biol 54:203–219. https://doi.org/10.1111/j.1365-2427.2008.02093.x

Lynch AJ et al (2023) People need freshwater biodiversity. WIREs. Water 10:e1633. https://doi.org/10.1002/wat2.1633

MLIT (2016) Basic Survey Manual for the National Census on the River Environment [River version, Benthic Macroinvertebrate Survey Edition] (in Japanese). River Environment Division, Water Management and Land Conservation Bureau, Ministry of Land, Infrastructure, Transport and Tourism https://www.nilim.go.jp/lab/fbg/ksnkankyo/ (accessed February 22, 2024)

MoE (2017) Manual of Water Quality Assessment Method by Aquatic Organisms -Japanese Version of Average Scoring System- (in Japanese). Water and Atmospheric Environment Bureau, Ministry of Environment https://www.env.go.jp/water/mizukankyo/hyokahomanual.pdf (accessed February 22, 2024)

Mondy CP, Villeneuve B, Archaimbault V, Usseglio-Polatera P (2012) A new macroinvertebrate-based multimetric index (I2M2) to evaluate ecological quality of French wadeable streams fulfilling the WFD demands: A taxonomical and trait approach. Ecol Indic 18:452–467. https://doi.org/10.1016/j.ecolind.2011.12.013

Naito W, Kamo M, Tsushima K, Iwasaki Y (2010) Exposure and risk assessment of zinc in Japanese surface waters. Sci Total Environ 408:4271–4284. https://doi.org/10.1016/j.scitotenv.2010.06.018

Namba H, Iwasaki Y, Heino J, Matsuda H (2020) What to survey? A systematic review of the choice of biological groups in assessing ecological impacts of metals in running waters. Environ Toxicol Chem 39:1964–1972. https://doi.org/10.1002/etc.4810

Niemi GJ, McDonald ME (2004) Application of ecological indicators. Annu Rev Ecol Evol Syst 35:89–111. https://doi.org/10.1146/annurev.ecolsys.35.112202.130132

Nozaki T (2012) Biological assessment based on macroinvertebrate commnities -average score system for Japanese rivers- (in Japanese). J Japan Soc Water Environ 35:118–121

Ormerod SJ, Durance I (2009) Restoration and recovery from acidification in upland Welsh streams over 25 years. J Appl Ecol 46:164–174. https://doi.org/10.1111/j.1365-2664.2008.01587.x

Pallottini M, Goretti E, Selvaggi R, Cappelletti D, Dedieu N, Céréghino R (2017) An efficient semi-quantitative macroinvertebrate multimetric index for the assessment of water and sediment contamination in streams. Inland Waters 7:314–322. https://doi.org/10.1080/20442041.2017.1329912

Persson L, Carney Almroth BM, Collins CD, Cornell S, de Wit CA, Diamond ML, Fantke P, Hassellöv M, MacLeod M, Ryberg MW, Søgaard Jørgensen P, Villarrubia-Gómez P, Wang Z, Hauschild MZ (2022) Outside the safe operating space of the planetary boundary for novel entities. Environ Sci Technol 56:1510–1521. https://doi.org/10.1021/acs.est.1c04158

R Core Team (2022) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria Available from: https://www.R-project.org/

Reid AJ, Carlson AK, Creed IF, Eliason EJ, Gell PA, Johnson PTJ, Kidd KA, MacCormack TJ, Olden JD, Ormerod SJ, Smol JP, Taylor WW, Tockner K, Vermaire JC, Dudgeon D, Cooke SJ (2019) Emerging threats and persistent conservation challenges for freshwater biodiversity. Biol Rev 94:849–873. https://doi.org/10.1111/brv.12480

Roy AH, Rosemond AD, Paul MJ, Leigh DS, Wallace JB (2003) Stream macroinvertebrate response to catchment urbanisation (Georgia, U.S.A.). Freshw Biol 48:329–346. https://doi.org/10.1046/j.1365-2427.2003.00979.x

Ryo M, Rillig MC (2017) Statistically reinforced machine learning for nonlinear patterns and variable interactions. Ecosphere 8:e01976. https://doi.org/10.1002/ecs2.1976

Schmidt TS, Van Metre PC, Carlisle DM (2019) Linking the agricultural landscape of the Midwest to stream health with structural equation modeling. Environ Sci Technol 53:452–462. https://doi.org/10.1021/acs.est.8b04381

Takeshita KM, Hayashi TI, Yokomizo H (2022) What do we want to estimate from observational datasets? Choosing appropriate statistical analysis methods based on the chemical management phase. Integr Environ Assess Manag 18:1414–1422. https://doi.org/10.1002/ieam.4564

Torii T, Abe E, Tare H, Tsuzuki T, Myosho T, Kobayashi T (2023) Prediction of average score per taxon in Japan using mega data from the national census on river environments. Limnology 25(1):51–61. https://doi.org/10.1007/s10201-023-00729-2

Vannote RL, Minshall GW, Cummins KW, Sedell JR (1980) The river continuum concept. Can J Fish Aquat Sci 37:130–137. https://doi.org/10.1139/f80-017

Waite IR, Munn MD, Moran PW, Konrad CP, Nowell LH, Meador MR, Van Metre PC, Carlisle DM (2019) Effects of urban multi-stressors on three stream biotic assemblages. Sci Total Environ 660:1472–1485. https://doi.org/10.1016/j.scitotenv.2018.12.240

Waite IR, Van Metre PC, Moran PW, Konrad CP, Nowell LH, Meador MR, Munn MD, Schmidt TS, Gellis AC, Carlisle DM, Bradley PM, Mahler BJ (2021) Multiple in-stream stressors degrade biological assemblages in five U.S. regions. Sci Total Environ 800:149350. https://doi.org/10.1016/j.scitotenv.2021.149350

Wright JF (2000) An introduction to RIVPACS. In: Wright JF, Sutcliffe DW, Furse MT (eds) Assessing the Biological Quality of Fresh Waters: RIVPACS and Other Techniques. Freshwater Biological Association, Ableside, UK, pp 1–24

Yamasaki M, Nozaki T, Fujisawa A, Ogawa T (1996) Researches on the establishment of the standard method to evaluate lotic environments based on the biological condition of macrobenthic invertebrates in Japan -the results of the collaborative studies by the Environmental Biology Group of Environmental Laboratories Association. J Environ Lab Assoc 21:114–145. https://dl.ndl.go.jp/info:ndljp/pid/11641873

CAS   Google Scholar  

Yamazaki D, Togashi S, Takeshima A, Sayama T (2018) High-resolution flow direction map of Japan. J Jpn Soc Civil Eng Ser 8(1):234–240. https://doi.org/10.2208/jscejhe.74.5_I_163

Ye F, Kameyama S (2020) Long-term spatiotemporal changes of 15 water-quality parameters in Japan: An exploratory analysis of countrywide data during 1982–2016. Chemosphere 242:125245. https://doi.org/10.1016/j.chemosphere.2019.125245

Download references

Acknowledgments

We thank Terutaka Mori, Hidetaka Ichiyanagi, Takeshi Mizukami, Noriyoshi Shimura, and Takashi Yamasaki for their help and advice during data collection and handling. During the preparation of this work the authors used ChatGPT to improve readability and language. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.

This study was supported by JSPS KAKENHI [grant numbers JP18H04141 and JP20K12213]. The funder had no role in the study design, data collection and analysis, interpretation of data, manuscript preparation, or decision to submit.

Author information

Authors and affiliations.

Research Institute of Science for Safety and Sustainability, National Institute of Advanced Industrial Science and Technology (AIST), 16-1 Onogawa, Tsukuba, Ibaraki, 305-8569, Japan

Yuichi Iwasaki & Tomomi Suemori

Field Science Center, Faculty of Agriculture, Tokyo University of Agriculture and Technology, 3-5-8 Saiwai-tyo, Fuchu, Tokyo, 183-8509, Japan

Yuta Kobayashi

You can also search for this author in PubMed   Google Scholar

Contributions

Yuichi Iwasaki: Conceptualization, Methodology, Formal analysis, Investigation, Data curation, Writing – original draft, Writing – review & editing, Visualization, Funding acquisition. Tomomi Suemori: Investigation, Data curation, Writing – review & editing. Yuta Kobayashi: Methodology, Formal analysis, Investigation, Resources, Data curation, Writing – review & editing.

Corresponding author

Correspondence to Yuichi Iwasaki .

Ethics declarations

Ethical approval.

Not applicable.

Consent to participate

Consent for publication, competing interests.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Responsible Editor: Thomas Hein

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

(DOCX 2.83 mb)

(XLSX 18 kb)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Iwasaki, Y., Suemori, T. & Kobayashi, Y. Predicting macroinvertebrate average score per taxon (ASPT) at water quality monitoring sites in Japanese rivers. Environ Sci Pollut Res (2024). https://doi.org/10.1007/s11356-024-33053-y

Download citation

Received : 21 November 2023

Accepted : 19 March 2024

Published : 01 April 2024

DOI : https://doi.org/10.1007/s11356-024-33053-y

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Aquatic insect
  • Invertebrate
  • Water pollution
  • Bioindicator
  • Biomonitoring
  • Find a journal
  • Publish with us
  • Track your research
  • Open access
  • Published: 02 April 2024

Effect of evidence-based nursing practices training programme on the competency of nurses caring for mechanically ventilated patients: a randomised controlled trial

  • Sameh Elhabashy 1 ,
  • Michiko Moriyama 2 ,
  • Eman Ibrahim El-Desoki Mahmoud 3 &
  • Basem Eysa 3  

BMC Nursing volume  23 , Article number:  225 ( 2024 ) Cite this article

Metrics details

Evidence-Based Practice (EBP) has been recognised worldwide as a standardised approach for enhancing the quality of healthcare and patient outcomes. Nurses play a significant role in integrating EBP, especially in Intensive Care Unit (ICU). Consequently, this study aims to examine the effect of an adapted evidence-based nursing practices training programme on the competency level of nurses caring for mechanically ventilated patients.

A prospective open-label parallel 1:1 randomised controlled trial was conducted on 80 nurses caring for ICU patients at the National Hepatology and Tropical Medicine Research Institute, Egypt. The trial was carried out between November 2022 and February 2023 under the registration number NCT05721664. The enrolled nurses were randomly divided into intervention and control groups. The intervention group received the evidence-based nursing practice training programme (EBNPTP) in accordance with the Johns Hopkins EBP conceptional model, whereas the control group received traditional in-service education. Four assessments (one pre- and three post-assessments) were conducted to evaluate nurses’ competency level over time using the adapted evidence-based nursing competency assessment checklist. The primary endpoint was an increase the competency levels among nurses caring for mechanically ventilated patients.

The current study results revealed statistically significant differences between intervention and control groups in relation to their level of competency across the three post-assessments, with ( p  <.001). The study also demonstrated that the nurses’ competency level continued to decline significantly over time, with ( p  <.001). Additionally, a significant correlation was found between the nurses’ pre-assessment and educational level, acting as independent variables (predictors), and the third endpoint assessment ( p  <.01), indicated by multiple linear regression.

The EBP training programme demonstrated a significant increase in the nurses’ level of competency compared with traditional in-service education. This suggests that by training the nurses in various settings with the essential skills and knowledge for EBP, their competency level can be enhanced, leading to the delivery of effective care and improving patient outcomes. However, the long-term sustainability of the EBP adoptions was insufficient; further studies are needed to investigate the factors that affect the durability of EBP adoption.

Trial registration

The study was registered with Clinical Trials.gov (Registration # NCT05721664) on 10/02/2023.

Peer Review reports

Evidence-based practice (EBP) is a universal fundamental approach for delivering standardised care based on the most recent scientific evidence to enhance healthcare quality [ 1 ]. EBP is a problem-solving method for making effective, safe clinical decisions as a foundation for improving patient outcomes, as it bridges the theory-to-practice gap and delivers innovative patient care, while also reducing healthcare costs and encouraging lifelong learning [ 2 ]. Nurses play a crucial role in maximising the efficiency of healthcare services. Furthermore, they directly interact with patients, particularly in Intensive Care Units (ICUs) [ 3 ]. Therefore, healthcare organisations should always strive to make it easier for frontline nurses to use the best evidence in their everyday practices and overcome obstacles that may impede the implementation of the evidence [ 4 ]. EBP in healthcare is not a novel concept; Florence Nightingale introduced EBP to nursing in 1858 [ 5 ]. The concept of EBP changed as the nursing profession evolved and expanded significantly over the past few decades [ 6 ].

Care for critically ill patients necessitates a high level of competency [ 7 ]. In the ICU, mechanical ventilation (MV) is the most frequently utilised treatment modality [ 8 ]. Although MV aids in the survival of patients with respiratory compromise, it frequently results in a number of complications if they do not receive adequate nursing care [ 9 ]. Since the primary purpose of EBP is to address healthcare issues that contribute to higher mortality and morbidity rates, we have chosen to focus on ventilator-associated pneumonia (VAP) as it is a predominant complication among MV patients in Egypt. The incidence of VAP in Egypt ranges from 16 to 75% [ 10 , 11 ], which is a significantly higher incidence compared to other regions, as the incidence of VAP globally is 15.6%, with rates of 13.5% in the United States, 13.8% in Latin America, and 16.0% in the Asia-Pacific region [ 12 , 13 ]. Additionally, the survival rate among VAP patients in Egypt ranges from 58.3 to 31.8% [ 11 , 14 ], whereas the global survival rate of VAP typically falls between 50% and 75% [ 15 ], which is also considered greater than the rate observed in Egypt. This high incidence of VAP and low survival rates in Egypt may indicate a lack of EBP in nursing practice. Due to inadequate nursing practices, particularly in the care of patients with MV, several studies recommend extensive training for nurses [ 16 , 17 , 18 ].

Therefore, we hypothesised that nurses who received an evidence-based nursing practice training programme (EBNPTP) (μ1) demonstrate a sustainable higher increase in their level of competency than those who received the usual traditional in-service education (μ2) in caring for mechanically ventilated patients. (H1: μ1 > μ2). This study aims to examine the effect of a designed EBNPTP on the competency level of nurses caring for patients on MV in selected ICUs in Egypt.

Conceptual framework

The revised Johns Hopkins Evidence-Based Practice (JHEBP) Model [ 4 ] was selected as a systematic and efficient approach to implementing an evidence-based programme into practice in this study Fig.  1 . The JHEBP model encompasses four essential components: inquiry, practice, practice improvements, and learning. Nurse performance is considered the most typical determinant and predictor of the quality of care and patient outcomes [ 19 ]. Due to a lack of nurses’ level of competency regarding caring for MV patients, the quality of provided care and patients’ outcomes are negatively impacted [ 16 , 17 , 18 ]. As an independent variable, we designed the EBP training programme for ICU nurses based on the JHEBP method. The EBP training programme contains eight domains listed in Fig.  1 , which meet the educational needs of nurses in terms of both knowledge and practices. The ultimate objective of the training is to provide a positive, sustainable change in nurses’ level of competencies, thereby improving patient outcomes [ 20 ]. The JHEBP Model defines learning as a sustainable change in candidates’ behaviour. Therefore, the post-assessment of nurses’ competency as a dependent variable was measured three times at one-month intervals to evaluate the over-time change compared to the baseline pre-assessment and control group.

figure 1

Conceptual Framework of this study

EBNCAC: Evidence-Based Nursing Competency Assessment Checklist; AARC: American Association for Respiratory Care; AACN American Association of Critical-Care Nurses; EBNPTP: Evidence-Based Nursing Practice Training Programme; MV: Mechanical Ventilator; ICU: Intensive Care Unit; EBP: Evidence-Based Practices; VAP: Ventilator-Associated Pneumonia

Trial design

The current study was a prospective open-label parallel 1:1 randomised controlled trial. This study’s protocol was developed following the Standard Protocol Items Recommendations for Interventional Trials [ 21 ]. The study was conducted between November 2022 and February 2023 at the National Hepatology and Tropical Medicine Research Institute (NHTMRI) in Cairo, Egypt, in accordance with the Consolidated Standards of Reporting Trials (CONSORT) guidelines [ 22 ].

Participants sampling and study setting

The study was conducted in adult ICUs at the NHTMRI, Cairo, Egypt. The total capacity of the ICU is 14 beds, divided into two sections. One section was allocated for nurses who were assigned as an intervention group, while the other section was allocated for nurses who served as a control group. By randomly assigning nurses to each group and ensuring they work in separate ICU sections, the risk of contamination bias is reduced. These sections are comparable in terms of patient flow, equipment availability, and the number of working nurses. The total number of nurses in the selected setting is 94. All selected nurses met specific inclusion criteria: Willing to participate in this research, hold the existing position for at least three months; this criterion ensures that participants have had sufficient time to become familiar with their roles and responsibilities, allowing for a more accurate assessment of any changes or improvements in competency resulting from the EBP training programme. Additionally, participants were required to have at least two years of critical care experience, ensuring they possess a solid foundation of knowledge and skills necessary for caring for MV patients. This experience enhances the credibility of their feedback on the training programme’s effectiveness. Nurses intending to leave their jobs within the study period (four months) were excluded.

Sample size calculation

The sample size of 80 nurses was estimated by G power software V.3.1.9.4 (Psychonomic Society, Madison, Wisconsin, USA) with α = 0.05, power (1-β err prob) = 0.80, effect size = 0.56, and confidence level of 0.95. In terms of statistical power and effect size, the sample size chosen for our study was deemed adequate based on the previous study that studied the impact of an education programme on the performance of nurses providing care for patients on MV [ 23 ].

Randomisation and allocation

After verifying the eligibility criteria, the enrolled nurses were randomly divided into intervention and control groups. A simple random sample was generated by a lottery method. The eligible nurses were assigned a number, and each number was written and placed in a small opaque envelope. Then random selection and allocation were performed sequentially for the intervention and control groups. Randomisation and allocation were conducted by an unaffiliated third party.

The primary endpoint

The primary outcome was an ‘increase in the competency level’ of nurses caring for MV patients, measured by the Evidence-Based Nursing Competency Assessment Checklist (EBNCAC) over three months after receiving EBNPT, aiming to comprehensively evaluate the effectiveness and durability of the provided EBNPTP and to ensure the stability of results.

Measurement tools

EBNCAC is a structured observational checklist assessing nurses’ competency developed by the researcher and compiled from evidence-based clinical guidelines listed in Fig.  1 . Encompassing 74 items, the checklist covers eight domains addressed in the Evidence-Based Nursing Practice Training Programme (EBNPTP). The tool was structured based on various sources, including the American Association of Critical-Care Nurses (AACN), the American Association for Respiratory Care (AARC), the National Institutes of Health (NIH), and the Cochrane Library [ 24 , 25 , 26 , 27 ]. The responses of nurses to each item were graded on a scale of 2 to 0. “2 = performed correctly and satisfactory”, “1 = performed but unsatisfactory,” and “0 = not performed”. The total score ranged from 0 (lowest) to 148 (highest). Assessors were the charge nurses in the selected ICUs; they directly observed the nurses’ performance while participants cared for the MV patients. Based on the total score of 148, the scoring level was divided into three categories: High (> 120 / >80%), Moderate (74–120 / 50–80%), and Low (74 / 50%).

Validity and reliability

Content and scope validity for EBNCAC were determined utilising the Lawshe method [ 28 ]. The tool was revised by five experts in critical care medicine and nursing. Following the Subject Matter Expert (SME) ratings, the content validity ratio (CVR) was calculated for each item using the formula (ne–N/2)/(N/2), where ne represents the number of SMEs indicating “essential” and N denotes the total number of SMEs. The Content Validity Index (CVI) was then calculated by averaging the CVRs across all items, resulting in a value of 0.98 (72.57/74). The scale’s reliability was assessed using internal consistency (Cronbach’s alpha) for all items in the EBNCAC. The calculated Cronbach’s alpha for the EBNCAC was 0.721.

Evidence-based nursing practices training programme (EBNPTP) for the intervention group

BNPTP pertains to the care of MV patients. This is an integrated theoretical and clinical course for one week (30 h) designed by the researchers. To ensure the validity and reliability of the provided EBNPTP, it was formulated based on the latest research findings in the relevant areas of this study, such as those from the AACN and Cochrane Library [ 24 , 27 ]. Furthermore, it underwent review by three professional experts in critical care nursing and medicine to ascertain content validity. Also, three ICU nurses were enlisted to conduct a pre-test assessing the feasibility and acceptability of the training programme and the tool. Necessary revisions were made based on their feedback, and these nurses were excluded from the sample frame for enrollment. We standardised the delivery of the EBNPTP to enhance reliability by providing clear instructions to facilitators and conducting training sessions in a controlled and consistent manner. Finally, setting amenities and nurses’ and patients’ preferences were considered as it is a necessity of EBP. During the training week, the nurses in the intervention group ( n  = 40) were divided into two equal groups. They were scheduled to exchange their working days in the ICU with their training times to prevent any interruption of workflow in the ICU. The nurses’ considerable clinical experience enabled them to effectively fulfil the objectives of the condensed course.

Control group

The control group received traditional in-service education on a regular basis from the quality management department and nursing office. Usually, the educational content was provided in accordance with the educational needs of nurses. Routine clinical guidance was usually provided in real clinical settings. Additionally, periodic supplementary sessions were organised to address significant incidental clinical issues encountered by nurses.

Data collection procedure

After obtaining the informed consent, recruitment started in November 2022, and baseline pre-assessment was conducted in November 2022 using the EBNCAC. It serves as the initial assessment before the EBNPTP intervention, which was provided in one week at the end of November 2022. The first post-assessment was conducted immediately after the EBNPTP at the beginning of December 2022. It measures the immediate impact of the EBNPTP using the EBNCAC. The second post-assessment took place one month after the first assessment in January 2023. The Third Post-assessment occurred in the second half of February 2023, one month after the second post-assessment and three months after the EBNPTP, serving as the endpoint assessment. Considering that each assessment was held within one week, the interval between the four assessments was one month.

Data analysis

This study utilized a per-protocol analysis. Statistical Package for Social Sciences (SPSS) V.23.0 (IBM, New York) was used for analysis. Data were expressed using mean and standard deviation (SD). The normal data distribution was examined using Shapiro-Wilk’s test, histograms, box plots, and normal Q-Q plots for both the control and intervention groups with ( p  >.05). The two groups were compared by a two-way repeated measure of ANOVA. Multiple linear regression was applied as a regression model to test the effect of the study predictors on the endpoint third post-assessment. Finally, the effects of demographic characteristics on the baseline pre-assessment and endpoint of the third post-assessment were determined utilising a t-test and one-way ANOVA. The significance level was set at ( p  <.05).

Out of 94 nurses, 14 were excluded as they did not meet the eligibility criteria. Eighty nurses were equally allocated into the intervention group and control group. Ultimately, the third post-assessment data was carried out for 71 nurses (intervention, n  = 37; control, n  = 34). The reasons for dropout throughout the follow-up using three post-assessments are reported in Fig.  2 . The scores for nurses’ competency subscales across the control and intervention groups at the four observation times are depicted in supplementary material 1 .

figure 2

CONSORT flow diagram shows the participation in this study

At baseline, there were no significant differences between the groups in regard to their demographic characteristics Table  1 . Most participants were female (78.8%), and more than half were diploma nurses (57.5%). Their mean age and length of experience at the ICU were 33.2 years old and 10.3 years, respectively.

figure 3

Comparison between nurses’ level of competency over the four times of measurements

* P  <.001

The stacked line chart in Fig.  3 depicts the chronological change in nurses’ competency level measured by EBNCAC among the two groups, revealing a low level of competency at the baseline with no significant difference between the two groups ( p  =.81). The highest level of competency was at the first post-assessment score among the interventional group, with a mean score of (90.4 ± 11.55). The mean score declined steadily until the third post-assessment reaching a mean score of (65.6 ± 26.70). The control group demonstrated a low level of competency along with the four-time assessments, with mean scores ranging from 31.5 to 33.67 out of 148. Statistically significant differences were observed between the groups among the three post-assessments, with ( p  <.001).

According to Table  2 , there is a statistically significant difference within groups in terms of pre, first, second, and third assessments in relation to the nurses’ competency level ( p  <.001). In addition, there is a statistically significant difference between groups regarding the nurses’ competency level ( p  <.001), specifically in the three post-assessment phases. with a considerable high estimated effect size (η2 = 0.699). Finally, there is an interaction effect between the measurements in time and group ( p  <.001). Therefore, the result of the Two-way repeated measures ANOVA supported our hypothesis that the EBNPTP significantly increased nurses’ level of competency.

Table  3 presents the results of multiple linear regression models, which reveal the effect of these independent variables on the endpoint observation. The study revealed that pre-assessment (B = 0.53, p  =.002), educational level (B = 14.12, p  <.001), and control-intervention groups strongly predicted improvement of the third post-assessment (Endpoint) (B = 35.69, p  <.001). The adjusted R square was (0.667), indicating that the model could account for approximately 66.7% of third post assessment improvement.

Table  4 indicates significant differences between nurses’ level of education and the scores of the third post-assessment (F = 9.41, p  <.01). The 3rd post assessment score was significantly higher for bachelor’s degree nurses than for diploma ( p  <.001) and technical institute nurses ( p  <.05), while no significant differences between nurses’ pre / third post-assessments related scores in term of their gender, age, and years of experience.

To the of our knowledge, this is the first randomised controlled trial (RCT) in Egypt and the Middle East that aimed to quantitatively examine the effect of the EBNPTP caring for MV patients and assess the sustainability effect over time. The current study showed that the nurses who received the EBNPTP demonstrated a higher level of competency than those who did not. Congruently, several studies revealed that nurses’ level of competency was significantly improved by attending an EBP educational programme [ 29 , 30 , 31 , 32 ]. These findings strongly advocate for the widespread adoption of EBP utilisation to enhance nurses’ competency levels. Conversely, previous research has suggested that although EBP enhances nurses’ practices, it does not have a significant impact on their knowledge and attitude [ 33 ].

The improvement in nurses’ level of competency after receiving the EBNPTP can be attributed to a number of factors, including an increase in their job satisfaction, a sense of confidence, and an increase in their knowledge and skills, which provides them with a rationale for each specific task they perform. These provided justifications align with previous research [ 4 , 31 ]. From another point of view, this finding underscores deficiencies in baseline nurses’ understanding of EBP and its inadequate integration into their clinical practices. Moreover, the findings hint at the ineffectiveness of traditional in-service education delivered to nurses. Indicates the importance of substituting traditional in-service education with EBP training programmes and wide use of such programmes across different nursing domains.

In terms of the sustainability effect of EBNPTP among the intervention group, this study demonstrated that the mean scores of the nurses who received EBNPTP decreased significantly over time, indicating a lack of sustainability in the nurses’ level of competency. Even though the third post-assessment score for the intervention group was the lowest, it was still significantly two times higher than the baseline per-assessment score for the same group and the average scores of the control group. Similarly, Chu et al. (2019) [ 34 ] found that the experimental group’s scores substantially improved more than the control group one month after the training. However, both groups’ results declined; still, the experimental group performed better than the control group, indicating the effectiveness of the EBNPTP. Short-term initiatives of EBP education are likely to be successful. Nevertheless, there is little evidence regarding these initiatives’ sustainability [ 34 , 35 ]. Conversely, other studies confirmed that the participants’ EBP competencies were significantly improved and maintained over time [ 36 , 37 ].

The rationale for the lack of sustainability of nurses’ competencies can be attributed to nurses’ attitudes, resistance to change, lack of motivation, inadequate commitment, insufficient clinical supervision, stressful work environment, and workload. This finding is consistent with previous studies [ 31 , 36 ]. The lack of sustainability may also result from insufficient availability of essential equipment and supplies required for conducting EBP. Hence, it is strongly advised to provide effective clinical supervision for nurses in parallel with implementing EBP and to encourage nurses to adopt EBP consistently. Also, ensuring the constant availability of the necessary equipment.

Regarding the variables that affect the competency level of the nurses, multi-linear regression analysis revealed that a higher educational level was associated with a higher level of nurses’ competency. This result could be attributed to the higher level of skills and knowledge that baccalaureate-educated nurses possess; they are also more confident and take the initiative to update their knowledge, which makes them more capable of performing competently. On the same line, Hashish et al. (2020) [ 38 ] illustrated that experienced and baccalaureate nurses are more likely to access more resources, power, and knowledge that enable them to undertake autonomous and EBP than diploma programmes. Notably, the majority (approximately 90%) of nurses in Egypt hold diplomas, while only 6–8% hold bachelor’s degrees [ 39 ].

Furthermore, the study found that the baseline pre-assessment was an independent predictor and showed a significant relationship with the third assessment. This may be due to the fact that those who demonstrated in a particular way will continue to do so, and the nurses’ performance depends on their previous accumulated knowledge. This finding aligns with previous research which stated that nurses with a higher baseline competency level are likely to be more confident in implementing EBP [ 40 ]. However, the study revealed that nurses’ gender, age, and years of experience did not significantly affect their competency level before and after EBNPTP. This finding may be due to the reliability and accessibility of the provided EBNPTP, which was available to all nurses irrespective of these demographic variations, thus suggesting that EBNPTP implementation in the future could be beneficial for all nurses, regardless of these demographic characteristics. Consistently, Stokke et al. (2014) stated that none of the nurses’ demographic characteristics were found to be correlated with the implementation of EBP [ 41 ]. Since nurses’ attitude and motivation reflect their professional values and performance, prior research emphasises the importance of enhancing these factors when encouraging nurses to adopt EBP into their practices [ 42 ].

The current RCT is the first in Egypt and the Middle East to investigate the effect of an EBP training programme on the competency of nurses caring for MV patients and to assess the effect’s sustainability over time. In accordance with the research hypothesis, the EBP training programme demonstrated a significant increase in the nurses’ level of competency compared with traditional in-service education. This highlights the potential for widespread adoption of EBP across various areas of nursing to enhance the quality of care provided. However, the efficacy of the EBP training programme was found to be unsustainable over time. Addressing this challenge requires integrating the EBP training programme as material for job development and remuneration training for nurses to enhance its long-term effectiveness, ongoing monitoring of nurses’ performance, and further assessment of the contributing factors. The baseline competency and educational level of nurses correlate significantly with their performance. Consequently, the difficulties in dealing with nurses with varying levels of education and diminishing competencies persisted. This suggests the need for customised training programmes based on nurses’ baseline competency levels and educational backgrounds, as well as facilitating peer support and mentorship.

Limitations of the study

The enrolled participants were selected from a limited number of nurses. In addition, when the researchers estimated the sample size, the dropout rate was not considered, which may affect the power of the sample size; the dropout rate was 11%. The study included a small number of allocated bachelor nurses. Another limitation is that only three months were the duration to assess the sustainability effect of EBNPTP over time, which may be a short period. Additionally, data was collected from only one hospital.

Recommendations

According to the current study’s findings, we strongly suggest implementing EBP in nursing practices through elevating awareness and delivering extensive training for nurses across different settings. By equipping nurses with the necessary skills and knowledge for EBP, their competency can be enhanced, thus contributing to improved patient outcomes. Additionally, assign highly educated nurses to critical care settings requiring advanced care. Also, to sustain the implementation of EBP, we recommend providing effective clinical supervision. Furthermore, we propose a larger-scale evaluation of the impact of EBP implications in a variety of nursing specialisations. We therefore strongly advise evaluating the effects of EBP on patient outcomes. In addition, it is important to assess the factors or obstacles that may affect the application of EBP in nursing and to maintain its sustainability, as well as to identify the gap between education and practice using qualitative and quantitative research methods.

Data availability

The generated tool of data collection EBNCAC, intervention training programme EBNPTP and raw data of this study are available from the corresponding author upon request.

Saunders H, Vehviläinen-Julkunen K. Key considerations for selecting instruments when evaluating healthcare professionals’ evidence‐based practice competencies: a discussion paper. J Adv Nurs. 2018;74:2301–11. https://doi.org/10.1111/jan.13802

Article   PubMed   Google Scholar  

American Nurses Association (ANA). What is evidence-based practice in nursing? https://www.nursingworld.org/practice-policy/nursing-excellence/evidence-based-practice-in-nursing/ . Accessed 12 Feb 2024.

Davies C, Lyons C, Whyte R. Optimizing nursing time in a day care unit: quality improvement using lean six sigma methodology. Int J Qual Health Care. 2019;31(Supplement1):22–8. https://doi.org/10.1093/intqhc/mzz087

Dang D, Dearholt SL, Bissett K, Ascenzi J, Whalen M. Johns Hopkins evidence-based practice for nurses and healthcare professionals: model and guidelines. Sigma Theta Tau; 2021.

Nightingale F. Notes on matters affecting the health, efficiency and hospital administration of the British Army… by Florence Nightingale… Harrison and Sons; 1858.

D’Souza P, George A, Noronha J, Renjith V. Integration of evidence-based practice in nursing education: a novel approach. 2015.

Bassford C. Decisions regarding admission to the ICU and international initiatives to improve the decision-making process. Crit Care. 2017;21:1–3. https://doi.org/10.1186/s13054-017-1749-3

Article   Google Scholar  

Jung YT, Kim MJ, Lee JG, Lee SH. Predictors of early weaning failure from mechanical ventilation in critically ill patients after emergency gastrointestinal surgery: a retrospective study. Medicine. 2018;97. https://doi.org/10.1097/MD.0000000000012741

Kobayashi H, Uchino S, Takinami M, Uezono S. The impact of ventilator-associated events in critically ill subjects with prolonged mechanical ventilation. Respir Care. 2017;62:1379–86. https://doi.org/10.1097/MD.0000000000012741

Fathy A, Abdelhafeez R, Abdel-Hady E-G, Abd Elhafez SA. Analysis of ventilator associated pneumonia (VAP) studies in Egyptian University hospitals. Egypt J Chest Dis Tuberculosis. 2013;62:17–25. https://doi.org/10.1016/j.ejcdt.2013.04.008

Elkolaly RM, Bahr HM, El-Shafey BI, Basuoni AS, Elber EH. Incidence of ventilator-associated pneumonia: Egyptian study. Egypt J Bronchol. 2019;13:258–66. https://doi.org/10.4103/ejb.ejb_43_18

Kollef MH, Chastre J, Fagon JY, François B, Niederman MS, Rello J, et al. Global prospective epidemiologic and surveillance study of ventilator-associated pneumonia due to pseudomonas aeruginosa. Crit Care Med. 2014;42:2178–87. https://doi.org/10.1097/CCM.0000000000000510

Xie J, Yang Y, Huang Y, Kang Y, Xu Y, Ma X, et al. The current epidemiological landscape of ventilator-associated pneumonia in the intensive care unit: a multicenter prospective observational study in China. Clin Infect Dis. 2018;67 suppl2:S153–61. https://doi.org/10.1093/cid/ciy692

Galal YS, Youssef MRL, Ibrahiem SK. Ventilator-associated pneumonia: incidence, risk factors and outcome in paediatric intensive care units at Cairo University Hospital. J Clin Diagn Res. 2016;10:SC06. https://doi.org/10.7860/JCDR/2016/18570.7920

Article   CAS   PubMed   PubMed Central   Google Scholar  

Karakuzu Z, Iscimen R, Akalin H, Girgin NK, Kahveci F, Sinirtas M. Prognostic risk factors in ventilator-associated pneumonia. Med Sci Monit. 2018;24:1321. https://doi.org/10.12659/msm.905919

Liu M, Lin Y, Dai Y, Deng Y, Chun X, Lv Y, et al. A multi-dimensional EBP educational program to improve evidence-based practice and critical thinking of hospital-based nurses: development, implementation, and preliminary outcomes. Nurse Educ Pract. 2021;52:102964. https://doi.org/10.1016/j.nepr.2020.102964

Meligy BS, Kamal S, El Sherbini SA. Mechanical ventilation practice in Egyptian pediatric intensive care units. Electron Physician. 2017;9:4370. https://doi.org/10.19082/4370

Article   PubMed   PubMed Central   Google Scholar  

Abdelhafez AI, Tolba AA. Nurses’ practices and obstacles to oral care quality in intensive care units in Upper Egypt. Nurs Crit Care. 2021. https://doi.org/10.1111/nicc.12736

Gunawan NPIN, Hariyati RTS, Gayatri D. Motivation as a factor affecting nurse performance in regional general hospitals: a factors analysis. Enferm Clin. 2019;29:515–20. https://doi.org/10.1016/j.enfcli.2019.04.078

TEAL Center Fact Sheet No. 11: Adult Learning Theories| Adult Education and Literacy| U.S. Department of Education. https://lincs.ed.gov/state-resources/federal-initiatives/teal/guide/adultlearning . Accessed 17 Apr 2023.

Chan AW, Tetzlaff JM, Altman DG, Laupacis A, Gøtzsche PC, Krleža-Jerić K, et al. SPIRIT 2013 statement: defining standard protocol items for clinical trials. Ann Intern Med. 2013;158:200–7. https://doi.org/10.7326/0003-4819-158-3-201302050-00583

Schulz KF, Altman DG, Moher D. CONSORT 2010 statement: updated guidelines for reporting parallel group randomised trials. 2010;1:100–7. https://doi.org/10.4103/0976-500X72352

Behzadi F, Khanjari S, Haghani H. Impact of an education program on the performance of nurses in providing oral care for mechanically ventilated children. Australian Crit Care. 2019;32:307–13. https://doi.org/10.1016/j.aucc.2018.06.007

AACN Levels of Evidence - AACN. https://www.aacn.org/clinical-resources/practice-alerts/aacn-levels-of-evidence . Accessed 8 Feb 2024.

Mussa CC, Gomaa D, Rowley DD, Schmidt U, Ginier E, Strickland SL. AARC clinical practice guideline: management of adult patients with tracheostomy in the acute care setting. Respir Care. 2021;66:156–69. https://doi.org/10.4187/respcare.08206

Boltey E, Yakusheva O, Kelly Costa D, Michigan AA. 5 nursing strategies to prevent ventilator-associated pneumonia. Am Nurse Today. 2017;12:42.

PubMed   PubMed Central   Google Scholar  

Smith V, Devane D, Nichol A, Roche D. Care bundles for improving outcomes in patients with COVID-19 or related conditions in intensive care– a rapid scoping review. Cochrane Database Syst Reviews. 2020;2020. https://doi.org/10.1002/14651858.CD013819

Lawshe CH. A quantitative approach to content validity. Pers Psychol. 1975;28:563–75. https://doi.org/10.1111/j.1744-6570.1975.tb01393.x

Gallagher-Ford L, Koshy Thomas B, Connor L, Sinnott LT, Melnyk BM. The effects of an intensive evidence‐based practice educational and skills building program on EBP competency and attributes. Worldviews Evid Based Nurs. 2020;17:71–81. https://doi.org/10.1111/wvn.12397

Kim JS, Gu MO, Chang H. Effects of an evidence-based practice education program using multifaceted interventions: a quasi-experimental study with undergraduate nursing students. BMC Med Educ. 2019;19:1–10. https://doi.org/10.1186/s12909-019-1501-6

Melnyk BM, Fineout-Overholt E. Evidence-based practice in nursing & healthcare: a guide to best practice. Lippincott Williams & Wilkins; 2022.

van der Goot WE, Keers JC, Kuipers R, Nieweg RMB, de Groot M. The effect of a multifaceted evidence-based practice programme for nurses on knowledge, skills, attitudes, and perceived barriers: a cohort study. Nurse Educ Today. 2018;63:6–11. https://doi.org/10.1016/j.nedt.2018.01.008

Lee CY, Wang WF, Chang YJ. The effects of evidence-based nursing training program on nurses’ knowledge, attitude, and behavior. New Taipei J Nurs. 2011;13:19–31.

Google Scholar  

Chu T-L, Wang J, Monrouxe L, Sung Y-C, Kuo C, Ho L-H, et al. The effects of the flipped classroom in teaching evidence based nursing: a quasi-experimental study. PLoS ONE. 2019;14:e0210606. https://doi.org/10.1371/journal.pone.0210606

Fleiszer AR, Semenic SE, Ritchie JA, Richer M, Denis J. Nursing unit leaders’ influence on the long-term sustainability of evidence‐based practice improvements. J Nurs Manag. 2016;24:309–18. https://doi.org/10.1111/jonm.12320

Gorsuch C (ret) PF, Gallagher Ford L, Koshy Thomas B, Melnyk BM, Connor, L. Impact of a formal educational skill-building program based on the ARCC model to enhance evidence‐based practice competency in nurse teams. Worldviews Evid Based Nurs. 2020;17:258–68. https://doi.org/10.1111/wvn.12463

Ramos-Morcillo AJ, Fernández‐Salazar S, Ruzafa‐Martínez M, Del‐Pino‐Casado R. Effectiveness of a brief, basic evidence‐based practice course for clinical nurses. Worldviews Evid Based Nurs. 2015;12:199–207. https://doi.org/10.1111/wvn.12103

Hashish A, Aly E, Alsayed S. Evidence-based practice and its relationship to quality improvement: a cross-sectional study among Egyptian nurses. Open Nurs J. 2020;14. https://doi.org/10.2174/1874434602014010254

Bellizzi S, Padrini S. Report of the satisfaction survey amongst public health services nurses in Port Said. BMC Nurs. 2021;20:1–5. https://doi.org/10.1186/s12912-021-00707-y

Al-Busaidi IS, Al Suleimani SZ, Dupo JU, Al Sulaimi NK, Nair VG. Nurses’ knowledge, attitudes, and implementation of evidence-based practice in Oman: a multi-institutional, cross-sectional study. Oman Med J. 2019;34:521. https://doi.org/10.5001/omj.2019.95

Stokke K, Olsen NR, Espehaug B, Nortvedt MW. Evidence based practice beliefs and implementation among nurses: a cross-sectional study. BMC Nurs. 2014;13:1–10. https://doi.org/10.1186/1472-6955-13-8

Mlambo M, Silén C, McGrath C. Lifelong learning and nurses’ continuing professional development, a metasynthesis of the literature. BMC Nurs. 2021;20:1–13. https://doi.org/10.1186/s12912-021-00579-2

Download references

Acknowledgements

We would like to express our gratitude to all nurses who participated in this study. We thank Mrs. Samar Hashem for the assistance in data collection.

Not applicable.

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Author information

Authors and affiliations.

Faculty of Nursing, Cairo University, 11562, Cairo, Egypt

Sameh Elhabashy

Graduate School of Biomedical and Health Sciences, Hiroshima University, 734-8551, Kasumi, Hiroshima, Japan

Michiko Moriyama

National Hepatology and Tropical Medicine Research Institute, Cairo, Egypt

Eman Ibrahim El-Desoki Mahmoud & Basem Eysa

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization, S.E., and M.M.; methodology, S.E., M.M., E.E., and B.E.; Data collection, E.E., B.E., and S.E. investigation and formal analysis, S.E. and M.M.; writing—review and editing, S.E., M.M., E.E., and B.E.; All authors read and approved the final manuscript.

Corresponding author

Correspondence to Sameh Elhabashy .

Ethics declarations

Ethics approval and informed consent to participate.

This study was registered with Clinical Trials.gov (Registration # NCT05721664) on 10/02/2023 and approved by the Research Ethics Committee for Human Subject Research at NHTMRI-IRB, Egypt (approval # 33/22). Participation in this study was entirely voluntary, and a written informed consent was obtained before the commencement of the study, the authors are attesting that all participants were aware of the study’s purpose, risks, and potential benefits before providing written consent. Participants had the right to withdraw from the study at any time without any repercussions on their professional evaluations, while still receiving the traditional routine education as usual. Even though the control group did not receive the intervention being studied, they still received equitable education to ensure fairness, and no expected harm was verified. Also, we intend to provide the EBP training programme to the control group if the intervention proves to be effective, thus ensuring equal treatment for all participants in the study. The study was conducted with the participants’ rights and safety protected by adhering to local Egypt laws and all methods were carried out in accordance with relevant guidelines and regulations of the Declaration of Helsinki. Every participant received a unique identification number, which protected their anonymity. Confidentiality was also confirmed.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Elhabashy, S., Moriyama, M., Mahmoud, ED. et al. Effect of evidence-based nursing practices training programme on the competency of nurses caring for mechanically ventilated patients: a randomised controlled trial. BMC Nurs 23 , 225 (2024). https://doi.org/10.1186/s12912-024-01869-1

Download citation

Received : 19 April 2023

Accepted : 14 March 2024

Published : 02 April 2024

DOI : https://doi.org/10.1186/s12912-024-01869-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Evidence-based practice
  • Mechanical ventilation

BMC Nursing

ISSN: 1472-6955

research article using linear regression

IMAGES

  1. Linear Regression Explained. A High Level Overview of Linear…

    research article using linear regression

  2. Linear Regression model sample illustration

    research article using linear regression

  3. Linear Regression Basics for Absolute Beginners

    research article using linear regression

  4. Linear regression models showing ultimate tensile strength, toughness

    research article using linear regression

  5. (PDF) Linear Regression Analysis Part 14 of a Series on Evaluation of

    research article using linear regression

  6. 28 Linear Regression

    research article using linear regression

VIDEO

  1. Regression analysis with R: A step-by-step guide

  2. Using Combined Linear Regression and Principal Component Analysis for Unsupervised Change Detection

  3. Linear Regression Analysis in SPSS

  4. Linear Regression Analysis: Predicting Final Exam Grades from Midterm Scores |Chap#11|

  5. using linear regression model to predict in Stata

  6. Lecture 6: Multiple Regression

COMMENTS

  1. Linear Regression in Medical Research

    Linear regression is an extremely versatile technique that can be used to address a variety of research questions and study aims. Researchers may want to test whether there is evidence for a relationship between a categorical (grouping) variable (eg, treatment group or patient sex) and a quantitative outcome (eg, blood pressure).

  2. Linear Regression in Medical Research : Anesthesia & Analgesia

    KEY POINT: Linear regression is used to quantify the relationship between ≥1 independent (predictor) variables and a continuous dependent (outcome) variable. In this issue of Anesthesia & Analgesia, Müller-Wirtz et al 1 report results of a study in which they used linear regression to assess the relationship in a rat model between tissue ...

  3. The clinician's guide to interpreting a regression analysis

    In a linear regression model, the dependent variable must be continuous ... Linear regression in medical research. Anesth Analg. 2021;132:108-9. Article Google Scholar

  4. (PDF) Linear regression analysis study

    Linear regression is a statistical procedure for calculating the value of a dependent variable from an independent variable. Linear regression measures the association between two variables. It is ...

  5. Is It the Intervention or the Students? Using Linear Regression to

    Andrews and colleagues' (2011) use of linear regression is an important addition to the undergraduate STEM education literature, as it allows them to control for factors other than active learning—such as the instructor's position and years of teaching experience, class size, and student-rated course difficulty—that could influence ...

  6. Improving the Prediction of Total Surgical Procedure Time Using Linear

    This number is based on the research performed by van Veen-Berkx et al., which showed that 33% of SCT is generally a good approximation of anesthesia-controlled time (ACT). We then systematically tested all possible linear regression models to predict TPT using eSCT in combination with the other available independent variables.

  7. Simple linear regression

    The most basic regression relationship is a simple linear regression. In this case, E ( Y | X) = μ ( X) = β0 + β1X, a line with intercept β0 and slope β1. We can interpret this as Y having a ...

  8. The Use and Interpretation of Linear Regression Analysis in

    A clear understanding of linear regression analysis is of fundamental importance to quantitative research. In this editorial, I briefly discuss some of the key concepts; a comprehensive treatment is available in many textbooks, such as that by Kutner and associates.1 Linear regression is used to describe the relationship of a continuous outcome measure to 1 or more explanatory or predictor ...

  9. Review of guidance papers on regression modeling in statistical ...

    An article within a series was considered to be topic-relevant if the title included one of the following keywords: regression, linear, logistic, Cox, survival, Poisson, multivariable, multivariate, or if the title suggested that the main topic of the article was statistical regression modeling. Both raters decided on the topic-relevance of an ...

  10. Regression analysis of student academic performance using ...

    We have applied regression using deep learning and linear regression on the dataset. For such models with smaller datasets, to tackle the issue of overfitting is critical. Hence, the parameters can be tuned to deal with such issues. ... Journal of Educational Computing Research, 57(3), 547-570. Article Google Scholar Yadav, S. K., Bharadwaj ...

  11. Multiple linear regression

    When we use the regression sum of squares, SSR = Σ ( ŷi − Y−) 2, the ratio R2 = SSR/ (SSR + SSE) is the amount of variation explained by the regression model and in multiple regression is ...

  12. Anxiety, Affect, Self-Esteem, and Stress: Mediation and ...

    Multiple linear regression analyses were used in order to examine moderation effects between anxiety, stress, self-esteem and affect on depression. The analysis indicated that about 52% of the variation in the dependent variable (i.e., depression) could be explained by the main effects and the interaction effects ( R 2 = .55, adjusted R 2 = .51 ...

  13. Full article: Using Simple Linear Regression to Assess the Success of

    The data and accompanying analysis presented in this paper provide both a meaningful example of data analysis using simple linear regression and a story of remarkable success of international cooperation addressing a global environmental problem. ... Report of the International Ozone Trends Panel—1988 (Report 18, Global Ozone Research and ...

  14. Healthcare

    COVID-19, or SARS-CoV-2, is considered as one of the greatest pandemics in our modern time. It affected people's health, education, employment, the economy, tourism, and transportation systems. It will take a long time to recover from these effects and return people's lives back to normal. The main objective of this study is to investigate the various factors in health and food access, and ...

  15. (PDF) Multiple Regression: Methodology and Applications

    Abstract. Multiple regression is one of the most significant forms of regression and has a wide range. of applications. The study of the implementation of multiple regression analysis in different ...

  16. ORIGINAL RESEARCH article

    Multiple linear regression analysis was employed to evaluate associations between ON tendency, weight control methods, and dietary variety. Females exhibited a higher ON tendency than males (14.4 ± 3.4 vs. 13.5 ± 3.7, p < 0.001, d = 0.25). ... This article is part of the Research Topic.

  17. Predicting the actual location of faults in underground optical

    The difficulty of tracing these under-ground faults mostly result in an undue delay and loss of revenue. This research presents a machine learning approach to predict the actual location of a fiber cable fault in an underground optical transmission link. Linear regression in the python sci-kit learn library was used to predict the actual ...

  18. A short intro to linear regression analysis using survey data

    Regression is a statistical method that allows us to look at the relationship between two variables, while holding other factors equal. This post will show how to estimate and interpret linear regression models with survey data using R. We'll use data taken from a Pew Research Center 2016 post-election survey, and you can download the dataset ...

  19. Trends in antidiabetic drug use and expenditure in public hospitals in

    Linear regression was used to analyse the trends and magnitude of drug use and expenditure. Results The overall trend in the use and expenditure of antidiabetic drugs was on the rise, with the use increasing from 1.04 in 2012 to 16.02 DID in 2021 and the expenditure increasing from 48.36 in 2012 to 496.42 million yuan in 2021 (from 7.66 to 76. ...

  20. Predicting macroinvertebrate average score per taxon (ASPT ...

    Based on the predictor values obtained from our published database (Iwasaki et al. 2022), the ASPT values for all 2925 WQM sites (i.e., environmental reference points) were estimated by using the best multiple linear regression model (Fig. 4). These ASPT values indicated that 29% of the WQM sites should be classified as "very good," 50% as ...

  21. Effect of evidence-based nursing practices training programme on the

    Multiple linear regression was applied as a regression model to test the effect of the study predictors on the endpoint third post-assessment. Finally, the effects of demographic characteristics on the baseline pre-assessment and endpoint of the third post-assessment were determined utilising a t-test and one-way ANOVA.