Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Developing and Optimising the Use of Logic Models in Systematic Reviews: Exploring Practice and Good Practice in the Use of Programme Theory in Reviews

* E-mail: [email protected]

Affiliation Evidence for Policy and Practice Information and Co-ordinating Centre (EPPI-Centre), UCL Institute of Education, University College London, London, United Kingdom

Affiliation Centre for Paediatrics, Blizard Institute, Queen Mary University of London, London, United Kingdom

  • Dylan Kneale, 
  • James Thomas, 
  • Katherine Harris

PLOS

  • Published: November 17, 2015
  • https://doi.org/10.1371/journal.pone.0142187
  • Reader Comments

Table 1

Logic models are becoming an increasingly common feature of systematic reviews, as is the use of programme theory more generally in systematic reviewing. Logic models offer a framework to help reviewers to ‘think’ conceptually at various points during the review, and can be a useful tool in defining study inclusion and exclusion criteria, guiding the search strategy, identifying relevant outcomes, identifying mediating and moderating factors, and communicating review findings.

Methods and Findings

In this paper we critique the use of logic models in systematic reviews and protocols drawn from two databases representing reviews of health interventions and international development interventions. Programme theory featured only in a minority of the reviews and protocols included. Despite drawing from different disciplinary traditions, reviews and protocols from both sources shared several limitations in their use of logic models and theories of change, and these were used almost unanimously to solely depict pictorially the way in which the intervention worked. Logic models and theories of change were consequently rarely used to communicate the findings of the review.

Conclusions

Logic models have the potential to be an aid integral throughout the systematic reviewing process. The absence of good practice around their use and development may be one reason for the apparent limited utility of logic models in many existing systematic reviews. These concerns are addressed in the second half of this paper, where we offer a set of principles in the use of logic models and an example of how we constructed a logic model for a review of school-based asthma interventions.

Citation: Kneale D, Thomas J, Harris K (2015) Developing and Optimising the Use of Logic Models in Systematic Reviews: Exploring Practice and Good Practice in the Use of Programme Theory in Reviews. PLoS ONE 10(11): e0142187. https://doi.org/10.1371/journal.pone.0142187

Editor: Paula Braitstein, University of Toronto Dalla Lana School of Public Health, CANADA

Received: February 1, 2015; Accepted: October 19, 2015; Published: November 17, 2015

Copyright: © 2015 Kneale et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited

Data Availability: Data can be found on the Cochrane database of systematic reviews ( http://onlinelibrary.wiley.com/cochranelibrary/search/ ) and the 3ie database of systematic reviews ( http://www.3ieimpact.org/evidence/systematic-reviews/ ).

Funding: This work was supported by the National Institute for Health Research (NIHR) Collaboration for Leadership in Applied Health Research and Care North Thames at Barts Health NHS Trust. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health. The funders had no direct role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Researchers in academic institutions have historically measured their success by the impact that their research has within their own research communities, and have paid less attention to measuring its broader social impact. This presents a contradiction between the metrics of success of research and its ultimate extrinsic value [ 1 ], serving to expose a gulf between ‘strictly objective’ and ‘citizen’ scientists and social scientists [ 2 ]; the former believing that research should be objective and independent of external societal influences and the latter whose starting point is that science should benefit society. In recent years the need to link research within broader knowledge utilisation processes has been recognised, or at least accepted, by research councils and increasing numbers of researchers. While some forms of academic enquiry that pushes disciplinary boundaries or that represents ‘blue skies’ thinking remains important, despite being only distally linked to knowledge utilisation, there is little doubt as to the capacity of many other forms of ‘research’ to influence and transform policy and practice (see [ 3 , 4 ]). In many ways, both systematic reviews and logic models are both borne of such a need for greater knowledge transference and influence. Policy and practice-relevance is integral to most systematic reviews, with the systematic and transparent synthesis of evidence serving to enhance the accessibility of research findings to other researchers and wider audiences [ 5 , 6 ]. Through an explicit, rigorous and accountable process of discovery, description, quality assessment, and synthesis of the literature according to defined criteria, systematic reviews can help to make research accessible to policy-makers and other stakeholders who may not otherwise engage with voluminous tomes of evidence. Similarly, one of the motivations in evaluation research and programme management for setting out programme theory through a logic model or theory of change was to develop a shared understanding of the processes and underlying mechanisms by which interventions were likely to ‘work’. In the case of logic models this is undertaken through pictorially depicting the chain of components representing processes and conditions between the initial inputs of an intervention and the outcomes; a similar approach also underlies theories of change, albeit with a greater emphasis on articulating the specific hypotheses of how different parts of the chain result in progression to the next stage. This shared understanding was intended to develop across practitioner and program implementers, who may otherwise have very different roles in an intervention, as well as among a broader set of stakeholders, including funders and policy-makers.

As others before us have speculated, there is room for the tools of programme theory and the approach of systematic reviewing to converge, or more precisely, for logic models to become a tool to be employed as part of undertaking a systematic review [ 7 – 9 ]. This is not in dispute in this paper. However, even among audiences engaged in systematic research methods, we remain far from a shared understanding about the purpose and potential uses of a logic model, and even its definition. This has also left us without any protocol around how a logic model should be constructed to enhance a systematic review. In this paper we offer:

  • an account of the way in which logic models are used in the systematic review literature
  • an example of a logic model we have constructed to guide our own work and the documented steps taken to construct this
  • a set of principles for good practice in preparing a logic model

Here, we begin with an outline of the introduction of logic models into systematic reviews and their utility as part of the process.

The Use of Programme Theory in Review Literature

As understood in the program evaluation literature, logic models are one way of representing the underlying processes by which an intervention effects a change on individuals, communities or organisations. Logic models themselves have become an established part of evaluation methodology since the late 60s [ 10 ], although documentation that outlines the underlying assumptions addressing the ‘why’ and ‘for whom’ questions that define interventions is found in literature that dates back further, to the late 50s [ 11 ].

Despite being established in evaluation research, programme theory and the use of logic models remains an approach that is underutilised by many practitioners who design, run, and evaluate interventions in the UK [ 12 , 13 ]. Furthermore, there is a substantial degree of fragmentation regarding the programme theory approach used. Numerous overlapping approaches have been developed within evaluation literature, including ‘logic model’, ‘theory of change’, ‘theory of action’, ‘outcomes chain’, ‘programme theory’ and ‘program logic’ [ 11 , 13 ]. This very lack of consistency and agreement as to the appropriate tools to be used to conceptualise programme theory has been identified as one reason why, in a survey of 1,000 UK charities running interventions, four-fifths did not report using any formal program theory tool to understand the way in which their interventions effected change on their beneficiaries [ 12 ].

Conversely, within systematic reviewing so far, there has been some degree of consensus on the terminology used to represent the processes that link interventions and their outcomes (for example [ 7 , 8 , 9 ]). Many systematic reviews of health interventions tend to settle on a logic model as being the instrument of choice for guiding the review. Alternatively, reviews of international development interventions often include a theory of change, perhaps reflective of the added complexity of such interventions which often take place on a community, policy or systems basis. Logic models and theories of change sit on the same continuum, although a somewhat ‘fuzzy’ but important distinction exists. While a logic model may set out the chain of activities needed or are expected to lead to a chain of outcomes, a theory of change will provide a fuller account of the causal processes, indicators, and hypothesised mechanisms linking activities to outcomes. However, how reviews utilise programme theory is relatively unexplored.

Methods and criteria

To examine the use of logic models and theories of change in the systematic review literature, we examined indicative evidence from two sources. The first of these sources, the Cochrane database publishes reviews that have a largely standardised format that follow guidelines set out in the Cochrane Handbook (of systematic reviews of interventions) [ 14 ]. Currently, the handbook itself does not include a section on the use of programme theory in reviews. Other guidance set out by individual Cochrane review groups (of which there are 53, each focussed on a specific health condition or area) may highlight the utility of using programme theory in the review process. For example the Public Health Review Group, in their guidance on preparing a review protocol, describe the way in which logic models can be used to describe how interventions work and to justify a focus of a review on a particular part of the intervention or outcome [ 15 ]. Meanwhile, in the 2012 annual Cochrane methods-focussed report, the use logic models was viewed as holding potential to ‘confer benefits for review authors and their readers’ [ 16 ] and logic models have also been included in Cochrane Colloquium programme [ 17 ]. However, a definitive recommendation for use is not found, at the time of writing, in standard guidance provided to review authors. The second source, the 3ie database includes reviews with a focus on the effectiveness of social and economic interventions in low- and middle- income countries. The database includes reviews that have been funded by 3ie as well as those that are not directly funded but that nevertheless fall within its scope and are deemed to be of sufficient rigour. While the use of programme theory does not form part of the inclusion criteria, its use is encouraged in good practice set out by 3ie [ 18 ] and a high degree of importance is attributed to its use in 3ie’s criteria for awarding funding for reviews [ 19 ].

To obtain a sample of publications, we searched the Cochrane Library for systematic reviews and protocols and for material that included either the phrase ‘logic model’ or ‘theory of change’ occurring anywhere, published between September 2013 and September 2014 (over this period a total of 1,473 documents were published in the Cochrane Library). We also searched the 3ie (International Initiative for Impact Evaluation) database of systematic reviews published in 2013, and manually searched publications for the phrases ‘logic model’ or ‘theory of change’. Both searches were intended to provide a snapshot of review activity through capturing systematic review publications occurring over the course of a year. For the 3ie database, it was not possible to search by month, therefore we searched for publications by calendar year; to ensure that we obtained a full sample for a year we selected 2013 as our focus. For the Cochrane database, in order to obtain a more recent snapshot of publications to reflect current trends, we opted to search for publications occurring over a year (13 months in this case). All reviews and protocols of reviews that fell within the search parameters were analysed.

In the Cochrane database, over this period, four reviews and ten protocols were published that included the phrase ‘logic model’; while two protocols were published that included the phrase ‘theory of change’. It should be noted therefore that, certainly within reviews of health topics that adhere to Cochrane standards, that neither tool has made a substantial impact in this set of literature. This is likely to reflect the mainly clinical nature of many Cochrane reviews–among the 8 publications that were published through the public health group (all of which were protocols), 5 included mention of programme theory. Within the 3ie database of international development interventions, 53 reviews and protocols were published in 2013 (correct as of December 2014), of which 24 included a mention of either a logic model or a theory of change.

We developed a template for summarising the way in which logic models were used in the included protocols and systematic reviews based on the different stages of undertaking systematic reviews [ 6 ] and the potential usage of systematic reviews as identified by Anderson and colleagues and Waddington and colleagues [ 7 , 18 ], who in the former case describe logic models as being tools that can help to (i) scope the review; (ii) define and conduct the review; and (iii) make the review relevant to policy and practice including in communicating results. These template constructs also reflected the way in which logic model usage was described in the publications, which was primarily shaped by reporting conventions for protocols and reports published in Cochrane and 3ie (although the format for the latter source is less structured). Criteria around the constructs included in the template were then defined before two reviewers (see S1 Table . Data Coding Template) then independently assessed the use of logic model within the reviews and protocols published; the reviewers then met to discuss their findings. What this template approach cannot capture is the extent to which using a logic model shaped the conceptual thinking of the review teams, which, as discussed later in this paper, is one of the major contributions of using a logic model framework.

While both databases cover two different disciplines, allowing us to make comparisons between these, there may be some who argue that through having rigidly enforced methodological guidelines in the production of reviews, that we are unlikely to encounter innovative approaches to the use of programme theory in the reviews and protocols included here. This is a legitimate concern and is a caveat of the results presented here, although even among these sources we observe considerable diversity in the use of programme theory as we describe in our results.

Results: how are logic models and theories of change used in systematic reviews?

Looking first at publications from the Cochrane database and the two studies that included some component examining ‘theories of change’, the first of these described ‘theory of change’ in the context of synthesising different underlying theoretical frameworks [ 20 ] while the second used ‘theory of change’ in the context of describing a particular modality of theory [ 21 ]. Meanwhile, logic models were incorporated in a variety of ways; most frequently, they have been used as a shorthand way to describe how interventions are expected to work, outlining stages from intervention inputs through to expected outcomes at varying levels of detail ( Table 1 ).

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0142187.t001

In around half of reports and protocols, the authors described in some form how they planned to, or did in fact use, their logic model in the review. Nevertheless, in the remainder of publications (all of which were protocols), the logic model was presented as a schematic to describe how the intervention may work and was not explicitly referred to further as a tool to guide the review process. Two Cochrane review protocols explicitly outlined the way in which the logic model would be used as a reference tool when considering the types of intervention that would be within the scope of the review [ 25 , 26 ]; this was also described in a full review [ 24 ]. We identified three publications where it was suggested that the logic model was developed through an iterative process where consensus was developed across the review team [ 23 , 24 , 30 ].

Two Cochrane reviews described how the logic model was used to determine the subgroup analyses a priori [ 22 , 23 ], helping to avoid some of the statistical pitfalls of running post-hoc sub-group analyses [ 35 ]. For example in their review of psychosocial smoking cessation interventions for pregnant women, Chamberlain and colleagues [ 22 ] developed their logic model from a synthesis of information collected before the review process began, with the express purpose of guiding analyses and stating sub-group analyses of interest. In Glenton’s study, a revision to the logic model they originally specified, based on the review findings was also viewed as being a useful tool to guide the sub-group analyses of future reviews. However, none of the protocols (as opposed to reviews) from the Cochrane database explicitly mentioned that the logic model had been used to consider which sub-group analyses should be undertaken. The review and protocol by Glenton et al. [ 23 ] and Ramke et al. [ 33 ] respectively provided the only examples where the logic model was to be revised iteratively during the course of the review based on review findings. Of the three Cochrane reviews included in Table 1 , Glenton and colleagues’ study [ 23 ] can be considered the one to have used a logic model most comprehensively as a dynamic tool to be refined and used to actively the synthesis of results in the review. The authors describe a novel use of the logic model in their mixed methods review as being a tool to describe mechanisms by which the intervention barriers and facilitators identified in the qualitative synthesis could impact on the outcomes assessed quantitatively in their review of programme effectiveness.

Among the studies extracted from the 3ie database, the terminology was weighted towards the ‘theories of change’ as opposed to ‘logic models’ (as expected, based on the guidance provided). Out of the 24 studies that were included ( Table 2 ), fourteen included a Logic Model and nine included a Theory of Change, while one report used both terms. Despite more studies including mention or an actual depiction of a theory of change or logic model, this body of literature shared the same limitations around the extent of use of programme theory as a tool integral to the review process. The majority of studies used a Theory of Change/Logic Model to describe their overall conceptual model or how they viewed the intervention or policy change under review would work, although this was reported at different stages of the review. Of the eleven protocols that were included, eight explicitly mentioned that they planned to return to their model at the end of the review, emphasising the use of programme theory tools as tools to help design the review and communicate the findings in this field. For example, in Willey and colleagues’ review of approaches to strengthen health services in developing countries [ 59 ], the Logic Model was updated at the end of the review to reflect the strength of the evidence discovered for each of the hypothesised pathways. Seven of the twenty protocols and studies described how a theory of change/logic model would be used to guide the review in terms of search strategy or more generally as a reference throughout the screening and other stages. Finally, two publications [ 48 , 52 ] described how they would use a theory of change as the basis for synthesising qualitative findings and two described how they would use a logic model/theory of change to structure sub-group meta analyses in quantitative syntheses [ 48 , 58 ]; both of these two latter protocols described how programme theory would be used at a number of key decision points in the review itself.

thumbnail

https://doi.org/10.1371/journal.pone.0142187.t002

Among the Cochrane and 3ie publications, few reviews or protocols described the logic model as being useful in the review initiation, description of the study characteristics or in assessing the quality and relevance of publications. Three Cochrane protocols and one Cochrane review described using existing logic models in full or examining components of existing logic models or reviews to develop their own while one in our sample of international development systematic reviews did so. Most authors appear to develop their own logic models afresh, and largely in the absence of guidance around good practice around the use of logic models. As Glenton and colleagues describe there is “no uniform template for developing logic models, although the most common approach involves identifying a logical flow that starts with specific planned inputs and activities and ends with specific outcomes or impacts, often with short-term or intermediate outcomes along the way” ([ 23 ]; p13).

Developing a Logic Model: A Worked Example from School Based Asthma Interventions

The second aim of this paper is to provide an example of the development of a logic model in practice. The logic model we describe is one developed as part of a systematic review examining the impact of school based interventions focussing on the self-management of asthma among children. This review is being carried out by a multidisciplinary team comprising team members with experience of systematic reviewing as well as team members who are trialists with direct experience in the field of asthma and asthma management. Of particular interest in this review are the modifiable intervention factors that can be identified as being associated with improvements in asthma management and health outcomes. The evidence will be used directly in the design of an intervention that will be trialled among London school children. Our approach was to view the development of the logic model as an iterative process, and we present three different iterations (Figs 1 – 3 ) that we undertook to arrive at the model we included in our review protocol [ 60 ]. Our first model was based on pathways identified by one reviewer through a summary examination of the literature and existing reviews. This was then challenged and refined through the input of a second reviewer and an information scientist, to form a second iteration of the model. Finally, a third iteration was constructed through the input of the wider review team, which included both methodological specialists and clinicians. These steps are summarised in Box 1 and are described in greater detail in the sections below. The example provided here is one that best reflects a process driven logic model where the focus is on establishing the key stages of interest and using the identified processes to guide later stages of the review. An alternative approach to developing a logic model may be to focus more on the representation of systems and theory [ 61 ]; although this approach may be better placed to support reviews of highly complex interventions (such as many of the international development reviews described earlier) or reviews that are more methodological than empirical in nature.

thumbnail

https://doi.org/10.1371/journal.pone.0142187.g001

thumbnail

https://doi.org/10.1371/journal.pone.0142187.g002

thumbnail

https://doi.org/10.1371/journal.pone.0142187.g003

Box 1. Summary of steps taken in developing the logic model for school based asthma interventions.

  • Synthesis of existing logic models in the field
  • Reviewer 1 identified distal outcomes
  • Working backwards, reviewer 1 then identified the necessary preconditions to reach distal outcomes; from distal outcomes intermediate and proximal level outcomes were then identified
  • Once outcomes had been identified, the outputs were defined (necessary pre-conditions but not necessarily goals in themselves); on completion the change part of the model was complete in draft form
  • Modifiable processes were then specified these were components that were not expected to be present in each intervention included in the review
  • Continuing to work backwards, intervention inputs (including core pedagogical inputs) were then specified. These were inputs that were expected to be present in each intervention included in the review, although their characteristics would differ between studies
  • In addition, external factors were identified as were potential moderators
  • Reviewer 1 and 2 then worked together to redevelop the model paying particular attention to clarity, the conceptual soundness of groupings and the sequencing of aspects
  • The review team and external members were asked to comment on the second iteration, and later agree a revised version 3. This version would provide the structure for some aspects of quantitative analyses and highlight where qualitative analyses were expected to illuminate areas of ambiguity.
  • The final version was included in the protocol with details on how it would be used in later stages of the review, including the way in which it would be transformed, based on the results uncovered, into a theory of change.
  • Consider undertaking additional/supplementary steps12.

Step 1, examination and synthesis of existing logic models

The first step we took in developing our logic model was to familiarise ourselves with the existing literature around the way in which improved self-management of asthma leads to improved outcomes among children and the way in which school-based learning can help to foster these. Previous systematic reviews around this intervention did not include a logic model or develop a theory of change but did help to identify some of the outcomes of self-management educational interventions. These included improved lung, self-efficacy, absenteeism from school, days of restricted activity, and number of visits to an emergency department, among others (see [ 62 ]). A logic model framework helped to order these sequentially and separate process outputs from proximal, intermediate and distal outcomes. Other studies also pointed towards the school being a good site for teaching asthma self-management techniques among children for several reasons, including the familiar environment for learning that it provides for children, and the potential for identification of large numbers of children with asthma at a single location [ 63 – 65 ]. Some individual studies and government action plans also included logic models showing how increased education aimed at improving self-management skills was expected to lead to improvements in asthma outcomes (for example [ 66 , 67 , 68 ]). This evidence was synthesised and was found to be particularly useful in helping to identify some of the intervention processes taking place that could lead to better asthma outcomes, although these were of varying relevance to our specific situation of interest in terms of school-based asthma interventions, as well as being very heavily shaped by local context. We adopted an aggregative approach to the synthesis of the evidence at this point, including all information which was transferable across contexts [ 69 ]. After examining the available literature, the first reviewer was able to proceed with constructing a first draft of the logic model.

Step 2, identification of distal outcomes

Reviewer 1 started with the identification of the very distal outcomes that could change as a result of school-based interventions aimed at improving asthma self-management. From these outcomes the reviewer worked backwards and identified the necessary pre-conditions to achieving these to develop a causal chain. Identifying this set of distal outcomes was analogous to questioning why a potential funding, delivery or hosting organisation (such as a school or health authority) may want to fund such an intervention–the underlying goals of running the intervention. In this case, these outcomes could include potential improvements in population-level health, reductions in health budgets and/or potential increases in measures of school performance ( Fig 1 ). After identifying these macro-level outcomes, we identified the distal child level outcomes which were restricted to changes in children’s outcomes that would only be perceptible at long-term follow-up. These included changes in quality of life and academic achievement, which we identified as being modifiable only after sustained periods of behaviour change and a reduction in the physical symptoms of asthma.

Step 3, identification of intermediate and proximal outcomes

Next, the reviewer 1 outlined some of the intermediate outcomes, those changes necessary to achieve the distal outcomes. Here our intermediate changes in health were based on observations of events, including emergency admissions and limitations in children’s activity over a period of time (which we left unspecified). The only intermediate educational outcome was school attendance, and we identified this as being the only (or at least main) pathway through which we may observe (distal) changes in academic achievement as a result of school-based asthma interventions. Working backwards, our proximal outcomes were defined those pre-conditions necessary to achieve our intermediate outcomes; these revolved around health symptoms and behaviour around asthma and asthma management. We expect these to be observable shortly after the intervention ends (although may be measured at long-term follow-up). The intention is for the systematic review to be published as a Cochrane review which requires the identification of 2–3 primary outcomes and approximately 7 outcomes in total, which in our case helped to rationalise the number of outcomes we included, which left unbounded could have included many more.

Step 4, identification of outputs

Finally in the ‘change’ section of the logic model (see Fig 2 ), we then specified the outputs of the intervention, which we define here as those aspects of behaviour or knowledge that are the direct focus for modification within the activities of the intervention, but are unlikely to represent the original motivations underlying the intervention. Our outputs are those elements of the intervention where changes will be detectable during the course of the intervention itself. Here increased knowledge of asthma may be a pre-condition for improved symptomology and would have a direct focus within intervention activities (outputs), but increased knowledge in itself was not viewed as an likely underlying motivation of running the intervention. A different review question may prioritise improved knowledge of a health condition, and view increased knowledge as an end-point in itself.

Step 5, specification of modifiable intervention processes

To aid in later stages of the review we placed the modifiable design characteristics in sequence after intervention inputs, as we view these as variants that can occur once the inputs of the intervention have been secured. Separating these from standard intervention inputs was a useful stage when it came to considering the types of process data we might extract and in designing data extraction forms. The number of modifiable design characteristics of the intervention specified was enhanced by examining some of the literature described earlier as well as through discussions with members of the review team who were most involved with designing the intervention that will take place after the review.

Step 6, specification of intervention inputs

Standard intervention inputs were specified as were the ‘core elements of the intervention’. These core elements represent the pedagogical focus of the intervention and form some of the selection criteria for studies that will be included, although studies will differ in terms of the number of core elements that are included as well as the way in which these are delivered. Studies that do not include any of these core elements were not considered as interventions that focus on the improvement asthma self-management skills.

Step 7, specification of intervention moderators including setting and population group

Finally, child level moderators (population characteristics) and the characteristics of the schools receiving the intervention were specified (context/setting characteristics). Specifying these early-on in the logic model helped to identify early-on the type of subgroup analyses we would conduct to investigate any potential sources of heterogeneity.

Step 8, share initial logic model, review and redraft

Reviewer 1 shared the draft logic model with a second member of the team. Of particular concern in this step was to establish consensus around the clarity, conceptual soundness of the groupings, the sequencing of the change part of the model, and the balance between meeting the design needs of the intervention and the generalisability of the findings to other settings. With respect to the latter, the second reviewer commented that specifying reductions in health budgets reflected our own experiences of the UK context, and may not be appropriate for all healthcare contexts likely to be included in our review. Therefore, Fig 2 in our second iteration only acknowledges that macro-level (beneficial) changes can be observed from observing changes in the distal outcomes of children, but we do not specify what these might be. At this stage it was helpful to have the first reviewer working alongside a second member of the review team who had greater expertise and knowledge of the design and delivery health-based interventions and who was working directly alongside schools in the preliminary stages of data collection around the intervention itself. Figs 1 – 3 show the development of the logic model across iterations. This second iteration had a clearer distinction between the action and change aspects of the logic model, and had refined the number of outcomes that would be explicitly outlined, which had implications for the search strategy. The action part of the model was also altered to better differentiate parts of the model that represented implementation processes from parts of the model that represented implementation measures or metrics.

Step 9, share revised logic model with wider group, review and redraft

The draft logic model was shared among the wider review team and to an information scientist with comments sought, particularly around those aspects of step 8 that had been the source of further discussion. The review team were asked specifically to input on the content of the different sections of the logic model, the sequencing of different parts, and the balance between meeting the design needs of the intervention and the generalisability of the findings to other settings. Input was sought from an information scientist external to review to ensure that the model adequately communicated and captured the body of literature that the review team aimed to include, and helped to make certain that the model was interpretable to those who were not directly part of the review team. For the third (and final) iteration, views were also sought about whether the main moderating factors across which the team might investigate sources of heterogeneity in meta analyses were included, or for those that would be identified through earlier qualitative analyses, that these were adequately represented. Once these views were collated, the third iteration was produced and agreed. The third iteration better represents the uncertainty in terms of processes that may be uncovered during qualitative analyses and the way in which these will be used to investigate heterogeneity in subgroup analyses in the quantitative (meta) analysis.

Step 10, present the final logic model in the protocol

The final version was included in the protocol with details on how it would be used in later stages of the review. At the end of the review, we intend to return to the logic model and represent those factors that are associated with successful interventions from the quantitative and qualitative analyses in a theory of change.

Potential additional and supplementary steps that could be taken elsewhere

Greater consultation or active collaboration with additional stakeholders on the logic model may be beneficial, particularly for complex reviews involving system-based interventions where different stakeholders will bring different types of knowledge [ 8 , 70 ]. There may also be merit in this approach at the outset in situations where the review findings are intended to inform on an intervention in a known setting, and to ensure that the elements that will enhance the applicability or transferability of the intervention are represented. In the example given here, as there were members of the review team who were both taking part in the review and the design of the intervention, there was less of a need to undertake this additional stage of consultation, and elements such as the presence or change in school asthma policies were included in the logic model to reflect the interests of the intervention team.

Produce further iterations of the logic model : When there is less consensus among the review team than was the case here, when there are greater numbers of stakeholders being consulted, or when the intervention itself is a more complex systems-based intervention; there may be a need to produce further multiple iterations of the logic model. In programme evaluation, logic models are considered to be iterative tools that reflect cumulative knowledge accrued during the course of running an intervention [ 11 ]. While the exact same principle doesn’t apply in the case of systematic reviews, a greater number of iterations may be necessary in order to produce a logic model to guide the review, for example to reflect the different forms of knowledge different stakeholders may bring. Where there are parts of the logic model that are unclear at the outset of a review, or in situations where there is an insurmountable lack of consensus and only the review findings can help to clarify the issue, these can be represented in a less concrete way in the logic model, for example the processes to be examined in our own review in Fig 3 .

Multiple logic models : There may also be a need to construct multiple logic models for large interventions to reflect the complexity of the intervention, although it may also be the case that such a large or complex question may be unsuitable for a single review but would instead fall across multiple linked reviews. However, where the same question is being examined using different types of evidence (mixed method review), multiple logic models representing the same processes in different ways could be useful–for example a logic model focussing on theory and mechanistic explanations for processes in addition to a logic model focussing on empirical expected changes may be necessary for certain forms of mixed methods reviews (dependent on the research question). In other cases, the review may focus on a particular intervention or behaviour change mechanism within a (small) number of defined situations–for example a review may focus on the impact of mass media to tackle public health issues using smoking cessation, alcohol consumption and sexual health as examples. The review question may be focussed on the transferability of processes between these public health issues but in order to guide the review itself it may be necessary to produce a separate logic model for each public health issue, which could be synthesised into a unified theory of change for mass media as an intervention at a later stage.

Using the logic model in the review.

The logic model described in this paper is being used to guide a review that is currently in progress and as such we are not able to give a full outline of its potential use. Others in the literature before us have described logic models as having added value to systematic reviewers when (i) scoping the review (refining the question; deciding on lumping or splitting a review topic; identifying review components); (ii) defining and conducting the review (identifying the review study criteria; guiding the search strategy; rationale behind surrogate outcomes; justifying sub-group analyses); (iii) making the review relevant to policy and practice (structuring the reporting of results; illustrating how harms and feasibility or connected with interventions, interpreting the results based on intervention theory) [ 7 , p35]. Others still have emphasised the utility of a logic model framework in helping reviewers to think conceptually through illustrating the influential relationships and components from inputs to outcomes, suggesting that logic models can help reviewers identify testable hypotheses to focus upon [ 8 , 71 ]; they have also speculated that a logic models could help to identify the parameters of a review as an addition to the well-established PICO framework [ 8 , 9 ].

Our own experience of using the logic model ( Fig 3 ) in a current systematic review to date is summarised in Table 3 below; which focuses on additions to the uses suggested elsewhere. While the additional description below provides an indication as to the potential added value of using a logic model, the use of a logic model has not be without its challenges. Firstly, the use of logic models is relatively novel within the systematic review literature (and even in program theory literature, as discussed earlier), and initially there was some apathy towards the logic model, even within the review team. Secondly, while we agree that a logic model could be used to depict the PICO criteria [ 8 , 9 ], our own logic model did not include a representation of ‘C’, the comparator, as this was the usual care provided across different settings, which could vary substantially. Others may also experience difficulties in representing the comparison element in their logic models. Finally, all of the utilities of the logic model described here and elsewhere are not unique qualities or contingent to using a logic model, but using a logic model accelerates these processes and brings about a shared understanding more quickly; for example development of exclusion criteria is not contingent on having a logic model, but rather that the logic model facilitates the process of identifying inclusion and exclusion criteria more rationally, and helps depict some of the reasoning underlying review decisions. Practically, where the logic model has its advantages is in aiding the initial conceptual thinking around the scope of interventions, its utility in aiding decisions about individual parts of the intervention within the context of the intervention as whole, its flexibility and its use as a reference tool in decision-making, and in communication across the review team.

thumbnail

https://doi.org/10.1371/journal.pone.0142187.t003

Developing Elements of Good Practice for the Use of Logic Models and Theories of Change

The earlier analysis suggests that many systematic review authors tend to use programme theory tools to depict the conceptual framework pictorially, but may not view either a logic model or theory of change as integral review tools. To prevent logic models and theories of change being included in reviews and protocols as simply part of a tick-box exercise, there is a need to develop good practice on how to use programme theory in systematic reviews, as well as developing good practice on how to develop a logic model/theory of change. This is not to over-complicate the process or to introduce rigidity where rigidity is unwelcome, but to maximise the contribution that programme theory can make to a review.

Here we introduce some elements of good practice that can be applied when considering the use of logic models. These are derived from (i) the literature around the use of logic models in systematic reviews [ 7 , 8 , 17 ]; (ii) the broader literature around the use of theory in systematic reviews [ 72 , 73 ]; (iii) our analyses contrasting the suggested uses of logic models in systematic reviews with their actual use (section1); (iv) the use of logic models in program theory literature [ 11 , 13 ]; as well as broader conceptual debates in systematic review and program theory literature. These principles draw from the work of Anderson and colleagues and Baxter and colleagues as well as our own experiences, but are unlikely to represent an exhaustive list as there is a need to maintain a degree of flexibility in the development and use of logic models. Our main concern is that logic models in the review literature appear to be used in such a limited way that a loose set of principles, such as those proposed here, can be applied with little cost in terms of imposing rigidity but with substantial impact in terms of enhanced transparency in use and benefit to the review concept, structure and communication.

A logic model is a tool and as such its use needs to be described

Logic models provide a framework for ‘thinking’ conceptually before, during and at the end of the review. In addition to the uses highlighted earlier by Anderson and Waddington [ 7 , 18 ], our own experiences of using the logic model on our review has emphasised the utility of a logic model in: (i) clarifying the scope of the review and assessing whether a question was too broad to be addressed in a single review; (ii) identifying points of uncertainty that could become focal points of investigation within the review; (iii) clarification of the scope of the study and particularly in distinguishing between different forms of intervention study design (in our own case between a process evaluation and a qualitative outcomes evaluation); (iv) ensuring that there is theoretical inclusivity at an early stage of the review; (v) clarifying inclusion and exclusion criteria, particularly with regards to core elements of the intervention; (vi) informing the search strategy with regards to the databases and scholarly disciplines upon which the review may draw literature; (vii) a communication tool and reference point when making decisions about the review design; (viii) as a project management tool in helping to identify dependencies within the review. Sharing the logic model with an information scientist was also a means of communicating the goals of the review itself while examination of existing logic models was found to be a way of developing expertise around how an intervention was expected to work. Use of a logic model has also been linked with a reduced risk of type III errors occurring, helping to avoid conflation between errors in the implementation and flaws in the intervention [ 17 , 74 ].

Summarising our own learning around the uses of the logic model and the uses identified by others (primarily Anderson) for their use as a tool in systematic reviews in Table 4 highlights that a logic model may have utility primarily at the beginning and end of the systematic review, and may be a useful reference tool throughout.

thumbnail

https://doi.org/10.1371/journal.pone.0142187.t004

Our analyses suggest that the use of logic models has faltered and our earlier review of the systematic review literature highlighted that (i) logic models were infrequently used as a review tool and that the extent of use is likely to reflect the conventions of different disciplines; and (ii) where logic models were used, they were often used in a very limited way to represent the intervention pictorially. Often, they did not ostensibly appear to be used as tools integral to the review. There remains the possibility that some of the reviews and protocols featured earlier simply did not report the extent to which they used the logic model, although given that this is both a tool for thinking conceptually and a communication tool, it could be expected that the logic model would be referred to and referenced at different points in the review process. Logic models can be useful review tools, although the limited scope of use described in the literature could suggest that they are in danger of becoming a box-ticking exercise included in reviews and protocols rather than methodological aids in their own right.

Terminology is important: Logic models and theories of change

We earlier stated that ‘theories of change’ and ‘logic models’ were used interchangeably by reviewers, largely dependent of the discipline in which they are conducted. However, outside the systematic review literature, a distinction often exists. Theories of change are often used to denote complex interventions where there is a need to identify the assumptions of how and why sometimes disparate elements of large interventions may effect change; they are also used in cases for less complex interventions where assumptions of how and why program components effect change are pre-specified. Theories of change can also be used to depict how entirely different interventions can lead to the same set of outcomes. Logic models on the other hand are used to outline program components and check whether they are plausible in relation to the outcomes; they do not necessitate the underlying assumptions to be stated [ 11 , 75 ]. This distinction fits in well with the different stages of a systematic review. A logic model provides a sequential depiction of the components of interventions and their outcomes, but not necessarily the preconditions that are needed to achieve these outcomes, or the relative magnitude of different components. Given that few of the programme theory tools that are used in current protocols and reviews are derived or build upon existing tools, for most systematic reviews that do not constitute whole system reviews of complex intervention strategies, or for reviews that are not testing a pre-existing theory of change, developing a Logic Model initially may be most appropriate. This assertion does not mean that systematic reviews should be atheoretical or ‘theory-lite’, and different possible conceptual frameworks can be included in Logic Models. However, the selection of a single conceptual framework upfront, as is implicitly the case when developing a Theory of Change, may not represent the diversity of disciplines that reviewers are likely to encounter. Except in the cases outlined earlier around highly complex/systems based interventions (mainly restricted to development studies literature), theories of change are causal models that are most suitable when developed through the evidence gathered in the review process itself.

Logic models can evolve into theories of change

Once a review has identified the factors that are associated with different outcomes, their magnitude, and the underlying causal pathways linking intervention components with different outcomes, this evidence can in some cases be used to update a logic model and construct a theory of change. We can observe examples in the literature where review evidence has been synthesised to map out the direction and magnitude of evidence in the literature (see [ 8 ], although in this case, the resulting model was described as a ‘Logic Model’ and not a ‘Theory of Change’), and this serves as a good model for all reviews. Programme theory can effectively be used to represent the change in understanding developed as a result of the review, and even in some cases the learning acquired during a review, although this is not the case for all reviews and there may be some where this approach is unsuitable or is otherwise not possible. A logic model can be viewed iteratively as a preliminary for constructing a theory of change at the end of the review, which in turn forms a useful tool to communicate the findings of the review. However, some reviewers may find little to update in the logic model in terms of the theory of the intervention or may otherwise find that the evidence around the outcomes and process of the intervention is unclear among the literature as it stands. There may also be occasions where reporting conventions for disciplines or review groups may preclude updating the logic model on the basis of the findings of the review.

Programme theory should not be developed in isolation

In our exploration of health-based and international development reviews, we observed just one example where the reviewers described a Logic Model as having been developed through consensus within the review team [ 24 ]. Other examples are found in the literature, where logic models or theories of change have been developed with stakeholders, for example Baxter and colleagues [ 8 ; p3] record that ‘following development of a draft model we sought feedback from stakeholders regarding the clarity of representation of the findings, and potential uses’. These examples are clearly in the minority in the systematic review literature, although most programme theory described in the evaluation literature is clear that models should be developed through a series of iterations taking into account the views of different stakeholders [ 11 ]. While some of this effect may be due to reporting, as it is likely that at least some of the models included in Tables 1 and 2 were developed having reached a consensus, it is nevertheless important to highlight that a more collaborative approach to developing models could bring benefits to the review itself. Given that systematic review teams are often interdisciplinary in nature, and can be engaging with literature that is also interdisciplinary, programme theory should reflect the expertise and experience of all team members as well as that of external stakeholders if appropriate. Programme theory is also used as a shorthand communication tool, and the process of developing a working theoretical model within a team can help to simplify the conceptual model into a format that is understandable within review teams, but which can also be used to involve external stakeholders, as is often necessary in the review process [ 70 ].

A logic model should be used as an Early Warning System

Logic models have their original purpose as a planning and communication tool in the evaluation literature. However in systematic reviews, they can also provide the first test of the underlying conceptual model of the review. If a review question cannot be represented in a Logic Model (or Theory of Change in the case of highly complex issues), this can provide a signal that the question itself may be too broad or needs further refining. It may be that a series of Logic Models may better represent the underlying concepts and the overall research question driving the review, and this may also reflect a need to undertake a series of reviews rather than a single review, particularly where the resources available may be more limited [ 73 ]. Alternatively, as is often the case with complex systems-based interventions (as encountered in many reviews of international development initiatives published on the 3ie database), the intervention may be based on a number of components, which could be represented individually through logic models, the mechanisms of which are relatively well-established and understood, and a theory of change may better represent the intervention. The tool can also help the reviewer to assess the degree to which the review may focus on collecting evidence around particular pathways of outcomes, and the potential contribution the review can make to the field, helping to establish whether the scope of the review is broad and deep (as might be the ideal in the presence of sufficient resource), or narrower and more limited in scope and depth [ 73 ]. This can also help to manage the expectations of stakeholders at the outset. Logic models can be used as the basis for developing a systematic review protocol, and should be considered living documents, subject to several iterations during the process of developing a protocol as the focus of the review is clarified. They can both guide and reflect the review scope and focus during the preparation of a review protocol.

There is no set format for a Logic Model (or Theory of Change), but there are key components that should be considered

Most Logic Models at a minimum, depict the elements included in the PICO/T/S framework (patient/problem/population; intervention; comparison/control/comparator; outcomes and time/setting) [ 76 ]. However, a logic model represents a causal chain of events resulting from an intervention (or from exposure, membership of a group or other ‘treatment’); therefore it is necessary to consider how outcomes may precede or succeed one another sequentially. Dividing outcomes into distal (from the intervention), intermediate or proximal categories is a strategy that is often used to help identify sets of pre-existing conditions or events needed in order to achieve a given outcome. The result is a causal chain of events including outputs and outcomes that represent the pre-conditions that are hypothesised to lead to distal outcomes. Outcomes are only achieved once different sets of outputs are met; these may represent milestones of the intervention but are not necessarily success criteria of the intervention in themselves (for example in Fig 3 ). In the case of reviews of observational studies, the notion of outputs (and even interventions and intervention components) may be less relevant, but may instead be better represented by ‘causes’ and potential ‘intervention points’ [ 71 ], that are also structured sequentially to indicate which are identified as necessary pre-conditions for later events in the causal chain.

Many of the elements described above refer to the role of the intervention (or condition) in changing the outcomes for an individual (or other study unit), which can also be referred to as a theory of change; the elements of the causal chain that reflect the intervention and its modifiable elements are known as the theory of action [ 11 , 77 ]. Within the theory of action, the modifiable components of the intervention needed to achieve later outputs and outcomes, such as the study design, resources, and process-level factors such as quality and adherence are usually included. Other modifiable elements, including population or group-level moderators can also be included, and even the underlying conceptual theories that may support different interventions may be included as potential modifiers. Finally, it some of the contextual factors that may reflect the environments in which interventions take place can also be represented. Within our example in Fig 3 , these include the school-level factors such as intake of the school and its local neighbourhood as well as broader health service factors and local health policies. For some reviews and studies, the influence of these contextual factors may themselves be the focus of the review.

Summary and Conclusions

In the past, whether justified or not, a critique often levelled at systematic reviews has been the absence of theory when classifying intervention components [ 78 ]. The absence of theory in reviews has transparent negative consequences in terms of the robustness of the findings, the applicability and validity of recommendations, and in the absence of mechanistic theories of why interventions work, limits on the generalisability of the findings. A number of systematic reviewers are beginning to address this critique directly through considering the methodological options available when reviewing theories [ 69 , 78 , 79 ], while others have gone further through exploring the role that differences in taxonomies of theory hold in explaining effect sizes [ 78 , 80 , 81 ]. Nevertheless, despite the benefits of, and need for, using theory to guide secondary data analysis, reviewers may be confronted by several situations where the conceptualisation of the theoretical framework itself may be problematic. Such instances include those where there may be little available detail on the theories underlying interventions, or competing theories or disciplinary differences in the articulation of theories for the same interventions exist (requiring synthesis) [ 79 ]; or a review topic may necessitate the grouping and synthesis of very different interventions to address a particular research question; or more fundamentally, where there is a need to consider alternative definitions, determinants and outcomes of interventions that goes beyond representing these within ‘black boxes’. In common with others before us [ 7 , 8 ], in this paper we view a logic model as a tool to help reviewers to overcome these challenges and meet these needs through providing a framework for ‘thinking’ conceptually.

Much of this paper examines the application of a logic model to ‘interventionist’ systematic reviews, and we have not directly considered their use in systematic reviews of observational phenomena. Certainly, while some of the terminology would need to change to reflect the absence of ‘outputs’ and ‘resources’, the benefits to the review process would remain. For some, this idea may simply be too close that of a graphical depiction of a conceptual framework. However, the logic model is distinct in that it represents only part of a conceptual framework–it does not definitively represent a single ideological perspective or epistemological stance, and the accompanying assumptions. Arguably, a theory of change often does attempt to represent an epistemological framework, and this is why we view a distinction between both tools. As the goal of a systematic review is uncover the strength of evidence and the likely mechanisms underlying how different parts of a causal pathway relate to one another, then the evidence can be synthesised into a theory of change; and we maintain the emphasis on this being a ‘theory’, to be investigated and tested across different groups and across different geographies of time and space.

In investigating the use of logic models, we found that among the comparatively small number of reviewers who used a theory of change or logic model, many described a limited role as well as role intrinsic to the beginning of the review and not as a tool to communicate review findings. A worked through example may help expand the use as will making their use a formal requirement, but the formation of guidelines will help make sure where they are used, they’re used to greater effect. A recommendation of this paper is for greater guidance to be prepared around how programme theory could and should be used in systematic reviewing, incorporating the elements raised here and others. Much of this paper is concerned with the benefits that logic models can bring to reviewers as a pragmatic tool in carrying out the review, as a tool to help strengthen the quality of reviews, and perhaps most importantly as a communication tool to disseminate the findings to reviewers and trialists within academic communities and beyond to policy-makers and funders. With respect to this last purpose in particular, improving the way in which logic models are used in reviews can only serve to increase the impact that systematic reviews can have in shaping policy and influencing practice in healthcare and beyond.

Supporting Information

S1 checklist. prisma checklist..

https://doi.org/10.1371/journal.pone.0142187.s001

S1 Flowchart. PRISMA Flowchart.

https://doi.org/10.1371/journal.pone.0142187.s002

S1 Table. Data Coding Template.

https://doi.org/10.1371/journal.pone.0142187.s003

Acknowledgments

We would like to acknowledge the contributions of Jonathan Grigg, Toby Lasserson and Vanessa McDonald in helping to shape the development of the logic model.

Author Contributions

Conceived and designed the experiments: DK JT. Analyzed the data: DK KH. Contributed reagents/materials/analysis tools: DK. Wrote the paper: DK JT.

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 6. Gough D, Oliver S, Thomas J. Introducing systematic reviews. In: Gough D, Oliver S, Thomas J, editors. An Introduction to Systematic Reviews. London: Sage; 2012.
  • 11. Funnell SC, Rogers PJ. Purposeful program theory: effective use of theories of change and logic models. San Francisco, CA: John Wiley & Sons; 2011.
  • 12. Ni Ogain E, Lumley T, Pritchard D. Making an Impact. London: NPC, 2012.
  • 14. Higgins JPT, Green S. Cochrane handbook for systematic reviews of interventions. Chichester: Wiley-Blackwell; 2011.
  • 15. The Cochrane Public Health Group. Guide for developing a Cochrane protocol. Melbourne, Australia: University of Melbourne, 2011.
  • 16. Brennan S. Using logic models to capture complexity in systematic reviews: Commentary. Oxford: The Cochrane Operations Unit, 2012.
  • 17. Francis D, Baker P. Developing and using logic models in reviews of complex interventions. 19th Cochrane Colloquium; October 18th; Madrid: Cochrane; 2011.
  • 19. Bhavsar A, Waddington H. 3ie tips for writing strong systematic review applications London: 3ie; 2012. Available: http://www.3ieimpact.org/en/funding/systematic-reviews-grants/3ie-tips-for-writing-systematic-review-applications/.Accessed 3 December 2014.
  • 20. Barlow J, MacMillan H, Macdonald G, Bennett C, Larkin SK. Psychological interventions to prevent recurrence of emotional abuse of children by their parents. The Cochrane Library. 2013.
  • 21. McLaughlin AE, Macdonald G, Livingstone N, McCann M. Interventions to build resilience in children of problem drinkers. The Cochrane Library. 2014.
  • 25. Burns J, Boogaard H, Turley R, Pfadenhauer LM, van Erp AM, Rohwer AC, et al. Interventions to reduce ambient particulate matter air pollution and their effect on health. The Cochrane Library. 2014.
  • 26. Costello JT, Baker PRA, Minett GM, Bieuzen F, Stewart IB, Bleakley C. Whole-body cryotherapy (extreme cold air exposure) for preventing and treating muscle soreness after exercise in adults. The Cochrane Library. 2013.
  • 27. Gavine A, MacGillivray S, Williams DJ. Universal community-based social development interventions for preventing community violence by young people 12 to 18 years of age. The Cochrane Library. 2014.
  • 28. Kuehnl A, Rehfuess E, von Elm E, Nowak D, Glaser J. Human resource management training of supervisors for improving health and well-being of employees. The Cochrane Library. 2014.
  • 29. Land M-A, Christoforou A, Downs S, Webster J, Billot L, Li M, et al. Iodine fortification of foods and condiments, other than salt, for preventing iodine deficiency disorders. The Cochrane Library. 2013.
  • 30. Langbecker D, Diaz A, Chan RJ, Marquart L, Hevey D, Hamilton J. Educational programmes for primary prevention of skin cancer. The Cochrane Library. 2014.
  • 31. Michelozzi P, Bargagli AM, Vecchi S, De Sario M, Schifano P, Davoli M. Interventions for reducing adverse health effects of high temperature and heatwaves. The Cochrane Library. 2014.
  • 32. Peña-Rosas JP, Field MS, Burford BJ, De-Regil LM. Wheat flour fortification with iron for reducing anaemia and improving iron status in populations. The Cochrane Library. 2014.
  • 33. Ramke J, Welch V, Blignault I, Gilbert C, Petkovic J, Blanchet K, et al. Interventions to improve access to cataract surgical services and their impact on equity in low- and middle- income countries. The Cochrane Library. 2014.
  • 34. Sreeramareddy CT, Sathyanarayana TN. Decentralised versus centralised governance of health services. The Cochrane Library. 2013.
  • 37. Brody C, Dworkin S, Dunbar M, Murthy P, Pascoe L. The effects of economic self-help group programs on women's empowerment: A systematic review protocol. Oslo, Norway: The Campbell Collaboration, 2013.
  • 38. Cirera X, Lakshman R, Spratt S. The impact of export processing zones on employment, wages and labour conditions in developing countries. London: 3ie, 2013.
  • 40. Giedion U, Andrés Alfonso E, Díaz Y. The impact of universal coverage schemes in the developing world: a review of the existing evidence. Washington DC: World Bank, 2013.
  • 41. Gonzalez L, Piza C, Cravo TA, Abdelnour S, Taylor L. The Impacts of Business Support Services for Small and Medium Enterprises on Firm Performance in Low-and Middle-Income Countries: A Systematic Review. The Campbell Collaboration, 2013.
  • 42. Higginson A, Mazerolle L, Benier KH, Bedford L. Youth gang violence in developing countries: a systematic review of the predictors of participation and the effectiveness of interventions to reduce involvement. London: 3ie, 2013.
  • 43. Higginson A, Mazerolle L, Davis J, Bedford L, Mengersen K. The impact of policing interventions on violent crime in developing countries. London: 3ie, 2013.
  • 44. Kingdon G, Aslam M, Rawal S, Das S. Are contract and para-teachers a cost effective intervention to address teacher shortages and improve learning outcomes? London: Institute of Education, 2012.
  • 45. Kluve J, Puerto S, Robalino D, Rother F, Weidenkaff F, Stoeterau J, et al. Interventions to improve labour market outcomes of youth: a systematic review of training, entrepreneurship promotion, employment services, mentoring, and subsidized employment interventions. The Campbell Collaboration, 2013.
  • 46. Loevinsohn M, Sumberg J, Diagne A, Whitfield S. Under What Circumstances and Conditions Does Adoption of Technology Result in Increased Agricultural Productivity? A Systematic Review. Brighton: Institute for Development Studies, 2013.
  • 47. Lynch U, Macdonald G, Arnsberger P, Godinet M, Li F, Bayarre H, et al. What is the evidence that the establishment or use of community accountability mechanisms and processes improve inclusive service delivery by governments, donors and NGOs to communities. London: EPPI-Centre, Social Science Research Unit, Institute of Education, University of London, 2013.
  • 48. Molina E, Pacheco A, Gasparini L, Cruces G, Rius A. Community Monitoring to Curb Corruption and Increase Efficiency in Service Delivery: Evidence from Low Income Communities. Campbell Collaboration, 2013.
  • 49. Posthumus H, Martin A, Chancellor T. A systematic review on the impacts of capacity strengthening of agricultural research systems for development and the conditions of success. London: EPPI-Centre, Social Science Research Unit, Institute of Education, University of London, 2013 1907345469.
  • 50. Samarajiva R, Stork C, Kapugama N, Zuhyle S, Perera RS. Mobile phone interventions for improving economic and productive outcomes for farm and non-farm rural enterprises and households in low and middle-income countries. London: 3ie, 2013.
  • 51. Samii C, Lisiecki M, Kulkarni P, Paler L, Chavis L. Impact of Payment for Environmental Services and De-Centralized Forest Management on Environmental and Human Welfare: A Systematic Review. The Campbell Collaboration, 2013.
  • 52. Seguin M, Niño-Zarazúa M. What do we know about non-clinical interventions for preventable and treatable childhood diseases in developing countries? United Nations University, 2013.
  • 53. Spangaro J, Zwi A, Adogu C, Ranmuthugala G, Davies GP, Steinacker L. What is the evidence of the impact of initiatives to reduce risk and incidence of sexual violence in conflict and post-conflict zones and other humanitarian crises in lower-and middle-income countries? London: EPPI-Centre, Social Science Research Unit, Institute of Education, University of London, 2013.
  • 56. Tripney J, Roulstone A, Vigurs C, Moore M, Schmidt E, Stewart R. Protocol for a Systematic Review: Interventions to Improve the Labour Market Situation of Adults with Physical and/or Sensory Disabilities in Low-and Middle-Income Countries. The Campbell Collaboration, 2013.
  • 58. Welch VA, Awasthi S, Cumberbatch C, Fletcher R, McGowan J, Krishnaratne S, et al. Deworming and Adjuvant Interventions for Improving the Developmental Health and Well-being of Children in Low- and Middle- income Countries: A Systematic Review and Network Meta-analysis. Campbell Collaboration, 2013.
  • 59. Willey B, Smith Paintain L, Mangham L, Car J, Armstrong Schellenberg J. Effectiveness of interventions to strengthen national health service delivery on coverage, access, quality and equity in the use of health services in low and lower middle income countries. London: EPPI-Centre, Social Science Research Unit, Institute of Education, University of London 2013.
  • 61. Rohwer AC, Rehfuess E. Logic model templates for systematic reviews of complex health interventions. Cochrane Colloquium; Quebec, Canada2013.
  • 66. Alamgir AH. Texas Asthma Control Program: Strategic Evaluation Plan 2011–2014. Austin, TX: Texas Department of State Health Services 2012.
  • 67. AAP. Schooled in Asthma–Physicians and Schools Managing Asthma Together. Elk Grove Village, IL: American Academy of Pediatrics (AAP) 2001.
  • 70. Rees R, Oliver S. Stakeholder perspectives and participation in reviews. In: Gough D, Oliver S, Thomas J, editors. An Introduction to Systematic Reviews. London: Sage Publications; 2012.
  • 73. Gough D, Thomas J. Commonality and diversity in reviews. In: Gough D, Oliver S, Thomas J, editors. An Introduction to Systematic Reviews. London: Sage; 2012.
  • 74. Waddington H. Response to 'Developing and using logic models in reviews of complex interventions'. 19th Cochrane Colloquium; October 18th; Madrid: Cochrane; 2011.
  • 75. Clark H, Anderson AA. Theories of Change and Logic Models: Telling Them Apart. American Evaluation Association; Atlanta, Georgia2004.
  • 76. Brunton G, Stansfield C, Thomas J. Finding relevant studies. In: Gough D, Oliver S, Thomas J, editors. An Introduction to Systematic Reviews. London: Sage; 2012.
  • 77. Chen H-T. Theory-driven evaluations. Newbury Park, CA, USA: Sage Publications; 1990.
  • Research article
  • Open access
  • Published: 10 May 2014

Using logic model methods in systematic review synthesis: describing complex pathways in referral management interventions

  • Susan K Baxter 1 ,
  • Lindsay Blank 1 ,
  • Helen Buckley Woods 1 ,
  • Nick Payne 1 ,
  • Melanie Rimmer 1 &
  • Elizabeth Goyder 1  

BMC Medical Research Methodology volume  14 , Article number:  62 ( 2014 ) Cite this article

55 Citations

22 Altmetric

Metrics details

There is increasing interest in innovative methods to carry out systematic reviews of complex interventions. Theory-based approaches, such as logic models, have been suggested as a means of providing additional insights beyond that obtained via conventional review methods.

This paper reports the use of an innovative method which combines systematic review processes with logic model techniques to synthesise a broad range of literature. The potential value of the model produced was explored with stakeholders.

The review identified 295 papers that met the inclusion criteria. The papers consisted of 141 intervention studies and 154 non-intervention quantitative and qualitative articles. A logic model was systematically built from these studies. The model outlines interventions, short term outcomes, moderating and mediating factors and long term demand management outcomes and impacts. Interventions were grouped into typologies of practitioner education, process change, system change, and patient intervention. Short-term outcomes identified that may result from these interventions were changed physician or patient knowledge, beliefs or attitudes and also interventions related to changed doctor-patient interaction. A range of factors which may influence whether these outcomes lead to long term change were detailed. Demand management outcomes and intended impacts included content of referral, rate of referral, and doctor or patient satisfaction.

Conclusions

The logic model details evidence and assumptions underpinning the complex pathway from interventions to demand management impact. The method offers a useful addition to systematic review methodologies.

Trial registration number

PROSPERO registration number: CRD42013004037 .

Peer Review reports

Worldwide shifts in demographics and disease patterns, accompanied by changes in societal expectations are driving up treatment costs. As a result of this, several strategies have been developed to manage the referral of patients for specialist care. In the United Kingdom (UK) referrals from primary care to secondary services are made by General Practitioners (GPs), who may be termed Family Physicians or Primary Care Providers in other health systems. These physicians in the UK act as the gatekeeper for patient access to secondary care, and are responsible for deciding which patients require referral to specialist care. Similar models are found in health care services in Australia, Denmark and the Netherlands however, this process differs from systems in other countries such as France and the United States of America.

As demand outstrips resources in the UK, the volume and appropriateness of referrals from primary care to specialist services has become a key concern. The term “demand management” is used to describe methods which monitor, direct or regulate patient referrals within the healthcare system. Evaluation of these referral management interventions however presents challenges for systematic review methodologies. Target outcomes are diverse, encompassing for example both the reduction of referrals and enhancing the optimal timing of referrals. Also, the interventions are varied and may target primary care, specialist services, or administration or infrastructure (such as triaging processes and referral management centres) [ 1 ].

In systematic review methodology there is increasing recognition of the need to evaluate not only what works, but the theory of why and how an intervention works [ 2 ]. The evaluation of complex interventions such as referral management therefore requires methods which move beyond reductionist approaches, to those which examine wider factors including mechanisms of change [ 3 – 5 ].

A logic model is a summary diagram which maps out an intervention and conjectured links between the intervention and anticipated outcomes in order to develop a summarised theory of how a complex intervention works. Logic models seek to uncover the theories of change or logic underpinning pathways from interventions to outcomes [ 2 ]. The aim is to identify assumptions which underpin links between interventions, and the intended short and long term outcomes and broader impacts [ 6 ]. While logic models have been used for some time in programme evaluation, their potential to make a contribution to systematic review methodology has been recognised only more recently. Anderson et al. [ 7 ] discuss their use at many points in the systematic review process including scoping the review, guiding the searching and identification stages, and during interpretation of the results. Referral management entails moving from a system that reacts in an ad hoc way to increasing needs, to one which is able to plan, direct and optimise services in order to optimise demand, capacity and access across an area. Uncovering the assumptions and processes within a referral management intervention therefore requires an understanding of whole systems and assumptions, which a logic model methodology is well placed to address.

A number of benefits from using logic models have been proposed including: identification of different understandings or theories about how an intervention should work; clarification of which interventions lead to which outcomes; providing a summary of the key elements of an intervention; and the generation of testable hypotheses [ 8 ]. These advantages relate to the power of diagrammatic representation as a communication tool. Logic models have the potential to make systematic reviews “more transparent and more cogent” to decision-makers [ 7 ]. The use of alternative methods of synthesis and presentation of reviews is also worthy of consideration given the poor awareness and use of systematic review results amongst clinicians [ 9 ]. In addition, logic models may move systematic review findings beyond the oft-repeated conclusion that more evidence is needed [ 7 ].

While the potential benefit as a communication tool has been emphasised, there has been limited evaluation of logic models. In this study we aimed to further develop and evaluate the use of logic models as synthesis tools, during a systematic review of interventions to manage referrals from primary care to hospital specialists.

The method we used built on previous work by members of the team [ 10 , 11 ]. The approach combines conventional rigorous and transparent review methods (systematic searching, identification, selection and extraction of papers for review, and appraisal of potential bias amongst included studies) with a logic model synthesis of data. The building of models systematically from the evidence contrasts to the approach typically adopted, whereby logic models are built by discussion and consensus at meetings of stakeholders or expert groups. The processes followed are described in further detail below.

Search strategy

A study protocol was devised (PROSPERO registration number: CRD42013004037) to guide the review which outlined the research questions, search strategy, inclusion criteria, and methods to be used. The primary research question was “what can be learned from the international evidence on interventions to manage referral from primary to specialist care?” Secondary questions were “what factors affect the applicability of international evidence in the UK”, and “what are the pathways from interventions to improved outcomes?”

Systematic searches of published and unpublished (grey literature) sources from healthcare, and other industries were undertaken. Rather than a single search, an iterative (a number of different searches) and emergent approach (the understanding of the question develops throughout the process), was taken to identify evidence [ 12 , 13 ]. As the model was constructed, further searches were required in order to seek additional evidence where there were gaps in the chain of reasoning as described below. An audit table of the search process was kept, with date of search, search terms/strategy, database searched, number of hits, keywords and other comments included, in order that searches were transparent, systematic and replicable.

Searches took place between November 2012 and July 2013. A broad range of electronic databases was searched in order to reflect the diffuse nature of the evidence (see Additional file 1 ). Citation searches of included articles and other systematic reviews were also undertaken and relevant reviews articles were used to identify studies. Grey literature (in the form of published or unpublished reports, data published on websites, in government policy documents or in books) was searched for using OpenGrey, Greysource, and Google Scholar electronic databases. Hand searching of reference lists of all included articles was also undertaken; including relevant systematic reviews.

Identification of studies

Inclusion/exclusion criteria were developed using the established PICO framework [ 14 ]. Participants included all primary care physicians, hospital specialists, and their patients. Interventions included were those which aimed to influence and/or affect referral from primary care to specialist services by having an impact on the referral practices of the primary physician. Studies using any comparator group were eligible for inclusion, and all outcomes relating to referral were considered. With the increasing recognition that a broad range of evidence is able to inform review findings, no restrictions were placed on study design with controlled, non-controlled (before and after) studies, as well as qualitative work examined. Studies eligible for inclusion were limited by date (January 2000 to July 2013). Articles in non-English languages with English abstracts were considered for translation (none were found to meet the inclusion criteria for the review). The key criterion for inclusion in the review was that a study was able to answer or inform the research questions.

Selection of papers

Citations identified using the above search methods were imported into Reference Manager Version 12. The database was screened by two reviewers, with identification and coding of potential papers for inclusion. Full papers copies of potentially relevant articles were retrieved for further examination.

Data extraction

A data extraction form was developed using the previous expertise of the review team, trialled using a small number of papers, and refined for use here. Data extractions were completed by one reviewer and checked by a second. Extracted data included: country of the study, study design, data collection method, aim of the study, detail of participants (number, any reported demographics), study methods/intervention details, comparator details if any, length of follow up, response and/or attrition rate, context (referral from what/who to what/who), outcome measures, main results, and reported associations.

Quality appraisal

The potential for bias within each quantitative study was assessed drawing on work by the Cochrane Collaboration [ 15 ]. We slightly adapted their tool for assessing risk of bias in order that the appraisal would be suitable for our broader range of study designs. For the qualitative papers we adapted the Critical Appraisal Skills Checklist [ 16 ] to provide a similar format to the quantitative tool. In addition to assessing the quality of each individual paper we also considered the overall strength of evidence for papers grouped by typology, drawing on criteria used by Hoogendoom et al. [ 17 ]. Each group of papers was graded as providing either: stronger evidence (generally consistent findings in multiple higher quality studies); weaker evidence (generally consistent findings in one higher quality study and lower quality studies, or in multiple lower quality studies.); inconsistent evidence (<75% consistency findings in multiple studies) or very limited evidence (a single study). Strength of evidence appraisal was undertaken at a meeting of the research team to establish consensus.

Logic model synthesis

Logic models typically adopt a left to right flow of “if....then” propositions to illustrate the chain of reasoning underpinning how interventions lead to immediate (or short term) outcomes and then to longer term outcomes and impacts. This lays out the logic or assumptions that underpin the pathway (in this case, what needs to happen in order for interventions with General Practitioners to impact on referral demand). In our approach, extracted data from the included papers across study designs are combined and treated as textual (qualitative) data. A process of charting, categorising and thematic synthesis [ 18 ] of the extracted quantitative intervention and qualitative data is used in order to identify individual elements of the model. A key part of the model is detailing the mechanism/s of change within the pathway and the moderating and mediating factors which may be associated with or influence outcomes [ 19 ] this is often referred to as the theory of change [ 2 ].

Evaluation of the model

Following development of a draft model we sought feedback from stakeholders regarding the clarity of representation of the findings, and potential uses. We carried out group sessions with patient representatives, individual interviews and seminar presentations with GPs and consultants, and also interviews with commissioners (in the UK commissioning groups comprise individuals who are responsible for the process of planning, agreeing and monitoring services, having control of the budget to be spent). At these sessions we presented the draft model and asked for verbal comments regarding the clarity of the model as a way of understanding the review findings, any elements which seemed to be missing, elements which did not seem to make sense or fit participants knowledge or experience, and also how participants envisaged that the model could be used. We also gave out feedback forms for participants to provide written comments on these aspects. In addition to these sessions we circulated the draft model via email to topic experts for their input. The feedback we obtained was examined and discussed by the team in order to inform subsequent drafts of the model.

Ethical approval

The main study was secondary research and therefore exempt from requiring ethical approval. Approval for the final feedback phase of the work was obtained from the University of Sheffield School of Health and Related Research ethics committee (reference 0599). Informed consent was obtained from all participants.

The electronic searches generated a database of 8327 unique papers. Of these, 581 papers were selected for full paper review (see Additional file 1 ). After considering these and completing our further identification procedures, 295 papers were included in the review. Figure  1 illustrates the process of inclusion and exclusion. The included papers consisted of 141 intervention papers and 154 non-intervention papers. The 154 non-intervention papers included 33 qualitative studies and 121 quantitative studies (see Additional file 1 ).

figure 1

The process of inclusion and exclusion. A flow chart illustrating the process of paper identification.

A logic model was systematically developed from reviewing and synthesising these papers (see Figure  2 ). The model illustrates the elements in the pathway from demand management interventions to their intended impact. While this paper will refer to the emerging review findings which are currently undergoing peer review, its primary purpose is to describe and evaluate the methodology.

figure 2

The completed logic model built from examining the identified published literature. The model illustrates the pathway between demand management interventions and intended impact. It reads from left to right, with a typology of demand management interventions in the first column, the immediate or short term outcomes following the interventions in the second column, then factors which may act as barriers to achievement of longer term outcomes in the mediating and moderating factors column, outcomes in terms of demand management at the level of physicians and their practice in the fourth column, and longer term system-wide impacts in the final column. The model indicates (via differing text types) where there was stronger or weaker evidence of links in the pathway.

Logic model development

Following data extraction and quality appraisal, the process of systematically constructing the logic model began. We developed the model column by column, underpinned by the evidence. The model contains five columns detailing the pathway from interventions to short-term outcomes; via moderating and mediating factors; to demand management outcomes; and finally demand management impact.

The first stage in building the model was to develop intervention typology tables from the extracted data, in order to begin the process of grouping and organising the intervention content and processes which would form the first column. This starting point in the pathway details the wide range of interventions which are reported in the literature. It groups these interventions into typologies of: practitioner education; process change; system change; and patient intervention. Within each of these boxes the specific types of interventions in each category have been listed, for example the GP education typology contains interventions targeting training sessions, peer feedback, and provision of guidelines. Process change interventions include electronic referral, direct access to screening and consultation with specialists prior to referral. System change interventions include additional staff in community, gate-keeping and payment systems. We found few examples of patient interventions. The model provides an indication of where the evidence is stronger or weaker. For example in regard to physician education it can be seen that peer review/feedback has stronger evidence underpinning its effectiveness, with the use of guidelines being underpinned by conflicting evidence of effectiveness. For all but two of the interventions, the evidence was for either none or some level of positive outcome on referral management. For the additional staff in primary care and the addition or removal of gatekeeping interventions however there was strong evidence that these could worsen referral management outcomes.

The interventions thus formed the starting point, and first column of the logic model. By developing a typology we were able to group and categorise the data, and begin to explore questions regarding which types of intervention may work, and what characteristics of interventions may be successful in managing patient referral.

The intervention studies used a wide range of outcomes to judge efficacy. A key aim of logic models is to uncover assumptions in the chain of reasoning between interventions and their expected impacts, and to develop a theory of change which sets out these implicit “if…then” pathways. The next stage in development of the model was therefore to begin to unpack these outcomes and assumptions regarding links between interventions and demand management impacts.

The outcomes were divided into those which were considered to be short-term outcomes, long term outcomes or result in broader impact on demand management systems. In order to do this we used “if…then” reasoning to deduce in what order outcomes needed to occur for these to then lead to the intended impact. Short-term outcomes were classified as those that impacted immediately or specifically on individual referrers, patients or referrals. Long term outcomes were categorised as those which had an effect more widely beyond the level of the individual GP, service or patient, and impact factors were those that would determine the effectiveness of referral management across whole health systems.

Outcomes and impacts reported in the intervention studies were identified and grouped by typology, by the stage in the pathway, and by the level of evidence. The outcomes column includes all those outcomes which were reported in the included papers. They encompass: whether or not the adequacy of information provided by the referrer to the specialist was improved; whether there was an improvement to patient waiting time; whether there was an increase in level of GP or patient satisfaction with services, and whether referrals were auctioned more appropriately. These outcomes form an important element of the pathway to the final impacts column and demonstrate the importance of identifying all the links in the chain of reasoning. For example the model outlines that referral information needs to be accurate in order that referrals may be directed to the most appropriate place or person. Interventions need to include evaluation of this interim outcome and not only consider impact measures such as rate of referral if they are to explore how and if an intervention is effective. Also, GP satisfaction with a service will determine where referrals are sent, and patient satisfaction may determine whether a costly appointment with a specialist is attended. Here again many studies we evaluated used only broad impact measures (such as referral rate) to evaluate outcomes rather than explore where the links in the pathway may be breaking down.

The impacts column contains all those impacts that were reported in the included literature. These were: the impact on referral rate/level; whether attendance rate increased; any impact on referrals being considered appropriate; any impact on the appropriateness of the timing of the referral; and the effect on healthcare cost. As can be seen from the model, the relationship between interventions and a wider impact on systems was challenging to demonstrate from the evidence.

Having developed the first and final two columns of the model, attention then turned to the key middle section. This phase of the work required detailed exploration of the change pathway to explore exactly how the interventions would act on participants in order to produce the demand management outcomes and impacts. The second and third columns of the model are core elements of the theory of change within the model.

While a small amount of data for these elements came from the intervention studies, the majority came from analysis and synthesis of the qualitative papers and non-intervention studies. Much of the intervention literature seemed to have a “black box” between the intervention and the long term impacts. This was a key area of the work where we employed iterative additional searching in order to seek evidence for associations, to ensure that the chain of reasoning was complete. For example, the first additional search aimed to explore evidence underpinning the assumption that increasing GP knowledge would lead to improved referral practice. The second additional search aimed to identify evidence underpinning the link between changes in referral systems and changed physician attitudes or behaviour. The search also sought evidence regarding specific outcomes following interventions to change patient knowledge, attitudes or behaviour.

The second column of the model details the short-term outcomes for individual GPs, patients, and GP services that may result from interventions. These are the factors which need to be changed within the referrer or referral, in order that the longer term outcomes and impacts will happen. The short-term outcomes we identified were: physician knowledge; physician beliefs/attitudes; physician behaviour; doctor-patient interaction; patient knowledge, or patient attitudes/beliefs or behaviour. Of note is the weaker evidence of physician knowledge change impacting on referrals, and greater evidence of change to physician attitudes and beliefs, and also doctor-patient interaction having an impact.

The third column (and final element to be completed) is another key part of the theory of change. This section identifies a range of factors which may be associated with or influence whether the short-term outcomes will lead to the intended longer term outcomes and impacts. This column examines the moderating and mediating variables which may act as predictors of whether an intervention will be successful. They can be considered as similar to the barriers and facilitators often described in qualitative studies. The model details a wide range of these moderating and mediating factors relating to: the physician; the patient; and the organisation. Of particular interest here is the conflicting evidence relating to physician and patient demographic factors (the subject of a large number of studies) influencing referral patterns, and the clearer picture regarding the influence of patient clinical and social factors in the referral process.

Having outlined the content of each column, the following provides an example of the flow of reasoning for one particular type of intervention underpinned by elements of the model. Much work in the UK has been directed towards issuing guidelines for GPs regarding who and when to refer, with the assumption that changed knowledge will lead to changed referral practice. However, the model questions these assumptions by indicating that there is conflicting evidence regarding the efficacy of this type of intervention, and also suggesting that there is weak evidence of interventions such as these leading to enhanced knowledge outcomes, Perhaps if the guidelines focused more on elements of the model where evidence is stronger such as addressing GP attitudes and beliefs (for example tolerance of risk) or behaviour (such as the optimal content of referral information) this may lead to more successful immediate outcomes. The model highlights however that the effect of any guideline intervention will also be modified by GP, patient and service factors, for example the complexity of the case, the GP”s emotional response to the patient and GP time pressure. These potential barriers need to be considered in the implementation of guideline interventions. If these elements can be addressed however, use of guidelines by GPs may enhance referrer or patient satisfaction, improve waiting time, or change the content of a referral and thus have a resulting impact at a service wide level.

Following development of the model we sought feedback from stakeholders regarding the clarity of representation of the findings, and potential uses. This consultation was carried out via individual and group discussion with practitioners, patient and the public representatives, commissioners (individuals who have responsibility for purchasing services), and by circulating the model to experts in the field. In total we received input from 44 individuals (15 GPs, five commissioners, seven patient and public representatives, and 17 hospital specialists from a range of clinical areas). Thirty eight of the respondents reported that they clearly understood the model however, four specialists described the model as overly complex and 2 patient representatives reported some confusion understanding it.

GPs in particular gave positive feedback, highlighting that it was a good fit with their experience of the way referrals are managed, and that it successfully conveyed the complexity of general practice. The model was also described positively as identifying the role of both the GPs’ and the patients’ attitudes and beliefs, and the doctor-patient interaction. Also, GPs noted with satisfaction that the model included the physicians’ emotional response to the patient, which resonated with their experiences. Most specialists also reported that the model was a good fit with their experience of factors influencing referral management. Potential uses of the model described were: as a tool for GP trainees and educators; as a teaching aid for undergraduate medical students; for analysing the demand management pathway when commissioning; for comparing what was being commissioned with what was evidence based; and to direct research into poorly evidenced areas.

Some of the feedback from participants concerned factors that had not been identified in the literature. For example the potential role of carers as well as the patient in doctor-patient interactions was highlighted, and the potential influence of being a GP temporarily covering a colleagues’ work. Some amendments were made to the model following this feedback, principally clarifying where there was no evidence versus inconclusive evidence, and editing terminology.

While referral management is often considered to have only a capacity-limiting function, the model was able to identify the true complexity of what is aiming to be achieved. Our model has added to the existing literature by setting out the chain of reasoning that underpins how and if interventions are to lead to their intended impacts, and made explicit the assumptions that underpin the process. The logic model is able to summarise a wealth of information regarding the findings of a systematic review on a single page. The visual presentation of this information was clearly understood by almost all professionals and all commissioners in our sample. This study therefore supports the value of logic models as communication tools. The effectiveness of the model for communicating findings to patients and the public however warrants further exploration. While four of the seven patient representatives in our group found value in the model, three found it lacked relevance to patients. While this was a very small sample, it would be worth exploring in the future whether the topic of the model contributed to this perception. Perhaps a model relating to a specific clinical condition rather than service delivery may be perceived of greater relevance to patients and the public.

The use of stakeholders in developing a theory of change has been recommended by other authors [ 20 ]. Participants in our sample were able to provide valuable input by suggesting areas where there were seeming gaps in evidence. Our method of building the model from the literature has sought to be systematic and evidence-based, rather than be influenced by expert/stakeholder opinion (as is more typically the process of logic model development). However, while it is important to be alert to potential sources of bias in the review process, it seems that the involvement of stakeholders for determining potential gaps in evidence alongside systematic identification processes should not be ignored. The logic model we have produced outlines only where we identified literature, and does not include the two areas suggested by the stakeholders. We debated whether these suggested areas should be added to the model, and concluded that it would be counter-intuitive in a model presenting the evidence to show areas of no evidence. It is possible that future development of the methodology could consider including “ghost boxes” or similar to indicate where experts or practitioners believe that there are links, however there is no current research to substantiate this.

We endeavoured to enhance the communicative power of the model by adopting a system of evaluating the strength of evidence underpinning elements. The determination of strength of evidence is a challenging area, with our adopted system likely to be the subject of debate. Other widely used methods of appraising the quality of evidence (such as that used by the Cochrane Collaboration [ 15 ]) typically use different checklists for different study designs. The method that we adopted was able to be inclusive of the diversity of types of evidence in our review. In selecting an approach we also aimed to move beyond a simple count of papers. This “more equals stronger” approach may be misleading as a greater number may be only an indicator of where work has been carried out, or is perhaps where a topic is more amenable to investigation. The evaluation we utilised included elements of both quantity and quality, together with the consideration of consistency. However, the volume of studies in the rating was still influential. While we believe that the strength of evidence indicator adds considerably to the model and review findings, we recognise that there is still work to be done in refining this aspect of the method.

Our process of synthesising the data to develop the logic model draws on methodological developments in the area of qualitative evidence synthesis [ 18 , 21 ]. Our use of categorising and charting to build elements also draws on techniques of Framework Analysis [ 22 ] which is commonly used as a method of qualitative data analysis in policy research. The Framework Method may be particularly useful to underpin this process as it is highly systematic method of categorizing and organizing data [ 23 ]. By its inclusion of a diverse range of evidence, our method also resonates with the growing use of mixed-methods research which appreciates the contribution of both qualitative and quantitative evidence to answering a research question. While our method utilises the model for synthesis at the latter end of a systematic review, logic models have been suggested as being of value at various stages of the process [ 7 ]. Recently it has been proposed that logic models should be added to the established PICOs framework for establishing review parameters [ 24 ] in the initial stages.

In order to be of value, a visual representation should stand up to scrutiny so that concepts and meaning can be grasped by others and stimulate discussion [ 21 ]. We believe that evaluation of our model indicates that it met these requirements. The use of diagrams to explain complex interventions has been criticised in the past on the grounds that it can fail to identify mechanisms of change [ 19 ]. We argue that by using a wide range of literature and employing methods of iterative searching to examine potential associations, that this potential limitation can be overcome. Vogel [ 25 ] emphasised that diagrams should combine simplicity with validity – an acknowledgement of complexity but recognition that things are more complex than can be described. The vast majority of feedback on the model reported that this complexity represented the area as they knew it, and that this was a key asset of the model.

This work has demonstrated the potential value of a logic model synthesis approach to systematic review methodologies. In particular for this piece of work, the method proved valuable in unpicking the complexity of the area, illuminating multiple outcomes and potential impacts, and highlighting a range of factors that need to be considered if interventions are to lead to intended impacts.

Faulkner A, Mills N, Bainton D, Baxter K, Kinnersley P, Peters TJ, Sharp D: A systematic review of the effect of primary care-based service innovations on quality and patterns of referral to specialist secondary care. Brit J Gen Pract. 2003, 53: 878-884.

Google Scholar  

Weiss CH: Nothing as practical as a good theory: exploring theory-based evaluation for comprehensive community initiatives for children and families. New Approaches to Evaluating Community Initiatives. Edited by: Connell JP, Kubisch AC, Schoor LB, Weiss CH. 1995, Washington DC: Aspen Institute, 65-69.

Plsek PE, Greenhalgh T: The challenge of complexity in healthcare. BMJ. 2001, 323: 625-628. 10.1136/bmj.323.7313.625.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Miles A: Complexity in medicine and healthcare: people and systems, theory and practice. J Eval Clin Pract. 2009, 15: 409-410. 10.1111/j.1365-2753.2009.01204.x.

Article   PubMed   Google Scholar  

Pawson R: Evidence-based policy: the promise of realist synthesis. Evaluation. 2002, 8: 340-358. 10.1177/135638902401462448.

Article   Google Scholar  

Rogers PJ: Theory-based evaluation: reflections ten years on. N Dir Eval. 2007, 114: 63-81.

Anderson LM, Petticrew M, Rehfuess E, Armstong R, Ueffing E, Baker P, Francis D, Tugwell D: Using logic models to capture complexity in systematic reviews. Res Synth Meth. 2011, 2: 33-42. 10.1002/jrsm.32.

Rogers PJ: Using programme theory to evaluate complicated and complex aspects of interventions. Evaluation. 2008, 14: 29-48. 10.1177/1356389007084674.

Wallace J, Nwosu B, Clarke M: Barriers to the uptake of evidence from systematic reviews and meta-analyses: a systematic review of decision makers’ perceptions. BMJ Open. 2012, 2: e001220-doi:10.1136/bmjopen-2012-001220

Article   PubMed   PubMed Central   Google Scholar  

Baxter S, Baxter S, Killoran A, Kelly M, Goyder E: Synthesizing diverse evidence: the use of primary qualitative data analysis methods and logic models in public health reviews. Pub Health. 2010, 124: 99-106. 10.1016/j.puhe.2010.01.002.

Article   CAS   Google Scholar  

Allmark P, Baxter S, Goyder E, Guillaume L, Crofton-Martin G: Assessing the health benefits of advice services: using research evidence and logic model methods to explore complex pathways. Health Soc Care Comm. 2013, 21: 59-68. 10.1111/j.1365-2524.2012.01087.x.

EPPI-Centre: Methods for Conducting Systematic Reviews. 2010, London: EPPI-Centre, Social Science Research Unit, Institute of Education, University of London

Grant MJ, Brettle A, Long AF: Poster Presentation. Beyond the Basics of Systematic Reviews. Developing a Review Question: A Spiral Approach to Literature Searching. 2000, Oxford

Schardt C, Adams MB, Owens T, Keitz S, Fontelo P: Utilization of the PICO framework to improve searching PubMed for clinical questions. BMC Med Inform Decis Mak. 2007, 7: doi:10.1186/1472-6947-7-16

The Cochrane Collaboration: Cochrane handbook for systematic reviews of interventions, version 5.1.0, 2011. [Handbook.cochrane.org]

Critical Skills Appraisal Programme (CASP) qualitative research checklist. [ http://www.casp-uk.net/wpcontent/uploads/2011/11/CASP_Qualitative_Appraisal_Checklist_14oct10.pdf ]

Hoogendoom WE, Van Poppel MN, Bongers PM, Koes BW, Bouter LM: Physical load during work and leisure time as risk factors for back pain. Scand J Work Environ Health. 1999, 25: 387-403. 10.5271/sjweh.451.

Thomas A, Harden A: Methods for the thematic synthesis of qualitative research in systematic reviews. BMC Med Res Methodol. 2008, 8: doi:10.1186/1471-2288-8-45

Weiss CH: Theory-based evaluation: past present and future. N Dir Eval. 1997, 76: 68-81.

Blamey A, Mackenzie M: Theories of change and realistic evaluation: peas in a pod or apples and oranges?. Evaluation. 2007, 13: 439-455. 10.1177/1356389007082129.

Dixon-Woods M, Fitzpatrick R, Roberts K: Including qualitative research in systematic reviews: opportunities and problems. J Eval Clin Pract. 2001, 2: 125-133.

Gale NK, Heath G, Cameron E, Rashid S, Redwood S: Using the framework method for the analysis of qualitative data in multi-disciplinary health research. BMC Med Res Methodol. 2013, 13: doi:10.1186/1471-2288-13-117

Ritchie J, Lewis J: Qualitative Research Practice: A Guide for Social Science Students and Researchers. 2003, London: Sage

McDonald KM, Schultz EM, Chang C: Evaluating the state of quality-improvement science through evidence synthesis: insights from the Closing the Quality Gap Series. Perm J. 2013, 17: 52-61. 10.7812/TPP/13-010.

Vogel I: Review of the Use of Theory of Change in International Development: Review Report. 2012, London: Department of International Development

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/14/62/prepub

Download references

Acknowledgements

We would like to thank the following members of the project steering group for their valuable input: Professor Danuta Kasprzyk; Professor Helena Britt; Ellen Nolte; Jon Karnon; Nigel Edwards; Christine Allmar; Brian Hodges; and Martin McShane.

This project was funded by the National Institute for Health Research (Health Service and Delivery Research Programme project number11/1022/01). The views and opinions expressed therein are those of the authors and do not necessarily reflect those of the Health Service and Delivery Research Programme NIHR, NHS or the Department of Health.

Author information

Authors and affiliations.

School of Health and Related Research, University of Sheffield, Regent Court, 30 Regent Street, Sheffield, S14DA, UK

Susan K Baxter, Lindsay Blank, Helen Buckley Woods, Nick Payne, Melanie Rimmer & Elizabeth Goyder

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Susan K Baxter .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors’ contribution

SB was a reviewer and led development of the logic model. LB was principal investigator and lead reviewer. HB carried out the searches. NP and EG provided methodological and topic expertise throughout the work. MR carried out the evaluation phase. All members of the team read and commented on drafts of this paper.

Electronic supplementary material

12874_2014_1082_moesm1_esm.docx.

Additional file 1: Using logic model methods in systematic review synthesis: describing complex pathways in referral management interventions.(DOCX 42 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2, rights and permissions.

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article.

Baxter, S.K., Blank, L., Woods, H.B. et al. Using logic model methods in systematic review synthesis: describing complex pathways in referral management interventions. BMC Med Res Methodol 14 , 62 (2014). https://doi.org/10.1186/1471-2288-14-62

Download citation

Received : 16 January 2014

Accepted : 30 April 2014

Published : 10 May 2014

DOI : https://doi.org/10.1186/1471-2288-14-62

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Systematic review
  • Methodology
  • Evidence synthesis
  • Logic model
  • Demand management
  • Referral systems
  • Referral management

BMC Medical Research Methodology

ISSN: 1471-2288

literature review on logic model

  • Open access
  • Published: 25 September 2020

The Implementation Research Logic Model: a method for planning, executing, reporting, and synthesizing implementation projects

  • Justin D. Smith   ORCID: orcid.org/0000-0003-3264-8082 1 , 2 ,
  • Dennis H. Li 3 &
  • Miriam R. Rafferty 4  

Implementation Science volume  15 , Article number:  84 ( 2020 ) Cite this article

89k Accesses

192 Citations

83 Altmetric

Metrics details

A Letter to the Editor to this article was published on 17 November 2021

Numerous models, frameworks, and theories exist for specific aspects of implementation research, including for determinants, strategies, and outcomes. However, implementation research projects often fail to provide a coherent rationale or justification for how these aspects are selected and tested in relation to one another. Despite this need to better specify the conceptual linkages between the core elements involved in projects, few tools or methods have been developed to aid in this task. The Implementation Research Logic Model (IRLM) was created for this purpose and to enhance the rigor and transparency of describing the often-complex processes of improving the adoption of evidence-based interventions in healthcare delivery systems.

The IRLM structure and guiding principles were developed through a series of preliminary activities with multiple investigators representing diverse implementation research projects in terms of contexts, research designs, and implementation strategies being evaluated. The utility of the IRLM was evaluated in the course of a 2-day training to over 130 implementation researchers and healthcare delivery system partners.

Preliminary work with the IRLM produced a core structure and multiple variations for common implementation research designs and situations, as well as guiding principles and suggestions for use. Results of the survey indicated a high utility of the IRLM for multiple purposes, such as improving rigor and reproducibility of projects; serving as a “roadmap” for how the project is to be carried out; clearly reporting and specifying how the project is to be conducted; and understanding the connections between determinants, strategies, mechanisms, and outcomes for their project.

Conclusions

The IRLM is a semi-structured, principle-guided tool designed to improve the specification, rigor, reproducibility, and testable causal pathways involved in implementation research projects. The IRLM can also aid implementation researchers and implementation partners in the planning and execution of practice change initiatives. Adaptation and refinement of the IRLM are ongoing, as is the development of resources for use and applications to diverse projects, to address the challenges of this complex scientific field.

Peer Review reports

Contributions to the literature

Drawing from and integrating existing frameworks, models, and theories, the IRLM advances the traditional logic model for the requirements of implementation research and practice.

The IRLM provides a means of describing the complex relationships between critical elements of implementation research and practice in a way that can be used to improve the rigor and reproducibility of research and implementation practice, and the testing of theory.

The IRLM offers researchers and partners a useful tool for the purposes of planning, executing, reporting, and synthesizing processes and findings across the stages of implementation projects.

In response to a call for addressing noted problems with transparency, rigor, openness, and reproducibility in biomedical research [ 1 ], the National Institutes of Health issued guidance in 2014 pertaining to the research it funds ( https://www.nih.gov/research-training/rigor-reproducibility ). The field of implementation science has similarly recognized a need for better specification with similar intent [ 2 ]. However, integrating the necessary conceptual elements of implementation research, which often involves multiple models, frameworks, and theories, is an ongoing challenge. A conceptually grounded organizational tool could improve rigor and reproducibility of implementation research while offering additional utility for the field.

This article describes the development and application of the Implementation Research Logic Model (IRLM). The IRLM can be used with various types of implementation studies and at various stages of research, from planning and executing to reporting and synthesizing implementation studies. Example IRLMs are provided for various common study designs and scenarios, including hybrid designs and studies involving multiple service delivery systems [ 3 , 4 ]. Last, we describe the preliminary use of the IRLM and provide results from a post-training evaluation. An earlier version of this work was presented at the 2018 AcademyHealth/NIH Conference on the Science of Dissemination and Implementation in Health, and the abstract appeared in the Implementation Science [ 5 ].

Specification challenges in implementation research

Having an imprecise understanding of what was done and why during the implementation of a new innovation obfuscates identifying the factors responsible for successful implementation and prevents learning from what contributed to failed implementation. Thus, improving the specification of phenomena in implementation research is necessary to inform our understanding of how implementation strategies work, for whom, under what determinant conditions, and on what implementation and clinical outcomes. One challenge is that implementation science uses numerous models and frameworks (hereafter, “frameworks”) to describe, organize, and aid in understanding the complexity of changing practice patterns and integrating evidence-based health interventions across systems [ 6 ]. These frameworks typically address implementation determinants, implementation process, or implementation evaluation [ 7 ]. Although many frameworks incorporate two or more of these broad purposes, researchers often find it necessary to use more than one to describe the various aspects of an implementation research study. The conceptual connections and relationships between multiple frameworks are often difficult to describe and to link to theory [ 8 ].

Similarly, reporting guidelines exist for some of these implementation research components, such as strategies [ 9 ] and outcomes [ 10 ], as well as for entire studies (i.e., Standards for Reporting Implementation Studies [ 11 ]); however, they generally help describe the individual components and not their interactions. To facilitate causal modeling [ 12 ], which can be used to elucidate mechanisms of change and the processes involved in both successful and unsuccessful implementation research projects, investigators must clearly define the relations among variables in ways that are testable with research studies [ 13 ]. Only then can we open the “black box” of how specific implementation strategies operate to predict outcomes.

  • Logic models

Logic models, graphic depictions that present the shared relationships among various elements of a program or study, have been used for decades in program development and evaluation [ 14 ] and are often required by funding agencies when proposing studies involving implementation [ 15 ]. Used to develop agreement among diverse stakeholders of the “what” and the “how” of proposed and ongoing projects, logic models have been shown to improve planning by highlighting theoretical and practical gaps, support the development of meaningful process indicators for tracking, and aid in both reproducing successful studies and identifying failures of unsuccessful studies [ 16 ]. They are also useful at other stages of research and for program implementation, such as organizing a project/grant application/study protocol, presenting findings from a completed project, and synthesizing the findings of multiple projects [ 17 ].

Logic models can also be used in the context of program theory, an explicit statement of how a project/strategy/intervention/program/policy is understood to contribute to a chain of intermediate results that eventually produce the intended/observed impacts [ 18 ]. Program theory specifies both a Theory of Change (i.e., the central processes or drivers by which change comes about following a formal theory or tacit understanding) and a Theory of Action (i.e., how program components are constructed to activate the Theory of Change) [ 16 ]. Inherent within program theory is causal chain modeling. In implementation research, Fernandez et al. [ 19 ] applied mapping methods to implementation strategies to postulate the ways in which changes to the system affect downstream implementation and clinical outcomes. Their work presents an implementation mapping logic model based on Proctor et al. [ 20 , 21 ], which is focused primarily on the selection of implementation strategy(s) rather than a complete depiction of the conceptual model linking all implementation research elements (i.e., determinants, strategies, mechanisms of action, implementation outcomes, clinical outcomes) in the detailed manner we describe in this article.

Development of the IRLM

The IRLM began out of a recognition that implementation research presents some unique challenges due to the field’s distinct and still codifying terminology [ 22 ] and its use of implementation-specific and non-specific (borrowed from other fields) theories, models, and frameworks [ 7 ]. The development of the IRLM occurred through a series of case applications. This began with a collaboration between investigators at Northwestern University and the Shirley Ryan AbilityLab in which the IRLM was used to study the implementation of a new model of patient care in a new hospital and in other related projects [ 23 ]. Next, the IRLM was used with three already-funded implementation research projects to plan for and describe the prospective aspects of the trials, as well as with an ongoing randomized roll-out implementation trial of the Collaborative Care Model for depression management [Smith JD, Fu E, Carroll AJ, Rado J, Rosenthal LJ, Atlas JA, Burnett-Zeigler I, Carlo, A, Jordan N, Brown CH, Csernansky J: Collaborative care for depression management in primary care: a randomized rollout trial using a type 2 hybrid effectiveness-implementation design submitted for publication]. It was also applied in the later stages of a nearly completed implementation research project of a family-based obesity management intervention in pediatric primary care to describe what had occurred over the course of the 3-year trial [ 24 ]. Last, the IRLM was used as a training tool in a 2-day training with 63 grantees of NIH-funded planning project grants funded as part of the Ending the HIV Epidemic initiative [ 25 ]. Results from a survey of the participants in the training are reported in the “Results” section. From these preliminary activities, we identified a number of ways that the IRLM could be used, described in the section on “Using the IRLM for different purposes and stages of research.”

The Implementation Research Logic Model

In developing the IRLM, we began with the common “pipeline” logic model format used by AHRQ, CDC, NIH, PCORI, and others [ 16 ]. This structure was chosen due to its familiarity with funders, investigators, readers, and reviewers. Although a number of characteristics of the pipeline logic model can be applied to implementation research studies, there is an overall misfit due to implementation research’s focusing on the systems that support adoption and delivery of health practices; involving multiple levels within one or more systems; and having its own unique terminology and frameworks [ 3 , 22 , 26 ]. We adapted the typical evaluation logic model to integrate existing implementation science frameworks as its core elements while keeping to the same aim of facilitating causal modeling.

The most common IRLM format is depicted in Fig. 1 . Additional File A1 is a Fillable PDF version of Fig. 1 . In certain situations, it might be preferable to include the evidence-based intervention (EBI; defined as a clinical, preventive, or educational protocol or a policy, principle, or practice whose effects are supported by research [ 27 ]) (Fig. 2 ) to demonstrate alignment of contextual factors (determinants) and strategies with the components and characteristics of the clinical intervention/policy/program and to disentangle it from the implementation strategies. Foremost in these indications are “home-grown” interventions, whose components and theory of change may not have been previously described, and novel interventions that are early in the translational pipeline, which may require greater detail for the reader/reviewer. Variant formats are provided as Additional Files A 2 to A 4 for use with situations and study designs commonly encountered in implementation research, including comparative implementation studies (A 2 ), studies involving multiple service contexts (A 3 ), and implementation optimization designs (A 4 ). Further, three illustrative IRLMs are provided, with brief descriptions of the projects and the utility of the IRLM (A 5 , A 6 and A 7 ).

figure 1

Implementation Research Logic Model (IRLM) Standard Form. Notes. Domain names in the determinants section were drawn from the Consolidated Framework for Implementation Research. The format of the outcomes column is from Proctor et al. 2011

figure 2

Implementation Research Logic Model (IRLM) Standard Form with Intervention. Notes. Domain names in the determinants section were drawn from the Consolidated Framework for Implementation Research. The format of the outcomes column is from Proctor et al. 2011

Core elements and theory

The IRLM specifies the relationships between determinants of implementation, implementation strategies, the mechanisms of action resulting from the strategies, and the implementation and clinical outcomes affected. These core elements are germane to every implementation research project in some way. Accordingly, the generalized theory of the IRLM posits that (1) implementation strategies selected for a given EBI are related to implementation determinants (context-specific barriers and facilitators), (2) strategies work through specific mechanisms of action to change the context or the behaviors of those within the context, and (3) implementation outcomes are the proximal impacts of the strategy and its mechanisms, which then relate to the clinical outcomes of the EBI. Articulated in part by others [ 9 , 12 , 21 , 28 , 29 ], this causal pathway theory is largely explanatory and details the Theory of Change and the Theory of Action of the implementation strategies in a single model. The EBI Theory of Action can also be displayed within a modified IRLM (see Additional File A 4 ). We now briefly describe the core elements and discuss conceptual challenges in how they relate to one another and to the overall goals of implementation research.

Determinants

Determinants of implementation are factors that might prevent or enable implementation (i.e., barriers and facilitators). Determinants may act as moderators, “effect modifiers,” or mediators, thus indicating that they are links in a chain of causal mechanisms [ 12 ]. Common determinant frameworks are the Consolidated Framework for Implementation Research (CFIR) [ 30 ] and the Theoretical Domains Framework [ 31 ].

Implementation strategies

Implementation strategies are supports, changes to, and interventions on the system to increase adoption of EBIs into usual care [ 32 ]. Consideration of determinants is commonly used when selecting and tailoring implementation strategies [ 28 , 29 , 33 ]. Providing the theoretical or conceptual reasoning for strategy selection is recommended [ 9 ]. The IRLM can be used to specify the proposed relationships between strategies and the other elements (determinants, mechanisms, and outcomes) and assists with considering, planning, and reporting all strategies in place during an implementation research project that could contribute to the outcomes and resulting changes

Because implementation research occurs within dynamic delivery systems with multiple factors that determine success or failure, the field has experienced challenges identifying consistent links between individual barriers and specific strategies to overcome them. For example, the Expert Recommendations for Implementing Change (ERIC) compilation of strategies [ 32 ] was used to determine which strategies would best address contextual barriers identified by CFIR [ 29 ]. An online CFIR–ERIC matching process completed by implementation researchers and practitioners resulted in a large degree of heterogeneity and few consistent relationships between barrier and strategy, meaning the relationship is rarely one-to-one (e.g., a single strategy is often is linked to multiple barriers; more than one strategy needed to address a single barrier). Moreover, when implementation outcomes are considered, researchers often find that to improve one outcome, more than one contextual barrier needs to be addressed, which might in turn require one or more strategies.

Frequently, the reporting of implementation research studies focuses on the strategy or strategies that were introduced for the research study, without due attention to other strategies already used in the system or additional supporting strategies that might be needed to implement the target strategy. The IRLM allows for the comprehensive specification of all introduced and present strategies, as well as their changes (adaptations, additions, discontinuations) during the project.

Mechanisms of action

Mechanisms of action are processes or events through which an implementation strategy operates to affect desired implementation outcomes [ 12 ]. The mechanism can be a change in a determinant, a proximal implementation outcome, an aspect of the implementation strategy itself, or a combination of these in a multiple-intervening-effect model. An example of a causal process might be using training and fidelity monitoring strategies to improve delivery agents’ knowledge and self-efficacy about the EBI in response to knowledge-related barriers in the service delivery system. This could result in raising their acceptability of the EBI, increase the likelihood of adoption, improve the fidelity of delivery, and lead to sustainment. Relatively, few implementation studies formally test mechanisms of action, but this area of investigation has received significant attention more recently as the necessity to understand how strategies operate grows in the field [ 33 , 34 , 35 ].

Implementation outcomes are the effects of deliberate and purposive actions to implement new treatments, practices, and services [ 21 ]. They can be indicators of implementation processes, or key intermediate outcomes in relation to service, or target clinical outcomes. Glasgow et al. [ 36 , 37 , 38 ] describe the interrelated nature of implementation outcomes as occurring in a logical, but not necessarily linear, sequence of adoption by a delivery agent, delivery of the innovation with fidelity, reach of the innovation to the intended population, and sustainment of the innovation over time. The combined impact of these nested outcomes, coupled with the size of the effect of the EBI, determines the population or public health impact of implementation [ 36 ]. Outcomes earlier in the sequence can be conceptualized as mediators and mechanisms of strategies on later implementation outcomes. Specifying which strategies are theoretically intended to affect which outcomes, through which mechanisms of action, is crucial for improving the rigor and reproducibility of implementation research and to testing theory.

Using the Implementation Research Logic Model

Guiding principles.

One of the critical insights from our preliminary work was that the use of the IRLM should be guided by a set of principles rather than governed by rules. These principles are intended to be flexible both to allow for adaptation to the various types of implementation studies and evolution of the IRLM over time and to address concerns in the field of implementation science regarding specification, rigor, reproducibility, and transparency of design and process [ 5 ]. Given this flexibility of use, the IRLM will invariably require accompanying text and other supporting documents. These are described in the section “Use of Supporting Text and Documents.”

Principle 1: Strive for comprehensiveness

Comprehensiveness increases transparency, can improve rigor, and allows for a better understanding of alternative explanations to the conclusions drawn, particularly in the presence of null findings for an experimental design. Thus, all relevant determinants, implementation strategies, and outcomes should be included in the IRLM.

Concerning determinants, the valence should be noted as being either a barrier, a facilitator, neutral, or variable by study unit. This can be achieved by simply adding plus (+) or minus (–) signs for facilitators and barriers, respectively, or by using coding systems such as that developed by Damschroder et al. [ 39 ], which indicates the relative strength of the determinant on a scale: – 2 ( strong negative impact ), – 1 ( weak negative impact ), 0 ( neutral or mixed influence ), 1 ( weak positive impact ), and 2 ( strong positive impact ). The use of such a coding system could yield better specification compared to using study-specific adjectives or changing the name of the determinant (e.g., greater relative priority, addresses patient needs, good climate for implementation). It is critical to include all relevant determinants and not simply limit reporting to those that are hypothesized to be related to the strategies and outcomes, as there are complex interrelationships between determinants.

Implementation strategies should be reported in their entirety. When using the IRLM for planning a study, it is important to list all strategies in the system, including those already in use and those to be initiated for the purposes of the study, often in the experimental condition of the design. Second, strategies should be labeled to indicate whether they were (a) in place in the system prior to the study, (b) initiated prospectively for the purposes of the study (particularly for experimental study designs), (c) removed as a result of being ineffective or onerous, or (d) introduced during the study to address an emergent barrier or supplement other strategies because of low initial impact. This is relevant when using the IRLM for planning, as an ongoing tracking system, for retrospective application to a completed study, and in the final reporting of a study. There have been a number of processes proposed for tracking the use of and adaptations to implementation strategies over time [ 40 , 41 ]. Each of these is more detailed than would be necessary for the IRLM, but the processes described provide a method for accurately tracking the temporal aspects of strategy use that fulfill the comprehensiveness principle.

Although most studies will indicate a primary implementation outcome, other outcomes are almost assuredly to be measured. Thus, they ought to be included in the IRLM. This guidance is given in large part due to the interdependence of implementation outcomes, such that adoption relates to delivery with fidelity, reach of the intervention, and potential for sustainment [ 36 ]. Similarly, the overall public health impact (defined as reach multiplied by the effect size of the intervention [ 38 ]) is inextricably tied to adoption, fidelity, acceptability, cost, etc. Although the study might justifiably focus on only one or two implementation outcomes, the others are nonetheless relevant and should be specified and reported. For example, it is important to capture potential unintended consequences and indicators of adverse effects that could result from the implementation of an EBI.

Principle 2: Indicate key conceptual relationships

Although the IRLM has a generalized theory (described earlier), there is a need to indicate the relationships between elements in a manner aligning with the specific theory of change for the study. Researchers ought to provide some form or notation to indicate these conceptual relationships using color-coding, superscripts, arrows, or a combination of the three. Such notations in the IRLM facilitate reference in the text to the study hypotheses, tests of effects, causal chain modeling, and other forms of elaboration (see “Supporting Text and Resources”). We prefer the use of superscripts to color or arrows in grant proposals and articles for practical purposes, as colors can be difficult to distinguish, and arrows can obscure text and contribute to visual convolution. When presenting the IRLM using presentation programs (e.g., PowerPoint, Keynote), colors and arrows can be helpful, and animations can make these connections dynamic and sequential without adding to visual complexity. This principle could also prove useful in synthesizing across similar studies to build the science of tailored implementation, where strategies are selected based on the presence of specific combinations of determinants. As previously indicated [ 29 ], there is much work to be done in this area given.

Principle 3: Specify critical study design elements

This critical element will vary by the study design (e.g., hybrid effectiveness-implementation trial, observational, what subsystems are assigned to the strategies). This principle includes not only researchers but service systems and communities, whose consent is necessary to carry out any implementation design [ 3 , 42 , 43 ].

Primary outcome(s)

Indicate the primary outcome(s) at each level of the study design (i.e., clinician, clinic, organization, county, state, nation). The levels should align with the specific aims of a grant application or the stated objective of a research report. In the case of a process evaluation or an observational study including the RE-AIM evaluation components [ 38 ] or the Proctor et al. [ 21 ] taxonomy of implementation outcomes, the primary outcome may be the product of the conceptual or theoretical model used when a priori outcomes are not clearly indicated. We also suggest including downstream health services and clinical outcomes even if they are not measured, as these are important for understanding the logic of the study and the ultimate health-related targets.

For quasi/experimental designs

When quasi/experimental designs [ 3 , 4 ] are used, the independent variable(s) (i.e., the strategies that are introduced or manipulated or that otherwise differentiate study conditions) should be clearly labeled. This is important for internal validity and for differentiating conditions in multi-arm studies.

For comparative implementation trials

In the context of comparative implementation trials [ 3 , 4 ], a study of two or more competing implementation strategies are introduced for the purposes of the study (i.e., the comparison is not implementation-as-usual), and there is a need to indicate the determinants, strategies, mechanisms, and potentially outcomes that differentiate the arms (see Additional File A 2 ). As comparative implementation can involve multiple service delivery systems, the determinants, mechanisms, and outcomes might also differ, though there must be at least one comparable implementation outcome. In our preliminary work applying the IRLM to a large-scale comparative implementation trial, we found that we needed to use an IRLM for each arm of the trial as it was not possible to use a single IRLM because the strategies being tested occurred across two delivery systems and strategies were very different, by design. This is an example of the flexible use of the IRLM.

For implementation optimization designs

A number of designs are now available that aim to test processes of optimizing implementation. These include factorial, Sequential Multiple Assignment Randomized Trial (SMART) [ 44 ], adaptive [ 45 ], and roll-out implementation optimization designs [ 46 ]. These designs allow for (a) building time-varying adaptive implementation strategies based on the order in which components are presented [ 44 ], (b) evaluating the additive and combined effects of multiple strategies [ 44 , 47 ], and (c) can incorporate data-driven iterative changes to improve implementation in successive units [ 45 , 46 ]. The IRLM in Additional File A 4 can be used for such designs.

Additional specification options

Users of the IRLM are allowed to specify any number of additional elements that may be important to their study. For example, one could notate those elements of the IRLM that have been or will be measured versus those that were based on the researcher’s prior studies or inferred from findings reported in the literature. Users can also indicate when implementation strategies differ by level or unit within the study. In large multisite studies, strategies might not be uniform across all units, particularly those strategies that already exist within the system. Similarly, there might be a need to increase the dose of certain strategies to address the relative strengths of different determinants within units.

Using the IRLM for different purposes and stages of research

Commensurate with logic models more generally, the IRLM can be used for planning and organizing a project, carrying out a project (as a roadmap), reporting and presenting the findings of a completed project, and synthesizing the findings of multiple projects or of a specific area of implementation research, such as what is known about how learning collaboratives are effective within clinical care settings.

When the IRLM is used for planning, the process of populating each of the elements often begins with the known parameter(s) of the study. For example, if the problem is improving the adoption and reach of a specific EBI within a particular clinical setting, the implementation outcomes and context, as well as the EBI, are clearly known. The downstream clinical outcomes of the EBI are likely also known. Working from the two “bookends” of the IRLM, the researchers and community partners and/or organization stakeholders can begin to fill in the implementation strategies that are likely to be feasible and effective and then posit conceptually derived mechanisms of action. In another example, only the EBI and primary clinical outcomes were known. The IRLM was useful in considering different scenarios for what strategies might be needed and appropriate to test the implementation of the EBI in different service delivery contexts. The IRLM was a tool for the researchers and stakeholders to work through these multiple options.

When we used the IRLM to plan for the execution of funded implementation studies, the majority of the parameters were already proposed in the grant application. However, through completing the IRLM prior to the start of the study, we found that a number of important contextual factors had not been considered, additional implementation strategies were needed to complement the primary ones proposed in the grant, and mechanisms needed to be added and measured. At the time of award, mechanisms were not an expected component of implementation research projects as they will likely become in the future.

For another project, the IRLM was applied retrospectively to report on the findings and overall logic of the study. Because nearly all elements of the IRLM were known, we approached completion of the model as a means of showing what happened during the study and to accurately report the hypothesized relationships that we observed. These relationships could be formally tested using causal pathway modeling [ 12 ] or other path analysis approaches with one or more intervening variables [ 48 ].

Synthesizing

In our preliminary work with the IRLM, we used it in each of the first three ways; the fourth (synthesizing) is ongoing within the National Cancer Institute’s Improving the Management of symPtoms during And Following Cancer Treatment (IMPACT) research consortium. The purpose is to draw conclusions for the implementation of an EBI in a particular context (or across contexts) that are shared and generalizable to provide a guide for future research and implementation.

Use of supporting text and documents

While the IRLM provides a good deal of information about a project in a single visual, researchers will need to convey additional details about an implementation research study through the use of supporting text, tables, and figures in grant applications, reports, and articles. Some elements that require elaboration are (a) preliminary data on the assessment and valence of implementation determinants; (b) operationalization/detailing of the implementation strategies being used or observed, using established reporting guidelines [ 9 ] and labeling conventions [ 32 ] from the literature; (c) hypothesized or tested causal pathways [ 12 ]; (d) process, service, and clinical outcome measures, including the psychometric properties, method, and timing of administration, respondents, etc.; (e) study procedures, including subject selection, assignment to (or observation of natural) study conditions, and assessment throughout the conduct of the study [ 4 ]; and (f) the implementation plan or process for following established implementation frameworks [ 49 , 50 , 51 ]. By utilizing superscripts, subscripts, and other notations within the IRLM, as previously suggested, it is easy to refer to (a) hypothesized causal paths in theoretical overviews and analytic plan sections, (b) planned measures for determinants and outcomes, and (c) specific implementation strategies in text, tables, and figures.

Evidence of IRLM utility and acceptability

The IRLM was used as the foundation for a training in implementation research methods to a group of 65 planning projects awarded under the national Ending the HIV Epidemic initiative. One investigator (project director or co-investigator) and one implementation partner (i.e., a collaborator from a community service delivery system) from each project were invited to attend a 2-day in-person summit in Chicago, IL, in October 2019. One hundred thirty-two participants attended, representing 63 of the 65 projects. A survey, which included demographics and questions pertaining to the Ending the HIV Epidemic, was sent to potential attendees prior to the summit, to which 129 individuals—including all 65 project directors, 13 co-investigators, and 51 implementation partners (62% Female)—responded. Those who indicated an investigator role ( n = 78) received additional questions about prior implementation research training (e.g., formal coursework, workshop, self-taught) and related experiences (e.g., involvement in a funded implementation project, program implementation, program evaluation, quality improvement) and the stage of their project (i.e., exploration, preparation, implementation, sustainment [ 50 ]).

Approximately 6 weeks after the summit, 89 attendees (69%) completed a post-training survey comprising more than 40 questions about their overall experience. Though the invitation to complete the survey made no mention of the IRLM, it included 10 items related to the IRLM and one more generally about the logic of implementation research, each rated on a 4-point scale (1 = not at all , 2 = a little , 3 = moderately , 4 = very much ; see Table 1 ). Forty-two investigators (65% of projects) and 24 implementation partners indicated attending the training and began and completed the survey (68.2% female). Of the 66 respondents who attended the training, 100% completed all 11 IRLM items, suggesting little potential response bias.

Table 1 provides the means, standard deviations, and percent of respondents endorsing either “moderately” or “very” response options. Results were promising for the utility of the IRLM on the majority of the dimensions assessed. More than 50% of respondents indicated that the IRLM was “moderately” or “very” helpful on all questions. Overall, 77.6% ( M = 3.18, SD = .827) of respondents indicated that their knowledge on the logic of implementation research had increased either moderately or very much after the 2-day training. At the time of the survey, when respondents were about 2.5 months into their 1-year planning projects, 44.6% indicated that they had already been able to complete a full draft of the IRLM.

Additional analyses using a one-way analysis of variance indicated no statistically significant differences in responses to the IRLM questions between investigators and implementation partners. However, three items approached significance: planning the project ( F = 2.460, p = .055), clearly reporting and specifying how the project is to be conducted ( F = 2.327, p = .066), and knowledge on the logic of implementation research ( F = 2.107, p = .091). In each case, scores were higher for the investigators compared to the implementation partners, suggesting that perhaps the knowledge gap in implementation research lay more in the academic realm than among community partners, who may not have a focus on research but whose day-to-day roles include the implementation of EBPs in the real world. Lastly, analyses using ordinal logistic regression did not yield any significant relationship between responses to the IRLM survey items and prior training ( n = 42 investigators who attended the training and completed the post-training survey), prior related research experience ( n = 42), and project stage of implementation ( n = 66). This suggests that the IRLM is a useful tool for both investigators and implementers with varying levels of prior exposure to implementation research concepts and across all stages of implementation research. As a result of this training, the IRLM is now a required element in the FY2020 Ending the HIV Epidemic Centers for AIDS Research/AIDS Research Centers Supplement Announcement released March 2020 [ 15 ].

Resources for using the IRLM

As the use of the IRLM for different study designs and purposes continues to expand and evolve, we envision supporting researchers and other program implementers in applying the IRLM to their own contexts. Our team at Northwestern University hosts web resources on the IRLM that includes completed examples and tools to assist users in completing their model, including templates in various formats (Figs. 1 and 2 , Additional Files A 1 , A 2 , A 3 and A 4 and others) a Quick Reference Guide (Additional File A 8 ) and a series of worksheets that provide guidance on populating the IRLM (Additional File A 9 ). These will be available at https://cepim.northwestern.edu/implementationresearchlogicmodel/ .

The IRLM provides a compact visual depiction of an implementation project and is a useful tool for academic–practice collaboration and partnership development. Used in conjunction with supporting text, tables, and figures to detail each of the primary elements, the IRLM has the potential to improve a number of aspects of implementation research as identified in the results of the post-training survey. The usability of the IRLM is high for seasoned and novice implementation researchers alike, as evidenced by our survey results and preliminary work. Its use in the planning, executing, reporting, and synthesizing of implementation research could increase the rigor and transparency of complex studies that ultimately could improve reproducibility—a challenge in the field—by offering a common structure to increase consistency and a method for more clearly specifying links and pathways to test theories.

Implementation occurs across the gamut of contexts and settings. The IRLM can be used when large organizational change is being considered, such as a new strategic plan with multifaceted strategies and outcomes. Within a narrower scope of a single EBI in a specific setting, the larger organizational context still ought to be included as inner setting determinants (i.e., the impact of the organizational initiative on the specific EBI implementation project) and as implementation strategies (i.e., the specific actions being done to make the organizational change a reality that could be leveraged to implement the EBI or could affect the success of implementation). The IRLM has been used by our team to plan for large systemic changes and to initiate capacity building strategies to address readiness to change (structures, processes, individuals) through strategic planning and leadership engagement at multiple levels in the organization. This aspect of the IRLM continues to evolve.

Among the drawbacks of the IRLM is that it might be viewed as a somewhat simplified format. This represents the challenges of balancing depth and detail with parsimony, ease of comprehension, and ease of use. The structure of the IRLM may inhibit creative thinking if applied too rigidly, which is among the reasons we provide numerous examples of different ways to tailor the model to the specific needs of different project designs and parameters. Relatedly, we encourage users to iterate on the design of the IRLM to increase its utility.

The promise of implementation science lies in the ability to conduct rigorous and reproducible research, to clearly understand the findings, and to synthesize findings from which generalizable conclusions can be drawn and actionable recommendations for practice change emerge. As scientists and implementers have worked to better define the core methods of the field, the need for theory-driven, testable integration of the foundational elements involved in impactful implementation research has become more apparent. The IRLM is a tool that can aid the field in addressing this need and moving toward the ultimate promise of implementation research to improve the provision and quality of healthcare services for all people.

Availability of data and materials

Not applicable.

Abbreviations

Consolidated Framework for Implementation Research

Evidence-based intervention

Expert Recommendations for Implementing Change

Implementation Research Logic Model

Nosek BA, Alter G, Banks GC, Borsboom D, Bowman SD, Breckler SJ, Buck S, Chambers CD, Chin G, Christensen G, et al. Promoting an open research culture. Science. 2015;348:1422–5.

Article   CAS   Google Scholar  

Slaughter SE, Hill JN, Snelgrove-Clarke E. What is the extent and quality of documentation and reporting of fidelity to implementation strategies: a scoping review. Implement Sci. 2015;10:1–12.

Article   Google Scholar  

Brown CH, Curran G, Palinkas LA, Aarons GA, Wells KB, Jones L, Collins LM, Duan N, Mittman BS, Wallace A, et al: An overview of research and evaluation designs for dissemination and implementation. Annual Review of Public Health 2017, 38:null.

Hwang S, Birken SA, Melvin CL, Rohweder CL, Smith JD: Designs and methods for implementation research: advancing the mission of the CTSA program. Journal of Clinical and Translational Science 2020:Available online.

Smith JD. An Implementation Research Logic Model: a step toward improving scientific rigor, transparency, reproducibility, and specification. Implement Sci. 2018;14:S39.

Google Scholar  

Tabak RG, Khoong EC, Chambers DA, Brownson RC. Bridging research and practice: models for dissemination and implementation research. Am J Prev Med. 2012;43:337–50.

Nilsen P. Making sense of implementation theories, models and frameworks. Implement Sci. 2015;10:53.

Damschroder LJ. Clarity out of chaos: use of theory in implementation research. Psychiatry Res. 2019.

Proctor EK, Powell BJ, McMillen JC. Implementation strategies: recommendations for specifying and reporting. Implement Sci. 2013;8.

Kessler RS, Purcell EP, Glasgow RE, Klesges LM, Benkeser RM, Peek CJ. What does it mean to “employ” the RE-AIM model? Evaluation & the Health Professions. 2013;36:44–66.

Pinnock H, Barwick M, Carpenter CR, Eldridge S, Grandes G, Griffiths CJ, Rycroft-Malone J, Meissner P, Murray E, Patel A, et al. Standards for Reporting Implementation Studies (StaRI): explanation and elaboration document. BMJ Open. 2017;7:e013318.

Lewis CC, Klasnja P, Powell BJ, Lyon AR, Tuzzio L, Jones S, Walsh-Bailey C, Weiner B. From classification to causality: advancing understanding of mechanisms of change in implementation science. Front Public Health. 2018;6.

Glanz K, Bishop DB. The role of behavioral science theory in development and implementation of public health interventions. Annu Rev Public Health. 2010;31:399–418.

WK Kellogg Foundation: Logic model development guide. Battle Creek, Michigan: WK Kellogg Foundation; 2004.

CFAR/ARC Ending the HIV Epidemic Supplement Awards [ https://www.niaid.nih.gov/research/cfar-arc-ending-hiv-epidemic-supplement-awards ].

Funnell SC, Rogers PJ. Purposeful program theory: effective use of theories of change and logic models. San Francisco, CA: John Wiley & Sons; 2011.

Petersen D, Taylor EF, Peikes D. The logic model: the foundation to implement, study, and refine patient-centered medical home models (issue brief). Mathematica Policy Research: Mathematica Policy Research Reports; 2013.

Davidoff F, Dixon-Woods M, Leviton L, Michie S. Demystifying theory and its use in improvement. BMJ Quality &amp; Safety. 2015;24:228–38.

Fernandez ME, ten Hoor GA, van Lieshout S, Rodriguez SA, Beidas RS, Parcel G, Ruiter RAC, Markham CM, Kok G. Implementation mapping: using intervention mapping to develop implementation strategies. Front Public Health. 2019;7.

Proctor EK, Landsverk J, Aarons G, Chambers D, Glisson C, Mittman B. Implementation research in mental health services: an emerging science with conceptual, methodological, and training challenges. Admin Pol Ment Health. 2009;36.

Proctor EK, Silmere H, Raghavan R, Hovmand P, Aarons G, Bunger A, Griffey R, Hensley M. Outcomes for implementation research: conceptual distinctions, measurement challenges, and research agenda. Adm Policy Ment Health Ment Health Serv Res. 2011;38.

Rabin BA, Brownson RC: Terminology for dissemination and implementation research. In Dissemination and implementation research in health: translating science to practice. 2 edition. Edited by Brownson RC, Colditz G, Proctor EK. New York, NY: Oxford University Press; 2017: 19-45.

Smith JD, Rafferty MR, Heinemann AW, Meachum MK, Villamar JA, Lieber RL, Brown CH: Evaluation of the factor structure of implementation research measures adapted for a novel context and multiple professional roles. BMC Health Serv Res 2020.

Smith JD, Berkel C, Jordan N, Atkins DC, Narayanan SS, Gallo C, Grimm KJ, Dishion TJ, Mauricio AM, Rudo-Stern J, et al. An individually tailored family-centered intervention for pediatric obesity in primary care: study protocol of a randomized type II hybrid implementation-effectiveness trial (Raising Healthy Children study). Implement Sci. 2018;13:1–15.

Fauci AS, Redfield RR, Sigounas G, Weahkee MD, Giroir BP. Ending the HIV epidemic: a plan for the United States: Editorial. JAMA. 2019;321:844–5.

Grimshaw JM, Eccles MP, Lavis JN, Hill SJ, Squires JE. Knowledge translation of research findings. Implement Sci. 2012;7:50.

Brown CH, Curran G, Palinkas LA, Aarons GA, Wells KB, Jones L, Collins LM, Duan N, Mittman BS, Wallace A, et al. An overview of research and evaluation designs for dissemination and implementation. Annu Rev Public Health. 2017;38:1–22.

Krause J, Van Lieshout J, Klomp R, Huntink E, Aakhus E, Flottorp S, Jaeger C, Steinhaeuser J, Godycki-Cwirko M, Kowalczyk A, et al. Identifying determinants of care for tailoring implementation in chronic diseases: an evaluation of different methods. Implement Sci. 2014;9:102.

Waltz TJ, Powell BJ, Fernández ME, Abadie B, Damschroder LJ. Choosing implementation strategies to address contextual barriers: diversity in recommendations and future directions. Implement Sci. 2019;14:42.

Damschroder LJ, Aron DC, Keith RE, Kirsh SR, Alexander JA, Lowery JC. Fostering implementation of health services research findings into practice: a consolidated framework for advancing implementation science. Implement Sci. 2009;4.

Atkins L, Francis J, Islam R, O’Connor D, Patey A, Ivers N, Foy R, Duncan EM, Colquhoun H, Grimshaw JM, et al. A guide to using the Theoretical Domains Framework of behaviour change to investigate implementation problems. Implement Sci. 2017;12:77.

Powell BJ, Waltz TJ, Chinman MJ, Damschroder LJ, Smith JL, Matthieu MM, Proctor EK, Kirchner JE. A refined compilation of implementation strategies: results from the Expert Recommendations for Implementing Change (ERIC) project. Implement Sci. 2015;10.

Powell BJ, Fernandez ME, Williams NJ, Aarons GA, Beidas RS, Lewis CC, McHugh SM, Weiner BJ. Enhancing the impact of implementation strategies in healthcare: a research agenda. Front Public Health. 2019;7.

PAR-19-274: Dissemination and implementation research in health (R01 Clinical Trial Optional) [ https://grants.nih.gov/grants/guide/pa-files/PAR-19-274.html ].

Edmondson D, Falzon L, Sundquist KJ, Julian J, Meli L, Sumner JA, Kronish IM. A systematic review of the inclusion of mechanisms of action in NIH-funded intervention trials to improve medication adherence. Behav Res Ther. 2018;101:12–9.

Gaglio B, Shoup JA, Glasgow RE. The RE-AIM framework: a systematic review of use over time. Am J Public Health. 2013;103:e38–46.

Glasgow RE, Harden SM, Gaglio B, Rabin B, Smith ML, Porter GC, Ory MG, Estabrooks PA. RE-AIM planning and evaluation framework: adapting to new science and practice with a 20-year review. Front Public Health. 2019;7.

Glasgow RE, Vogt TM, Boles SM. Evaluating the public health impact of health promotion interventions: the RE-AIM framework. Am J Public Health. 1999;89:1322–7.

Damschroder LJ, Reardon CM, Sperber N, Robinson CH, Fickel JJ, Oddone EZ. Implementation evaluation of the Telephone Lifestyle Coaching (TLC) program: organizational factors associated with successful implementation. Transl Behav Med. 2016;7:233–41.

Bunger AC, Powell BJ, Robertson HA, MacDowell H, Birken SA, Shea C. Tracking implementation strategies: a description of a practical approach and early findings. Health Research Policy and Systems. 2017;15:15.

Boyd MR, Powell BJ, Endicott D, Lewis CC. A method for tracking implementation strategies: an exemplar implementing measurement-based care in community behavioral health clinics. Behav Ther. 2018;49:525–37.

Brown CH, Kellam S, Kaupert S, Muthén B, Wang W, Muthén L, Chamberlain P, PoVey C, Cady R, Valente T, et al. Partnerships for the design, conduct, and analysis of effectiveness, and implementation research: experiences of the Prevention Science and Methodology Group. Adm Policy Ment Health Ment Health Serv Res. 2012;39:301–16.

McNulty M, Smith JD, Villamar J, Burnett-Zeigler I, Vermeer W, Benbow N, Gallo C, Wilensky U, Hjorth A, Mustanski B, et al: Implementation research methodologies for achieving scientific equity and health equity. In Ethnicity & disease, vol. 29. pp. 83-92; 2019:83-92.

Collins LM, Murphy SA, Strecher V. The multiphase optimization strategy (MOST) and the sequential multiple assignment randomized trial (SMART): new methods for more potent eHealth interventions. Am J Prev Med. 2007;32:S112–8.

Brown CH, Ten Have TR, Jo B, Dagne G, Wyman PA, Muthén B, Gibbons RD. Adaptive designs for randomized trials in public health. Annu Rev Public Health. 2009;30:1–25.

Smith JD: The roll-out implementation optimization design: integrating aims of quality improvement and implementation sciences. Submitted for publication 2020.

Dziak JJ, Nahum-Shani I, Collins LM. Multilevel factorial experiments for developing behavioral interventions: power, sample size, and resource considerations. Psychol Methods. 2012;17:153–75.

MacKinnon DP, Lockwood CM, Hoffman JM, West SG, Sheets V. A comparison of methods to test mediation and other intervening variable effects. Psychol Methods. 2002;7:83–104.

Graham ID, Tetroe J. Planned action theories. In: Straus S, Tetroe J, Graham ID, editors. Knowledge translation in health care: Moving from evidence to practice. Wiley-Blackwell: Hoboken, NJ; 2009.

Moullin JC, Dickson KS, Stadnick NA, Rabin B, Aarons GA. Systematic review of the Exploration, Preparation, Implementation, Sustainment (EPIS) framework. Implement Sci. 2019;14:1.

Rycroft-Malone J. The PARIHS framework—a framework for guiding the implementation of evidence-based practice. J Nurs Care Qual. 2004;19:297–304.

Download references

Acknowledgements

The authors wish to thank our colleagues who provided input at different stages of developing this article and the Implementation Research Logic Model, and for providing the examples included in this article: Hendricks Brown, Brian Mustanski, Kathryn Macapagal, Nanette Benbow, Lisa Hirschhorn, Richard Lieber, Piper Hansen, Leslie O’Donnell, Allen Heinemann, Enola Proctor, Courtney Wolk-Benjamin, Sandra Naoom, Emily Fu, Jeffrey Rado, Lisa Rosenthal, Patrick Sullivan, Aaron Siegler, Cady Berkel, Carrie Dooyema, Lauren Fiechtner, Jeanne Lindros, Vinny Biggs, Gerri Cannon-Smith, Jeremiah Salmon, Sujata Ghosh, Alison Baker, Jillian MacDonald, Hector Torres and the Center on Halsted in Chicago, Michelle Smith, Thomas Dobbs, and the pastors who work tirelessly to serve their communities in Mississippi and Arkansas.

This study was supported by grant P30 DA027828 from the National Institute on Drug Abuse, awarded to C. Hendricks Brown; grant U18 DP006255 to Justin Smith and Cady Berkel; grant R56 HL148192 to Justin Smith; grant UL1 TR001422 from the National Center for Advancing Translational Sciences to Donald Lloyd-Jones; grant R01 MH118213 to Brian Mustanski; grant P30 AI117943 from the National Institute of Allergy and Infectious Diseases to Richard D’Aquila; grant UM1 CA233035 from the National Cancer Institute to David Cella; a grant from the Woman’s Board of Northwestern Memorial Hospital to John Csernansky; grant F32 HS025077 from the Agency for Healthcare Research and Quality; grant NIFTI 2016-20178 from the Foundation for Physical Therapy; the Shirley Ryan AbilityLab; and by the Implementation Research Institute (IRI) at the George Warren Brown School of Social Work, Washington University in St. Louis, through grant R25 MH080916 from the National Institute of Mental Health and the Department of Veterans Affairs, Health Services Research & Development Service, and Quality Enhancement Research Initiative (QUERI) to Enola Proctor. The opinions expressed herein are the views of the authors and do not necessarily reflect the official policy or position of the National Institutes of Health, the Centers for Disease Control and Prevention, the Agency for Healthcare Research and Quality the Department of Veterans Affairs, or any other part of the US Department of Health and Human Services.

Author information

Authors and affiliations.

Department of Population Health Sciences, University of Utah School of Medicine, Salt Lake City, Utah, USA

Justin D. Smith

Center for Prevention Implementation Methodology for Drug Abuse and HIV, Department of Psychiatry and Behavioral Sciences, Department of Preventive Medicine, Department of Medical Social Sciences, and Department of Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA

Center for Prevention Implementation Methodology for Drug Abuse and HIV, Department of Psychiatry and Behavioral Sciences, Feinberg School of Medicine; Institute for Sexual and Gender Minority Health and Wellbeing, Northwestern University Chicago, Chicago, Illinois, USA

Dennis H. Li

Shirley Ryan AbilityLab and Center for Prevention Implementation Methodology for Drug Abuse and HIV, Department of Psychiatry and Behavioral Sciences and Department of Physical Medicine and Rehabilitation, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA

Miriam R. Rafferty

You can also search for this author in PubMed   Google Scholar

Contributions

JDS conceived of the Implementation Research Logic Model. JDS, MR, and DL collaborated in developing the Implementation Research Logic Model as presented and in the writing of the manuscript. All authors approved of the final version.

Corresponding author

Correspondence to Justin D. Smith .

Ethics declarations

Ethics approval and consent to participate.

Not applicable. This study did not involve human subjects.

Consent for publication

Competing interests.

None declared.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1..

IRLM Fillable PDF form

Additional file 2.

IRLM for Comparative Implementation

Additional file 3.

IRLM for Implementation of an Intervention Across or Linking Two Contexts

Additional file 4.

IRLM for an Implementation Optimization Study

Additional file 5.

IRLM example 1: Faith in Action: Clergy and Community Health Center Communication Strategies for Ending the Epidemic in Mississippi and Arkansas

Additional file 6.

IRLM example 2: Hybrid Type II Effectiveness–Implementation Evaluation of a City-Wide HIV System Navigation Intervention in Chicago, IL

Additional file 7.

IRLM example 3: Implementation, spread, and sustainment of Physical Therapy for Mild Parkinson’s Disease through a Regional System of Care

Additional file 8.

IRLM Quick Reference Guide

Additional file 9.

IRLM Worksheets

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Smith, J.D., Li, D.H. & Rafferty, M.R. The Implementation Research Logic Model: a method for planning, executing, reporting, and synthesizing implementation projects. Implementation Sci 15 , 84 (2020). https://doi.org/10.1186/s13012-020-01041-8

Download citation

Received : 03 April 2020

Accepted : 03 September 2020

Published : 25 September 2020

DOI : https://doi.org/10.1186/s13012-020-01041-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Program theory
  • Integration
  • Study specification

Implementation Science

ISSN: 1748-5908

  • Submission enquiries: Access here and click Contact Us
  • General enquiries: [email protected]

literature review on logic model

  • Open access
  • Published: 27 May 2024

Optimizing double-layered convolutional neural networks for efficient lung cancer classification through hyperparameter optimization and advanced image pre-processing techniques

  • M. Mohamed Musthafa 1 ,
  • I. Manimozhi 2 ,
  • T. R. Mahesh 3 &
  • Suresh Guluwadi 4  

BMC Medical Informatics and Decision Making volume  24 , Article number:  142 ( 2024 ) Cite this article

Metrics details

Lung cancer remains a leading cause of cancer-related mortality globally, with prognosis significantly dependent on early-stage detection. Traditional diagnostic methods, though effective, often face challenges regarding accuracy, early detection, and scalability, being invasive, time-consuming, and prone to ambiguous interpretations. This study proposes an advanced machine learning model designed to enhance lung cancer stage classification using CT scan images, aiming to overcome these limitations by offering a faster, non-invasive, and reliable diagnostic tool. Utilizing the IQ-OTHNCCD lung cancer dataset, comprising CT scans from various stages of lung cancer and healthy individuals, we performed extensive preprocessing including resizing, normalization, and Gaussian blurring. A Convolutional Neural Network (CNN) was then trained on this preprocessed data, and class imbalance was addressed using Synthetic Minority Over-sampling Technique (SMOTE). The model’s performance was evaluated through metrics such as accuracy, precision, recall, F1-score, and ROC curve analysis. The results demonstrated a classification accuracy of 99.64%, with precision, recall, and F1-score values exceeding 98% across all categories. SMOTE significantly enhanced the model’s ability to classify underrepresented classes, contributing to the robustness of the diagnostic tool. These findings underscore the potential of machine learning in transforming lung cancer diagnostics, providing high accuracy in stage classification, which could facilitate early detection and tailored treatment strategies, ultimately improving patient outcomes.

Peer Review reports

Introduction

Lung cancer stands as a formidable global health challenge, consistently ranking as one of the leading causes of cancer-related mortality worldwide. It is characterized by the uncontrolled growth of abnormal cells in one or both lungs, typically in the cells lining the air passages. Unlike normal cells, these cancerous cells do not develop into healthy lung tissue; instead, they divide rapidly and form tumors that disrupt the lung’s primary function: oxygen exchange.

The global impact of lung cancer is staggering, with millions of new cases diagnosed annually. Its high mortality rate is primarily due to late-stage detection, where the cancer has progressed to an advanced stage or metastasized to other body parts, significantly diminishing the effectiveness of treatment modalities. Thus, early and accurate diagnosis of lung cancer is paramount in improving patient prognoses, extending survival rates, and enhancing the quality of life for affected individuals.

The primary cause of lung cancer is cigarette smoking, which exposes the lungs to carcinogenic substances that can damage the cells’ DNA and lead to cancer. Other risk factors for lung cancer include exposure to secondhand smoke, radon gas, asbestos, air pollution, and a family history of lung cancer.

Symptoms of lung cancer can vary but may include persistent coughing, chest pain, shortness of breath, hoarseness, coughing up blood, unexplained weight loss, and fatigue. However, lung cancer may not cause symptoms in its initial stages, which is why early detection through screening is crucial for improving outcomes.

Diagnosis of lung cancer typically involves imaging tests such as chest X-rays, CT scans, and PET scans to visualize the lungs and detect any abnormalities. A biopsy, where a small sample of lung tissue is taken and examined under a microscope, is usually needed to confirm the diagnosis.

Treatment options for lung cancer depend on several factors, including the type and stage of the cancer, as well as the patient’s overall health and preferences. Treatment may include surgery to remove the tumor, chemotherapy, radiation therapy, targeted therapy, immunotherapy, or a combination of these approaches.

Lung cancer is a critical condition that necessitates immediate medical care. Detecting it early, along with improvements in treatment methods, has enhanced the prognosis for numerous patients. However, the most effective strategy to avoid lung cancer is to stop smoking and minimize contact with additional risk elements. Figure  1 displays some example images of lung cancer tests.

figure 1

Sample images of lung cancer

Current diagnostic techniques for lung cancer involve various approaches, such as biopsies, CT scans, chest X-rays, PET scans, and MRI, among others [ 1 ]. While these methods are invaluable in the diagnostic process, they come with certain limitations. For instance, biopsies, while definitive, are invasive and carry risks of complications. Less invasive imaging methods such as X-rays or CT scans might produce false positives or negatives, potentially causing unwarranted stress or delays in treatment.

Moreover, the interpretation of these diagnostic tests heavily relies on the expertise of the clinician, introducing a degree of subjectivity and potential for human error. There’s also the challenge of early-stage lung cancer, which often presents very subtle changes not always detectable with conventional imaging techniques [ 2 ].

This context highlights the critical need for advanced diagnostic tools capable of overcoming these challenges. This study aims to address these issues by developing a machine learning model using Convolutional Neural Networks (CNNs) to enhance the precision and effectiveness of lung cancer stage classification from CT scans. By automating and refining the diagnostic process, the proposed model seeks to mitigate the limitations of traditional methods, offering a faster, non-invasive, and more reliable diagnostic alternative.

The impact of this study is significant: the model’s high accuracy in classifying lung cancer stages promises to revolutionize clinical diagnostics, facilitating early detection and enabling tailored treatment strategies. This advancement has the potential to improve patient outcomes by allowing for timely intervention and more effective management of lung cancer, ultimately contributing to reduced mortality rates and enhanced patient care.

The objective of this research paper is to:

Develop a machine learning model utilizing Convolutional Neural Networks (CNNs) for lung cancer stage classification based on CT scans.

Bridge existing diagnostic deficiencies by providing clinicians with a tool for expedited and precise decision-making in lung cancer management.

Contribute to improved patient outcomes through enhanced diagnostic accuracy and early detection capabilities.

The paper is organized as follows: Initially, the Literature Review explores existing research on lung cancer diagnostics, highlighting advancements and limitations, and sets the foundation for the proposed methodology. Subsequently, the Materials and methods section describes the dataset, preprocessing steps, model architecture, training process, and evaluation metrics in detail. The Results section then presents the study’s findings, including model performance metrics and comparative analysis with existing methods. This is followed by the Discussion, which interprets the results, discusses implications for clinical practice, addresses limitations, and suggests future research directions. Finally, the Conclusion summarizes the main findings and their relevance within the broader scope of lung cancer diagnostics, supported by a comprehensive list of References to provide credit and enable readers to explore the research background further.

Through this structured approach, the paper aims to contribute meaningful insights to the field of medical imaging and machine learning, offering a novel tool for the early and accurate diagnosis of lung cancer.

Literature review

The literature surrounding lung cancer diagnostics encompasses various methodologies, ranging from traditional imaging techniques to more advanced approaches such as machine learning. This review aims to explore existing research in this area, highlighting both the advancements made and the limitations faced, ultimately setting the foundation for the proposed machine learning-based methodology.

Diagnosis of lung cancer using CT scans

The utilization of Computed Tomography (CT) scans in lung cancer diagnosis has been a cornerstone in the medical field, offering high-resolution images that are pivotal for detecting and monitoring various stages of lung tumors [ 3 ]. Over the years, numerous studies have underscored the importance of CT scans in identifying nodules that could potentially be malignant, with a particular focus on low-dose CT scans, which have become a standard in screening programs, especially for high-risk populations. Such studies underscore the superior sensitivity of CT scans in identifying early-stage lung cancer, a significant advancement over other imaging methods like chest X-rays, which may overlook smaller, subtler lesions.

Despite the advancements, the interpretation of CT scans remains a significant challenge. Radiologists need to discern between benign and malignant nodules, an endeavor complicated by the presence of various artifacts and benign conditions like scars or inflammatory diseases, which can mimic the appearance of cancerous nodules [ 4 , 5 ].

Machine learning approaches in lung cancer detection and classification

The integration of machine learning, particularly deep learning techniques, into the analysis of CT images has established a groundbreaking paradigm in the identification and classification of lung cancer. Convolutional Neural Networks (CNNs) are spearheading this transformation by providing a framework for automated extraction and categorization of features directly from the images. This advancement marks a substantial stride in augmenting the accuracy and effectiveness of lung cancer diagnostics, thus facilitating more precise and timely interventions.

Binary classification models

Early studies primarily focused on binary classification, distinguishing between malignant and non-malignant nodules. CNNs, through their layered architecture, have demonstrated the ability to learn complex patterns in imaging data, surpassing traditional computer vision techniques in accuracy and reliability [ 6 , 7 ].

Multi-class classification models

Recent advancements have moved towards more nuanced multi-class classification models that categorize nodules into various cancer stages or types. This granularity is crucial for treatment planning and prognosis, offering a more detailed understanding of the disease’s progression [ 8 ].

Transfer learning

Given the challenges of assembling large annotated medical imaging datasets, transfer learning has become a popular approach. Models pre-trained on vast, non-medical image datasets are fine-tuned on smaller medical imaging datasets, leveraging learned features to improve performance in the medical domain [ 9 ].

Data augmentation

To address the issue of restricted training data, strategies such as rotation, scaling, and flipping are commonly employed for data augmentation, effectively expanding the training dataset artificially. These methods bolster the model’s resilience and its ability to generalize from a limited number of examples [ 10 ].

Segmentation models

Deep learning models extend their utility beyond mere classification; they are also employed in segmentation tasks, delineating the precise boundaries of nodules, which is vital for assessing tumor size and growth over time. U-Net, a type of CNN, is particularly noted for its effectiveness in medical image segmentation [ 11 ].

In Table 1  a few of the studies which have been done in this field are given.

Gaps in current research

Despite significant advancements in lung cancer diagnostics, several critical gaps remain in the current research landscape. Many existing models are trained on datasets lacking diversity in demographics, scanner types, and image acquisition parameters, which can limit their generalizability across different populations and clinical settings. This limitation underscores the need for more comprehensive and diverse datasets to enhance the robustness of diagnostic models. Additionally, the “black box” nature of deep learning models poses a challenge for clinical adoption, as there is a growing demand for models that not only predict accurately but also provide insights into the reasoning behind their predictions. This issue of interpretability is crucial for gaining the trust of clinicians and integrating these models into clinical workflows effectively. Furthermore, the transition from research to clinical practice is slow, with models requiring not just technological solutions but also addressing regulatory, ethical, and practical considerations to facilitate their integration into routine medical care. Another critical gap is the need for models capable of longitudinal analysis, which can analyze changes in lung nodules over time, providing a dynamic assessment that aligns more closely with clinical needs. Addressing these gaps, this study introduces a comprehensive CNN model trained on a diverse and extensive dataset, encompassing various stages of lung cancer. The model is designed for multi-class classification, offering detailed insights critical for personalized treatment strategies. Emphasis is placed on the interpretability of the model, aiming to provide clinicians with understandable and actionable information. By demonstrating the model’s effectiveness in a clinical setting, this research contributes to the ongoing effort to integrate advanced machine learning techniques into the realm of lung cancer diagnosis and treatment.

Addressing these gaps, this study introduces a comprehensive CNN model trained on a diverse and extensive dataset, encompassing various stages of lung cancer. The model is designed for multi-class classification, offering detailed insights critical for personalized treatment strategies. Emphasis is placed on the interpretability of the model, aiming to provide clinicians with understandable and actionable information. By demonstrating the model’s effectiveness in a clinical setting, this research contributes to the ongoing effort to integrate advanced machine learning techniques into the realm of lung cancer diagnosis and treatment.

Materials and methods

This section delineates the comprehensive methodology employed to construct and validate a convolutional neural network (CNN) model for the classification of lung cancer stages using the IQ-OTHNCCD lung cancer dataset. The approach encompasses dataset acquisition, application of preprocessing methodologies, formulation of the model architecture, delineation of training procedures, and determination of evaluation metrics to ensure a comprehensive and reliable analysis. The workflow of the proposed model is visually depicted in Fig.  2 .

figure 2

Workflow of the proposed model

Dataset description and preprocessing

The IQ-OTHNCCD lung cancer dataset, integral to this study, is painstakingly curated to facilitate the creation and validation of machine learning models aimed at identifying and classifying lung cancer stages. This dataset encompasses a vast collection of CT scan images essential for advancing diagnostic capabilities in the field of lung cancer.

This dataset comprises CT scan images, comprising a diverse and comprehensive range of cases, covering various stages of lung cancer, including benign, malignant, and normal cases. This diversity is essential for training robust models capable of generalizing well across the spectrum of lung cancer manifestations, enabling effective diagnostic applications. In Table 2  a brief description of the dataset has been given.

Based on Table  2 , to provide visual insights of the data Fig.  3 delves into the same aspects.

figure 3

Dataset description

Annotating and labeling each image meticulously, medical professionals from the Iraq-Oncology Teaching Hospital/National Center for Cancer Diseases have ensured the dataset’s reliability. Annotations categorize images into one of three classes: benign, malignant, or normal. Such granular labeling establishes a solid ground truth essential for training and assessing the model, enhancing the dataset’s utility in research and clinical applications.

Characterized by high quality and consistency, the CT scans adhere to standardized imaging protocols, guaranteeing reliability and accuracy. However, variations in image dimensions necessitate preprocessing to standardize inputs for neural networks. These steps ensure that the model processes uniform data, enhancing its performance and generalizability across diverse datasets. The images of the dataset ratio are checked using Eq.  1 .

Preprocessing steps are pivotal in preparing data for effective model training, including:

Resizing: Resizing images to a uniform dimension ensures consistency in input size for CNNs, optimizing model performance.

Normalization: Normalizing pixel values to a scale of 0 to 1 expedites model convergence during training, facilitating efficient learning. It is achieved using Eq.  2 .

Augmentation: Utilizing data augmentation methods like rotation, flipping, and scaling improves the model’s robustness and helps prevent overfitting by effectively enlarging the dataset size.

Splitting: Partitioning the dataset into training, validation, and test sets is crucial for facilitating effective model training and evaluation, thereby ensuring the model’s ability to generalize and perform accurately on unseen data.

In this process, CNN is trained using the preprocessed dataset to adeptly extract features from CT scan images and accurately classify the stages of lung cancer. The dataset’s diversity and quality are pivotal in enabling the model to learn nuanced features and patterns associated with various lung cancer stages, underscoring its significance in advancing diagnostic accuracy and efficiency.

The IQ-OTHNCCD lung cancer dataset serves as the cornerstone for developing machine learning models that enhance early detection and classification of lung cancer. Through meticulous curation and rigorous preprocessing, this dataset showcases the transformative potential of AI in healthcare, underscoring its role in improving diagnostic accuracy and efficiency.

  • Image preprocessing

The preprocessing of images stands as a pivotal stage in the pipeline of developing a machine learning model, especially when handling medical imaging data like the IQ-OTHNCCD lung cancer dataset. This procedure comprises several crucial steps, each tailored to convert the raw CT scan images into a format conducive to effective analysis by a convolutional neural network (CNN).

Initially, image resizing is conducted. Given the inherent variability in the dimensions of CT scans, it is imperative to standardize the size of all images to ensure consistent input to the CNN. Resizing is performed while preserving the aspect ratio to avoid distortion, typically scaling down to a fixed size (e.g., 256 × 256 pixels). This uniformity is vital for the neural network to process and glean insights from the data effectively, as it necessitates a consistent input size [ 21 ].

Some pre-processed images to enhance the accessibility has been provided in Fig.  4 .

figure 4

Pre-processed images

Following resizing, normalization of pixel values is performed. CT scans, by nature, contain a wide range of pixel intensities, which can adversely affect the training process of a CNN due to the varying scales of image brightness and contrast. Normalization is a crucial preprocessing step in image analysis that adjusts the pixel values to fall within a specific range, commonly 0 to 1 or -1 to 1. This adjustment is typically achieved by dividing the pixel values by the maximum possible value, which is 255 for 8-bit images. Such a normalization process ensures that the model can train faster and more efficiently. This step ensures that the model trains faster and more effectively, as small, standardized values facilitate quicker convergence during the optimization process.

Gaussian blur is then applied as an additional preprocessing step. This technique, which employs a Gaussian kernel to smooth the image, is instrumental in reducing image noise and mitigating the effects of minor variations and artifacts in the scans. By doing so, the model’s focus is directed toward the salient features relevant to lung cancer classification, rather than being distracted by irrelevant noise or details. Gaussian blur operates by convolving the image with a Gaussian function, effectively averaging the pixel values within a specified radius. This process smoothens the image, reducing high-frequency components and noise, which can otherwise lead to overfitting or distraction during the training of the CNN.

In the context of lung cancer CT scans, Gaussian blur helps to highlight the important structural elements of the lungs and nodules while suppressing irrelevant details that could complicate the model’s learning process. By smoothing the images, Gaussian blur enhances the model’s ability to generalize by focusing on the more significant, lower-frequency features of the image, such as the shape and size of nodules, rather than being confounded by small variations or noise. This is particularly beneficial in medical imaging, where the presence of noise and artifacts can obscure critical diagnostic features.

The application of Gaussian blur can also aid in generalizing the model, preventing overfitting to the high-frequency noise present in the training set. It is achieved using Eq.  3 and the SMOTE ratio through Eq.  4 .

These are the preprocessing steps collectively enhance the quality and consistency of the input data, enabling the CNN to focus on learning meaningful, discriminative features from the CT images [ 22 ]. By ensuring that the images are appropriately resized, normalized, and filtered, the model is better equipped to identify the subtle nuances associated with different stages of lung cancer, thereby improving its diagnostic accuracy and reliability. Through meticulous image preprocessing, the foundation is laid for developing a robust machine learning model capable of contributing significantly to the field of medical imaging and diagnostics.

Deep learning model

The model architecture utilized in this study is a Convolutional Neural Network (CNN), renowned for its effectiveness in various image analysis tasks, notably in the domain of medical image processing. In this study, we utilized a Convolutional Neural Network (CNN) architecture, known for its effectiveness in analyzing images, particularly in medical contexts like lung cancer diagnosis from CT scans. Let’s break down how it works in simpler terms. First, the input layer takes in images resized to a standard size of 256 × 256 pixels, in black and white. This consistency helps the CNN learn efficiently. Then comes the first convolutional layer, where the model looks for basic patterns like edges and textures using small 3 × 3 filters. After that, a process called max pooling reduces the image’s size, focusing on the most important features. This step helps the model generalize better and ignore noise. We repeat this process with another convolutional layer to capture more complex patterns. The flattened layer turns the extracted features into a format the model can understand. Next, a fully connected layer reasons based on these features, helping with the final classification. The output layer then gives probabilities for each class (benign, malignant, or normal). Throughout training, we used the Adam optimizer to adjust learning rates and manage gradients effectively. Additionally, we applied a technique called SMOTE to balance our dataset, ensuring the model learned from all classes equally. By carefully designing our CNN architecture and incorporating these steps, we aimed to create a model that can accurately classify lung cancer stages from CT scans.

Input layer : The input layer accepts images resized to 256 × 256 pixels, maintaining a single channel (grayscale), resulting in an input shape of (256, 256, 1).

First convolutional layer : This layer consists of 64 filters of size 3 × 3, using a ReLU (Rectified Linear Unit) activation function. The choice of 64 filters is aimed at capturing a broad array of features from the input image, while the 3 × 3 filter size is standard for capturing spatial relationships in the image data. The equation involved are given in Eqs.  5 and 6 .

First max pooling layer : Following the convolutional layer, the model incorporates a max pooling layer with a 2 × 2 pool size. This layer serves to decrease the spatial dimensions of the feature maps, which not only helps in reducing the computational load but also enhances the model’s generalization capabilities. By focusing on the most prominent features, max pooling ensures that the model does not overfit to the noise in the training data. It is done using Eq.  7 .

Second convolutional layer : Another set of 64 filters is applied, like the first convolutional layer, to further refine the feature extraction. This layer also uses a 3 × 3 kernel and is followed by a ReLU activation. It is achieved using Eqs.  8 and 9 .

Second max pooling layer : This layer additionally decreases the size of the feature maps, aiding in the prevention of overfitting and lessening the computational burden.

Flattening : The feature maps are flattened into a single vector to prepare for the fully connected layers, facilitating the transition from convolutional layers to dense layers.

Fully connected layer : A dense layer with 16 neurons is used, providing a high-level reasoning based on the extracted features. This layer utilizes a linear activation function to allow for a range of linear responses. The equations helping in this are given in Eqs.  10 and 11 .

Output layer : The final layer of the model contains three neurons, each representing one of the classes: benign, malignant, and normal. It uses a SoftMax activation function, which is selected because it provides a probability distribution across these three classes, making it . involved are given in Eqs.  12 and 13 .

Optimizer : The Adam optimizer is used due to its effectiveness in managing sparse gradients and its ability to adapt learning rates, which enhance the convergence speed during training. The equation involved in this is given in Eq.  14 .

CNN is chosen for its proven efficacy in image classification tasks, particularly its ability to learn hierarchical patterns in data. In medical imaging, CNNs have demonstrated success in identifying subtle patterns that are indicative of various pathologies, making them ideal for this application. The sequential model with convolutional layers followed by pooling layers allows for the extraction and down sampling of features, which is critical for capturing relevant information from medical images.

The Synthetic Minority Over-Sampling Technique (SMOTE) represents an innovative strategy devised to address the issue of class imbalance within the dataset. Class imbalance poses a substantial risk of biasing the model’s performance, particularly in medical datasets where one class may be underrepresented. SMOTE functions by creating synthetic samples within the feature space of the minority class, drawing inspiration from the feature space of its nearest neighbors. This process aids in rectifying class imbalances and ensuring more equitable representation during model training.

Filter mapping of a sample image is shown in Fig.  5 to make it more sound about the interoperability of the model.

figure 5

In this research:

Application of SMOTE : SMOTE is applied only to the training data to prevent information leakage and to promote robust generalization on unseen data. It balances the dataset by augmenting the minority classes, ensuring that the model does not become biased toward the majority class.

Impact on model performance : By addressing the class imbalance, SMOTE helps in improving the model’s sensitivity towards the minority class, which is crucial in medical diagnostics, as overlooking a positive case can have serious implications.

Considerations : While SMOTE can significantly improve model performance in cases of class imbalance, it’s essential to monitor for overfitting, as the synthetic samples may cause the model to overgeneralize from the minority class.

The algorithm for the proposed model is presented in Algorithm 1.

figure a

Algorithm 1: Proposed algorithm for the methodology

As per the algorithm in the initial convolutional layers of the model, two sets of convolutional layers followed by max-pooling layers play a pivotal role in feature detection. Utilizing a standard 3 × 3 kernel size allows the model to discern small, localized features within CT scan images. By stacking these convolutional layers before applying max pooling, the model effectively captures intricate patterns such as edges, textures, and shapes, crucial for distinguishing between benign, malignant, and normal lung tissue. The ReLU activation function is employed in these convolutional layers due to its effectiveness in introducing non-linearity, enabling the model to learn complex patterns efficiently. Additionally, max pooling is utilized to downsample the feature maps, reducing computational load and enhancing robustness to image variations, thereby improving translational invariance. Following feature extraction, the model flattens the output and transitions to dense layers, condensing learned information into abstract representations. The final layer consists of three neurons, representing the three classes under consideration, employing the SoftMax activation function to transform logits into probabilities, thereby providing insights into the model’s confidence regarding each class. Throughout the compilation and training phases, the Adam optimizer and sparse categorical crossentropy loss function, as depicted by Eq.  15 , are chosen due to their adaptive learning rate features and appropriateness for classification objectives. Validation on an independent dataset is crucial for detecting overfitting and refining hyperparameters.

In the training phase, SMOTE is strategically applied to create a balanced dataset representative of all classes, crucial for generalizing well across various lung tissue conditions, especially in medical datasets where class imbalance may exist.

Training and validation

Throughout the training and validation phases of the deep learning model, meticulous steps are taken to ensure that the model not only learns effectively from the training data but also demonstrates robust generalization capabilities when presented with new, unseen data. This phase plays a pivotal role in evaluating the model’s proficiency in accurately classifying lung cancer stages from CT scans.

The training process initiates with the segmentation of the dataset into distinct training and validation subsets. This segmentation is performed in a stratified manner to guarantee that each subset encompasses a balanced representation of the various classes. Such stratification is essential for maintaining consistency and mitigating biases, particularly in light of the class imbalance addressed by SMOTE during training. Approximately 80% of the data is allocated for training purposes, while the remaining 20% is reserved for validation.

Subsequent to the data segmentation, the training commences with the utilization of a batch size of 8. The selection of a smaller batch size is deliberate, aiming to facilitate more precise and nuanced updates to the model’s weights during each iteration, thereby potentially enhancing generalization. Nonetheless, it is imperative to strike a balance between this granularity and computational efficiency, as smaller batch sizes may prolong the training duration.

The number of epochs is predetermined to be 12, indicating the total number of complete passes that the learning algorithm will undertake across the entire training dataset. This choice represents a delicate balance between underfitting and overfitting; insufficient epochs may hinder the model’s learning process, whereas excessive epochs may result in the model memorizing the training data, consequently impairing its ability to generalize effectively. The progression of training and validation loss and accuracy across epochs is visualized in Fig.  6 .

figure 6

Training and validation loss and accuracy

During training, the model’s performance is continuously evaluated using a comprehensive set of performance metrics assessed against the validation set. These metrics encompass accuracy, precision, recall, and F1-score, all of which are instrumental in comprehending the model’s strengths and weaknesses in classifying each lung cancer stage. Accuracy furnishes a broad overview of the model’s overall performance, while precision and recall delve deeper into its class-specific performance, a critical consideration in medical diagnostics where false negatives and false positives carry significant consequences. The F1-score serves to harmonize precision and recall, furnishing a unified metric to gauge the model’s equilibrium between these two facets.

Moreover, the validation process incorporates a confusion matrix and ROC curves to furnish a more granular analysis of the model’s performance across diverse thresholds and classes. The confusion matrix delineates the model’s true positives, false positives, false negatives, and true negatives, offering a snapshot of its classification capabilities. Meanwhile, ROC curves and the corresponding AUC (Area Under the Curve) provide insights into the model’s capacity to discriminate between classes at varying threshold settings, a crucial consideration for refining the model’s decision boundary.

In our quest to maximize the performance of our Convolutional Neural Network (CNN) model for lung cancer classification, we meticulously fine-tuned several critical hyperparameters that play pivotal roles in shaping the learning process and ultimately, the model’s accuracy. Specifically, we focused on optimizing the learning rate, batch size, number of filters in each convolutional layer, filter size, and dropout rate. Firstly, we delved into exploring a spectrum of learning rates to pinpoint the optimal value that ensures swift convergence towards the minimum of the loss function without overshooting. Next, we scrutinized various batch sizes to strike a delicate balance between training time and the stability of the gradient descent process. Moving forward, we embarked on an exploration of different combinations of the number of filters and filter sizes in the convolutional layers, aiming to unearth the configuration most adept at extracting salient features from the intricate CT scan images. Additionally, to combat overfitting and foster model robustness, we meticulously optimized the dropout rate, discerning the precise proportion of neurons to deactivate during training. Our methodology embraced a meticulous grid search strategy, systematically traversing through predefined sets of values for each hyperparameter while evaluating the model’s performance using cross-validation. This exhaustive search enabled us to pinpoint the hyperparameter combination that not only elevated the model’s classification accuracy but also bolstered its generalization capabilities. Subsequently, the efficacy of the selected hyperparameters was meticulously validated using a distinct validation set, underscoring the robustness and reliability of our chosen parameters. Through this systematic and rigorous approach to hyperparameter tuning, we achieved remarkable strides in fortifying the performance and stability of our lung cancer classification model, thereby augmenting its potential for real-world clinical applications.

The training and validation phases operate iteratively, with refinements made to the model’s architecture, hyperparameters, or training methodology based on the validation outcomes. This iterative refinement persists until the model achieves a satisfactory equilibrium of accuracy, generalizability, and robustness, thereby ensuring its efficacy and reliability in clinical settings for lung cancer stage classification.

Statistical methods

In the analysis of the IQ-OTH/NCCD lung cancer dataset, various statistical and machine learning techniques were employed to ensure a comprehensive evaluation of the data. The primary focus was on classification metrics to assess the performance of the predictive models.

Confusion matrix : The confusion matrix serves as a pivotal component in our analysis, furnishing a visual representation of the model’s performance. It succinctly presents the counts of true positives, true negatives, false positives, and false negatives, thereby offering a lucid comprehension of the model’s classification accuracy and any instances of misclassification.

Accuracy : The accuracy metric was calculated by dividing the number of correctly predicted observations by the total number of observations, providing a straightforward measure for assessing the model’s overall performance. However, relying solely on accuracy can be deceptive, particularly in datasets with imbalanced class distributions. Therefore, it is imperative to incorporate additional metrics for a more comprehensive evaluation. It is achieved by Eq.  16 .

Precision (positive predictive value) : Precision was utilized to assess the accuracy of positive predictions, quantified as the ratio of true positives to the sum of true positives and false positives. This metric bears significant relevance in scenarios where the repercussions of false positives are considerable. It is achieved by Eq.  17 .

Recall (sensitivity or true positive rate) : Recall assesses the model’s ability to detect positive instances, calculated as the ratio of true positives to the sum of true positives and false negatives. This metric holds particular importance in medical diagnostics, where failing to identify a positive case can lead to severe consequences. It is achieved by Eq.  18 .

F1-score : The F1-score, which is the harmonic mean of precision and recall, was used to provide a balance between the two metrics, particularly valuable in situations of class imbalance. It is a more robust measure than accuracy in scenarios where false negatives and false positives have different implications. It is achieved by Eq.  19 .

Cohen’s kappa : The Cohen’s Kappa statistic was applied to assess the agreement between observed and predicted classifications, accounting for chance agreement. This statistic offers a nuanced understanding of the model’s performance, which is particularly valuable in scenarios involving imbalanced datasets. It is achieved by Eq.  20 .

Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) : MSE (Mean Squared Error) and RMSE (Root Mean Squared Error) were calculated to evaluate the average squared difference and the square root of the average squared differences, respectively, between predicted and actual classification categories. These metrics are instrumental in understanding the variance of prediction errors. MSE and RMSE are achieved using Eqs.  21 and 22 , respectively.

Mean Absolute Error (MAE) : MAE (Mean Absolute Error) measures the average magnitude of errors in a set of predictions, regardless of their direction. It is a linear score, meaning that all individual differences are equally weighted in the average. It is achieved using Eq.  23 .

Receiver Operating Characteristic (ROC) Curve and Area Under the Curve (AUC) : The ROC curve graphically illustrates the diagnostic ability of the model by plotting the true positive rate against the false positive rate at various threshold settings. The AUC (Area Under the Curve) provides a single scalar value summarizing the overall performance of the model across all possible classification thresholds. It is achieved using Eq.  24 .

F2-score : The F2-score was calculated to weigh recall higher than precision, useful in scenarios where missing positive predictions is more detrimental than making false positives. It is achieved using Eq.  25 .

These statistical methods and metrics provided a multifaceted evaluation of the model’s performance, ensuring a robust analysis of the predictive capabilities and reliability in classifying the cases within the IQ-OTH/NCCD lung cancer dataset.

The evaluation of the IQ-OTH/NCCD lung cancer dataset through our predictive model yielded detailed insights across various statistical metrics, showcasing the model’s efficacy in classifying lung cancer stages. Here we delve into a comprehensive analysis of each metric:

Confusion matrix : The confusion matrix offered a detailed perspective on the model’s classification performance, unveiling a notable count of true positives and true negatives, reflecting precise predictions. Notably, there were minimal occurrences of false positives and false negatives, underscoring the model’s accuracy in discerning between benign, malignant, and normal cases. The same is visualized in Fig.  7 .

figure 7

Confusion matrix

Accuracy : The overall model accuracy was noted at 99.64%, highlighting the model’s robust capacity to accurately identify and classify instances within the dataset. This exceptional accuracy rate underscores the model’s reliability in clinical diagnostic settings, establishing a solid basis for subsequent validation and potential clinical implementation. To provide visual insight of this Fig.  8 gives truly classified instances.

figure 8

Correctly classified instances

Precision : The precision metric provided valuable insights into the model’s predictive reliability. It attained a precision of 96.77% for benign cases, signifying a high probability that a case predicted as benign is indeed benign. Moreover, for malignant and normal cases, the precision reached 100%, demonstrating the model’s outstanding ability to predict these categories accurately without any false positives.

Recall : The recall scores were equally remarkable, achieving 100% for both benign and malignant cases, and 99.04% for normal cases. These findings underscore the model’s sensitivity and its capability to accurately detect all true positive cases, thereby mitigating the risk of false negatives as a pivotal consideration in medical diagnostics.

F1-score : The F1-scores, which strike a balance between precision and recall, were 98.36% for benign, 100% for malignant, and 99.52% for normal cases. These scores signify the model’s balanced performance, guaranteeing both the accuracy of positive predictions and the reduction of false negatives. To enhance the visualization of the classification report, Table  3 provides a statistical representation.

Based on Table 3 a heatmap to visualize the same detail is provided in Fig.  9 for better insights.

figure 9

Classification report

Cohen’s kappa : With a Cohen’s Kappa score of 0.9938, the model exhibited perfect agreement with the actual classifications, surpassing the performance expected by chance alone. This underscores an elevated level of consistency in the model’s predictions, thus reinforcing its reliability.

Mean Squared Error (MSE) and Root Mean Squared Error (RMSE) : The model reported an MSE of 0.0145 and an RMSE of 0.1206, indicating minimal variance and bias in the prediction errors. These low values suggest that the model’s predictions are consistently close to the actual values, enhancing trust in its predictive power.

Mean Absolute Error (MAE) : With an MAE of 0.0073, the model exhibited minimal average error magnitude in its predictions, signifying high predictive accuracy. This metric further reinforces the model’s suitability for clinical settings where precision is crucial. To visualize the error metrics, a bar chart is given in Fig.  10 .

figure 10

Error metrics barh chart

Receiver Operating Characteristic (ROC) Curve and Area Under the Curve (AUC) : The ROC curves and corresponding AUC values were exceptional, achieving AUCs of 1.00 for malignant, benign, and normal cases. These results indicate the model’s outstanding discrimination ability between different classes across various threshold settings. The roc-auc curve is provided in Fig.  11 .

figure 11

F2-score : The F2-score of 0.9964, which places more emphasis on recall, indicates the model’s strong ability to identify positive cases. This is particularly important in the medical field, where failing to detect a condition could have profound consequences. The visual representation of performance score is given in Fig.  12 .

figure 12

Performance scores

The detailed results across these metrics provide a comprehensive picture of the model’s performance, highlighting its precision, reliability, and robustness in classifying lung cancer stages from the IQ-OTH/NCCD dataset. The findings demonstrate the model’s potential as a diagnostic tool, supporting its further investigation and potential integration into clinical practice.

The analysis of the IQ-OTH/NCCD lung cancer dataset with our model reveals a groundbreaking level of performance in medical image classification. With an accuracy of 99.64% and exceptional precision and recall metrics across the three categories (benign, malignant, and normal), the model emerges as a highly reliable diagnostic aid. The significance of these results extends beyond the high metric scores; it lies in the model’s capability to accurately distinguish between benign and malignant cases, a critical aspect for patient management and treatment planning.

The high F1-score underscores the model’s balanced consideration of precision and recall, thereby minimizing the risk of misdiagnosis. Additionally, the emphasis on recall in the F2-score holds particular significance in the medical domain, where overlooking a positive case (false negative) can have more severe consequences than erroneously identifying a case as positive (false positive). The comparison between the baseline models and proposed model has been given in Table  4 .

In the realm of lung cancer detection, many existing models focus predominantly on binary classification, often neglecting the nuanced differentiation between benign and malignant cases [ 37 ]. Our model’s tri-classification capability sets a new benchmark, offering a more detailed diagnostic tool compared to the binary classifiers. When juxtaposed with existing methods, our model’s performance underscores its advanced detection capabilities, potentially offering a more nuanced and informative diagnostic perspective than currently available tools.

For clinical practice, the integration of such a high-performing model could revolutionize lung cancer diagnostics [ 22 , 38 ]. It can augment radiologists’ capabilities, reducing diagnostic time and increasing throughput. The ability to accurately classify lung nodules as benign, malignant, or normal could significantly reduce unnecessary interventions, minimizing patient exposure to invasive procedures and associated risks. Additionally, it can streamline the patient pathway, ensuring rapid treatment initiation for malignant cases and appropriate follow-up for benign conditions [ 39 , 40 ].

While the results are promising, the study’s limitations warrant consideration. The model’s training on a dataset from a specific demographic and geographic area raises questions about its applicability to broader populations. Additionally, the model’s performance in a controlled study environment might not fully translate to the diverse and unpredictable nature of clinical settings. The black-box nature of deep learning models also poses a challenge in clinical contexts, where understanding the rationale behind a diagnosis is as crucial as the diagnosis itself [ 41 ]. To make it more clear in Fig.  13 some misclassified instances has been shown.

figure 13

Misclassified instances

When evaluating our CNN model’s performance on the lung cancer dataset, we noticed some errors in classification. These mistakes can happen for various reasons. Firstly, some features in the CT scans may look similar between benign and malignant nodules, making it hard for the model to tell them apart. Also, noise and artifacts in the scans can confuse the model by hiding important details. Even though we tried to balance the classes, rare cases could still be challenging for the model to recognize. Plus, early-stage cancer might look very similar to normal tissue, making it tricky for the model to spot. Differences in how scans are taken can also affect the model’s understanding, leading to errors. Lastly, if the model learns too much from the training data, it might not perform well on new, unseen images. To fix these issues, we’re planning to use better techniques for preparing the data, like removing noise more effectively and making the model more flexible to different imaging conditions. We also aim to combine multiple models and use more diverse data to improve accuracy. By addressing these challenges, we hope to make our model better at classifying lung cancer stages.

While the IQ-OTHNCCD lung cancer dataset has been instrumental in developing and validating our model, it is important to recognize its limitations, particularly concerning demographic and geographic diversity. The dataset predominantly represents a specific population, which may not capture the full spectrum of variations seen in global populations. This limitation poses challenges for the model’s generalizability, as differences in demographics, such as age, ethnicity, and underlying health conditions, can influence the presentation of lung cancer in CT scans.

To address these limitations, future research should focus on expanding the dataset to include a more diverse range of CT scan images from various demographic groups and geographic regions. This expansion can be facilitated through collaborations with international medical institutions and accessing publicly available medical imaging repositories. Additionally, incorporating advanced data augmentation techniques that simulate variations in demographic characteristics, such as age and gender, can further enhance the dataset’s diversity. By broadening the dataset, we aim to improve the model’s robustness and ensure its applicability across different populations, ultimately enhancing the utility and reliability of our diagnostic tool in diverse clinical settings. This approach will contribute to developing a more inclusive and universally applicable model for lung cancer diagnosis.

Sensitivity analysis of precision, recall, and F1-score

In our endeavor to comprehensively assess the performance of our Convolutional Neural Network (CNN) model for lung cancer diagnosis, we conducted a sensitivity analysis focusing on precision, recall, and the F1-score. Precision sensitivity involved systematically adjusting the threshold values used for classification to observe its impact on false positive rates and the model’s conservatism in identifying positive cases. As precision increased, indicating a more stringent classification approach, false positives decreased, but the risk of false negatives rose, necessitating a delicate balance in medical diagnostics. Conversely, recall sensitivity entailed modifying the model’s sensitivity to detect positive cases, thereby influencing its ability to minimize false negatives. Heightened recall improved the identification of true positives, crucial for early diagnosis and treatment, albeit with potential increases in false positives, mandating cautious management. Additionally, analyzing the F1-score, a harmonic mean of precision and recall, elucidated its role in balancing false positives and false negatives. Optimizing for a high F1-score underscored a balanced approach, ensuring robust performance across both precision and recall metrics. Overall, the sensitivity analysis underscored the significance of striking a delicate balance between precision, recall, and the F1-score to optimize the model’s performance in clinical settings. By navigating and managing these trade-offs effectively, we can bolster the reliability and efficacy of our model in diagnosing lung cancer, thereby contributing to improved patient outcomes.

Regulatory considerations for clinical application

Implementing machine learning models in clinical settings involves navigating a complex landscape of regulatory requirements to ensure patient safety, data security, and efficacy. One of the primary regulatory hurdles is obtaining approval from medical device regulatory bodies such as the U.S. Food and Drug Administration (FDA), the European Medicines Agency (EMA), or other relevant national authorities. These regulatory agencies require extensive validation studies to demonstrate the model’s accuracy, reliability, and safety in diagnosing lung cancer. This involves rigorous testing on diverse datasets to ensure the model’s generalizability and performance across different patient populations and clinical scenarios.

Additionally, regulatory guidelines mandate that machine learning models used in healthcare must provide a level of interpretability and transparency. Clinicians need to understand the decision-making process of the model to trust and effectively integrate it into clinical workflows. This requirement for explainability poses a challenge for deep learning models, which are often considered “black boxes.” Therefore, developing methods to elucidate the model’s reasoning, such as feature importance analysis or visual explanations, is crucial for meeting regulatory standards.

Data privacy and security are also significant regulatory concerns, particularly with the implementation of regulations like the General Data Protection Regulation (GDPR) in Europe and the Health Insurance Portability and Accountability Act (HIPAA) in the United States. Ensuring that patient data is anonymized, securely stored, and used ethically is essential for compliance. This includes implementing robust data encryption, access controls, and audit trails to protect sensitive health information from unauthorized access and breaches.

Moreover, post-market surveillance is a critical component of regulatory compliance, requiring continuous monitoring of the model’s performance in real-world clinical settings. This involves tracking the model’s diagnostic accuracy, identifying potential biases, and updating the model as needed to maintain its efficacy and safety over time. Establishing a framework for ongoing evaluation and improvement is essential to meet regulatory requirements and ensure the model’s long-term success in clinical applications.

Addressing these regulatory hurdles necessitates close collaboration between developers, healthcare providers, and regulatory bodies to ensure that machine learning models are safe, effective, and aligned with clinical needs. By adhering to these regulatory frameworks, we can facilitate the successful integration of advanced diagnostic tools into healthcare, ultimately enhancing patient outcomes and advancing the field of medical diagnostics.

Future research directions should focus on external validation of the model across various populations and healthcare settings to ascertain its universality and robustness. Integrating multimodal data, encompassing patient history, genetic information, and other diagnostic results, could enhance the model’s diagnostic precision. Addressing the interpretability of deep learning models could foster greater trust and integration into clinical decision-making processes. Additionally, prospective studies assessing the model’s impact on clinical outcomes, patient satisfaction, and healthcare efficiency would provide invaluable insights into its practical benefits and potential areas for improvement.

This study presented a comprehensive analysis of the IQ-OTH/NCCD lung cancer dataset using a sophisticated machine learning model, which demonstrated exceptional performance in classifying lung cancer stages. Key findings include a near-perfect accuracy rate of 99.64%, alongside impressive precision and recall metrics across benign, malignant, and normal case classifications. The model’s balanced F1-score and the emphasis on recall in the F2-score further highlight its diagnostic precision and sensitivity. These results signify a substantial advancement in the model’s ability to differentiate between nuanced lung cancer stages, providing a critical tool for early and accurate diagnosis.

The implications of these discoveries on the field of lung cancer diagnostics are profound. The model’s precision in classifying lung cancer stages holds the promise of substantially enhancing diagnostic protocols, thereby refining the accuracy and efficiency of lung cancer detection. This advancement has the potential to facilitate earlier treatment interventions, potentially enhancing patient outcomes and survival rates. Moreover, the model’s capability to differentiate between benign and malignant nodules could mitigate the need for unnecessary invasive procedures, consequently reducing patient risk and healthcare expenditures.

Future research should focus on external validation of the model to ensure its effectiveness across diverse populations and clinical settings. The exploration of model interpretability is crucial for clinical adoption, where understanding the basis for diagnostic decisions is essential. Additionally, integrating the model with other diagnostic data and clinical workflows could enhance its utility and impact.

Prospective studies are needed to evaluate the model’s real-world clinical impact, particularly its ability to improve patient outcomes, streamline diagnostic pathways, and reduce healthcare costs. The potential for the model to be adapted or extended to other types of cancers or medical imaging modalities also represents an exciting avenue for future research.

This study highlights the potential of advanced machine learning models to transform lung cancer diagnostics, providing a more precise, effective, and nuanced approach to detecting and classifying lung cancer. The ongoing advancement and incorporation of such models into clinical settings hold the promise of catalyzing substantial progress in patient care and outcomes within the field of oncology.

Availability of data and materials

Data used for the findings are publicly available at https://www.kaggle.com/datasets/hamdallak/the-iqothnccd-lung-cancer-dataset .

Nooreldeen R. Current and future development in lung cancer diagnosis. Int J Mol Sci. 2021;22:8661.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Rea G, et al. Beyond visual interpretation: quantitative analysis and artificial intelligence in interstitial lung disease diagnosis expanding horizons in radiology. Diagnostics. 2023;13:2333.

Article   PubMed   PubMed Central   Google Scholar  

Rajasekar V, et al. Lung cancer disease prediction with CT scan and histopathological images feature analysis using deep learning techniques. Results Eng. 2023;18:101111.

Article   CAS   Google Scholar  

Lanjewar MG, Kamini G, Panchbhai, Panem Charanarur. Lung cancer detection from CT scans using modified DenseNet with feature selection methods and ML classifiers. Expert Syst Appl. 2023;224:119961.

Article   Google Scholar  

Raza R, et al. Lung-EffNet: lung cancer classification using EfficientNet from CT-scan images. Eng Appl Artif Intell. 2023;126:106902.

Chaunzwa TL, et al. Deep learning classification of lung cancer histology using CT images. Sci Rep. 2021;11(1):1–12.

Chaturvedi P, Jhamb A, Vanani M, Nemade V. Prediction and Classification of Lung Cancer Using Machine Learning Techniques. IOP Conference Series: Materials Science and Engineering. 2021;1099:012059. https://doi.org/10.1088/1757-899X/1099/1/012059 .

Hong M, et al. Multi-class classification of lung diseases using CNN models. Appl Sci. 2021;11:9289.

Phankokkruad M. Ensemble transfer learning for lung cancer detection. 2021 4th international conference on data science and information technology. 2021.

Google Scholar  

Ren Z, Zhang Y, Wang S. LCDAE: data augmented ensemble framework for lung cancer classification. Technology Cancer Research Treatment. 2022;21:15330338221124372.

Protonotarios NE, et al. A few-shot U-Net deep learning model for lung cancer lesion segmentation via PET/CT imaging. Biomedical Physics Engineering Express. 2022;8(2):025019.

Heuvelmans MA, van Ooijen PM, Ather S, Silva CF, Han D, Heussel CP, Oudkerk M. Lung cancer prediction by Deep Learning to identify benign lung nodules. Lung Cancer. 2021;154:1–4.

Article   PubMed   Google Scholar  

Le NQK, Kha QH, Nguyen VH, Chen YC, Cheng SJ, Chen CY. Machine learning-based radiomics signatures for EGFR and KRAS mutations prediction in non-small-cell lung cancer. Int J Mol Sci. 2021;22(17):9254.

Xie Y, Meng WY, Li RZ, Wang YW, Qian X, Chan C, Leung ELH. Early lung cancer diagnostic biomarker discovery by machine learning methods. Transl Oncol. 2021;14(1):907.

Li Z, et al. Deep Learning Methods for Lung Cancer Segmentation in Whole-Slide Histopathology Images—The ACDC@LungHP Challenge 2019. IEEE J Biomed Health Inform. 2021;25(2):429–40.

Narvekar S, Shirodkar M, Raut T, Vainganka P, Chaman Kumar KM, Aswale S. A Survey on Detection of Lung Cancer Using Different Image Processing Techniques. London, United Kingdom: 2022 3rd International Conference on Intelligent Engineering and Management (ICIEM); 2022. p. 13–8. https://doi.org/10.1109/ICIEM54221.2022.9853190 .

Book   Google Scholar  

Aharonu M, Kumar RL. Convolutional Neural Network based Framework for Automatic Lung Cancer Detection from Lung CT Images. Bangalore, India: 2022 International Conference on Smart Generation Computing, Communication and Networking (SMART GENCON); 2022. p. 1–7. https://doi.org/10.1109/SMARTGENCON56628.2022.10084235 .

Kavitha BC, Naveen KB. Image Acquisition and Pre-processing for Detection of Lung Cancer using Neural Network. Mandya, India: 2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT); 2022. p. 1–4.

Causey JL, et al. Spatial pyramid pooling with 3D convolution improves Lung Cancer Detection, in IEEE/ACM transactions on Computational Biology and Bioinformatics . 1 March-April. 2022;19(2):1165–72. https://doi.org/10.1109/TCBB.2020.3027744 .

Ahmed I, Chehri A, Jeon G, Piccialli F. Automated Pulmonary Nodule Classification and Detection Using Deep Learning Architecture. IEEE/ACM Trans Comput Biol Bioinform. 2023;20(4):2445–56. https://doi.org/10.1109/TCBB.2022.3192139 .

Thakur A, Gupta M, Sinha DK, Mishra KK, Venkatesan VK, Guluwadi S. Transformative breast Cancer diagnosis using CNNs with optimized ReduceLROnPlateau and Early stopping Enhancements. Int J Comput Intell Syst. 2024;17(1):14.

Albalawi E, Thakur A, Ramakrishna MT, Khan B, Sankaranarayanan S, Almarri SB, Aldhyani T. Oral squamous cell carcinoma detection using EfficientNet on histopathological images. Front Med. 2024;10:1349336.

Shah AA, Malik HAM, Muhammad A, Alourani A, Butt ZA. Deep learning ensemble 2D CNN approach towards the detection of lung cancer. Sci Rep. 2023;13(1):2987.

Alzubaidi MA, Otoom M, Jaradat H. Comprehensive and Comparative Global and Local Feature Extraction Framework for Lung Cancer Detection Using CT Scan Images, in IEEE Access . 2021;9:158140–54. https://doi.org/10.1109/ACCESS.2021.3129597 .

Mathio D, Johansen JS, Cristiano S, Medina JE, Phallen J, Larsen KR, Velculescu E. Detection and characterization of lung cancer using cell-free DNA fragmentomes. Nat Commu. 2021;12(1):5060.

Mehmood S et al. Malignancy Detection in Lung and Colon Histopathology Images Using Transfer Learning With Class Selective Image Processing, in IEEE Access , vol. 10, pp. 25657–25668, 2022, https://doi.org/10.1109/ACCESS.2022.3150924 .

Dritsas E, Trigka M. Lung cancer risk prediction with machine learning models. Big Data Cogn Comput. 2022;6(4):139.

Masud M, Sikder N, Nahid AA, Bairagi AK, AlZain MA. A machine learning approach to diagnosing lung and colon cancer using a deep learning-based classification framework. Sensors. 2021;21(3):748.

Naseer S, Akram T, Masood M, Rashid, Jaffar A. Lung Cancer Classification Using Modified U-Net Based Lobe Segmentation and Nodule Detection, in IEEE Access , vol. 11, pp. 60279–60291, 2023, https://doi.org/10.1109/ACCESS.2023.3285821 .

Bharathy S, Pavithra R. Lung Cancer Detection using Machine Learning. In 2022 International Conference on Applied Artificial Intelligence and Computing (ICAAIC). 2022. p. 539–43 IEEE.

Kasinathan G, Jayakumar S. Cloud based lung tumor detection and stage classification using deep learning techniques. BioMed Res Int. 2022;2022:4185835.

Das S, et al. Automated prediction of Lung Cancer using Deep Learning algorithms. Applied Artificial Intelligence. CRC; 2023. pp. 93–120.

Chapter   Google Scholar  

Tasnim N, et al. A Deep Learning Based Image Processing Technique for Early Lung Cancer Prediction. 2024 ASU International Conference in Emerging Technologies for Sustainability and Intelligent Systems (ICETSIS). 2024. IEEE.

Safta W. Advancing pulmonary nodule diagnosis by integrating Engineered and Deep features extracted from CT scans. Algorithms. 2024;17(4):161.

Khaliq K, et al. LCCNet: a deep learning based Method for the identification of lungs Cancer using CT scans. VFAST Trans Softw Eng. 2023;11(2):80–93.

Nigudgi S. Lung cancer CT image classification using hybrid-SVM transfer learning approach. Soft Comput. 2023;27(14):9845–59.

Diwakar M, Singh P, Shankar A. Multi-modal medical image fusion framework using co-occurrence filter and local extrema in NSST domain. Biomed Signal Process Control. 2021;68:102788. https://doi.org/10.1016/j.bspc.2021.102788 .

Das M, Gupta D, Bakde A. An end-to-end content-aware generative adversarial network-based method for multimodal medical image fusion. Data Analytics Intell Sys. 2024;7(1):7–10. https://doi.org/10.1088/978-0-7503-5417-2ch7 .

Jie Y, Xu Y, Li X, Tan H. (2024). TSJNet: A Multi-modality Target and Semantic Awareness Joint-driven Image Fusion Network. arXiv preprint arXiv:2402.01212 .

Dhaundiyal R, Tripathi A, Joshi K, Diwakar M, Singh P. Clustering based multi-modality medical image fusion. In: Journal of Physics: Conference Series. 2020 (Vol. 1478, No. 1, p. 012024). IOP Publishing.

Diwakar M, Singh P, Shankar A, Nayak RS, Nayak J, Vimal S, Sisodia D. Directive clustering contrast-based multi-modality medical image fusion for smart healthcare system. Netw Model Anal Health Inf Bioinf. 2022;11(1):15.

Download references

Acknowledgements

Not applicable.

This research received no external funding.

Author information

Authors and affiliations.

Al-Ameen Engineering College (Autonomous), Erode, Tamil Nadu, India

M. Mohamed Musthafa

Department of Computer science and Engineering, East Point College of Engineering & Technology, Bangalore, India

I. Manimozhi

Department of Computer Science and Engineering, JAIN (Deemed-to-be University), Bengaluru, 562112, India

T. R. Mahesh

Adama Science and Technology University, Adama, 302120, Ethiopia

Suresh Guluwadi

You can also search for this author in PubMed   Google Scholar

Contributions

M.M.M took care of the review of literature and methodology. M.T.R has done the formal analysis, data collection and investigation. I.M has done the initial drafting and statistical analysis. S.G has supervised the overall project. All the authors of the article have read and approved the final article.

Corresponding author

Correspondence to Suresh Guluwadi .

Ethics declarations

Ethics approval and consent to participate.

Not applicable. 

Consent for publication

Not applicable as the work is carried out on publicly available dataset.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Musthafa, M.M., Manimozhi, I., Mahesh, T.R. et al. Optimizing double-layered convolutional neural networks for efficient lung cancer classification through hyperparameter optimization and advanced image pre-processing techniques. BMC Med Inform Decis Mak 24 , 142 (2024). https://doi.org/10.1186/s12911-024-02553-9

Download citation

Received : 16 April 2024

Accepted : 22 May 2024

Published : 27 May 2024

DOI : https://doi.org/10.1186/s12911-024-02553-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Lung cancer
  • Machine learning
  • Classification
  • Diagnostic accuracy

BMC Medical Informatics and Decision Making

ISSN: 1472-6947

literature review on logic model

  • Open access
  • Published: 24 May 2024

Cost-effectiveness of differentiated care models that incorporate economic strengthening for HIV antiretroviral therapy adherence: a systematic review

  • Annie Liang 1 ,
  • Marta Wilson-Barthes   ORCID: orcid.org/0000-0002-9845-7142 2 &
  • Omar Galárraga   ORCID: orcid.org/0000-0002-9985-9266 3  

Cost Effectiveness and Resource Allocation volume  22 , Article number:  46 ( 2024 ) Cite this article

77 Accesses

1 Altmetric

Metrics details

There is some evidence that differentiated service delivery (DSD) models, which use a client-centered approach to simplify and increase access to care, improve clinical outcomes among people living with HIV (PLHIV) in high HIV prevalence countries. Integrating economic strengthening tools (e.g., microcredit, cash transfers, food assistance) within DSD models can help address the poverty-related barriers to HIV antiretroviral therapy (ART). Yet there is minimal evidence of the cost-effectiveness of these types of multilevel care delivery models, which potentially prohibits their wider implementation.

Using a qualitative systematic review, this article synthesizes the literature surrounding the cost-effectiveness of differentiated service delivery models that employ economic strengthening initiatives to improve HIV treatment adherence in low- and middle-income countries. We searched three academic databases for randomized controlled trials and observational studies published from January 2000 through March 2024 in Sub-Saharan Africa. The quality of each study was scored using a validated appraisal system.

Eighty-nine full texts were reviewed and 3 met all eligibility criteria. Two of the three included articles were specific to adolescents living with HIV. Economic strengthening opportunities varied by care model, and included developmental savings accounts, microenterprise workshops, and cash and non-cash conditional incentives. The main drivers of programmatic and per-patient costs were ART medications, CD4 cell count testing, and economic strengthening activities.

All economic evaluations in this review found that including economic strengthening as part of comprehensive differentiated service delivery was cost-effective at a willingness to pay threshold of at least 2 times the national per capita gross domestic product. Two of the three studies in this review focused on adolescents, suggesting that these types of care models may be especially cost-effective for youth entering adulthood. All studies were from the provider perspective, indicating that additional evidence is needed to inform the potential cost-savings of DSD and economic strengthening interventions to patients and society. Randomized trials testing the effectiveness of DSD models that integrate economic strengthening should place greater emphasis on costing these types of programs to inform the potential for bringing these types of multilevel interventions to scale.

Introduction

Responding to the World Health Organization’s Treat All Policy, low- and middle-income countries (LMICs) are increasingly using differentiated service delivery (DSD) models as a way to rapidly scale up access to life-saving antiretroviral therapy for people living with HIV (PLHIV) [ 1 ]. According to the International AIDS Society, “differentiated service delivery (DSD), previously referred to as differentiated care, is a client-centred approach that simplifies and adapts HIV services across the cascade to reflect the preferences, expectations and needs of people living with and affected by HIV, while reducing unnecessary burdens on the health system” [ 2 ]. DSD models aim to make care “patient-centered” while reducing logistical and administrative burden(s) on traditional, resource-constrained care facilities [ 1 ]. These models have shown to be effective for increasing treatment adherence, but most do not address the persistent poverty-related barriers to HIV care engagement (e.g., long and costly distances to facilities, food insecurity, HIV stigma). A recent systematic review from 20 LMICs found that economic strengthening interventions such as conditional cash transfers, microcredit, and transportation assistance can improve medication adherence and care-seeking behaviors among persons living with HIV, with more moderate impacts on clinical outcomes [ 3 ]. Two other systematic reviews found that, on their own, differentiated HIV service delivery approaches in Sub-Saharan Africa (SSA) generally cost the same as or less than standard HIV care in terms of the cost per patient per year from a patient perspective [ 1 , 4 ]. For providers and health systems, the available economic evidence suggests that DSD models in SSA are not cost saving compared to more traditional facility-based care models [ 4 ]. A 2017 modeling study found that differentiated service delivery models aiming to increase access to ART in SSA could yield up to a 17.5% reduction in health system costs and health workforce requirements over 5 years [ 5 ]. It remains to be seen whether differentiated service delivery models that additionally aim to address poverty-related barriers to care (e.g., food insecurity, long and costing distances to facilities, restricted access to income-generating opportunities) are cost-effective for patients, providers, or society as a whole [ 6 , 7 ].

The purpose of this systematic review is to (i) summarize the current evidence surrounding the cost and cost-effectiveness of differentiated HIV service delivery models that include economic strengthening compared to differentiated service delivery without economic strengthening and to standard HIV care, and (ii) offer a conceptual framework that can help future researchers understand the key components influencing the incremental cost-effectiveness of these holistic models for patients and providers.

Eligibility criteria

Our review focused on studies of the cost-effectiveness of differentiated HIV care models that incorporated at least one economic strengthening component. Articles were excluded if they were not a randomized controlled trial or observational study, did not include both an economic strengthening and a differentiated care component for promoting ART adherence, or did not report a standard metric for assessing cost-effectiveness of an ART adherence intervention. Economic strengthening included any activity that aimed to generate individual- or household-level income or wealth, such as microfinance groups, social protection programs, savings accounts, or training in financial literacy or entrepreneurship. Articles that were not peer reviewed, published in English, or conducted in SSA were also excluded. There were no restrictions on the study population in terms of age, gender, or SSA region. During the abstract round of screening if the study fit all other criteria (differentiated service delivery in Sub-Saharan Africa with economic strengthening) but did not mention whether a cost-analysis was performed, the study was included for full text screening to account for ancillary costeffectiveness analyses.

Information sources & search strategy

We conducted a literature search of articles in PubMed (National Center for Biotechnology Information, Bethesda, Maryland) and EconLit (American Economic Association, Nashville, Tennessee), supplemented by an Internet search of Google Scholar. Prior reviews indicate that DSD interventions have been implemented since the 2000s. Thus, we searched articles published from January 1, 2000 through March 31, 2024 using the terms “HIV or AIDS”, “ antiretroviral therapy”, “economic strengthening”, “differentiated service delivery”, “Sub-Saharan Africa” “cost analysis”, “cost-effectiveness” and “cost-savings”. Literature searched in PubMed used MeSH (Medical Subject Headings) controlled vocabulary to select key search terms. The full search strategy implemented for each database is provided in Additional File 1 .

Selection process

Initial search results were reviewed by one reviewer (AL). Abstracts and main texts of articles that met all eligibility criteria were double reviewed (AL and MWB), with a third reviewer consulted when necessary (OG).

Data collection process

A data extraction tool was developed to capture the following indicators: study context (e.g., country and region of study), design, population, DSD component(s), economic strengthening activity, costing perspective, main drivers of intervention and per-patient costs, cost-effectiveness metric (e.g., incremental cost-effectiveness ratio), willingness-to-pay threshold (WTP), and a binary indicator of whether the intervention showed to be cost-effective (yes/no). Due to significant heterogeneity across studies in terms of effectiveness and cost-effectiveness outcomes, a meta-analysis was not performed. Search findings were reported following the Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) guidelines [ 8 ].

Quality assessment

Full texts that were standard health economic evaluations were assessed using the validated Quality of Health Economic Studies (QHES) appraisal system developed by Chiou [ 9 , 10 ]. The quality of each full text article was assessed based on the sixteen weighted criteria listed in Additional File 2 . Weighted scores for each criterion were summed to generate an overall quality score ranging from 0 (extremely poor quality) to 100 (excellent quality). Four quality categories (0–25, 25.1–50, 50.1–75, and 75.1–100) were used with scores > 75 indicating high quality studies [ 10 ]. Systematic reviews, micro-costing studies, and qualitative analyses were not scored given our focus on randomized controlled trials (RCTs) and observational studies.

Conceptual framework

Drawing on the papers included in the review, we adapted an existing conceptual framework to synthesize the key components that could be understood to drive the incremental cost-effectiveness of HIV differentiated service delivery models for SSA health systems.

Identified articles

Figure  1 documents the flow of articles through the review and reasons for exclusion. Most of the 89 articles were peer-reviewed journal articles (93.2%), followed by preprints (2.2%), and scientific reports (2.2%). Of the 57 articles that included a DSD intervention, the most common differentiated service delivery model was community-based ART support and adherence counseling. Of the 40 articles that included an economic strengthening (ES) component, conditional economic (cash and non-cash) incentives and microfinance engagement were the most common ES activities. The most common reasons for exclusion were no economic strengthening component and no cost-effectiveness analysis. Eleven of the 89 reviewed articles were traditional cost-effectiveness analyses and thus were appraised for quality using the Chiou grading system; those that were not appraised using the grading system included costing, budget impact, or other types of non-cost-effectiveness evaluations. The 11 articles had an average quality score of 80.73 (out of 100), and all satisfied at least 11 of the 16 grading criteria (Additional File 2 ). Of the 89 full text articles that were assessed, three papers met all eligibility criteria and were included in this narrative review.

figure 1

Preferred Reporting Items for Systematic Reviews and Meta-analyses (PRISMA) flow diagram

Background and summary of included articles

All 3 studies scored above a 75 (out of 100) on the QHES appraisal system, indicating high quality studies [ 10 ]. Tozan et al. and Ekwunife et al. scored an 85 on the QHES, satisfying the same criteria. Stevens et al. scored 100 on the QHES, satisfying all criteria. Only Stevens et al. displayed a clear economic model, study methods and analysis, and components of numerator and denominator and justified choice of economic model, main assumptions, and limitations of the study. Although all three included studies were of high quality according to the QHES, each provided minimal rationale for their use of a given economic model which may hinder replicability.

Details of the three included studies are summarized in Table  1 . In brief, Tozan et al. [ 11 ] estimated the incremental costs of providing additional counseling sessions for HIV and ART adherence as well as an incentivized savings account and workshops on asset building to adolescents living with HIV in Uganda. Incremental intervention costs were compared to the cost of providing routine HIV care and social support alone. Ekwunife et al. [ 12 ] estimated the cost-effectiveness of a differentiated care model for young adults living with HIV in Nigeria that included motivational interview sessions and economic incentives based on viral load over 12 months. Stevens et al. [ 13 ] modelled the cost-effectiveness of scaling-up a combination care package in Swaziland, which included SMS reminders for ART adherence, counseling and health commodities for ART adherence (e.g., pillboxes and informational materials), and non-cash financial incentives for adults who newly tested positive for HIV. All included studies utilized a facility-based DSD model. For each study, the additional cost for a given intervention compared to the status quo was $970 [95% CI: $508 − 10,275] per additional patient virally suppressed [ 11 ], $1,419 per additional patient with undetected viral load [ 12 ], and $3,560 per additional quality-adjusted life year (QALY) gained [ 13 ].

Cost-effectiveness of differentiated care with economic strengthening

Table  2 presents the cost-effectiveness outcomes from each included study. All analyses used a provider perspective.

The threshold at which a given intervention was deemed cost-effective varied across studies. Tozan et al. did not report a pre-specified willingness to pay threshold [ 11 ]. Ekwunife et al. specified a willingness to pay threshold of $1,137 per additional QALY gained by the intervention [ 12 ]. Stevens et al. reported a threshold of $9,840 per additional QALY gained (3x Swaziland’s GDP per capita); the Link4Health combination package yielded an incremental cost effectiveness ratio (ICER) of $3,560 per additional QALY gained from the health sector perspective, which the authors deemed cost-effective at a willingness to pay threshold of 3 x Swaziland’s per capita GDP in 2018 [ 13 ]. The cost-effectiveness analysis by Ekwunife et al. [ 12 ] found that combing conditional economic incentives and motivational interviewing was not cost-effective compared to standard care at the authors’ pre-defined willingness to pay threshold of 0.51 times Nigeria’s per capita GDP; the intervention was cost-effective at 1 x Nigeria’s per capita GDP in 2021 ($2,027.80). Tozan et al. [ 11 ] did not report the cost-effectiveness of the combined adherence mentoring and incentivized financial savings account intervention in relation to a pre-defined cost-effectiveness threshold; however the intervention cost less than 2 x Uganda’s per capita GDP ($847.30 in 2021). The respective interventions analyzed by Ekwunife et al. [ 12 ] and Tozan et al. [ 11 ] were cost-effective (compared to standard care) assuming the World Health Organization’s willingness to pay thresholds of 2 to 3 times the national per capita GDP in the trial year. Across the three studies, the main drivers of programmatic and per-patient costs were ART treatment costs, CD4 cell count testing, and economic strengthening activities including the costs to provide non-financial incentives. In the Uganda cluster-randomized trial [ 12 ], the largest cost drivers for the intervention came from viral load tests, CD4 count testing, and patient transportation. Financial incentives and point of care CD4 testing were the main drivers of the observed cost differences in the analysis of the Link4Health cluster-RCT [ 13 ]. For Tozan et al. [ 11 ], intervention activities including health education sessions, microenterprise workshops, and savings accounts contributed the largest difference in costs between intervention and standard care. All interventions were more expensive than standard care in terms of total cost per patient.

Synthesizing framework

Based on the three papers in this review, we adapted an existing conceptual model originally developed by Kahn and colleagues [ 14 ] to illustrate – from a health system perspective – the key components that can be hypothesized to influence the cost-effectiveness of differentiated service delivery models that incorporate economic strengthening. (Fig.  2 ) Increasing patient access to antiretroviral therapy immediately following diagnosis and sustaining access over time (e.g., by offering community- or home-based care visits; accelerating ART initiation following point of care CD4 cell count testing) can be expected to add costs to the health system via an increased demand for higher drug quantities, follow-up tests, and personnel time. Similarly, providing economic strengthening opportunities that address known poverty-related barriers to ART adherence will almost always increase the incremental costs of these care delivery approaches if the initiatives are not self-sustaining. For example, providing economic incentives conditional on achieving a viral load below an assay’s lower detection limit will incur additional costs to health ministries who wish to offer this incentive scheme as part of a government social protection program. However, economic strengthening interventions have the potential to be cost-neutral to health systems if they can generate economic growth on their own, as in the case of saving and lending microfinance groups [ 15 , 16 ] or no fee savings accounts [ 11 ]. Averting new HIV infections and decreasing HIV-related morbidity by achieving an undetectable viral load via ART leads to substantial reductions in both disability-adjusted life years and treatment costs. However, as individuals live longer due to ART, they may develop other chronic diseases that incur additional costs to themselves and the health system [ 17 ]. Thus, differentiated service delivery models that integrate economic strengthening and treatment for co-occurring conditions have the potential to further reduce disease burden without substantially increasing treatment costs.

figure 2

Conceptual Framework. The conceptual framework was adapted from an existing conceptual model developed by Kahn et al. [ 11 ] The framework illustrates the key components that can be hypothesized to influence the cost-effectiveness of differentiated HIV care approaches that incorporate economic strengthening activities, from a health system perspective

All elements of this synthesizing conceptual framework are drawn from the authors’ analyses of the supporting literature. Further research on the cost-effectiveness impact of these mechanisms is required to support their validity.

This systematic narrative review found one of three studies testing a differentiated service delivery model that includes economic strengthening to be cost-effective for providers at the authors’ pre-determined WTP threshold. All three included articles were cost-effective at the WHO willingness to pay threshold of at least 2 times a given country’s per capita GDP. Sensitivity analyses [ 11 , 12 ] and modeling projections [ 13 ] in these papers suggest that the cost-effectiveness of these types of multilevel interventions would increase as these care models are brought to scale. Ekwunife et al. [ 12 ] found that if CD4 + count tests were performed triannually rather than four times a year, the intervention would become cost-effective. Thus, only minimal adjustments to the differentiated service delivery and ES components could increase the interventions’ cost-effectiveness.

Two of three studies in this review were among adolescents living with HIV. This suggests that cultivating routine medication taking behaviors and establishing positive economic skills (e.g., having a savings account, managing microcredit) may be especially important for lower income adolescents living with HIV who can carry these practices into adulthood. Additionally, two recent feasibility studies did not meet inclusion criteria (i.e., being an RCT or observational study) but were initially screened in this review. Findings from these studies further support the potential of integrating DSD with economic strengthening for improving HIV treatment outcomes along the care continuum (testing, linkage to care, and ART adherence) [ 18 , 19 ].

The World Health Organization’s Treat-All guidance recommends CD4 testing before initiating antiretroviral therapy (ART) and recommends routine viral load monitoring (over CD4 cell count monitoring) for patients on ART [ 20 , 21 ]. Viral load monitoring remains the gold standard for monitoring ART adherence and viral suppression among persons living with diagnosed HIV, even in settings where health systems face financial and resource constraints [ 22 , 23 , 24 ]. Thus, given that the focus of our review is on cost-effectiveness of models for ART adherence among persons with diagnosed HIV, our findings can inform scale-up of DSD models that support the most widely used HIV treatment outcomes.

Recent protocol studies reveal that there remains space in the literature to continue to examine DSD with economic strengthening interventions as an effective and cost-effective method of enhancing ART adherence [ 25 ]. For future research and policymaking, these findings suggest there may be potential for implementing scaled-up DSD with economic strengthening interventions enhancing ART adherence among adolescents and young adults specifically.

Limitations of this systematic review stemmed from the large variability in population, context, and target outcomes across studies, which limited our ability to calculate an overall combined economic effect of these interventions. Additionally, all of the cost-effectiveness analyses in this review calculated cost according to the provider perspective, which limits our ability to quantify the potential economic impact of these combination differentiated care models on patients or society. We aimed to mitigate any potential reviewer bias in the inclusion/exclusion of a quality assessment by using a standardized data extraction tool.

Despite calls for novel cost-effectiveness data of holistic differentiated care models in low- and middle-income countries [ 1 , 6 , 26 , 27 , 28 ], the evidence base surrounding the scale-up potential of DSD interventions and economic strengthening remains sparse. To our knowledge, this is the first review to synthesize the available evidence of poverty-addressing DSD models from a health economics perspective. This evidence is critical for policymakers and health care advocates working to address the economic determinants of HIV treatment adherence with limited resources.

This brief systematic review demonstrated that including economic strengthening tools as part of differentiated service delivery models is effective and largely cost-effective at common thresholds compared to traditional HIV care. Modelling projections suggest that scaling these types of multilevel intervention may improve their cost-effectiveness in the short and medium term. Future research should consider the cost-effectiveness and cost-savings of these comprehensive HIV care models from a patient and societal perspective.

Data availability

Data sharing is not applicable to this article as no new datasets were generated or analyzed during the current study.

Abbreviations

Adolescents Living with HIV

Antiretroviral Therapy

Differentiated Service Delivery

Economic Strengthening

Gross Domestic Product

Incremental Cost-Effectiveness Ratio

Income-Generating Activity

Low- or Middle-Income Country

Motivational Interviewing

People Living with HIV

Preferred Reporting Items for Systematic Reviews and Meta-Analyses

Quality-Adjusted Life Year

Randomized Controlled Trial

  • Sub-Saharan Africa

Willingness-to-Pay

Roy M, Bolton Moore C, Sikazwe I, Holmes CB. A review of Differentiated Service Delivery for HIV Treatment: effectiveness, mechanisms, Targeting, and Scale. Curr HIV/AIDS Rep. 2019;16(4):324–34. https://doi.org/10.1007/s11904-019-00454-5 .

Article   PubMed   Google Scholar  

Differentiated Service Delivery. International AIDS Society. 2024. Accessed April 24, 2024. https://www.iasociety.org/ias-programme/differentiated-service-delivery .

Swann M. Economic strengthening for retention in HIV care and adherence to antiretroviral therapy: a review of the evidence. AIDS Care. 2018;30(3):99–125. https://doi.org/10.1080/09540121.2018.1479030 .

Article   Google Scholar  

Rosen S, Nichols B, Guthrie T, Benade M, Kuchukhidze S, Long L. Do differentiated service delivery models for HIV treatment in sub-saharan Africa save money? Synthesis of evidence from field studies conducted in sub-saharan Africa in 2017–2019. Gates Open Res. 2022;5:177. https://doi.org/10.12688/gatesopenres.13458.2 .

Article   PubMed   PubMed Central   Google Scholar  

Barker C, Dutta A, Klein K. Can differentiated care models solve the crisis in HIV treatment financing? Analysis of prospects for 38 countries in sub-saharan Africa. J Int AIDS Soc. 2017;20(Suppl 4):21648. https://doi.org/10.7448/IAS.20.5.21648 .

Nachega JB, Adetokunboh O, Uthman OA, Knowlton A, Altice FL, Schechter M, et al. Community-based interventions to improve and sustain antiretroviral therapy adherence, Retention in HIV Care and clinical outcomes in low- and Middle-Income Countries for achieving the UNAIDS 90-90-90 targets. Curr HIV/AIDS Rep. 2016;13:241–55. https://doi.org/10.1007/s11904-016-0325-9 .

Munyayi FK, van Wyk B, Mayman Y. Interventions to Improve Treatment outcomes among adolescents on antiretroviral therapy with unsuppressed viral loads: a systematic review. IJERPH. 2022;19(7):3940. https://doi.org/10.3390/ijerph19073940 .

Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. 2021;372:n71. https://doi.org/10.1136/bmj.n71 .

Chiou CF, Hay JW, Wallace JF, Bloom BS, Neumann PJ, Sullivan SD, et al. Development and validation of a grading system for the quality of cost-effectiveness studies. Med Care. 2003;41(1):32–44. https://doi.org/10.1097/00005650-200301000-00007 .

Spiegel BM, Targownik LE, Kanwal F, et al. The quality of published health economic analyses in digestive diseases: a systematic review and quantitative appraisal. Gastroenterology. 2004;127(2):403–11. https://doi.org/10.1053/j.gastro.2004.04.020 .

Tozan Y, Capasso A, Sun S, Neilands TB, Damulira C, Namuwonge F, et al. The efficacy and cost-effectiveness of a family-based economic empowerment intervention (suubi + adherence) on suppression of HIV viral loads among adolescents living with HIV: results from a Cluster Randomized Controlled Trial in southern Uganda. JIAS. 2021;24(6):e25752. https://doi.org/10.1002/jia2.25752 .

Ekwunife OI, Ofomata CJ, Okafor CE, Anetoh MU, Kalu SO, Ele PU, et al. Cost-effectiveness and feasibility of conditional economic incentives and motivational interviewing to improve HIV health outcomes of adolescents living with HIV in Anambra State, Nigeria. BMC Health Serv Res. 2021;21:685. https://doi.org/10.1186/s12913-021-06718-4 .

Stevens ER, Li L, Nucifora KA, Zhou Q, McNairy ML, Gachuhi A, et al. Cost-effectiveness of a combination strategy to enhance the HIV care continuum in Swaziland: Link4Health. PLoS ONE. 2018;13(9):e0204245. https://doi.org/10.1371/journal.pone.0204245 .

Article   CAS   PubMed   PubMed Central   Google Scholar  

Kahn JG, Marseille EA, Bennett R, Williams BG, Granich R. Cost-effectiveness of antiretroviral therapy for prevention. Curr HIV Res. 2011;9(6):405–15. https://doi.org/10.2174/157016211798038542 .

Genberg BL, Wachira J, Steingrimsson JA, Pastakia S, Tran DNT, Said JA, et al. Integrated community-based HIV and non-communicable disease care within microfinance groups in Kenya: study protocol for the Harambee Cluster randomised trial. BMJ Open. 2021;11(5):e042662. https://doi.org/10.1136/bmjopen-2020-042662 .

Pastakia SD, Manyara SM, Vedanthan R, Kamano JH, Menya D, Andama B, et al. Impact of bridging Income Generation with Group Integrated Care (BIGPIC) on hypertension and diabetes in Rural Western Kenya. J Gen Intern Med. 2017;32(5):540–8. https://doi.org/10.1007/s11606-016-3918-5 .

Negin J, Bärnighausen T, Lundgren JD, Mills EJ. Aging with HIV in Africa: the challenges of living longer. AIDS. 2012;26:S1–5. https://doi.org/10.1097/QAD.0b013e3283560f54 .

Kim HY, Inghels M, Mathenjwa T, et al. The impact of a conditional financial incentive on linkage to HIV care: findings from the HITS cluster randomized clinical trial in rural South Africa. Preprint medRxiv. 2024. https://doi.org/10.1101/2024.03.15.24304278 . 2024.03.15.24304278. Published 2024 Mar 18.

Kibel M, Nyambura M, Embleton L, et al. Enabling adherence to treatment (EAT): a pilot study of a combination intervention to improve HIV treatment outcomes among street-connected individuals in western Kenya. BMC Health Serv Res. 2023;23(1):1331. https://doi.org/10.1186/s12913-023-10215-1 . Published 2023 Nov 30.

Guideline on when to start antiretroviral therapy and on pre-exposure prophylaxis for HIV. who.int. Published September 1, 2015. Accessed April 24. 2024. https://www.who.int/publications/i/item/9789241509565 .

Brazier E, Tymejczyk O, Zaniewski E, et al. Effects of National Adoption of treat-all guidelines on Pre-antiretroviral Therapy (ART) CD4 testing and viral load monitoring after ART initiation: a regression discontinuity analysis. Clin Infect Dis. 2021;73(6):e1273–81. https://doi.org/10.1093/cid/ciab222 .

Pham P MD, Nguyen HV, Anderson D, Crowe S, Luchters S. Viral load monitoring for people living with HIV in the era of test and treat: progress made and challenges ahead - a systematic review. BMC Public Health. 2022;22(1):1203. https://doi.org/10.1186/s12889-022-13504-2 . Published 2022 Jun 16.

Okoboi S, Musaazi J, King R, et al. Adherence monitoring methods to measure virological failure in people living with HIV on long-term antiretroviral therapy in Uganda. PLOS Glob Public Health. 2022;2(12):e0000569. https://doi.org/10.1371/journal.pgph.0000569 . Published 2022 Dec 30.

World Health Organization. Updated recommendations on HIV prevention, infant diagnosis, antiretroviral initiation and monitoring. who.int. Published March 17, 2021. Accessed April 24. 2024. https://www.who.int/publications/i/item/9789240022232 .

van Heerden A, Szpiro A, Ntinga X, Celum C, van Rooyen H, Essack Z, Barnabas R. A sequential multiple assignment Randomized Trial of scalable interventions for ART delivery in South Africa: the SMART ART study. Trials. 2023;24(1):32. https://doi.org/10.1186/s13063-022-07025-x .

Decroo T, Rasschaert F, Telfer B, Remartinez D, Laga M, Ford N. Community-based antiretroviral Therapy Programs can overcome barriers to Retention of Patients and Decongest Health Services in Sub-saharan Africa: a systematic review. Int Health. 2013;5(3):169–79. https://doi.org/10.1093/inthealth/iht016 .

Chaiyachati KH, Ogbuoji O, Price M, Suthar AB, Negussie EK, Bärnighausen T. Interventions to improve adherence to antiretroviral therapy: a Rapid systematic review. AIDS. 2014;28:187–204. https://doi.org/10.1097/QAD.0000000000000252 .

Creese A, Floyd K, Alban A, Guinness L. Cost-effectiveness of HIV/AIDS interventions in Africa: a systematic review of the evidence. Lancet. 2002;359(9318):1635–42. https://doi.org/10.1016/S0140-6736(02)08595-1 .

Download references

Acknowledgements

We thank the authors of the original source papers, whose work we drew on considerably.

Research reported in this publication was supported by the National Institute of Mental Health of the National Institutes of Health under award number R01MH118075, and by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health through the Providence/Boston Center for Aids Research (CFAR) (award number P30AI042853). One hundred percent of this research was financed with Federal money. The design of the study and collection, analysis and interpretation of data and writing of the manuscript is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Author information

Authors and affiliations.

Brown University School of Public Health, Providence, RI, USA

Annie Liang

Department of Epidemiology, Brown University School of Public Health, Providence, RI, USA

Marta Wilson-Barthes

Department of Health Services, Policy and Practice; and International Health Institute, Brown University School of Public Health, 121 South Main Street, Box G-S121-2, Providence, RI, USA

Omar Galárraga

You can also search for this author in PubMed   Google Scholar

Contributions

AL and OG conceived and designed the work. AL led the analysis and interpretation of the data, and drafted the work. MWB contributed to the analysis and interpretation of data. MWB and OG substantively revised the work. All authors approved the submitted version of the manuscript, and agree to be personally accountable for their own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved, and the resolution documented in the literature.

Corresponding author

Correspondence to Omar Galárraga .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Additional file 1: Search syntax

Additional file 2: quality assessment of full text articles that were standard health economic evaluations, rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Liang, A., Wilson-Barthes, M. & Galárraga, O. Cost-effectiveness of differentiated care models that incorporate economic strengthening for HIV antiretroviral therapy adherence: a systematic review. Cost Eff Resour Alloc 22 , 46 (2024). https://doi.org/10.1186/s12962-024-00557-w

Download citation

Received : 13 April 2023

Accepted : 19 May 2024

Published : 24 May 2024

DOI : https://doi.org/10.1186/s12962-024-00557-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Differentiated care
  • Differentiated service delivery
  • Economic strengthening
  • Microfinance
  • Conditional cash transfer
  • Cost-effectiveness
  • Antiretroviral therapy

Cost Effectiveness and Resource Allocation

ISSN: 1478-7547

literature review on logic model

Effectiveness of Hospital Pharmacist Interventions for COPD Patients: A Systematic Literature Review and Logic Model

Affiliations.

  • 1 Institute of Chinese Medical Sciences, University of Macau, Taipa, Macao SAR, People's Republic of China.
  • 2 Department of Public Health and Medicinal Administration, Faculty of Health Sciences, University of Macau, Taipa, Macao SAR, People's Republic of China.
  • PMID: 36317184
  • PMCID: PMC9617520
  • DOI: 10.2147/COPD.S383914

Purpose: This review aimed to summarize empirical evidence about pharmacist-led interventions for chronic obstructive pulmonary disease (COPD) patients in hospital settings and to identify the components of a logic model (including input, interventions, output, outcome and contextual factors) to inform the development of hospital pharmacist's role in COPD management.

Methods: A systematic review of literature retrieved from four English databases (PubMed, Web of Science, Scopus, ScienceDirect) and one Chinese database (CNKI) were conducted to identify eligible studies published from inception to March 2022. Studies concerning pharmacist and COPD were identified to screen for randomized controlled studies that focused on pharmacist interventions for COPD at the hospital setting.

Results: Twenty-nine studies were included in this review. The components of interventions identified were categorized according to the six service domains in the International Pharmaceutical Federation's Basel Statements, and mainly concerned prescribing, preparation, administration and monitoring but not procurement and training. Extended interventions were also identified including life guidance, psychological counseling, and respiratory function exercise. The most common outputs reported were improvement in medication adherence, rational drug use, level of knowledge, and inhalation technique. The clinical outcomes (symptomatic control, lung function, rates of hospital readmission, length of hospital stay, and adverse drug adverse reactions), humanistic outcomes (quality of life and patient satisfaction), and economic outcomes (drug costs, hospitalization costs, antibiotic costs, and direct costs) were reported only in some studies. The contextual factors mainly included geographical factors, education level of patients, socio-economic factors, and no-smoking policy.

Conclusion: The evidence for hospital pharmacists' interventions in improving COPD patients' outcome is growing. However, considering the challenges of COPD management, hospital pharmacists should further leverage the advantages of cross-sector and multi-disciplinary collaboration in order to provide more comprehensive support to better address the needs of their patients.

Keywords: chronic obstructive pulmonary disease; hospital pharmacist; intervention; logic model; outcome; output; systematic review.

© 2022 Lin et al.

Publication types

  • Systematic Review
  • Pharmacists*
  • Pulmonary Disease, Chronic Obstructive* / diagnosis
  • Pulmonary Disease, Chronic Obstructive* / drug therapy
  • Quality of Life

Grants and funding

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Int J Integr Care
  • v.21(2); Apr-Jun 2021

Logo of ijicare

A Contextual Analysis and Logic Model for Integrated Care for Frail Older Adults Living at Home: The INSPIRE Project

1 Nursing Science (INS), Department of Public Health, University of Basel, Switzerland

Evelyn Huber

2 Institute of Nursing, School of Health Professions, ZHAW Zurich University of Applied Sciences, CH

Samuel Stenz

Leah l. zullig.

3 Department of Population Health Sciences, Duke University Medical Center, USA

Andreas Zeller

4 Centre for Primary Health Care, University of Basel, Switzerland

Sabina M. De Geest

5 Academic Center for Nursing and Midwifery, Department of Public Health and Primary Care, KU Leuven, Belgium

Mieke Deschodt

6 Gerontology and Geriatrics, Department of Public Health and Primary Care, KU Leuven, Belgium

7 Healthcare and Ethics, Faculty of Medicine and Life Sciences, UHasselt, Belgium

the INSPIRE consortium

8 Matthias Briel , Matthias Schwenkglenks , Franziska Zúñiga , Penelope Vounatsou , Carlos Quinto, Eva Blozik, Flaka Siqeca , Maria José Mendieta Jara

Associated Data

The datasets used and/or analysed during the current study are mostly included in this published article and the sources are listed in Table 1 or Additional File 1. Other data files can be requested, provided the identity of the data sources will be kept anonymous.

Introduction:

Implementation science methods and a theory-driven approach can enhance the understanding of whether, how, and why integrated care for frail older adults is successful in practice. In this study, we aimed to perform a contextual analysis, develop a logic model, and select preliminary implementation strategies for an integrated care model in newly created information and advice centers for older adults in Canton Basel-Landschaft, Switzerland.

We conducted a contextual analysis to determine factors which may influence the integrated care model and implementation strategies needed. A logic model depicting the overall program theory, including inputs, core components, outputs and outcomes, was designed using a deductive approach, and included stakeholders’ feedback and preliminary implementation strategies.

Contextual factors were identified (e.g., lack of integrated care regulations, existing community services, and a care pathway needed). Core components of the care model include screening, referral, assessment, care plan creation and coordination, and follow-up. Outcomes included person-centred coordinated care experiences, hospitalization rate and symptom burden, among others. Implementation strategies (e.g., nurse training and co-developing educational materials) were proposed to facilitate care model adoption.

Conclusion:

Contextual understanding and a clear logic model should enhance the potential for successful implementation of the integrated care model.

Introduction

Frail older adults, often living with multimorbidity and functional and cognitive disabilities [ 1 ], are at higher risk of mortality, hospitalization and institutionalization [ 2 , 3 ]. Care for this population tends to be uncoordinated and fragmented [ 4 ], as frail older adults may require support from several health and social care providers as well as informal care [ 5 , 6 ]. Fragmented care can lead to negative health outcomes such as patient confusion and distress, gaps in information delivery, duplication of services, unnecessary hospitalizations and higher care costs [ 4 ]. These negative outcomes may be overcome by integrated care, a person-centered approach where individual pro-active care is facilitated by continuous, multidisciplinary collaboration and coordination of various care providers [ 7 , 8 ].

In the many studies evaluating integrated care for frail or multimorbid older adults, comprehensive assessments, tailored care plans, multidisciplinary care teams, case management, and a proactive and patient-centered approach, are commonly reported as key components [ 9 , 10 , 11 , 12 , 13 , 14 , 15 ]. However, systematic reviews indicate major heterogeneity with respect to the target population, the study outcomes selected, the delivery of their intervention elements, and most importantly, the results found on a patient-, provider-, and system-level, impeding consistent conclusions [ 9 , 10 , 11 ]. The lack of impact resulting from integrated care initiatives may be related to the outcomes measured and the measures used [ 9 , 10 ], but may also be a result of implementation issues with these complex interventions, potentially low fidelity to the intervention or the intervention lacking contextual fit [ 16 , 17 , 18 ]. This indicated the need for effectiveness studies which include process evaluations, contextual analysis, and measuring implementation outcomes to determine if, how and why community-based integrated care for frail, older adults is successful in practice [ 14 , 19 , 20 ].

Intervention development, implementation, and evaluation can be facilitated by using a theory-driven approach and implementation science methods, ensuring contextual relevance [ 21 , 22 ]. Furthermore, feasibility studies to measure for example, acceptability and fidelity, are needed before evaluating ultimate effectiveness [ 22 , 23 , 24 , 25 ], especially in light of the major challenges recognized in implementing integrated care in practice [ 17 ]. While fidelity has been measured in a seldom number of studies of integrated care for frail, home-based older adults [ 26 , 27 ], most studies rarely include implementation science methods such as stakeholder involvement; use of theories, models and frameworks; contextual analysis; and studying implementation strategies (i.e., the methods used to increase the likelihood for intervention uptake and success [ 28 ]) and implementation outcomes (e.g., acceptability, adoption, and fidelity [ 23 ]) [ 11 , 29 ]. Additionally, logic models, which are recommended when planning an intervention to illustrate how a program will create change [ 25 , 30 ], were not often used [ 11 , 18 ]. Logic models are visual tools that demonstrate an overall program theory, describing and linking the program’s input/resources, activities, expected outcomes and impact [ 31 , 32 ]. They are especially valuable in integrated care initiatives as deciphering the underlying pathway and which individual components of these complex interventions contribute to the outcomes can be especially challenging [ 11 , 14 ]. Logic models have numerous benefits during program planning, monitoring and evaluation, such as communicating the evidence-informed strategies used in the program; detecting gaps in theory; facilitating a shared understanding of the program with stakeholders; identifying what to measure during evaluation; and helping to differentiate between intervention and implementation failure [ 31 , 32 , 33 ]. Applying implementation science methods and creating a logic model when developing an intervention may improve the chances of success, inform future care models and reduce research waste.

Context is a major focus in implementation science [ 21 , 34 , 35 ]. During intervention development, a strong grasp of the context helps to ensure that the intervention components will be well-suited for the context and the actions needed [ 18 ]. Although there are inconsistencies in how the term “context” is formulated in the literature, Pfadenhauer et al.’s (2017) work using a Pragmatic Utility concept analysis helped to refine the conceptualization of context as: “a set of characteristics and circumstances that consist of active and unique factors, within which the implementation is embedded. As such, context … interacts, influences, modifies and facilitates or constrains the intervention and its implementation” [ 36 ]. The Context and Implementation of Complex Interventions (CICI) framework proposed by Pfadenhauer provides a richer assessment of “context”, differentiating it from the “setting” [ 36 ]. Specifically, the “context” dimension includes seven domains: geographical, epidemiological, socio-cultural, socio-economic, ethical, legal and political, while “setting” is defined by the physical place where an intervention takes place [ 36 ]. As a “determinant” framework, CICI provides a solid basis for understanding and analyzing the extensive set of factors within the context which may affect the intervention and implementation outcomes [ 36 , 37 , 38 ]. Accounting for such contextual factors is an essential consideration when planning and evaluating integrated care initiatives [ 7 , 11 , 18 ] and can lead to the selection of appropriate implementation strategies [ 39 , 40 ]. The selection of implementation strategies will be influenced by their proposed effectiveness [ 41 ] but also greatly depends on the context in which an intervention is implemented [ 42 ].

Given their major importance in the development and evaluation of complex care interventions, implementation science methods and a theory-driven approach will be applied in the INSPIRE project (ImplemeNtation of a community-baSed care Program for home dwelling senIoR citizEns) in Canton Basel-Landschaft (BL), Switzerland. A 2018 Cantonal law required the 86 BL municipalities, with an approximate population of 288’000, to re-organize themselves into eight care regions, and each develop a care concept including services for outpatient, intermediate, and inpatient care [ 43 ]. The INSPIRE project aims to develop, implement and evaluate an integrated care model for the information and advice centers (IAC), which are required in each of these newly formed care regions [ 43 ]. These community-based centers must include a nurse to provide needs assessments and advice for older adults who are living at home, especially if entry into a nursing home is being considered [ 43 ]. Building on gaps and recommendations in recent studies, the aim of this paper is to report the contextual analysis, logic model development, and preliminary implementation strategies for the INSPIRE integrated care model for home-dwelling older adults in Canton Basel-Landschaft.

Overall Project Design

The overall INSPIRE project is positioned within phases one to three of the Medical Research Council (MRC) framework for developing and evaluating complex interventions, yet also includes implementation science elements, such as a contextual analysis, stakeholder involvement, mapping of implementation strategies, and using a hybrid implementation-effectiveness evaluation (See Figure 1 ). This paper specifically addresses the development phase of INSPIRE and aims to:

An external file that holds a picture, illustration, etc.
Object name is ijic-21-2-5607-g1.jpg

INSPIRE project overview mapped according to the Framework of the Medical Research Council.

  • Determine the contextual factors which may influence the INSPIRE integrated care model for the IACs and implementation strategies by collecting information through various sources
  • Develop a logic model to display the overall theory for the INSPIRE care model, including inputs, activities, outputs, anticipated outcomes and assumptions
  • Propose preliminary implementation strategies for the INSPIRE care model

Performing the tasks related to these aims is a simultaneous and iterative process as shown in Figure 2 .

An external file that holds a picture, illustration, etc.
Object name is ijic-21-2-5607-g2.jpg

The INSPIRE project approach to care model development.

Contextual analysis

We followed Stange and Glasgow’s (2013) approach to assessing context to identify, analyze and report on contextual factors which may influence the INSPIRE care model [ 38 ]. Their approach involves gathering contextual input from various stakeholders and using theories or frameworks to determine relevant ‘domains’ from which quantitative and qualitative information related to contextual factors should be collected, assessed, and reported [ 38 ]. Pfadenhauer’s CICI framework was used to identify which contextual domains to consider (e.g., political, socio-economic, socio-cultural), and how these contextual factors and the setting may interact with the INSPIRE intervention and its implementation [ 36 ]. We used a worked example in Pfadenhauer’s paper [ 36 ] as a template (Additional File 1) to synthesize our collected data related to the context, setting, and anticipated implementation. The data came from a combination of activities initiated by the research team, such as: conducting the INSPIRE cantonal stakeholder meetings, a cross-sectional survey and the context analysis meetings; participating in local stakeholder meetings; conducting the Basel-Landschaft Older Persons Survey [ 44 ]; and reviewing local, national and international reports (e.g., a key document by Threapleton et al. on implementation facilitators and barriers [ 14 ]) ( Table 1 ). We mapped the identified contextual factors according to the CICI framework, and subsequently refined the INSPIRE care model components, the implementation process, and the potential implementation strategies.

Data sources used for the contextual analysis.

BL = Basel-Landschaft; IAC = Information and Advice Center.

Development and validation of the logic model

Development.

A logic model describing the input/resources, activities, anticipated outcomes and impact was created to illustrate the overall INSPIRE program theory for how the IAC could function to achieve the desired results in the community. The template and definitions for each logic model component were based on the W.K. Kellogg Foundation [ 32 ], the Canadian Evaluation Society [ 45 ] and the Centers for Disease Control and Prevention [ 30 ]. The one-page logic model illustrates the outcomes chain (i.e., the successive relationship between the immediate, intermediate and long-term outcomes) in the program theory and some of the assumptions about “program factors” (e.g., effective advertising of the IAC), “nonprogram external factors” (e.g., participant factors that can potentially influence the outcomes) and the change process [ 31 , 32 ]. The logic model was built based on a deductive approach to constructing program theory as the ideas were gathered from documentation such as the data sources for the contextual analysis, grey and peer-reviewed literature, and program documents developed by the research team [ 31 ]. As logic model development is not a static process, it continued to evolve as we gathered contextual information, detected gaps in our program theory, and identified additional types of implementation strategies needed (e.g., train and educate stakeholders and engage consumers ) [ 46 , 47 ].

The original core components of the INSPIRE integrated care model were presented during in-person cantonal stakeholder meetings to ensure that the overall model appeared to be appropriate from the perspective of local professionals. To gather stakeholders’ opinions on the program logic model, we undertook a structured activity during a stakeholder meeting attended by 40 stakeholders (e.g., health and social care organizations/providers, cantonal and municipal representatives, patient organizations, umbrella organization for care homes, volunteer organizations, health insurers, etc. [ 48 ]). We showed the stakeholders a condensed German version of the logic model that included the resources, activities, and outputs, but excluded outcomes. We asked stakeholders to work in groups to create a list of outcomes, i.e., the differences they expect to see as a result of the INSPIRE care model. Groups contributed their input via online interactive presentation software. Stakeholders were asked to choose from the long-list the three most relevant outcomes, resulting in a final list. Following the meeting, their input was incorporated into the INSPIRE logic model to create a new version, which was subsequently emailed to the stakeholder group for further input, and to identify any gaps or revisions needed.

Deriving preliminary implementation strategies

Determining implementation strategies which fit the context is a two-step process involving an analysis of the factors which may influence implementation, followed by a selection and tailoring of implementation strategies [ 42 ]. In the current study, we mapped contextual data to the CICI framework, and synthesized this information to derive actions needed in terms of the care model and preliminary implementation strategies. We also reflected on the implications for the intervention or implementation strategies based on the contextual factors.

The implementation strategies were specified according to the Expert Recommendations for Implementing Change (ERIC) compilation [ 28 , 47 , 49 ], and were added to the logic model to indicate the actors and outcomes they intend to influence. This is a preliminary selection of strategies which will be systematically mapped, assessed for their evidence level and reviewed by stakeholders.

Bringing it all together

As shown in Figure 2 , the INSPIRE project team used a unique approach to perform the preparatory work when designing the care model that aligns with O’Cathain’s recommendations [ 50 ]. As a first step, the project team performed a literature review, context analysis and involved stakeholders to develop the underlying program theory for how the intervention could work. As this is a circular process, specific details of the program theory and operationalization of the program progressed through stakeholder feedback or as more empirical data surfaced over time. Likewise, potential implementation strategies transpired as a result of the evidence, context and stakeholder input, as well as through the evolution of the program theory. The program theory was then formulated into a preliminary concept for the care model, accompanied by potential implementation strategies. To operationalize and communicate the program theory, a logic model was drafted and regularly adapted for one year. The final implementation strategies will evolve based on their success or failure, and as new information becomes available.

Ethical considerations

This study was submitted to the Ethikkommission Nordwest- und Zentralschweiz (EKNZ) in Switzerland, EKNZ Project ID Req-2019-00900. The study was able to be conducted as the EKNZ deemed that it complied with the general ethical and scientific standards for research with humans (Art. 51 Abs. 2 HRA) and did not meet the definition as a research project requiring further review as per the Human Research Act ART.2.

Context analysis

We selected specific contextual domains according to the CICI framework, including: socio-economic, socio-cultural, political, legal, epidemiological and the setting (Additional File 1). Key contextual factors on a macro level included: a lack of national integrated care regulations; the presence of integrated care guidance and indications of political support for integrated care; potentially challenging financing models; and inconsistent IT systems. Additionally, we noted the significant changes in nursing education across Switzerland over the past several decades, which is an important consideration when hiring an appropriate nurse for the IAC. On a meso level we noted: the rapidly growing population of older adults in BL; a cantonal law aiming to improve care for older adults yet not specifying the organization of integrated care; and the numerous organizations involved in the care of older adults. We also found that approximately one quarter of home-based older adults (aged 75+) in Canton BL showed signs of frailty, but that health care professionals likely do not systematically screen for frailty nor do general practitioners (GPs) typically perform a comprehensive geriatric assessment (CGA). On a micro level, we observed that the IAC and the nurse position would be new for the community and therefore new processes and tools, such as a referral pathway, an electronic patient file and communication tools, would be needed for the professionals to work together to deliver person-centered integrated care. In terms of the setting, the function of existing community-based centers that are mainly staffed by social service professionals and provide advice (e.g., social/financial) to older adults, could potentially be morphed into the new IACs required in the care law.

Logic Model

The INSPIRE logic model illustrates the program theory for a care model that integrates health and social care service provision for home-based frail older adults ( Figure 3 ).

An external file that holds a picture, illustration, etc.
Object name is ijic-21-2-5607-g3.jpg

The logic model for the INSPIRE care model.

BL = Basel-Landschaft; CGA = Comprehensive Geriatric Assessment; ED = Emergency Department; HC = Health care; HCP = Health care providers; HR-QoL = Health-related Quality of Life; IAC = Information and Advice Center; SSP = Social service providers.

The inputs column lists the resources that will contribute to the operation of the care model within any care region in BL. This covers human resources, such as the people referring older adults to the IAC, IAC employees, and stakeholders involved in decision making/funding, as well as organizational resources, such as the care law which mandates the IAC. It also includes the physical space where the IAC services will be delivered, the costs to run and support implementation of the IAC, and other resources (e.g., tools for screening and conducting a CGA, as well as marketing products).

The activities include the core components of the INSPIRE care model. First, individuals will be screened to identify those at risk of health deterioration who could benefit from in-depth geriatric assessment and coordination of additional services. Screening older adults aged 75+ for a certain geriatric risk profile indicating potential frailty can be performed by older adults, family members or health/social care professionals in the community such as GPs. At-risk older adults can be referred for an appointment at the IAC. The core components of the care model in the IAC, include conducting a CGA by a geriatric nurse expert and social worker; creating an individualized care plan including evidence-based interventions with a multidisciplinary team; and needs-based follow up. The geriatric nurse expert will act as the care coordinator in close collaboration with the social worker, and the care plan will be rolled out with the older adult and their caregivers, the GP, and health and social services in the community. The outputs column describes the main products anticipated as a result of both the intervention components and the implementation strategies. Certain aspects that will contribute to the measurement of implementation outcomes (e.g., acceptability, appropriateness, feasibility and fidelity) are reflected in the outputs and outcomes columns. For example, the percent of eligible older adults who receive an individualized care plan will contribute to the measurement of the intervention fidelity, and the IAC nurses’ views on whether the intervention is appropriate for frail older adults will be explored to measure intervention appropriateness. The outcomes columns are grouped temporally based on when we anticipate the change will be seen.

Lastly, arrows indicate the links between the activities and outcomes. These links illustrate the sequential outcomes anticipated and are evidence-informed based on clinical expertise, expert opinion, recommendations or previous/current hypotheses. To provide three examples: first, we anticipate that performing a CGA including a care plan and follow-up, which involves multi-disciplinary care professionals, will result in a care plan coordinated by one professional based on the older adults’ needs, being connected to necessary resources and services, and improving person-centered coordinated care. If the health and social needs of the older adults are assessed, we anticipate this will result in appropriate referrals, which may help to reduce the pressure on caregivers. As a second example, the educational meetings held will be instrumental to increase awareness of the IAC, and to determine how care planning can best be coordinated between the IAC and the other health and social service professionals, such as GPs. Thirdly, reviewing the patients’ medication list by the geriatric nurse expert as part of the CGA should help to flag any potentially inappropriate medications, which can be a concern with community-dwelling older adults.

During the validation phase, stakeholders elicited similar outcomes to those anticipated by the project team, and contributed new valuable outcomes. There were no concerns or discrepancies regarding the logic model when the revised version was sent to stakeholders.

Implementation strategies

Table 2 presents the implementation strategies which were selected from six different ERIC clusters, namely: Use evaluative and iterative strategies ; adapt and tailor to context ; develop stakeholder interrelationships ; train and educate stakeholders ; support clinicians ; and engage consumers [ 49 ]. For example, the strategy “use advisory boards and workgroups” was operationalized in our project by collaborating with local workgroups, including social service professionals to co-develop the electronic patient file which will be used for the IAC consultations. Meanwhile, given the diversity in the nursing education system in Switzerland, ongoing training is planned for the IAC nurse to increase their self-efficacy in geriatric care planning and to fulfill their role, as marked by the training-related strategies. In terms of educating stakeholders, it will be crucial to provide GPs as well as other providers in the community with information about the IAC, a referral path, as well as communication tools to foster care coordination. We anticipate that additional strategies will be needed as implementation progresses, depending on the resources available.

Potential implementation strategies for INSPIRE presented using the Expert Recommendations for Implementing Change (ERIC) compilation [ 49 ].

BL = Basel-Landschaft; IAC = Information and Advice Center; GP = General Practitioner.

Given the international desire to establish effective models of integrated care for home-based, frail older adults, this paper described the essential investments made during the development phase before implementing a new integrated care model in Canton BL. The results of this study demonstrate how a rich understanding of the context can help further refine an intervention concept and consider preliminary implementation strategies. Additionally, a contextually-relevant logic model was created to effectively communicate the program theory to INSPIRE project members, stakeholders, and other researchers.

Overall, many of the activities, outputs and outcomes described in our logic model are comparable with those seen in the Social Care Institute for Excellence Logic Model for Integrated Care [ 51 ] and the Logic Model for Patient-Centered Medical Home Models [ 52 ], among others [ 53 , 54 ]. Nevertheless, our logic model is specifically designed for our program and context, incorporates our assumptions, has an operational-level focus, and includes our implementation strategies. By providing a rich description of the contextual factors collected to date, this study addresses a common gap in the literature where the context of interventions is often not reported or only vaguely described. Without the findings emerging from the contextual analysis, necessary actions related to the intervention or implementation strategies would not have been detected. Examples of this include: the future role of IACs in performing a CGA and care coordination based on the current care system; identifying the local health and social service organizations to coordinate care with and to prevent duplication of services; the importance of a marketing plan for the IAC and unique strategies needed to reach family members; and the competencies needed by the IAC nurse and how defined pathways could help them work together with the social worker and GPs. Additionally, the importance of early involvement of professionals in the community, such as GPs, to facilitate frailty screening and referral to the IAC and collaborative care planning. However, some of the contextual barriers will remain outside of our control within the project, such as the financing models, incentives for integration or whether electronic records are shared across the whole system [ 14 ]. Awareness of these factors will also allow for a more accurate evaluation of the care model in future and interpretation of the results, and can also support other researchers or professionals who are looking for guidance on how to analyze context and use the findings within their intervention development. Stadnick et al. (2019) recently conducted seven case studies of integrated care initiatives across multiple countries, where they reflected on the shared contextual factors which influenced the implementation of these projects [ 55 ]. Among the inner context factors, several of the important considerations identified such as knowledge, education, training and confidence of service providers; monitoring fidelity; and shaping providers’ roles and responsibilities, will be relevant in the INSPIRE project and can guide where to enhance our efforts [ 55 ]. Establishing a “community-academic partnership” was the main bridging factor they identified [ 55 ], which will remain of great importance during all phases of the INSPIRE care model. By describing and linking the ultimate program goals with the activities that will be done to achieve these goals, the logic model revealed our thinking about what should work and how [ 31 , 32 , 33 ], mitigating the “black box” phenomenon which can otherwise occur when describing an intervention [ 46 , 52 ].

With respect to the overall program theory, the integration of health and social care has been fundamentally endorsed for years [ 13 , 56 , 57 , 58 ], especially for populations with complex needs [ 59 , 60 ]. The program theory encompasses the World Health Organization’s approach to Integrated Care for Older People (ICOPE) at the micro-level [ 60 ], and at the meso-level it incorporates actions deemed essential based on findings from the recent eDelphi study on implementing the ICOPE approach (e.g., conducting comprehensive assessments and training personnel to develop a care plan) [ 61 ]. As the first component in the care model, screening for potential frailty has been promoted as part of a preventative approach and as an effective means to determine the subset of the older population that would benefit from further comprehensive assessment and subsequent interventions [ 62 , 63 , 64 , 65 ]. Given that only a subset of the older population is estimated to be in higher need of IAC services, screening for potential frailty is particularly appropriate to use healthcare resources efficiently, combined with the recognition that frailty is an emerging public health priority [ 66 ]; and that it is likely a major factor predicting admittance to a nursing home [ 3 ]. Although there are different schools of thought on whether and how to screen older people for frailty in different health care settings based on feasibility, evidence gaps and resources required [ 65 , 66 , 67 , 68 , 69 ], frailty detection is essential to determine actions which can help prevent further conditions associated with aging [ 66 ]. If supported by appropriate implementation strategies, we believe screening can be an effective mechanism for identifying older adults most in need of further assessment.

Following screening, the remaining activities included in the program theory (i.e., conducting a CGA; assessing needs; creating and coordinating a care plan; and conducting follow-up) are highlighted as part of an integrated care approach for frailty or multi-morbidity [ 11 , 13 , 14 , 15 , 70 , 71 , 72 ], and are common to many studies of this nature [ 19 , 65 , 73 , 74 , 75 , 76 ]. The core intervention features a CGA at the center, which is considered either beneficial or a gold standard in caring for frail older adults in certain settings [ 66 , 77 , 78 , 79 ]. In a recent scoping review of 27 integrated care programs for older people, the authors found that the 21 different CGA instruments used incorporated three of the dominant principles of integrated care, i.e., comprehensive, multidisciplinary and person-centred care [ 80 ]. However, they proposed that stronger involvement of both social care professionals and older adults could strengthen the CGA process, which will be key in the INSPIRE model [ 12 ]. The present study, together with results from the ongoing systematic review by Briggs et al. assessing the effectiveness of the CGA in community-dwelling, frail older adults [ 81 ], will help add to the body of research testing the CGA as part of an integrated care model to improve outcomes for this population.

The program outcomes presented in the logic model were derived from studies of related care models [ 9 , 10 , 11 , 82 , 83 , 84 , 85 ]; outcomes that have been proposed for integrated care initiatives [ 8 , 86 ]; the program team’s realistic assumptions and/or stakeholder expectations (e.g., relief and support, coordination, costs and perception of aging). Achievement of these outcomes relies on important assumptions such as trusting relationships and strong communication between providers [ 87 ] as well as “provider commitment to and understanding of the model” [ 88 ]. However, previous authors have questioned whether some of the outcomes hypothesized for integrated care for this population are in fact appropriate or realistic, such as improvements in activities of daily living or quality adjusted life years [ 9 , 10 ]. Focusing on care processes and outcomes that are most important to patients have been emphasized as a priority [ 9 , 10 , 85 ], particularly measuring patient’s care experience as an outcome to reflect the quality of integrated care for multi-morbid individuals [ 89 ] or concentrating on intrinsic capacity and the patient’s individual goals [ 59 ].

With respect to the process of intervention development, O’Cathain et al. (2019) conducted a consensus exercise with experts to offer guidance for intervention development [ 50 ]. We endorse the principles they put forward, as we illustrated a “dynamic, iterative, creative, open to change and forward looking” approach in our process [ 50, p.2 ]. Our paper provides a practical example of how some of the actions within their framework (e.g., “involving stakeholders, reviewing published research evidence, drawing on existing theories, articulating programme theory, undertaking primary data collection, and understanding context”) can be applied and combined to prepare for a new intervention. It also reflects on the relationship between these steps with emphasis on logic model development and adds the element of implementation strategies. While the process and results are specific to our project, the approach and methods we used for the development phase can be broadly generalizable for other researchers, which is a strength of this study.

Methodological Considerations

With regards to contextual analysis, new methods are under development to guide researchers in the field to use a consistent, systematic approach for analyzing context [ 90 ]. As context constantly evolves, the factors present at the time of data collection may change and therefore it will be important that the implementation strategies adapt along with it. Contextual factors will differ for every setting limiting the generalizability of the care model; therefore, we have made the contextual factors in our situation transparent for researchers. In essence, the overall methodological approach can guide researchers for assessing their own setting, designing a logic model, and to facilitate the design and evaluation of future care models.

Logic models can also be criticized for not describing “why” activities produce outcomes that would otherwise be clear through a theory of change [ 91 ]. Another downfall is that some more basic or linear versions may fail to capture context or communicate the true complexity involved for a complex intervention to become contextually-fit [ 92 , 93 ]. However, our format supports program planning and was appropriate for our purposes, as described by Mills et al. [ 92 ]. Nevertheless, the combination of the logic model, extensive narrative describing the evidence-informed strategies, and contextual analysis we provided in compliment should support interpretation of the logic model and understanding of the development of this complex intervention [ 36 ]. As an alternative means, other authors have described innovative methods they used to account for context while developing logic models [ 46 , 93 ], and have proposed a new format for presenting this [ 92 ]. Innovative work by Smith et al. (2020) may support implementation science researchers moving forward as they have introduced new templates for logic models that link the different frameworks specific to implementation science and can support the various study designs [ 94 ]. For the future evaluation of the care model, a systems thinking approach may be more appropriate to accurately reflect the complexity of the system [ 95 ].

This study has set the foundation for the next steps in the INSPIRE research project: to conduct a feasibility study of the integrated care model and implementation strategies prior to full evaluation of the implementation and intervention outcomes. Based on the insights of previous integrated care studies on older adults, stronger understanding of context and program theory is needed, especially to develop, implement and evaluate these initiatives which are yet to yield strong evidence in the field. Investing sufficient efforts into program development and stakeholder involvement is essential to ensure a strong fit between the context and the integrated care model, identify the implementation strategies needed, and reduce research waste. Flexibility in the next phases of research and implementation will also be essential as changes in leadership, policies, and so on is typically inevitable. The approach followed during this study can be used as a basis and adapted when developing future integrated care programs.

Data Accessibility statement

Additional file.

The additional file for this article can be found as follows:

Additional File 1.

Applying the Context and Implementation of Complex Interventions (CICI) framework to the INSPIRE project.

Funding Statement

This study is part of the INSPIRE research project which was funded by Swisslos Fond Baselland, Velux Stiftung/Velux Foundation, the Swiss National Science Foundation (NRP74) and Amt für Gesundheit Kanton Basel-Landschaft/Health Department Canton Basel-Landschaft. Additionally, this project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 801076 (through the SSPH+ Global PhD Fellowship Programme in Public Health Sciences [GlobalP3HS] of the Swiss School of Public Health) and grant agreement No 812656 as it also forms part of the TRANS-SENIOR Project. The funding bodies had no role in the design, execution, analysis and interpretation of the data, or writing of the study.

Tamara Alhambra Borrás , PhD, Polibienestar Research Institute – Universitat de València, Spain.

Anneli Hujala, Senior Researcher , PhD, University of Eastern Finland, Department of Health and Social Management, Kuopio, Finland.

Dr Lynsey Warwick-Giles , Research Associate, Health Organisation, Policy and Economics (HOPE) research group, University of Manchester, UK.

Funding Information

Competing interests.

The authors have no competing interests to declare.

IMAGES

  1. Logic Model Cheat-Sheet

    literature review on logic model

  2. R711 Literature Review: The Use of Logic Models for Evaluation

    literature review on logic model

  3. More than 40 Logic Model Templates & Examples ᐅ TemplateLab

    literature review on logic model

  4. More than 40 Logic Model Templates & Examples ᐅ TemplateLab

    literature review on logic model

  5. Step 3: Draw a logic model

    literature review on logic model

  6. Improve Your Logic Model Using 3 Simple Design Principles

    literature review on logic model

VIDEO

  1. 19 logic models program evaluation: What is a logic model development

  2. Developing a Logic Model "Showing What Your Program Does and What the Outcome will be"

  3. Developing a Logic Model Part 1.m4v

  4. The Basic Logic Model Theory

  5. From Conceptual Data Model to Logic Model Data

  6. Theory of Change (ToC) vs. Logic Model: What is the difference between them?

COMMENTS

  1. Developing and Optimising the Use of Logic Models in Systematic ...

    Background Logic models are becoming an increasingly common feature of systematic reviews, as is the use of programme theory more generally in systematic reviewing. Logic models offer a framework to help reviewers to 'think' conceptually at various points during the review, and can be a useful tool in defining study inclusion and exclusion criteria, guiding the search strategy, identifying ...

  2. Program Evaluation Through the Use of Logic Models

    Interestingly, 56% of students felt that utilizing logic models in implementing public health programs will be an overwhelming responsibility for pharmacists. Overall, 64.86% of students agreed that the experience gained through constructing their own logic model was an intellectually stimulating activity. More than half of the participants ...

  3. Enhancing the Effectiveness of Logic Models

    One of the most widely used communication tools in evaluation is the logic model. Despite its extensive use, there has been little research into the visualization aspect of the logic model. To assess the impact that design modifications would have on its effectiveness, we applied established visualization principles to revise a program model.

  4. Advancing complexity science in healthcare research: the logic of logic

    A typology of logic model types is proposed based on a scoping review of the literature, along with a formal methodology for developing dynamic models, referred to as "type 4" logic models. We hope this will help researchers to a) know which logic model type to use when evaluating interventions and b) overcome the challenges of modelling ...

  5. Using logic model methods in systematic review synthesis: describing

    Theory-based approaches, such as logic models, have been suggested as a means of providing additional insights beyond that obtained via conventional review methods. Methods. This paper reports the use of an innovative method which combines systematic review processes with logic model techniques to synthesise a broad range of literature.

  6. Using logic models to capture complexity in systematic reviews

    Second, logic models can be used to direct the review process more specifically. They can help justify narrowing the scope of a review, identify the most relevant inclusion criteria, guide the literature search, and clarify interpretation of results when drawing policy-relevant conclusions about review findings.

  7. (PDF) Developing and Optimising the Use of Logic Models ...

    Logic mod els offer a. framework to help reviewers to ' think ' conceptually at various points during the review, and. can be a useful tool in defining study inclusion and exclusion criteria ...

  8. Towards a taxonomy of logic models in systematic reviews and ...

    The taxonomy distinguishes 3 approaches (a priori, staged, and iterative) and 2 types (systems-based and process-orientated) of logic models. An a priori logic model is specified at the start of the systematic review/HTA and remains unchanged. With a staged logic model, the reviewer prespecifies several points, at which major data inputs ...

  9. Developing and Optimising the Use of Logic Models in ...

    Background: Logic models are becoming an increasingly common feature of systematic reviews, as is the use of programme theory more generally in systematic reviewing. Logic models offer a framework to help reviewers to 'think' conceptually at various points during the review, and can be a useful tool in defining study inclusion and exclusion criteria, guiding the search strategy, identifying ...

  10. PDF Introducing Logic Models

    Use of theory of change and program logic models began in the 1970s. Carol Weiss (1995), and Michael Fullan (2001) and Huey Chen (1994, 2005) are among the pioneers and champions for the use of program theory in program design and evaluation. U.S. Agency for International Development's (1971) logical framework approach and Claude Bennett's ...

  11. A logic model for pharmaceutical care

    The logic model for pharmaceutical care was created following a process consisting of four steps: (1) a liter-ature review to identify what pharmaceutical care is and what elements it consists of; (2) interviews with stakeholders involved in pharmaceutical care to discuss the results of the literature review; (3) construction of the logic model ...

  12. Developing and Optimising the Use of Logic Models in Systematic Reviews

    As understood in the program evaluation literature, logic models are one way of representing the underlying processes by which an intervention effects a change on individuals, communities or organisations. ... This distinction fits in well with the different stages of a systematic review. A logic model provides a sequential depiction of the ...

  13. PDF How to Develop a Program Logic Model

    • There is no one best logic model. • Logic models represent intention. • A program logic model can change and be refined as the program changes and develops. • Programs do not need to evaluate every aspect of a logic model. • Logic models play a critical role in informing evaluation and building the evidence base for a program.

  14. Using logic model methods in systematic review synthesis: describing

    The completed logic model built from examining the identified published literature. The model illustrates the pathway between demand management interventions and intended impact. ... While our method utilises the model for synthesis at the latter end of a systematic review, logic models have been suggested as being of value at various stages of ...

  15. On the same page: Co-designing the logic model Volume 5: 1-7 The Author

    Table 1. Steps taken in developing the programme logic model. Step 1: 10 September 2016 In an inductive approach to identify the potential components of the logic model, a review of the literature (both peer-reviewed and the grey literature, including existing RFW internal documents) was conducted to

  16. A logic model framework for evaluation and planning in a primary care

    What is a logic model? The logic model has proven to be a successful tool for program planning as well as implementation and performance management in numerous fields, including primary care (2-14).A logic model (see Figure One) is defined as a graphical/textual representation of how a program is intended to work and links outcomes with processes and the theoretical assumptions of the ...

  17. PDF Logic models for program design, implementation, and evaluation

    Session II. From logic models to program and policy evaluation \(1.5 hours\) 30\n . Agenda 30\n . Pre-assignment 31\n . Goals 32\n . Example cases revisited 33\n . Additional case examples 34\n . Review of logic models 35\n . Introducing evaluation 37\n . Moving from logic model to evaluation questions 39\n . Generating indicators 43\n

  18. How to Write a Literature Review

    Examples of literature reviews. Step 1 - Search for relevant literature. Step 2 - Evaluate and select sources. Step 3 - Identify themes, debates, and gaps. Step 4 - Outline your literature review's structure. Step 5 - Write your literature review.

  19. The Implementation Research Logic Model: a method for planning

    Numerous models, frameworks, and theories exist for specific aspects of implementation research, including for determinants, strategies, and outcomes. However, implementation research projects often fail to provide a coherent rationale or justification for how these aspects are selected and tested in relation to one another. Despite this need to better specify the conceptual linkages between ...

  20. Trans-Belief: Developing Artificial Intelligence NLP Model Capable of

    This paper investigates the possibility of developing artificial intelligence (AI) systems capable of exhibiting limited cognitive processes analogous to aspects of human religious belief. The literature review pertains to the most essential cognitive mechanisms of belief and the most relevant models for AI with belief. Accordingly, and as a result of the theoretical review, drawing ...

  21. Pharmacist intervention for pediatric asthma: A systematic literature

    The logic model summarized components of interventions evaluated in literature. It provides a blueprint for pharmacist-led management of pediatric asthma. ... Pharmacist intervention for pediatric asthma: A systematic literature review and logic model Res Social Adm Pharm. 2023 Aug 22;S1551-7411(23)00350-9. doi: 10.1016/j.sapharm.2023.08.008.

  22. The Implementation Research Logic Model: a method for planning

    Logic models. Logic models, graphic depictions that present the shared relationships among various elements of a program or study, have been used for decades in program development and evaluation and are often required by funding agencies when proposing studies involving implementation .

  23. Optimizing double-layered convolutional neural networks for efficient

    Lung cancer remains a leading cause of cancer-related mortality globally, with prognosis significantly dependent on early-stage detection. Traditional diagnostic methods, though effective, often face challenges regarding accuracy, early detection, and scalability, being invasive, time-consuming, and prone to ambiguous interpretations. This study proposes an advanced machine learning model ...

  24. Cost-effectiveness of differentiated care models that incorporate

    There is some evidence that differentiated service delivery (DSD) models, which use a client-centered approach to simplify and increase access to care, improve clinical outcomes among people living with HIV (PLHIV) in high HIV prevalence countries. Integrating economic strengthening tools (e.g., microcredit, cash transfers, food assistance) within DSD models can help address the poverty ...

  25. Effectiveness of Hospital Pharmacist Interventions for COPD ...

    Purpose: This review aimed to summarize empirical evidence about pharmacist-led interventions for chronic obstructive pulmonary disease (COPD) patients in hospital settings and to identify the components of a logic model (including input, interventions, output, outcome and contextual factors) to inform the development of hospital pharmacist's role in COPD management.

  26. A Contextual Analysis and Logic Model for Integrated Care for Frail

    Logic models have numerous benefits during program planning, monitoring and evaluation, ... As a first step, the project team performed a literature review, context analysis and involved stakeholders to develop the underlying program theory for how the intervention could work. As this is a circular process, specific details of the program ...

  27. Going Beyond the Conventional Service Profit Chain Model

    Adeinat and Kassim (2018) tried to retest the model of SPC to address gaps raised in the relevant literature in the Saudi setting. They highlighted two shortages in the relevant literature: namely, first, testing this complex model and interrelationships in only one industry and setting; second, more operational factors such as employees' productivity and achievement require full integration ...