Purdue Online Writing Lab Purdue OWL® College of Liberal Arts

Writing a Literature Review

OWL logo

Welcome to the Purdue OWL

This page is brought to you by the OWL at Purdue University. When printing this page, you must include the entire legal notice.

Copyright ©1995-2018 by The Writing Lab & The OWL at Purdue and Purdue University. All rights reserved. This material may not be published, reproduced, broadcast, rewritten, or redistributed without permission. Use of this site constitutes acceptance of our terms and conditions of fair use.

A literature review is a document or section of a document that collects key sources on a topic and discusses those sources in conversation with each other (also called synthesis ). The lit review is an important genre in many disciplines, not just literature (i.e., the study of works of literature such as novels and plays). When we say “literature review” or refer to “the literature,” we are talking about the research ( scholarship ) in a given field. You will often see the terms “the research,” “the scholarship,” and “the literature” used mostly interchangeably.

Where, when, and why would I write a lit review?

There are a number of different situations where you might write a literature review, each with slightly different expectations; different disciplines, too, have field-specific expectations for what a literature review is and does. For instance, in the humanities, authors might include more overt argumentation and interpretation of source material in their literature reviews, whereas in the sciences, authors are more likely to report study designs and results in their literature reviews; these differences reflect these disciplines’ purposes and conventions in scholarship. You should always look at examples from your own discipline and talk to professors or mentors in your field to be sure you understand your discipline’s conventions, for literature reviews as well as for any other genre.

A literature review can be a part of a research paper or scholarly article, usually falling after the introduction and before the research methods sections. In these cases, the lit review just needs to cover scholarship that is important to the issue you are writing about; sometimes it will also cover key sources that informed your research methodology.

Lit reviews can also be standalone pieces, either as assignments in a class or as publications. In a class, a lit review may be assigned to help students familiarize themselves with a topic and with scholarship in their field, get an idea of the other researchers working on the topic they’re interested in, find gaps in existing research in order to propose new projects, and/or develop a theoretical framework and methodology for later research. As a publication, a lit review usually is meant to help make other scholars’ lives easier by collecting and summarizing, synthesizing, and analyzing existing research on a topic. This can be especially helpful for students or scholars getting into a new research area, or for directing an entire community of scholars toward questions that have not yet been answered.

What are the parts of a lit review?

Most lit reviews use a basic introduction-body-conclusion structure; if your lit review is part of a larger paper, the introduction and conclusion pieces may be just a few sentences while you focus most of your attention on the body. If your lit review is a standalone piece, the introduction and conclusion take up more space and give you a place to discuss your goals, research methods, and conclusions separately from where you discuss the literature itself.

Introduction:

  • An introductory paragraph that explains what your working topic and thesis is
  • A forecast of key topics or texts that will appear in the review
  • Potentially, a description of how you found sources and how you analyzed them for inclusion and discussion in the review (more often found in published, standalone literature reviews than in lit review sections in an article or research paper)
  • Summarize and synthesize: Give an overview of the main points of each source and combine them into a coherent whole
  • Analyze and interpret: Don’t just paraphrase other researchers – add your own interpretations where possible, discussing the significance of findings in relation to the literature as a whole
  • Critically Evaluate: Mention the strengths and weaknesses of your sources
  • Write in well-structured paragraphs: Use transition words and topic sentence to draw connections, comparisons, and contrasts.

Conclusion:

  • Summarize the key findings you have taken from the literature and emphasize their significance
  • Connect it back to your primary research question

How should I organize my lit review?

Lit reviews can take many different organizational patterns depending on what you are trying to accomplish with the review. Here are some examples:

  • Chronological : The simplest approach is to trace the development of the topic over time, which helps familiarize the audience with the topic (for instance if you are introducing something that is not commonly known in your field). If you choose this strategy, be careful to avoid simply listing and summarizing sources in order. Try to analyze the patterns, turning points, and key debates that have shaped the direction of the field. Give your interpretation of how and why certain developments occurred (as mentioned previously, this may not be appropriate in your discipline — check with a teacher or mentor if you’re unsure).
  • Thematic : If you have found some recurring central themes that you will continue working with throughout your piece, you can organize your literature review into subsections that address different aspects of the topic. For example, if you are reviewing literature about women and religion, key themes can include the role of women in churches and the religious attitude towards women.
  • Qualitative versus quantitative research
  • Empirical versus theoretical scholarship
  • Divide the research by sociological, historical, or cultural sources
  • Theoretical : In many humanities articles, the literature review is the foundation for the theoretical framework. You can use it to discuss various theories, models, and definitions of key concepts. You can argue for the relevance of a specific theoretical approach or combine various theorical concepts to create a framework for your research.

What are some strategies or tips I can use while writing my lit review?

Any lit review is only as good as the research it discusses; make sure your sources are well-chosen and your research is thorough. Don’t be afraid to do more research if you discover a new thread as you’re writing. More info on the research process is available in our "Conducting Research" resources .

As you’re doing your research, create an annotated bibliography ( see our page on the this type of document ). Much of the information used in an annotated bibliography can be used also in a literature review, so you’ll be not only partially drafting your lit review as you research, but also developing your sense of the larger conversation going on among scholars, professionals, and any other stakeholders in your topic.

Usually you will need to synthesize research rather than just summarizing it. This means drawing connections between sources to create a picture of the scholarly conversation on a topic over time. Many student writers struggle to synthesize because they feel they don’t have anything to add to the scholars they are citing; here are some strategies to help you:

  • It often helps to remember that the point of these kinds of syntheses is to show your readers how you understand your research, to help them read the rest of your paper.
  • Writing teachers often say synthesis is like hosting a dinner party: imagine all your sources are together in a room, discussing your topic. What are they saying to each other?
  • Look at the in-text citations in each paragraph. Are you citing just one source for each paragraph? This usually indicates summary only. When you have multiple sources cited in a paragraph, you are more likely to be synthesizing them (not always, but often
  • Read more about synthesis here.

The most interesting literature reviews are often written as arguments (again, as mentioned at the beginning of the page, this is discipline-specific and doesn’t work for all situations). Often, the literature review is where you can establish your research as filling a particular gap or as relevant in a particular way. You have some chance to do this in your introduction in an article, but the literature review section gives a more extended opportunity to establish the conversation in the way you would like your readers to see it. You can choose the intellectual lineage you would like to be part of and whose definitions matter most to your thinking (mostly humanities-specific, but this goes for sciences as well). In addressing these points, you argue for your place in the conversation, which tends to make the lit review more compelling than a simple reporting of other sources.

Get science-backed answers as you write with Paperpal's Research feature

What is a Literature Review? How to Write It (with Examples)

literature review

A literature review is a critical analysis and synthesis of existing research on a particular topic. It provides an overview of the current state of knowledge, identifies gaps, and highlights key findings in the literature. 1 The purpose of a literature review is to situate your own research within the context of existing scholarship, demonstrating your understanding of the topic and showing how your work contributes to the ongoing conversation in the field. Learning how to write a literature review is a critical tool for successful research. Your ability to summarize and synthesize prior research pertaining to a certain topic demonstrates your grasp on the topic of study, and assists in the learning process. 

Table of Contents

  • What is the purpose of literature review? 
  • a. Habitat Loss and Species Extinction: 
  • b. Range Shifts and Phenological Changes: 
  • c. Ocean Acidification and Coral Reefs: 
  • d. Adaptive Strategies and Conservation Efforts: 
  • How to write a good literature review 
  • Choose a Topic and Define the Research Question: 
  • Decide on the Scope of Your Review: 
  • Select Databases for Searches: 
  • Conduct Searches and Keep Track: 
  • Review the Literature: 
  • Organize and Write Your Literature Review: 
  • Frequently asked questions 

What is a literature review?

A well-conducted literature review demonstrates the researcher’s familiarity with the existing literature, establishes the context for their own research, and contributes to scholarly conversations on the topic. One of the purposes of a literature review is also to help researchers avoid duplicating previous work and ensure that their research is informed by and builds upon the existing body of knowledge.

comparison and literature review

What is the purpose of literature review?

A literature review serves several important purposes within academic and research contexts. Here are some key objectives and functions of a literature review: 2  

  • Contextualizing the Research Problem: The literature review provides a background and context for the research problem under investigation. It helps to situate the study within the existing body of knowledge. 
  • Identifying Gaps in Knowledge: By identifying gaps, contradictions, or areas requiring further research, the researcher can shape the research question and justify the significance of the study. This is crucial for ensuring that the new research contributes something novel to the field. 
  • Understanding Theoretical and Conceptual Frameworks: Literature reviews help researchers gain an understanding of the theoretical and conceptual frameworks used in previous studies. This aids in the development of a theoretical framework for the current research. 
  • Providing Methodological Insights: Another purpose of literature reviews is that it allows researchers to learn about the methodologies employed in previous studies. This can help in choosing appropriate research methods for the current study and avoiding pitfalls that others may have encountered. 
  • Establishing Credibility: A well-conducted literature review demonstrates the researcher’s familiarity with existing scholarship, establishing their credibility and expertise in the field. It also helps in building a solid foundation for the new research. 
  • Informing Hypotheses or Research Questions: The literature review guides the formulation of hypotheses or research questions by highlighting relevant findings and areas of uncertainty in existing literature. 

Literature review example

Let’s delve deeper with a literature review example: Let’s say your literature review is about the impact of climate change on biodiversity. You might format your literature review into sections such as the effects of climate change on habitat loss and species extinction, phenological changes, and marine biodiversity. Each section would then summarize and analyze relevant studies in those areas, highlighting key findings and identifying gaps in the research. The review would conclude by emphasizing the need for further research on specific aspects of the relationship between climate change and biodiversity. The following literature review template provides a glimpse into the recommended literature review structure and content, demonstrating how research findings are organized around specific themes within a broader topic. 

Literature Review on Climate Change Impacts on Biodiversity:

Climate change is a global phenomenon with far-reaching consequences, including significant impacts on biodiversity. This literature review synthesizes key findings from various studies: 

a. Habitat Loss and Species Extinction:

Climate change-induced alterations in temperature and precipitation patterns contribute to habitat loss, affecting numerous species (Thomas et al., 2004). The review discusses how these changes increase the risk of extinction, particularly for species with specific habitat requirements. 

b. Range Shifts and Phenological Changes:

Observations of range shifts and changes in the timing of biological events (phenology) are documented in response to changing climatic conditions (Parmesan & Yohe, 2003). These shifts affect ecosystems and may lead to mismatches between species and their resources. 

c. Ocean Acidification and Coral Reefs:

The review explores the impact of climate change on marine biodiversity, emphasizing ocean acidification’s threat to coral reefs (Hoegh-Guldberg et al., 2007). Changes in pH levels negatively affect coral calcification, disrupting the delicate balance of marine ecosystems. 

d. Adaptive Strategies and Conservation Efforts:

Recognizing the urgency of the situation, the literature review discusses various adaptive strategies adopted by species and conservation efforts aimed at mitigating the impacts of climate change on biodiversity (Hannah et al., 2007). It emphasizes the importance of interdisciplinary approaches for effective conservation planning. 

comparison and literature review

How to write a good literature review

Writing a literature review involves summarizing and synthesizing existing research on a particular topic. A good literature review format should include the following elements. 

Introduction: The introduction sets the stage for your literature review, providing context and introducing the main focus of your review. 

  • Opening Statement: Begin with a general statement about the broader topic and its significance in the field. 
  • Scope and Purpose: Clearly define the scope of your literature review. Explain the specific research question or objective you aim to address. 
  • Organizational Framework: Briefly outline the structure of your literature review, indicating how you will categorize and discuss the existing research. 
  • Significance of the Study: Highlight why your literature review is important and how it contributes to the understanding of the chosen topic. 
  • Thesis Statement: Conclude the introduction with a concise thesis statement that outlines the main argument or perspective you will develop in the body of the literature review. 

Body: The body of the literature review is where you provide a comprehensive analysis of existing literature, grouping studies based on themes, methodologies, or other relevant criteria. 

  • Organize by Theme or Concept: Group studies that share common themes, concepts, or methodologies. Discuss each theme or concept in detail, summarizing key findings and identifying gaps or areas of disagreement. 
  • Critical Analysis: Evaluate the strengths and weaknesses of each study. Discuss the methodologies used, the quality of evidence, and the overall contribution of each work to the understanding of the topic. 
  • Synthesis of Findings: Synthesize the information from different studies to highlight trends, patterns, or areas of consensus in the literature. 
  • Identification of Gaps: Discuss any gaps or limitations in the existing research and explain how your review contributes to filling these gaps. 
  • Transition between Sections: Provide smooth transitions between different themes or concepts to maintain the flow of your literature review. 

Conclusion: The conclusion of your literature review should summarize the main findings, highlight the contributions of the review, and suggest avenues for future research. 

  • Summary of Key Findings: Recap the main findings from the literature and restate how they contribute to your research question or objective. 
  • Contributions to the Field: Discuss the overall contribution of your literature review to the existing knowledge in the field. 
  • Implications and Applications: Explore the practical implications of the findings and suggest how they might impact future research or practice. 
  • Recommendations for Future Research: Identify areas that require further investigation and propose potential directions for future research in the field. 
  • Final Thoughts: Conclude with a final reflection on the importance of your literature review and its relevance to the broader academic community. 

what is a literature review

Conducting a literature review

Conducting a literature review is an essential step in research that involves reviewing and analyzing existing literature on a specific topic. It’s important to know how to do a literature review effectively, so here are the steps to follow: 1  

Choose a Topic and Define the Research Question:

  • Select a topic that is relevant to your field of study. 
  • Clearly define your research question or objective. Determine what specific aspect of the topic do you want to explore? 

Decide on the Scope of Your Review:

  • Determine the timeframe for your literature review. Are you focusing on recent developments, or do you want a historical overview? 
  • Consider the geographical scope. Is your review global, or are you focusing on a specific region? 
  • Define the inclusion and exclusion criteria. What types of sources will you include? Are there specific types of studies or publications you will exclude? 

Select Databases for Searches:

  • Identify relevant databases for your field. Examples include PubMed, IEEE Xplore, Scopus, Web of Science, and Google Scholar. 
  • Consider searching in library catalogs, institutional repositories, and specialized databases related to your topic. 

Conduct Searches and Keep Track:

  • Develop a systematic search strategy using keywords, Boolean operators (AND, OR, NOT), and other search techniques. 
  • Record and document your search strategy for transparency and replicability. 
  • Keep track of the articles, including publication details, abstracts, and links. Use citation management tools like EndNote, Zotero, or Mendeley to organize your references. 

Review the Literature:

  • Evaluate the relevance and quality of each source. Consider the methodology, sample size, and results of studies. 
  • Organize the literature by themes or key concepts. Identify patterns, trends, and gaps in the existing research. 
  • Summarize key findings and arguments from each source. Compare and contrast different perspectives. 
  • Identify areas where there is a consensus in the literature and where there are conflicting opinions. 
  • Provide critical analysis and synthesis of the literature. What are the strengths and weaknesses of existing research? 

Organize and Write Your Literature Review:

  • Literature review outline should be based on themes, chronological order, or methodological approaches. 
  • Write a clear and coherent narrative that synthesizes the information gathered. 
  • Use proper citations for each source and ensure consistency in your citation style (APA, MLA, Chicago, etc.). 
  • Conclude your literature review by summarizing key findings, identifying gaps, and suggesting areas for future research. 

The literature review sample and detailed advice on writing and conducting a review will help you produce a well-structured report. But remember that a literature review is an ongoing process, and it may be necessary to revisit and update it as your research progresses. 

Frequently asked questions

A literature review is a critical and comprehensive analysis of existing literature (published and unpublished works) on a specific topic or research question and provides a synthesis of the current state of knowledge in a particular field. A well-conducted literature review is crucial for researchers to build upon existing knowledge, avoid duplication of efforts, and contribute to the advancement of their field. It also helps researchers situate their work within a broader context and facilitates the development of a sound theoretical and conceptual framework for their studies.

Literature review is a crucial component of research writing, providing a solid background for a research paper’s investigation. The aim is to keep professionals up to date by providing an understanding of ongoing developments within a specific field, including research methods, and experimental techniques used in that field, and present that knowledge in the form of a written report. Also, the depth and breadth of the literature review emphasizes the credibility of the scholar in his or her field.  

Before writing a literature review, it’s essential to undertake several preparatory steps to ensure that your review is well-researched, organized, and focused. This includes choosing a topic of general interest to you and doing exploratory research on that topic, writing an annotated bibliography, and noting major points, especially those that relate to the position you have taken on the topic. 

Literature reviews and academic research papers are essential components of scholarly work but serve different purposes within the academic realm. 3 A literature review aims to provide a foundation for understanding the current state of research on a particular topic, identify gaps or controversies, and lay the groundwork for future research. Therefore, it draws heavily from existing academic sources, including books, journal articles, and other scholarly publications. In contrast, an academic research paper aims to present new knowledge, contribute to the academic discourse, and advance the understanding of a specific research question. Therefore, it involves a mix of existing literature (in the introduction and literature review sections) and original data or findings obtained through research methods. 

Literature reviews are essential components of academic and research papers, and various strategies can be employed to conduct them effectively. If you want to know how to write a literature review for a research paper, here are four common approaches that are often used by researchers.  Chronological Review: This strategy involves organizing the literature based on the chronological order of publication. It helps to trace the development of a topic over time, showing how ideas, theories, and research have evolved.  Thematic Review: Thematic reviews focus on identifying and analyzing themes or topics that cut across different studies. Instead of organizing the literature chronologically, it is grouped by key themes or concepts, allowing for a comprehensive exploration of various aspects of the topic.  Methodological Review: This strategy involves organizing the literature based on the research methods employed in different studies. It helps to highlight the strengths and weaknesses of various methodologies and allows the reader to evaluate the reliability and validity of the research findings.  Theoretical Review: A theoretical review examines the literature based on the theoretical frameworks used in different studies. This approach helps to identify the key theories that have been applied to the topic and assess their contributions to the understanding of the subject.  It’s important to note that these strategies are not mutually exclusive, and a literature review may combine elements of more than one approach. The choice of strategy depends on the research question, the nature of the literature available, and the goals of the review. Additionally, other strategies, such as integrative reviews or systematic reviews, may be employed depending on the specific requirements of the research.

The literature review format can vary depending on the specific publication guidelines. However, there are some common elements and structures that are often followed. Here is a general guideline for the format of a literature review:  Introduction:   Provide an overview of the topic.  Define the scope and purpose of the literature review.  State the research question or objective.  Body:   Organize the literature by themes, concepts, or chronology.  Critically analyze and evaluate each source.  Discuss the strengths and weaknesses of the studies.  Highlight any methodological limitations or biases.  Identify patterns, connections, or contradictions in the existing research.  Conclusion:   Summarize the key points discussed in the literature review.  Highlight the research gap.  Address the research question or objective stated in the introduction.  Highlight the contributions of the review and suggest directions for future research.

Both annotated bibliographies and literature reviews involve the examination of scholarly sources. While annotated bibliographies focus on individual sources with brief annotations, literature reviews provide a more in-depth, integrated, and comprehensive analysis of existing literature on a specific topic. The key differences are as follows: 

References 

  • Denney, A. S., & Tewksbury, R. (2013). How to write a literature review.  Journal of criminal justice education ,  24 (2), 218-234. 
  • Pan, M. L. (2016).  Preparing literature reviews: Qualitative and quantitative approaches . Taylor & Francis. 
  • Cantero, C. (2019). How to write a literature review.  San José State University Writing Center . 

Paperpal is an AI writing assistant that help academics write better, faster with real-time suggestions for in-depth language and grammar correction. Trained on millions of research manuscripts enhanced by professional academic editors, Paperpal delivers human precision at machine speed.  

Try it for free or upgrade to  Paperpal Prime , which unlocks unlimited access to premium features like academic translation, paraphrasing, contextual synonyms, consistency checks and more. It’s like always having a professional academic editor by your side! Go beyond limitations and experience the future of academic writing.  Get Paperpal Prime now at just US$19 a month!

Related Reads:

  • Empirical Research: A Comprehensive Guide for Academics 
  • How to Write a Scientific Paper in 10 Steps 
  • Life Sciences Papers: 9 Tips for Authors Writing in Biological Sciences
  • What is an Argumentative Essay? How to Write It (With Examples)

6 Tips for Post-Doc Researchers to Take Their Career to the Next Level

Self-plagiarism in research: what it is and how to avoid it, you may also like, what is hedging in academic writing  , how to use ai to enhance your college..., ai + human expertise – a paradigm shift..., how to use paperpal to generate emails &..., ai in education: it’s time to change the..., is it ethical to use ai-generated abstracts without..., do plagiarism checkers detect ai content, word choice problems: how to use the right..., how to avoid plagiarism when using generative ai..., what are journal guidelines on using generative ai....

  • UConn Library
  • Literature Review: The What, Why and How-to Guide
  • Introduction

Literature Review: The What, Why and How-to Guide — Introduction

  • Getting Started
  • How to Pick a Topic
  • Strategies to Find Sources
  • Evaluating Sources & Lit. Reviews
  • Tips for Writing Literature Reviews
  • Writing Literature Review: Useful Sites
  • Citation Resources
  • Other Academic Writings

What are Literature Reviews?

So, what is a literature review? "A literature review is an account of what has been published on a topic by accredited scholars and researchers. In writing the literature review, your purpose is to convey to your reader what knowledge and ideas have been established on a topic, and what their strengths and weaknesses are. As a piece of writing, the literature review must be defined by a guiding concept (e.g., your research objective, the problem or issue you are discussing, or your argumentative thesis). It is not just a descriptive list of the material available, or a set of summaries." Taylor, D.  The literature review: A few tips on conducting it . University of Toronto Health Sciences Writing Centre.

Goals of Literature Reviews

What are the goals of creating a Literature Review?  A literature could be written to accomplish different aims:

  • To develop a theory or evaluate an existing theory
  • To summarize the historical or existing state of a research topic
  • Identify a problem in a field of research 

Baumeister, R. F., & Leary, M. R. (1997). Writing narrative literature reviews .  Review of General Psychology , 1 (3), 311-320.

What kinds of sources require a Literature Review?

  • A research paper assigned in a course
  • A thesis or dissertation
  • A grant proposal
  • An article intended for publication in a journal

All these instances require you to collect what has been written about your research topic so that you can demonstrate how your own research sheds new light on the topic.

Types of Literature Reviews

What kinds of literature reviews are written?

Narrative review: The purpose of this type of review is to describe the current state of the research on a specific topic/research and to offer a critical analysis of the literature reviewed. Studies are grouped by research/theoretical categories, and themes and trends, strengths and weakness, and gaps are identified. The review ends with a conclusion section which summarizes the findings regarding the state of the research of the specific study, the gaps identify and if applicable, explains how the author's research will address gaps identify in the review and expand the knowledge on the topic reviewed.

  • Example : Predictors and Outcomes of U.S. Quality Maternity Leave: A Review and Conceptual Framework:  10.1177/08948453211037398  

Systematic review : "The authors of a systematic review use a specific procedure to search the research literature, select the studies to include in their review, and critically evaluate the studies they find." (p. 139). Nelson, L. K. (2013). Research in Communication Sciences and Disorders . Plural Publishing.

  • Example : The effect of leave policies on increasing fertility: a systematic review:  10.1057/s41599-022-01270-w

Meta-analysis : "Meta-analysis is a method of reviewing research findings in a quantitative fashion by transforming the data from individual studies into what is called an effect size and then pooling and analyzing this information. The basic goal in meta-analysis is to explain why different outcomes have occurred in different studies." (p. 197). Roberts, M. C., & Ilardi, S. S. (2003). Handbook of Research Methods in Clinical Psychology . Blackwell Publishing.

  • Example : Employment Instability and Fertility in Europe: A Meta-Analysis:  10.1215/00703370-9164737

Meta-synthesis : "Qualitative meta-synthesis is a type of qualitative study that uses as data the findings from other qualitative studies linked by the same or related topic." (p.312). Zimmer, L. (2006). Qualitative meta-synthesis: A question of dialoguing with texts .  Journal of Advanced Nursing , 53 (3), 311-318.

  • Example : Women’s perspectives on career successes and barriers: A qualitative meta-synthesis:  10.1177/05390184221113735

Literature Reviews in the Health Sciences

  • UConn Health subject guide on systematic reviews Explanation of the different review types used in health sciences literature as well as tools to help you find the right review type
  • << Previous: Getting Started
  • Next: How to Pick a Topic >>
  • Last Updated: Sep 21, 2022 2:16 PM
  • URL: https://guides.lib.uconn.edu/literaturereview

Creative Commons

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base
  • Comparing and contrasting in an essay | Tips & examples

Comparing and Contrasting in an Essay | Tips & Examples

Published on August 6, 2020 by Jack Caulfield . Revised on July 23, 2023.

Comparing and contrasting is an important skill in academic writing . It involves taking two or more subjects and analyzing the differences and similarities between them.

Instantly correct all language mistakes in your text

Upload your document to correct all your mistakes in minutes

upload-your-document-ai-proofreader

Table of contents

When should i compare and contrast, making effective comparisons, comparing and contrasting as a brainstorming tool, structuring your comparisons, other interesting articles, frequently asked questions about comparing and contrasting.

Many assignments will invite you to make comparisons quite explicitly, as in these prompts.

  • Compare the treatment of the theme of beauty in the poetry of William Wordsworth and John Keats.
  • Compare and contrast in-class and distance learning. What are the advantages and disadvantages of each approach?

Some other prompts may not directly ask you to compare and contrast, but present you with a topic where comparing and contrasting could be a good approach.

One way to approach this essay might be to contrast the situation before the Great Depression with the situation during it, to highlight how large a difference it made.

Comparing and contrasting is also used in all kinds of academic contexts where it’s not explicitly prompted. For example, a literature review involves comparing and contrasting different studies on your topic, and an argumentative essay may involve weighing up the pros and cons of different arguments.

Here's why students love Scribbr's proofreading services

Discover proofreading & editing

As the name suggests, comparing and contrasting is about identifying both similarities and differences. You might focus on contrasting quite different subjects or comparing subjects with a lot in common—but there must be some grounds for comparison in the first place.

For example, you might contrast French society before and after the French Revolution; you’d likely find many differences, but there would be a valid basis for comparison. However, if you contrasted pre-revolutionary France with Han-dynasty China, your reader might wonder why you chose to compare these two societies.

This is why it’s important to clarify the point of your comparisons by writing a focused thesis statement . Every element of an essay should serve your central argument in some way. Consider what you’re trying to accomplish with any comparisons you make, and be sure to make this clear to the reader.

Comparing and contrasting can be a useful tool to help organize your thoughts before you begin writing any type of academic text. You might use it to compare different theories and approaches you’ve encountered in your preliminary research, for example.

Let’s say your research involves the competing psychological approaches of behaviorism and cognitive psychology. You might make a table to summarize the key differences between them.

Or say you’re writing about the major global conflicts of the twentieth century. You might visualize the key similarities and differences in a Venn diagram.

A Venn diagram showing the similarities and differences between World War I, World War II, and the Cold War.

These visualizations wouldn’t make it into your actual writing, so they don’t have to be very formal in terms of phrasing or presentation. The point of comparing and contrasting at this stage is to help you organize and shape your ideas to aid you in structuring your arguments.

When comparing and contrasting in an essay, there are two main ways to structure your comparisons: the alternating method and the block method.

The alternating method

In the alternating method, you structure your text according to what aspect you’re comparing. You cover both your subjects side by side in terms of a specific point of comparison. Your text is structured like this:

Mouse over the example paragraph below to see how this approach works.

One challenge teachers face is identifying and assisting students who are struggling without disrupting the rest of the class. In a traditional classroom environment, the teacher can easily identify when a student is struggling based on their demeanor in class or simply by regularly checking on students during exercises. They can then offer assistance quietly during the exercise or discuss it further after class. Meanwhile, in a Zoom-based class, the lack of physical presence makes it more difficult to pay attention to individual students’ responses and notice frustrations, and there is less flexibility to speak with students privately to offer assistance. In this case, therefore, the traditional classroom environment holds the advantage, although it appears likely that aiding students in a virtual classroom environment will become easier as the technology, and teachers’ familiarity with it, improves.

The block method

In the block method, you cover each of the overall subjects you’re comparing in a block. You say everything you have to say about your first subject, then discuss your second subject, making comparisons and contrasts back to the things you’ve already said about the first. Your text is structured like this:

  • Point of comparison A
  • Point of comparison B

The most commonly cited advantage of distance learning is the flexibility and accessibility it offers. Rather than being required to travel to a specific location every week (and to live near enough to feasibly do so), students can participate from anywhere with an internet connection. This allows not only for a wider geographical spread of students but for the possibility of studying while travelling. However, distance learning presents its own accessibility challenges; not all students have a stable internet connection and a computer or other device with which to participate in online classes, and less technologically literate students and teachers may struggle with the technical aspects of class participation. Furthermore, discomfort and distractions can hinder an individual student’s ability to engage with the class from home, creating divergent learning experiences for different students. Distance learning, then, seems to improve accessibility in some ways while representing a step backwards in others.

Note that these two methods can be combined; these two example paragraphs could both be part of the same essay, but it’s wise to use an essay outline to plan out which approach you’re taking in each paragraph.

Prevent plagiarism. Run a free check.

If you want to know more about AI tools , college essays , or fallacies make sure to check out some of our other articles with explanations and examples or go directly to our tools!

  • Ad hominem fallacy
  • Post hoc fallacy
  • Appeal to authority fallacy
  • False cause fallacy
  • Sunk cost fallacy

College essays

  • Choosing Essay Topic
  • Write a College Essay
  • Write a Diversity Essay
  • College Essay Format & Structure
  • Comparing and Contrasting in an Essay

 (AI) Tools

  • Grammar Checker
  • Paraphrasing Tool
  • Text Summarizer
  • AI Detector
  • Plagiarism Checker
  • Citation Generator

Some essay prompts include the keywords “compare” and/or “contrast.” In these cases, an essay structured around comparing and contrasting is the appropriate response.

Comparing and contrasting is also a useful approach in all kinds of academic writing : You might compare different studies in a literature review , weigh up different arguments in an argumentative essay , or consider different theoretical approaches in a theoretical framework .

Your subjects might be very different or quite similar, but it’s important that there be meaningful grounds for comparison . You can probably describe many differences between a cat and a bicycle, but there isn’t really any connection between them to justify the comparison.

You’ll have to write a thesis statement explaining the central point you want to make in your essay , so be sure to know in advance what connects your subjects and makes them worth comparing.

Comparisons in essays are generally structured in one of two ways:

  • The alternating method, where you compare your subjects side by side according to one specific aspect at a time.
  • The block method, where you cover each subject separately in its entirety.

It’s also possible to combine both methods, for example by writing a full paragraph on each of your topics and then a final paragraph contrasting the two according to a specific metric.

Cite this Scribbr article

If you want to cite this source, you can copy and paste the citation or click the “Cite this Scribbr article” button to automatically add the citation to our free Citation Generator.

Caulfield, J. (2023, July 23). Comparing and Contrasting in an Essay | Tips & Examples. Scribbr. Retrieved April 11, 2024, from https://www.scribbr.com/academic-essay/compare-and-contrast/

Is this article helpful?

Jack Caulfield

Jack Caulfield

Other students also liked, how to write an expository essay, how to write an argumentative essay | examples & tips, academic paragraph structure | step-by-step guide & examples, what is your plagiarism score.

University of Texas

  • University of Texas Libraries

Literature Reviews

  • What is a literature review?
  • Steps in the Literature Review Process
  • Define your research question
  • Determine inclusion and exclusion criteria
  • Choose databases and search
  • Review Results
  • Synthesize Results
  • Analyze Results
  • Librarian Support

What is a Literature Review?

A literature or narrative review is a comprehensive review and analysis of the published literature on a specific topic or research question. The literature that is reviewed contains: books, articles, academic articles, conference proceedings, association papers, and dissertations. It contains the most pertinent studies and points to important past and current research and practices. It provides background and context, and shows how your research will contribute to the field. 

A literature review should: 

  • Provide a comprehensive and updated review of the literature;
  • Explain why this review has taken place;
  • Articulate a position or hypothesis;
  • Acknowledge and account for conflicting and corroborating points of view

From  S age Research Methods

Purpose of a Literature Review

A literature review can be written as an introduction to a study to:

  • Demonstrate how a study fills a gap in research
  • Compare a study with other research that's been done

Or it can be a separate work (a research article on its own) which:

  • Organizes or describes a topic
  • Describes variables within a particular issue/problem

Limitations of a Literature Review

Some of the limitations of a literature review are:

  • It's a snapshot in time. Unlike other reviews, this one has beginning, a middle and an end. There may be future developments that could make your work less relevant.
  • It may be too focused. Some niche studies may miss the bigger picture.
  • It can be difficult to be comprehensive. There is no way to make sure all the literature on a topic was considered.
  • It is easy to be biased if you stick to top tier journals. There may be other places where people are publishing exemplary research. Look to open access publications and conferences to reflect a more inclusive collection. Also, make sure to include opposing views (and not just supporting evidence).

Source: Grant, Maria J., and Andrew Booth. “A Typology of Reviews: An Analysis of 14 Review Types and Associated Methodologies.” Health Information & Libraries Journal, vol. 26, no. 2, June 2009, pp. 91–108. Wiley Online Library, doi:10.1111/j.1471-1842.2009.00848.x.

Meryl Brodsky : Communication and Information Studies

Hannah Chapman Tripp : Biology, Neuroscience

Carolyn Cunningham : Human Development & Family Sciences, Psychology, Sociology

Larayne Dallas : Engineering

Janelle Hedstrom : Special Education, Curriculum & Instruction, Ed Leadership & Policy ​

Susan Macicak : Linguistics

Imelda Vetter : Dell Medical School

For help in other subject areas, please see the guide to library specialists by subject .

Periodically, UT Libraries runs a workshop covering the basics and library support for literature reviews. While we try to offer these once per academic year, we find providing the recording to be helpful to community members who have missed the session. Following is the most recent recording of the workshop, Conducting a Literature Review. To view the recording, a UT login is required.

  • October 26, 2022 recording
  • Last Updated: Oct 26, 2022 2:49 PM
  • URL: https://guides.lib.utexas.edu/literaturereviews

Creative Commons License

Literature Reviews

  • Tools & Visualizations
  • Literature Review Examples
  • Videos, Books & Links

Business & Econ Librarian

Profile Photo

Click to Chat with a Librarian

Text: (571) 248-7542

What is a literature review?

A literature review discusses published information in a particular subject area. Often part of the introduction to an essay, research report or thesis, the literature review is literally a "re" view or "look again" at what has already been written about the topic, wherein the author analyzes a segment of a published body of knowledge through summary, classification, and comparison of prior research studies, reviews of literature, and theoretical articles. Literature reviews provide the reader with a bibliographic history of the scholarly research in any given field of study. As such,  as new information becomes available, literature reviews grow in length or become focused on one specific aspect of the topic.

A literature review can be just a simple summary of the sources, but usually contains an organizational pattern and combines both summary and synthesis. A summary is a recap of the important information of the source, whereas a synthesis is a re-organization, or a reshuffling, of that information. The literature review might give a new interpretation of old material or combine new with old interpretations. Or it might trace the intellectual progression of the field, including major debates. Depending on the situation, the literature review may evaluate the sources and advise the reader on the most pertinent or relevant.

A literature review is NOT:

  • An annotated bibliography – a list of citations to books, articles and documents that includes a brief description and evaluation for each citation. The annotations inform the reader of the relevance, accuracy and quality of the sources cited.
  • A literary review – a critical discussion of the merits and weaknesses of a literary work.
  • A book review – a critical discussion of the merits and weaknesses of a particular book.
  • Teaching Information Literacy Reframed: 50+ Framework-Based Exercises for Creating Information-Literate Learners
  • The UNC Writing Center – Literature Reviews
  • The UW-Madison Writing Center: The Writer’s Handbook – Academic and Professional Writing – Learn How to Write a Literature Review

What is the difference between a literature review and a research paper?

The focus of a literature review is to summarize and synthesize the arguments and ideas of others without adding new contributions, whereas academic research papers present and develop new arguments that build upon the previously available body of literature.

How do I write a literature review?

There are many resources that offer step-by-step guidance for writing a literature review, and you can find some of them under Other Resources in the menu to the left. Writing the Literature Review: A Practical Guide suggests these steps:

  • Chose a review topic and develop a research question
  • Locate and organize research sources
  • Select, analyze and annotate sources
  • Evaluate research articles and other documents
  • Structure and organize the literature review
  • Develop arguments and supporting claims
  • Synthesize and interpret the literature
  • Put it all together

Cover Art

What is the purpose of writing a literature review?

Literature reviews serve as a guide to a particular topic: professionals can use literature reviews to keep current on their field; scholars can determine credibility of the writer in his or her field by analyzing the literature review.

As a writer, you will use the literature review to:

  • See what has, and what has not, been investigated about your topic
  • Identify data sources that other researches have used
  • Learn how others in the field have defined and measured key concepts
  • Establish context, or background, for the argument explored in the rest of a paper
  • Explain what the strengths and weaknesses of that knowledge and ideas might be
  • Contribute to the field by moving research forward
  • To keep the writer/reader up to date with current developments in a particular field of study
  • Develop alternative research projects
  • Put your work in perspective
  • Demonstrate your understanding and your ability to critically evaluate research in the field
  • Provide evidence that may support your own findings
  • Next: Tools & Visualizations >>
  • Last Updated: Sep 7, 2023 8:35 AM
  • URL: https://subjectguides.library.american.edu/literaturereview

Banner

Evidence-Based Practice (EBP)

  • The EBP Process
  • Forming a Clinical Question
  • Inclusion & Exclusion Criteria
  • Acquiring Evidence
  • Appraising the Quality of the Evidence
  • Writing a Literature Review
  • Finding Psychological Tests & Assessment Instruments

What Is a Literature Review?

A literature review is an integrated analysis of scholarly writings that are related directly to your research question. Put simply, it's  a critical evaluation of what's already been written on a particular topic . It represents the literature that provides background information on your topic and shows a connection between those writings and your research question.

A literature review may be a stand-alone work or the introduction to a larger research paper, depending on the assignment. Rely heavily on the guidelines your instructor has given you.

What a Literature Review Is Not:

  • A list or summary of sources
  • An annotated bibliography
  • A grouping of broad, unrelated sources
  • A compilation of everything that has been written on a particular topic
  • Literary criticism (think English) or a book review

Why Literature Reviews Are Important

  • They explain the background of research on a topic
  • They demonstrate why a topic is significant to a subject area
  • They discover relationships between research studies/ideas
  • They identify major themes, concepts, and researchers on a topic
  • They identify critical gaps and points of disagreement
  • They discuss further research questions that logically come out of the previous studies

To Learn More about Conducting and Writing a Lit Review . . .

Monash University (in Australia) has created several extremely helpful, interactive tutorials. 

  • The Stand-Alone Literature Review, https://www.monash.edu/rlo/assignment-samples/science/stand-alone-literature-review
  • Researching for Your Literature Review,  https://guides.lib.monash.edu/researching-for-your-literature-review/home
  • Writing a Literature Review,  https://www.monash.edu/rlo/graduate-research-writing/write-the-thesis/writing-a-literature-review

Keep Track of Your Sources!

A citation manager can be helpful way to work with large numbers of citations. See UMSL Libraries' Citing Sources guide for more information. Personally, I highly recommend Zotero —it's free, easy to use, and versatile. If you need help getting started with Zotero or one of the other citation managers, please contact a librarian.

  • << Previous: Appraising the Quality of the Evidence
  • Next: Finding Psychological Tests & Assessment Instruments >>
  • Last Updated: Nov 15, 2023 11:47 AM
  • URL: https://libguides.umsl.edu/ebp

University Libraries      University of Nevada, Reno

  • Skill Guides
  • Subject Guides

Systematic, Scoping, and Other Literature Reviews: Overview

  • Project Planning

What Is a Systematic Review?

Regular literature reviews are simply summaries of the literature on a particular topic. A systematic review, however, is a comprehensive literature review conducted to answer a specific research question. Authors of a systematic review aim to find, code, appraise, and synthesize all of the previous research on their question in an unbiased and well-documented manner. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) outline the minimum amount of information that needs to be reported at the conclusion of a systematic review project. 

Other types of what are known as "evidence syntheses," such as scoping, rapid, and integrative reviews, have varying methodologies. While systematic reviews originated with and continue to be a popular publication type in medicine and other health sciences fields, more and more researchers in other disciplines are choosing to conduct evidence syntheses. 

This guide will walk you through the major steps of a systematic review and point you to key resources including Covidence, a systematic review project management tool. For help with systematic reviews and other major literature review projects, please send us an email at  [email protected] .

Getting Help with Reviews

Organization such as the Institute of Medicine recommend that you consult a librarian when conducting a systematic review. Librarians at the University of Nevada, Reno can help you:

  • Understand best practices for conducting systematic reviews and other evidence syntheses in your discipline
  • Choose and formulate a research question
  • Decide which review type (e.g., systematic, scoping, rapid, etc.) is the best fit for your project
  • Determine what to include and where to register a systematic review protocol
  • Select search terms and develop a search strategy
  • Identify databases and platforms to search
  • Find the full text of articles and other sources
  • Become familiar with free citation management (e.g., EndNote, Zotero)
  • Get access to you and help using Covidence, a systematic review project management tool

Doing a Systematic Review

  • Plan - This is the project planning stage. You and your team will need to develop a good research question, determine the type of review you will conduct (systematic, scoping, rapid, etc.), and establish the inclusion and exclusion criteria (e.g., you're only going to look at studies that use a certain methodology). All of this information needs to be included in your protocol. You'll also need to ensure that the project is viable - has someone already done a systematic review on this topic? Do some searches and check the various protocol registries to find out. 
  • Identify - Next, a comprehensive search of the literature is undertaken to ensure all studies that meet the predetermined criteria are identified. Each research question is different, so the number and types of databases you'll search - as well as other online publication venues - will vary. Some standards and guidelines specify that certain databases (e.g., MEDLINE, EMBASE) should be searched regardless. Your subject librarian can help you select appropriate databases to search and develop search strings for each of those databases.  
  • Evaluate - In this step, retrieved articles are screened and sorted using the predetermined inclusion and exclusion criteria. The risk of bias for each included study is also assessed around this time. It's best if you import search results into a citation management tool (see below) to clean up the citations and remove any duplicates. You can then use a tool like Rayyan (see below) to screen the results. You should begin by screening titles and abstracts only, and then you'll examine the full text of any remaining articles. Each study should be reviewed by a minimum of two people on the project team. 
  • Collect - Each included study is coded and the quantitative or qualitative data contained in these studies is then synthesized. You'll have to either find or develop a coding strategy or form that meets your needs. 
  • Explain - The synthesized results are articulated and contextualized. What do the results mean? How have they answered your research question?
  • Summarize - The final report provides a complete description of the methods and results in a clear, transparent fashion. 

Adapted from

Types of reviews, systematic review.

These types of studies employ a systematic method to analyze and synthesize the results of numerous studies. "Systematic" in this case means following a strict set of steps - as outlined by entities like PRISMA and the Institute of Medicine - so as to make the review more reproducible and less biased. Consistent, thorough documentation is also key. Reviews of this type are not meant to be conducted by an individual but rather a (small) team of researchers. Systematic reviews are widely used in the health sciences, often to find a generalized conclusion from multiple evidence-based studies. 

Meta-Analysis

A systematic method that uses statistics to analyze the data from numerous studies. The researchers combine the data from studies with similar data types and analyze them as a single, expanded dataset. Meta-analyses are a type of systematic review.

Scoping Review

A scoping review employs the systematic review methodology to explore a broader topic or question rather than a specific and answerable one, as is generally the case with a systematic review. Authors of these types of reviews seek to collect and categorize the existing literature so as to identify any gaps.

Rapid Review

Rapid reviews are systematic reviews conducted under a time constraint. Researchers make use of workarounds to complete the review quickly (e.g., only looking at English-language publications), which can lead to a less thorough and more biased review. 

Narrative Review

A traditional literature review that summarizes and synthesizes the findings of numerous original research articles. The purpose and scope of narrative literature reviews vary widely and do not follow a set protocol. Most literature reviews are narrative reviews. 

Umbrella Review

Umbrella reviews are, essentially, systematic reviews of systematic reviews. These compile evidence from multiple review studies into one usable document. 

Grant, Maria J., and Andrew Booth. “A Typology of Reviews: An Analysis of 14 Review Types and Associated Methodologies.” Health Information & Libraries Journal , vol. 26, no. 2, 2009, pp. 91-108. doi: 10.1111/j.1471-1842.2009.00848.x .

  • Next: Project Planning >>
  • Library Guides
  • Literature Reviews
  • Writing the Review

Literature Reviews: Writing the Review

Outline of review sections.

comparison and literature review

Your Literature Review should not be a summary and evaluation of each article, one after the other. Your sources should be integrated together to create a narrative on your topic.

Consider the following ways to organize your review:

  • By themes, variables, or issues
  • By varying perspectives regarding a topic of controversy
  • Chronologically, to show how the topic and research have developed over time

Use an outline to organize your sources and ideas in a logical sequence. Identify main points and subpoints, and consider the flow of your review. Outlines can be revised as your ideas develop. They help guide your readers through your ideas and show the hierarchy of your thoughts. What do your readers need to understand first? Where might certain studies fit most naturally? These are the kinds of questions that an outline can clarify.

An example outline for a Literature Review might look like this:

Introduction

  • Background information on the topic & definitions
  • Purpose of the literature review
  • Scope and limitations of the review (what is included /excluded)
  • Historical background 
  • Overview of the existing research on the topic
  • Principle question being asked
  • Organization of the literature into categories or themes
  • Evaluation of the strengths and weaknesses of each study
  • Combining the findings from multiple sources to identify patterns and trends
  • Insight into the relationship between your central topic and a larger area of study
  • Development of a new research question or hypothesis
  • Summary of the key points and findings in the literature
  • Discussion of gaps in the existing knowledge
  • Implications for future research

Strategies for Writing

Annotated bibliography.

An annotated bibliography collects short descriptions of each source in one place. After you have read each source carefully, set aside some time to write a brief summary. Your summary might be simply informative (e.g. identify the main argument/hypothesis, methods, major findings, and/or conclusions), or it might be evaluative (e.g. state why the source is interesting or useful for your review, or why it is not).

This method is more narrative than the Literature Matrix talked about on the Documenting Your Search page.

Taking the time to write short informative and/or evaluative summaries of your sources while you are researching can help you transition into the drafting stage later on. By making a record of your sources’ contents and your reactions to them, you make it less likely that you will need to go back and re-read many sources while drafting, and you might also start to gain a clearer idea of the overarching shape of your review.

READ EXTANT LIT REVIEWS CLOSELY

As you conduct your research, you will likely read many sources that model the same kind of literature review that you are researching and writing. While your original intent in reading those sources is likely to learn from the studies’ content (e.g. their results and discussion), it will benefit you to re-read these articles rhetorically.

Reading rhetorically means paying attention to how a text is written—how it has been structured, how it presents its claims and analyses, how it employs transitional words and phrases to move from one idea to the next. You might also pay attention to an author’s stylistic choices, like the use of first-person pronouns, active and passive voice, or technical terminology.

See  Finding Example Literature Reviews on the Developing a Research Question page for tips on finding reviews relevant to your topic.

MIND-MAPPING

Creating a mind-map is a form of brainstorming that lets you visualize how your ideas function and relate. Draw the diagram freehand or download software that lets you easily manipulate and group text, images, and shapes ( Coggle ,  FreeMind , MindMaple ).

Write down a central idea, then identify associated concepts, features, or questions around that idea. Make lines attaching various ideas, or arrows to signify directional relationships. Use different shapes, sizes, or colors to indicate commonalities, sequences, or relative importance.

comparison and literature review

This drafting technique allows you to generate ideas while thinking visually about how they function together. As you follow lines of thought, you can see which ideas can be connected, where certain pathways lead, and what the scope of your project might be. By drawing out a mind-map you may be able to see what elements of your review are underdeveloped and will benefit from more focused attention.

USE VISUALIZATION TOOLS

Attribution.

Thanks to Librarian Jamie Niehof at the University of Michigan for providing permission to reuse and remix this Literature Reviews guide.

Avoiding Bias

Reporting bias.

This occurs when you are summarizing the literature in an unbalanced, inconsistent or distorted way . 

Ways to avoid:

  • look for literature that supports multiple perspectives, viewpoints or theories 
  • ask multiple people to review your writing for bias
  • Last Updated: Apr 9, 2024 3:50 PM
  • URL: https://info.library.okstate.edu/literaturereviews

comparison and literature review

  • Master Your Homework
  • Do My Homework

Compare and Contrast: Research Paper vs. Literature Review

When discussing research papers and literature reviews, there are key differences to consider. While both genres of academic writing may share some similarities in the way they are structured or the types of evidence used, it is important to understand their distinct characteristics for effective communication within an academic setting. This article aims to explore these two forms of writing by highlighting common features and illustrating essential distinctions between them. Additionally, guidelines will be provided on how best to incorporate either type into a scholarly project. With this understanding in hand, readers can make informed decisions when selecting which form fits most appropriately with any particular goal or purpose at hand.

I. Introduction

Ii. overview of research paper and literature review, iii. similarities between research papers and literature reviews, iv. differences between research papers and literature reviews, v. assessing sources for a research paper vs a literature review vi . structuring an academic essay: argumentative structure in the context of compare & contrast writing vii. conclusion.

The importance of properly introducing your research paper cannot be overstated. The introduction serves as the reader’s first impression of your work, setting up the context and providing necessary background information. It is also where you establish yourself as an authoritative source on your topic by discussing prior literature.

Research Paper vs Literature Review : When researching a particular issue or topic, it can often become overwhelming trying to decide whether you should conduct a research paper or literature review. Research papers involve conducting in-depth studies into one specific subject, while literature reviews examine existing published works related to that same subject matter. A research paper might propose new theories or ideas that could be tested further through empirical evidence; conversely, a literature review provides synthesis between previously established concepts.

Exploring the Literature

As researchers, it is our responsibility to explore all pertinent literature in order to contextualize our research. To this end, we will be taking a closer look at both the research paper and its accompanying literature review.

  • A research paper , simply put, presents original findings from an empirical investigation or scholarly exploration.
  • On the other hand, a literature review , while also incorporating some of one’s own ideas and interpretations on existing material (to a certain degree), ultimately provides an overview of already established work by synthesizing past published material into summaries that can inform future studies.

When it comes to academic research, there are a number of similarities between research papers and literature reviews. Both documents require an in-depth analysis of their respective topics, incorporating evidence from multiple sources into the text.

  • Research Paper vs Literature Review
  • Both types of document contain information about a particular topic or issue.
  • In both cases, this information is sourced from reliable materials such as journals, books and articles.

However, while they share these common characteristics when writing either type of document there exist some key differences. Research papers focus on providing new insights based on existing theories whereas literature reviews analyse what has already been established. Therefore it is vital that one understands which sort of assignment needs to be completed before embarking upon any given task!

Research papers and literature reviews are two of the most important elements in academic writing. Both require extensive research, critical thinking, and well-crafted arguments. However, there are some key differences between these two types of scholarly documents.

  • Focus on original research or analysis – Research papers typically present an argument based upon a student’s own findings from primary sources such as interviews, surveys, experiments etc. Students must demonstrate their understanding of existing knowledge by researching the topic thoroughly and analyzing relevant evidence to make new discoveries.
  • Provide a narrow focus – Research papers typically cover one specific aspect or angle related to a larger subject area. This allows students to explore that particular angle in depth while providing comprehensive information within limited word counts.

When it comes to assessing sources for a research paper and a literature review, there are some similarities but also many differences. It is important that students be aware of these distinctions when preparing either type of assignment.

  • Research Paper: Research papers require the student to evaluate source materials in terms of their relevance and accuracy. Sources must be evaluated carefully for bias or misinterpretation as well as for any gaps in information which may need to be addressed with additional resources.

VI . Structuring an Academic Essay: Argumentative Structure in the Context of Compare & Contrast Writing VII. Conclusion

  • Literature Review: Literature reviews involve looking at published works from several different angles; focusing not only on content, but also author perspective, impact within its field, possible implications and connections between ideas presented within one work or multiple works over time. Students should read through each source critically noting both positive aspects (such as clarity of presentation) and any concerns they have about reliability.

English: This article has provided an overview of the differences between research papers and literature reviews. In conclusion, it is clear that although both are valuable forms of academic writing, they serve different purposes and require distinct levels of research and evidence gathering. Furthermore, there are some fundamental elements shared by both types; however these can be altered depending on the specific requirements for each assignment. Therefore, when presented with a task to write either a research paper or a literature review in academia, it is important to understand which form will best suit one’s particular aims as well as the purpose for which this piece of writing must fulfill before embarking upon their project.

University of Tasmania, Australia

Literature reviews.

  • Introduction: who will benefit from this guide?
  • Getting started: what is a literature review?
  • How to develop a researchable question
  • How to find the literature
  • How to manage the reading and take notes that make sense
  • How to bring it all together: examples, templates, links, guides

Who will benefit from this guide?

This guide is written for undergraduates and postgraduate, course work students who are doing their first literature review.

Higher degree research candidates and academic researchers, please also refer to the Resources for Researchers library guides for more detailed information on writing theses and systematic reviews. 

What is a literature review?

A literature review is an examination of research in a particular field. 

  • It gathers, critically analyses, evaluates, and synthesises current research literature in a discipline,
  • indicates where there may be strengths, gaps,  weaknesses, and agreements in the current research.

It considers:

  • what has been done,
  •  the current thinking,
  • research trends,
  •  principal debates,
  • dominant ideas,
  • methods used in researching the topic
  • gaps and flaws in the research.

  http://libguides.lib.msu.edu/c.php?g=96146&p=904793

Different Types of reviews

You may be asked to complete a literature review that is done in a systematic way, that is like a systematic review.

Mostly, the literature review you will be asked to do will be integrative – that is, conclusions are drawn from the literature in order to create something new, such as a new hypothesis to address a question, a solution to a complex problem, a new workplace procedure or training program.

Some elements of what you are asked to do may be like a systematic review, particularly in health fields.

Systematic approach does not mean a systematic review.

A true systematic review is a complex research project:

  •  conducted in a scientific manner,
  • usually with more than one person involved,
  • they take a long time to complete
  • are generally a project in themselves.

For more information have a look at the Systematic Review library guide .

If you would like to know more about different types of reviews, have a look at the document below: 

comparison and literature review

At the core of a literature review is a synthesis of the research. 

While both analysis and synthesis are involved, s ynthesis goes beyond analysis and is a higher order thinking.(Bloom's taxonomy).

Looking at the diagram below, it is evident that synthesis goes well beyond just analysis. 

comparison and literature review

  • Analysis asks you to break something down into its parts and compare and contrast with other research findings.
  • where they agree and disagree
  • the major themes, arguments, ideas in a field
  • the questions raised and those yet to be answered.
  • This will show the relationships between different aspects of the research findings in the literature.
  • It is not a summary, but rather is organised around concepts and themes, where there is a combining of elements to form something new.

Watch this short clip from Utah State University which defines how to go about achieving synthesis. 

Synthesis: True or False. 

Quick Quiz: check your understanding of synthesis from the video by deciding which of these statements are true or false .

  • Next: How to develop a researchable question >>
  • Last Updated: Apr 10, 2024 11:56 AM
  • URL: https://utas.libguides.com/literaturereviews

Australian Aboriginal Flag

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • CBE Life Sci Educ
  • v.21(3); Fall 2022

Literature Reviews, Theoretical Frameworks, and Conceptual Frameworks: An Introduction for New Biology Education Researchers

Julie a. luft.

† Department of Mathematics, Social Studies, and Science Education, Mary Frances Early College of Education, University of Georgia, Athens, GA 30602-7124

Sophia Jeong

‡ Department of Teaching & Learning, College of Education & Human Ecology, Ohio State University, Columbus, OH 43210

Robert Idsardi

§ Department of Biology, Eastern Washington University, Cheney, WA 99004

Grant Gardner

∥ Department of Biology, Middle Tennessee State University, Murfreesboro, TN 37132

Associated Data

To frame their work, biology education researchers need to consider the role of literature reviews, theoretical frameworks, and conceptual frameworks as critical elements of the research and writing process. However, these elements can be confusing for scholars new to education research. This Research Methods article is designed to provide an overview of each of these elements and delineate the purpose of each in the educational research process. We describe what biology education researchers should consider as they conduct literature reviews, identify theoretical frameworks, and construct conceptual frameworks. Clarifying these different components of educational research studies can be helpful to new biology education researchers and the biology education research community at large in situating their work in the broader scholarly literature.

INTRODUCTION

Discipline-based education research (DBER) involves the purposeful and situated study of teaching and learning in specific disciplinary areas ( Singer et al. , 2012 ). Studies in DBER are guided by research questions that reflect disciplines’ priorities and worldviews. Researchers can use quantitative data, qualitative data, or both to answer these research questions through a variety of methodological traditions. Across all methodologies, there are different methods associated with planning and conducting educational research studies that include the use of surveys, interviews, observations, artifacts, or instruments. Ensuring the coherence of these elements to the discipline’s perspective also involves situating the work in the broader scholarly literature. The tools for doing this include literature reviews, theoretical frameworks, and conceptual frameworks. However, the purpose and function of each of these elements is often confusing to new education researchers. The goal of this article is to introduce new biology education researchers to these three important elements important in DBER scholarship and the broader educational literature.

The first element we discuss is a review of research (literature reviews), which highlights the need for a specific research question, study problem, or topic of investigation. Literature reviews situate the relevance of the study within a topic and a field. The process may seem familiar to science researchers entering DBER fields, but new researchers may still struggle in conducting the review. Booth et al. (2016b) highlight some of the challenges novice education researchers face when conducting a review of literature. They point out that novice researchers struggle in deciding how to focus the review, determining the scope of articles needed in the review, and knowing how to be critical of the articles in the review. Overcoming these challenges (and others) can help novice researchers construct a sound literature review that can inform the design of the study and help ensure the work makes a contribution to the field.

The second and third highlighted elements are theoretical and conceptual frameworks. These guide biology education research (BER) studies, and may be less familiar to science researchers. These elements are important in shaping the construction of new knowledge. Theoretical frameworks offer a way to explain and interpret the studied phenomenon, while conceptual frameworks clarify assumptions about the studied phenomenon. Despite the importance of these constructs in educational research, biology educational researchers have noted the limited use of theoretical or conceptual frameworks in published work ( DeHaan, 2011 ; Dirks, 2011 ; Lo et al. , 2019 ). In reviewing articles published in CBE—Life Sciences Education ( LSE ) between 2015 and 2019, we found that fewer than 25% of the research articles had a theoretical or conceptual framework (see the Supplemental Information), and at times there was an inconsistent use of theoretical and conceptual frameworks. Clearly, these frameworks are challenging for published biology education researchers, which suggests the importance of providing some initial guidance to new biology education researchers.

Fortunately, educational researchers have increased their explicit use of these frameworks over time, and this is influencing educational research in science, technology, engineering, and mathematics (STEM) fields. For instance, a quick search for theoretical or conceptual frameworks in the abstracts of articles in Educational Research Complete (a common database for educational research) in STEM fields demonstrates a dramatic change over the last 20 years: from only 778 articles published between 2000 and 2010 to 5703 articles published between 2010 and 2020, a more than sevenfold increase. Greater recognition of the importance of these frameworks is contributing to DBER authors being more explicit about such frameworks in their studies.

Collectively, literature reviews, theoretical frameworks, and conceptual frameworks work to guide methodological decisions and the elucidation of important findings. Each offers a different perspective on the problem of study and is an essential element in all forms of educational research. As new researchers seek to learn about these elements, they will find different resources, a variety of perspectives, and many suggestions about the construction and use of these elements. The wide range of available information can overwhelm the new researcher who just wants to learn the distinction between these elements or how to craft them adequately.

Our goal in writing this paper is not to offer specific advice about how to write these sections in scholarly work. Instead, we wanted to introduce these elements to those who are new to BER and who are interested in better distinguishing one from the other. In this paper, we share the purpose of each element in BER scholarship, along with important points on its construction. We also provide references for additional resources that may be beneficial to better understanding each element. Table 1 summarizes the key distinctions among these elements.

Comparison of literature reviews, theoretical frameworks, and conceptual reviews

This article is written for the new biology education researcher who is just learning about these different elements or for scientists looking to become more involved in BER. It is a result of our own work as science education and biology education researchers, whether as graduate students and postdoctoral scholars or newly hired and established faculty members. This is the article we wish had been available as we started to learn about these elements or discussed them with new educational researchers in biology.

LITERATURE REVIEWS

Purpose of a literature review.

A literature review is foundational to any research study in education or science. In education, a well-conceptualized and well-executed review provides a summary of the research that has already been done on a specific topic and identifies questions that remain to be answered, thus illustrating the current research project’s potential contribution to the field and the reasoning behind the methodological approach selected for the study ( Maxwell, 2012 ). BER is an evolving disciplinary area that is redefining areas of conceptual emphasis as well as orientations toward teaching and learning (e.g., Labov et al. , 2010 ; American Association for the Advancement of Science, 2011 ; Nehm, 2019 ). As a result, building comprehensive, critical, purposeful, and concise literature reviews can be a challenge for new biology education researchers.

Building Literature Reviews

There are different ways to approach and construct a literature review. Booth et al. (2016a) provide an overview that includes, for example, scoping reviews, which are focused only on notable studies and use a basic method of analysis, and integrative reviews, which are the result of exhaustive literature searches across different genres. Underlying each of these different review processes are attention to the s earch process, a ppraisa l of articles, s ynthesis of the literature, and a nalysis: SALSA ( Booth et al. , 2016a ). This useful acronym can help the researcher focus on the process while building a specific type of review.

However, new educational researchers often have questions about literature reviews that are foundational to SALSA or other approaches. Common questions concern determining which literature pertains to the topic of study or the role of the literature review in the design of the study. This section addresses such questions broadly while providing general guidance for writing a narrative literature review that evaluates the most pertinent studies.

The literature review process should begin before the research is conducted. As Boote and Beile (2005 , p. 3) suggested, researchers should be “scholars before researchers.” They point out that having a good working knowledge of the proposed topic helps illuminate avenues of study. Some subject areas have a deep body of work to read and reflect upon, providing a strong foundation for developing the research question(s). For instance, the teaching and learning of evolution is an area of long-standing interest in the BER community, generating many studies (e.g., Perry et al. , 2008 ; Barnes and Brownell, 2016 ) and reviews of research (e.g., Sickel and Friedrichsen, 2013 ; Ziadie and Andrews, 2018 ). Emerging areas of BER include the affective domain, issues of transfer, and metacognition ( Singer et al. , 2012 ). Many studies in these areas are transdisciplinary and not always specific to biology education (e.g., Rodrigo-Peiris et al. , 2018 ; Kolpikova et al. , 2019 ). These newer areas may require reading outside BER; fortunately, summaries of some of these topics can be found in the Current Insights section of the LSE website.

In focusing on a specific problem within a broader research strand, a new researcher will likely need to examine research outside BER. Depending upon the area of study, the expanded reading list might involve a mix of BER, DBER, and educational research studies. Determining the scope of the reading is not always straightforward. A simple way to focus one’s reading is to create a “summary phrase” or “research nugget,” which is a very brief descriptive statement about the study. It should focus on the essence of the study, for example, “first-year nonmajor students’ understanding of evolution,” “metacognitive prompts to enhance learning during biochemistry,” or “instructors’ inquiry-based instructional practices after professional development programming.” This type of phrase should help a new researcher identify two or more areas to review that pertain to the study. Focusing on recent research in the last 5 years is a good first step. Additional studies can be identified by reading relevant works referenced in those articles. It is also important to read seminal studies that are more than 5 years old. Reading a range of studies should give the researcher the necessary command of the subject in order to suggest a research question.

Given that the research question(s) arise from the literature review, the review should also substantiate the selected methodological approach. The review and research question(s) guide the researcher in determining how to collect and analyze data. Often the methodological approach used in a study is selected to contribute knowledge that expands upon what has been published previously about the topic (see Institute of Education Sciences and National Science Foundation, 2013 ). An emerging topic of study may need an exploratory approach that allows for a description of the phenomenon and development of a potential theory. This could, but not necessarily, require a methodological approach that uses interviews, observations, surveys, or other instruments. An extensively studied topic may call for the additional understanding of specific factors or variables; this type of study would be well suited to a verification or a causal research design. These could entail a methodological approach that uses valid and reliable instruments, observations, or interviews to determine an effect in the studied event. In either of these examples, the researcher(s) may use a qualitative, quantitative, or mixed methods methodological approach.

Even with a good research question, there is still more reading to be done. The complexity and focus of the research question dictates the depth and breadth of the literature to be examined. Questions that connect multiple topics can require broad literature reviews. For instance, a study that explores the impact of a biology faculty learning community on the inquiry instruction of faculty could have the following review areas: learning communities among biology faculty, inquiry instruction among biology faculty, and inquiry instruction among biology faculty as a result of professional learning. Biology education researchers need to consider whether their literature review requires studies from different disciplines within or outside DBER. For the example given, it would be fruitful to look at research focused on learning communities with faculty in STEM fields or in general education fields that result in instructional change. It is important not to be too narrow or too broad when reading. When the conclusions of articles start to sound similar or no new insights are gained, the researcher likely has a good foundation for a literature review. This level of reading should allow the researcher to demonstrate a mastery in understanding the researched topic, explain the suitability of the proposed research approach, and point to the need for the refined research question(s).

The literature review should include the researcher’s evaluation and critique of the selected studies. A researcher may have a large collection of studies, but not all of the studies will follow standards important in the reporting of empirical work in the social sciences. The American Educational Research Association ( Duran et al. , 2006 ), for example, offers a general discussion about standards for such work: an adequate review of research informing the study, the existence of sound and appropriate data collection and analysis methods, and appropriate conclusions that do not overstep or underexplore the analyzed data. The Institute of Education Sciences and National Science Foundation (2013) also offer Common Guidelines for Education Research and Development that can be used to evaluate collected studies.

Because not all journals adhere to such standards, it is important that a researcher review each study to determine the quality of published research, per the guidelines suggested earlier. In some instances, the research may be fatally flawed. Examples of such flaws include data that do not pertain to the question, a lack of discussion about the data collection, poorly constructed instruments, or an inadequate analysis. These types of errors result in studies that are incomplete, error-laden, or inaccurate and should be excluded from the review. Most studies have limitations, and the author(s) often make them explicit. For instance, there may be an instructor effect, recognized bias in the analysis, or issues with the sample population. Limitations are usually addressed by the research team in some way to ensure a sound and acceptable research process. Occasionally, the limitations associated with the study can be significant and not addressed adequately, which leaves a consequential decision in the hands of the researcher. Providing critiques of studies in the literature review process gives the reader confidence that the researcher has carefully examined relevant work in preparation for the study and, ultimately, the manuscript.

A solid literature review clearly anchors the proposed study in the field and connects the research question(s), the methodological approach, and the discussion. Reviewing extant research leads to research questions that will contribute to what is known in the field. By summarizing what is known, the literature review points to what needs to be known, which in turn guides decisions about methodology. Finally, notable findings of the new study are discussed in reference to those described in the literature review.

Within published BER studies, literature reviews can be placed in different locations in an article. When included in the introductory section of the study, the first few paragraphs of the manuscript set the stage, with the literature review following the opening paragraphs. Cooper et al. (2019) illustrate this approach in their study of course-based undergraduate research experiences (CUREs). An introduction discussing the potential of CURES is followed by an analysis of the existing literature relevant to the design of CUREs that allows for novel student discoveries. Within this review, the authors point out contradictory findings among research on novel student discoveries. This clarifies the need for their study, which is described and highlighted through specific research aims.

A literature reviews can also make up a separate section in a paper. For example, the introduction to Todd et al. (2019) illustrates the need for their research topic by highlighting the potential of learning progressions (LPs) and suggesting that LPs may help mitigate learning loss in genetics. At the end of the introduction, the authors state their specific research questions. The review of literature following this opening section comprises two subsections. One focuses on learning loss in general and examines a variety of studies and meta-analyses from the disciplines of medical education, mathematics, and reading. The second section focuses specifically on LPs in genetics and highlights student learning in the midst of LPs. These separate reviews provide insights into the stated research question.

Suggestions and Advice

A well-conceptualized, comprehensive, and critical literature review reveals the understanding of the topic that the researcher brings to the study. Literature reviews should not be so big that there is no clear area of focus; nor should they be so narrow that no real research question arises. The task for a researcher is to craft an efficient literature review that offers a critical analysis of published work, articulates the need for the study, guides the methodological approach to the topic of study, and provides an adequate foundation for the discussion of the findings.

In our own writing of literature reviews, there are often many drafts. An early draft may seem well suited to the study because the need for and approach to the study are well described. However, as the results of the study are analyzed and findings begin to emerge, the existing literature review may be inadequate and need revision. The need for an expanded discussion about the research area can result in the inclusion of new studies that support the explanation of a potential finding. The literature review may also prove to be too broad. Refocusing on a specific area allows for more contemplation of a finding.

It should be noted that there are different types of literature reviews, and many books and articles have been written about the different ways to embark on these types of reviews. Among these different resources, the following may be helpful in considering how to refine the review process for scholarly journals:

  • Booth, A., Sutton, A., & Papaioannou, D. (2016a). Systemic approaches to a successful literature review (2nd ed.). Los Angeles, CA: Sage. This book addresses different types of literature reviews and offers important suggestions pertaining to defining the scope of the literature review and assessing extant studies.
  • Booth, W. C., Colomb, G. G., Williams, J. M., Bizup, J., & Fitzgerald, W. T. (2016b). The craft of research (4th ed.). Chicago: University of Chicago Press. This book can help the novice consider how to make the case for an area of study. While this book is not specifically about literature reviews, it offers suggestions about making the case for your study.
  • Galvan, J. L., & Galvan, M. C. (2017). Writing literature reviews: A guide for students of the social and behavioral sciences (7th ed.). Routledge. This book offers guidance on writing different types of literature reviews. For the novice researcher, there are useful suggestions for creating coherent literature reviews.

THEORETICAL FRAMEWORKS

Purpose of theoretical frameworks.

As new education researchers may be less familiar with theoretical frameworks than with literature reviews, this discussion begins with an analogy. Envision a biologist, chemist, and physicist examining together the dramatic effect of a fog tsunami over the ocean. A biologist gazing at this phenomenon may be concerned with the effect of fog on various species. A chemist may be interested in the chemical composition of the fog as water vapor condenses around bits of salt. A physicist may be focused on the refraction of light to make fog appear to be “sitting” above the ocean. While observing the same “objective event,” the scientists are operating under different theoretical frameworks that provide a particular perspective or “lens” for the interpretation of the phenomenon. Each of these scientists brings specialized knowledge, experiences, and values to this phenomenon, and these influence the interpretation of the phenomenon. The scientists’ theoretical frameworks influence how they design and carry out their studies and interpret their data.

Within an educational study, a theoretical framework helps to explain a phenomenon through a particular lens and challenges and extends existing knowledge within the limitations of that lens. Theoretical frameworks are explicitly stated by an educational researcher in the paper’s framework, theory, or relevant literature section. The framework shapes the types of questions asked, guides the method by which data are collected and analyzed, and informs the discussion of the results of the study. It also reveals the researcher’s subjectivities, for example, values, social experience, and viewpoint ( Allen, 2017 ). It is essential that a novice researcher learn to explicitly state a theoretical framework, because all research questions are being asked from the researcher’s implicit or explicit assumptions of a phenomenon of interest ( Schwandt, 2000 ).

Selecting Theoretical Frameworks

Theoretical frameworks are one of the most contemplated elements in our work in educational research. In this section, we share three important considerations for new scholars selecting a theoretical framework.

The first step in identifying a theoretical framework involves reflecting on the phenomenon within the study and the assumptions aligned with the phenomenon. The phenomenon involves the studied event. There are many possibilities, for example, student learning, instructional approach, or group organization. A researcher holds assumptions about how the phenomenon will be effected, influenced, changed, or portrayed. It is ultimately the researcher’s assumption(s) about the phenomenon that aligns with a theoretical framework. An example can help illustrate how a researcher’s reflection on the phenomenon and acknowledgment of assumptions can result in the identification of a theoretical framework.

In our example, a biology education researcher may be interested in exploring how students’ learning of difficult biological concepts can be supported by the interactions of group members. The phenomenon of interest is the interactions among the peers, and the researcher assumes that more knowledgeable students are important in supporting the learning of the group. As a result, the researcher may draw on Vygotsky’s (1978) sociocultural theory of learning and development that is focused on the phenomenon of student learning in a social setting. This theory posits the critical nature of interactions among students and between students and teachers in the process of building knowledge. A researcher drawing upon this framework holds the assumption that learning is a dynamic social process involving questions and explanations among students in the classroom and that more knowledgeable peers play an important part in the process of building conceptual knowledge.

It is important to state at this point that there are many different theoretical frameworks. Some frameworks focus on learning and knowing, while other theoretical frameworks focus on equity, empowerment, or discourse. Some frameworks are well articulated, and others are still being refined. For a new researcher, it can be challenging to find a theoretical framework. Two of the best ways to look for theoretical frameworks is through published works that highlight different frameworks.

When a theoretical framework is selected, it should clearly connect to all parts of the study. The framework should augment the study by adding a perspective that provides greater insights into the phenomenon. It should clearly align with the studies described in the literature review. For instance, a framework focused on learning would correspond to research that reported different learning outcomes for similar studies. The methods for data collection and analysis should also correspond to the framework. For instance, a study about instructional interventions could use a theoretical framework concerned with learning and could collect data about the effect of the intervention on what is learned. When the data are analyzed, the theoretical framework should provide added meaning to the findings, and the findings should align with the theoretical framework.

A study by Jensen and Lawson (2011) provides an example of how a theoretical framework connects different parts of the study. They compared undergraduate biology students in heterogeneous and homogeneous groups over the course of a semester. Jensen and Lawson (2011) assumed that learning involved collaboration and more knowledgeable peers, which made Vygotsky’s (1978) theory a good fit for their study. They predicted that students in heterogeneous groups would experience greater improvement in their reasoning abilities and science achievements with much of the learning guided by the more knowledgeable peers.

In the enactment of the study, they collected data about the instruction in traditional and inquiry-oriented classes, while the students worked in homogeneous or heterogeneous groups. To determine the effect of working in groups, the authors also measured students’ reasoning abilities and achievement. Each data-collection and analysis decision connected to understanding the influence of collaborative work.

Their findings highlighted aspects of Vygotsky’s (1978) theory of learning. One finding, for instance, posited that inquiry instruction, as a whole, resulted in reasoning and achievement gains. This links to Vygotsky (1978) , because inquiry instruction involves interactions among group members. A more nuanced finding was that group composition had a conditional effect. Heterogeneous groups performed better with more traditional and didactic instruction, regardless of the reasoning ability of the group members. Homogeneous groups worked better during interaction-rich activities for students with low reasoning ability. The authors attributed the variation to the different types of helping behaviors of students. High-performing students provided the answers, while students with low reasoning ability had to work collectively through the material. In terms of Vygotsky (1978) , this finding provided new insights into the learning context in which productive interactions can occur for students.

Another consideration in the selection and use of a theoretical framework pertains to its orientation to the study. This can result in the theoretical framework prioritizing individuals, institutions, and/or policies ( Anfara and Mertz, 2014 ). Frameworks that connect to individuals, for instance, could contribute to understanding their actions, learning, or knowledge. Institutional frameworks, on the other hand, offer insights into how institutions, organizations, or groups can influence individuals or materials. Policy theories provide ways to understand how national or local policies can dictate an emphasis on outcomes or instructional design. These different types of frameworks highlight different aspects in an educational setting, which influences the design of the study and the collection of data. In addition, these different frameworks offer a way to make sense of the data. Aligning the data collection and analysis with the framework ensures that a study is coherent and can contribute to the field.

New understandings emerge when different theoretical frameworks are used. For instance, Ebert-May et al. (2015) prioritized the individual level within conceptual change theory (see Posner et al. , 1982 ). In this theory, an individual’s knowledge changes when it no longer fits the phenomenon. Ebert-May et al. (2015) designed a professional development program challenging biology postdoctoral scholars’ existing conceptions of teaching. The authors reported that the biology postdoctoral scholars’ teaching practices became more student-centered as they were challenged to explain their instructional decision making. According to the theory, the biology postdoctoral scholars’ dissatisfaction in their descriptions of teaching and learning initiated change in their knowledge and instruction. These results reveal how conceptual change theory can explain the learning of participants and guide the design of professional development programming.

The communities of practice (CoP) theoretical framework ( Lave, 1988 ; Wenger, 1998 ) prioritizes the institutional level , suggesting that learning occurs when individuals learn from and contribute to the communities in which they reside. Grounded in the assumption of community learning, the literature on CoP suggests that, as individuals interact regularly with the other members of their group, they learn about the rules, roles, and goals of the community ( Allee, 2000 ). A study conducted by Gehrke and Kezar (2017) used the CoP framework to understand organizational change by examining the involvement of individual faculty engaged in a cross-institutional CoP focused on changing the instructional practice of faculty at each institution. In the CoP, faculty members were involved in enhancing instructional materials within their department, which aligned with an overarching goal of instituting instruction that embraced active learning. Not surprisingly, Gehrke and Kezar (2017) revealed that faculty who perceived the community culture as important in their work cultivated institutional change. Furthermore, they found that institutional change was sustained when key leaders served as mentors and provided support for faculty, and as faculty themselves developed into leaders. This study reveals the complexity of individual roles in a COP in order to support institutional instructional change.

It is important to explicitly state the theoretical framework used in a study, but elucidating a theoretical framework can be challenging for a new educational researcher. The literature review can help to identify an applicable theoretical framework. Focal areas of the review or central terms often connect to assumptions and assertions associated with the framework that pertain to the phenomenon of interest. Another way to identify a theoretical framework is self-reflection by the researcher on personal beliefs and understandings about the nature of knowledge the researcher brings to the study ( Lysaght, 2011 ). In stating one’s beliefs and understandings related to the study (e.g., students construct their knowledge, instructional materials support learning), an orientation becomes evident that will suggest a particular theoretical framework. Theoretical frameworks are not arbitrary , but purposefully selected.

With experience, a researcher may find expanded roles for theoretical frameworks. Researchers may revise an existing framework that has limited explanatory power, or they may decide there is a need to develop a new theoretical framework. These frameworks can emerge from a current study or the need to explain a phenomenon in a new way. Researchers may also find that multiple theoretical frameworks are necessary to frame and explore a problem, as different frameworks can provide different insights into a problem.

Finally, it is important to recognize that choosing “x” theoretical framework does not necessarily mean a researcher chooses “y” methodology and so on, nor is there a clear-cut, linear process in selecting a theoretical framework for one’s study. In part, the nonlinear process of identifying a theoretical framework is what makes understanding and using theoretical frameworks challenging. For the novice scholar, contemplating and understanding theoretical frameworks is essential. Fortunately, there are articles and books that can help:

  • Creswell, J. W. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). Los Angeles, CA: Sage. This book provides an overview of theoretical frameworks in general educational research.
  • Ding, L. (2019). Theoretical perspectives of quantitative physics education research. Physical Review Physics Education Research , 15 (2), 020101-1–020101-13. This paper illustrates how a DBER field can use theoretical frameworks.
  • Nehm, R. (2019). Biology education research: Building integrative frameworks for teaching and learning about living systems. Disciplinary and Interdisciplinary Science Education Research , 1 , ar15. https://doi.org/10.1186/s43031-019-0017-6 . This paper articulates the need for studies in BER to explicitly state theoretical frameworks and provides examples of potential studies.
  • Patton, M. Q. (2015). Qualitative research & evaluation methods: Integrating theory and practice . Sage. This book also provides an overview of theoretical frameworks, but for both research and evaluation.

CONCEPTUAL FRAMEWORKS

Purpose of a conceptual framework.

A conceptual framework is a description of the way a researcher understands the factors and/or variables that are involved in the study and their relationships to one another. The purpose of a conceptual framework is to articulate the concepts under study using relevant literature ( Rocco and Plakhotnik, 2009 ) and to clarify the presumed relationships among those concepts ( Rocco and Plakhotnik, 2009 ; Anfara and Mertz, 2014 ). Conceptual frameworks are different from theoretical frameworks in both their breadth and grounding in established findings. Whereas a theoretical framework articulates the lens through which a researcher views the work, the conceptual framework is often more mechanistic and malleable.

Conceptual frameworks are broader, encompassing both established theories (i.e., theoretical frameworks) and the researchers’ own emergent ideas. Emergent ideas, for example, may be rooted in informal and/or unpublished observations from experience. These emergent ideas would not be considered a “theory” if they are not yet tested, supported by systematically collected evidence, and peer reviewed. However, they do still play an important role in the way researchers approach their studies. The conceptual framework allows authors to clearly describe their emergent ideas so that connections among ideas in the study and the significance of the study are apparent to readers.

Constructing Conceptual Frameworks

Including a conceptual framework in a research study is important, but researchers often opt to include either a conceptual or a theoretical framework. Either may be adequate, but both provide greater insight into the research approach. For instance, a research team plans to test a novel component of an existing theory. In their study, they describe the existing theoretical framework that informs their work and then present their own conceptual framework. Within this conceptual framework, specific topics portray emergent ideas that are related to the theory. Describing both frameworks allows readers to better understand the researchers’ assumptions, orientations, and understanding of concepts being investigated. For example, Connolly et al. (2018) included a conceptual framework that described how they applied a theoretical framework of social cognitive career theory (SCCT) to their study on teaching programs for doctoral students. In their conceptual framework, the authors described SCCT, explained how it applied to the investigation, and drew upon results from previous studies to justify the proposed connections between the theory and their emergent ideas.

In some cases, authors may be able to sufficiently describe their conceptualization of the phenomenon under study in an introduction alone, without a separate conceptual framework section. However, incomplete descriptions of how the researchers conceptualize the components of the study may limit the significance of the study by making the research less intelligible to readers. This is especially problematic when studying topics in which researchers use the same terms for different constructs or different terms for similar and overlapping constructs (e.g., inquiry, teacher beliefs, pedagogical content knowledge, or active learning). Authors must describe their conceptualization of a construct if the research is to be understandable and useful.

There are some key areas to consider regarding the inclusion of a conceptual framework in a study. To begin with, it is important to recognize that conceptual frameworks are constructed by the researchers conducting the study ( Rocco and Plakhotnik, 2009 ; Maxwell, 2012 ). This is different from theoretical frameworks that are often taken from established literature. Researchers should bring together ideas from the literature, but they may be influenced by their own experiences as a student and/or instructor, the shared experiences of others, or thought experiments as they construct a description, model, or representation of their understanding of the phenomenon under study. This is an exercise in intellectual organization and clarity that often considers what is learned, known, and experienced. The conceptual framework makes these constructs explicitly visible to readers, who may have different understandings of the phenomenon based on their prior knowledge and experience. There is no single method to go about this intellectual work.

Reeves et al. (2016) is an example of an article that proposed a conceptual framework about graduate teaching assistant professional development evaluation and research. The authors used existing literature to create a novel framework that filled a gap in current research and practice related to the training of graduate teaching assistants. This conceptual framework can guide the systematic collection of data by other researchers because the framework describes the relationships among various factors that influence teaching and learning. The Reeves et al. (2016) conceptual framework may be modified as additional data are collected and analyzed by other researchers. This is not uncommon, as conceptual frameworks can serve as catalysts for concerted research efforts that systematically explore a phenomenon (e.g., Reynolds et al. , 2012 ; Brownell and Kloser, 2015 ).

Sabel et al. (2017) used a conceptual framework in their exploration of how scaffolds, an external factor, interact with internal factors to support student learning. Their conceptual framework integrated principles from two theoretical frameworks, self-regulated learning and metacognition, to illustrate how the research team conceptualized students’ use of scaffolds in their learning ( Figure 1 ). Sabel et al. (2017) created this model using their interpretations of these two frameworks in the context of their teaching.

An external file that holds a picture, illustration, etc.
Object name is cbe-21-rm33-g001.jpg

Conceptual framework from Sabel et al. (2017) .

A conceptual framework should describe the relationship among components of the investigation ( Anfara and Mertz, 2014 ). These relationships should guide the researcher’s methods of approaching the study ( Miles et al. , 2014 ) and inform both the data to be collected and how those data should be analyzed. Explicitly describing the connections among the ideas allows the researcher to justify the importance of the study and the rigor of the research design. Just as importantly, these frameworks help readers understand why certain components of a system were not explored in the study. This is a challenge in education research, which is rooted in complex environments with many variables that are difficult to control.

For example, Sabel et al. (2017) stated: “Scaffolds, such as enhanced answer keys and reflection questions, can help students and instructors bridge the external and internal factors and support learning” (p. 3). They connected the scaffolds in the study to the three dimensions of metacognition and the eventual transformation of existing ideas into new or revised ideas. Their framework provides a rationale for focusing on how students use two different scaffolds, and not on other factors that may influence a student’s success (self-efficacy, use of active learning, exam format, etc.).

In constructing conceptual frameworks, researchers should address needed areas of study and/or contradictions discovered in literature reviews. By attending to these areas, researchers can strengthen their arguments for the importance of a study. For instance, conceptual frameworks can address how the current study will fill gaps in the research, resolve contradictions in existing literature, or suggest a new area of study. While a literature review describes what is known and not known about the phenomenon, the conceptual framework leverages these gaps in describing the current study ( Maxwell, 2012 ). In the example of Sabel et al. (2017) , the authors indicated there was a gap in the literature regarding how scaffolds engage students in metacognition to promote learning in large classes. Their study helps fill that gap by describing how scaffolds can support students in the three dimensions of metacognition: intelligibility, plausibility, and wide applicability. In another example, Lane (2016) integrated research from science identity, the ethic of care, the sense of belonging, and an expertise model of student success to form a conceptual framework that addressed the critiques of other frameworks. In a more recent example, Sbeglia et al. (2021) illustrated how a conceptual framework influences the methodological choices and inferences in studies by educational researchers.

Sometimes researchers draw upon the conceptual frameworks of other researchers. When a researcher’s conceptual framework closely aligns with an existing framework, the discussion may be brief. For example, Ghee et al. (2016) referred to portions of SCCT as their conceptual framework to explain the significance of their work on students’ self-efficacy and career interests. Because the authors’ conceptualization of this phenomenon aligned with a previously described framework, they briefly mentioned the conceptual framework and provided additional citations that provided more detail for the readers.

Within both the BER and the broader DBER communities, conceptual frameworks have been used to describe different constructs. For example, some researchers have used the term “conceptual framework” to describe students’ conceptual understandings of a biological phenomenon. This is distinct from a researcher’s conceptual framework of the educational phenomenon under investigation, which may also need to be explicitly described in the article. Other studies have presented a research logic model or flowchart of the research design as a conceptual framework. These constructions can be quite valuable in helping readers understand the data-collection and analysis process. However, a model depicting the study design does not serve the same role as a conceptual framework. Researchers need to avoid conflating these constructs by differentiating the researchers’ conceptual framework that guides the study from the research design, when applicable.

Explicitly describing conceptual frameworks is essential in depicting the focus of the study. We have found that being explicit in a conceptual framework means using accepted terminology, referencing prior work, and clearly noting connections between terms. This description can also highlight gaps in the literature or suggest potential contributions to the field of study. A well-elucidated conceptual framework can suggest additional studies that may be warranted. This can also spur other researchers to consider how they would approach the examination of a phenomenon and could result in a revised conceptual framework.

It can be challenging to create conceptual frameworks, but they are important. Below are two resources that could be helpful in constructing and presenting conceptual frameworks in educational research:

  • Maxwell, J. A. (2012). Qualitative research design: An interactive approach (3rd ed.). Los Angeles, CA: Sage. Chapter 3 in this book describes how to construct conceptual frameworks.
  • Ravitch, S. M., & Riggan, M. (2016). Reason & rigor: How conceptual frameworks guide research . Los Angeles, CA: Sage. This book explains how conceptual frameworks guide the research questions, data collection, data analyses, and interpretation of results.

CONCLUDING THOUGHTS

Literature reviews, theoretical frameworks, and conceptual frameworks are all important in DBER and BER. Robust literature reviews reinforce the importance of a study. Theoretical frameworks connect the study to the base of knowledge in educational theory and specify the researcher’s assumptions. Conceptual frameworks allow researchers to explicitly describe their conceptualization of the relationships among the components of the phenomenon under study. Table 1 provides a general overview of these components in order to assist biology education researchers in thinking about these elements.

It is important to emphasize that these different elements are intertwined. When these elements are aligned and complement one another, the study is coherent, and the study findings contribute to knowledge in the field. When literature reviews, theoretical frameworks, and conceptual frameworks are disconnected from one another, the study suffers. The point of the study is lost, suggested findings are unsupported, or important conclusions are invisible to the researcher. In addition, this misalignment may be costly in terms of time and money.

Conducting a literature review, selecting a theoretical framework, and building a conceptual framework are some of the most difficult elements of a research study. It takes time to understand the relevant research, identify a theoretical framework that provides important insights into the study, and formulate a conceptual framework that organizes the finding. In the research process, there is often a constant back and forth among these elements as the study evolves. With an ongoing refinement of the review of literature, clarification of the theoretical framework, and articulation of a conceptual framework, a sound study can emerge that makes a contribution to the field. This is the goal of BER and education research.

Supplementary Material

  • Allee, V. (2000). Knowledge networks and communities of learning . OD Practitioner , 32 ( 4 ), 4–13. [ Google Scholar ]
  • Allen, M. (2017). The Sage encyclopedia of communication research methods (Vols. 1–4 ). Los Angeles, CA: Sage. 10.4135/9781483381411 [ CrossRef ] [ Google Scholar ]
  • American Association for the Advancement of Science. (2011). Vision and change in undergraduate biology education: A call to action . Washington, DC. [ Google Scholar ]
  • Anfara, V. A., Mertz, N. T. (2014). Setting the stage . In Anfara, V. A., Mertz, N. T. (eds.), Theoretical frameworks in qualitative research (pp. 1–22). Sage. [ Google Scholar ]
  • Barnes, M. E., Brownell, S. E. (2016). Practices and perspectives of college instructors on addressing religious beliefs when teaching evolution . CBE—Life Sciences Education , 15 ( 2 ), ar18. https://doi.org/10.1187/cbe.15-11-0243 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Boote, D. N., Beile, P. (2005). Scholars before researchers: On the centrality of the dissertation literature review in research preparation . Educational Researcher , 34 ( 6 ), 3–15. 10.3102/0013189x034006003 [ CrossRef ] [ Google Scholar ]
  • Booth, A., Sutton, A., Papaioannou, D. (2016a). Systemic approaches to a successful literature review (2nd ed.). Los Angeles, CA: Sage. [ Google Scholar ]
  • Booth, W. C., Colomb, G. G., Williams, J. M., Bizup, J., Fitzgerald, W. T. (2016b). The craft of research (4th ed.). Chicago, IL: University of Chicago Press. [ Google Scholar ]
  • Brownell, S. E., Kloser, M. J. (2015). Toward a conceptual framework for measuring the effectiveness of course-based undergraduate research experiences in undergraduate biology . Studies in Higher Education , 40 ( 3 ), 525–544. https://doi.org/10.1080/03075079.2015.1004234 [ Google Scholar ]
  • Connolly, M. R., Lee, Y. G., Savoy, J. N. (2018). The effects of doctoral teaching development on early-career STEM scholars’ college teaching self-efficacy . CBE—Life Sciences Education , 17 ( 1 ), ar14. https://doi.org/10.1187/cbe.17-02-0039 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Cooper, K. M., Blattman, J. N., Hendrix, T., Brownell, S. E. (2019). The impact of broadly relevant novel discoveries on student project ownership in a traditional lab course turned CURE . CBE—Life Sciences Education , 18 ( 4 ), ar57. https://doi.org/10.1187/cbe.19-06-0113 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Creswell, J. W. (2018). Research design: Qualitative, quantitative, and mixed methods approaches (5th ed.). Los Angeles, CA: Sage. [ Google Scholar ]
  • DeHaan, R. L. (2011). Education research in the biological sciences: A nine decade review (Paper commissioned by the NAS/NRC Committee on the Status, Contributions, and Future Directions of Discipline Based Education Research) . Washington, DC: National Academies Press. Retrieved May 20, 2022, from www7.nationalacademies.org/bose/DBER_Mee ting2_commissioned_papers_page.html [ Google Scholar ]
  • Ding, L. (2019). Theoretical perspectives of quantitative physics education research . Physical Review Physics Education Research , 15 ( 2 ), 020101. [ Google Scholar ]
  • Dirks, C. (2011). The current status and future direction of biology education research . Paper presented at: Second Committee Meeting on the Status, Contributions, and Future Directions of Discipline-Based Education Research, 18–19 October (Washington, DC). Retrieved May 20, 2022, from http://sites.nationalacademies.org/DBASSE/BOSE/DBASSE_071087 [ Google Scholar ]
  • Duran, R. P., Eisenhart, M. A., Erickson, F. D., Grant, C. A., Green, J. L., Hedges, L. V., Schneider, B. L. (2006). Standards for reporting on empirical social science research in AERA publications: American Educational Research Association . Educational Researcher , 35 ( 6 ), 33–40. [ Google Scholar ]
  • Ebert-May, D., Derting, T. L., Henkel, T. P., Middlemis Maher, J., Momsen, J. L., Arnold, B., Passmore, H. A. (2015). Breaking the cycle: Future faculty begin teaching with learner-centered strategies after professional development . CBE—Life Sciences Education , 14 ( 2 ), ar22. https://doi.org/10.1187/cbe.14-12-0222 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Galvan, J. L., Galvan, M. C. (2017). Writing literature reviews: A guide for students of the social and behavioral sciences (7th ed.). New York, NY: Routledge. https://doi.org/10.4324/9781315229386 [ Google Scholar ]
  • Gehrke, S., Kezar, A. (2017). The roles of STEM faculty communities of practice in institutional and departmental reform in higher education . American Educational Research Journal , 54 ( 5 ), 803–833. https://doi.org/10.3102/0002831217706736 [ Google Scholar ]
  • Ghee, M., Keels, M., Collins, D., Neal-Spence, C., Baker, E. (2016). Fine-tuning summer research programs to promote underrepresented students’ persistence in the STEM pathway . CBE—Life Sciences Education , 15 ( 3 ), ar28. https://doi.org/10.1187/cbe.16-01-0046 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Institute of Education Sciences & National Science Foundation. (2013). Common guidelines for education research and development . Retrieved May 20, 2022, from www.nsf.gov/pubs/2013/nsf13126/nsf13126.pdf
  • Jensen, J. L., Lawson, A. (2011). Effects of collaborative group composition and inquiry instruction on reasoning gains and achievement in undergraduate biology . CBE—Life Sciences Education , 10 ( 1 ), 64–73. https://doi.org/10.1187/cbe.19-05-0098 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Kolpikova, E. P., Chen, D. C., Doherty, J. H. (2019). Does the format of preclass reading quizzes matter? An evaluation of traditional and gamified, adaptive preclass reading quizzes . CBE—Life Sciences Education , 18 ( 4 ), ar52. https://doi.org/10.1187/cbe.19-05-0098 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Labov, J. B., Reid, A. H., Yamamoto, K. R. (2010). Integrated biology and undergraduate science education: A new biology education for the twenty-first century? CBE—Life Sciences Education , 9 ( 1 ), 10–16. https://doi.org/10.1187/cbe.09-12-0092 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lane, T. B. (2016). Beyond academic and social integration: Understanding the impact of a STEM enrichment program on the retention and degree attainment of underrepresented students . CBE—Life Sciences Education , 15 ( 3 ), ar39. https://doi.org/10.1187/cbe.16-01-0070 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lave, J. (1988). Cognition in practice: Mind, mathematics and culture in everyday life . New York, NY: Cambridge University Press. [ Google Scholar ]
  • Lo, S. M., Gardner, G. E., Reid, J., Napoleon-Fanis, V., Carroll, P., Smith, E., Sato, B. K. (2019). Prevailing questions and methodologies in biology education research: A longitudinal analysis of research in CBE — Life Sciences Education and at the Society for the Advancement of Biology Education Research . CBE—Life Sciences Education , 18 ( 1 ), ar9. https://doi.org/10.1187/cbe.18-08-0164 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Lysaght, Z. (2011). Epistemological and paradigmatic ecumenism in “Pasteur’s quadrant:” Tales from doctoral research . In Official Conference Proceedings of the Third Asian Conference on Education in Osaka, Japan . Retrieved May 20, 2022, from http://iafor.org/ace2011_offprint/ACE2011_offprint_0254.pdf
  • Maxwell, J. A. (2012). Qualitative research design: An interactive approach (3rd ed.). Los Angeles, CA: Sage. [ Google Scholar ]
  • Miles, M. B., Huberman, A. M., Saldaña, J. (2014). Qualitative data analysis (3rd ed.). Los Angeles, CA: Sage. [ Google Scholar ]
  • Nehm, R. (2019). Biology education research: Building integrative frameworks for teaching and learning about living systems . Disciplinary and Interdisciplinary Science Education Research , 1 , ar15. https://doi.org/10.1186/s43031-019-0017-6 [ Google Scholar ]
  • Patton, M. Q. (2015). Qualitative research & evaluation methods: Integrating theory and practice . Los Angeles, CA: Sage. [ Google Scholar ]
  • Perry, J., Meir, E., Herron, J. C., Maruca, S., Stal, D. (2008). Evaluating two approaches to helping college students understand evolutionary trees through diagramming tasks . CBE—Life Sciences Education , 7 ( 2 ), 193–201. https://doi.org/10.1187/cbe.07-01-0007 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Posner, G. J., Strike, K. A., Hewson, P. W., Gertzog, W. A. (1982). Accommodation of a scientific conception: Toward a theory of conceptual change . Science Education , 66 ( 2 ), 211–227. [ Google Scholar ]
  • Ravitch, S. M., Riggan, M. (2016). Reason & rigor: How conceptual frameworks guide research . Los Angeles, CA: Sage. [ Google Scholar ]
  • Reeves, T. D., Marbach-Ad, G., Miller, K. R., Ridgway, J., Gardner, G. E., Schussler, E. E., Wischusen, E. W. (2016). A conceptual framework for graduate teaching assistant professional development evaluation and research . CBE—Life Sciences Education , 15 ( 2 ), es2. https://doi.org/10.1187/cbe.15-10-0225 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Reynolds, J. A., Thaiss, C., Katkin, W., Thompson, R. J. Jr. (2012). Writing-to-learn in undergraduate science education: A community-based, conceptually driven approach . CBE—Life Sciences Education , 11 ( 1 ), 17–25. https://doi.org/10.1187/cbe.11-08-0064 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rocco, T. S., Plakhotnik, M. S. (2009). Literature reviews, conceptual frameworks, and theoretical frameworks: Terms, functions, and distinctions . Human Resource Development Review , 8 ( 1 ), 120–130. https://doi.org/10.1177/1534484309332617 [ Google Scholar ]
  • Rodrigo-Peiris, T., Xiang, L., Cassone, V. M. (2018). A low-intensity, hybrid design between a “traditional” and a “course-based” research experience yields positive outcomes for science undergraduate freshmen and shows potential for large-scale application . CBE—Life Sciences Education , 17 ( 4 ), ar53. https://doi.org/10.1187/cbe.17-11-0248 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sabel, J. L., Dauer, J. T., Forbes, C. T. (2017). Introductory biology students’ use of enhanced answer keys and reflection questions to engage in metacognition and enhance understanding . CBE—Life Sciences Education , 16 ( 3 ), ar40. https://doi.org/10.1187/cbe.16-10-0298 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Sbeglia, G. C., Goodridge, J. A., Gordon, L. H., Nehm, R. H. (2021). Are faculty changing? How reform frameworks, sampling intensities, and instrument measures impact inferences about student-centered teaching practices . CBE—Life Sciences Education , 20 ( 3 ), ar39. https://doi.org/10.1187/cbe.20-11-0259 [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Schwandt, T. A. (2000). Three epistemological stances for qualitative inquiry: Interpretivism, hermeneutics, and social constructionism . In Denzin, N. K., Lincoln, Y. S. (Eds.), Handbook of qualitative research (2nd ed., pp. 189–213). Los Angeles, CA: Sage. [ Google Scholar ]
  • Sickel, A. J., Friedrichsen, P. (2013). Examining the evolution education literature with a focus on teachers: Major findings, goals for teacher preparation, and directions for future research . Evolution: Education and Outreach , 6 ( 1 ), 23. https://doi.org/10.1186/1936-6434-6-23 [ Google Scholar ]
  • Singer, S. R., Nielsen, N. R., Schweingruber, H. A. (2012). Discipline-based education research: Understanding and improving learning in undergraduate science and engineering . Washington, DC: National Academies Press. [ Google Scholar ]
  • Todd, A., Romine, W. L., Correa-Menendez, J. (2019). Modeling the transition from a phenotypic to genotypic conceptualization of genetics in a university-level introductory biology context . Research in Science Education , 49 ( 2 ), 569–589. https://doi.org/10.1007/s11165-017-9626-2 [ Google Scholar ]
  • Vygotsky, L. S. (1978). Mind in society: The development of higher psychological processes . Cambridge, MA: Harvard University Press. [ Google Scholar ]
  • Wenger, E. (1998). Communities of practice: Learning as a social system . Systems Thinker , 9 ( 5 ), 2–3. [ Google Scholar ]
  • Ziadie, M. A., Andrews, T. C. (2018). Moving evolution education forward: A systematic analysis of literature to identify gaps in collective knowledge for teaching . CBE—Life Sciences Education , 17 ( 1 ), ar11. https://doi.org/10.1187/cbe.17-08-0190 [ PMC free article ] [ PubMed ] [ Google Scholar ]

Literature Review vs Systematic Review

  • Literature Review vs. Systematic Review
  • Primary vs. Secondary Sources
  • Databases and Articles
  • Specific Journal or Article

Subject Guide

Profile Photo

Definitions

It’s common to confuse systematic and literature reviews because both are used to provide a summary of the existent literature or research on a specific topic. Regardless of this commonality, both types of review vary significantly. The following table provides a detailed explanation as well as the differences between systematic and literature reviews. 

Kysh, Lynn (2013): Difference between a systematic review and a literature review. [figshare]. Available at:  http://dx.doi.org/10.6084/m9.figshare.766364

  • << Previous: Home
  • Next: Primary vs. Secondary Sources >>
  • Last Updated: Dec 15, 2023 10:19 AM
  • URL: https://libguides.sjsu.edu/LitRevVSSysRev

Log in using your username and password

  • Search More Search for this keyword Advanced search
  • Latest content
  • Current issue
  • Write for Us
  • BMJ Journals More You are viewing from: Google Indexer

You are here

  • Volume 24, Issue 2
  • Five tips for developing useful literature summary tables for writing review articles
  • Article Text
  • Article info
  • Citation Tools
  • Rapid Responses
  • Article metrics

Download PDF

  • http://orcid.org/0000-0003-0157-5319 Ahtisham Younas 1 , 2 ,
  • http://orcid.org/0000-0002-7839-8130 Parveen Ali 3 , 4
  • 1 Memorial University of Newfoundland , St John's , Newfoundland , Canada
  • 2 Swat College of Nursing , Pakistan
  • 3 School of Nursing and Midwifery , University of Sheffield , Sheffield , South Yorkshire , UK
  • 4 Sheffield University Interpersonal Violence Research Group , Sheffield University , Sheffield , UK
  • Correspondence to Ahtisham Younas, Memorial University of Newfoundland, St John's, NL A1C 5C4, Canada; ay6133{at}mun.ca

https://doi.org/10.1136/ebnurs-2021-103417

Statistics from Altmetric.com

Request permissions.

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Introduction

Literature reviews offer a critical synthesis of empirical and theoretical literature to assess the strength of evidence, develop guidelines for practice and policymaking, and identify areas for future research. 1 It is often essential and usually the first task in any research endeavour, particularly in masters or doctoral level education. For effective data extraction and rigorous synthesis in reviews, the use of literature summary tables is of utmost importance. A literature summary table provides a synopsis of an included article. It succinctly presents its purpose, methods, findings and other relevant information pertinent to the review. The aim of developing these literature summary tables is to provide the reader with the information at one glance. Since there are multiple types of reviews (eg, systematic, integrative, scoping, critical and mixed methods) with distinct purposes and techniques, 2 there could be various approaches for developing literature summary tables making it a complex task specialty for the novice researchers or reviewers. Here, we offer five tips for authors of the review articles, relevant to all types of reviews, for creating useful and relevant literature summary tables. We also provide examples from our published reviews to illustrate how useful literature summary tables can be developed and what sort of information should be provided.

Tip 1: provide detailed information about frameworks and methods

  • Download figure
  • Open in new tab
  • Download powerpoint

Tabular literature summaries from a scoping review. Source: Rasheed et al . 3

The provision of information about conceptual and theoretical frameworks and methods is useful for several reasons. First, in quantitative (reviews synthesising the results of quantitative studies) and mixed reviews (reviews synthesising the results of both qualitative and quantitative studies to address a mixed review question), it allows the readers to assess the congruence of the core findings and methods with the adapted framework and tested assumptions. In qualitative reviews (reviews synthesising results of qualitative studies), this information is beneficial for readers to recognise the underlying philosophical and paradigmatic stance of the authors of the included articles. For example, imagine the authors of an article, included in a review, used phenomenological inquiry for their research. In that case, the review authors and the readers of the review need to know what kind of (transcendental or hermeneutic) philosophical stance guided the inquiry. Review authors should, therefore, include the philosophical stance in their literature summary for the particular article. Second, information about frameworks and methods enables review authors and readers to judge the quality of the research, which allows for discerning the strengths and limitations of the article. For example, if authors of an included article intended to develop a new scale and test its psychometric properties. To achieve this aim, they used a convenience sample of 150 participants and performed exploratory (EFA) and confirmatory factor analysis (CFA) on the same sample. Such an approach would indicate a flawed methodology because EFA and CFA should not be conducted on the same sample. The review authors must include this information in their summary table. Omitting this information from a summary could lead to the inclusion of a flawed article in the review, thereby jeopardising the review’s rigour.

Tip 2: include strengths and limitations for each article

Critical appraisal of individual articles included in a review is crucial for increasing the rigour of the review. Despite using various templates for critical appraisal, authors often do not provide detailed information about each reviewed article’s strengths and limitations. Merely noting the quality score based on standardised critical appraisal templates is not adequate because the readers should be able to identify the reasons for assigning a weak or moderate rating. Many recent critical appraisal checklists (eg, Mixed Methods Appraisal Tool) discourage review authors from assigning a quality score and recommend noting the main strengths and limitations of included studies. It is also vital that methodological and conceptual limitations and strengths of the articles included in the review are provided because not all review articles include empirical research papers. Rather some review synthesises the theoretical aspects of articles. Providing information about conceptual limitations is also important for readers to judge the quality of foundations of the research. For example, if you included a mixed-methods study in the review, reporting the methodological and conceptual limitations about ‘integration’ is critical for evaluating the study’s strength. Suppose the authors only collected qualitative and quantitative data and did not state the intent and timing of integration. In that case, the strength of the study is weak. Integration only occurred at the levels of data collection. However, integration may not have occurred at the analysis, interpretation and reporting levels.

Tip 3: write conceptual contribution of each reviewed article

While reading and evaluating review papers, we have observed that many review authors only provide core results of the article included in a review and do not explain the conceptual contribution offered by the included article. We refer to conceptual contribution as a description of how the article’s key results contribute towards the development of potential codes, themes or subthemes, or emerging patterns that are reported as the review findings. For example, the authors of a review article noted that one of the research articles included in their review demonstrated the usefulness of case studies and reflective logs as strategies for fostering compassion in nursing students. The conceptual contribution of this research article could be that experiential learning is one way to teach compassion to nursing students, as supported by case studies and reflective logs. This conceptual contribution of the article should be mentioned in the literature summary table. Delineating each reviewed article’s conceptual contribution is particularly beneficial in qualitative reviews, mixed-methods reviews, and critical reviews that often focus on developing models and describing or explaining various phenomena. Figure 2 offers an example of a literature summary table. 4

Tabular literature summaries from a critical review. Source: Younas and Maddigan. 4

Tip 4: compose potential themes from each article during summary writing

While developing literature summary tables, many authors use themes or subthemes reported in the given articles as the key results of their own review. Such an approach prevents the review authors from understanding the article’s conceptual contribution, developing rigorous synthesis and drawing reasonable interpretations of results from an individual article. Ultimately, it affects the generation of novel review findings. For example, one of the articles about women’s healthcare-seeking behaviours in developing countries reported a theme ‘social-cultural determinants of health as precursors of delays’. Instead of using this theme as one of the review findings, the reviewers should read and interpret beyond the given description in an article, compare and contrast themes, findings from one article with findings and themes from another article to find similarities and differences and to understand and explain bigger picture for their readers. Therefore, while developing literature summary tables, think twice before using the predeveloped themes. Including your themes in the summary tables (see figure 1 ) demonstrates to the readers that a robust method of data extraction and synthesis has been followed.

Tip 5: create your personalised template for literature summaries

Often templates are available for data extraction and development of literature summary tables. The available templates may be in the form of a table, chart or a structured framework that extracts some essential information about every article. The commonly used information may include authors, purpose, methods, key results and quality scores. While extracting all relevant information is important, such templates should be tailored to meet the needs of the individuals’ review. For example, for a review about the effectiveness of healthcare interventions, a literature summary table must include information about the intervention, its type, content timing, duration, setting, effectiveness, negative consequences, and receivers and implementers’ experiences of its usage. Similarly, literature summary tables for articles included in a meta-synthesis must include information about the participants’ characteristics, research context and conceptual contribution of each reviewed article so as to help the reader make an informed decision about the usefulness or lack of usefulness of the individual article in the review and the whole review.

In conclusion, narrative or systematic reviews are almost always conducted as a part of any educational project (thesis or dissertation) or academic or clinical research. Literature reviews are the foundation of research on a given topic. Robust and high-quality reviews play an instrumental role in guiding research, practice and policymaking. However, the quality of reviews is also contingent on rigorous data extraction and synthesis, which require developing literature summaries. We have outlined five tips that could enhance the quality of the data extraction and synthesis process by developing useful literature summaries.

  • Aromataris E ,
  • Rasheed SP ,

Twitter @Ahtisham04, @parveenazamali

Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.

Competing interests None declared.

Patient consent for publication Not required.

Provenance and peer review Not commissioned; externally peer reviewed.

Read the full text or download the PDF:

Pediaa.Com

Home » Education » Difference Between Literature Review and Systematic Review

Difference Between Literature Review and Systematic Review

Main difference – literature review vs systematic review.

Literature review and systematic review are two scholarly texts that help to introduce new knowledge to various fields. A literature review, which reviews the existing research and information on a selected study area, is a crucial element of a research study. A systematic review is also a type of a literature review. The main difference between literature review and systematic review is their focus on the research question ; a systematic review is focused on a specific research question whereas a literature review is not.

This article highlights,

1. What is a Literature Review?        – Definition, Features, Characteristics

2. What is a Systematic Review?        – Definition, Features, Characteristics

Difference Between Literature Review and Systematic Review - Comparison Summary

What is a Literature Review

A literature review is an indispensable element of a research study. This is where the researcher shows his knowledge on the subject area he or she is researching on. A literature review is a discussion on the already existing material in the subject area. Thus, this will require a collection of published (in print or online) work concerning the selected research area. In simple terms, a literature is a review of the literature in the related subject area.

A good literature review is a critical discussion, displaying the writer’s knowledge on relevant theories and approaches and awareness of contrasting arguments. A literature review should have the following features (Caulley, 1992)

  • Compare and contrast different researchers’ views
  • Identify areas in which researchers are in disagreement
  • Group researchers who have similar conclusions
  • Criticize the methodology
  • Highlight exemplary studies
  • Highlight gaps in research
  • Indicate the connection between your study and previous studies
  • Indicate how your study will contribute to the literature in general
  • Conclude by summarizing what the literature indicates

The structure of a literature review is similar to that of an article or essay, unlike an annotated bibliography . The information that is collected is integrated into paragraphs based on their relevance. Literature reviews help researchers to evaluate the existing literature, to identify a gap in the research area, to place their study in the existing research and identify future research.

Difference Between Literature Review and Systematic Review

What is a Systematic Review

A systematic review is a type of systematic review that is focused on a particular research question . The main purpose of this type of research is to identify, review, and summarize the best available research on a specific research question. Systematic reviews are used mainly because the review of existing studies is often more convenient than conducting a new study. These are mostly used in the health and medical field, but they are not rare in fields such as social sciences and environmental science.  Given below are the main stages of a systematic review:

  • Defining the research question and identifying an objective method
  • Searching for relevant data that from existing research studies that meet certain criteria (research studies must be reliable and valid).
  • Extracting data from the selected studies (data such as the participants, methods, outcomes, etc.
  • Assessing the quality of information
  • Analyzing and combining all the data which would give an overall result.

Literature Review is a critical evaluation of the existing published work in a selected research area.

Systematic Review is a type of literature review that is focused on a particular research question.

Literature Review aims to review the existing literature, identify the research gap, place the research study in relation to other studies, to evaluate promising research methods, and to suggest further research.

Systematic Review aims to identify, review, and summarize the best available research on a specific research question.

Research Question

In Literature Review, a r esearch question is formed after writing the literature review and identifying the research gap.

In Systematic Review, a research question is formed at the beginning of the systematic review.

Research Study

Literature Review is an essential component of a research study and is done at the beginning of the study.

Systematic Review is not followed by a separate research study.

Caulley, D. N. “Writing a critical review of the literature.”  La Trobe University: Bundoora  (1992).

“Animated Storyboard: What Are Systematic Reviews?” .  cccrg.cochrane.org .  Cochrane Consumers and Communication . Retrieved 1 June 2016.

Image Courtesy: Pixabay

' src=

About the Author: Hasa

Hasanthi is a seasoned content writer and editor with over 8 years of experience. Armed with a BA degree in English and a knack for digital marketing, she explores her passions for literature, history, culture, and food through her engaging and informative writing.

​You May Also Like These

Leave a reply cancel reply.

  • Open access
  • Published: 23 February 2021

Beta-blocker therapy in patients with COPD: a systematic literature review and meta-analysis with multiple treatment comparison

  • Claudia Gulea   ORCID: orcid.org/0000-0001-9607-5901 1 , 2 ,
  • Rosita Zakeri 3 ,
  • Vanessa Alderman 4 ,
  • Alexander Morgan 5 ,
  • Jack Ross 6 &
  • Jennifer K. Quint 1 , 2 , 7  

Respiratory Research volume  22 , Article number:  64 ( 2021 ) Cite this article

28k Accesses

29 Citations

37 Altmetric

Metrics details

Beta-blockers are associated with reduced mortality in patients with cardiovascular disease but are often under prescribed in those with concomitant COPD, due to concerns regarding respiratory side-effects. We investigated the effects of beta-blockers on outcomes in patients with COPD and explored within-class differences between different agents.

We searched the Cochrane Central Register of Controlled Trials, Embase, Cumulative Index to Nursing and Allied Health Literature (CINAHL) and Medline for observational studies and randomized controlled trials (RCTs) investigating the effects of beta-blocker exposure versus no exposure or placebo, in patients with COPD, with and without cardiovascular indications. A meta-analysis was performed to assess the association of beta-blocker therapy with acute exacerbations of COPD (AECOPD), and a network meta-analysis was conducted to investigate the effects of individual beta-blockers on FEV1. Mortality, all-cause hospitalization, and quality of life outcomes were narratively synthesized.

We included 23 observational studies and 14 RCTs. In pooled observational data, beta-blocker therapy was associated with an overall reduced risk of AECOPD versus no therapy (HR 0.77, 95%CI 0.70 to 0.85) . Among individual beta-blockers, only propranolol was associated with a relative reduction in FEV1 versus placebo, among 199 patients evaluated in RCTs. Narrative syntheses on mortality, all-cause hospitalization and quality of life outcomes indicated a high degree of heterogeneity in study design and patient characteristics but suggested no detrimental effects of beta-blocker therapy on these outcomes.

The class effect of beta-blockers remains generally positive in patients with COPD. Reduced rates of AECOPD, mortality, and improved quality of life were identified in observational studies, while propranolol was the only agent associated with a deterioration of lung function in RCTs.

COPD and cardiovascular disease (CVD) often co-occur, in an interaction characterized by complex biological mechanisms and risk factors such as smoking. Beta-blockers are recommended in treatment regimens of people with heart failure (HF), following myocardial infarction (MI), angina or hypertension, due to proven mortality benefits [ 1 , 2 , 3 , 4 ]. Seventeen years after the publication of the first robust meta-analysis demonstrating that beta-blockers do not impair lung function in patients with COPD [ 5 ], prescription rates remain lower than for people without COPD, among those with an indication for treatment. This treatment gap is thought to be, in part, due to concerns regarding adverse respiratory effects (such as a decrease in lung function) despite accumulating evidence to the contrary[ 6 ]. Concomitant CVD independently affects mortality and hospitalization in patients with COPD, further adding to the clinical burden and complexity of treatment pathways in these patients[ 7 , 8 ].

COPD guidelines recommend the use of cardioselective beta-blockers when appropriate, reinforced by evidence gathered in a Cochrane review [ 9 ]. Data regarding the association of beta-blocker therapy with mortality and acute exacerbations due to COPD (AECOPD) is derived mostly from observational data and previous reviews have aggregated results for cardio and non-cardioselective agents [ 10 , 11 ]. However, a recent single RCT [ 12 ] reported more hospitalizations due to AECOPD in patients treated with metoprolol as compared to placebo, though results on mortality and FEV1 were inconclusive.

Our study expands on previous literature by dissecting the effects of beta-blockers from both RCTs and observational studies, on a wide-range of clinically-relevant end points (mortality, AECOPD, FEV1, all-cause hospitalization and quality of life outcomes such as St. George’s Respiratory Questionnaire (SGRQ), 12 and 6MWT (12, 6 Minute Walking Test) and the Short-Form Health Survey Questionnaire (SF-36), thereby providing a comprehensive assessment of the effects of beta-blocker treatment in COPD. We have two overarching aims: (1) to identify and assess the class-effect of beta-blockers and (2) to compare within-class effects of beta-blockers on the aforementioned outcomes. If all studies have a minimum of one intervention in common with another, it will be possible to create a network of treatments, allowing both direct and indirect evidence to be used in deriving comparisons between beta-blockers not studied in a head-to-head manner, using a network-meta-analysis (NMA). Importantly, we also want to address a current gap in knowledge—we will investigate whether the potential benefits of beta-blockers are limited to those with CVD or may extend in the wider COPD population with or without undiagnosed CVD.

The protocol for this review was previously published [ 13 ]. Searches were conducted from inception to January 2021 in MEDLINE, Embase and CINAHL via Ovid and The Cochrane Collection Central Register of Clinical Trials to identify studies that examined the association between beta-blockers in patients with COPD (defined as post-bronchodilator FEV1/FVC of < 0.70, or as being in accordance with GOLD guidelines [ 6 ]; patients with a clinical diagnosis of COPD) and clinical, safety and quality of life outcomes. To ensure we captured all relevant evidence, we included prospective interventional trials (RCTs) and prospective observational studies (single-arm studies were excluded). At the screening stage, due to a scarcity of prospective observational studies, we decided to also include retrospective observational studies. We required all studies to report on mortality, AECOPD, all-cause hospitalization and quality of life outcomes. We also manually searched reference lists of previously published reviews. Abstracts were screened for inclusion by two independent reviewers, with any discrepancies resolved through discussion. Full texts of included abstracts were screened by a single investigator, and 25% of articles were additionally validated by a second investigator. Full inclusion/exclusion criteria applied at each stage are available in the Additional file 1 : Table S1.

Data extraction and quality assessment

For each accepted study, data was extracted on design, characteristics of study population including comorbidities, inclusion and exclusion criteria, treatment administered and the reported effect of beta-blocker on included outcomes. Details on planned data extraction are available in the protocol [ 13 ]. Authors were contacted to clarify ambiguously reported data from published reports. Included observational studies were assessed for risk of bias using the ROBIN-I [ 14 ] tool for cohort studies and RCTs were assessed using the ROB tool [ 15 ]. Bias domains evaluated include confounding, reporting, attrition, or measurement of outcomes. Each domain was assigned to a risk category such as “low”, “moderate”, “high” or “unclear” for observational studies and “low”, “high” or “some concerns” for RCTs. Additionally, we assessed the certainty of the evidence using the Grading of Recommendations Assessment Development and Evaluation (GRADE) framework [ 16 ].

Searches identified studies reporting on all-cause mortality, AECOPD, FEV1, all-cause hospitalization, SGRQ, the 12 and 6 MWT, and the SF-36. Four researchers extracted data from the included articles, and all were validated by a second researcher.

Data analysis

Where included studies were reasonably statistically and clinically similar, we pooled results using meta-analysis (to investigate class-effect of beta-blocker treatment), or NMA, where data on individual therapeutic compounds was available. Publication bias was assessed using funnel plots if there were at least 10 studies included in meta-analysis [ 17 ]. For binary outcomes we initially included studies that reported on outcomes in any format (Hazard ratio [HR], Odds Ratio [OR], Risk ratio, Incidence Rate); however, the final inclusion list contains only studies reporting HRs since this was the most common amongst included studies. Heterogeneity was assessed using I 2 [ 18 ].

FEV1—Network meta-analysis of RCTs

We performed a random-effects Bayesian NMA to estimate mean change in FEV1 between patients who received individual beta-blockers versus (vs.) placebo with 95% Credible intervals (CrI), using package gemtc [ 19 ] in R v3.6. CrIs represent the 95% probability that the true underlying effect lies in the interval specified. In cases where the standard deviation (SD) for the FEV1 measures was not reported, the SD was extrapolated by averaging the SDs from other studies with similar sample characteristics. Random-effect analyses are widely accepted as the appropriate, more conservative approach when there is heterogeneity across study methods. By contrast, fixed-effect models assume that effect size associated with an intervention does not vary from study to study, and they may be particularly appropriate when only few studies are available for analysis. The best model fit for each network was selected based on a review of the deviance information criterion (DIC) and an evaluation of the different model assumptions.

NMAs include direct and indirect evidence from trials to determine the best available treatment with respect to an outcome of interest. For the results to be valid, NMA assumptions need to be met, including the transitivity and consistency assumptions. For the transitivity assumption to be met, the studies that contribute direct evidence must be similar in distribution of covariates and effect modifiers across the trial populations. Inconsistency occurs when the indirect evidence in a network is different compared to the direct evidence. Assessing consistency of data in the network model is done implicitly in package “gemtc” which uses a decision rule to choose which comparisons may be potentially inconsistent—the “node-splitting” method. Small study effects were explored by conducting comparison-adjusted funnel plots [ 20 ] and publication bias was assessed by Egger’s test among comparisons of beta-blockers and placebo. A value of p < 0.1 indicated significant publication bias. To assess the probability that a treatment is the best within a network, rank probabilities were determined—the probability for each treatment to obtain each possible rank in terms of their relative effects. Interpretation needs to be made with caution, because a treatment may have a high probability of being first, or last treatment and its’ benefit over other treatments may be of little clinical value [ 21 ]. For this reason, we report a full ranking profile (where each treatment is assigned a probability of being first, second, and so on, best treatment in the network) which was derived using the surface under the cumulative ranking curve (SUCRA) [ 22 ].

Sensitivity analyses

We conducted two meta-regressions to establish whether FEV1 measurement at baseline or study duration influenced the main NMA results. These variables were added, separately, as covariates in the main NMA model; FEV1 as a continuous variable and follow-up dichotomised into short follow-up (less than 24 h) vs. long follow-up (more than 24 h). We compared model fit between models with and without covariates using the DIC. Where possible, we analyzed patients with and without CVD separately.

AECOPD—meta-analysis of observational studies

We pooled HRs denoting the association between beta-blocker exposure (vs. no exposure) amongst patients with COPD, using random-effects meta-analysis with the DerSimonian-Lard estimator in “metafor” [ 23 ] package in R v3.6.

Mortality; quality of life—narrative synthesis

If studies were too heterogeneous (I 2  > 75%), or where outcomes were reported in under three studies per treatment comparison, quantitative analysis was not reported, but summary results were graphed on forest plots without pooling the results (mortality) and/or synthesized qualitatively (quality of life outcomes).

The database search identified 2932 potentially relevant articles whilst other sources revealed six. After title and abstract screening, 187 articles underwent full-text review. We included 23 observational studies and 14 RCTs that reported on patients with COPD, in the systematic literature review. Out of a total of 23 observational studies, 21 reported on mortality [ 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 ], five reported on AECOPD [ 24 , 33 , 35 , 45 , 46 ] three reported on all-cause hospitalization [ 47 , 48 , 49 ], one reported on SGRQ [ 45 ] and one reported on SF-36 [ 42 ]. From a total of 14 RCTs, 12 reported on FEV1 [ 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 ], two each reported on 12MWT [ 59 , 62 ] and 6MWT [ 12 , 56 ] and two reported on SGRQ [ 12 , 56 ] (Fig.  1 ).

figure 1

According to our protocol, we intended to include data on effect of beta-blockers on AECOPD from RCTs, however our search strategy revealed only one study of this kind [ 12 ]. Based on a population of 532 patients with moderate to severe COPD, the authors reported no significant difference in time to first AECOPD (of any severity) between metoprolol and placebo, however the use of the beta-blocker was associated with a higher risk of severe exacerbation (requiring hospitalization). This study could not be included in the quantitative analysis, as there was no other RCT data to corroborate.

Quantitative analyses

There were five [ 24 , 33 , 35 , 45 , 46 ] observational studies reporting on the effect of beta-blockers on AECOPD in patients from at least five countries across Europe. Follow-up varied from 0.76 [ 46 ] to 7.2 years [ 33 ]. The average age of the patients ranged from 62.8 [24] to 74 [46] years old and the proportion of males from 49.8% [33] to 72.3% [45]. Only two studies reported on smoking status [ 33 , 45 ], which indicated the majority of patients were either current or former smokers. Comorbidities were frequent in all cohorts, specifically CVD, reported in all but one study [ 35 ]. Body mass index (BMI) was reported in only two studies and ranged between 25.5 [ 45 ] and 29. 9 kg/m 2 [ 24 ]. All study characteristics are available in Additional file 1 : Table S2 and Table S3.

In the presence of low statistical heterogeneity (< 25%), the random effects and fixed effects method for pooling effect estimates give identical results. Due to low heterogeneity (I 2  = 0, owing to the large weight attributed to one study only [ 46 ]) and the small overall number of studies, we report both random and fixed-effects meta-analyses of AECOPD. In random-effects analysis, the pooled estimate risk of AECOPD associated with beta-blocker use, from an total of 27,717 patients, was HR 0.78 [95%CI 0.74–0.82] suggesting a reduction in relative risk in the presence of beta-blockers (Fig.  2 , Additional file 1 : Table S4 for individual study outcomes). The fixed-effects meta-analysis yielded similar results (Additional File 1 : Figure S1). Due to low number of studies we could not formally assess the extent of publication bias. The GRADE assessment indicated the overall quality of evidence based on which the meta-analysis was conducted was low (Additional file 1 : Table S18).

figure 2

Forest plot illustrating results of the meta-analysis evaluating the impact of beta-blocker therapy versus no beta-blocker therapy on AECOPD in patients with COPD (Estimate: HR hazard ratio, 95% CI confidence interval)

Data from 12 RCTs evaluating FEV1 in 199 patients and seven beta-blockers (atenolol, bisoprolol, carvedilol, celiprolol, metoprolol, propranolol, labetalol) were evaluated [ 50 , 51 , 52 , 53 , 54 , 55 , 57 , 58 , 59 , 60 , 61 , 63 ]. Duration of trials varied from 1 hour [ 53 , 59 ] to 3–4 months [ 57 ] and FEV1 measurement at baseline between 1.15 [ 59 ] and 2.41 L [l][ 61 ]. Most patients were over 40 years old except for one study where mean age was 39 [ 60 ]. Across all studies, over 50% of the patient population were male and four studies only included patients with CVD or hypertension explicitly [ 50 , 54 , 55 , 57 ] (Additional file 1 : Table S5). A comparison between studies enrolling patients with CVD and those enrolling patients with COPD only is difficult due to scarcity of reported data. BMI was available in two studies of COPD and CVD [ 55 , 57 ] and in one study only which excluded CVD [ 58 ]. Estimates were however similar and denoted overweight, but not obese patient populations. Celiprolol was the only treatment which was evaluated in patients without CVD exclusively, in one trial [ 61 ] only. Sample size, age and proportion of males were similar across all studies.

Figure  3 shows the network of eligible comparisons for FEV1 mean change from baseline to time-point, including seven treatments. All beta-blockers except carvedilol were evaluated in at least one placebo-controlled trial. Individual study FEV1 measurements are presented in Additional file 1 : Table S6. Figure  4 and Additional file 1 : Table S7 show the NMA results for FEV1. Consistency results are illustrated in Additional file 1 : Figure S2. Effects relative to placebo are presented separately for each treatment.

figure 3

Network of beta-blockers used to treat patients with COPD, from RCTs assessing FEV1

figure 4

Network meta-analysis results for mean difference in FEV1 (95 CrI), beta-blockers compared to placebo [measured in liters, CrI  credible intervals]

There was no significant difference in FEV1 amongst all beta-blockers except for propranolol, which was the only treatment associated with a decrease in FEV1 (mean difference [MD]:− 0.14 ml, 95% CrI,  0.28 to 0.016). Individual medications were ranked and are presented with estimates of the probability that each is the best treatment (i.e. probability that the treatment improves lung function). Figure  5 shows that celiprolol had the highest likelihood of being ranked best treatment, followed by labetalol. For the second rank, the same treatments appear the most likely. Overall, the SUCRA results based on the rankogram values appear to suggest labetalol (86.2%) and celiprolol (80%) are the most likely of being the best treatments to positively affect FEV1, whilst propranolol was the least likely (16.2% probability of being the best) (Additional file 1 : Table S9). According to the comparison-adjusted funnel plot, no publication bias was found for Egger’s test (p = 0.1286, Additional file 1 : Figure S3).

figure 5

Rankogram illustrating probabilities that each treatment is first, second, third…eighth with regards to FEV1 improvement

The meta-regression analyses, with baseline FEV1 measurement, follow-up duration, respectively, added as covariates, showed similar results to the main analysis (model fit did not improve in either model with added covariates, Additional file 1 : Figure S4).

Beta-blocker therapy effect on FEV1 in patients with COPD with and without explicit CVD

Data from eight trials evaluating six beta-blockers (atenolol, bisoprolol, carvedilol, celiprolol, metoprolol, and propranolol) in 137 patients with COPD and no explicit CVD were evaluated [ 51 , 52 , 53 , 56 , 58 , 59 , 60 , 61 ]. No significant difference in FEV1 was detected when comparing each of the active treatments with placebo (Additional file 1 : Figure S5A). Additional file 1 : Figure S6 shows celiprolol was similarly likely to rank first in terms of increasing FEV1, while the second rank was surprisingly obtained by placebo, then celiprolol. There were four trials investigating six  beta-blockers (carvedilol, bisoprolol, atenolol, propranolol, metoprolol, labetalol) in patients with COPD and CVD [ 50 , 54 , 55 , 57 ]. No significant difference in FEV1 was detected when comparing each of the active treatments with placebo (Additional file 1 : Figure S5B, Additional file 1 : Figure S7).

Narrative synthesis

There were 21 observational studies reporting on mortality [ 24 , 25 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 43 , 44 , 64 ] which evaluated the effect of beta-blockers vs. no beta-blocker use, in an overall population of 422,552 patients from at least 11 countries (Additional file 1 : Table S2). According to inclusion criteria, 15 studies enrolled patients with COPD and a CVD indication [ 25 , 27 , 28 , 29 , 34 , 37 , 39 , 40 , 43 , 64 , 65 ], while the remaining six [ 24 , 26 , 32 , 33 , 35 , 44 ] did not specify whether those with CVD were specifically excluded; however, all studies had varying percentages of CVD comorbidities. Overall, patient characteristics varied: mean age ranged between 62.8 [ 24 ] and 84.6 [ 44 ] years old and the proportion of males between and 37% [ 26 ] and 100 [ 44 ]%. Distribution of comorbidities was mixed, with hypertension being the most widely reported and ranging between 27.5% [ 33 ] and 88.3% [ 37 ]. Smoking status was reported in seven studies [ 25 , 26 , 28 , 31 , 33 , 41 , 44 ] where most patients were recorded as being either current or former smokers, however data was not available consistently. BMI was reported in five studies [ 24 , 28 , 29 , 41 , 44 ] only, ranging between 20.4 [ 29 ] and 29.9 [ 24 ]. Follow-up time was also highly variable, ranging from 2 [ 30 ] to 112 months [ 40 ].

Individual adjusted study risk estimates for mortality associated with beta-blocker use vs. no beta-blocker use ranged from HR 0.46 (95%CI 0.19–1.11) [ 29 ] to 1.19 (95%CI 1.04 to 1.37) [ 26 ] (Fig.  6 ). While age and sex were the most common covariates adjusted for, the majority of studies used a variety of study-specific variables: medications for specific indications (such as hypertension, HF) and other comorbidities or clinical variables (Additional file 1 : Table S10). Two studies reported unadjusted analyses [ 27 , 28 ]. There was one study only reporting an increase in mortality risk associated with beta-blockers (HR: 1.19, 95% CI 1.04 to 1.37); however the population assessed in this report consisted of severe COPD patients who were undergoing long-term oxygen therapy [ 26 ]. There was a very high degree of heterogeneity amongst studies (I 2  = 99.3%). This was explored by conducting stratified analyses (i.e. stratifying by type of beta-blocker [cardioselective vs. non-cardioselective, Additional file 1 : Figure S8]; excluding unadjusted estimates; excluding the only study which exclusively included very severe COPD patients). However, due to heterogeneity remaining very high (I 2  > 75%), results from the outcome analysis are presented graphically (Fig.  6 ).

figure 6

Forest plot illustrating the impact of beta-blocker therapy versus no beta-blocker therapy on mortality, in patients with COPD (Estimate: HR hazard ratio, 95% CI confidence interval)

All-cause hospitalization

All-cause hospitalization was reported in three observational studies [ 47 , 48 , 49 ]. One compared cardioselective beta-blockers to non-selective beta-blockers (and presented odds ratios [OR]) [ 48 ]; one compared non-cardioselective beta-blockers to selective beta-blockers (and presented HR) [ 49 ] and one compared cardioselective beta-blockers to lack of beta-blocker treatment (and presented relative risk) [ 47 ] therefore no class-effect comparison could be inferred. None of the studies found significant differences in all-cause hospitalization associated with the investigated treatments (Additional file 1 : Table S11).

Quality of life

SGRQ was assessed in two RCTs [ 12 , 56 ] and one observational study [ 45 ], but none reported mean change from baseline to follow-up per treatment arm. One RCT [ 66 ] compared metoprolol to placebo and one observational study [ 45 ] assessed any beta-blocker compared to lack of beta-blocker treatment; both reported no significant difference in SGRQ between the two treatment arms at one-year follow-up (Additional file 1 : Table S12).

12MWT was investigated in two RCTs [ 59 , 62 ]; one study investigated atenolol and metoprolol vs. placebo, which did not report a mean change in score at  four weeks follow-up [ 62 ]; the second study did not find a significant difference in distance walked between patients that received metoprolol vs. propranolol  six hours after treatment was administered [ 59 ] (Additional file 1 : Table S13).

Data on 6MTW was reported in two recent RCTs from 2017 [ 56 ], respectively 2019 [ 12 ]. The first evaluated the effect of bisoprolol compared to carvedilol and did not present mean change between treatment groups, however the calculated estimates suggest both agents decreased distance walked in patients with COPD with no difference apparent between the two; the second trial [ 12 ] did not identify a significant difference between the metoprolol and placebo on 6MWT (Additional file 1 : Table S14).

Data on SF-36 was available in one observational study [ 42 ]. Whilst overall scores were not available per treatment group, authors reported no significant association between beta-blocker treatment and individual domains of the quality of life assessment tool, either at baseline or 6.4 years follow-up (Additional file 1 : Table S15).

Risk of bias

Observational studies were mostly judged to have moderate risk of bias (23 studies [ 24 , 25 , 26 , 28 , 30 , 31 , 32 , 33 , 34 , 35 , 37 , 38 , 39 , 40 , 41 , 42 , 46 , 47 , 48 , 49 , 64 , 65 ]), two studies [ 25 , 45 ] were considered to be of low risk of bias, one [ 44 ] had serious risk of bias and one [ 27 ] did not provide enough information for a judgment to be made. The domains of bias which were mostly affected by a “moderate rating” were “bias due to confounding” and “bias in selection of participants into study” as the majority of studies included patients recruited from databases which did not provide clinical diagnoses and relied on ICD coding (without confirming validity of diagnosis) (Additional file 1 : Table S16). Ten RCTs [ 53 , 55 , 58 , 60 , 61 ] had moderate risk of bias, denoted by ratings of “some concerns”; two[ 56 , 57 ] studies were deemed of serious risk of bias, both due to the lack of blinding (Additional file 1 : Figure S9).

This comprehensive and up-to-date evaluation of the effects of beta-blockers in patients with COPD adds to the previous literature in several ways: we included all studies reporting on any type of beta-blocker treatment in patients with COPD, showing overall beneficial effects on AECOPD and mortality. For the first time, we used a probabilistic approach to evaluate the effect of beta-blockers on FEV1 using direct and indirect evidence from RCTs in an NMA, comparing seven treatments against placebo, and presented results for patients with COPD with and without CVD disease separately. No beta-blocker affected lung function significantly except propranolol, and the treatments less likely to have a detrimental effect on FEV1 were labetalol (in those with COPD and CVD) and celiprolol (in those with COPD without explicit CVD). Lastly, we found that data on all-cause hospitalization and quality of life endpoints such as SGRQ, 12 and 6MWT and SF-36 were scarcely reported across the literature and did not lend themselves to formal quantitative analysis—suggesting an area of focus for future studies.

Despite heterogeneous elements such as follow-up time, baseline characteristics including age, sex and comorbidities and geographical location, individual results from the 17 out of 21 studies reporting on mortality suggested beta-blocker therapy was associated with a diminished risk of death compared to those not prescribed beta-blockers, in patients with COPD. However, this quality of evidence was deemed “low” per GRADE assessment (Additional file 1 : Table S17) and we were not able to quantify the effect of beta-blockers on mortality due to considerable heterogeneity (I 2  > 75%). Previous reports [ 10 , 11 , 67 ] have provided pooled estimates of reductions in mortality risk associated with beta-blocker treatment, however all reported degrees of heterogeneity above the Cochrane I 2 threshold of 75%; 89.3% [ 10 ], 83% [ 11 ] and most recently 96% [ 67 ] bringing into question the validity and interpretability of these results as applied to the general COPD population. Reasons for very high heterogeneity in previous meta-analyses include: differences in study populations (i.e. including patients with differing degrees of severity), inaccurate risk of bias assessment and inclusion of different comparators for the intervention effect of interest (i.e. including studies where comparator arms received calcium channel blockers, despite aiming to assess the effect of beta-blocker treatment vs. lack of treatment) [ 67 ].

In our analysis, most studies were affected by bias, particularly due to confounding: two studies did not adjust for any covariate factors [ 27 , 55 ], whilst nine did not adjust for COPD severity either directly, or indirectly by including COPD medication regimen/exacerbation history in the final model [ 25 , 26 , 27 , 28 , 30 , 32 , 34 , 36 , 37 ]. Therefore, these studies may overestimate the prognostic effect of beta-blocker therapy on patients with COPD and may, in turn, skew results to show benefits. One of the reasons for the lack of adjustment for COPD-related variables may be due to using data from either existing drug-trials or CVD-specific registries which included data on subgroups of patients with COPD, reiterating the need for trials designed specifically for patients with COPD (with and without additional CVD) which may allow for reliable assessment of the true effect of beta-blockers in these patients. Furthermore, it is not surprising to observe a decrease in mortality, as this could be related to the effect of beta-blockers on other comorbid conditions of patients (i.e. CVD), which is established. A previous study [ 33 ] suggested long-term treatment with beta-blockers improved survival of patients with COPD without CVD, however future studies are needed to confirm this result and to assess whether beta-blockers provide non-CV mortality benefits.

We found evidence to suggest that patients with COPD who are given beta-blockers are at decreased risk of AECOPD (HR 0.78 [95%CI 0.74–0.82]), replicating findings from Du and colleagues [ 10 ] who report an even larger reduction in risk, of 37% (RR 0.63 [95% CI, 0.57–0.71]). However, this previous meta-analysis, had methodological limitations inherent to the observational nature of the pooled studies (i.e. residual confounding, immortal time bias), which may limit generalizability of results. However, the GRADE assessment revealed the body of observational evidence on which our estimate was derived was of “low” quality (Additional file 1 : Table S19). A recent RCT [ 12 ], less likely to be affected by the biases of previous observational studies, found no significant difference between metoprolol and placebo on the time to AECOPD of any severity, but revealed a significant increase in risk of AECOPD requiring hospitalization, in patients with COPD without an indication for beta-blocker treatment, bringing into question the protective effect of this specific beta-blocker agent.

However, this trial did not evaluate other beta-blockers, therefore future RCTs evaluating multiple regimens, are needed to confirm the benefit of these agents. Whether beta-blockers have an indirect effect on exacerbations of COPD could be assessed in clinical trials including patients with COPD and comorbid CVD, allowing assessment of these agents in a more representative COPD population.

FEV1 was assessed in 199 patients enrolled in 12 RCTs and we found that none of the individual cardioselective beta-blockers included in our NMA (atenolol, bisoprolol, celiprolol, metoprolol) were associated with significant effects on lung function in patients with COPD, regardless of baseline FEV1 or follow-up time. This is in line with a Cochrane review [ 9 ] which concluded that cardioselective beta-blockers given in either single dose or for longer durations, do not affect FEV1 in patients with COPD, even in those with the lowest baseline FEV1 measurements. Furthermore, our report extends to incorporate a lack of effect on FEV1 of non-selective beta-blockers such as carvedilol and labetalol. Propranolol was the only medication found to be associated with a reduction of 140 ml in FEV1 (95% CrI: -0.28, -0.016), which is larger than the threshold of 100 ml change deemed clinically significant by the American Thoracic Society and European Respiratory Society guidelines. This result is based on high quality evidence, according to the GRADE assessment (Additional file 1 : Table S19), and thus supports current recommendations to not use this medication in patients with COPD.

For the first time reported in the literature, we aimed to rank beta-blockers with respect to their effect on lung function. Propranolol had the lowest probability of being ranked first (suggesting worse impact on lung function), compared to all other individual treatments considered in our NMA, including placebo. Labetalol and celiprolol—drugs used in hypertension—were the least likely drugs to negatively impact FEV1, compared to all other beta-blockers; however, neither affected FEV1 with certainty compared to placebo and results were  inferred from very low quality evidence according to GRADE (Additional file 1 : Table S18), bringing into question their leading positions in the hierarchy. Since choice of beta-blocker may be influenced by CVD comorbidity (i.e. carvedilol, metoprolol and bisoprolol are recommended in stable HF; atenolol is more often prescribed in patients with asymptomatic hypertension, while bisoprolol is also used in atrial fibrillation, and propranolol is infrequently used to treat tachyarrhythmias), it is perhaps not surprising that we did not identify a clear “best” beta-blocker to be used in COPD. The fact that the beta-blockers less likely to decrease lung function are mainly used to treat hypertension may just reflect this subgroup of patients could be less prone to detrimental side-effects (i.e. indication bias), compared to others with COPD and more severe comorbidities. Indeed, the prescription of beta-blockers in COPD needs to consider clinically significant lung function alteration vs. mortality benefits in those with CVD, particularly MI [ 68 ] and HF [ 69 ].

Whilst CVD is diagnosed in 20 to 60% patients with COPD [ 70 ], our main analysis included primarily small trials and only three explicitly included patients with a cardiac comorbidity (one included angina [ 54 ], two included HF patients [ 55 , 57 ], and one included patients with hypertension, which is a common CVD risk factor [ 50 ]. In line with previous research [ 9 ], we report no significant FEV1 treatment effect in patients with COPD with CVD.

The remaining eight trials excluded those with CVD (or simply did not report whether this was present), and results mirrored those observed for patients with CVD. Whilst results from this subgroup analysis are encouraging, previous clinical data on in this subgroup is scarce. A recent single RCT including COPD patients without an indication for beta-blockers (therefore those with HF, previous MI or revascularization) failed to demonstrate clear benefits of metoprolol over placebo. Observational studies have included a more varied breadth of specific beta-blockers, however they do not present a clear picture: the population-based Rotterdam Study [ 71 ] reported significant decreases in FEV1 associated with both cardio and non-cardioselective beta-blockers, while two other studies, one from Scotland [ 35 ] and an one from Japan [ 72 ] reported no significant difference in FEV1. Yet, these results may be affected by confounding by indication, which may explain the variability of estimates. Additionally, the longer follow-up times in these studies (ranging from 4 to 6 years) may overlook effects of FEV1 decline which is documented in patients with COPD, regardless of CVD comorbidities.

Overall, our FEV1 analysis suggests the beta-blockers included in this review do not affect lung function in patients with COPD regardless of CVD disease status, and selectivity of agent does not appear to have an impact. However, the two treatment networks contained different medications (celiprolol was assessed in one trial excluding CVD, while labetalol in one trial including CVD) thus we cannot rule out any other potential differential results if a whole range of beta-blockers were included. Finally, we included evidence based on a relatively small population and some of the studies were conducted decades ago; therefore, large clinical studies are needed to assess other agents which may confer lung function benefits across contemporary COPD patients.

The effect of beta-blocker exposure on all-cause hospitalization and quality of life outcomes in patients with COPD could not be quantified, due to a paucity of data. Narrative results from the assessment of studies investigating quality of life outcomes, such as SGRQ, 12 and 6 MWT and SF-36 all suggest non-significant effect of beta-blockers, from both RCTs and observational studies, albeit the data was deemed to be of “very low” quality according to GRADE (Additional file 1 : Table S17). Currently, COPD management is focused on preventing exacerbations and improving functioning and health-related quality of life. Clinical studies of beta-blocker treatment in cardiac disease suggests improvements in exercise tolerance and functional status, so whether beta-blockers impair or improve these outcomes in patients with COPD also, is a topic of importance for clinical management. Both randomized trials and, importantly, prospective observational studies with longer follow-up times are needed.

Limitations

There are several limitations to our analysis: first, we included published, peer-reviewed literature only thus, results may affected by publication bias as it is more likely that studies reporting positive results (i.e. that did not find beta-blockers were associated with negative outcomes) are more often reported than negative studies. Nevertheless, our data is based on the most recent available evidence and portray a nuanced implication of specific beta-blocker treatment in patients with COPD, emphasizing the need for a targeted treatment of CVD comorbidity in these patients.

We only included stable COPD patients and whilst we showed that FEV1 reduction (or increase) was not significant according to beta-blocker exposure (apart for propranolol), we could not verify whether these therapeutic agents diminish the response to rescue COPD medication such as beta-agonists, administered during an exacerbation of COPD. We also did not verify long-term effects of co-administration of beta-blockers and beta-agonists and how their interaction may affect outcomes in patients receiving both types of medication.

Another issue is undiagnosed CVD in patients with COPD. Symptoms of ischemic heart disease or HF may be misattributed or overlapping with COPD, and thus not formally diagnosed, posing difficulties in disentangling possible non-cardiac effects of beta-blockers, independent of their proven cardiac benefits. One advantage of our FEV1 analysis is that we included RCTs only, where concomitant CVD is often ascertained more rigorously and therefore CVD status was known with a greater degree of confidence that may be the case in observational studies.

Furthermore, no statistically significant effect was detected in subgroup analyses stratified by CVD status, which may be due to limited sample size. Future, adequately powered RCTs are needed to assess the effect of beta-blockers in a diverse COPD population, allowing for accurate comparisons based on CVD status to be made.

A recent RCT [ 12 ] comparing metoprolol with placebo failed to find a significant effect on FEV1, but reported worsening of dyspnea and overall COPD symptoms, suggestive of respiratory effects not captured by spirometry. This confirms the need to evaluate a spectrum of respiratory outcomes to fully assess the implications of beta-blocker treatment in patients with COPD, which needs to be addressed in future studies

Confounding by contraindication is likely to affect interpretation of results—if we assume clinicians knowingly withheld treatment from patients due to concerns regarding breathlessness, this may have resulted in a reduced sample size of possible COPD patients who may have been eligible for beta-blocker therapy. Alternatively, doctors may prescribe beta-blockers to less severe patients, limiting generalizability.

Our AECOPD analysis is also limited by a low number of included studies, all of which were observational—we identified one RCT only (evaluating metoprolol). This reinstates the need of more carefully conducted RCTs to evaluate a range of beta-blockers and their effects of AECOPD, in order to validate observational data.

Findings from this analysis represent the most comprehensive and up-to-date available evidence synthesis to assess the effects of beta-blocker use in patients with COPD, spanning data published over four decades. A reduction in COPD exacerbation risk was inferred from observational data while clinical data were pooled to assess lung function. Mortality and quality of life were narratively described owing to high heterogeneity or sparsity of data, respectively. FEV1 was significantly impacted by propranolol, but not by atenolol, bisoprolol, carvedilol, celiprolol, labetalol or metoprolol. In the subset of individuals with CVD, no individual beta-blocker was associated with a reduction in lung function. Treatment choice in patients with COPD should be made according to CVD comorbidity guidelines on management.

Availability of data and materials

The datasets analyzed during this study are available from the corresponding author upon reasonable request.

Abbreviations

6-Minute walking test

12-Minute walking test

Acute exacerbation due to COPD

Body mass index

Chronic obstructive pulmonary disease

Confidence interval

Credible interval

Cardiovascular disease

Forced expiratory volume in one second

Heart failure

Hazard ratio

  • Network meta-analysis

Myocardial infarction

Short-Form Health Survey Questionnaire

St. George’s Respiratory Questionnaire

Surface under the cumulative ranking

Randomized controlled trial

Heidenreich PAMK, Hastie T, Fadel B, Hagan V, Lee BK, Hlatky MA. Meta-analysis of trials comparing β-blockers, calcium antagonists, and nitrates for stable angina. JAMA. 1999;281(20):1927–36.

Article   CAS   PubMed   Google Scholar  

Ponikowski P, Voors AA, Anker SD, Bueno H, Cleland JGF, Coats AJS, et al. 2016 ESC Guidelines for the diagnosis and treatment of acute and chronic heart failure: The Task Force for the diagnosis and treatment of acute and chronic heart failure of the European Society of Cardiology (ESC) Developed with the special contribution of the Heart Failure Association (HFA) of the ESC. Eur Heart J. 2016;37(27):2129–200.

Article   PubMed   Google Scholar  

Task Force on the management of STseamiotESoC, Steg PG, James SK, Atar D, Badano LP, Blomstrom-Lundqvist C, et al. ESC Guidelines for the management of acute myocardial infarction in patients presenting with ST-segment elevation. Eur Heart J. 2012;33(20):2569–619.

Article   CAS   Google Scholar  

Whelton PK, Carey RM, Aronow WS, Casey DE Jr, Collins KJ, Dennison Himmelfarb C, et al. 2017 ACC/AHA/AAPA/ABC/ACPM/AGS/APhA/ASH/ASPC/NMA/PCNA Guideline for the Prevention, Detection, Evaluation, and Management of High Blood Pressure in Adults: Executive Summary: a report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines. Hypertension. 2018;71(6):1269–324.

Salpeter SR, Ormiston TM, Salpeter EE, Poole PJ, Cates CJ. Cardioselective beta-blockers for chronic obstructive pulmonary disease: a meta-analysis. Respir Med. 2003;97(10):1094–101.

Vestbo J, Hurd SS, Agusti AG, Jones PW, Vogelmeier C, Anzueto A, et al. Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease: GOLD executive summary. Am J Respir Crit Care Med. 2013;187(4):347–65.

Morgan AD, Zakeri R, Quint JK. Defining the relationship between COPD and CVD: what are the implications for clinical practice? Ther Adv Respir Dis. 2018;12:1753465817750524.

Article   PubMed   PubMed Central   Google Scholar  

Rabe KF, Hurst JR, Suissa S. Cardiovascular disease and COPD: dangerous liaisons? Eur Respir Rev 2018;27(149).

Salpeter S, Ormiston T, Salpeter E. Cardioselective beta-blockers for chronic obstructive pulmonary disease. Cochrane Database Syst Rev. 2016;4:CD003566.

Google Scholar  

Du Q, Sun Y, Ding N, Lu L, Chen Y. Beta-blockers reduced the risk of mortality and exacerbation in patients with COPD: a meta-analysis of observational studies. PLoS ONE. 2014;9(11):e113048.

Article   PubMed   PubMed Central   CAS   Google Scholar  

Etminan MJS, Carleton B, FitzGerald JM. Beta-blocker use and COPD mortality: a systematic review and meta-analysis. BMC Pulm Med. 2012;12(1):48.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Dransfield MT, Voelker H, Bhatt SP, Brenner K, Casaburi R, Come CE, et al. Metoprolol for the Prevention of Acute Exacerbations of COPD. N Engl J Med. 2019;381(24):2304–14.

Gulea C, Zakeri R, Quint JK. Effect of beta-blocker therapy on clinical outcomes, safety, health-related quality of life and functional capacity in patients with chronic obstructive pulmonary disease (COPD): a protocol for a systematic literature review and meta-analysis with multiple treatment comparison. BMJ Open. 2018;8(11):e024736.

Sterne JA, Hernan MA, Reeves BC, Savovic J, Berkman ND, Viswanathan M, et al. ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions. BMJ. 2016;355:i4919.

Sterne JAC, Savovic J, Page MJ, Elbers RG, Blencowe NS, Boutron I, et al. RoB 2: a revised tool for assessing risk of bias in randomised trials. BMJ. 2019;366:l4898.

Puhan MA, Schunemann HJ, Murad MH, Li T, Brignardello-Petersen R, Singh JA, et al. A GRADE Working Group approach for rating the quality of treatment effect estimates from network meta-analysis. BMJ. 2014;349:g5630.

Higgins JPTJ, Chandler J, Cumpston M, Li T, Page MJ, Welch VA, editors. Cochrane handbook for systematic reviews of interventions. New Jersey: Wiley; 2019.

Higgins JP, Thompson SG. Quantifying heterogeneity in a meta-analysis. Stat Med. 2002;21(11):1539–58.

van Valkenhoef G, Lu G, de Brock B, Hillege H, Ades AE, Welton NJ. Automating network meta-analysis. Res Synth Methods. 2012;3(4):285–99.

Salanti G, Del Giovane C, Chaimani A, Caldwell DM, Higgins JP. Evaluating the quality of evidence from a network meta-analysis. PloS one. 2014;3;9(7):e99682.

Trinquart L, Attiche N, Bafeta A, Porcher R, Ravaud P. Uncertainty in treatment rankings: reanalysis of network meta-analyses of randomized trials. Ann Intern Med. 2016;164(10):666–73.

Furukawa TA, Salanti G, Atkinson LZ, Leucht S, Ruhe HG, Turner EH, et al. Comparative efficacy and acceptability of first-generation and second-generation antidepressants in the acute treatment of major depression: protocol for a network meta-analysis. BMJ Open. 2016;6(7):e010919.

Viechtbauer W. Conducting meta-analyses in R with the metafor package. J Stat Softw. 2010;36(3):1–48.

Article   Google Scholar  

Bhatt SP, Wells JM, Kinney GL, Washko GR Jr, Budoff M, Kim YI, et al. Beta-blockers are associated with a reduction in COPD exacerbations. Thorax. 2016;71(1):8–14.

Coiro S, Girerd N, Rossignol P, Ferreira JP, Maggioni A, Pitt B, et al. Association of beta-blocker treatment with mortality following myocardial infarction in patients with chronic obstructive pulmonary disease and heart failure or left ventricular dysfunction: a propensity matched-cohort analysis from the High-Risk Myocardial Infarction Database Initiative. Eur J Heart Fail. 2017;19(2):271–9.

Ekstrom MP, Hermansson AB, Strom KE. Effects of cardiovascular drugs on mortality in severe chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2013;187(7):715–20.

Gottlieb SSMR, Vogel RA. Effect of beta-blockade on mortality among high-risk and low-risk patients after myocardial infarction. N Engl J Med. 1998;339(8):489–97.

Hawkins NM, Huang Z, Pieper KS, Solomon SD, Kober L, Velazquez EJ, et al. Chronic obstructive pulmonary disease is an independent predictor of death but not atherosclerotic events in patients with myocardial infarction: analysis of the Valsartan in Acute Myocardial Infarction Trial (VALIANT). Eur J Heart Fail. 2009;11(3):292–8.

Kubota Y, Asai K, Furuse E, Nakamura S, Murai K, Tsukada YT, et al. Impact of beta-blocker selectivity on long-term outcomes in congestive heart failure patients with chronic obstructive pulmonary disease. Int J Chron Obstruct Pulmon Dis. 2015;10:515–23.

Mentz RJ, Wojdyla D, Fiuzat M, Chiswell K, Fonarow GC, O’Connor CM. Association of beta-blocker use and selectivity with outcomes in patients with heart failure and chronic obstructive pulmonary disease (from OPTIMIZE-HF). Am J Cardiol. 2013;111(4):582–7.

Quint JK, Herrett E, Bhaskaran K, Timmis A, Hemingway H, Wedzicha JA, et al. Effect of beta blockers on mortality after myocardial infarction in adults with COPD: population based cohort study of UK electronic healthcare records. BMJ. 2013;347:f6650.

Rodriguez-Manero M, Lopez-Pardo E, Cordero A, Ruano-Ravina A, Novo-Platas J, Pereira-Vazquez M, et al. A prospective study of the clinical outcomes and prognosis associated with comorbid COPD in the atrial fibrillation population. Int J Chron Obstruct Pulmon Dis. 2019;14:371–80.

Rutten FH, Hak Eelko ZNPA, Grobbee DE, Hoes AW. Blockers may reduce mortality and risk of exacerbations in patients with chronic obstructive pulmonary disease. Arch Intern Med. 2010;170(10):880–7.

Scrutinio D, Guida P, Passantino A, Ammirati E, Oliva F, Lagioia R, et al. Acutely decompensated heart failure with chronic obstructive pulmonary disease: clinical characteristics and long-term survival. Eur J Intern Med. 2019;60:31–8.

Short PM, Lipworth SI, Elder DH, Schembri S, Lipworth BJ. Effect of beta blockers in treatment of chronic obstructive pulmonary disease: a retrospective cohort study. BMJ. 2011;342:d2549.

Sin DD, McAlister FA. The effects of beta-blockers on morbidity and mortality in a population-based cohort of 11,942 elderly patients with heart failure. Am J Med. 2002;113(8):650–6.

Staszewsky L, Cortesi L, Tettamanti M, Dal Bo GA, Fortino I, Bortolotti A, et al. Outcomes in patients hospitalized for heart failure and chronic obstructive pulmonary disease: differences in clinical profile and treatment between 2002 and 2009. Eur J Heart Fail. 2016;18(7):840–8.

Su TH, Chang SH, Kuo CF, Liu PH, Chan YL. beta-blockers after acute myocardial infarction in patients with chronic obstructive pulmonary disease: a nationwide population-based observational study. PLoS ONE. 2019;14(3):e0213187.

Su VY, Chang YS, Hu YW, Hung MH, Ou SM, Lee FY, et al. Carvedilol, bisoprolol, and metoprolol use in patients with coexistent heart failure and chronic obstructive pulmonary disease. Medicine (Baltimore). 2016;95(5):e2427.

Article   CAS   PubMed Central   Google Scholar  

Su VYYY, Perng DW, Tsai YH, Chou KT, Su KC, Su WJ, Chen PC, Yang KY. Real-world effectiveness of medications on survival in patients with COPD-heart failure overlap. Aging. 2019;11(11):3650.

van Gestel YR, Hoeks SE, Sin DD, Welten GM, Schouten O, Witteveen HJ, et al. Impact of cardioselective beta-blockers on mortality in patients with chronic obstructive pulmonary disease and atherosclerosis. Am J Respir Crit Care Med. 2008;178(7):695–700.

van Gestel YR, Hoeks SE, Sin DD, Stam H, Mertens FW, Bax JJ, van Domburg RT, Poldermans D. Beta-blockers and health-related quality of life in patients with peripheral arterial disease and COPD. Int J Chronic Obstructive Pulm Dis. 2009;4:177.

Wang WH, Cheng CC, Mar GY, Wei KC, Huang WC, Liu CP. Improving outcomes in chronic obstructive pulmonary disease by taking beta-blockers after acute myocardial infarction: a nationwide observational study. Heart Vessels. 2019;34(7):1158–67.

Zeng LH, Hu YX, Liu L, Zhang M, Cui H. Impact of beta2-agonists, beta-blockers, and their combination on cardiac function in elderly male patients with chronic obstructive pulmonary disease. Clin Interv Aging. 2013;8:1157–65.

CAS   PubMed   PubMed Central   Google Scholar  

Maltais F, Buhl R, Koch A, Amatto VC, Reid J, Gronke L, et al. Beta-blockers in COPD: a cohort study from the TONADO Research Program. Chest. 2018;153(6):1315–25.

Rasmussen DB, Bodtger U, Lamberts M, Torp-Pedersen C, Gislason G, Lange P, et al. Beta-blocker use and acute exacerbations of COPD following myocardial infarction: a Danish nationwide cohort study. Thorax. 2020;75(11):928–33.

Brooks TW, Creekmore F, Young DC, Asche CV, Oberg B, Samuelson WM. Rates of hospitalizations and emergency department visits in patients with asthma and chronic obstructive pulmonary disease taking β-blockers. Pharmacotherapy. 2007;27(5):684–90.

Farland MZ, Peters CJ, Williams JD, Bielak KM, Heidel RE, Ray SM. beta-Blocker use and incidence of chronic obstructive pulmonary disease exacerbations. Ann Pharmacother. 2013;47(5):651–6.

Article   PubMed   CAS   Google Scholar  

Sessa M, Mascolo A, Mortensen RN, Andersen MP, Rosano GMC, Capuano A, et al. Relationship between heart failure, concurrent chronic obstructive pulmonary disease and beta-blocker use: a Danish nationwide cohort study. Eur J Heart Fail. 2018;20(3):548–56.

Adam WR, Meagher EJ, Barter CE. Labetalol, beta blockers, and acute deterioration of chronic airway obstruction. Clin Exp Hypertens A. 1982;4(8):1419–28.

CAS   PubMed   Google Scholar  

Chang CL, Mills GD, McLachlan JD, Karalus NC, Hancox RJ. Cardio-selective and non-selective beta-blockers in chronic obstructive pulmonary disease: effects on bronchodilator response and exercise. Intern Med J. 2010;40(3):193–200.

Chester EH, Schwartz HJ, Fleming GM. Adverse effect of propranolol on airway function in nonasthmatic chronic obstructive lung disease. Chest. 1981;79(5):540–4.

Sinclair DJ. Comparison of effects of propranolol and metoprolol on airways obstruction in chronic bronchitis. Br Med J. 1979;1(6157):168.

Dorow PBH, Tönnesmann U. Effects of single oral doses of bisoprolol and atenolol on airway function in nonasthmatic chronic obstructive lung disease and angina pectoris. Eur J Clin Pharmacol. 1986;31(2):143–7.

Hawkins NM, MacDonald MR, Petrie MC, Chalmers GW, Carter R, Dunn FG, et al. Bisoprolol in patients with heart failure and moderate to severe chronic obstructive pulmonary disease: a randomized controlled trial. Eur J Heart Fail. 2009;11(7):684–90.

Jabbal S, Anderson W, Short P, Morrison A, Manoharan A, Lipworth BJ. Cardiopulmonary interactions with beta-blockers and inhaled therapy in COPD. QJM. 2017;110(12):785–92.

Lainscak M, Podbregar M, Kovacic D, Rozman J, von Haehling S. Differences between bisoprolol and carvedilol in patients with chronic heart failure and chronic obstructive pulmonary disease: a randomized trial. Respir Med. 2011;105:S44–9.

Mainguy V, Girard D, Maltais F, Saey D, Milot J, Senechal M, et al. Effect of bisoprolol on respiratory function and exercise capacity in chronic obstructive pulmonary disease. Am J Cardiol. 2012;110(2):258–63.

McGavin CRWI. The effects of oral propranolol and metoprolol on lung function and exercise performance in chronic airways obstruction. Br J Dis Chest. 1978;72:327–32.

Ranchod SR. The effect of beta-blockers on ventilatory function in chronic bronchitis. South African Med J. 1982;61(12):423–4.

CAS   Google Scholar  

van der Woude HJZJ, Postma DS, Winter TH, van Hulst M, Aalbers R. Detrimental effects of β-blockers in COPD: a concern for nonselective β-blockers. Chest. 2005;127(3):818–24.

Butland RJPJ, Geddes DM. Effect of beta-adrenergic blockade on hyperventilation and exercise tolerance in emphysema. J Appl Physiol. 1983;54(5):1368–73.

Jabbal S, Lipworth BJ. Tolerability of bisoprolol on domiciliary spirometry in COPD. Lung. 2018;196(1):11–4.

Ellingsen J, Johansson G, Larsson K, Lisspers K, Malinovschi A, Stallberg B, et al. Impact of comorbidities and commonly used drugs on mortality in COPD—Real-world data from a primary care setting. Int J Chron Obstruct Pulmon Dis. 2020;15:235–45.

Sin DD, Anthonisen NR, Soriano JB, Agusti AG. Mortality in COPD: role of comorbidities. Eur Respir J. 2006;28(6):1245–57.

Dransfield MT, Rowe SM, Johnson JE, Bailey WC, Gerald LB. Use of beta blockers and the risk of death in hospitalised patients with acute exacerbations of COPD. Thorax. 2008;63(4):301–5.

Yang YL, Xiang ZJ, Yang JH, Wang WJ, Xu ZC, Xiang RL. Association of β-blocker use with survival and pulmonary function in patients with chronic obstructive pulmonary and cardiovascular disease: a systematic review and meta-analysis. Eur Heart J. 2020;41(46):4415–22.

Hjalmarson ÅHJ, Malek I, Ryden L, Vedin A, Waldenström A, Wedel H, Elmfeldt D, Holmberg S, Nyberg G, Swedberg K. Effect on mortality of metoprolol in acute myocardial infarction: a double-blind randomised trial. Lancet. 1981;17(318):823–7.

Hjalmarson ÅGS, Fagerberg B, Wedel H, Waagstein F, Kjekshus J, Wikstrand J, El Allaf D, Vítovec J, Aldershvile J, Halinen M. Effects of controlled-release metoprolol on total mortality, hospitalizations, and well-being in patients with heart failure: the Metoprolol CR/XL Randomized Intervention Trial in congestive heart failure (MERIT-HF). JAMA. 2000;283(10):1295–302.

Loth DW, Brusselle GG, Lahousse L, Hofman A, Leufkens HG, Stricker BH. beta-Adrenoceptor blockers and pulmonary function in the general population: the Rotterdam Study. Br J Clin Pharmacol. 2014;77(1):190–200.

Oda N, Miyahara N, Ichikawa H, Tanimoto Y, Kajimoto K, Sakugawa M, et al. Long-term effects of beta-blocker use on lung function in Japanese patients with chronic obstructive pulmonary disease. Int J Chron Obstruct Pulmon Dis. 2017;12:1119–24.

Download references

CG is funded by an NHLI studentship.

Author information

Authors and affiliations.

National Heart and Lung Institute, Imperial College London, Manresa Road, London, UK

Claudia Gulea & Jennifer K. Quint

NIHR Imperial Biomedical Research Centre, London, UK

British Heart Foundation Centre for Research Excellence, King’s College London, London, UK

Rosita Zakeri

Homerton University Hospital NHS Foundation Trust, London, UK

Vanessa Alderman

Epsom and St. Helier University Hospitals NHS Trust, Epsom, UK

Alexander Morgan

Guy’s & St Thomas’ NHS Foundation Trust, London, UK

Royal Brompton & Harefield NHS Foundation Trust, London, UK

Jennifer K. Quint

You can also search for this author in PubMed   Google Scholar

Contributions

CG, RZ and JKQ made substantial contributions to the conception and design of the study. CG, VA, AM and JR screened abstracts and full-texts and extracted the data. CG carried out statistical analyses and wrote the first draft. CG, RZ, JKQ, VA, AM and JR contributed to data interpretation and provided revisions to the final manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Claudia Gulea .

Ethics declarations

Ethics approval and consent to participate.

Not applicable.

Consent for publication

Competing interests.

CG, RZ, VA, AM and JR have no conflict of interest. JKQ’s research group has received funds from AZ, GSK, The Health Foundation, MRC, British Lung Foundation, IQVIA, Chiesi, and Asthma UK outside the submitted work; grants and personal fees from GlaxoSmithKline, Boehringer Ingelheim, AstraZeneca, Bayer, Insmed outside the submitted work.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: figure s1..

Forest plot illustrating results of the meta-analysis evaluating the impact of beta-blocker therapy vs. no beta-blocker therapy on AECOPD in patients with COPD. Figure S2 . Consistency results illustrating no significant difference between direct and indirect evidence across all comparisons that were assessed in the FEV1 network meta-analysis. Figure S3 . Comparison-adjusted funnel plot . Figure S4. Network meta-analysis with meta-regression results (long vs. short follow-up) . Figure S5 . Network meta-analysis results for patients without A) COPD with explicit cardiovascular disease; B) with cardiovascular disease. Figure S6. Rankogram illustrating probabilities of being 1st, 2nd, 3rd…7th with respect to improvement in lung function, for each beta-blocker (and placebo) in patients with COPD without explicit cardiovascular disease. Figure S7 . Rankogram illustrating probabilities of being 1st, 2nd, 3rd…7th with respect to improvement in lung function for each beta-blocker (and placebo) in patients with COPD with cardiovascular disease. Figure S8. Forest plot showing hazard ratios associated with A) Cardioselective beta-blockers and B) Non-cardioselective beta-blockers and mortality in patients with COPD . Figure S9. Risk of bias assessment, RCTs. Table S1 . Screening criteria. Table S2 . Summary of observational studies. Table S3. Patient characteristics—observational studies. Table S4 . AECOPD estimates for beta-blocker versus no beta-blocker use, from individual observational studies. Table S5 . Study characteristics—RCTs. Table S6 . Baseline  characteristics—RCTs. Table S7 . FEV1 measurements—RCTs. Table S8 . Network meta-analysis results—league table. Table S9 . SUCRA ranking probability of being the best treatment. Table S10. Mortality estimates for beta-blocker versus no beta-blocker use, from individual studies. Table S11 . All-cause hospitalization results. Table S12. SGRQ results. Table S13 .12MWT results. Table S14. 6MWT results. Table S15 . SF-36 results. Table S16 . Risk of bias assessment, observational studies. Table S17. GRADE assessment (mortality, quality of life). Table S18 . GRADE assessment (AECOPD). Table S19. GRADE assessment from each pair-wise comparison within the NMA network (FEV1 analysis)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Gulea, C., Zakeri, R., Alderman, V. et al. Beta-blocker therapy in patients with COPD: a systematic literature review and meta-analysis with multiple treatment comparison. Respir Res 22 , 64 (2021). https://doi.org/10.1186/s12931-021-01661-8

Download citation

Received : 21 December 2020

Accepted : 10 February 2021

Published : 23 February 2021

DOI : https://doi.org/10.1186/s12931-021-01661-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Beta-blockers

Respiratory Research

ISSN: 1465-993X

comparison and literature review

  • Open access
  • Published: 15 February 2023

Literature review of stroke assessment for upper-extremity physical function via EEG, EMG, kinematic, and kinetic measurements and their reliability

  • Rene M. Maura   ORCID: orcid.org/0000-0001-6023-9038 1 ,
  • Sebastian Rueda Parra 4 ,
  • Richard E. Stevens 2 ,
  • Douglas L. Weeks 3 ,
  • Eric T. Wolbrecht 1 &
  • Joel C. Perry 1  

Journal of NeuroEngineering and Rehabilitation volume  20 , Article number:  21 ( 2023 ) Cite this article

5987 Accesses

15 Citations

Metrics details

Significant clinician training is required to mitigate the subjective nature and achieve useful reliability between measurement occasions and therapists. Previous research supports that robotic instruments can improve quantitative biomechanical assessments of the upper limb, offering reliable and more sensitive measures. Furthermore, combining kinematic and kinetic measurements with electrophysiological measurements offers new insights to unlock targeted impairment-specific therapy. This review presents common methods for analyzing biomechanical and neuromuscular data by describing their validity and reporting their reliability measures.

This paper reviews literature (2000–2021) on sensor-based measures and metrics for upper-limb biomechanical and electrophysiological (neurological) assessment, which have been shown to correlate with clinical test outcomes for motor assessment. The search terms targeted robotic and passive devices developed for movement therapy. Journal and conference papers on stroke assessment metrics were selected using PRISMA guidelines. Intra-class correlation values of some of the metrics are recorded, along with model, type of agreement, and confidence intervals, when reported.

A total of 60 articles are identified. The sensor-based metrics assess various aspects of movement performance, such as smoothness, spasticity, efficiency, planning, efficacy, accuracy, coordination, range of motion, and strength. Additional metrics assess abnormal activation patterns of cortical activity and interconnections between brain regions and muscle groups; aiming to characterize differences between the population who had a stroke and the healthy population.

Range of motion, mean speed, mean distance, normal path length, spectral arc length, number of peaks, and task time metrics have all demonstrated good to excellent reliability, as well as provide a finer resolution compared to discrete clinical assessment tests. EEG power features for multiple frequency bands of interest, specifically the bands relating to slow and fast frequencies comparing affected and non-affected hemispheres, demonstrate good to excellent reliability for populations at various stages of stroke recovery. Further investigation is needed to evaluate the metrics missing reliability information. In the few studies combining biomechanical measures with neuroelectric signals, the multi-domain approaches demonstrated agreement with clinical assessments and provide further information during the relearning phase. Combining the reliable sensor-based metrics in the clinical assessment process will provide a more objective approach, relying less on therapist expertise. This paper suggests future work on analyzing the reliability of metrics to prevent biasedness and selecting the appropriate analysis.

Stroke is one of the leading causes of death and disability in developed countries. In the United States, a stroke occurs every 40 s, ranking stroke as the fifth leading cause of death and the first leading cause of disability in the country [ 1 ]. The high prevalence of stroke, coupled with increasing stroke survival rates, puts a growing strain on already limited healthcare resources; the cost of therapy is elevated [ 2 ] and restricted mostly to a clinical setting [ 3 ], leading to 50% of survivors that reach the chronic stage experiencing severe motor disability for upper extremities [ 4 ]. This highlights the need for refined (improved) assessment which can help pair person-specific impairment with appropriately targeted therapeutic strategies.

Rehabilitation typically starts with a battery of standardized tests to assess impairment and function. This initial evaluation serves as a baseline of movement capabilities and usually includes assessment of function during activities of daily living (ADL). Because these clinical assessments rely on trained therapists as raters, the scoring scale is designed to be discrete and, in some cases, bounded. While this improves the reliability of the metric [ 5 ] (i.e., raters more likely to agree), it also reduces the sensitivity of the scale. Furthermore, those assessment scales that are bounded, such as the Fugl-Meyer Assessment (FMA) [ 6 ], Ashworth or Modified Ashworth (MA) Scale [ 7 ], and Barthel Index [ 8 ], suffer from floor/ceiling effects where the limits of the scales become insensitive to the extremes of impairment and function. It is therefore important to develop new clinical assessment methods that are objective, quantifiable, reliable, and sensitive to change over the full range of function and impairment.

Over the last several decades, robotic devices have been designed and studied for administering post-stroke movement therapy. These devices have begun being adopted into clinical rehabilitation practice. More recently, researchers have proposed and studied the use of robotic devices to assess stroke-related impairments as an approach to overcome the limitations of existing clinical measures previously discussed [ 9 , 10 , 11 , 12 ]. Robots may be equipped with sensitive measurement devices that can be used to rate the person’s performance in a predefined task. These devices can include measuring kinematic (position/velocity), kinetic (force/torque), and/or neuromuscular (electromyography/electroencephalography) output from the subject during the task. Common sensor-based robotic metrics for post-stroke assessment included speed of response, planning time, movement planning, smoothness, efficiency, range, and efficacy [ 13 , 14 ]. Figure  1 demonstrates an example method for comprehensive assessment of a person who has suffered a stroke with data acquired during robotically administered tests. Furthermore, there is potential for new and more comprehensive knowledge to be gained from a wider array of assessment methods and metrics that combine the benefits of biomechanical (e.g., kinematic and kinetic) and neurological (e.g., electromyographic and electroencephalographic) measures [ 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 ].

figure 1

Example of instrument for upper extremities bilateral biomechanical and neuromuscular assessment. From this data, a wide variety of measures and metrics for assessment of upper-extremity impairment and function may be reported

  • Biomechanical assessment

Many classical methods of assessing impairment or function involve manual and/or instrumented quantification of performance through measures of motion (i.e., kinematic) and force (i.e., kinetic) capabilities. These classical methods rely on the training of the therapist to evaluate the capabilities of the person through keen observation (e.g., FMA [ 6 ] and MA [ 7 ]). The quality of kinematic and kinetic measures can be improved with the use of electronic-based measurements [ 23 ]. Robotic devices equipped with electronic sensors have the potential to improve the objectivity, sensitivity, and reliability of the assessment process by providing a means for more quantitative, precise, and accurate information [ 9 , 10 , 11 , 12 , 24 , 25 , 26 , 27 , 28 ]. Usually, the electronic sensors on a rehabilitation robotic device are used for control purposes [ 29 , 30 , 31 ]. Robotics can also measure movement outputs, such as force or joint velocities, which the clinician may not be able to otherwise measure as accurately (or simultaneously) using existing clinical assessment methods [ 23 ]. With accurate and repeatable measurement of forces and joint velocities, sensor-based assessments have the potential to assess the person’s movement in an objective and quantifiable way. This article reviews validity and reliability of biomechanical metrics in relationship to assessment of motor function for upper extremities.

Electrophysiological features for assessment

Neural signals that originate from the body can be measured using non-invasive methods. Among others, electroencephalograms (EEG) measure cortical electrical activity, and electromyograms (EMG) measure muscle electrical activity. The relative low cost, as well as the noninvasive nature of these technologies make them suitable for studying changes in cortical or muscle activation caused by conditions or injuries of the brain, such as the ones elicited by stroke lesions [ 32 ].

Initially, EMG/EEG were used strictly as clinical diagnostic tools [ 33 , 34 ]. Recent improvements in signal acquisition hardware and computational processing methods have increased their use as viable instruments for understanding and treating neuromuscular diseases and neural conditions [ 32 ]. Features extracted from these signals are being researched to assess their relationship to motor and cognitive deficits [ 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 ] and delayed ischemia [ 34 , 43 ], as well as to identify different uses of the signals that could aid rehabilitation [ 44 ]. Applications of these features in the context of stroke include: (1) commanding robotic prostheses [ 45 , 46 ], exoskeletons [ 21 , 47 , 48 ], and brain-machine interfaces [ 44 , 49 , 50 , 51 ]; and (2) bedside monitoring for sub-acute patients and thrombolytic therapy [ 52 , 53 , 54 ]. Here we review the validity and reliability of metrics derived from electrophysiological signals in relationship to stroke motor assessment for upper extremity.

Reliability of metrics

Robotic or sensor-based assessment tools have not gained widespread clinical acceptance for stroke assessment. Numerous barriers to their clinical adoption remain, including demonstrating their reliability and providing sufficient validation of robotic metrics with respect to currently accepted assessment techniques [ 55 ]. In the assessment of motor function with sensor-based systems, several literature reviews reveal a wide spectrum of sensor-based metrics to use for stroke rehabilitation and demonstrate their validity [ 13 , 42 , 56 , 57 , 58 , 59 , 63 , 64 ]. However, in addition to demonstrating validity, new clinical assessments must also demonstrate good or excellent reliability in order to support their adoption in the clinical field. This is achieved by: (1) comparing multiple measurements on the same subject (test–retest reliability), and (2) checking agreement between multiple raters of the same subject (inter-rater reliability). Reliability quantifies an assessment’s ability to deliver scores that are free from measurement error [ 65 ]. Previous literature reviews have presented limited, if any, information on the reliability of the biomechanical robotic metrics. Murphy and Häger [ 66 ], Wang et al. [ 56 ], and Shishov et al. [ 67 ] reviewed reliability, but omitted some important aspects of intra-class correlation methods used in the study (e.g., the model type and/or the confidence interval), which are required when analyzing intra-class correlation methods for reliability [ 68 ]. If the reliability is not properly analyzed and reported, the study runs the risk of having a biased result. Murphy and Häger [ 66 ] also found a lack of studies determining the reliability of metrics in 2015. Since electronic-based assessments require the use of a therapist or an operator to administer the test, an inter-observer reliability test should be investigated to observe the effect of the test administrators on the assessment process. Therefore, both test–retest and inter-observer reliability in biomechanical and electrophysiological metrics are reviewed to provide updated information on the current findings of the metrics’ reliability.

Integrated metrics

Over the past 50 years, numerous examples of integrated metrics have provided valuable insight into the inner workings of human arm function. In the 1970s EMG was combined with kinematic data in patients with spasticity to understand muscle patterns during ballistic arm reach movements [ 69 ], the affects of pharmacological intervention on spastic stretch reflexes during passive vs. voluntary movement [ 70 ], and in the 1990s EMG was combined with kinetic data to understand the effects of abnormal synergy patterns on reach workspace when lifting the arm against gravity [ 71 ]. This work dispelled long-standing theories of muscular weakness and spasticity alone being the major contributors to arm impairment. More recently, quantified aspects of processed EEG and EMG signals are being combined with kinematic data to investigate the compensatory role, and relation to shoulder-related abnormal muscle synergies of the contralesional secondary sensorimotor cortex, in a group of chronic stroke survivors [ 72 ]. These and other works demonstrate convincingly the value of combined metrics and the insights they can uncover that isolated metrics cannot discover alone.

To provide further information on the stroke severity and the relearning process during stroke therapy, researchers are investigating a multi-modal approach using biomechanical and neuromuscular features [ 15 , 16 , 18 , 19 , 21 , 22 ]. Combining both neuromuscular and biomechanical metrics will provide a comprehensive assessment of the person’s movement starting from motor planning to the end of motor execution. Neuromuscular output provides valuable information on the feedforward control and the movement planning phase [ 22 ]. However, neuromuscular signals provides little information on the movement quality that is often investigated with movement function tests or biomechanical output [ 21 ]. Also, using neuromuscular data will provide information to therapist on the neurological status and nervous system reorganization of the person that biomechanical information cannot provide [ 73 ]. The additional information can assist in developing more personalized care for the person with stroke, as well as offer considerable information on the changes that occur at the physiological level.

Paper overview

This paper reviews published sensor-based methods, for biomechanical and neuromuscular assessment of impairment and function after neurological damage, and how the metrics resulting from the assessments, both alone and in combination, may be able to provide further information on the recovery process. Specifically, methods and metrics utilizing digitized kinematic, kinetic, EEG, and EMG data were considered. The “Methods” section explains how the literature review was performed. In “Measures and methods based on biomechanical performance” section, prevailing robotic assessment metrics are identified and categorized including smoothness, resistance, efficiency, accuracy, efficacy, planning, range-of-motion, strength, inter-joint coordination, and intra-joint coordination. In “Measures and methods based on neural activity using EEG/EMG” section, EEG- and EMG-derived measures are discussed by the primary category of analysis performed to obtain them, including frequency power and coherence analyses. The relationship of each method and metric to stroke impairment and/or function is also discussed. Section “Reliability of measures” discusses the reliability of sensor-based metrics and some of the complications in demonstrating the effectiveness of the metrics. Section “Integrated metrics” reviews previous studies on combining biomechanical and neuromuscular data to provide further information on the changes occurring during assessment and training. Finally, Section “Discussions and conclusions” concludes the paper with a discussion on the advantages of combining multi-domain data, which of the metrics from the earlier sections should be considered in future robotic applications, as well as the ones that still require more investigation for either validity and/or reliability.

A literature review was performed following PRISMA guidelines [ 74 ] on biomechanical and neuromuscular assessment in upper-limb stroke rehabilitation. The review was composed of two independent searches on (1) biomechanical robotic devices, and (2) electrophysiological digital signal processing. Figures  2 and 3 show the selection process of the electrophysiological and biomechanical papers, respectively. Each of these searches applied the following steps: In step 1, each researcher searched in Google Scholar for papers between 2000 and 2021 (see Table 1 for search terms and prompts). In step 2, resulting titles and abstracts were screened to remove duplicates, articles in other languages, and articles not related to the literature review. In step 3, researchers read the full texts of articles screened in step 2, papers qualifying for inclusion using the Literature Review Criteria in Table 1 were selected. Finally, in step 4, selected articles from independent review process were read by the other researcher. Uncertainties in determining if a paper should be included/excluded were discussed with the whole research group. Twenty-four papers focus on biomechanical measures (kinematic and kinetic), thirty-three focus on electrophysiological measures (EEG/EMG), and six papers on multimodal approaches combining biomechanical and neuromuscular measures to assess stroke. Three of the six multimodal papers are also reported in the biomechanical section and 3 papers were hand-picked. A total of 60 papers are reviewed and reported.

figure 2

PRISMA flowchart on the selection for electrophysiological papers

figure 3

PRISMA flow chart for the selection for biomechanical papers

Measures and methods based on biomechanical performance

This review presents common robotic metrics which have been previously used to assess impairment and function after stroke. Twenty-five biomechanical papers are reviewed, which used both sensor-based and traditional clinical metrics to assess upper-extremity impairment and function. The five common metrics included in the reviewed studies measured the number of velocity peaks (~ 9 studies), path-length ratio (~ 8 studies), the max speed of the arm (~ 7 studies), active range of motion (~ 7 studies), and movement time (~ 7 studies). The metrics are often compared to an established clinical assessment to determine validity of the metric. The sensor-based metrics can be categorized by the aspect in which they evaluate movement quality similar to De Los Reyes-Guzmán et al.: smoothness, efficiency, efficacy, accuracy, coordination, or range of motion [ 14 ]. Resistance, Movement Planning, Coordination, and Strength are included as additional categories since some of the reviewed sensor-based metrics best evaluate those movement aspects. Examples of common evaluation activities and specific metrics that have been computed to quantify movement quality are outlined in Table 2 .

Lack of arm movement smoothness is a key indicator of underlying impairment [ 79 ]. Traditional therapist-administered assessments do not computationally measure smoothness leaving therapists unable to determine the degree to which disruption to movement smoothness is compromising motor function and, therefore, ADL. Most metrics that have been developed to quantify smoothness are based on features of the velocity profile of an arm movement, such as speed [ 80 , 81 ], speed arc length [ 79 ], local minima of velocity [ 10 ], velocity peaks [ 75 , 76 , 81 ], tent [ 80 ], spectral [ 25 ], spectral arc length [ 25 , 81 ], modified spectral arc length [ 79 ], and mean arrest period ratio [ 76 ]. Table 3 summarizes the smoothness metrics and their corresponding equations with equation numbers for reference. The speed metric is expressed as a ratio between the mean speed and the peak speed (Eq. 1). The speed arc length is the temporal length of the velocity profile (Eq. 2). Local minima of velocity and the velocity peaks metrics are measured by counting the number of minimum (Eq. 3) or maximum (Eq. 4) peaks in the velocity profile, respectively. The tent metric is a graphical approach that divides the area under the velocity curve by the area of a single peak velocity curve (Eq. 5). The spectral metric is the summation of the maximal Fourier transformed velocity vector (Eq. 6). The spectral arc-length metric is calculated from the frequency spectrum of the velocity profile by performing a fast Fourier transform operation and then computing the length (Eq. 7). The modified spectral arc length adapts the cutoff frequency according to a given threshold velocity and an upper-bound cutoff frequency (Eq. 8). The modified spectral arc length is then independent of temporal movement scaling. The mean arrest period ratio is the time portion that movement speed exceeds a given percentage of peak speed (Eq. 9).

Another commonly used approach is to analyze the jerk (i.e., the derivative of acceleration) profile. The common ways to assess smoothness using the jerk profile are root mean square jerk, mean rectified jerk, normalized jerk, and the logarithm of dimensionless jerk. The root mean square jerk takes the root-mean-square of the jerk that is then normalized by the movement duration [ 82 ] (Eq. 10). The mean rectified jerk (normalized mean absolute jerk) is the mean of the magnitude jerk normalized or divided by the peak velocity [ 80 , 82 ] (Eq. 11). The normalized jerk (dimensionless-squared jerk) is the square of the jerk times the duration of the movement to the fifth power over the length squared (Eq. 12). It is then integrated over the duration and square rooted. The normalized jerk can be normalized by mean speed, max speed, or mean jerk [ 80 ]. The logarithm of dimensionless jerk (Eq. 13) is the logarithm of normalized jerk defined in Eq. 12 [ 81 ].

It has yet to be determined which smoothness metric is more effective for characterizing recovery of smooth movement. According to Rohrer et al. [ 80 ], the metrics of speed, local minima of velocity, peaks, tent, and mean arrest period ratio showed increases in smoothness for inpatient recovery from stroke, but the mean rectified jerk metric seemed to show a decrease in smoothness as survivors of stroke recovered. Rohrer et al. warned that a low smoothness factor in jerk does not always mean the person is highly impaired. The spectral arc-length metric showed a consistent increase in smoothness as the number of sub-movements decreased [ 25 ], whereas the other metrics showed sudden changes in smoothness. For example, the mean arrest period ratio and the speed metric showed an increase in smoothness with two or more sub-movements, but when two sub-movements started to merge, the smoothness decreased. As a result, the spectral arc-length metric appears to capture change over a wider range of movement conditions in recovery in comparison to other metrics.

The presence of a velocity-dependent hyperactive stretch reflex is referred to as spasticity [ 83 ]. Spasticity results in a lack of smoothness during both passive and active movements and is more pronounced with activities that involve simultaneous shoulder abduction loading and extension of the elbow, wrist, or fingers [ 83 ], which are unfortunately quite common in ADL. A standard approach to assessing spasticity by a therapist involves moving a subject’s passive arm at different velocities and checking for the level of resistance. While this manual approach is subjective, electronic sensors have the potential to assess severity of spasticity in much more objective ways. Centen et al. report a method to assess the spasticity of the elbow using an upper-limb exoskeleton [ 84 ] involving the measurement of peak velocity, final angle, and creep. Sin et al., similarly performed a comparison study between a therapist moving the arm versus a robot moving the arm. An EMG sensor was used to detect the catch and compared with a torque sensor to detect catch angle for the robotic motion [ 85 ]. The robot moving the arm seemed to perform better with the inclusion of either an EMG or a torque sensor than with the therapist moving the arm and the robot simply recording the movement. A related measure that may be correlated with spasticity is the assessment of joint resistance torques during passive movement [ 76 ]. This can provide an assessment of the velocity-dependent resistance to movement that arises following stroke.

Efficiency measures movement fluency in terms of both task completion times and spatial trajectories. In point-to-point reaching, people who have suffered a stroke commonly display inefficient paths in comparison to their healthy side or compared to subjects who are unimpaired [ 10 ]. During the early phases of recovery after stroke, subjects may show slow overall movement speed resulting in longer task times. As recovery progresses, overall speed tends to increase and task times decrease, indicating more effective and efficient motor planning and path execution. Therapists usually observe the person’s efficiency in completing a task and then rate the person’s ability in completing a task in a timely manner. Therefore, both task time (or movement time) [ 10 , 76 , 77 , 86 , 87 ] and mean speed [ 25 , 75 , 77 , 81 , 86 ] are effective ways to assess temporal efficiency. Similar measures used by Wagner et al. include peak-hand velocity and time to peak-hand velocity [ 87 ]. To measure spatial efficiency of movement, both Colombo et al. [ 75 ], Mostafavi [ 77 ], and Germanotta [ 86 ] calculated the movement path length and divided it by the straight-line distance between the start and end points. This is known as the path-length ratio.

Movement planning

Movement planning is associated with feedforward sensorimotor control, elements that occur before the initial phase of movement. A common approach is to use reaction time to assess the duration of the planning phase. In a typical clinical assessment, a therapist can only observe/quantify whether movement can be initiated or not, but has no way to quantify the lag between the signal to initiate movement and initiation of movement. Keller et al., Frisoli et al., and Mostafavi et al. quantified the reaction time to assess movement planning [ 10 , 76 , 77 ] in subjects who have suffered a stroke. Mostafavi assessed movement planning in three additional ways by assessing characteristics of the actual movement: change in direction, movement distance ratio, and maximum speed ratio [ 77 ]. The change in direction is the angular deviation between the initial movement vector and the straight line between the start and end points. The first-movement-distance ratio is the ratio between the distance the hand traveled during the initial movement and the total distance between start and end points. The first-movement-maximum speed ratio is the ratio of the maximum hand speed during the initial phase of the movement divided by the global hand speed for the entire movement task.

Movement efficacy 

Movement efficacy measures the person’s ability to achieve the desired task without assistance. While therapists can assess the number of completed repetitions, they have no means to kinetically quantify amount of assistance required to perform a given task. Movement efficacy is quantified by robot sensor systems that can measure: (a) person-generated movement, and/or (b) the amount of work performed by the robot to complete the movement (e.g., when voluntary person-generated movement fails to achieve a target). Hence, movement efficacy can involve both kinematic and kinetic measures. A kinematic metric that can be used to represent movement efficacy is the active movement index, which is calculated by dividing the portion of the distance the person is to complete by the total target distance for the task [ 75 ]. An example metric based on kinetic data is the amount of assistance metric, proposed by Balasubramanian et al. [ 25 ]. It is calculated by estimating the work performed by the robot to assist voluntary movement, and then dividing it by the work performed by the robot as if the person performs the task without assistance from the robot. A similar metric obtained by Germanotta et al. calculates the total work by using the movement’s path length, but Germanotta et al. also calculate the work generated towards the target [ 86 ].

Movement accuracy

Movement accuracy has been characterized by the error in the end-effector trajectory compared to a theoretical trajectory. It measures the person’s ability to follow a prescribed path, whereas movement efficiency assesses the person’s ability to find the most ideal path to reach a target. Colombo et al. measured movement accuracy in people after stroke by calculating the mean-absolute value of the distance, which is the mean absolute value of the distance between each point on the person’s path and the theoretical path [ 75 ]. Figure  4 demonstrates the difference between path-length ratio and mean-absolute value of the distance. The mean-absolute value of the distance computes the error between a desired trajectory and the actual, and the path-length ratio computes the total path length the person’s limb has traveled. Another similar metric is the average inter-quartile range, which quantifies the average “spread” among several trajectories [ 15 ]. Balasubramanian et al. characterized movement accuracy as a measure of the subject’s ability to achieve a target during active reaching. They refer to the metric as movement synergy [ 25 ], and calculate it by finding the distance between the end-effector’s final location and the target location.

figure 4

Difference between path-length ratio and mean absolute value of the distance. A Path-length ratio. \(d_{ref}\) is the theoretical distance the hand should travel between the start and end point. \(d_{total}\) is the total distance the hand travelled from Start to End. B Mean absolute value of the distance. \(d_{i}\) is the distance between the theoretical path and the actual hand path

Intra-limb coordination

Intra-limb (inter-joint) coordination is a measure of the level of coordination achieved by individual joints of a limb or between multiple joints of the same limb (i.e., joint synergy) when performing a task. Since the upper limb consists of kinematic redundancies, the human arm can achieve a desired outcome in multiple ways. For example, a person might choose to move an atypical joint in order to compensate for a loss of mobility in another joint. Frisoli et al. and Bosecker et al. used the shoulder and elbow angle to find a linear correlation between the two angles in a movement task that required multi-joint movement [ 10 , 78 ]. In terms of clinical assessment, joint angle correlations can illustrate typical or atypical contribution of a joint while performing a multi-joint task.

Inter-limb coordination

Inter-limb coordination refers to a person’s ability to appropriately perform bilateral movements with affected and unaffected arms. Therapists observe the affected limb by often comparing to the unaffected limb during a matching task, such as position matching. Matching can either be accomplished with both limbs moving simultaneously or sequentially, and typically without the use of vision. Dukelow et al. used position matching to obtain measures of inter-limb coordination [ 24 ], including trial-to-trial variability, spatial contraction/expansion, and systematic shifts. Trial-to-trial variability is the standard deviation of the matching hand’s position for each location in the x (distal/proximal), y (anterior/posterior), and both in x and y in the transverse plane. Spatial contraction/expansion is the ratio of the 2D work area of the target hand to the 2D work area of the matching hand during a matching task. Systematic shifts were found by calculating the mean absolute position error between the target and matching hand for each target location.

Semrau et al. analyzed the performance of subjects in their ability to match their unaffected arm with the location of their affected arm [ 88 ]. In the experiment, a robot moved the affected arm to a position and the person then mirrored the position with the unaffected side. The researchers compared the data when the person was able to see the driven limb versus when they were unable to see the driven limb. The initial direction error, path length ratio, response latency, peak speed ratio, and their variabilities were calculated to assess the performance of the person’s ability to perform the task.

Range of motion

Range of motion is a measure of the extent of mobility in one or multiple joints. Traditionally, range of motion can be measured with the use of a goniometer [ 89 ]. The goniometer measures the individual joint range of motion, which takes considerable time. Range of motion can be expressed as a 1-DOF angular measure [ 76 , 89 ], a 2-DOF planar measure (i.e., work area) [ 82 ], or a 3-DOF spatial measure (i.e., workspace) [ 77 ]. Individual joints are commonly measured in joint space, whereas measures of area or volume are typically given in Cartesian space. In performing an assessment of work area or workspace with a robotic device, the measure can be estimated either by: (a) measuring individual joint angles with an exoskeleton device and then using these angles to compute the region swept out by the hand, or (b) directly measuring the hand or fingertips with a Cartesian (end-effector) device. The measurement of individual joint range of motion (ROM) as well as overall workspace have significant clinical importance in assessing both passive (pROM) and active (aROM) range of motion. To measure pROM, the robot drives arm movement while the person remains passive. The pROM is the maximum range of motion the person has with minimal or no pain. For aROM, a robot may place the arm in an initial position/orientation from which the person performs unassisted joint movements to determine the ROM of particular joints [ 76 ], or the area or volume swept by multiple joints. Lin et al. quantified the work area of the elbow and shoulder using potentiometers and derived test–retest reliability [ 89 ]. The potentiometer measurements were then compared to therapist measurements to determine validity.

Measures of strength evaluate a person’s ability to generate a force in a direction or a torque about a joint. Strength measurements may involve single or multiple joints. At the individual joint level, strength is typically measured from a predefined position of a person’s arm and/or hand. The person then applies a contraction to produce a torque at the assessed joint [ 76 , 78 ]. Multi-joint strength may also be measured by assessing strength and/or torque in various directions at distal locations along the arm, such as the hand. Lin et al. compared the grip strength obtained from load cells to a clinical method using precise weights, which showed excellent concurrent validity [ 89 ].

Measures and methods based on neural activity using EEG/EMG

Although much information can be captured and analyzed using the kinematic and kinetic measures listed above, their purview is limited. These measures provide insight into the functional outcomes of neurological system performance but provide limited perspective on potential contributing sources of measured impairment [ 90 ]. For a deeper look into the neuromuscular system, measures based on neurological activation are often pursued. As a complement to biomechanical measures, methods based on quantization of neural activity like EEG and EMG have been used to characterize the impact of stroke and its underlying mechanisms of impairments [ 91 , 92 ]. Over the past 20 years, numerous academic research studies have used these measures to explore the effects of stroke, therapeutic interventions, or time on the evolution of abnormal neural activity [ 91 ]. Groups with different levels of neurological health are commonly compared (e.g., chronic/acute/subacute stroke vs. non-impaired, or impairment level) or other specific experimental characteristics (e.g., different rehabilitation paradigms [ 93 , 94 ]). With this evidence, the validity of these metrics has been tested; however, the study of reliability of these metrics is needed to complete the jump from academic to clinical settings.

Extracting biomarkers from non-invasive neural activity requires careful decomposition and processing of raw EEG and EMG recordings [ 32 ]. Various methods have been used, and the results have produced a growing body of evidence for the validity of these biomarkers in providing insight on the current and future state of motor, cognitive, and language skills in people after stroke [ 38 , 95 ]. Some of the biomarkers derived from EEG signals include: power-related band-specific information [ 34 , 35 , 43 , 47 , 53 , 54 , 96 , 97 , 98 , 99 , 100 , 101 ], band frequency event-related synchronization and desynchronization (ERS/ERD) [ 22 , 51 , 102 , 103 ], intra-cortical coherence or functional connectivity [ 39 , 59 , 73 , 94 , 104 , 105 , 106 , 107 , 108 , 109 ], corticomuscular coherence (CMC) [ 37 , 110 , 111 , 112 , 113 ], among others [ 114 , 115 ]. Biomarkers extracted from EEG can be used to assess residual functional ability [ 38 , 54 , 73 , 97 , 98 , 99 ], derive prognostic indicators [ 34 , 43 , 104 ], or categorize people into groups (e.g., to better match impairments with therapeutic strategies) [ 39 , 47 , 58 , 116 ].

In the following subsections, valid biomarkers derived mostly from EEG signal features (relationship with motor outcome for a person after stroke) will be discussed and introduced theoretically. Distinctions will be made about the stage after stroke when signals were taken. Findings are reported from 33 studies that have examined the relationship between extracted neural features and motor function for different groups of people after stroke. These records are grouped by quantization methods used including approaches based on measures of frequency spectrum power (n = 9), inter-regional coherence (n = 10 for cortical coherence and n = 9 for CMC), and reliability (n = 5).

Frequency spectrum power

Power measures the amount of activity within a signal that occurs at a specific frequency or range of frequencies. Power can be computed in absolute or relative terms (i.e., with respect to other signals). It is often displayed as a power density spectrum where the magnitudes of signal power can be seen across a range of frequencies. In electro-cognitive research, the representation of power within specific frequency bands has been useful to explain brain activity and to characterize abnormal oscillatory activity due to regional neurological damage [ 32 , 117 ].

Frequency bands in EEG content

Electrical activity in the brain is dominated primarily by frequencies from 0–100 Hz where different frequency bands correspond with different states of activity: Delta (0–4 Hz) is associated with deep sleep, Theta (4–8 Hz) with drowsiness, Alpha (8–13 Hz) with relaxed alertness and important motor activity [ 117 ], and Beta (13–31 Hz) with focused alertness. Gamma waves (> 32 Hz) are also seen in EEG activity; however, their specific relationship to level of alertness or consciousness is still debated [ 32 , 117 ]. Important cognitive tasks have been found to trigger activity in these bands in different ways. Levels of both Alpha and Delta activity have also been shown to be affected by stroke and can therefore be examined as indicators of prognosis or impairment in sub-acute and chronic stroke [ 52 , 100 , 118 ].

Power in acute and sub-acute stroke

For individuals in the early post-stroke (i.e., sub-acute) phase, abnormal power levels can be an indicator of neurological damage [ 98 ]. Attenuation of activity in Alpha and Beta bands have been observed in the first hours after stroke [ 100 ] preceding the appearance of abnormally high Delta activity. Tolonen et al. reported a high correlation between Delta power and regional Cerebral Blood Flow (rCBF). This relationship appears during the sub-acute stroke phase and has been used to predict clinical, cognitive, and functional outcomes [ 119 ]. Delta activity has also been shown to positively correlate with 1-month National Institutes of Health Stroke Scale (NIHSS) [ 52 ] and 3-month Rankin scale [ 36 ] assessments.

Based on these findings, several QEEG (Quantitative Electroencephalography) metrics involving ratios of abnormal slow (Delta) and abnormal fast (Alpha and Beta) activity have been developed. The Delta-Alpha Ratio (DAR), Delta-Theta Ratio (DTR), and (Delta + Theta)/(Alpha + Beta) Ratio (DTABR also known as PRI for Power Ratio Index) relate amount of abnormal slow activity with the activity from faster bands and have been shown to provide valuable insight into prognosis of stroke outcome and thrombolytic therapy monitoring [ 98 ]. Increased DAR and DTABR have been repeatedly found to be the QEEG indices that best predict worse outcome for the following: comparing with the Functional Independence Measure and Functional Assessment Measure (FIM-FAM) at 105 days [ 53 ], Montreal Cognitive Assessment (MoCa) at 90 days [ 54 ], NIHSS at 1 month [ 35 ], modified ranking scale (mRS) at 6 months [ 105 ], NIHSS evolution at multiple times [ 120 ], and NIHSS at 12 months [ 96 ]. DAR was also used to classify people in the acute phase and healthy subjects with an accuracy of 100% [ 58 ].

The ability of basic EEG monitoring to derive useful metrics during the early stage of stroke has made EEG collection desirable for people who have suffered a stroke in intensive care settings. The derived QEEG indices have proven to be helpful to determine Delayed Cerebral Ischemia (DCI), increased DAR [ 43 ], and increased Delta power [ 34 , 118 ]. However, finding the electrode montage with the least number of electrodes that still reveals the necessary information for prognoses is one of the biggest challenges for this particular use of EEG. Comparing DAR from 19 electrodes on the scalp with 4 electrodes on the frontal cortex suggests that DAR from 4 frontal electrodes may be enough to detect early cognitive and functional deficits [ 53 ]. Studies explored the possibility of a single-electrode montage over the Fronto-Parietal area (FP1); the DAR and DTR from this electrode might be a valid predictor of cognitive function after stroke when correlated with the MoCA [ 54 ], relative power in Theta band correlated with mRS and modified Barthel Index (mBI) 30 and 90 days after stroke [ 121 ].

Power in chronic stroke

The role of power-related QEEG indices during chronic stroke and progression of motor functional performance have been examined with respect to rehabilitation therapies, since participants have recovered their motion to a certain degree [ 4 ]. Studies have shown that therapy and functional activity improvements correlate with changes of the shape and delay of event-related desynchronization and synchronization (ERD-ERS) for time–frequency power features when analyzing Alpha and Beta bands on the primary motor cortex for ipsilesional and contralesional hemispheres [ 21 , 22 , 122 ]. Therapies with better outcome tend to have reduced Delta rhythms and increased Alpha rhythms [ 122 ].

Bertolucci [ 47 ] compared starting power spectrum density in different bands for both hemispheres with changes in WMFT and FMA over time. Increased global Alpha and Beta activity was shown to correlate with better WMFT evolution while, increase in contralesional Beta activity was shown to be correlated with FMA evolution. Metrics combining slow and fast activity have also been tested in the chronic stage of stroke, significant negative correlation between DTABR (PRI) at the start of therapy was related to FMA change during robotic therapy [ 99 ]. This finding suggests that DTABR may have promise as prognostic indicators for all stages of stroke.

Brain Symmetry Index (BSI) is a generalized measure of “left to right” (affected to non-affected) power symmetry of mean spectral power per hemisphere. These inter-hemispheric relationships of power have been used as prognostic measures during all stages of stroke. Baseline BSI (during the sub-acute stage) was found to correlate with the FMA at 2 months [ 73 ], mRS at 6 months [ 123 ], and FM-UE predictor when using only theta band BSI for patients in the chronic stage [ 124 ]. BSI can be modified to account for the direction of asymmetry, the directed BSI at Delta and Theta bands proved meaningful to describe evolution from acute to chronic stages of upper limb impairment as measured by FM-UE [ 120 , 125 ]. Table 4 and Table 11 in Appendix 1 communicate power-derived metrics across different stages of stroke documented in this section and their main reported relationships with motor function. Findings are often reported in terms of correlation with clinical tests of motor function.

Brain connectivity (cortical coherence)

Brain connectivity is a measure of interaction and synchronization between distributed networks of the brain and allows for a clearer understanding of brain function. Although cortical damage from ischemic stroke is focal, cortical coherence can explain abnormalities in functionality of remote zones that share functional connections to the stroke-affected zone [ 59 ].

Several estimators of connectivity have been proposed in the literature. Coherency, partial coherence (pCoh) [ 125 ], multiple coherence (mCoh), imaginary part of coherence (iCoh) [ 126 ], Phase Lagged Index (PLI), weighted Phase Lagged Index (wPLI) [ 127 ], and simple ratios of power at certain frequency bands [ 73 ] describe synchronic symmetric activity between ROIs and are referred to as non-directed or functional connectivity [ 128 ]. Estimators based on Granger’s prediction such as partial directed coherence (PDC) [ 129 , 130 , 131 ], or directed transfer Function (DTF) [ 132 , 133 ] and any of their normalizations describe causal relationships between variables and are referred to as directed or effective connectivity [ 134 ]. Connectivity also allows the analysis of brain activity as network topologies, borrowing methods from graph theory [ 32 , 134 ]. Network features such as complexity, linearity, efficiency, clustering, path length, node hubs, and more can be derived from graphs [ 128 ]. Comparisons of these network features among groups with impairment and healthy controls have proven to be interesting tools to understand and characterize motor and functional deficits after stroke [ 108 ].

Studies have used intra- and inter-cortical coherence to expand the clinical understanding of the neural reorganization process [ 59 , 106 , 107 , 108 , 109 ], as a clinical motor and cognitive predictor [ 38 , 94 , 104 , 135 , 136 ], and as a tool to predict the efficacy of rehabilitation therapy [ 94 ]. Table 5 and Table 12 in Appendix 2 briefly summarize the main metrics discussed in this section and their results that are related with motor function assessment. In general, studies have shown that motor deficits in stroke survivors are related to less connectivity to main sensory motor areas [ 38 , 94 , 104 , 137 ], weak interhemispheric sensorimotor connectivity [ 109 , 138 ], less efficient networks [ 106 , 135 ], with less “small world” network patterns [ 108 , 134 ] (small-world networks are optimized to integrate specialized processes in the whole network and are known as an important feature of healthy brain networks).

Survivors of stroke tend to exhibit more modular (i.e., more clustered, less integrated) and less efficient networks than non-impaired controls with the biggest difference occurring in the Beta and Gamma bands [ 106 ]. Modular networks are less “small-world” [ 134 ]; small-world networks are optimized to integrate specialized processes in the whole network and are known as an important feature of healthy brain networks. Such a transition to a less small-world network was observed during the acute stage of stroke (first hours after stroke) and documented to be bilaterally decreased in the Delta band and bilaterally increased in the high Alpha band (also known as Alpha2: 10.5–13 Hz) [ 108 ].

Global connectivity with the ipsilesional primary motor cortex (M1) is the most researched biomarker derived from connectivity and has been studied in longitudinal experiments as a plasticity indicator leading to future outcome improvement [ 38 ], motor and therapy gains [ 94 ], upper limb gains during the sub-acute stage [ 137 ], and as a feature that characterizes stroke survivors’ cognitive deficits [ 104 ]. Pietro [ 38 ] used iCoh to test the weighted node degree (WND), a measure that quantifies the importance of a ROI in the brain, for M1 and reported that Beta-band features are linearly related with motor improvement as measured by FM-UE and Nine-Hole-Peg Test. Beta-band connectivity to ipsilesional M1, as measured by spectral coherence, can be used as a therapy outcome predictor, and more than that, results point heavily toward connectivity between M1 and ipsilesional frontal premotor area (PM) to be the most important variable as a therapy gain predictor; predictions can be further improved by using lesion-related information such as CST or MRI to yield more accurate results [ 94 ]. Comparisons between groups of people with impairment and controls showed significant differences on Alpha connectivity involving ipsilesional M1, this value showed a relation with FMA 3 months for the group with impairment due to stroke [ 104 ].

The relationship between interhemispheric ROI connectivity and motor impairment has been studied. The normalized interhemispheric strength (nIHS) from PDC was used to quantify the coupling between structures in the brain, Beta- and lower Gamma-band features of this quantity in sensorimotor areas exhibited linear relationships with the degree of motor impairment measured by CST [ 136 ]. A similar measure, also derived from PDC used to measure ROI interhemispheric importance named EEG-PDC was used in [ 109 ]; here the results show that Mu-band (10–12 Hz) and Beta-band features could be used to explain results for hand motor function from FM-UE. In another study, Beta debiased weighted phase lag index (dwPLI), correlated with outcome measured by Action Research Arm Test (ARAT) and FM-UE [ 138 ].

Global and local network efficiency for Beta and Gamma bands seem to be significantly decreased in the population who suffered from a stroke compared to healthy controls as reported in [ 106 ]. Newer results, such as the ones pointed out by [ 135 ] found statistically significant relationships between Beta network efficiency, network intradensity derived using a non-parametric method (named Generalized Measure of Association), and functional recovery results given by FM-UE. Global maximal coherence features in the Alpha band have been recently recognized as FM-UE predictors, where coherence was computed using PLI and related to motor outcome by means of linear regression [ 139 ].

Corticomuscular coherence

Corticomuscular coherence (CMC) is a measure of the amount of synchronous activity between signals in the brain (i.e., EEG or MEG) and associated musculature (i.e., EMG) of the body [ 92 ]. Typically measured during voluntary contractions [ 110 ], the presence of coherence demonstrates a direct relationship between cortical rhythms in the efferent motor commands and the discharge of neurons in the motor cortex [ 140 ]. CMC is computed as correlation between EEG and EMG signals at a given frequency. Early CMC research found synchronous (correlated) activity in Beta and low Gamma bands [ 40 , 41 , 42 ]. CMC is strongest in the contralateral motor cortex [ 141 ]. This metric seems to be affected by stroke-related lesions, and thus provides an interesting tool to assess motor recovery [ 111 , 142 , 143 , 144 ]. The level of CMC is lower in the chronic stage of stroke than in healthy subjects [ 112 , 145 ], with chronic stroke survivors showing lower peak CMC frequency [ 146 ], and topographical patterns that are more widespread than in healthy people; highlighting a connection to muscle synergies [ 142 , 147 , 148 ]. CMC has been shown to increase with training [ 37 , 112 , 144 ].

Corticomuscular coherence has been proposed as a tool to: (a) identify the functional contribution of reorganized cortical areas to motor recovery [ 37 , 112 , 141 , 144 , 146 ]; (b) understand functional remapping [ 93 , 142 , 145 ]; and (c) study the mechanisms underlying synergies [ 147 , 148 ]. CMC has shown increased abnormal correlation with deltoid EMG during elbow flexion for people who have motor impairment [ 147 ], and the best muscles to target with rehabilitative interventions [ 148 ]. Changes in CMC have been shown to correlate with motor improvement for different stages of stroke, although follow-up scores based on CMC have not shown statistically significant correlations when compared to clinical metrics [ 37 , 93 ]. Results summarizing CMC on stroke can be found in Table 6 and Table 13 in Appendix 3.

Reliability of measures

Each of the aforementioned measures have the potential to be integrated into robotic devices for upper-limb assessment. However, to improve the clinical acceptability of robotic-assisted assessment, the measurements and derived metrics must meet reliability standards in a clinical setting [ 55 ]. Reliability can be defined as the degree of consistency between measurements or the degree to which a measurement is free of error. A common method to represent the relative reliability of a measurement process is the intraclass correlation coefficient (ICC) [ 150 ]. Koo and Li suggest a guideline on reporting ICC values for reliability that includes the ICC value, analysis model (one-way random effects, two-way random effects, two-way fixed effects, or two-way mixed effects), the model type per Shrout and Fleiss (individual trials or mean of k trials), model definition (absolute agreement or consistency), and confidence interval [ 68 ]. Koo and Li also provide a flowchart in selecting the appropriate ICC based on the type of reliability and rater information. An ICC value below 0.5 indicates poor reliability, 0.5 to 0.75 moderate reliability, 0.75 to 0.9 good reliability, and above 0.9 excellent reliability. The reviewed papers will be evaluated based on these guidelines. For reporting the ICC, the Shrout and Fleiss convention is used [ 68 ]. The chosen reliability studies are included in the tables if the chosen ICC model, type, definition, and confidence interval are identifiable, and the metrics have previously been used in electronic-based metrics. For studies that report multiple ICC scores due to assessment of test–retest reliability for multiple raters, the lowest ICC reported is included to avoid bias in the reported results.

In the assessment of reliability of data from robotic sensors, common ways to assess reliability are to correlate multiple measurements in a single session (intra-session) and correlate multiple measurements between different sessions (inter-session) measurements (i.e., test–retest reliability) [ 151 ]. Checking for test–retest reliability determines the repeatability of the robotic metric. The repeatability is the ability to reproduce the same measurements under the same conditions. Table 7 shows the test–retest reliability of several robotic metrics. For metrics checking for test–retest reliability, a two-way mixed-effects model with either single or multiple measurements may be used [ 68 ]. Since the same set of sensors will be used to assess subjects, the two-way mixed model is used. The test–retest reliability should be checking for absolute agreement. Checking for absolute agreement (y = x) rather than consistency (y = x + b) determines the reliability without a bias or systematic error. For example, in Fig.  5 , for a two-way random effect with a single measurement checking for agreement gives a score of 0.18. When checking for consistency, the ICC score reaches to 1.00. In other words, the bias has no effect on the ICC score when checking for consistency. Therefore, when performing test–retest reliability, it is important to check for absolute agreement to prevent bias in the test–retest result.

figure 5

Checking agreement versus consistency among ratings. For y = x, the absolute ICC score is 1 and the consistency ICC score is 1.00. For y = x + 1, the agreement ICC score is 0.18 and the consistency ICC score is 1.00. For y = 3x, the absolute ICC score is 0.32 and the consistency ICC score is 0.60. For y = 3x + 1, the absolute ICC score is 0.13 and the consistency ICC score is 0.60

Not only should a robotic metric demonstrate repeatability, it should also be reproducible when different operators are using the same device. Reproducibility evaluates the change in measurements when conditions have changed. Inter-rater reliability tests have been performed to determine the effect raters have when collecting measurements when two or more raters perform the same experimental protocol [ 68 ]. To prevent a biased result, raters should have no knowledge of the evaluations given by other raters, ensuring that raters’ measurements are independent from one another. Table 8 shows the reproducibility of several robotic biomechanical metrics. All the included studies have used two raters to check for reproducibility. The researchers performed a two-way random effects analysis with either a single measurement or multiple measurements to check for agreement.

Measurement reliability of robotic biomechanical assessment

Of the 24 papers reviewed for biomechanical metrics, 13 papers reported on reliability. 6 papers reported reproducibility and 9 papers reported on repeatability. Overall, the metrics seem to demonstrate good to moderate reliability for both repeatability and reproducibility. However, caution should be exercised in determining which robotic metric is more effective in assessing movement quality based on reliability studies. The quality of measurements is highly dependent on the quality of the robotic device and sensors [ 85 ]. Having a completely transparent robot with a sensitive and accurate sensor will further improve assessment of reliability. Also, the researchers have used different versions of the ICC, as seen in Tables 7 and 8 , which complicates direct comparisons of the metrics.

Reliability of electrophysiological signal features

Of the 33 papers reviewed for electrophysiological metrics, 5 papers reported on reliability. 6 papers reported on repeatability. Convenience of acquiring electrophysiological signals non-invasively is relatively new. Metrics for assessment of upper limb motor impairment in stroke, derived from these signals have shown to be valid in academic settings, but most of these valid metrics have yet to be tested for intra- and inter-session reliability to be used in clinical and rehabilitation settings. Few studies found as a result of our systematic search have looked at test–retest reliability of these metrics. Therefore, we found and manually added records reporting on intra- and inter-session reliability on metrics based on electrophysiological features described in section “Measures and methods based on neural activity using EEG/EMG”, even if reliability was not assessed on people with stroke. Relevant results are illustrated in Table 9 .

Spectral power features of EEG signals have been tested during rest [ 153 , 154 ] and task (cognitive and motor) conditions for different cohorts of subjects [ 102 , 103 ]. Some of the spectral features observed during these experiments are related to timed behavior of oscillatory activity due to cued experiments, such as event-related desynchronization of the Beta band (ERD and Beta rebound) [ 102 ] and topographical patterns of Alpha activity R = 0.9302, p < 0.001 [ 103 ].

Test–retest reliability for rest EEG functional connectivity has been explored for few of the estimators listed in section “Measures and methods based on neural activity using EEG/EMG”: (1) for a cohort of people with Alzheimer by means of the amplitude envelope correlation (AEC), phase lag index (PLI) and weighted phase lag index (wPLI) [ 155 ]; (2) in healthy subjects using iCoh and PLI [ 156 ]; and (3) in infants, by studying differences of inter-session PLI graph metrics such as path length, cluster coefficient, and network “small-worldness” [ 60 ]. Reliability for upper limb CMC has not yet been documented (at least to our knowledge). However, an experiment involving testing reliability of CMC for gait reports low CMC reliability in groups with different ages [ 61 ].

EEG and EMG measurements could be combined with kinematic and kinetic measurements to provide additional information about the severity of impairment and decrease the number of false positives from individual measurements [ 21 ]. This could further be used to explain abnormal relationships between brain activation, muscle activation and movement kinematics, as well as provide insight about subject motor performance during therapy [ 15 ]. The availability of EEG and EMG measures can also enhance aspects of biofeedback given during tests or be used to complement other assessments to provide a more holistic picture of an individual’s neurological function.

It has been shown that combining EEG, EMG, and kinematic data using a multi-domain approach can produce correlations to traditional clinical assessments, a summary of some of the reviewed studies is presented in Table 10 . Belfatto et al. have assessed people’s ROM for shoulder and elbow flexion, task time, and computed jerk to measure people’s smoothness, while the EMG was used to measure muscle synergies, and EEG detected ERD and a lateralization coefficient [ 21 ]. Comani et al. used task time, path length, normalized jerk, and speed to measure motor performance while observing ERD and ERS during motor training [ 22 ]. Pierella et al. gathered kinematic data from an upper-limb exoskeleton, which assessed the mean tangential velocity, path-length ratio, the number of speed peaks, spectral arc length, the amount of assistance, task time, and percentage of workspace, while observing EEG and EMG activity [ 18 ]. Mazzoleni et al. used the InMotion2 robot system to capture the movement accuracy, movement efficiency, mean speed, and the number of velocity peaks, while measuring brain activity with EEG [ 16 ]. However, further research is necessary to determine the effectiveness of the chosen metrics and methods compared to other more promising methods to assess function. Furthermore, greater consensus in literature is needed to support the clinical use of more reliable metrics. For example, newer algorithms to estimate smoothness such as spectral arc length have been shown to provide greater validity and reliability than the commonly used normalized jerk metric. Despite this evidence, normalized jerk remains a widely accepted measure of movement smoothness.

Discussions and conclusions

In this paper we reviewed studies that used different sensor-acquired biomechanical and electrophysiological signals to derive metrics related to neuromuscular impairment for stroke survivors; such metrics are of interest for robotic therapy and assessment applications. To assess the ability of a given measure to relate with impairment or motor outcome, we looked for metrics where results have been demonstrated to correlate or predict scores from established clinical assessment metrics for impairment and function (validity). Knowing that a metric has some relationship with impairment and function (i.e., that it is valid) is not enough for it to be used in clinical settings if those results are not repeatable (reliable). Thus, we also reviewed the reliability of metrics and related signal features looking for metrics which produce similar results for the same subject during different test sessions and for different raters. With this information, researchers can aim to use metrics that not only seem to be related with stroke, but also can be trusted, with less bias, and with a simpler interpretation. The main conclusions of this review paper are presented as answers to the following research questions.

Which biomechanical-based metrics show promise for valid assessment of function and impairment?

Metrics derived from kinematic (e.g., position & velocity) and kinetic (e.g., force & torque) sensors affixed to robotic and passive mechanical devices have successfully been used to measure biomechanical aspects of upper-extremity function and impairment in people after stroke. The five common metrics included in the reviewed studies measured the number of velocity peaks (~ 9 studies), path-length ratio (~ 8 studies), the maximum speed of the arm (~ 7 studies), active range of motion (~ 7 studies), and movement time (~ 7 studies). The metrics are often compared to an established clinical assessment to determine validity of the metric. According to the review study by Murphy and Häger, the Fugl-Meyer Assessment for Upper Extremity had significant correlation with movement time, movement smoothness, peak velocity, elbow extension, and shoulder flexion [ 66 ]. The movement time and smoothness showed strong correlation with the Action Research Arm Test, whereas speed, path-length ratio, and end-point error showed moderate correlation. Tran et al. reviewed specifically validation of robotic metrics with clinical assessments [ 57 ]. The review found mean speed, number of peak velocities, movement accuracy, and movement duration to be most promising metrics based on validation with clinical assessments. However, the review mentioned that some studies seem to conflict on the correlation between the robotic metric and clinical measures, which could be due to assessment task, subject characteristics, type of intervention, and robotic device. For further information about the validation of sensor-based metrics, please refer to the previously mentioned literature reviews [ 57 , 66 ].

Which biomechanical-based metrics show promise for repeatable assessment?

Repeatable measures, in which measurement taken by a single instrument and/or person produce low variation within a single task, are a critical requirement for assessment of impairment and function. The biomechanical based metrics that show the most promise for repeatability are range of motion, mean speed, mean distance, normal path length, spectral arc length, number of peaks, and task time. Two or more studies used these metrics and demonstrated good and excellent reliability, which implies the metric is robust against measurement noise and/or disturbances. Since the metrics have been used on different measuring instruments, the sensors’ resolution and signal-to-noise ratio appear to have a minimal impact on the reliability. However, more investigation is needed to confirm this robustness. In lieu of more evidence, it is recommended that investigators choose sensors similar or superior in quality to those used in the measuring devices presented in Tables 7 and 8 to achieve the same level of reliability.

What aspects of biomechanical-based metrics lack evidence or require more investigation?

Although many metrics (see previous section) demonstrate good or excellent repeatability across multiple studies, the evidence for reproducibility is limited to single studies. When developing a novel device capable of robotic assistance and assessment, researchers have typically focused their efforts to create a device capable of repeatable and reliable measurements. However, since the person administering the test is using the device to measure the subject’s performance, the reproducibility of the metric must also be considered. The reproducibility of a metric is affected by the ease-of-use of the device; if the device is too complicated to setup and use, there is an increased probability that different operators will observe different measurements. Also, the operator’s instructions to the subject affects the reproducibility, especially in the initial sessions, which may lead to different learning effects, and different assessment results. More studies are needed across multiple sites and operators to determine the reproducibility of the biomechanical metrics reviewed in this paper.

Which neural activity-based metrics (EEG & EMG) show the most promise for reliable assessment?

Electrical neurological signals such as EEG and EMG have successfully been used to understand changes in motor performance and outcome variability across all stages of post-stroke recovery including the first few hours after onset. Experimental results have shown that metrics derived from slow frequency power (delta power, relative delta power, and theta power), and power ratio between slow and fast EEG frequency bands like DAR and DTABR convey useful information both about current and future motor capabilities, as presented in Table 4 and Table 11 in Appendix 1. Multimodal studies using robotic tools for assessment of motor performance have expanded the study of power signal features in people who suffered a stroke in the chronic recovery stage by studying not only rest EEG activity but also task-related activity [ 19 , 21 , 122 ]; ERD-ERS features like amplitude and latency along with biomechanical measures have been shown to correlate with clinical measures of motor performance and to predict a person’s response to movement therapies. EEG power features in general have been found to have good to excellent reliability for test–retest conditions among different populations, across all frequency bands of interest (see Table 9 ).

Functional connectivity (i.e., non-directed connectivity) expands the investigative capacity of EEG measurements, enabling analyzing the brain as a network system by investigating the interactions between regions of interest in the brain while resting or during movement tasks. Inter-hemispheric interactions (interactions between the same ROI in both hemispheres) and global interactions (interactions between the entire brain and an ROI) reported as power or graph indices in Beta and Gamma bands have fruitfully been used to explain motor outcome scores. Although results seem promising, connectivity reliability is still debated with results ranging mostly between moderate to good reliability only for a few connectivity estimators ( PLI, wPLI and iCoh ).

Which neural activity-based metrics (EEG and EMG) lack evidence or require more investigation?

EEG and EMG provide useful non-invasive insight into the human neuromuscular system allowing researchers to make conjectures about its function and structure; however, interpretation of results based on these measures solely must be carefully analyzed within the frame of experimental conditions. Overall, the field needs more studies involving cohorts of stroke survivors to determine the reliability (test–retest) of metrics derived from EEG and EMG signal features that have already shown validity in academic studies.

Metrics calculated from power imbalance between interhemispheric activity like BSI , pwBSI and PRI [ 62 , 73 , 124 ] are a great premise to measure how the brain relies on foreign regions to accomplish tasks related with affected areas. A battery of diverse estimators for connectivity, especially those of effective (directed) connectivity, open the door to investigations into the relationship between abnormal communication of regions of interest and impairment (see Table 5 and Table 12 in Appendix 2). These metrics, although valid have yet to be tested in terms of reliability in clinical use. Reliability for connectivity metrics should specify which estimator was used to derive the metric.

CMC is another exciting neural-activity-based metric lacking sufficient evidence to support its significance. CMC considers and bridges two of the most affected domains for motor execution in neuromuscular system, making it a good candidate for robotic-based therapy and assessment of survivors of stroke [ 147 ]. Although features in the Beta and Gamma bands seem to be related to motor impairment, there is still not agreement about which one is most closely related to motor outcomes. Studies reviewed in this paper considered cortical spatial patterns of maximum coherence, peak frequency shift when compared to healthy controls, latency for peak coherence, among others (see Table 6 and Table 13 in Appendix 3). However, when comparing to motor outcomes, results are not always significant, and test–retest reliability for this metric is yet (to our knowledge) to be documented for the upper extremity (see [ 61 ] for a lower-extremity study).

What standards should be adopted for reporting biomechanical and neural activity-based metrics and their reliability?

For metrics to be accepted as reliable in the clinical field, researchers are asked to follow the guidelines presented in Koo and Li [ 68 ], which provide guidance on which ICC model to use depending on the type of reliability study and what should be reported (e.g., the software they used to compute the ICC and confidence interval). In the papers reviewed, some investigated the learning effects of the assessment task and checked for consistency rather than agreement (see Table 7 ). However, the learning effects should be minimal in a clinical setting between each session, and potential effects should be taken into consideration during protocol design; common practices to minimize the implications of learning effects is to allow practice runs by the patients [ 99 , 122 ] and to remove the first experimental runs [ 81 , 85 ]. By removing this information, signal analysis focuses performance of learned tasks with similar associated behaviors. Therefore, to demonstrate test–retest reliability (i.e., repeatability), the researcher should be checking for absolute agreement. Also, as can be seen in Tables 7 and 8 , there does not seem to be a standard on reporting ICC values. Some researchers report the confidence interval of the ICC value, while others do not. It was also difficult to determine the ICC model used in some of the studies. Therefore, a standard on reporting ICC values is needed to help readers understand the ICC used and prevent bias (see [ 68 ] for suggestive guideline on how to report ICC scores). Also, authors are asked to include the means of each individual session or rater would provide additional information on the variation of the means between the groups. The variation between groups can be shown with Bland–Altman plot, but readers are unable to perform other forms of analysis. To help with this, data from studies should be made publicly available to allow results to be verified and enable further analysis in the future.

When is it advantageous to combine biomechanical and neural activity-based metrics for assessment?

Biomechanical and neural activity provide distinct but complementary information about the neuro-musculoskeletal system, potentially offering a more complete picture of impairment and function after stroke. Metrics derived from kinematic/kinetic information assess motor performance based on motor execution; however, compensatory strategies related to stroke may mask underlying neural deficits (i.e., muscle synergies line up to complete a given task) [ 18 , 21 , 69 , 70 , 71 , 72 , 122 ]. Information relevant to these compensatory strategies can be obtained when analyzing electrophysiological activity, as has been done using connectivity [ 59 , 107 ], CMC [ 147 , 148 ] and brain cortical power [ 91 ].

Combining signals from multiple domains, although beneficial in the sense that it would allow a deeper understanding of a subject’s motor ability, is still a subject of exploration. Experimental paradigms play an important role that influences the decision of feature selection; increasing the dimensionality of signals may provide more useful information for analysis, but comes at the expense of experimental costs (e.g., hardware) and time (e.g., subject setup). With all this in mind, merging information from different domains in the hierarchy of the neuro-musculoskeletal system may provide a more comprehensive quantitative profile of a person’s impairment and performance. Examples of robotic multidomain methods such as the ones in [ 18 , 21 ], highlight the importance of this type of assessment for monitoring and understanding the impact of rehabilitation in chronic stroke survivors. In both cases, these methodologies allowed pairing of observed behavioral changes in task execution (i.e., biomechanical data) with corresponding functional recovery, instead of adopted compensation strategies.

What should be the focus of future investigations of biomechanical and/or neural activity-based metrics?

Determining the reliability and validity of sensor-based metrics requires carefully designed experiments. In future investigations, experiments should be conducted that calculate multiple metrics from multiple sensors and device combinations, allowing the effect of sensor type and quality on the measure’s reliability to be quantified. After the conclusion of such experiments, researchers are strongly encouraged to make their anonymized raw data public to allow other researchers to compute different ICCs. Performing comparison studies on the reliability of metrics will produce reliability data to expand Tables 7 , 8 , 9 and improve our ability to compare similar sensor-based metrics. Additional reliability studies should also be performed that include neural features of survivors of stroke, with increased focus on modeling the interactions between these domains (biomechanical and neural activity). It is also important to understand how to successfully combine data from multimodal experiments; many of the studies reviewed in this paper recorded multidimensional data, but performed analysis for each domain separately.

Availability of data and materials

Not applicable.

Abbreviations

Activities of daily living

Amplitude envelope correlation

Action research arm test

Active range of motion

Autism spectrum disorder

Box and Blocks test

Brain Symmetry Index

Canonical correlation analysis

Cortico-spinal tract

Delta-alpha ratio

Delayed cerebral ischemia

Direct directed transfer function

Degree of freedom

(Delta + Theta)/(Alpha + Beta)

Directed transfer function

Delta-theta ratio

  • Electroencephalography

Electromyography

Event related desynchronization

Event related synchronization

Full frequency directed transfer function

Functional independence measure and functional assessment measure

Fugl-Meyer assessment for upper extremity

Generalized Measure of Association

Generalized partial directed coherence

Intra-class correlations

Imaginary part of coherence

Primary motor cortex

Modified Ashworth

Modified Barthel Index

Multiple coherence

Motricity Index

Montreal Cognitive Assessment

Movement related beta desynchronization

Magnetic resonance imaging

Modified Ranking Scale

Normalized interhemispheric strength

National Institutes of Health Stroke Scale

Non-negative matrix factorization algorithm

Principal component analysis

Partial coherence

Partial directed coherence

Phase lag index, weight phase lag index, debiased weighted phase lag index

Premotor area

Post movement beta rebound

Power Ratio Index

Passive range of motion

Quantitative EEG

Regional cerebral blood flow

Region of interest

Renormalized partial directed coherence

Singular value decomposition

Wolf motor function

Weighted Node Degree Index

Stroke Facts. 2020. https://www.cdc.gov/stroke/facts.htm . Accessed 26 Mar 2020.

Ottenbacher KJ, Smith PM, Illig SB, Linn RT, Ostir GV, Granger CV. Trends in length of stay, living setting, functional outcome, and mortality following medical rehabilitation. JAMA. 2004;292(14):1687–95. https://doi.org/10.1001/jama.292.14.1687 .

Article   CAS   PubMed   Google Scholar  

Lang CE, MacDonald JR, Gnip C. Counting repetitions: an observational study of outpatient therapy for people with hemiparesis post-stroke. J Neurol Phys Ther. 2007;31(1). https://journals.lww.com/jnpt/Fulltext/2007/03000/Counting_Repetitions__An_Observational_Study_of.4.aspx .

Gresham GE, Phillips TF, Wolf PA, McNamara PM, Kannel WB, Dawber TR. Epidemiologic profile of long-term stroke disability: the Framingham study. Arch Phys Med Rehabil. 1979;60(11):487–91.

CAS   PubMed   Google Scholar  

Duncan EA, Murray J. The barriers and facilitators to routine outcome measurement by allied health professionals in practice: a systematic review. BMC Health Serv Res. 2012;12(1):96.

Article   PubMed   PubMed Central   Google Scholar  

Sullivan KJ, Tilson JK, Cen SY, Rose DK, Hershberg J, Correa A, et al. Fugl-meyer assessment of sensorimotor function after stroke: standardized training procedure for clinical practice and clinical trials. Stroke. 2011;42(2):427–32.

Article   PubMed   Google Scholar  

Ansari NN, Naghdi S, Arab TK, Jalaie S. The interrater and intrarater reliability of the Modified Ashworth Scale in the assessment of muscle spasticity: limb and muscle group effect. NeuroRehabilitation. 2008;23:231–7.

Wade DT, Collin C. The Barthel ADL Index: a standard measure of physical disability? Int Disabil Stud. 1988;10(2):64–7.

Maggioni S, Melendez-Calderon A, van Asseldonk E, Klamroth-Marganska V, Lünenburger L, Riener R, et al. Robot-aided assessment of lower extremity functions: a review. J Neuroeng Rehabil. 2016;13(1):72. https://doi.org/10.1186/s12984-016-0180-3 .

Frisoli A, Procopio C, Chisari C, Creatini I, Bonfiglio L, Bergamasco M, et al. Positive effects of robotic exoskeleton training of upper limb reaching movements after stroke. J Neuroeng Rehabil. 2012;9(1):36. https://doi.org/10.1186/1743-0003-9-36 .

Groothuis-Oudshoorn CGM, Prange GB, Hermens HJ, Ijzerman MJ, Jannink MJA. Systematic review of the effect of robot-aided therapy on recovery of the hemiparetic arm after stroke. J Rehabil Res Dev. 2006;43(2):171.

Harwin WS, Murgia A, Stokes EK. Assessing the effectiveness of robot facilitated neurorehabilitation for relearning motor skills following a stroke. Med Biol Eng Comput. 2011;49(10):1093–102.

Nordin N, Xie SQ, Wünsche B. Assessment of movement quality in robot- assisted upper limb rehabilitation after stroke: a review. J NeuroEngineering Rehabil. 2014;11:137. https://doi.org/10.1186/1743-0003-11-137 .

Article   Google Scholar  

De Los Reyes-Guzman A, Dimbwadyo-Terrer I, Trincado-Alonso F, Monasterio-Huelin F, Torricelli D, Gil-Agudo A. Quantitative assessment based on kinematic measures of functional impairments during upper extremity movements: a review. Clin Biomech. 2014;29(7):719–27. https://doi.org/10.1016/j.clinbiomech.2014.06.013 .

Molteni E, Preatoni E, Cimolin V, Bianchi AM, Galli M, Rodano R. A methodological study for the multifactorial assessment of motor adaptation: integration of kinematic and neural factors. 2010 Annu Int Conf IEEE Eng Med Biol Soc EMBC’10. 2010;4910–3.

Mazzoleni S, Coscia M, Rossi G, Aliboni S, Posteraro F, Carrozza MC. Effects of an upper limb robot-mediated therapy on paretic upper limb in chronic hemiparetic subjects: a biomechanical and EEG-based approach for functional assessment. 2009 IEEE Int Conf Rehabil Robot ICORR 2009. 2009;92–7.

Úbeda A, Azorín JM, Chavarriaga R, Millán JdR. Classification of upper limb center-out reaching tasks by means of EEG-based continuous decoding techniques. J Neuroeng Rehabil. 2017;14(1):1–14.

Pierella C, Pirondini E, Kinany N, Coscia M, Giang C, Miehlbradt J, et al. A multimodal approach to capture post-stroke temporal dynamics of recovery. J Neural Eng. 2020;17(4): 045002.

Steinisch M, Tana MG, Comani S. A post-stroke rehabilitation system integrating robotics, VR and high-resolution EEG imaging. IEEE Trans Neural Syst Rehabil Eng. 2013;21(5):849–59. https://doi.org/10.1596/978-1-4648-1002-2_Module14 .

Úbeda A, Hortal E, Iáñez E, Perez-Vidal C, Azorín JM. Assessing movement factors in upper limb kinematics decoding from EEG signals. PLoS ONE. 2015;10(5):1–12.

Belfatto A, Scano A, Chiavenna A, Mastropietro A, Mrakic-Sposta S, Pittaccio S, et al. A multiparameter approach to evaluate post-stroke patients: an application on robotic rehabilitation. Appl Sci. 2018;8(11):2248.

Comani S, Schinaia L, Tamburro G, Velluto L, Sorbi S, Conforto S, et al. Assessing Neuromotor Recovery in a stroke survivor with high resolution EEG, robotics and virtual reality. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). 2015. p. 3925–8.

Kwon HM, Yang IH, Lee WS, Yu ARL, Oh SY, Park KK. Reliability of intraoperative knee range of motion measurements by goniometer compared with robot-assisted arthroplasty. J Knee Surg. 2019;32(3):233–8.

Dukelow SP, Herter TM, Moore KD, Demers MJ, Glasgow JI, Bagg SD, et al. Quantitative assessment of limb position sense following stroke. Neurorehabil Neural Repair. 2010;24(2):178–87.

Balasubramanian S, Wei R, Herman R, He J. Robot-measured performance metrics in stroke rehabilitation. In: 2009 ICME International Conference on Complex Medical Engineering, CME 2009. 2009.

Otaka E, Otaka Y, Kasuga S, Nishimoto A, Yamazaki K, Kawakami M, et al. Clinical usefulness and validity of robotic measures of reaching movement in hemiparetic stroke patients. J Neuroeng Rehabil. 2015;12(1):66.

Singh H, Unger J, Zariffa J, Pakosh M, Jaglal S, Craven BC, et al. Robot-assisted upper extremity rehabilitation for cervical spinal cord injuries: a systematic scoping review. Disabil Rehabil Assist Technol. 2018;13(7):704–15. https://doi.org/10.1080/17483107.2018.1425747 .

Molteni F, Gasperini G, Cannaviello G, Guanziroli E. Exoskeleton and end-effector robots for upper and lower limbs rehabilitation: narrative review. PMR. 2018;10(9):174–88.

Jutinico AL, Jaimes JC, Escalante FM, Perez-Ibarra JC, Terra MH, Siqueira AAG. Impedance control for robotic rehabilitation: a robust markovian approach. Front Neurorobot. 2017;11(AUG):1–16.

Google Scholar  

Li Z, Huang Z, He W, Su CY. Adaptive impedance control for an upper limb robotic exoskeleton using biological signals. IEEE Trans Ind Electron. 2017;64(2):1664–74.

Marchal-Crespo L, Reinkensmeyer DJ. Review of control strategies for robotic movement training after neurologic injury. J NeuroEngineering Rehabil. 2009;6:20. https://doi.org/10.1186/1743-0003-6-20 .

Cohen MX. Analyzing neural time series data: theory and practice. Cambridge: MIT Press; 2014.

Book   Google Scholar  

Stafstrom CE, Carmant L. Seizures and epilepsy: an overview. Cold Spring Harb Perspect Med. 2015;5(6):65–77.

Machado C, Cuspineda E, Valdãs P, Virues T, Liopis F, Bosch J, et al. Assessing acute middle cerebral artery ischemic stroke by quantitative electric tomography. Clin EEG Neurosci. 2004;35(3):116–24.

Finnigan SP, Walsh M, Rose SE, Chalk JB. Quantitative EEG indices of sub-acute ischaemic stroke correlate with clinical outcomes. Clin Neurophysiol. 2007;118(11):2525–31.

Cuspineda E, Machado C, Galán L, Aubert E, Alvarez MA, Llopis F, et al. QEEG prognostic value in acute stroke. Clin EEG Neurosci. 2007;38(3):155–60.

Belardinelli P, Laer L, Ortiz E, Braun C, Gharabaghi A. Plasticity of premotor cortico-muscular coherence in severely impaired stroke patients with hand paralysis. NeuroImage Clin. 2017;14:726–33.

Di PM, Schnider A, Nicolo P, Rizk S, Guggisberg AG. Coherent neural oscillations predict future motor and language improvement after stroke. Brain. 2015;138(10):3048–60.

Chen CC, Lee SH, Wang WJ, Lin YC, Su MC. EEG-based motor network biomarkers for identifying target patients with stroke for upper limb rehabilitation and its construct validity. PLoS ONE. 2017;12(6):1–20. https://doi.org/10.1371/journal.pone.0178822 .

Article   CAS   Google Scholar  

Conway BA, Halliday DM, Farmer SF, Shahani U, Maas P, Weir AI, et al. Synchronization between motor cortex and spinal motoneuronal pool during the performance of a maintained motor task in man. J Physiol. 1995;489(3):917–24.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Salenius S, Portin K, Kajola M, Salmelin R, Hari R. Cortical control of human motoneuron firing during isometric contraction. J Neurophysiol. 1997;77(6):3401–5.

Mima T, Hallett M. Electroencephalographic analysis of cortico-muscular coherence: reference effect, volume conduction and generator mechanism. Clin Neurophysiol. 1999;110(11):1892–9.

Claassen J, Hirsch LJ, Kreiter KT, Du EY, Sander Connolly E, Emerson RG, et al. Quantitative continuous EEG for detecting delayed cerebral ischemia in patients with poor-grade subarachnoid hemorrhage. Clin Neurophysiol. 2004;115(12):2699–710.

Sullivan JL, Bhagat NA, Yozbatiran N, Paranjape R, Losey CG, Grossman RG, et al. Improving robotic stroke rehabilitation by incorporating neural intent detection: preliminary results from a clinical trial. In: 2017 International Conference on Rehabilitation Robotics (ICORR). IEEE; 2017. p. 122–7.

Muralidharan A, Chae J, Taylor DM. Extracting attempted hand movements from EEGs in people with complete hand paralysis following stroke. Front Neurosci. 2011. https://doi.org/10.3389/fnins.2011.00039 .

Nam C, Rong W, Li W, Xie Y, Hu X, Zheng Y. The effects of upper-limb training assisted with an electromyography-driven neuromuscular electrical stimulation robotic hand on chronic stroke. Front Neurol. 2017. https://doi.org/10.3389/fneur.2017.00679 .

Bertolucci F, Lamola G, Fanciullacci C, Artoni F, Panarese A, Micera S, et al. EEG predicts upper limb motor improvement after robotic rehabilitation in chronic stroke patients. Ann Phys Rehabil Med. 2018;61:e200–1.

Cantillo-Negrete J, Carino-Escobar RI, Carrillo-Mora P, Elias-Vinas D, Gutierrez-Martinez J. Motor imagery-based brain-computer interface coupled to a robotic hand orthosis aimed for neurorehabilitation of stroke patients. J Healthc Eng. 2018;3(2018):1–10.

Bhagat NA, Venkatakrishnan A, Abibullaev B, Artz EJ, Yozbatiran N, Blank AA, et al. Design and optimization of an EEG-based brain machine interface (BMI) to an upper-limb exoskeleton for stroke survivors. Front Neurosci. 2016;10(MAR):122.

PubMed   PubMed Central   Google Scholar  

Biasiucci A, Leeb R, Iturrate I, Perdikis S, Al-Khodairy A, Corbet T, et al. Brain-actuated functional electrical stimulation elicits lasting arm motor recovery after stroke. Nat Commun. 2018;9(1):1–13. https://doi.org/10.1038/s41467-018-04673-z .

Ang KK, Guan C, Chua KSG, Ang BT, Kuah C, Wang C, et al. Clinical study of neurorehabilitation in stroke using EEG-based motor imagery brain-computer interface with robotic feedback. Annu Int Conf IEEE Eng Med Biol. 2010. pp. 5549–52.

Finnigan SP, Rose SE, Walsh M, Griffin M, Janke AL, Mcmahon KL, et al. Correlation of quantitative EEG in acute ischemic stroke with 30-day NIHSS score: comparison with diffusion and perfusion MRI. Stroke. 2004;35(4):899–903.

Schleiger E, Sheikh N, Rowland T, Wong A, Read S, Finnigan S. Frontal EEG delta / alpha ratio and screening for post-stroke cognitive de fi cits: the power of four electrodes. Int J Psychophysiol. 2014;94(1):19–24. https://doi.org/10.1016/j.ijpsycho.2014.06.012 .

Aminov A, Rogers JM, Johnstone SJ, Middleton S, Wilson PH. Acute single channel EEG predictors of cognitive function after stroke. PLoS ONE. 2017;12(10): e0185841.

Andresen EM. Criteria for assessing the tools of disability outcomes research. Arch Phys Med Rehabil. 2000. https://doi.org/10.1053/apmr.2000.20619 .

Wang Q, Markopoulos P, Yu B, Chen W, Timmermans A. Interactive wearable systems for upper body rehabilitation: a systematic review. J Neuroeng Rehabil. 2017;14(1):1–21.

Tran VD, Dario P, Mazzoleni S. Kinematic measures for upper limb robot-assisted therapy following stroke and correlations with clinical outcome measures: a review. Med Eng Phys. 2018;53:13–31. https://doi.org/10.1016/j.medengphy.2017.12.005 .

Finnigan S, Wong A, Read S. Defining abnormal slow EEG activity in acute ischaemic stroke: Delta/alpha ratio as an optimal QEEG index. Clin Neurophysiol. 2016;127(2):1452–9. https://doi.org/10.1016/j.clinph.2015.07.014 .

Carter AR, Shulman GL, Corbetta M. Why use a connectivity-based approach to study stroke and recovery of function? Neuroimage. 2012;62(4):2271–80.

van der Velde B, Haartsen R, Kemner C. Test-retest reliability of EEG network characteristics in infants. Brain Behav. 2019;9(5):1–10.

Gennaro F, de Bruin ED. A pilot study assessing reliability and age-related differences in corticomuscular and intramuscular coherence in ankle dorsiflexors during walking. Physiol Rep. 2020;8(4):1–12.

Brihmat N, Loubinoux I, Castel-Lacanal E, Marque P, Gasq D. Kinematic parameters obtained with the ArmeoSpring for upper-limb assessment after stroke: a reliability and learning effect study for guiding parameter use. J Neuroeng Rehabil. 2020;17(1):130. https://doi.org/10.1186/s12984-020-00759-2 .

Dewald JPA, Ellis MD, Acosta AM, McPherson JG, Stienen AHA. Implementation of impairment- based neurorehabilitation devices and technologies following brain injury. Neurorehabilitation technology, 2nd edn. 2016. 375–392 p.

Subramanian SK, Yamanaka J, Chilingaryan G, Levin MF. Validity of movement pattern kinematics as measures of arm motor impairment poststroke. Stroke. 2010;41(10):2303–8.

Fayers PM, Machin D. Quality of life: the assessment, analysis and reporting of patient‐reported outcomes . John Wiley & Sons, Incorporated. 2016;3:89-124.

Alt Murphy M, Häger CK. Kinematic analysis of the upper extremity after stroke—how far have we reached and what have we grasped? Phys Ther Rev. 2015;20(3):137–55.

Shishov N, Melzer I, Bar-Haim S. Parameters and measures in assessment of motor learning in neurorehabilitation; a systematic review of the literature. Front Hum Neurosci. 2017. https://doi.org/10.3389/fnhum.2017.00082 .

Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155–63. https://doi.org/10.1016/j.jcm.2016.02.012 .

Angel RW. Electromyographic patterns during ballistic movement of normal and spastic limbs. Brain Res. 1975;99(2):387–92.

McLellan DL. C0-contraction and stretch reflexes in spasticity during treatment with baclofen. J Neurol Neurosurg Psychiatry. 1977;40(1):30–8.

Dewald JPA, Pope PS, Given JD, Buchanan TS, Rymer WZ. Abnormal muscle coactivation patterns during isometric torque generation at the elbow and shoulder in hemiparetic subjects. Brain. 1995;118(2):495–510. https://doi.org/10.1093/brain/118.2.495 .

Wilkins KB, Yao J, Owen M, Karbasforoushan H, Carmona C, Dewald JPA. Limited capacity for ipsilateral secondary motor areas to support hand function post-stroke. J Physiol. 2020;598(11):2153–67. https://doi.org/10.1113/JP279377 .

Agius Anastasi A, Falzon O, Camilleri K, Vella M, Muscat R. Brain symmetry index in healthy and stroke patients for assessment and prognosis. Stroke Res Treat. 2017;30(2017):1–9.

Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gøtzsche PC, Ioannidis JPA, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate healthcare interventions: explanation and elaboration. BMJ. 2009;339: b2700.

Colombo R, Pisano F, Micera S, Mazzone A, Delconte C, Carrozza MC, et al. Assessing mechanisms of recovery during robot-aided neurorehabilitation of the upper limb. Neurorehabil Neural Repair. 2008;22(1):50–63. https://doi.org/10.1177/1545968307303401 .

Keller U, Schölch S, Albisser U, Rudhe C, Curt A, Riener R, et al. Robot-assisted arm assessments in spinal cord injured patients: a consideration of concept study. PLoS One. 2015;10(5):e0126948. https://doi.org/10.1371/journal.pone.0126948 .

Mostafavi SM. Computational models for improved diagnosis and prognosis of stroke using robot-based biomarkers. 2016. http://hdl.handle.net/1974/14563 .

Bosecker C, Dipietro L, Volpe B, Krebs HI. Kinematic robot-based evaluation scales and clinical counterparts to measure upper limb motor performance in patients with chronic stroke. Neurorehabil Neural Repair. 2010;24(1):62–9.

Balasubramanian S, Melendez-Calderon A, Roby-Brami A, Burdet E. On the analysis of movement smoothness. J Neuroeng Rehabil. 2015;12(1):112. https://doi.org/10.1186/s12984-015-0090-9 .

Rohrer B, Fasoli S, Krebs HI, Hughes R, Volpe B, Frontera WR, et al. Movement smoothness changes during stroke recovery. J Neurosci. 2002;22(18):8297–304.

Mobini A, Behzadipour S, Saadat M. Test-retest reliability of Kinect’s measurements for the evaluation of upper body recovery of stroke patients. Biomed Eng Online. 2015;14(1):1–14.

Zariffa J, Myers M, Coahran M, Wang RH. Smallest real differences for robotic measures of upper extremity function after stroke: implications for tracking recovery. J Rehabil Assist Technol Eng. 2018;5:205566831878803. https://doi.org/10.1177/2055668318788036 .

Elovic E, Brashear A. Spasticity : diagnosis and management. New York: Demos Medical; 2011. http://ida.lib.uidaho.edu:2048/login?url=http://search.ebscohost.com/login.aspx?direct=true&db=e000xna&AN=352265&site=ehost-live&scope=site .

Centen A, Lowrey CR, Scott SH, Yeh TT, Mochizuki G. KAPS (kinematic assessment of passive stretch): a tool to assess elbow flexor and extensor spasticity after stroke using a robotic exoskeleton. J Neuroeng Rehabil. 2017;14(1):1–13.

Sin M, Kim WS, Cho K, Cho S, Paik NJ. Improving the test-retest and inter-rater reliability for stretch reflex measurements using an isokinetic device in stroke patients with mild to moderate elbow spasticity. J Electromyogr Kinesiol. 2017;2018(39):120–7. https://doi.org/10.1016/j.jelekin.2018.01.012 .

Germanotta M, Cruciani A, Pecchioli C, Loreti S, Spedicato A, Meotti M, et al. Reliability, validity and discriminant ability of the instrumental indices provided by a novel planar robotic device for upper limb rehabilitation. J Neuroeng Rehabil. 2018;15(1):1–14.

Wagner JM, Rhodes JA, Patten C. Reproducibility and minimal detectable change of three-dimensional kinematic analysis of reaching tasks in people with hemiparaesis after stroke. 2008. https://doi.org/10.2522/ptj.20070255 .

Semrau JA, Herter TM, Scott SH, Dukelow SP. Inter-rater reliability of kinesthetic measurements with the KINARM robotic exoskeleton. J Neuroeng Rehabil. 2017;14(1):1–10.

Lin CH, Chou LW, Wei SH, Lieu FK, Chiang SL, Sung WH. Validity and reliability of a novel device for bilateral upper extremity functional measurements. Comput Methods Programs Biomed. 2014;114(3):315–23. https://doi.org/10.1016/j.cmpb.2014.02.012 .

Wolf S, Butler A, Alberts J, Kim M. Contemporary linkages between EMG, kinetics and stroke. J Electromyogr Kinesiol. 2005;15(3):229–39.

Iyer KK. Effective assessments of electroencephalography during stroke recovery : contemporary approaches and considerations. J Neurophysiol. 2017;118(5):2521–5.

Liu J, Sheng Y, Liu H. Corticomuscular coherence and its applications: a review. Front Hum Neurosci. 2019;13(March):1–16.

Pan LLH, Yang WW, Kao CL, Tsai MW, Wei SH, Fregni F, et al. Effects of 8-week sensory electrical stimulation combined with motor training on EEG-EMG coherence and motor function in individuals with stroke. Sci Rep. 2018;8(1):1–10.

Wu J, Quinlan EB, Dodakian L, McKenzie A, Kathuria N, Zhou RJ, et al. Connectivity measures are robust biomarkers of cortical function and plasticity after stroke. Brain. 2015;138(8):2359–69.

Mrachacz-Kersting N, Jiang N, Thomas Stevenson AJ, Niazi IK, Kostic V, Pavlovic A, et al. Efficient neuroplasticity induction in chronic stroke patients by an associative brain-computer interface. J Neurophysiol. 2016;115(3):1410–21.

Bentes C, Peralta AR, Viana P, Martins H, Morgado C, Casimiro C, et al. Quantitative EEG and functional outcome following acute ischemic stroke. Clin Neurophysiol. 2018;129(8):1680–7.

Leon-carrion J, Martin-rodriguez JF, Damas-lopez J, Manuel J, Dominguez-morales MR. Delta–alpha ratio correlates with level of recovery after neurorehabilitation in patients with acquired brain injury. Clin Neurophysiol. 2009;120(6):1039–45. https://doi.org/10.1016/j.clinph.2009.01.021 .

Finnigan S, van Putten MJAM. EEG in ischaemic stroke: qEEG can uniquely inform (sub-)acute prognoses and clinical management. Clin Neurophysiol. 2013;124(1):10–9.

Trujillo P, Mastropietro A, Scano A, Chiavenna A, Mrakic-Sposta S, Caimmi M, et al. Quantitative EEG for predicting upper limb motor recovery in chronic stroke robot-assisted rehabilitation. IEEE Trans Neural Syst Rehabil Eng. 2017;25(7):1058–67.

Jordan K. Emergency EEG and continuous EEG monitoring in acute ischemic stroke. Clin Neurophysiol. 2004;21(5):341–52.

Comani S, Velluto L, Schinaia L, Cerroni G, Serio A, Buzzelli S, et al. Monitoring neuro-motor recovery from stroke with high-resolution EEG, robotics and virtual reality: a proof of concept. IEEE Trans Neural Syst Rehabil Eng. 2015;23(6):1106–16.

Espenhahn S, de Berker AO, van Wijk BCM, Rossiter HE, Ward NS. Movement-related beta oscillations show high intra-individual reliability. Neuroimage. 2017;147:175–85. https://doi.org/10.1016/j.neuroimage.2016.12.025 .

Vázquez-Marrufo M, Galvao-Carmona A, Benítez Lugo ML, Ruíz-Peña JL, Borges Guerra M, Izquierdo AG. Retest reliability of individual alpha ERD topography assessed by human electroencephalography. PLoS ONE. 2017;12(10):1–16.

Dubovik S, Ptak R, Aboulafia T, Magnin C, Gillabert N, Allet L, et al. EEG alpha band synchrony predicts cognitive and motor performance in patients with ischemic stroke. In: Behavioural Neurology. Hindawi Limited; 2013. p. 187–9.

Sheorajpanday RVAA, Nagels G, Weeren AJTMTM, Putten MJAMV, Deyn PPD, van Putten MJAM, et al. Quantitative EEG in ischemic stroke: correlation with functional status after 6 months. Clin Neurophysiol. 2011;122(5):874–83. https://doi.org/10.1016/j.clinph.2010.07.028 .

De Vico Fallani F, Astolfi L, Cincotti F, Mattia D, La Rocca D, Maksuti E, et al. Evaluation of the brain network organization from EEG signals: a preliminary evidence in stroke patient. In: Anatomical Record. 2009. p. 2023–31.

Westlake KP, Nagarajan SS. Functional connectivity in relation to motor performance and recovery after stroke. Front Syst Neurosci. 2011;18(5):8.

Caliandro P, Vecchio F, Miraglia F, Reale G, Della Marca G, La Torre G, et al. Small-world characteristics of cortical connectivity changes in acute stroke. Neurorehabil Neural Repair. 2017;31(1):81–94.

Eldeeb S, Akcakaya M, Sybeldon M, Foldes S, Santarnecchi E, Pascual-Leone A, et al. EEG-based functional connectivity to analyze motor recovery after stroke: a pilot study. Biomed Signal Process Control. 2019;49:419–26.

Myers LJ, O’Malley M. The relationship between human cortico-muscular coherence and rectified EMG. In: International IEEE/EMBS Conference on Neural Engineering, NER. IEEE Computer Society; 2003. p. 289–92.

Braun C, Staudt M, Schmitt C, Preissl H, Birbaumer N, Gerloff C. Crossed cortico-spinal motor control after capsular stroke. Eur J Neurosci. 2007;25(9):2935–45.

Larsen LH, Zibrandtsen IC, Wienecke T, Kjaer TW, Christensen MS, Nielsen JB, et al. Corticomuscular coherence in the acute and subacute phase after stroke. Clin Neurophysiol. 2017;128(11):2217–26.

Ang KK, Chua KSG, Phua KS, Wang C, Chin ZY, Kuah CWK, et al. A randomized controlled trial of EEG-based motor imagery brain-computer interface robotic rehabilitation for stroke. Clin EEG Neurosci. 2015;46(4):310–20.

Liu S, Guo J, Meng J, Wang Z, Yao Y, Yang J, et al. Abnormal EEG complexity and functional connectivity of brain in patients with acute thalamic ischemic stroke. Comput Math Methods Med. 2016;14(2016):1–9.

CAS   Google Scholar  

Sun R, Wong W, Wang J, Tong RK. Changes in electroencephalography complexity using a brain computer interface-motor observation training in chronic stroke patients : a fuzzy approximate entropy analysis. Front Hum Neurosci. 2017;5(11):444.

Auriat AM, Neva JL, Peters S, Ferris JK, Boyd LA. A review of transcranial magnetic stimulation and multimodal neuroimaging to characterize post-stroke neuroplasticity. Front Neurol. 2015;6:1–20.

Niedermeyer E, Schomer DL, Lopes da Silva FH. Niedermeyer’s electroencephalography: basic principles, clinical applications, and related fields, 6th edn. Philadelphia: Lippincott Williams & Wilkins.; 2011.

Foreman B, Claasen J. Update in intensive care and emergency medicine. Update in intensive care and emergency medicine. Springer Berlin Heidelberg; 2012.

Tolonen U, Ahonen A, Kallanranta T, Hokkanen E. Non-invasive external regional measurement of cerebral circulation time changes in supratentorial infarctions using pertechnetate. Stroke. 1981;12(4):437–44.

Saes M, Zandvliet SB, Andringa AS, Daffertshofer A, Twisk JWR, Meskers CGM, et al. Is resting-state EEG longitudinally associated with recovery of clinical neurological impairments early poststroke? A prospective cohort study. Neurorehabil Neural Repair. 2020;34(5):389–402.

Rogers J, Middleton S, Wilson PH, Johnstone SJ. Predicting functional outcomes after stroke: an observational study of acute single-channel EEG. Top Stroke Rehabil. 2020;27(3):161–72. https://doi.org/10.1080/10749357.2019.1673576 .

Sale P, Infarinato F, Lizio R, Babiloni C. Electroencephalographic markers of robot-aided therapy in stroke patients for the evaluation of upper limb rehabilitation. Rehabil Res. 2015;38(4):294–305.

Sheorajpanday RVA, Nagels G, Weeren AJTM, De Surgeloose D, De Deyn PP, De DPP. Additional value of quantitative EEG in acute anterior circulation syndrome of presumed ischemic origin. Clin Neurophysiol. 2010;121(10):1719–25.

Saes M, Meskers CGM, Daffertshofer A, van Wegen EEH, Kwakkel G. Are early measured resting-state EEG parameters predictive for upper limb motor impairment six months poststroke? Clin Neurophysiol. 2021;132(1):56–62. https://doi.org/10.1016/j.clinph.2020.09.031 .

Saes M, Meskers CGM, Daffertshofer A, de Munck JC, Kwakkel G, van Wegen EEH. How does upper extremity Fugl-Meyer motor score relate to resting-state EEG in chronic stroke? A power spectral density analysis. Clin Neurophysiol. 2019;130(5):856–62. https://doi.org/10.1016/j.clinph.2019.01.007 .

Nolte G, Bai O, Mari Z, Vorbach S, Hallett M. Identifying true brain interaction from EEG data using the imaginary part of coherency. Clin Neurophysiol. 2004;115(10):2292–307.

Stam CJ, Nolte G, Daffertshofer A. Phase lag index: assessment of functional connectivity from multi channel EEG and MEG with diminished bias from common sources. Hum Brain Mapp. 2007;28(11):1178–93.

Bullmore E, Sporns O. Complex brain networks: graph theoretical analysis of structural and functional systems. Nat Rev Neurosci. 2009;10(3):186–98.

Baccalá LA, Sameshima K. Partial directed coherence: a new concept in neural structure determination. Biol Cybern. 2001;84(6):463–74.

Baccalá LA, Sameshima K, Takahashi D. Generalized partial directed coherence. Int Conf Digit Signal Process. 2007;3:163–6.

Schelter B, Timmer J, Eichler M. Assessing the strength of directed influences among neural signals using renormalized partial directed coherence. J Neurosci Methods. 2009;179(1):121–30.

Kamiński M, Ding M, Truccolo WA, Bressler SL. Evaluating causal relations in neural systems: Granger causality, directed transfer function and statistical assessment of significance. Biol Cybern. 2001;85(2):145–57.

Korzeniewska A, Mańczak M, Kamiński M, Blinowska KJ, Kasicki S. Determination of information flow direction among brain structures by a modified directed transfer function (dDTF) method. J Neurosci Methods. 2003;125(1–2):195–207.

Fornito A, Bullmore ET, Zalesky A. Fundamentals of brain network analysis. Cambridge: Academic Press; 2016.

Philips GR, Daly JJ, Príncipe JC. Topographical measures of functional connectivity as biomarkers for post-stroke motor recovery. J Neuroeng Rehabil. 2017;14(1):67.

Pichiorri F, Petti M, Caschera S, Astolfi L, Cincotti F, Mattia D. An EEG index of sensorimotor interhemispheric coupling after unilateral stroke: clinical and neurophysiological study. Eur J Neurosci. 2018;47(2):158–63.

Hoshino T, Oguchi K, Inoue K, Hoshino A, Hoshiyama M. Relationship between upper limb function and functional neural connectivity among motor related-areas during recovery stage after stroke. Top Stroke Rehabil. 2020;27(1):57–66. https://doi.org/10.1080/10749357.2019.1658429 .

Hordacre B, Goldsworthy MR, Welsby E, Graetz L, Ballinger S, Hillier S. Resting state functional connectivity is associated with motor pathway integrity and upper-limb behavior in chronic stroke. Neurorehabil Neural Repair. 2020;34(6):547–57.

Riahi N, Vakorin VA, Menon C. Estimating Fugl-Meyer upper extremity motor score from functional-connectivity measures. IEEE Trans Neural Syst Rehabil Eng. 2020;28(4):860–8.

Gwin JT, Ferris DP. Beta- and gamma-range human lower limb corticomuscular coherence. Front Hum Neurosci. 2012;11(6):258.

Zheng Y, Peng Y, Xu G, Li L, Wang J. Using corticomuscular coherence to reflect function recovery of paretic upper limb after stroke: a case study. Front Neurol. 2018;10(8):728.

Rossiter HE, Eaves C, Davis E, Boudrias MH, Park CH, Farmer S, et al. Changes in the location of cortico-muscular coherence following stroke. NeuroImage Clin. 2013;2(1):50–5.

Mima T, Toma K, Koshy B, Hallett M. Coherence between cortical and muscular activities after subcortical stroke. Stroke. 2001;32(11):2597–601.

Krauth R, Schwertner J, Vogt S, Lindquist S, Sailer M, Sickert A, et al. Cortico-muscular coherence is reduced acutely post-stroke and increases bilaterally during motor recovery: a pilot study. Front Neurol. 2019;20(10):126.

Bao SC, Wong WW, Leung TW, Tong KY. Low gamma band cortico-muscular coherence inter-hemisphere difference following chronic stroke. In: Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS. Institute of Electrical and Electronics Engineers Inc.; 2018. p. 247–50.

von Carlowitz-Ghori K, Bayraktaroglu Z, Hohlefeld FU, Losch F, Curio G, Nikulin VV. Corticomuscular coherence in acute and chronic stroke. Clin Neurophysiol. 2014;125(6):1182–91.

Chen X, Xie P, Zhang Y, Chen Y, Cheng S, Zhang L. Abnormal functional corticomuscular coupling after stroke. NeuroImage Clin. 2018;19:147–59. https://doi.org/10.1016/j.nicl.2018.04.004 .

Curado MR, Cossio EG, Broetz D, Agostini M, Cho W, Brasil FL, et al. Residual upper arm motor function primes innervation of paretic forearm muscles in chronic stroke after brain-machine interface (BMI) training. PLoS ONE. 2015;10(10):1–18.

Guo Z, Qian Q, Wong K, Zhu H, Huang Y, Hu X, et al. Altered corticomuscular coherence (CMCoh) pattern in the upper limb during finger movements after stroke. Front Neurol. 2020. https://doi.org/10.3389/fneur.2020.00410 .

Bruton A, Conway JH, Holgate ST. Reliability: what is it, and how is it measured? Physiotherapy. 2000;86(2):94–9.

Colombo R, Cusmano I, Sterpi I, Mazzone A, Delconte C, Pisano F. Test-retest reliability of robotic assessment measures for the evaluation of upper limb recovery. IEEE Trans Neural Syst Rehabil Eng. 2014;22(5):1020–9.

Costa V, Ramírez Ó, Otero A, Muñoz-García D, Uribarri S, Raya R. Validity and reliability of inertial sensors for elbow and wrist range of motion assessment. PeerJ. 2020;8: e9687.

Gasser T, Bächer P, Steinberg H. Test-retest reliability of spectral parameters of the EEG. Electroencephalogr Clin Neurophysiol. 1985;60(4):312–9.

Levin AR, Naples AJ, Scheffler AW, Webb SJ, Shic F, Sugar CA, et al. Day-to-day test-retest reliability of EEG profiles in children with autism spectrum disorder and typical development. Front Integr Neurosci. 2020;14:1–12.

Briels CT, Briels CT, Schoonhoven DN, Schoonhoven DN, Stam CJ, De Waal H, et al. Reproducibility of EEG functional connectivity in Alzheimer’s disease. Alzheimer’s Res Ther. 2020;12(1):1–14.

Marquetand J, Vannoni S, Carboni M, Li Hegner Y, Stier C, Braun C, et al. Reliability of magnetoencephalography and high-density electroencephalography resting-state functional connectivity metrics. Brain Connect. 2019;9(7):539–53.

Lowrey CR, Blazevski B, Marnet J-L, Bretzke H, Dukelow SP, Scott SH. Robotic tests for position sense and movement discrimination in the upper limb reveal that they each are highly reproducible but not correlated in healthy individuals. J Neuroeng Rehabil. 2020;17(1):103. https://doi.org/10.1186/s12984-020-00721-2 .

Simmatis LER, Early S, Moore KD, Appaqaq S, Scott SH. Statistical measures of motor, sensory and cognitive performance across repeated robot-based testing. J Neuroeng Rehabil. 2020;17(1):86. https://doi.org/10.1186/s12984-020-00713-2 .

Download references

Acknowledgements

The authors would like to thank Stephen Goodwin and Aaron I. Feinstein for their contributions to the collection and organization of references on robotic systems, measurements, and metrics.

This work was funded by the National Science Foundation (Award#1532239) and the Eunice Kennedy Shriver National Institute of Child Health & Human Development of the National Institutes of Health (Award#K12HD073945). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Science Foundation nor the National Institutes of Health.

Author information

Authors and affiliations.

Mechanical Engineering Department, University of Idaho, Moscow, ID, USA

Rene M. Maura, Eric T. Wolbrecht & Joel C. Perry

Engineering and Physics Department, Whitworth University, Spokane, WA, USA

Richard E. Stevens

College of Medicine, Washington State University, Spokane, WA, USA

Douglas L. Weeks

Electrical Engineering Department, University of Idaho, ID, Moscow, USA

Sebastian Rueda Parra

You can also search for this author in PubMed   Google Scholar

Contributions

RM, and SRP drafted the manuscript and performed the literature search. EW, JP, RS, and DW provided concepts, edited, and revised the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Rene M. Maura .

Ethics declarations

Ethics approval and consent to participate, consent for publication, competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

See Table 11 .

See Table 12 .

See Table 13 .

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Maura, R.M., Rueda Parra, S., Stevens, R.E. et al. Literature review of stroke assessment for upper-extremity physical function via EEG, EMG, kinematic, and kinetic measurements and their reliability. J NeuroEngineering Rehabil 20 , 21 (2023). https://doi.org/10.1186/s12984-023-01142-7

Download citation

Received : 27 May 2021

Accepted : 19 January 2023

Published : 15 February 2023

DOI : https://doi.org/10.1186/s12984-023-01142-7

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Reliability
  • Robot-assisted therapy
  • Exoskeleton
  • Neurological assessment
  • Rehabilitation
  • Motor function

Journal of NeuroEngineering and Rehabilitation

ISSN: 1743-0003

comparison and literature review

Advertisement

Advertisement

Publics’ views on ethical challenges of artificial intelligence: a scoping review

  • Open access
  • Published: 19 December 2023

Cite this article

You have full access to this open access article

  • Helena Machado   ORCID: orcid.org/0000-0001-8554-7619 1 ,
  • Susana Silva   ORCID: orcid.org/0000-0002-1335-8648 2 &
  • Laura Neiva   ORCID: orcid.org/0000-0002-1954-7597 3  

2526 Accesses

9 Altmetric

Explore all metrics

This scoping review examines the research landscape about publics’ views on the ethical challenges of AI. To elucidate how the concerns voiced by the publics are translated within the research domain, this study scrutinizes 64 publications sourced from PubMed ® and Web of Science™. The central inquiry revolves around discerning the motivations, stakeholders, and ethical quandaries that emerge in research on this topic. The analysis reveals that innovation and legitimation stand out as the primary impetuses for engaging the public in deliberations concerning the ethical dilemmas associated with AI technologies. Supplementary motives are rooted in educational endeavors, democratization initiatives, and inspirational pursuits, whereas politicization emerges as a comparatively infrequent incentive. The study participants predominantly comprise the general public and professional groups, followed by AI system developers, industry and business managers, students, scholars, consumers, and policymakers. The ethical dimensions most commonly explored in the literature encompass human agency and oversight, followed by issues centered on privacy and data governance. Conversely, topics related to diversity, nondiscrimination, fairness, societal and environmental well-being, technical robustness, safety, transparency, and accountability receive comparatively less attention. This paper delineates the concrete operationalization of calls for public involvement in AI governance within the research sphere. It underscores the intricate interplay between ethical concerns, public involvement, and societal structures, including political and economic agendas, which serve to bolster technical proficiency and affirm the legitimacy of AI development in accordance with the institutional norms that underlie responsible research practices.

Similar content being viewed by others

comparison and literature review

Artificial intelligence ethics has a black box problem

Jean-Christophe Bélisle-Pipon, Erica Monteferrante, … Vincent Couture

comparison and literature review

Are we Nearly There Yet? A Desires & Realities Framework for Europe’s AI Strategy

Ariana Polyviou & Efpraxia D. Zamani

comparison and literature review

Ensuring a ‘Responsible’ AI future in India: RRI as an approach for identifying the ethical challenges from an Indian perspective

Nitika Bhalla, Laurence Brooks & Tonii Leach

Avoid common mistakes on your manuscript.

1 Introduction

Current advances in the research, development, and application of artificial intelligence (AI) systems have yielded a far-reaching discourse on AI ethics that is accompanied by calls for AI technology to be democratically accountable and trustworthy from the publics’ Footnote 1 perspective [ 1 , 2 , 3 , 4 , 5 ]. Consequently, several ethics guidelines for AI have been released in recent years. As of early 2020, there were 167 AI ethics guidelines documents around the world [ 6 ]. Organizations such as the European Commission (EC), the Organization for Economic Co-operation and Development (OECD), and the United Nations Educational, Scientific and Cultural Organization (UNESCO) recognize that public participation is crucial for ensuring the responsible development and deployment of AI technologies, Footnote 2 emphasizing the importance of inclusivity, transparency, and democratic processes to effectively address the societal implications of AI [ 11 , 12 ]. These efforts were publicly announced as aiming to create a common understanding of ethical AI development and foster responsible practices that address societal concerns while maximizing AI’s potential benefits [ 13 , 14 ]. The concept of human-centric AI has emerged as a key principle in many of these regulatory initiatives, with the purposes of ensuring that human values are incorporated into the design of algorithms, that humans do not lose control over automated systems, and that AI is used in the service of humanity and the common good to improve human welfare and human rights [ 15 ]. Using the same rationale, the opacity and rapid diffusion of AI have prompted debate about how such technologies ought to be governed and which actors and values should be involved in shaping governance regimes [ 1 , 2 ].

While industry and business have traditionally tended to be seen as having no or little incentive to engage with ethics or in dialogue, AI leaders currently sponsor AI ethics [ 6 , 16 , 17 ]. However, some concerns call for ethics, public participation, and human-centric approaches in areas such as AI with high economic and political importance to be used within an instrumental rationale by the AI industry. A growing corpus of critical literature has conceived the development of AI ethics as efforts to reduce ethics to another form of industrial capital or to coopt and capture researchers as part of efforts to control public narratives [ 12 , 18 ]. According to some authors, one of the reasons why ethics is so appealing to many AI companies is to calm critical voices from the publics; therefore, AI ethics is seen as a way of gaining or restoring trust, credibility and support, as well as legitimation, while criticized practices are calmed down to maintain the agenda of industry and science [ 12 , 17 , 19 , 20 ].

Critical approaches also point out that despite regulatory initiatives explicitly invoking the need to incorporate human values into AI systems, they have the main objective of setting rules and standards to enable AI-based products and services to circulate in markets [ 20 , 21 , 22 ] and might serve to avoid or delay binding regulation [ 12 , 23 ]. Other critical studies argue that AI ethics fails to mitigate the racial, social, and environmental damage of AI technologies in any meaningful sense [ 24 ] and excludes alternative ethical practices [ 25 , 26 ]. As explained by Su [ 13 ], in a paper that considers the promise and perils of international human rights in AI governance, while human rights can serve as an authoritative source for holding AI developers accountable, its application to AI governance in practice shows a lack of effectiveness, an inability to effect structural change, and the problem of cooptation.

In a value analysis of AI national strategies, Wilson [ 5 ] concludes that the publics are primarily cast as recipients of AI’s abstract benefits, users of AI-driven services and products, a workforce in need of training and upskilling, or an important element for thriving democratic society that unlocks AI's potential. According to the author, when AI strategies articulate a governance role for the publics, it is more like an afterthought or rhetorical gesture than a clear commitment to putting “society-in-the-loop” into AI design and implementation [ 5 , pp. 7–8]. Another study of how public participation is framed in AI policy documents [ 4 ] shows that high expectations are assigned to public participation as a solution to address concerns about the concentration of power, increases in inequality, lack of diversity, and bias. However, in practice, this framing thus far gives little consideration to some of the challenges well known for researchers and practitioners of public participation with science and technology, such as the difficulty of achieving consensus among diverse societal views, the high resource requirements for public participation exercises, and the risks of capture by vested interests [ 4 , pp. 170–171]. These studies consistently reveal a noteworthy pattern: while references to public participation in AI governance are prevalent in the majority of AI national strategies, they tend to remain abstract and are often overshadowed by other roles, values, and policy concerns.

Some authors thus contended that the increasing demand to involve multiple stakeholders in AI governance, including the publics, signifies a discernible transformation within the sphere of science and technology policy. This transformation frequently embraces the framework of “responsible innovation”, Footnote 3 which emphasizes alignment with societal imperatives, responsiveness to evolving ethical, social, and environmental considerations, and the participation of the publics as well as traditionally defined stakeholders [ 3 , 28 ]. When investigating how the conception and promotion of public participation in European science and technology policies have evolved, Macq, Tancoine, and Strasser [ 29 ] distinguish between “participation in decision-making” (pertaining to science policy decisions or decisions on research topics) and “participation in knowledge and innovation-making”. They find that “while public participation had initially been conceived and promoted as a way to build legitimacy of research policy decisions by involving publics into decision-making processes, it is now also promoted as a way to produce better or more knowledge and innovation by involving publics into knowledge and innovation-making processes, and thus building legitimacy for science and technology as a whole” [ 29 , p. 508]. Although this shift in science and technology research policies has been noted, there exists a noticeable void in the literature in regard to understanding how concrete research practices incorporate public perspectives and embrace multistakeholder approaches, inclusion, and dialogue.

While several studies have delved into the framing of the publics’ role within AI governance in several instances (from Big Tech initiatives to hiring ethics teams and guidelines issued from multiple institutions to governments’ national policies related to AI development), discussing the underlying motivations driving the publics’ participation and the ethical considerations resulting from such involvement, there remains a notable scarcity of knowledge concerning how publicly voiced concerns are concretely translated into research efforts [ 30 , pp. 3–4, 31 , p. 8, 6]. To address this crucial gap, our scoping review endeavors to analyse the research landscape about the publics’ views on the ethical challenges of AI. Our primary objective is to uncover the motivations behind involving the publics in research initiatives, identify the segments of the publics that are considered in these studies, and illuminate the ethical concerns that warrant specific attention. Through this scoping review, we aim to enhance the understanding of the political and social backdrop within which debates and prior commitments regarding values and conditions for publics’ participation in matters related to science and technology are formulated and expressed [ 29 , 32 , 33 ] and which specific normative social commitments are projected and performed by institutional science [ 34 , p. 108, [ 35 , p. 856].

We followed the guidance for descriptive systematic scoping reviews by Levac et al. [ 36 ], based on the methodological framework developed by Arksey and O’Malley [ 37 ]. The steps of the review are listed below:

2.1 Stage 1: identifying the research question

The central question guiding this scoping review is the following: What motivations, publics and ethical issues emerge in research addressing the publics’ views on the ethical challenges of AI? We ask:

What motivations for engaging the publics with AI technologies are articulated?

Who are the publics invited?

Which ethical issues concerning AI technologies are perceived as needing the participation of the publics?

2.2 Stage 2: identifying relevant studies

A search of the publications on PubMed® and Web of Science™ was conducted on 19 May 2023, with no restriction set for language or time of publication, using the following search expression: (“AI” OR “artificial intelligence”) AND (“public” OR “citizen”) AND “ethics”. The search was followed by backwards reference tracking, examining the references of the selected publications based on full-text assessment.

2.3 Stage 3: study selection

The inclusion criteria allowed only empirical, peer-reviewed, original full-length studies written in English to explore publics’ views on the ethical challenges of AI as their main outcome. The exclusion criteria disallowed studies focusing on media discourses and texts. The titles of 1612 records were retrieved. After the removal of duplicates, 1485 records were examined. Two authors (HM and SS) independently screened all the papers retrieved initially, based on the title and abstract, and afterward, based on the full text. This was crosschecked and discussed in both phases, and perfect agreement was achieved.

The screening process is summarized in Fig.  1 . Based on title and abstract assessments, 1265 records were excluded because they were neither original full-length peer-reviewed empirical studies nor focused on the publics’ views on the ethical challenges of AI. Of the 220 fully read papers, 54 met the inclusion criteria. After backwards reference tracking, 10 papers were included, and the final review was composed of 64 papers.

figure 1

Flowchart showing the search results and screening process for the scoping review of publics’ views on ethical challenges of AI

2.4 Stage 4: charting the data

A standardized data extraction sheet was initially developed by two authors (HM and SS) and completed by two coders (SS and LN), including both quantitative and qualitative data (Supplemental Table “Data Extraction”). We used MS Excel to chart the data from the studies.

The two coders independently charted the first 10 records, with any disagreements or uncertainties in abstractions being discussed and resolved by consensus. The forms were further refined and finalized upon consensus before completing the data charting process. Each of the remaining records was charted by one coder. Two meetings were held to ensure consistency in data charting and to verify accuracy. The first author (HM) reviewed the results.

Descriptive data for the characterization of studies included information about the authors and publication year, the country where the study was developed, study aims, type of research (quantitative, qualitative, or other), assessment of the publics’ views, and sample. The types of research participants recruited as publics were coded into 11 categories: developers of AI systems; managers from industry and business; representatives of governance bodies; policymakers; academics and researchers; students; professional groups; general public; local communities; patients/consumers; and other (specify).

Data on the main motivations for researching the publics’ views on the ethical challenges of AI were also gathered. Authors’ accounts of their motivations were synthesized into eight categories according to the coding framework proposed by Weingart and colleagues [ 33 ] concerning public engagement with science and technology-related issues: education (to inform and educate the public about AI, improving public access to scientific knowledge); innovation (to promote innovation, the publics are considered to be a valuable source of knowledge and are called upon to contribute to knowledge production, bridge building and including knowledge outside ‘formal’ ethics); legitimation (to promote public trust in and acceptance of AI, as well as of policies supporting AI); inspiration (to inspire and raise interest in AI, to secure a STEM-educated labor force); politicization (to address past political injustices and historical exclusion); democratization (to empower citizens to participate competently in society and/or to participate in AI); other (specify); and not clearly evident.

Based on the content analysis technique [ 38 ], ethical issues perceived as needing the participation of the publics were identified through quotations stated in the studies. These were then summarized in seven key ethical principles, according to the proposal outlined by the EC's Ethics Guidelines for Trustworthy AI [ 39 ]: human agency and oversight; technical robustness and safety; privacy and data governance; transparency; diversity, nondiscrimination and fairness; societal and environmental well-being; and accountability.

2.5 Stage 5: collating, summarizing, and reporting the results

The main characteristics of the 64 studies included can be found in Table  1 . Studies were grouped by type of research and ordered by the year of publication. The findings regarding the publics invited to participate are presented in Fig.  2 . The main motivations for engaging the publics with AI technologies and the ethical issues perceived as needing the participation of the publics are summarized in Tables  2 and 3 , respectively. The results are presented below in a narrative format, with complimentary tables and figures to provide a visual representation of key findings.

figure 2

Publics invited to engage with issues framed as ethical challenges of AI

There are some methodological limitations in this scoping review that should be taken into account when interpreting the results. The use of only two search engines may preclude the inclusion of relevant studies, although supplemented by scanning the reference list of eligible studies. An in-depth analysis of the topics explored within each of the seven key ethical principles outlined by the EC's Ethics Guidelines for Trustworthy AI was not conducted. This assessment would lead to a detailed understanding of the publics’ views on ethical challenges of AI.

3.1 Study characteristics

Most of the studies were in recent years, with 35 of the 64 studies being published in 2022 and 2023. Journals were listed either on the databases related to Science Citation Index Expanded (n = 25) or the Social Science Citation Index (n = 23), with fewer journals indexed in the Emerging Sources Citation Index (n = 7) and the Arts and Humanities Citation Index (n = 2). Works covered a wide range of fields, including health and medicine (services, policy, medical informatics, medical ethics, public and environmental health); education; business, management and public administration; computer science; information sciences; engineering; robotics; communication; psychology; political science; and transportation. Beyond the general assessment of publics’ attitudes toward, preferences for, and expectations and concerns about AI, the publics’ views on ethical challenges of AI technologies have been studied mainly concerning healthcare and public services and less frequently regarding autonomous vehicles (AV), education, robotic technologies, and smart homes. Most of the studies (n = 47) were funded by research agencies, with 7 papers reporting conflicts of interest.

Quantitative research approaches have assessed the publics’ views on the ethical challenges of AI mainly through online or web-based surveys and experimental platforms, relying on Delphi studies, moral judgment studies, hypothetical vignettes, and choice-based/comparative conjoint surveys. The 25 qualitative studies collected data mainly by semistructured or in-depth interviews. Analysis of publicly available material reporting on AI-use cases, focus groups, a post hoc self-assessment, World Café, participatory research, and practice-based design research were used once or twice. Multi or mixed-methods studies relied on surveys with open-ended and closed questions, frequently combined with focus groups, in-depth interviews, literature reviews, expert opinions, examinations of relevant curriculum examples, tests, and reflexive writings.

The studies were performed (where stated) in a wide variety of countries, including the USA and Australia. More than half of the studies (n = 38) were conducted in a single country. Almost all studies used nonprobability sampling techniques. In quantitative studies, sample sizes varied from 2.3 M internet users in an online experimental platform study [ 40 ] to 20 participants in a Delphi study [ 41 ]. In qualitative studies, the samples varied from 123 participants in 21 focus groups [ 42 ] to six expert interviews [ 43 ]. In multi or mixed-methods studies, samples varied from 2036 participants [ 44 ] to 21 participants [ 45 ].

3.2 Motivations for engaging the publics

The qualitative synthesis of the motivations for researching the publics’ views on the ethical challenges of AI is presented in Table  2 and ordered by the number of studies referencing them in the scoping review. More than half of the studies (n = 37) addressed a single motivation. Innovation (n = 33) and legitimation (n = 29) proved to have the highest relevance as motivations for engaging the publics in the ethical challenges of AI technologies, as articulated in 15 studies. Additional motivations are rooted in education (n = 13), democratization (n = 11), and inspiration (n = 9). Politicization was mentioned in five studies. Although they were not authors’ motivations, few studies were found to have educational [ 46 , 47 ], democratization [ 48 , 49 ], and legitimation or inspirations effects [ 50 ].

To consider the publics as a valuable source of knowledge that can add real value to innovation processes in both the private and public sectors was the most frequent motivation mentioned in the literature. The call for public participation is rooted in the aspiration to add knowledge outside “formal” ethics at three interrelated levels. First, at a societal level, by asking what kind of AI we want as a society based on novel experiments on public policy preferences [ 51 ] and on the study of public perceptions, values, and concerns regarding AI design, development, and implementation in domains such as health care [ 46 , 52 , 53 , 54 , 55 ], public and social services [ 49 , 56 , 57 , 58 ], AV [ 59 , 60 ] and journalism [ 61 ]. Second, at a practical level, the literature provides insights into the perceived usefulness of AI applications [ 62 , 63 ] and choices between boosting developers’ voluntary adoption of ethical standards or imposing ethical standards via regulation and oversight [ 64 ], as well as suggesting specific guidance for the development and use of AI systems [ 65 , 66 , 67 ]. Finally, at a theoretical level, literature expands the social-technical perspective [ 68 ] and motivated-reasoning theory [ 69 ].

Legitimation was also a frequent motivation for engaging the publics. It was underpinned by the need for public trust in and social licences for implementing AI technologies. To ensure the long-term social acceptability of AI as a trustworthy technology [ 70 , 71 ] was perceived as essential to support its use and to justify its implementation. In one study [ 72 ], the authors developed an AI ethics scale to quantify how AI research is accepted in society and which area of ethical, legal, and social issues (ELSI) people are most concerned with. Public trust in and acceptance of AI is claimed by social institutions such as governments, private sectors, industry bodies, and the science community, behaving in a trustworthy manner, respecting public concerns, aligning with societal values, and involving members of the publics in decision-making and public policy [ 46 , 48 , 73 , 74 , 75 ], as well as in the responsible design and integration of AI technologies [ 52 , 76 , 77 ].

Education, democratization, and inspiration had a more modest presence as motivations to explore the publics’ views on the ethical challenges of AI. Considering the emergence of new roles and tasks related to AI, the literature has pointed to the public need to ensure the safe use of AI technologies by incorporating ethics and career futures into the education, preparation, and training of both middle school and university students and the current and future health workforce. Improvements in education and guidance for developers and older adults were also noticed. The views of the publics on what needs to be learned or how this learning may be supported or assessed were perceived as crucial. In one study [ 78 ], the authors developed strategies that promote learning related to AI through collaborative media production, connecting computational thinking to civic issues and creative expression. In another study [ 79 ], real-world scenarios were successfully used as a novel approach to teaching AI ethics. Rhim et al. [ 76 ] provided AV moral behavior design guidelines for policymakers, developers, and the publics by reducing the abstractness of AV morality.

Studies motivated by democratization promoted broader public participation in AI, aiming to empower citizens both to express their understandings, apprehensions, and concerns about AI [ 43 , 78 , 80 , 81 ] and to address ethical issues in AI as critical consumers, (potential future) developers of AI technologies or would-be participants in codesign processes [ 40 , 43 , 45 , 78 , 82 , 83 ]. Understanding the publics’ views on the ethical challenges of AI is expected to influence companies and policymakers [ 40 ]. In one study [ 45 ], the authors explored how a digital app might support citizens’ engagement in AI governance by informing them, raising public awareness, measuring publics’ attitudes and supporting collective decision-making.

Inspiration revolved around three main motivations: to raise public interest in AI [ 46 , 48 ]; to guide future empirical and design studies [ 79 ]; and to promote developers’ moral awareness through close collaboration between all those involved in the implementation, use, and design of AI technologies [ 46 , 61 , 78 , 84 , 85 ].

Politicization was the less frequent motivation reported in the literature for engaging the publics. Recognizing the need for mitigation of social biases [ 86 ], public participation to address historically marginalized populations [ 78 , 87 ], and promoting social equity [ 79 ] were the highlighted motives.

3.3 The invited publics

Study participants were mostly the general public and professional groups, followed by developers of AI systems, managers from industry and business, students, academics and researchers, patients/consumers, and policymakers (Fig.  2 ). The views of local communities and representatives of governance bodies were rarely assessed.

Representative samples of the general public were used in five papers related to studies conducted in the USA [ 88 ], Denmark [ 73 ], Germany [ 48 ], and Austria [ 49 , 63 ]. The remaining random or purposive samples from the general public comprised mainly adults and current and potential users of AI products and services, with few studies involving informal caregivers or family members of patients (n = 3), older people (n = 2), and university staff (n = 2).

Samples of professional groups included mainly healthcare professionals (19 out of 24 studies). Educators, law enforcement, media practitioners, and GLAM professionals (galleries, libraries, archives, and museums) were invited once.

3.4 Ethical issues

The ethical issues concerning AI technologies perceived as needing the participation of the publics are depicted in Table  3 . They were mapped by measuring the number of studies referencing them in the scoping review. Human agency and oversight (n = 55) was the most frequent ethical aspect that was studied in the literature, followed by those centered on privacy and data governance (n = 43). Diversity, nondiscrimination and fairness (n = 39), societal and environmental well-being (n = 39), technical robustness and safety (n = 38), transparency (n = 35), and accountability (n = 31) were less frequently discussed.

The concerns regarding human agency and oversight were the replacement of human beings by AI technologies and deskilling [ 47 , 55 , 67 , 74 , 75 , 89 , 90 ]; the loss of autonomy, critical thinking, and innovative capacities [ 50 , 58 , 61 , 77 , 78 , 83 , 85 , 90 ]; the erosion of human judgment and oversight [ 41 , 70 , 91 ]; and the potential for (over)dependence on technology and “oversimplified” decisions [ 90 ] due to the lack of publics’ expertise in judging and controlling AI technologies [ 68 ]. Beyond these ethical challenges, the following contributions of AI systems to empower human beings were noted: more fruitful and empathetic social relationships [ 47 , 68 , 90 ]; enhancing human capabilities and quality of life [ 68 , 70 , 74 , 83 , 92 ]; improving efficiency and productivity at work [ 50 , 53 , 62 , 65 , 83 ] by reducing errors [ 77 ], relieving the burden of professionals and/or increasing accuracy in decisions [ 47 , 55 , 90 ]; and facilitating and expanding access to safe and fair healthcare [ 42 , 53 , 54 ] through earlier diagnosis, increased screening and monitoring, and personalized prescriptions [ 47 , 90 ]. To foster human rights, allowing people to make informed decisions, the last say was up to the person themselves [ 42 , 43 , 46 , 55 , 64 , 67 , 73 , 76 ]. People should determine where and when to use automated functions and which functions to use [ 44 , 54 ], developing “job sharing” arrangements with machines and humans complementing and enriching each other [ 56 , 65 , 90 ]. The literature highlights the need to build AI systems that are under human control [ 48 , 70 ] whether to confirm or to correct the AI system’s outputs and recommendations [ 66 , 90 ]. Proper oversight mechanisms were seen as crucial to ensure accuracy and completeness, with divergent views about who should be involved in public participation approaches [ 86 , 87 ].

Data sharing and/or data misuse were considered the major roadblocks regarding privacy and data governance, with some studies pointing out distrust of participants related to commercial interests in health data [ 55 , 90 , 93 , 94 , 95 ] and concerns regarding risks of information getting into the hands of hackers, banks, employers, insurance companies, or governments [ 66 ]. As data are the backbone of AI, secure methods of data storage and protection are understood as needing to be provided from the input to the output data. Recognizing that in contemporary societies, people are aware of the consequences of smartphone use resulting in the minimization of privacy concerns [ 93 ], some studies have focused on the impacts of data breaches and loss of privacy and confidentiality [ 43 , 45 , 46 , 60 , 62 , 80 ] in relation to health-sensitive personal data [ 46 , 93 ], potentially affecting more vulnerable populations, such as senior citizens and mentally ill patients [ 82 , 90 ] as well as those at young ages [ 50 ], and when journalistic organizations collect user data to provide personalized news suggestions [ 61 ]. The need to find a balance between widening access to data and ensuring confidentiality and respect for privacy [ 53 ] was often expressed in three interrelated terms: first, the ability of data subjects to be fully informed about how data will be used and given the option of providing informed consent [ 46 , 58 , 78 ] and controlling personal information about oneself [ 57 ]; second, the need for regulation [ 52 , 65 , 87 ], with one study reporting that AI developers complain about the complexity, slowness, and obstacles created by regulation [ 64 ]; and last, the testing and certification of AI-enabled products and services [ 71 ]. The study by De Graaf et al. [ 91 ] discussed the robots’ right to store and process the data they collect, while Jenkins and Draper [ 42 ] explored less intrusive ways in which the robot could use information to report back to carers about the patient’s adherence to healthcare.

Studies discussing diversity, nondiscrimination, and fairness have pointed to the development of AI systems that reflect and reify social inequalities [ 45 , 78 ] through nonrepresentative datasets [ 55 , 58 , 96 , 97 ] and algorithmic bias [ 41 , 45 , 85 , 98 ] that might benefit some more than others. This could have multiple negative consequences for different groups based on ethnicity, disease, physical disability, age, gender, culture, or socioeconomic status [ 43 , 55 , 58 , 78 , 82 , 87 ], from the dissemination of hate speech [ 79 ] to the exacerbation of discrimination, which negatively impacts peace and harmony within society [ 58 ]. As there were cross-country differences and issue variations in the publics’ views of discriminatory bias [ 51 , 72 , 73 ], fostering diversity, inclusiveness, and cultural plurality [ 61 ] was perceived as crucial to ensure the transferability/effectiveness of AI systems in all social groups [ 60 , 94 ]. Diversity, nondiscrimination, and fairness were also discussed as a means to help reduce health inequalities [ 41 , 67 , 90 ], to compensate for human preconceptions about certain individuals [ 66 ], and to promote equitable distribution of benefits and burdens [ 57 , 71 , 80 , 93 ], namely, supporting access by all to the same updated and high-quality AI systems [ 50 ]. In one study [ 83 ], students provided constructive solutions to build an unbiased AI system, such as using a dataset that includes a diverse dataset engaging people of different ages, genders, ethnicities, and cultures. In another study [ 86 ], participants recommended diverse approaches to mitigate algorithmic bias, from open disclosure of limitations to consumer and patient engagement, representation of marginalized groups, incorporation of equity considerations into sampling methods and legal recourse, and identification of a wide range of stakeholders who may be responsible for addressing AI bias: developers, healthcare workers, manufacturers and vendors, policymakers and regulators, AI researchers and consumers.

Impacts on employment and social relationships were considered two major ethical challenges regarding societal and environmental well-being. The literature has discussed tensions between job creation [ 51 ] and job displacement [ 42 , 90 ], efficiency [ 90 ], and deskilling [ 57 ]. The concerns regarding future social relationships were the loss of empathy, humanity, and/or sensitivity [ 52 , 66 , 90 , 99 ]; isolation and fewer social connections [ 42 , 47 , 90 ]; laziness [ 50 , 83 ]; anxious counterreactions [ 83 , 99 ]; communication problems [ 90 ]; technology dependence [ 60 ]; plagiarism and cheating in education [ 50 ]; and becoming too emotionally attached to a robot [ 65 ]. To overcome social unawareness [ 56 ] and lack of acceptance [ 65 ] due to financial costs [ 56 , 90 ], ecological burden [ 45 ], fear of the unknown [ 65 , 83 ] and/or moral issues [ 44 , 59 , 100 ], AI systems need to provide public benefit sharing [ 55 ], consider discrepancies between public discourse about AI and the utility of the tools in real-world settings and practices [ 53 ], conform to the best standards of sustainability and address climate change and environmental justice [ 60 , 71 ]. Successful strategies in promoting the acceptability of robots across contexts included an approachable and friendly looking as possible, but not too human-like [ 49 , 65 ], and working with, rather than in competition, with humans [ 42 ].

The publics were invited to participate in the following ethical issues related to technical robustness and safety: usability, reliability, liability, and quality assurance checks of AI tools [ 44 , 45 , 55 , 62 , 99 ]; validity of big data analytic tools [ 87 ]; the degree to which an AI system can perform tasks without errors or mistakes [ 50 , 57 , 66 , 84 , 90 , 93 ]; and needed resources to perform appropriate (cyber)security [ 62 , 101 ]. Other studies approached the need to consider both material and normative concerns of AI applications [ 51 ], namely, assuring that AI systems are developed responsibly with proper consideration of risks [ 71 ] and sufficient proof of benefits [ 96 ]. One study [ 64 ] highlighted that AI developers tend to be reluctant to recognize safety issues, bias, errors, and failures, and when they do so, they do so in a selective manner and in their terms by adopting positive-sounding professional jargon as AI robustness.

Some studies recognized the need for greater transparency that reduces the mystery and opaqueness of AI systems [ 71 , 82 , 101 ] and opens its “black box” [ 64 , 71 , 98 ]. Clear insights about “what AI is/is not” and “how AI technology works” (definition, applications, implications, consequences, risks, limitations, weaknesses, threats, rewards, strengths, opportunities) were considered as needed to debunk the myth about AI as an independent entity [ 53 ] and for providing sufficient information and understandable explanations of “what’s happening” to society and individuals [ 43 , 48 , 72 , 73 , 78 , 102 ]. Other studies considered that people, when using AI tools, should be made fully aware that these AI devices are capturing and using their data [ 46 ] and how data are collected [ 58 ] and used [ 41 , 46 , 93 ]. Other transparency issues reported in the literature included the need for more information about the composition of data training sets [ 55 ], how algorithms work [ 51 , 55 , 84 , 94 , 97 ], how AI makes a decision [ 57 ] and the motivations for that decision [ 98 ]. Transparency requirements were also addressed as needing the involvement of multiple stakeholders: one study reported that transparency requirements should be seen as a mediator of debate between experts, citizens, communities, and stakeholders [ 87 ] and cannot be reduced to a product feature, avoiding experiences where people feel overwhelmed by explanations [ 98 ] or “too much information” [ 66 ].

Accountability was perceived by the publics as an important ethical issue [ 48 ], while developers expressed mixed attitudes, from moral disengagement to a sense of responsibility and moral conflict and uncertainty [ 85 ]. The literature has revealed public skepticism regarding accountability mechanisms [ 93 ] and criticism about the shift of responsibility away from tech industries that develop and own AI technologies [ 53 , 68 ], as it opens space for users to assume their own individual responsibility [ 78 ]. This was the case in studies that explored accountability concerns regarding the assignment of fault and responsibility for car accidents using self-driving technology [ 60 , 76 , 77 , 88 ]. Other studies considered that more attention is needed to scrutinize each application across the AI life cycle [ 41 , 71 , 94 ], to explainability of AI algorithms that provide to the publics the cause of AI outcomes [ 58 ], and to regulations that assign clear responsibility concerning litigation and liability [ 52 , 89 , 101 , 103 ].

4 Discussion

Within the realm of research studies encompassed in the scoping review, the contemporary impetus for engaging the publics in ethical considerations related to AI predominantly revolves around two key motivations: innovation and legitimation. This might be explained by the current emphasis on responsible innovation, which values the publics’ participation in knowledge and innovation-making [ 29 ] within a prioritization of the instrumental role of science for innovation and economic return [ 33 ]. Considering the publics as a valuable source of knowledge that should be called upon to contribute to knowledge innovation production is underpinned by the desire for legitimacy, specifically centered around securing the publics’ endorsement of scientific and technological advancements [ 33 , 104 ]. Approaching the publics’ views on the ethical challenges of AI can also be used as a form of risk prevention to reduce conflict and close vital debates in contention areas [ 5 , 34 , 105 ].

A second aspect that stood out in this finding is a shift in the motivations frequently reported as central for engaging the publics with AI technologies. Previous studies analysing AI national policies and international guidelines addressing AI governance [ 3 , 4 , 5 ] and a study analysing science communication journals [ 33 ] highlighted education, inspiration and democratization as the most prominent motivations. Our scoping review did not yield similar findings, which might signal a departure, in science policy related to public participation, from the past emphasis on education associated with the deficit model of public understanding of science and democratization of the model of public engagement with science [ 106 , 107 ].

The underlying motives for the publics’ engagement raise the question of the kinds of publics it addresses, i.e., who are the publics that are supposed to be recruited as research participants [ 32 ]. Our findings show a prevalence of the general public followed by professional groups and developers of AI systems. The wider presence of the general public indicates not only what Hagendijk and Irwin [ 32 , p. 167] describe as a fashionable tendency in policy circles since the late 1990s, and especially in Europe, focused on engaging 'the public' in scientific and technological change but also the avoidance of the issues of democratic representation [ 12 , 18 ]. Additionally, the unspecificity of the “public” does not stipulate any particular action [ 24 ] that allows for securing legitimacy for and protecting the interests of a wide range of stakeholders [ 19 , 108 ] while bringing the risk of silencing the voices of the very publics with whom engagement is sought [ 33 ]. The focus on approaching the publics’ views on the ethical challenges of AI through the general public also demonstrates how seeking to “lay” people’s opinions may be driven by a desire to promote public trust and acceptance of AI developments, showing how science negotiates challenges and reinstates its authority [ 109 ].

While this strategy is based on nonscientific audiences or individuals who are not associated with any scientific discipline or area of inquiry as part of their professional activities, the converse strategy—i.e., involving professional groups and AI developers—is also noticeable in our findings. This suggests that technocratic expert-dominated approaches coexist with a call for more inclusive multistakeholder approaches [ 3 ]. This coexistence is reinforced by the normative principles of the “responsible innovation” framework, in particular the prescription that innovation should include the publics as well as traditionally defined stakeholders [ 3 , 110 ], whose input has become so commonplace that seeking the input of laypeople on emerging technologies is sometimes described as a “standard procedure” [ 111 , p. 153].

In the body of literature included in the scoping review, human agency and oversight emerged as the predominant ethical dimension under investigation. This finding underscores the pervasive significance attributed to human centricity, which is progressively integrated into public discourses concerning AI, innovation initiatives, and market-driven endeavours [ 15 , 112 ]. In our perspective, the importance given to human-centric AI is emblematic of the “techno-regulatory imaginary” suggested by Rommetveit and van Dijk [ 35 ] in their study about privacy engineering applied in the European Union’s General Data Protection Regulation. This term encapsulates the evolving collective vision and conceptualization of the role of technology in regulatory and oversight contexts. At least two aspects stand out in the techno-regulatory imaginary, as they are meant to embed technoscience in societally acceptable ways. First, it reinstates pivotal demarcations between humans and nonhumans while concurrently producing intensified blurring between these two realms. Second, the potential resolutions offered relate to embedding fundamental rights within the structural underpinnings of technological architectures [ 35 ].

Following human agency and oversight, the most frequent ethical issue discussed in the studies contained in our scoping review was privacy and data governance. Our findings evidence additional central aspects of the “techno-regulatory imaginary” in the sense that instead of the traditional regulatory sites, modes of protecting privacy and data are increasingly located within more privatized and business-oriented institutions [ 6 , 35 ] and crafted according to a human-centric view of rights. The focus on secure ways of data storage and protection as in need to be provided from the input to the output data, the testing and certification of AI-enabled products and services, the risks of data breaches, and calls for finding a balance between widening access to data and ensuring confidentiality and respect for privacy, exhibited by many studies in this scoping review, portray an increasing framing of privacy and data protection within technological and standardization sites. This tendency shows how forms of expertise for privacy and data protection are shifting away from traditional regulatory and legal professionals towards privacy engineers and risk assessors in information security and software development. Another salient element to highlight pertains to the distribution of responsibility for privacy and data governance [ 6 , 113 ] within the realm of AI development through engagement with external stakeholders, including users, governmental bodies, and regulatory authorities. It extends from an emphasis on issues derived from data sharing and data misuse to facilitating individuals to exercise control over their data and privacy preferences and to advocating for regulatory frameworks that do not impede the pace of innovation. This distribution of responsibility shared among the contributions and expectations of different actors is usually convoked when the operationalization of ethics principles conflicts with AI deployment [ 6 ]. In this sense, privacy and data governance are reconstituted as a “normative transversal” [ 113 , p. 20], both of which work to stabilize or close controversies, while their operationalization does not modify any underlying operations in AI development.

Diversity, nondiscrimination and fairness, societal and environmental well-being, technical robustness and safety, transparency, and accountability were the ethical issues less frequently discussed in the studies included in this scoping review. In contrast, ethical issues of technical robustness and safety, transparency, and accountability “are those for which technical fixes can be or have already been developed” and “implemented in terms of technical solutions” [ 12 , p. 103]. The recognition of issues related to technical robustness and safety expresses explicit admissions of expert ignorance, error, or lack of control, which opens space for politics of “optimization of algorithms” [ 114 , p. 17] while reinforcing “strategic ignorance” [ 114 , p. 89]. In the words of the sociologist Linsey McGoey, strategic ignorance refers to “any actions which mobilize, manufacture or exploit unknowns in a wider environment to avoid liability for earlier actions” [ 115 , p. 3].

According to the analysis of Jobin et al. [ 11 ] of the global landscape of existing ethics guidelines for AI, transparency comprising efforts to increase explainability, interpretability, or other acts of communication and disclosure is the most prevalent principle in the current literature. Transparency gains high relevance in ethics guidelines because this principle has become a pro-ethical condition “enabling or impairing other ethical practices or principles” [Turilli and Floridi 2009, [ 11 ], p. 14]. Our findings highlight transparency as a crucial ethical concern for explainability and disclosure. However, as emphasized by Ananny and Crawford [ 116 , p. 973], there are serious limitations to the transparency ideal in making black boxes visible (i.e., disclosing and explaining algorithms), since “being able to see a system is sometimes equated with being able to know how it works and governs it—a pattern that recurs in recent work about transparency and computational systems”. The emphasis on transparency mirrors Aradau and Blanke’s [ 114 ] observation that Big Tech firms are creating their version of transparency. They are prompting discussions about their data usage, whether it is for “explaining algorithms” or addressing bias and discrimination openly.

The framing of ethical issues related to accountability, as elucidated by the studies within this scoping review, manifests as a commitment to ethical conduct and the transparent allocation of responsibility and legal obligations in instances where the publics encounters algorithmic deficiencies, glitches, or other imperfections. Within this framework, accountability becomes intricately intertwined with the notion of distributed responsibility, as expounded upon in our examination of how the literature addresses challenges in privacy and data governance. Simultaneously, it converges with our discussion on optimizing algorithms concerning ethical concerns on technical robustness and safety by which AI systems are portrayed as fallible yet eternally evolving towards optimization. As astutely observed by Aradau and Blanke [ 114 , p. 171], “forms of accountability through error enact algorithmic systems as fallible but ultimately correctable and therefore always desirable. Errors become temporary malfunctions, while the future of algorithms is that of indefinite optimization”.

5 Conclusion

This scoping review of how publics' views on ethical challenges of AI are framed, articulated, and concretely operationalized in the research sector shows that ethical issues and publics formation are closely entangled with symbolic and social orders, including political and economic agendas and visions. While Steinhoff [ 6 ] highlights the subordinated nature of AI ethics within an innovation network, drawing on insights from diverse sources beyond Big Tech, we assert that this network is dynamically evolving towards greater hybridity and boundary fusion. In this regard, we extend Steinhoff's argument by emphasizing the imperative for a more nuanced understanding of how this network operates within diverse contexts. Specifically, within the research sector, it operates through a convergence of boundaries, engaging human and nonhuman entities and various disciplines and stakeholders. Concurrently, the advocacy for diversity and inclusivity, along with the acknowledgement of errors and flaws, serves to bolster technical expertise and reaffirm the establishment of order and legitimacy in alignment with the institutional norms underpinning responsible research practices.

Our analysis underscores the growing importance of involving the publics in AI knowledge creation and innovation, both to secure public endorsement and as a tool for risk prevention and conflict mitigation. We observe two distinct approaches: one engaging nonscientific audiences and the other involving professional groups and AI developers, emphasizing the need for inclusivity while safeguarding expert knowledge. Human-centred approaches are gaining prominence, emphasizing the distinction and blending of human and nonhuman entities and embedding fundamental rights in technological systems. Privacy and data governance emerge as the second most prevalent ethical concern, shifting expertise away from traditional regulatory experts to privacy engineers and risk assessors. The distribution of responsibility for privacy and data governance is a recurring theme, especially in cases of ethical conflicts with AI deployment. However, there is a notable imbalance in attention, with less focus on diversity, nondiscrimination, fairness, societal, and environmental well-being, compared to human-centric AI, privacy, and data governance being managed through technical fixes. Last, acknowledging technical robustness and safety, transparency, and accountability as foundational ethics principles reveals an openness to expert limitations, allowing room for the politics of algorithm optimization, framing AI systems as correctable and perpetually evolving.

Data availability

This manuscript has data included as electronic supplementary material. The dataset constructed by the authors, resulting from a search of publications on PubMed ® and Web of Science™, analysed in the current study, is not publicly available. But it can be available from the corresponding author on reasonable request.

In this article, we will employ the term "publics" rather than the singular "public" to delineate our viewpoint concerning public participation in AI. Our option is meant to acknowledge that there are no uniform, monolithic viewpoints or interests. From our perspective, the term "publics" allows for a more nuanced understanding of the various groups, communities, and individuals who may have different attitudes, beliefs, and concerns regarding AI. This choice may differ from the terminology employed in the referenced literature.

The following examples are particularly illustrative of the multiplicity of organizations emphasizing the need for public participation in AI. The OECD Recommendations of the Council on AI specifically emphasizes the importance of empowering stakeholders considering essential their engagement to adoption of trustworthy [ 7 , p. 6]. The UNESCO Recommendation on the Ethics of AI emphasizes that public awareness and understanding of AI technologies should be promoted (recommendation 44) and it encourages governments and other stakeholders to involve the publics in AI decision-making processes (recommendation 47) [ 8 , p. 23]. The European Union (EU) White Paper on AI [ 9 , p. 259] outlines the EU’s approach to AI, including the need for public consultation and engagement. The Ethics Guidelines for Trustworthy AI [ 10 , pp. 19, 239], developed by the High-Level Expert Group on AI (HLEG) appointed by the EC, emphasize the importance of public participation and consultation in the design, development, and deployment of AI systems.

“Responsible Innovation” (RI) and “Responsible Research and Innovation” (RRI) have emerged in parallel and are often used interchangeably, but they are not the same thing [ 27 , 28 ]. RRI is a policy-driven discourse that emerged from the EC in the early 2010s, while RI emerged largely from academic roots. For this paper, we will not consider the distinctive features of each discourse, but instead focus on the common features they share.

Cath, C., Wachter, S., Mittelstadt, B., Taddeo, M., Floridi, L.: Artificial intelligence and the ‘good society’: the US, EU, and UK approach. Sci. Eng. Ethics 24 , 505–528 (2017). https://doi.org/10.1007/s11948-017-9901-7

Article   Google Scholar  

Cussins, J.N.: Decision points in AI governance. CLTC white paper series. Center for Long-term Cybersecurity. https://cltc.berkeley.edu/publication/decision-points-in-ai-governance/ (2020). Accessed 8 July 2023

Ulnicane, I., Okaibedi Eke, D., Knight, W., Ogoh, G., Stahl, B.: Good governance as a response to discontents? Déjà vu, or lessons for AI from other emerging technologies. Interdiscip. Sci. Rev. 46 (1–2), 71–93 (2021). https://doi.org/10.1080/03080188.2020.1840220

Ulnicane, I., Knight, W., Leach, T., Stahl, B., Wanjiku, W.: Framing governance for a contested emerging technology: insights from AI policy. Policy Soc. 40 (2), 158–177 (2021). https://doi.org/10.1080/14494035.2020.1855800

Wilson, C.: Public engagement and AI: a values analysis of national strategies. Gov. Inf. Q. 39 (1), 101652 (2022). https://doi.org/10.1016/j.giq.2021.101652

Steinhoff, J.: AI ethics as subordinated innovation network. AI Soc. (2023). https://doi.org/10.1007/s00146-023-01658-5

Organization for Economic Co-operation and Development. Recommendation of the Council on Artificial Intelligence. https://legalinstruments.oecd.org/en/instruments/oecd-legal-0449 (2019). Accessed 8 July 2023

United Nations Educational, Scientific and Cultural Organization. Recommendation on the Ethics of Artificial Intelligence. https://unesdoc.unesco.org/ark:/48223/pf0000381137 (2021). Accessed 28 June 2023

European Commission. On artificial intelligence – a European approach to excellence and trust. White paper. COM(2020) 65 final. https://commission.europa.eu/publications/white-paper-artificial-intelligence-european-approach-excellence-and-trust_en (2020). Accessed 28 June 2023

European Commission. The ethics guidelines for trustworthy AI. Directorate-General for Communications Networks, Content and Technology, EC Publications Office. https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai (2019). Accessed 10 July 2023

Jobin, A., Ienca, M., Vayena, E.: The global landscape of AI ethics guidelines. Nat. Mach. Intell. 1 , 389–399 (2019). https://doi.org/10.1038/s42256-019-0088-2

Hagendorff, T.: The ethics of AI ethics: an evaluation of guidelines. Minds Mach. 30 , 99–120 (2020). https://doi.org/10.1007/s11023-020-09517-8

Su, A.: The promise and perils of international human rights law for AI governance. Law Technol. Hum. 4 (2), 166–182 (2022). https://doi.org/10.5204/lthj.2332

Article   MathSciNet   Google Scholar  

Ulnicane, I.: Emerging technology for economic competitiveness or societal challenges? Framing purpose in artificial intelligence policy. GPPG. 2 , 326–345 (2022). https://doi.org/10.1007/s43508-022-00049-8

Sigfrids, A., Leikas, J., Salo-Pöntinen, H., Koskimies, E.: Human-centricity in AI governance: a systemic approach. Front Artif. Intell. 6 , 976887 (2023). https://doi.org/10.3389/frai.2023.976887

Benkler, Y.: Don’t let industry write the rules for AI. Nature 569 (7755), 161 (2019). https://doi.org/10.1038/d41586-019-01413-1

Phan, T., Goldenfein, J., Mann, M., Kuch, D.: Economies of virtue: the circulation of ‘ethics’ in Big Tech. Sci. Cult. 31 (1), 121–135 (2022). https://doi.org/10.1080/09505431.2021.1990875

Ochigame, R.: The invention of “ethical AI”: how big tech manipulates academia to avoid regulation. Intercept. https://theintercept.com/2019/12/20/mit-ethical-ai-artificial-intelligence/ (2019). Accessed 10 July 2023

Ferretti, T.: An institutionalist approach to ai ethics: justifying the priority of government regulation over self-regulation. MOPP 9 (2), 239–265 (2022). https://doi.org/10.1515/mopp-2020-0056

van Maanen, G.: AI ethics, ethics washing, and the need to politicize data ethics. DISO 1 (9), 1–23 (2022). https://doi.org/10.1007/s44206-022-00013-3

Gerdes, A.: The tech industry hijacking of the AI ethics research agenda and why we should reclaim it. Discov. Artif. Intell. 2 (25), 1–8 (2022). https://doi.org/10.1007/s44163-022-00043-3

Amariles, D.R., Baquero, P.M.: Promises and limits of law for a human-centric artificial intelligence. Comput. Law Secur. Rev. 48 (105795), 1–10 (2023). https://doi.org/10.1016/j.clsr.2023.105795

Mittelstadt, B.: Principles alone cannot guarantee ethical AI. Nat. Mach. Intell. 1 (11), 501–507 (2019). https://doi.org/10.1038/s42256-019-0114-4

Munn, L.: The uselessness of AI ethics. AI Ethics 3 , 869–877 (2022). https://doi.org/10.1007/s43681-022-00209-w

Heilinger, J.C.: The ethics of AI ethics. A constructive critique. Philos. Technol. 35 (61), 1–20 (2022). https://doi.org/10.1007/s13347-022-00557-9

Roche, C., Wall, P.J., Lewis, D.: Ethics and diversity in artificial intelligence policies, strategies and initiatives. AI Ethics (2022). https://doi.org/10.1007/s43681-022-00218-9

Diercks, G., Larsen, H., Steward, F.: Transformative innovation policy: addressing variety in an emerging policy paradigm. Res. Policy 48 (4), 880–894 (2019). https://doi.org/10.1016/j.respol.2018.10.028

Owen, R., Pansera, M.: Responsible innovation and responsible research and innovation. In: Dagmar, S., Kuhlmann, S., Stamm, J., Canzler, W. (eds.) Handbook on Science and Public Policy, pp. 26–48. Edward Elgar, Cheltenham (2019)

Google Scholar  

Macq, H., Tancoigne, E., Strasser, B.J.: From deliberation to production: public participation in science and technology policies of the European Commission (1998–2019). Minerva 58 (4), 489–512 (2020). https://doi.org/10.1007/s11024-020-09405-6

Cath, C.: Governing artificial intelligence: ethical, legal and technical opportunities and challenges. Philos. Trans. Royal Soc. A. 376 , 20180080 (2018). https://doi.org/10.1098/rsta.2018.0080

Wilson, C.: The socialization of civic participation norms in government?: Assessing the effect of the Open Government Partnership on countries’e-participation. Gov. Inf. Q. 37 (4), 101476 (2020). https://doi.org/10.1016/j.giq.2020.101476

Hagendijk, R., Irwin, A.: Public deliberation and governance: engaging with science and technology in contemporary Europe. Minerva 44 (2), 167–184 (2006). https://doi.org/10.1007/s11024-006-0012-x

Weingart, P., Joubert, M., Connoway, K.: Public engagement with science - origins, motives and impact in academic literature and science policy. PLoS One 16 (7), e0254201 (2021). https://doi.org/10.1371/journal.pone.0254201

Wynne, B.: Public participation in science and technology: performing and obscuring a political–conceptual category mistake. East Asian Sci. 1 (1), 99–110 (2007). https://doi.org/10.1215/s12280-007-9004-7

Rommetveit, K., Van Dijk, N.: Privacy engineering and the techno-regulatory imaginary. Soc. Stud. Sci. 52 (6), 853–877 (2022). https://doi.org/10.1177/03063127221119424

Levac, D., Colquhoun, H., O’Brien, K.: Scoping studies: advancing the methodology. Implement. Sci. 5 (69), 1–9 (2010). https://doi.org/10.1186/1748-5908-5-69

Arksey, H., O’Malley, L.: Scoping studies: towards a methodological framework. Int. J. Soc. Res. Methodol. 8 (1), 19–32 (2005). https://doi.org/10.1080/1364557032000119616

Stemler, S.: An overview of content analysis. Pract. Asses. Res. Eval. 7 (17), 1–9 (2001). https://doi.org/10.7275/z6fm-2e34

European Commission. European Commission's ethics guidelines for trustworthy AI. https://digital-strategy.ec.europa.eu/en/library/ethics-guidelines-trustworthy-ai (2021). Accessed 8 July 2023

Awad, E., Dsouza, S., Kim, R., Schulz, J., Henrich, J., Shariff, A., et al.: The moral machine experiment. Nature 563 (7729), 59–64 (2018). https://doi.org/10.1038/s41586-018-0637-6

Liyanage, H., Liaw, S.T., Jonnagaddala, J., Schreiber, R., Kuziemsky, C., Terry, A.L., de Lusignan, S.: Artificial intelligence in primary health care: perceptions, issues, and challenges. Yearb. Med. Inform. 28 (1), 41–46 (2019). https://doi.org/10.1055/s-0039-1677901

Jenkins, S., Draper, H.: Care, monitoring, and companionship: views on care robots from older people and their carers. Int. J. Soc. Robot. 7 (5), 673–683 (2015). https://doi.org/10.1007/s12369-015-0322-y

Tzouganatou, A.: Openness and privacy in born-digital archives: reflecting the role of AI development. AI Soc. 37 (3), 991–999 (2022). https://doi.org/10.1007/s00146-021-01361-3

Liljamo, T., Liimatainen, H., Pollanen, M.: Attitudes and concerns on automated vehicles. Transp. Res. Part F Traffic Psychol. Behav. 59 , 24–44 (2018). https://doi.org/10.1016/j.trf.2018.08.010

Couture, V., Roy, M.C., Dez, E., Laperle, S., Belisle-Pipon, J.C.: Ethical implications of artificial intelligence in population health and the public’s role in its governance: perspectives from a citizen and expert panel. J. Med. Internet Res. 25 , e44357 (2023). https://doi.org/10.2196/44357

McCradden, M.D., Sarker, T., Paprica, P.A.: Conditionally positive: a qualitative study of public perceptions about using health data for artificial intelligence research. BMJ Open 10 (10), e039798 (2020). https://doi.org/10.1136/bmjopen-2020-039798

Blease, C., Kharko, A., Annoni, M., Gaab, J., Locher, C.: Machine learning in clinical psychology and psychotherapy education: a mixed methods pilot survey of postgraduate students at a Swiss University. Front. Public Health 9 (623088), 1–8 (2021). https://doi.org/10.3389/fpubh.2021.623088

Kieslich, K., Keller, B., Starke, C.: Artificial intelligence ethics by design. Evaluating public perception on the importance of ethical design principles of artificial intelligence. Big Data Soc. 9 (1), 1–15 (2022). https://doi.org/10.1177/20539517221092956

Willems, J., Schmidthuber, L., Vogel, D., Ebinger, F., Vanderelst, D.: Ethics of robotized public services: the role of robot design and its actions. Gov. Inf. Q. 39 (101683), 1–11 (2022). https://doi.org/10.1016/J.Giq.2022.101683

Tlili, A., Shehata, B., Adarkwah, M.A., Bozkurt, A., Hickey, D.T., Huang, R.H., Agyemang, B.: What if the devil is my guardian angel: ChatGPT as a case study of using chatbots in education. Smart Learn Environ. 10 (15), 1–24 (2023). https://doi.org/10.1186/S40561-023-00237-X

Ehret, S.: Public preferences for governing AI technology: comparative evidence. J. Eur. Public Policy 29 (11), 1779–1798 (2022). https://doi.org/10.1080/13501763.2022.2094988

Esmaeilzadeh, P.: Use of AI-based tools for healthcare purposes: a survey study from consumers’ perspectives. BMC Med. Inform. Decis. Mak. 20 (170), 1–19 (2020). https://doi.org/10.1186/s12911-020-01191-1

Laïï, M.C., Brian, M., Mamzer, M.F.: Perceptions of artificial intelligence in healthcare: findings from a qualitative survey study among actors in France. J. Transl. Med. 18 (14), 1–13 (2020). https://doi.org/10.1186/S12967-019-02204-Y

Valles-Peris, N., Barat-Auleda, O., Domenech, M.: Robots in healthcare? What patients say. Int. J. Environ. Res. Public Health 18 (9933), 1–18 (2021). https://doi.org/10.3390/ijerph18189933

Hallowell, N., Badger, S., Sauerbrei, A., Nellaker, C., Kerasidou, A.: “I don’t think people are ready to trust these algorithms at face value”: trust and the use of machine learning algorithms in the diagnosis of rare disease. BMC Med. Ethics 23 (112), 1–14 (2022). https://doi.org/10.1186/s12910-022-00842-4

Criado, J.I., de Zarate-Alcarazo, L.O.: Technological frames, CIOs, and artificial intelligence in public administration: a socio-cognitive exploratory study in spanish local governments. Gov. Inf. Q. 39 (3), 1–13 (2022). https://doi.org/10.1016/J.Giq.2022.101688

Isbanner, S., O’Shaughnessy, P.: The adoption of artificial intelligence in health care and social services in Australia: findings from a methodologically innovative national survey of values and attitudes (the AVA-AI Study). J. Med. Internet Res. 24 (8), e37611 (2022). https://doi.org/10.2196/37611

Kuberkar, S., Singhal, T.K., Singh, S.: Fate of AI for smart city services in India: a qualitative study. Int. J. Electron. Gov. Res. 18 (2), 1–21 (2022). https://doi.org/10.4018/Ijegr.298216

Kallioinen, N., Pershina, M., Zeiser, J., Nezami, F., Pipa, G., Stephan, A., Konig, P.: Moral judgements on the actions of self-driving cars and human drivers in dilemma situations from different perspectives. Front. Psychol. 10 (2415), 1–15 (2019). https://doi.org/10.3389/fpsyg.2019.02415

Vrščaj, D., Nyholm, S., Verbong, G.P.J.: Is tomorrow’s car appealing today? Ethical issues and user attitudes beyond automation. AI Soc. 35 (4), 1033–1046 (2020). https://doi.org/10.1007/s00146-020-00941-z

Bastian, M., Helberger, N., Makhortykh, M.: Safeguarding the journalistic DNA: attitudes towards the role of professional values in algorithmic news recommender designs. Digit. Journal. 9 (6), 835–863 (2021). https://doi.org/10.1080/21670811.2021.1912622

Kaur, K., Rampersad, G.: Trust in driverless cars: investigating key factors influencing the adoption of driverless cars. J. Eng. Technol. Manag. 48 , 87–96 (2018). https://doi.org/10.1016/j.jengtecman.2018.04.006

Willems, J., Schmid, M.J., Vanderelst, D., Vogel, D., Ebinger, F.: AI-driven public services and the privacy paradox: do citizens really care about their privacy? Public Manag. Rev. (2022). https://doi.org/10.1080/14719037.2022.2063934

Duke, S.A.: Deny, dismiss and downplay: developers’ attitudes towards risk and their role in risk creation in the field of healthcare-AI. Ethics Inf. Technol. 24 (1), 1–15 (2022). https://doi.org/10.1007/s10676-022-09627-0

Cresswell, K., Cunningham-Burley, S., Sheikh, A.: Health care robotics: qualitative exploration of key challenges and future directions. J. Med. Internet Res. 20 (7), e10410 (2018). https://doi.org/10.2196/10410

Amann, J., Vayena, E., Ormond, K.E., Frey, D., Madai, V.I., Blasimme, A.: Expectations and attitudes towards medical artificial intelligence: a qualitative study in the field of stroke. PLoS One 18 (1), e0279088 (2023). https://doi.org/10.1371/journal.pone.0279088

Aquino, Y.S.J., Rogers, W.A., Braunack-Mayer, A., Frazer, H., Win, K.T., Houssami, N., et al.: Utopia versus dystopia: professional perspectives on the impact of healthcare artificial intelligence on clinical roles and skills. Int. J. Med. Inform. 169 (104903), 1–10 (2023). https://doi.org/10.1016/j.ijmedinf.2022.104903

Sartori, L., Bocca, G.: Minding the gap(s): public perceptions of AI and socio-technical imaginaries. AI Soc. 38 (2), 443–458 (2022). https://doi.org/10.1007/s00146-022-01422-1

Chen, Y.-N.K., Wen, C.-H.R.: Impacts of attitudes toward government and corporations on public trust in artificial intelligence. Commun. Stud. 72 (1), 115–131 (2021). https://doi.org/10.1080/10510974.2020.1807380

Aitken, M., Ng, M., Horsfall, D., Coopamootoo, K.P.L., van Moorsel, A., Elliott, K.: In pursuit of socially ly-minded data-intensive innovation in banking: a focus group study of public expectations of digital innovation in banking. Technol. Soc. 66 (101666), 1–10 (2021). https://doi.org/10.1016/j.techsoc.2021.101666

Choung, H., David, P., Ross, A.: Trust and ethics in AI. AI Soc. 38 (2), 733–745 (2023). https://doi.org/10.1007/s00146-022-01473-4

Hartwig, T., Ikkatai, Y., Takanashi, N., Yokoyama, H.M.: Artificial intelligence ELSI score for science and technology: a comparison between Japan and the US. AI Soc. 38 (4), 1609–1626 (2023). https://doi.org/10.1007/s00146-021-01323-9

Ploug, T., Sundby, A., Moeslund, T.B., Holm, S.: Population preferences for performance and explainability of artificial intelligence in health care: choice-based conjoint survey. J. Med. Internet Res. 23 (12), e26611 (2021). https://doi.org/10.2196/26611

Zheng, B., Wu, M.N., Zhu, S.J., Zhou, H.X., Hao, X.L., Fei, F.Q., et al.: Attitudes of medical workers in China toward artificial intelligence in ophthalmology: a comparative survey. BMC Health Serv. Res. 21 (1067), 1–13 (2021). https://doi.org/10.1186/S12913-021-07044-5

Ma, J., Tojib, D., Tsarenko, Y.: Sex robots: are we ready for them? An exploration of the psychological mechanisms underlying people’s receptiveness of sex robots. J. Bus. Ethics 178 (4), 1091–1107 (2022). https://doi.org/10.1007/s10551-022-05059-4

Rhim, J., Lee, G.B., Lee, J.H.: Human moral reasoning types in autonomous vehicle moral dilemma: a cross-cultural comparison of Korea and Canada. Comput. Hum. Behav. 102 , 39–56 (2020). https://doi.org/10.1016/j.chb.2019.08.010

Dempsey, R.P., Brunet, J.R., Dubljevic, V.: Exploring and understanding law enforcement’s relationship with technology: a qualitative interview study of police officers in North Carolina. Appl. Sci-Basel 13 (6), 1–17 (2023). https://doi.org/10.3390/App13063887

Lee, C.H., Gobir, N., Gurn, A., Soep, E.: In the black mirror: youth investigations into artificial intelligence. ACM Trans. Comput. Educ. 22 (3), 1–25 (2022). https://doi.org/10.1145/3484495

Kong, S.C., Cheung, W.M.Y., Zhang, G.: Evaluating an artificial intelligence literacy programme for developing university students? Conceptual understanding, literacy, empowerment and ethical awareness. Educ. Technol. Soc. 26 (1), 16–30 (2023). https://doi.org/10.30191/Ets.202301_26(1).0002

Street, J., Barrie, H., Eliott, J., Carolan, L., McCorry, F., Cebulla, A., et al.: Older adults’ perspectives of smart technologies to support aging at home: insights from five world cafe forums. Int. J. Environ. Res. Public Health 19 (7817), 1–22 (2022). https://doi.org/10.3390/Ijerph19137817

Ikkatai, Y., Hartwig, T., Takanashi, N., Yokoyama, H.M.: Octagon measurement: public attitudes toward AI ethics. Int J Hum-Comput Int. 38 (17), 1589–1606 (2022). https://doi.org/10.1080/10447318.2021.2009669

Wang, S., Bolling, K., Mao, W., Reichstadt, J., Jeste, D., Kim, H.C., Nebeker, C.: Technology to support aging in place: older adults’ perspectives. Healthcare (Basel) 7 (60), 1–18 (2019). https://doi.org/10.3390/healthcare7020060

Zhang, H., Lee, I., Ali, S., DiPaola, D., Cheng, Y.H., Breazeal, C.: Integrating ethics and career futures with technical learning to promote AI literacy for middle school students: an exploratory study. Int. J. Artif. Intell. Educ. 33 , 290–324 (2022). https://doi.org/10.1007/s40593-022-00293-3

Henriksen, A., Blond, L.: Executive-centered AI? Designing predictive systems for the public sector. Soc. Stud. Sci. (2023). https://doi.org/10.1177/03063127231163756

Nichol, A.A., Halley, M.C., Federico, C.A., Cho, M.K., Sankar, P.L.: Not in my AI: moral engagement and disengagement in health care AI development. Pac. Symp. Biocomput. 28 , 496–506 (2023)

Aquino, Y.S.J., Carter, S.M., Houssami, N., Braunack-Mayer, A., Win, K.T., Degeling, C., et al.: Practical, epistemic and normative implications of algorithmic bias in healthcare artificial intelligence: a qualitative study of multidisciplinary expert perspectives. J. Med. Ethics (2023). https://doi.org/10.1136/jme-2022-108850

Nichol, A.A., Bendavid, E., Mutenherwa, F., Patel, C., Cho, M.K.: Diverse experts’ perspectives on ethical issues of using machine learning to predict HIV/AIDS risk in sub-Saharan Africa: a modified Delphi study. BMJ Open 11 (7), e052287 (2021). https://doi.org/10.1136/bmjopen-2021-052287

Awad, E., Levine, S., Kleiman-Weiner, M., Dsouza, S., Tenenbaum, J.B., Shariff, A., et al.: Drivers are blamed more than their automated cars when both make mistakes. Nat. Hum. Behav. 4 (2), 134–143 (2020). https://doi.org/10.1038/s41562-019-0762-8

Blease, C., Kaptchuk, T.J., Bernstein, M.H., Mandl, K.D., Halamka, J.D., DesRoches, C.M.: Artificial intelligence and the future of primary care: exploratory qualitative study of UK general practitioners’ views. J. Med. Internet Res. 21 (3), e12802 (2019). https://doi.org/10.2196/12802

Blease, C., Locher, C., Leon-Carlyle, M., Doraiswamy, M.: Artificial intelligence and the future of psychiatry: qualitative findings from a global physician survey. Digit. Health 6 , 1–18 (2020). https://doi.org/10.1177/2055207620968355

De Graaf, M.M.A., Hindriks, F.A., Hindriks, K.V.: Who wants to grant robots rights? Front Robot AI 8 , 781985 (2022). https://doi.org/10.3389/frobt.2021.781985

Guerouaou, N., Vaiva, G., Aucouturier, J.-J.: The shallow of your smile: the ethics of expressive vocal deep-fakes. Philos. Trans. R Soc. B Biol. Sci. 377 (1841), 1–11 (2022). https://doi.org/10.1098/rstb.2021.0083

McCradden, M.D., Baba, A., Saha, A., Ahmad, S., Boparai, K., Fadaiefard, P., Cusimano, M.D.: Ethical concerns around use of artificial intelligence in health care research from the perspective of patients with meningioma, caregivers and health care providers: a qualitative study. CMAJ Open 8 (1), E90–E95 (2020). https://doi.org/10.9778/cmajo.20190151

Rogers, W.A., Draper, H., Carter, S.M.: Evaluation of artificial intelligence clinical applications: Detailed case analyses show value of healthcare ethics approach in identifying patient care issues. Bioethics 36 (4), 624–633 (2021). https://doi.org/10.1111/bioe.12885

Tosoni, S., Voruganti, I., Lajkosz, K., Habal, F., Murphy, P., Wong, R.K.S., et al.: The use of personal health information outside the circle of care: consent preferences of patients from an academic health care institution. BMC Med. Ethics 22 (29), 1–14 (2021). https://doi.org/10.1186/S12910-021-00598-3

Allahabadi, H., Amann, J., Balot, I., Beretta, A., Binkley, C., Bozenhard, J., et al.: Assessing trustworthy AI in times of COVID-19: deep learning for predicting a multiregional score conveying the degree of lung compromise in COVID-19 patients. IEEE Trans. Technol. Soc. 3 (4), 272–289 (2022). https://doi.org/10.1109/TTS.2022.3195114

Gray, K., Slavotinek, J., Dimaguila, G.L., Choo, D.: Artificial intelligence education for the health workforce: expert survey of approaches and needs. JMIR Med. Educ. 8 (2), e35223 (2022). https://doi.org/10.2196/35223

Alfrink, K., Keller, I., Doorn, N., Kortuem, G.: Tensions in transparent urban AI: designing a smart electric vehicle charge point. AI Soc. 38 (3), 1049–1065 (2022). https://doi.org/10.1007/s00146-022-01436-9

Bourla, A., Ferreri, F., Ogorzelec, L., Peretti, C.S., Guinchard, C., Mouchabac, S.: Psychiatrists’ attitudes toward disruptive new technologies: mixed-methods study. JMIR Ment. Health 5 (4), e10240 (2018). https://doi.org/10.2196/10240

Kopecky, R., Kosova, M.J., Novotny, D.D., Flegr, J., Cerny, D.: How virtue signalling makes us better: moral preferences with respect to autonomous vehicle type choices. AI Soc. 38 , 937–946 (2022). https://doi.org/10.1007/s00146-022-01461-8

Lam, K., Abramoff, M.D., Balibrea, J.M., Bishop, S.M., Brady, R.R., Callcut, R.A., et al.: A Delphi consensus statement for digital surgery. NPJ Digit. Med. 5 (100), 1–9 (2022). https://doi.org/10.1038/s41746-022-00641-6

Karaca, O., Çalışkan, S.A., Demir, K.: Medical artificial intelligence readiness scale for medical students (MAIRS-MS) – development, validity and reliability study. BMC Med. Educ. 21 (112), 1–9 (2021). https://doi.org/10.1186/s12909-021-02546-6

Papyshev, G., Yarime, M.: The limitation of ethics-based approaches to regulating artificial intelligence: regulatory gifting in the context of Russia. AI Soc. (2022). https://doi.org/10.1007/s00146-022-01611-y

Balaram, B., Greenham, T., Leonard, J.: Artificial intelligence: real public engagement. RSA, London. https://www.thersa.org/globalassets/pdfs/reports/rsa_artificial-intelligence---real-public-engagement.pdf (2018). Accessed 28 June 2023

Hagendorff, T.: A virtue-based framework to support putting AI ethics into practice. Philos Technol. 35 (55), 1–24 (2022). https://doi.org/10.1007/s13347-022-00553-z

Felt, U., Wynne, B., Callon, M., Gonçalves, M. E., Jasanoff, S., Jepsen, M., et al.: Taking european knowledge society seriously. Eur Comm, Brussels, 1–89 (2007). https://op.europa.eu/en/publication-detail/-/publication/5d0e77c7-2948-4ef5-aec7-bd18efe3c442/language-en

Michael, M.: Publics performing publics: of PiGs, PiPs and politics. Public Underst. Sci. 18 (5), 617–631 (2009). https://doi.org/10.1177/09636625080985

Hu, L.: Tech ethics: speaking ethics to power, or power speaking ethics? J. Soc. Comput. 2 (3), 238–248 (2021). https://doi.org/10.23919/JSC.2021.0033

Strasser, B., Baudry, J., Mahr, D., Sanchez, G., Tancoigne, E.: “Citizen science”? Rethinking science and public participation. Sci. Technol. Stud. 32 (2), 52–76 (2019). https://doi.org/10.23987/sts.60425

De Saille, S.: Innovating innovation policy: the emergence of ‘Responsible Research and Innovation.’ J. Responsible Innov. 2 (2), 152–168 (2015). https://doi.org/10.1080/23299460.2015.1045280

Schwarz-Plaschg, C.: Nanotechnology is like… The rhetorical roles of analogies in public engagement. Public Underst. Sci. 27 (2), 153–167 (2018). https://doi.org/10.1177/0963662516655686

Taylor, R.R., O’Dell, B., Murphy, J.W.: Human-centric AI: philosophical and community-centric considerations. AI Soc. (2023). https://doi.org/10.1007/s00146-023-01694-1

van Dijk, N., Tanas, A., Rommetveit, K., Raab, C.: Right engineering? The redesign of privacy and personal data protection. Int. Rev. Law Comput. Technol. 32 (2–3), 230–256 (2018). https://doi.org/10.1080/13600869.2018.1457002

Aradau, C., Blanke, T.: Algorithmic reason. The new government of self and others. Oxford University Press, Oxford (2022)

Book   Google Scholar  

McGoey, L.: The unknowers. How strategic ignorance rules the word. Zed, London (2019)

Ananny, M., Crawford, K.: Seeing without knowing: limitations of the transparency ideal and its application to algorithmic accountability. New Media Soc. 20 (3), 973–989 (2018). https://doi.org/10.1177/1461444816676645

Download references

Acknowledgements

The authors would like to express their gratitude to Rafaela Granja (CECS, University of Minho) for her insightful support in an early stage of preparation of this manuscript, and to the AIDA research netwrok for the inspiring debates.

Open access funding provided by FCT|FCCN (b-on). Helena Machado and Susana Silva did not receive funding to assist in the preparation of this work. Laura Neiva received funding from FCT—Fundação para a Ciência e a Tecnologia, I.P., under a PhD Research Studentships (ref.2020.04764.BD), and under the project UIDB/00736/2020 (base funding) and UIDP/00736/2020 (programmatic funding).

Author information

Authors and affiliations.

Department of Sociology, Institute for Social Sciences, University of Minho, Braga, Portugal

Helena Machado

Department of Sociology and Centre for Research in Anthropology (CRIA), Institute for Social Sciences, University of Minho, Braga, Portugal

Susana Silva

Institute for Social Sciences, Communication and Society Research Centre (CECS), University of Minho, Braga, Portugal

Laura Neiva

You can also search for this author in PubMed   Google Scholar

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by HM, SS, and LN. The first draft of the manuscript was written by HM and SS. All authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Helena Machado .

Ethics declarations

Conflict of interest.

The authors have no relevant financial or non-financial interests to disclose. The authors have no competing interests to declare that are relevant to the content of this article.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (DOCX 20 KB)

Rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Machado, H., Silva, S. & Neiva, L. Publics’ views on ethical challenges of artificial intelligence: a scoping review. AI Ethics (2023). https://doi.org/10.1007/s43681-023-00387-1

Download citation

Received : 08 October 2023

Accepted : 16 November 2023

Published : 19 December 2023

DOI : https://doi.org/10.1007/s43681-023-00387-1

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Artificial intelligence
  • Public involvement
  • Publics’ views
  • Responsible research
  • Find a journal
  • Publish with us
  • Track your research

Reviews in History logo

Covering books and digital resources across all fields of history

Like us on Facebook

ISSN 1749-8155

Black Africans in Renaissance Europe

comparison and literature review

This book might come as a surprise for non-specialists, since black Africans are identified with slave trade to the Americas, while the Renaissance is regarded as a purely European phenomenon, centred on a largely homogeneous ethnicity. Neither of these assertions is true, and this excellent book helps to deconstruct such historical stereotypes. Europe received black Africans regularly and in significant numbers from the mid-fifteenth century onwards. The Mediterranean was a cross-cultural and inter-ethnic space even before Classical Greece. The Renaissance reflected not only the rediscovery of classical culture, but also the influx of techniques and ideas brought by the Arabs. Intercontinental navigation revealed simultaneous processes of cultural renovation, which helped to reshape Europe.

At the outset of the volume, Kate Lowe defines the editors’ key question: how were the main stereotypes concerning black people established in this period? She provides several examples relating to the main set of prejudices: the African was generally identified as a naked person who would mutilate his/her face and body with scarification, piercings, and tattoos; he/she would be considered as carefree and characterized by immoderate laughter, unaware of his/her condition, lazy and sexually promiscuous, physically strong, a good musician or dancer. Lowe recognizes the existence of noble or ennobled black men in European courts, but she stresses the role of black people as a necessary counter-image in the construction of European whiteness and ‘civilization’ (a notion coined in the eighteenth century). This is a necessary starting point, although some of the chapters develop a more nuanced vision of race relations in this period.

Anne Marie Jordan, for instance, has a fine chapter on slaves in the Lisbon court of Queen Catherine of Austria, where mainly women and children of different ethnic origins were used as musicians, cooks, pastry chefs, housekeepers, pages, or servants in royal apothecaries, kitchens, gardens, and stables. Jordan points out how white Moorish slaves were favoured because of skin colour prejudices, but black slaves were considered trustworthy for religious reasons. The black slaves were a sign of social prestige and distinction in a cosmopolitan court: this feature explains why Catherine spent so much money clothing and offering them as exotic gifts to her favourite ladies and relatives in other European courts. The representation of small black slaves in the portraits of Iberian princesses, as in the painting of Juana de Austria by Cristóvão de Morais, reinforced their image as symbols of empire building.

Jorge Fonseca presents the results of his research on sixteenth-century Southern Portugal, where he estimates a total of six to seven per cent of blacks in the population, mainly in urban areas, in contrast with the Northern region, where blacks were scarce. His analysis of the perceptions of black people by Nicholas Cleynaerts, a Flemish scholar who taught in Louvain, Paris, and Salamanca, spending several years in Portugal as tutor of infant Henry (the future cardinal and General Inquisitor), is less convincing. The scholar is presented as an ‘exotic visitor’, which is misplaced, since he belonged to the international Renaissance elite who circulated between different European countries. Cleynaerts bought young slaves and taught them as assistants. His observation that they were like ‘monkeys’ (meaning capable of imitating but not of creating) is considered by Fonseca as a sign of the contrast between two societies, the Flemish and the Portuguese, the first unaware of black people, the second used to them. It is disputable that Cleynaerts’ classification of the young slaves as ‘monkeys’ was his own, and not influenced by the Portuguese, but the implicit assumption that the Portuguese were less ‘racist’ than the Flemish is questionable.

Didier Lahon proposes an interesting analysis of the mixed confraternity of Nossa Senhora do Rosário in Lisbon, which split into two branches of white and black members. The conflict that existed between them for more than one century, and the final victory of the white branch in 1646, is interpreted as a shift from a relatively tolerant society, open to manumission (one of the privileges of the confraternity) and to intervention against bad treatment of slaves, to a more rigid and intolerant society in the seventeenth century. The implementation of the obligatory baptism of slaves throughout the second half of the sixteenth century is also reconstituted in detail. The analysis of the impact of the notion of blood purity in Portugal is much less convincing, with a deficient chronology and huge gaps, while comprehensive studies are ignored. The idea that the Iberian Peninsula dealt with the presence of Moors, Jews, and New Christians as an anomaly from 1350 onwards is simply wrong, as Maria José Ferro Tavares and Maria Filomena Lopes de Barros have demonstrated.

Thomas Earle focuses his study on the work of Afonso Álvares, a mulatto poet and playwright, cautiously alerting the reader to the lack of evidence to prove that they were one and the same person. Álvares is one of the few mixed-race intellectuals in Europe in the sixteenth century. He wrote satirical poems and four plays based on saints’ lives, commissioned by the Augustinian canons of São Vicente de Fora in Lisbon. Earle discusses the quality of the plays and convincingly refuses the historical devaluation of the writer, who has been seen as a minor disciple of Gil Vicente. A particularly interesting section concerns the polemic in satirical redondilhas between Afonso Álvares and another poet, António Ribeiro Chiado. Álvares accused Chiado of low birth and immorality. Chiado insulted Álvares in racist terms, accusing him of being a mulatto, son of a black woman, a slave freed by marriage. Álvares underlined the nobility of his father—whose identity was never disclosed; it might have been Dom Afonso de Portugal, bishop of Évora, in whose household Álvares was educated. In his plays, Álvares reflects the dominant anti-Semitic mood. There is sufficient material here for a deeper reflection on the racial prejudices of the Portuguese Renaissance society and on the conflicting mechanisms of social promotion among subaltern groups.

Jeremy Lawrence presents a very good overview of the Black Africans in Spanish literature, identifying the main ideas: dehumanization of slaves as chattels, defined by bestiality, nakedness, lascivious vulgarity, burlesque behaviour, pidgin language. He focuses his study on the ‘habla de negros’ enlarging the already significant bibliography on the subject (the crucial study by Paul Teyssier on Gil Vicente could have been mentioned). The author selects less known texts and provides two excellent critical editions of pliegos in the appendix. The originality and subversive meaning of the poems is brought out clearly in this chapter, since they staged strong black characters with unconventional relations with white women. Baltasar Fra-Molinero is another author who has extensively written on blacks in Spanish literature, and has contributed to changing the field. He has shown how this marginal and neglected topic played an important role in the sixteenth and seventeenth centuries. Here he concentrates on Juan Latino, the only black Latinist, scholar, and writer in the European Renaissance, who lived in Granada. He has previously pointed out how Juan Latino reflected on the black condition and refused a social hierarchy based on skin colour prejudices. Fra-Molinero analyses now the poem Austrias Carmen , dedicated by Juan Latino to Juan de Austria after his victory over Morisco insurrection in Granada, known as the War of the Alpujarras (1568–1572). In the text, Latino searched to establish the dignity of all black Africans, relating them to biblical Ethiopia and refusing the idea of natural slavery. He imagines white people subordinated in Ethiopia (a reversed irony) and exalts blackness in the final verses.

Debra Blumenthal addresses a very interesting issue: the role of a black African confraternity in Valencia founded in 1472 by forty black freedmen that collected alms and negotiated contracts of manumission on behalf of their fellows in captivity. She knows the context of slave trade in Valencia well, the variety of the black community in the town, and the functions of the confraternity (‘casa dels negres’) as shelter, hospice, and hospital. She analyses two cases of manumission, concerning Ursola and Johana, in which all the financial, juridical, and social difficulties are analysed, as well as the subsequent barriers to full integration.

Aurelia Marín Casares, who has written a very good book on slavery in Granada, presents here part of her enquiry into free and freed black Africans in the region. She has identified most of their occupations: men were stable workers, esparto workers, smelters and casters in foundries, carriers and vendors of water or firewood, bakers, butchers, hod carriers, builders, diggers, pavers; women were housewives, farmers, embroiders, maids, taverns and inns employees, sorceresses. The author details the confraternities created by blacks and mulattos in Granada. The notion of blackness and the different types of black people do not become clear in this article, however, since in many cases Moriscos were considered black by the Christian population.

The ‘Italian section’ is one of the most interesting in the book. Paul Kaplan argues that Isabella d’Este and Andrea Mantegna created a new iconographic type: the black attendant to a white European protagonist. In his opinion, Judith’s servant was depicted as black for the first time by Mantegna in a drawing from 1492. As the author points out, this idea of displaying black servants to suggest the universal reach of imperial power had already been coined by Frederick II. Kaplan stresses the diffusion of this idea among European rulers, namely the Aragonese kings of Naples or the ruling houses of Ferrara, Mantua, and Milan, in which black servants were used as human accessories and depicted as such. The only problem in this stimulating chapter is the uniform definition of ‘blackness’, while in several paintings (see for example the Allegory of Virtue by Correggio) there is a gradation of skin colour from black to brown.

John Brakett suggests that Alessandro de’ Medici, the first duke of Florence (1529–1537) was of mixed race, an illegitimate son of Lorenzo de’ Medici, duke of Nemours and ruler of Urbino (and a direct descendent of Lorenzo ‘il Magnifico’ and Cosimo ‘il Vecchio’) and a peasant woman, a freed slave, generally considered as a ‘Moor’, but now depicted as a Black African. The argument is based not on new documents but on the analysis of the set of images of Alessandro de’ Medici. The problem lies in the final conclusion: the author considers that there was no intellectual racism in the sixteenth century, since the duke was murdered under the accusation of being a tyrant, but his racial status was not used in political debate or in denigration of his memory, which proves the supremacy of the innate quality of princes. This is an open issue: as the author mentions in his text, the duke was nicknamed ‘the Moor’ and ‘the mule’ of the Medici in his lifetime, which suggests a more complicated picture.

Sergio Tognetti concentrates on the trade in black African slaves in fifteenth-century Florence. The percentage of East European slaves in North Italian cities was quite important by the end of the fourteenth century, mainly in Genoa (nearly ten per cent) due to the Genoese trading communities in the Black Sea, but the fall of Contantinople in 1453 ended this commercial exchange. The slave trade in black Africans spread throughout the fifteenth century, replacing the previous trade. Networks also changed, from Arab merchants to Portuguese ones. This careful research, based on the account books of the Cambini bank, shows the value of slaves (proving also how whiter skin was more appreciated than darker skin) and the overwhelming control of the market from Lisbon, confirming the role of Bartolomeo Marchionni as the biggest slave trader in those days.

The pastoral care of black Africans in Renaissance Italy is the subject of Nelson Minnich’s chapter. The zigzag policies of the Popes from Martin V to Paul III is well documented, with successive bulls prohibiting the African slave trade (1425) and black slavery (1462), then allowing the trade with captive people (1455, 1456, 1493), and finally condemning the enslavement of native American people (1537), while the citizens of Rome were authorized to hold slaves (1548). The creation of black confraternities in Naples, Palermo, and Messina was a result of the activity of different religious orders among slaves and freedmen. The access of black people (slaves and freedmen) to the sacraments of penance, communion, and marriage is well documented, while the ordination of black priests was very rare—one Ethiopian and one Congolese bishop, suggested by the Portuguese king in 1513, were exceptional cases.

Anu Korhonen addresses the crucial proverb ‘washing the Ethiopian white’ in Renaissance England. It became a metaphor for everything considered useless, irrational, and impossible. It was widespread in England, although the relatively frequent literary references to black people in literature were brief and stereotyped. Africans were explicitly related to apes, defined by unruly sexuality, a lack of reason, violence, and ugliness (English is the only language in which the same term, fair, is used for beauty and blondness). Although Korhonen quotes an impressive range of sources, some of them from a very early period, it would have been interesting to establish the turning point of the process of construction and diffusion of the stereotype.

Lorenz Seelig studies the fascinating case of the ‘Moor’s Head’ produced circa 1600 by the Nuremberg goldsmith Christoph Jamnitzer. It shows the features of a young African with full lips, broad nose, and curled hair, with a headband chased with eight ‘T’s. It as a heraldic work of art representing the armorial bearings of the Florentine Pucci family, coupled with the coat of arms of the Florentine Strozzi family. This splendid object, made of silver and rock crystal, is also a drinking vessel: the upper part of the head can be taken off, like a cover. Seelig relates the object to the German tradition of drinking vessels, the double sense of the word kopf and the practice of drinking from human skulls (relics of saints), which is documented until the late-eighteenth century. He points out that, outside of the ecclesiastical sphere, profane drinking vessels were considered signs of moral decadence such as in the tradition of fools’ head cups. Cups, jugs, or oil lamps were represented as black Africans (Seelig indicates an early example from the workshop of Andrea Ricci, circa 1500, with deformed face, open mouth, and protruding jaw to hold the wick). But on the other hand, Seelig points to the statues or cameos of the black Venus and black Diana, or the dignified sculptures of black prisoners and ambassadors (namely by Pietro Tacca, Pietro Francavilla, Francesco Bordoni, Nicolas Cordier, Francesco Caporale), relating to a notion of a rich Africa which contradicts the ideas of savagery and poverty. The only slippery moment in the article comes when Seelig points out a contradiction between the role of Roberto Pucci as commander of the order of Santo Stefano, responsible for chasing African pirates, and the attractive representation of the African head in his coat of arms. This is exactly the origin of the fashionable heraldry of African heads in many medieval coats of arms in Europe, following the crusades and the naval conflicts in Mediterranean.

Jean-Michel Massing writes a fascinating article on the representation of lip-plated Africans in Pierre Descelier’s world map of 1550. In his typical manner of detective research (perhaps inspired by the paradigma indiziario founded by Giovanni Morelli), Massing shows the crucial meaning of two figures of black men with enlarged lips, placed in central Africa, sitting opposite each other, probably bartering a gold nugget for a flowery plant. He reconstitutes the first accounts of the enlarged lips found in different parts of Africa, namely by Isidore of Seville, Rabanus Maurus, Vincent of Beauvis, and Alvise da Mosto. He traces the original image of the bartering scene, a woodcut from a Strasbourg edition of Ptolemy’s Geography published in 1522. He rightly interprets the scene as an expression of the notion that ‘such people’ have no idea of the true value of things. But it is at the beginning of the article, when Massing defines the circle of cartographers in Dieppe and the powerful ship-owners like Jean Ango, who created huge friezes in his house and his chapel representing peoples of different continents, that the most interesting hypothesis of the book is produced. Massing sustains that Northern Europeans recorded in their drawings the features and material culture of other peoples of the world (Africans, Indians, or Americans) with greater care than the southern Europeans, namely the Italians, who were looking for aesthetic solutions and became relatively blind to the rich variety of non-European people. This hypothesis requires further enquiry, but it raises a very interesting issue, related to the idea of the art of describing studied by Svetlana Alpers for a later period, in seventeenth century Dutch Art.

The only problem of this book is the unbalanced space dedicated to Southern and Northern Europe. We have thirteen chapters concerning Southern Europe (Portugal, Spain, and Italy), and three about the rest of Europe (England, France, and Germany). We already have a significant number of books and PhDs on black slaves and freedmen in Portugal and Spain (Saunders, Tinhorão, Lahon, Fonseca, Stella, Martín). We needed to have more information on Northern Europe to understand how black Africans circulated and stereotypes in this area developed. This would enable us to answer better the following questions: why was the theory of races born in Northern Europe from the 1730s to the 1850s (Linnæus, Camper, Cuvier, Gobineau)? What were the precedents of that theory, not only from a colonial point of view, but also through an internal European dynamic of contact with African people?

But we have to be fair with the editors of the volume: the books published on black Africans in Portugal and Spain have not been translated into English and some of their main authors were invited to participate; the final result is a truly excellent, well illustrated set of chapters, which raise new issues and provide much information and analysis.

Libraries are full of books about great cats. This one is special.

Caleb carr’s memoir, ‘my beloved monster,’ is a heart-rending tale of human-feline connection.

Over the years, my wife and I have been blessed with 15 cats, three rescued from the streets of Brooklyn, three from barns near our home in Vermont, one from a Canadian resort and the others from the nearby shelter, where my wife has volunteered as a “cat whisperer” for the most emotionally scarred of its feline inhabitants for years. Twelve of our beloved pets have died (usually in our arms), and we could lose any of our current three cats — whose combined age is roughly 52 — any day now. So, I am either the best person to offer an opinion on Caleb Carr’s memoir, “ My Beloved Monster ,” or the worst.

For the many who have read Carr’s 1994 novel, “The Alienist,” an atmospheric crime story set in 19th-century New York, or watched the Netflix series it inspired, Carr’s new book might come as something of a surprise. “My Beloved Monster” is a warm, wrenching love story about Carr and his cat, a half-wild rescue named Masha who, according to the subtitle of his book, in fact rescued Carr. The author is, by his own admission, a curmudgeon, scarred by childhood abuse, living alone and watching his health and his career go the way of all flesh.

What makes the book so moving is that it is not merely the saga of a great cat. Libraries are filled with books like that, some better than others. It’s the 17-year chronicle of Carr and Masha aging together, and the bond they forged in decline. (As Philip Roth observed, “Old age isn’t a battle; old age is a massacre.”) He chronicles their lives, beginning with the moment the animal shelter begs Carr to bring the young lioness home because the creature is so ferocious she unnerves the staff — “You have to take that cat!” one implores.

Interspersed throughout Carr’s account of his years with Masha are his recollections of all the other cats he has had in his life, going back to his youth in Manhattan. And there are a lot. Cats often provided him comfort after yet another torment his father, the writer Lucien Carr , and stepfather visited upon him. Moreover, Carr identifies so deeply with the species that as a small child he drew a self-portrait of a boy with a cat’s head. He knows a great deal about cats and is eager to share his knowledge, for instance about the Jacobson’s organ in the roof of their mouths that helps them decide if another creature is predator or prey. His observations are always astute: “Dogs tend to trust blindly, unless and until abuse teaches them discretion. … Cats, conversely, trust conditionally from the start.”

Carr, now 68, was a much younger man when he adopted Masha. Soon, however, they were joined at the hip. As the two of them bonded, the writer found himself marveling at what he believed were their shared childhood traumas, which move between horrifying and, in Carr’s hands, morbidly hilarious: “I began to accept my father’s behavior in the spirit with which he intended it … he was trying to kill me.” Man and cat shared the same physical ailments, including arthritis and neuropathy, possibly caused by physical violence in both cases. Carr allowed Masha, a Siberian forest cat, to go outside, a decision many cat owners may decry, but he defends it: “Masha was an entirely different kind of feline,” and keeping her inside “would have killed her just as certainly as any bear or dog.” Indeed, Masha took on fishers and bears (yes, bears!) on Carr’s wooded property in Upstate New York.

But bears and dogs are humdrum fare compared with cancer and old age, which come for both the novelist and his cat. Carr’s diagnosis came first, and his first concern was whether he would outlive Masha. (The existence of the book gives us the answer he didn’t have at the time.) Illness adds new intensity to the human-feline connection: “Coming back from a hospital or a medical facility to Masha was always particularly heartening,” Carr writes, “not just because she’d been worried and was glad to see me, but because she seemed to know exactly what had been going on … and also because she was so anxious to show that she hadn’t been scared, that she’d held the fort bravely.”

Sometimes, perhaps, Carr anthropomorphizes too much and exaggerates Masha’s language comprehension, or gives her more human emotion than she had. But maybe not. Heaven knows, I see a lot behind my own cats’ eyes. Moreover, it’s hard to argue with a passage as beautiful as this: “In each other’s company, nothing seemed insurmountable. We were left with outward scars. … But the only wounds that really mattered to either of us were the psychic wounds caused by the occasional possibility of losing each other; and those did heal, always, blending and dissolving back into joy.”

Like all good memoirs — and this is an excellent one — “My Beloved Monster” is not always for the faint of heart. Because life is not for the faint of heart. But it is worth the emotional investment, and the tissues you will need by the end, to spend time with a writer and cat duo as extraordinary as Masha and Carr.

Chris Bohjalian is the best-selling author of 24 books. His most recent novel, “The Princess of Las Vegas,” was published last month.

My Beloved Monster

Masha, the Half-Wild Rescue Cat Who Rescued Me

By Caleb Carr

Little, Brown. 435 pp. $29

We are a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for us to earn fees by linking to Amazon.com and affiliated sites.

comparison and literature review

IMAGES

  1. Literature Review Summary Table

    comparison and literature review

  2. Systematic Review and Literature Review: What's The Differences?

    comparison and literature review

  3. Types of literature reviews

    comparison and literature review

  4. 50 Smart Literature Review Templates (APA) ᐅ TemplateLab

    comparison and literature review

  5. The summary-comparison matrix: A tool for writing the literature review

    comparison and literature review

  6. A Complete Guide on How to Write Good a Literature Review

    comparison and literature review

VIDEO

  1. The Literature Review

  2. What is literature review?

  3. Literature Review

  4. Literature Review Writing Part II

  5. Purposes of a Literature Review

  6. Lecture 11: Basics of Literature Review

COMMENTS

  1. How to Write a Literature Review

    Examples of literature reviews. Step 1 - Search for relevant literature. Step 2 - Evaluate and select sources. Step 3 - Identify themes, debates, and gaps. Step 4 - Outline your literature review's structure. Step 5 - Write your literature review.

  2. Writing a Literature Review

    A literature review is a document or section of a document that collects key sources on a topic and discusses those sources in conversation with each other (also called synthesis).The lit review is an important genre in many disciplines, not just literature (i.e., the study of works of literature such as novels and plays).

  3. What is a Literature Review? How to Write It (with Examples)

    A literature review is a critical analysis and synthesis of existing research on a particular topic. It provides an overview of the current state of knowledge, identifies gaps, and highlights key findings in the literature. 1 The purpose of a literature review is to situate your own research within the context of existing scholarship ...

  4. Introduction

    What kinds of literature reviews are written? Narrative review: The purpose of this type of review is to describe the current state of the research on a specific topic/research and to offer a critical analysis of the literature reviewed. Studies are grouped by research/theoretical categories, and themes and trends, strengths and weakness, and gaps are identified.

  5. Ten Simple Rules for Writing a Literature Review

    Literature reviews are in great demand in most scientific fields. Their need stems from the ever-increasing output of scientific publications .For example, compared to 1991, in 2008 three, eight, and forty times more papers were indexed in Web of Science on malaria, obesity, and biodiversity, respectively .Given such mountains of papers, scientists cannot be expected to examine in detail every ...

  6. Comparing and Contrasting in an Essay

    Comparing and contrasting is also used in all kinds of academic contexts where it's not explicitly prompted. For example, a literature review involves comparing and contrasting different studies on your topic, and an argumentative essay may involve weighing up the pros and cons of different arguments.

  7. Writing a literature review

    Writing a literature review requires a range of skills to gather, sort, evaluate and summarise peer-reviewed published data into a relevant and informative unbiased narrative. Digital access to research papers, academic texts, review articles, reference databases and public data sets are all sources of information that are available to enrich ...

  8. Learn how to write a review of literature

    A review is a required part of grant and research proposals and often a chapter in theses and dissertations. Generally, the purpose of a review is to analyze critically a segment of a published body of knowledge through summary, classification, and comparison of prior research studies, reviews of literature, and theoretical articles.

  9. What is a literature review?

    A literature or narrative review is a comprehensive review and analysis of the published literature on a specific topic or research question. The literature that is reviewed contains: books, articles, academic articles, conference proceedings, association papers, and dissertations. It contains the most pertinent studies and points to important ...

  10. Literature Review Overview

    A literature review discusses published information in a particular subject area. Often part of the introduction to an essay, research report or thesis, the literature review is literally a "re" view or "look again" at what has already been written about the topic, wherein the author analyzes a segment of a published body of knowledge through summary, classification, and comparison of prior ...

  11. Literature review as a research methodology: An overview and guidelines

    Literature reviews can also be useful if the aim is to engage in theory development (Baumeister & Leary, 1997; Torraco, 2005). In these cases, a literature review provides the basis for building a new conceptual model or theory, and it can be valuable when aiming to map the development of a particular research field over time.

  12. Writing a Literature Review

    A literature review is an integrated analysis of scholarly writings that are related directly to your research question. Put simply, it's a critical evaluation of what's already been written on a particular topic.It represents the literature that provides background information on your topic and shows a connection between those writings and your research question.

  13. Systematic, Scoping, and Other Literature Reviews: Overview

    A scoping review employs the systematic review methodology to explore a broader topic or question rather than a specific and answerable one, as is generally the case with a systematic review. Authors of these types of reviews seek to collect and categorize the existing literature so as to identify any gaps.

  14. Writing the Review

    Your Literature Review should not be a summary and evaluation of each article, one after the other. Your sources should be integrated together to create a narrative on your topic. Consider the following ways to organize your review: Use an outline to organize your sources and ideas in a logical sequence. Identify main points and subpoints, and ...

  15. Comparing Integrative and Systematic Literature Reviews

    Table 2 presents a comparison of integrative and systematic literature reviews. An integrative literature review "reviews, critiques, and synthesizes representative literature on a topic in an integrated way such that new frameworks and perspectives on the topic are generated" (Torraco, 2005, p. 356).By integrated, it means that it is best used when different communities of practice are ...

  16. Compare and Contrast: Research Paper vs. Literature Review

    Differences between Research Papers and Literature Reviews. Research papers and literature reviews are two of the most important elements in academic writing. Both require extensive research, critical thinking, and well-crafted arguments. However, there are some key differences between these two types of scholarly documents.

  17. Chapter 9 Methods for Literature Reviews

    Literature reviews can take two major forms. The most prevalent one is the "literature review" or "background" section within a journal paper or a chapter in a graduate thesis. This section synthesizes the extant literature and usually identifies the gaps in knowledge that the empirical study addresses (Sylvester, Tate, & Johnstone, 2013).

  18. Comparative Literature Review Essays

    Comparative Literature Review Essays. A may, e.g., require you to examine two schools of thought, two issues, or the positions taken by two persons. You may create a hierarchy of issues and sub-issues to compare and contrast, as suggested by the following general plan. This model lists 3 options for structuring the of the review.

  19. Subject Guides: Literature Reviews: What is a literature review?

    A literature review is an examination of research in a particular field. It gathers, critically analyses, evaluates, and synthesises current research literature in a discipline, and. indicates where there may be strengths, gaps, weaknesses, and agreements in the current research. It considers:

  20. Systematic and scoping reviews: A comparison and overview

    A systematic review is a formalized method to address a specific clinical question by analyzing the breadth of published literature while minimizing bias. Systematic reviews are designed to answer narrow clinical questions in the PICO (population, intervention, comparison, and outcome) format. Alternatively, scoping reviews use a similar ...

  21. Literature Reviews, Theoretical Frameworks, and Conceptual Frameworks

    The first element we discuss is a review of research (literature reviews), which highlights the need for a specific research question, study problem, or topic of investigation. Literature reviews situate the relevance of the study within a topic and a field. The process may seem familiar to science researchers entering DBER fields, but new ...

  22. Literature Review vs Systematic Review

    Regardless of this commonality, both types of review vary significantly. The following table provides a detailed explanation as well as the differences between systematic and literature reviews. Kysh, Lynn (2013): Difference between a systematic review and a literature review.

  23. Five tips for developing useful literature summary tables for writing

    Literature reviews offer a critical synthesis of empirical and theoretical literature to assess the strength of evidence, develop guidelines for practice and policymaking, and identify areas for future research.1 It is often essential and usually the first task in any research endeavour, particularly in masters or doctoral level education. For effective data extraction and rigorous synthesis ...

  24. Difference Between Literature Review and Systematic Review

    Literature review and systematic review are two scholarly texts that help to introduce new knowledge to various fields. A literature review, which reviews the existing research and information on a selected study area, is a crucial element of a research study. A systematic review is also a type of a literature review.

  25. Beta-blocker therapy in patients with COPD: a systematic literature

    Gulea C, Zakeri R, Quint JK. Effect of beta-blocker therapy on clinical outcomes, safety, health-related quality of life and functional capacity in patients with chronic obstructive pulmonary disease (COPD): a protocol for a systematic literature review and meta-analysis with multiple treatment comparison. BMJ Open. 2018;8(11):e024736.

  26. Literature review of stroke assessment for upper-extremity physical

    This paper reviews literature (2000-2021) on sensor-based measures and metrics for upper-limb biomechanical and electrophysiological (neurological) assessment, which have been shown to correlate with clinical test outcomes for motor assessment. The search terms targeted robotic and passive devices developed for movement therapy.

  27. Publics' views on ethical challenges of artificial intelligence: a

    We followed the guidance for descriptive systematic scoping reviews by Levac et al. [], based on the methodological framework developed by Arksey and O'Malley [].The steps of the review are listed below: 2.1 Stage 1: identifying the research question. The central question guiding this scoping review is the following: What motivations, publics and ethical issues emerge in research addressing ...

  28. Black Africans in Renaissance Europe

    Anu Korhonen addresses the crucial proverb 'washing the Ethiopian white' in Renaissance England. It became a metaphor for everything considered useless, irrational, and impossible. It was widespread in England, although the relatively frequent literary references to black people in literature were brief and stereotyped.

  29. Climate Shocks and the Poor: A Review of the Literature

    PDF (0.6 MB) Tools. Share. Abstract: There is a rapidly growing literature on the link between climate change and poverty. This study reviews the existing literature on whether the poor are more exposed to climate shocks and whether they are more adversely affected. About two-thirds of the studies in our analyzed sample find that the poor are ...

  30. Review of "My Beloved Monster," a memoir by Caleb Carr

    Caleb Carr's memoir, 'My Beloved Monster,' is a heart-rending tale of human-feline connection. Over the years, my wife and I have been blessed with 15 cats, three rescued from the streets of ...