U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Implement Sci

Logo of implemsci

Assessing citation networks for dissemination and implementation research frameworks

Ted a. skolarus.

1 Center for Clinical Management Research, VA Ann Arbor Healthcare System, Ann Arbor, MI 48105 USA

2 Dow Division of Health Services Research, Department of Urology, University of Michigan, Ann Arbor, MI 48109 USA

3 Urology Section, VA Ann Arbor Healthcare System, Department of Urology, University of Michigan, Ann Arbor, MI 48109 USA

Todd Lehmann

4 Department of Political Science, College of Literature, Science and the Arts, University of Michigan, Ann Arbor, MI 48109 USA

Rachel G. Tabak

Jenine harris.

5 Prevention Research Center in St. Louis/George Warren Brown School of Social Work at Washington University in St. Louis, St. Louis, MO 63130 USA

6 Maxwell School of Citizenship and Public Affairs, Syracuse University, Syracuse, NY 13244 USA

Anne E. Sales

7 Department of Learning Health Sciences, University of Michigan Medical School, University of Michigan, Ann Arbor, MI 48109 USA

Associated Data

Data for this project is stored on secure servers. Data can be made publicly available upon request.

A recent review of frameworks used in dissemination and implementation (D&I) science described 61 judged to be related either to dissemination, implementation, or both. The current use of these frameworks and their contributions to D&I science more broadly has yet to be reviewed. For these reasons, our objective was to determine the role of these frameworks in the development of D&I science.

We used the Web of Science™ Core Collection and Google Scholar™ to conduct a citation network analysis for the key frameworks described in a recent systematic review of D&I frameworks (Am J Prev Med 43(3):337–350, 2012). From January to August 2016, we collected framework data including title, reference, publication year, and citations per year and conducted descriptive and main path network analyses to identify those most important in holding the current citation network for D&I frameworks together.

The source article contained 119 cited references, with 50 published articles and 11 documents identified as a primary framework reference. The average citations per year for the 61 frameworks reviewed ranged from 0.7 to 103.3 among articles published from 1985 to 2012. Citation rates from all frameworks are reported with citation network analyses for the framework review article and ten highly cited framework seed articles. The main path for the D&I framework citation network is presented.

Conclusions

We examined citation rates and the main paths through the citation network to delineate the current landscape of D&I framework research, and opportunities for advancing framework development and use. Dissemination and implementation researchers and practitioners may consider frequency of framework citation and our network findings when planning implementation efforts to build upon this foundation and promote systematic advances in D&I science.

Electronic supplementary material

The online version of this article (doi:10.1186/s13012-017-0628-2) contains supplementary material, which is available to authorized users.

The field of dissemination and implementation (D&I) science continues to evolve with contributions from a variety of disciplines, researchers, and institutions across the globe [ 1 ]. Significant advances in our understanding of how to conceptualize D&I research and practice were facilitated by a recent comprehensive review of relevant models, theories, and frameworks [ 2 ]. The review identified 61 frameworks to guide D&I researchers and practitioners in their research-to-practice activities at different socio-ecologic levels within the health care system (individual, organization, community, healthcare system, policy). The goal was to develop a D&I framework inventory to inform selection efforts for researchers and practitioners based on a given framework’s construct flexibility, its predilection for dissemination and/or implementation activities, as well as its socio-ecologic level targeting.

However, better understanding the most frequently cited D&I frameworks and the citation networks surrounding these frameworks can also provide useful information for selection, conceptualization, and resources for operationalization. For example, in cases where several different frameworks might be applicable to a given implementation intervention, identifying the most prominent and commonly applied frameworks in the field could have several advantages. First, it could provide researchers and practitioners with the most supporting literature to inform their effort. Second, accessing this information may increase the chances of intervention success and therefore help the best frameworks emerge. Third, as the framework literature evolves, there will be increasing opportunities to advance D&I science with respect to fidelity of framework use, core framework components, standardized measurement, advantages and disadvantages of a given framework, and ultimately implementation outcomes [ 3 ]. More broadly, mapping D&I framework networks can build upon this foundation to promote systematic advances in D&I science through identifying the common set of assumptions and knowledge that constitutes consensus in the field.

Bibliometric (or citation) analysis is one method to investigate the scholarly landscape surrounding D&I frameworks from the review. This quantitative technique is increasingly applied to measure the impact of academic research and examine relationships using tools such as citation network analysis [ 4 – 6 ]. In general, citation network analysis provides a map of the most highly cited publications within a given research domain, much like the way Google™ uses page rank to identify the most relevant websites [ 7 ]. This approach to understanding the state of scientific advancement has been used across a range of fields, including public administration, public health service systems, physical activity environments, and analytic method development, to discern the degree to which information flows through a scholarly network and identify opportunities for transdisciplinary collaboration and crosstalk [ 8 – 14 ]. Using citation analysis to examine the rapidly evolving D&I field could not only indicate the most frequently cited D&I frameworks but also determine their relationships across time and discipline, and map the emerging knowledge network constituting the D&I framework field.

For these reasons, we conducted a citation network analysis of D&I research frameworks. We created a snapshot of the scientific development of D&I framework research based on carefully selected framework articles followed forward in time as they integrated into the growing body of D&I knowledge. We examined citation rates and the main paths through the citation network to delineate the current landscape of D&I framework research, and opportunities for advancing framework development and use.

Citation network analysis

We used a citation data network collection tool, the Citation Network Analyzer (CNA), to generate the data and conduct our study [ 15 , 16 ]. This tool uses a constrained snowball sampling approach to identify a network of documents (i.e., journal and conference papers, theses and dissertations, academic books, pre-prints, abstracts, technical reports) in Google Scholar™ that can be used for descriptive, main path, and other network analyses via an R software package. In general, a constrained snowball sample of academic publications is created by identifying seed articles, determining the levels of data (articles that cite the seeds, articles that cite those, and so on), and selecting the sampling rate at each level. This vetted, efficient and inclusive networking approach to following citations forward in time is uniquely suited to advance our current understanding of the literature surrounding D&I framework development and use. In addition, the output from the CNA tool can be used to graphically represent the citation network and assign weights to the articles based on their importance in maintaining the network architecture as described below.

Our approach of using citation network analysis to conduct structured literature reviews was based on prior work using the CNA tool [ 9 , 10 , 13 , 15 , 17 ]. This approach can lead to a less biased assessment of the academic literature than traditional narrative reviews for at least two reasons. First, a citation analysis approach can avoid the cognitive bias associated with traditional literature searches using keyword searches which may be limited by the researcher expertise, training, and preferences. Second, the use of Google Scholar™ and a snowball sampling technique based on selected seed articles, rather than Web of Science™ citation tools based on keywords for instance, is able to survey a broader scope of publications that may be relevant to D&I frameworks especially given their expansive roots in fields ranging from agriculture, business, and political science to public health and medicine [ 18 , 19 ]. In addition, the CNA tool allows for a constrained approach to snowball sampling, rather than traditional snowball sampling where the sample grows exponentially, in order to limit the articles at each level from the seed article to arrive at empirical findings using a fraction of the data [ 15 ].

As detailed in Additional file 1 , we conducted two analyses using this novel approach. First, we synthesized the literature covered in the framework review article by Tabak et al. [ 2 ] with respect to recent citations and performed a structured literature review of the article itself. Next, we applied a structured literature review to a snowball sample of ten framework articles identified as the most important by the study team, largely based on the Tabak review. Overall, this work allowed us to understand the relevance of the framework review article as a D&I resource and to identify those frameworks forming the current backbone of the D&I framework field (i.e., framework articles in the network’s main path).

Characterizing the Tabak et al. framework review article and its citation network

The Tabak systematic review contained 119 references, with 50 published articles and 11 documents (reports/chapters/books) identified as a primary D&I framework reference ( n  = 61) [ 2 ]. These D&I frameworks were identified first through selecting commonly cited frameworks, then through snowball sampling and expert consultation including with U.S. National Institute of Health offıcials who process and review D&I grants. Frameworks were excluded from the review according to the following criteria: (1) focused on practitioner rather than D&I researcher; (2) applied to individual behavior change only (i.e., without ties to local, organizational or community dissemination); (3) intended only for national level use versus local, community, or organizational level; (4) frameworks focused only on dissemination after research study completion; and (5) articles not written in the English language. The frameworks were then judged by the authors to be related either more to dissemination, implementation, or both equally. Each framework’s construct flexibility was rated as broad and flexible versus operational and defined for a given context and activity. Last, the socio-ecologic level (individual, organization, community, healthcare system, policy) targeted by the framework was categorized, with most operating at more than one level.

We extracted the primary citation for each framework. In cases where more than one primary reference was used ( n  = 21), we selected the most relevant reference, usually the oldest, as the primary reference. The primary references for 11 frameworks were reports, chapters, or books. Because peer-reviewed articles were the most common documents cited in this study, we use the term article to denote all documents throughout the remainder of the manuscript.

To better understand the framework articles discussed in the Tabak review, we conducted descriptive analyses to identify the most common journals, authors, and countries of origin for the 61 models. We also examined the citation rates for each framework. We defined a citation rate as the number of citations/year(s) since publication. We used the Web of Science™ Core Collection in January 2016 to conduct these descriptive citation analyses and inform our subsequent network analysis described in the Additional file 1 .

Citation network analysis of selected D&I frameworks

Next, we conducted a citation network analysis of ten carefully selected D&I framework articles we felt reflected the current state of the field. Eight of these were based on citation rates and the Tabak review. However, we also included two additional frameworks given their relevance to implementation science and relatively high citation rates: (1) Theoretical Domains Framework (TDF) [ 20 , 21 ] and the (2) Knowledge to Action Framework (KTA) [ 22 ], for a total of ten seed articles for our next citation network analysis. Both of these models were developed by researchers outside the USA and were not included in the Tabak review. The details of the D&I framework citation analysis are included in the Additional file 1 .

Last, we performed a main path analysis to identify the connectedness and links among the articles considered to be the backbone of the D&I framework citation network. This approach identifies the key articles influencing D&I models based on the selected seed articles. We determined the traversal weights indicating the proportion of network paths that included a given article node in the network [ 23 ]. For instance, a traversal weight of 0.25 for framework X indicates that its article exists in 25% of the citation paths in the network. This traversal weight indicates the importance of any particular node (i.e., article) in the network. We constructed the main path by removing all ties in the network scoring below the 95% percentile for traversal weight value. We normalized the traversal weights according to flow using the Search Path Count method [ 24 ]. All computations were accomplished with Pajek [ 23 ].

All analyses were conducted between January 2016 and August 2016. This study was deemed not regulated by the Institutional Review Board at the University of Michigan.

Tabak framework review article and its citation network

As illustrated in Fig. ​ Fig.1, 1 , the Tabak framework review article is an increasingly cited resource. As of January 2016, it had been cumulatively cited 456 times across 388 articles and other source items indexed within Web of Science™ Core Collection. As shown in Table ​ Table1, 1 , there was a broad distribution of citation numbers and annual citation rates across the 61 framework articles within the Tabak review and our two selected framework articles (KTA and TDF). The average number of citations per year ranged from 0 to 1949 among articles published from 1962 to 2012. The outlier with the highest citation rate was a book reference for Rogers’ Diffusion of Innovations.

An external file that holds a picture, illustration, etc.
Object name is 13012_2017_628_Fig1_HTML.jpg

Citation report through 2015 for ‘Bridging Research and Practice Models for Dissemination and Implementation Research’ by Tabak et al. [ 2 ]

Citations for D&I frameworks in published articles as of January 2016

a Included as one of ten seed articles for citation network analysis

b Two additional frameworks were included along with the Tabak framework review articles given their relevance to implementation science - Theoretical Domains Framework (TDF) [ 20 ] and the Knowledge to Action Framework (KTA) [ 22 ]

Based on the structured literature review of the Tabak article using the CNA tool, we identified 239 articles across the network and its three levels of ‘distance.’ This included 17 level-one articles directly referencing the Tabak article, with the remainder of articles residing two and three levels from the Tabak source article. The majority of the documents were journal articles (84%), followed by books (16%). The articles in the Tabak network were published between 2002 and 2016, with 51 articles published prior to the source article year of 2012. The majority (86%) of these were three levels from the Tabak seed article and (35%) were book references. We identified 202 unique first authors contributing to this network. Each author contributed 1.18 articles (standard deviation (SD) = 0.58), on average. Most first authors contributed only one article to the network (one = 177; two = 19, three = 3, four = 2, six = 1). We identified 123 unique journals (books excluded) contributing to the Tabak network, each providing an average of 1.62 articles (SD = 2.63). Most journals contributed one article ( n  = 95). The top three journals producing the most articles were: Implementation Science ( n  = 29), Annual Review of Public Health ( n  = 6), and BMC Public Health ( n  = 5). All other journals had four or fewer articles each. The articles in the Tabak network were cited between 0 and 4410 times. The top ten cited articles in the Tabak network are shown in Table ​ Table2, 2 , and none of which served as a primary framework reference. As illustrated in Fig. ​ Fig.2, 2 , there were prominent ties in the Tabak network to social care and the law by Aveyard; normalization process and general implementation theory by May; implementation work by Glasgow, Proctor, Neta, and Chambers; a gateway to broader literature via a movement science article by Peters; a Karlin article which ties in psychotherapy; and a 2013 contribution by Straus that was an introduction to knowledge translation in healthcare.

Ten most cited articles within the Tabak framework review citation network

An external file that holds a picture, illustration, etc.
Object name is 13012_2017_628_Fig2_HTML.jpg

Citation network for ‘Bridging Research and Practice Models for Dissemination and Implementation Research’ by Tabak et al. [ 2 ]. Most first authors contributed only one article (one = 177). Those authors with two articles—Aarons, G; Archambault, P; Bjurlin, M; Blease, CR; Brownson, R; Chambers, D; Chor, K; Davidoff, F; Edwards, N; Gagliardi, A; Kozica, S; May, C; Naci, H; Neta, G; Page, A; Partridge, SR; Rhoades, E; Trevithick, P; Trockel, M; three articles—Aveyard, H; O’Brien, J; Proctor, E; four articles—Glasgow, R and Powell, B; and six articles—Thompson, N

The citation network for our seed articles highlighted in Table ​ Table1 1 included 355 unique documents published between 1996 and 2014. There were 302,472 citation links connecting the articles in this network. The majority of citations was from 323 journal articles (91%), followed by 29 books (8%), and 3 in-proceedings (1%). We identified 274 unique first authors, each contributing 1.30 articles (SD = 0.84), on average. The majority of first authors provided one article to the network with only six authors contributing greater than three. We also identified 128 unique journals contributing to this network, each providing an average of 2.52 articles (SD = 4.04). While many journals contributed one article ( n  = 29), the top five journals producing the most articles were: Strategic Management Journal ( n  = 29), Academy of Management Journal ( n  = 25), Implementation Science ( n  = 20), Organization Science ( n  = 15), and Management Science ( n  = 10). All other journals contributed less than ten articles each. The top ten cited articles are shown in Table ​ Table3, 3 , with Szulanski’s Sticky Knowledge as the only primary framework reference from the Tabak review. The remainder of articles tended to focus on business practices and knowledge sharing, collaboration networks, and social and/or intellectual capital. The articles for the D&I framework network contributed between 64 and 12,680 citations, with a median of 489.

Ten most cited articles within the D&I framework citation network

As illustrated in Fig. ​ Fig.3, 3 , the D&I framework citation network appears centered around the 2004 Greenhalgh et al. article with prominent ties to the Theoretical Domains Framework, the Knowledge to Action Framework, the Promoting Action on Research Implementation in Health Services Framework (PARiHS), the Consolidated Framework for Implementation Research (CFIR), and an article conceptualizing implementation outcomes, among others. A more complete picture of the network’s primary core is offered with the main path analysis, which consists of those ties above the 95% percentile score for traversal weight (0.0106). The main path, illustrated in Fig. ​ Fig.4, 4 , is comprised of the 15 articles listed in Table ​ Table4. 4 . A simple interpretation of the main path is that these articles are most important in holding the entire D&I framework citation network together. In this case, seven of the ten D&I framework seed articles are part of the main path, along with eight non-seed articles. Visually, one can inspect the main path and observe the chronological flow of influence from earlier to more recent publications. Kitson [ 25 ] and Klein [ 26 ] act as the primary originating sources of influence in the main path, which serve to influence Greenhalgh [ 27 ], Damschroder [ 28 ], and Proctor [ 29 ]. These five articles, along with Glasgow [ 30 ], all converge in Aarons [ 31 ], which acts as a major hub for the remainder of the more recent works on the periphery of the main path.

An external file that holds a picture, illustration, etc.
Object name is 13012_2017_628_Fig3_HTML.jpg

D&I framework citation network. The majority of first authors provided only one article to the network with only six authors contributing greater than three including Hansen, M and Pronovost, P—four articles; Michie, S and Rycroft-Malone, J—five articles; Greenhalgh, T—seven articles; and Glasgow, R—eight articles

An external file that holds a picture, illustration, etc.
Object name is 13012_2017_628_Fig4_HTML.jpg

The main path for a D&I framework citation network. A simple interpretation of the citation network main path is that these articles are the most important in holding the entire D&I framework citation network together. In this case, seven of the ten D&I framework seed articles were part of the main path, along with eight non-seed articles

Main path articles for leading D&I research frameworks

a Network vertex is a designated point in the network where 1 through 10 indicates a seed article

Using citation analysis, we identified the most frequently cited D&I frameworks and their relationships across time and discipline and mapped the knowledge network constituting the D&I framework field. We discovered that the Tabak framework review has been increasingly cited and that it was included in the periphery of the main D&I framework network path indicating its value as a recognized resource for D&I researchers and practitioners. We identified the leading journals and authors contributing to the D&I framework literature using methods that limit cognitive biases associated with traditional literature searches using keywords. Using the CNA tool to conduct our structured literature review, we were able to identify the main path articles that signify those most important in holding the entire D&I framework citation network together. Overall, D&I researchers and practitioners may consider frequency of citation and this network structure when planning implementation efforts to build upon this foundation and promote systematic advances in D&I science. Further work is necessary to delineate how these frameworks are being used in the literature, framework selection criteria for planning D&I research efforts, the core components of these frameworks, and how framework use relates to improved implementation outcomes [ 3 ].

This study provides insight into at least two aspects of the evolving D&I scientific field. First, it confirms that D&I research has witnessed a surge of frameworks with most developed in the last two decades [ 2 ]. However, we found that the majority of articles were rarely cited, leaving only a few highly cited frameworks. It is difficult to know whether more recent frameworks will be used or not based on this analysis though several recent articles, including the Tabak review, were highly cited. Nonetheless, there does appear to be framework saturation creating an increasing need to delve further into better understanding the current cadre rather than creating new D&I frameworks. Second, taking into consideration citation rates and this network structure may be a key factor to consider when choosing a framework, in addition to the socioecological level, construct flexibility, and location on the D&I spectrum. For example, increasing citations and centrality in the network indicates more literature is available to highlight the advantages and disadvantages of using a given framework. In addition, there may be more operational and measurement resources with increasing centrality. Taking these additional aspects into consideration creates opportunities to scrutinize frameworks, starting with those in the main path, and advance D&I science by examining issues of fidelity, core and adaptive components, measurement, and relationships to implementation outcomes [ 1 ].

We found a broad range of scientific fields contributing to the D&I citation network given our use of Google Scholar™ and its extensive search capabilities [ 7 , 19 ]. This reinforces the need to scan literature outside of health-related fields to discover new guidance for D&I sciences. For example, other than the specialized journal Implementation Science , which focuses specifically on the field, most citations of the Tabak framework review article were from public health journals due in part to it being a narrative review that used snowball sampling methods and focused on health. In addition, the journals other than Implementation Science, which published the highest number of citations in the broader D&I framework network, were all in the management and business fields. This is consistent with a prior review of leading management journals that found a significant degree of knowledge translation and organizational change literature relevant to D&I in healthcare [ 32 ]. While there is some current cross-over among these fields, they are often quite distinct and separate from each other when it comes to research and practice. Taken together, our findings suggest that greater efforts to scan across these journals and fields could provide unique transdisciplinary collaborations and innovation opportunities to hasten D&I research and practice. For that matter, D&I advances could also serve to improve management and business practices.

However, citing a framework does not imply use or specify what its application entails. How to operationalize determinants of practice across frameworks also needs to be better understood to advance D&I science. A recent study examined use of the KTA framework using citation analysis and systematic review to see if the framework was used in practice and how [ 33 ]. The authors found that it was used with varying degrees of completeness from a simple reference to integration into the design, delivery, and evaluation of the implementation activities. The latter contributing most to advancing D&I science and generalizability of outcomes. Similarly, another recent systematic review examined use of the CFIR among empirical studies in the peer-reviewed literature [ 34 ]. Twenty-six articles met inclusion criteria across a breadth of settings and units of analysis. Justification for which CFIR constructs were selected, integration throughout the research study, and relation to outcomes remained poorly articulated, again limiting contributions to D&I research more broadly. Furthermore, systematic efforts to reconcile determinants of healthcare professional practice across 12 different frameworks have generated practical checklists and implementation strategy recommendations to support implementation and quality improvement efforts [ 35 ]. Better understanding framework use, consolidation and operationalization of framework determinants, not just citations, could yield more to consider when selecting and using D&I frameworks for research and practice.

There are several limitations to our study approach. First, framework citation rates are influenced by a multitude of factors including journal impact factor, the authors’ fame and publication rate, the degree of research in a given field, whether citation is perceived as positive or negative, and do not necessarily indicate the quality of a given publication or framework [ 5 – 7 , 19 ]. Nonetheless, citation rates do serve as an approximation of the impact of a scholarly work. We also used an expert-led review article for seed article identification and a robust network analysis tool, coupled with citation rate data, to provide our snapshot of the scientific development of the D&I framework field with substantial face validity. Second, there could be issues with respect to language and the definition of D&I research leading to ascertainment bias. Using our comprehensive CNA approach in Google Scholar™, rather than keyword searches for example, actually created a broader scope for our study. Last, whether the use of highly cited documents (e.g., textbooks) as seed articles, rather than the journal articles selected as seeds in our study, would dramatically change our findings is unclear. Our network tool was inclusive of such documents although they were the minority of articles in both network analyses. Indeed, publishing frameworks outside of journal articles creates challenges, both in terms of physically obtaining the material and being able to grasp the conceptual and operational components dispersed throughout a given textbook. Perhaps corresponding peer-reviewed articles serving as a book review, preferably in open-access formats to improve dissemination, could help mitigate access and citation issues [ 36 ].

In conclusion, bibliometric analysis is one way to understand how D&I frameworks are used in the development of D&I science. We used a bibliometric citation analysis tool to help identify the most prevalent models influencing D&I. D&I researchers and practitioners may consider frequency of citation and this network structure when planning implementation efforts to build upon this foundation and promote systematic advances in D&I science.

Acknowledgements

Ryan Blake, BS, for administrative and data collection support.

Dr. Skolarus was supported by a VA HSR&D Career Development Award-2 (CDA 12–171) and the Mentored Training for Dissemination and Implementation Research in Cancer (MT-DIRC) Program, National Cancer Institute, 1 R25 CA171994-01A1 during this study. This study did not receive any dedicated funding.

Availability of data and materials

Abbreviations, additional file.

Citation Network Analysis Methods. (DOCX 14 kb)

Authors’ contributions

The individual contributions of the authors are as follows: TS, TL, RT, JH, JL, and AS contributed to the study conception and design. TS, TL, and JL contributed to the acquisition of data. TS, TL, JL, JH, RT, and AS contributed to the analysis and interpretation of data. TS, TL, and AS drafted the manuscript. TS, TL, RT, JH, JL, and AS made critical revisions. All authors read and approved the final manuscript.

Ethics approval and consent to participate

This study was deemed not regulated by the Institutional Review Board at the University of Michigan.

Consent for publication

Competing interests.

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Ted A. Skolarus, Phone: (734) 936-0054, Email: ude.hcimu.dem@ralokst .

Todd Lehmann, Email: ude.hcimu@nnamhelt .

Rachel G. Tabak, Email: ude.ltsuw@kabatr .

Jenine Harris, Email: ude.ltsuw@jsirrah .

Jesse Lecy, Email: ude.rys@yceldj .

Anne E. Sales, Email: ude.hcimu.dem@nnaselas .

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here .

Loading metrics

Open Access

Peer-reviewed

Research Article

Patent citation network analysis: A perspective from descriptive statistics and ERGMs

Roles Conceptualization, Data curation, Formal analysis, Visualization, Writing – original draft

* E-mail: [email protected]

Affiliation Faculty of Informatics, Universitá della Svizzera italiana, Lugano, Switzerland

ORCID logo

Roles Conceptualization, Formal analysis, Software, Writing – original draft

Roles Conceptualization, Funding acquisition, Project administration, Writing – original draft, Writing – review & editing

  • Manajit Chakraborty, 
  • Maksym Byshkin, 
  • Fabio Crestani

PLOS

  • Published: December 3, 2020
  • https://doi.org/10.1371/journal.pone.0241797
  • Reader Comments

Fig 1

Patent Citation Analysis has been gaining considerable traction over the past few decades. In this paper, we collect extensive information on patents and citations and provide a perspective of citation network analysis of patents from a statistical viewpoint. We identify and analyze the most cited patents, the most innovative and the highly cited companies along with the structural properties of the network by providing in-depth descriptive analysis. Furthermore, we employ Exponential Random Graph Models (ERGMs) to analyze the citation networks. ERGMs enables understanding the social perspectives of a patent citation network which has not been studied earlier. We demonstrate that social properties such as homophily (the inclination to cite patents from the same country or in the same language) and transitivity (the inclination to cite references’ references) together with the technicalities of the patents ( e.g., language, categories), has a significant effect on citations. We also provide an in-depth analysis of citations for sectors in patents and how it is affected by the size of the same. Overall, our paper delves into European patents with the aim of providing new insights and serves as an account for fitting ERGMs on large networks and analyzing them. ERGMs help us model network mechanisms directly, instead of acting as a proxy for unspecified dependence and relationships among the observations.

Citation: Chakraborty M, Byshkin M, Crestani F (2020) Patent citation network analysis: A perspective from descriptive statistics and ERGMs. PLoS ONE 15(12): e0241797. https://doi.org/10.1371/journal.pone.0241797

Editor: Stefan Cristian Gherghina, The Bucharest University of Economic Studies, ROMANIA

Received: May 3, 2020; Accepted: October 21, 2020; Published: December 3, 2020

Copyright: © 2020 Chakraborty et al. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: Data is publicly available and can be accessed from the following link: http://www.ifs.tuwien.ac.at/imp/marec.shtml and at https://github.com/Manajit89/ERGM-patent-analysis .

Funding: This work was partially supported by The Global Structure for Knowledge Networks project grant under the SNSF National Research Programme 75 Data" (NRP 75). There was no additional external funding received for this study.

Competing interests: The authors have declared that no competing interests exist.

Introduction

A patent is a contract between the inventor or assignee and the state, granting a limited period of time to the inventor to exploit his invention. The reasons for patenting could be myriad, ranging from the elementary need for exclusive rights to a particular technology or invention to building a positive image of an enterprise. In the context where they apply, patents are vital for technological innovation. Patents often incentivize synergistic partnerships between companies and academic institutions. They are known to be used for generating revenues for patent assignees and are often used as tools for competitive market advantages. Patents are also the basis for productive activities within and outside of firms engaged in services and other important business sectors. An invention is a solution to a specific technological problem and is either a product or a process. So, essentially patents are important not only for the protection of intellectual rights but also in solving a wide category of technological problems and promoting innovation. Usually, patents are associated with economic growth, but in certain cases, such as in a time of economic crisis, they can prove to be detrimental to such growth. While there have been a lot of studies and research carried out on academic research papers [ 1 ], the patents have not been the subject of such a rigorous study on the same scale despite the fact that the history of patents dates back to the thirteenth century [ 2 ].

Patent citations are references to already existing technology within either patents or scientific literature based on which the current patent is modeled. They bear a resemblance to references in academic research papers. These references are primarily concerned with older patents (patent-to-patent citations) on which the current one is build to prove novelty or for continuity (“prior art”) and, generally to a lesser extent, to non-patent items (non-patent references, NPRs), particularly academic and scientific publications (scientific non-patent references, SNPRs). The onus of including relevant references in academic and scientific publications is on the authors. However, in the case of patents, both inventors, as well as patent examiners, are equally responsible [ 3 ]. Having said that, there are significant differences between patent citations and scientific ones. As pointed out by Meyer [ 4 ], there are both organisational influences, legal and strategic factors and differences in patent examination offices that dictates which patents are cited. Often a patent not only contains a solution to a problem but also highlights opportunities for applications. In addition, journal articles do not emphasise the deficiency in earlier undertakings as frequently or as rigorously as in patents. Accounting for all these factors in understanding which patents are cited and why is beyond the scope of our study. Hence, we do acknowledge that while there are some parallels between patent and scientific citations, there are several unique differences as well. In this paper, our objective is solely focused on trying to understand which factors can possibly affect formation of citations from a network structure point of view, without accounting for extraneous factors.

In the patent application process, patent examiners suggest missing citations to applicants to ensure full coverage of related works so as to avoid patent infringement issues. There are basically two kinds of citations: forward citation and backward citation. Forward citations concern patents which cite a particular patent while backward citations are patents that are cited by a specific patent. Often, citation analysis is performed over citation graphs to identify similar works or to measure the impact factor of journals, researchers etc . With respect to academic citations, numerous methods have been proposed for computing these scores, such as bibliographic coupling [ 5 ], co-citation [ 6 ] or the Hirsch h-index metric [ 7 ]. References to prior patents i.e., patent citations and the state-of-the-art included therein, along with the frequency with which prior documents are cited are regularly used as indicators for estimating the commercial and technological value of a patent. Depending on the nature of the technology, patent citations are often used to identify “key” or pivotal patents.

Global trends for transferring technology could also be inferred from patent data. The geographical regions that a patent is granted in, demonstrates the wide applicability of the technology as perceived by the inventor. Based on the fact that protection for an invention may be sought in multiple countries, the Organisation for Economic Co-operation and Development (OECD) developed a proxy measure of technology transfer [ 8 ]. This stems from the notion that inventors and organizations would not be interested in filing patent applications in more than one country unless there is a market potential for the technology proposed in the patent in those particular countries.

There are several factors that determine how innovation evolves in a particular geographical location over a period of time, which includes political, social, environmental, and judicial policies, among others. While it is nearly impossible to chart all the factors and measure their impact on innovation, investigating how innovation grows and affects the knowledge flow across countries and classes, irrespective of such influences is still very important. Previous studies, like the one by Acs et al. [ 9 ], suggested that patents provide a fairly reliable measure of innovative activity. According to Trajtenberg [ 10 ], apart from serving as indicators, patent citations represent the causal relationships between citing and cited patents reinstating the view that innovation is a continuous and incremental process. All of these necessitates a closer look at patent citation networks, especially with respect to how citations are formed and their relevance in imparting knowledge about factors influencing them. The literature on patent citations is vast, and numerous studies have been conducted on different aspects, e.g., ethnicity [ 11 , 12 ], social networks [ 13 ], geographic proximity [ 14 , 15 ] and so on. Instead, in this paper we aim to study the patent citation network from the perspective of citation forming mechanisms and the factors that influence them. This is a novel perspective on the problem.

In this paper, provide an extensive study of the patent citation network from a statistical viewpoint. In particular, we carry out experiments to understand citation formation mechanisms in patents. We extract comprehensive information on patent meta-data such as assignee, language, country among others, and employ it in Exponential Random Graph Models (ERGMs) [ 16 ], which are well established statistical models for the analysis of network data. For our study, we carry out the analysis with the European (EPO) patents from MAREC dataset ( http://www.ifs.tuwien.ac.at/imp/marec.shtmlhttp://www.ifs.tuwien.ac.at/imp/marec.shtml ). The patents in this collection have been aggregated and curated within the period of 1976-2008 in several languages. Thus the contributions in the paper are as follows:

  • We provide a methodology based on descriptive analysis of the patent citation network to gain a shallow understanding of the network structure and its implications.
  • We carry out an in-depth study of the citation network among top patent applicants to verify whether the “small-world” effects holds true in this context. This also acts as a case-study to identify hubs and authorities within the sample network, thus enabling us a deeper understanding of how top companies interact among themselves in terms of patent citations.
  • We employ ERGMs on the patent citation network to study the effect of various self-defined covariates on the patent citation forming mechanisms. We posit that since the patent network is a large network consisting of several nodes and edges, ERGMs will be able to estimate parameters effectively. To the best of our knowledge, such a study focusing on patent citation forming mechanisms using ERGMs does not exist in the literature that deals with the effects of factors like the influence of patent recency, overlapping categorization and so on.

The paper is organized as follows: In the section titled Related Works , we discuss past citation network studies with special emphasis on patents and delineate our contributions to the literature. The section titled Exponential Random Graph Models describes the ERGM model in a brief while also emphasizing the novelty of the algorithm in this context. In Section titled Data and Analysis Strategies , we describe the dataset used for analysis and the methodology employed. In the section titled Results and Analysis , we present the results and analyses of our experiments and finally conclude the paper in section Conclusions .

Related works

The literature on patent data analysis is vast and varied. It is, thus, nearly impossible to list all the important works in conjunction patent analysis which deal with citations and other bibliometric measures. In this section, first we state a few important works in that regard which are relevant in context to our work in the broader sense. Also, we devote a section to the relevant works that involve ERGMs, which again is a vast research landscape on its own.

Patent analysis

It has already been established that statistical analysis of international patent documents acts as an invaluable instrument for technological planning and analysis within companies. Patents are a known source of detailed information, providing comprehensive coverage of technologies and countries, a relatively standardized level of invention, and a long time-series of data [ 17 ]. So, it essentially provides us with a technological indicator to measure technological growth, which in turn could be extrapolated to get a better understanding of the relation and mutual dependence of innovation and economics [ 18 , 19 ].

The field of quantitative evaluation of scientific impact is built upon the intrinsic notion that the scientific standard of papers [ 20 ], scholars [ 7 , 21 ], journals [ 22 ], universities [ 23 ] and countries [ 24 ] can be measured by metrics based on the citations received. Bibliometrics has been employed in a variety of scenarios to measure and analyze citations since they provide a rich source of information. Scientific papers and scholarly articles have been investigated using various bibliometric tools, especially citations for a long period [ 25 ]. While it is nearly impossible to study the characteristics of the complete citation graph of scholarly articles, researchers have chosen to focus on either different aspects of the network or on a subset of the graph [ 26 ]. Patent citation analysis gained traction relatively late (in the 1990s) compared to their scholarly articles’ counterpart. One of the early studies to measure the technological impact based on patent citations was done by Karki [ 27 ], who proposed several technological indicators based on citations among patents. Criscuolo et al. [ 28 ] have investigated the significance of R&D internationalization with respect to host country innovation systems providing aids in quantifying relative asset augmentation compared to the exploitative nature of foreign-located R&D Some studies, like the one by Albert et al . [ 29 ], have considered only citations counts as indicators of industrially important patents. The pattern of knowledge flows, as indicated by patent citations between European regions, has been studied by Maurseth and Verspagen [ 30 ]. The authors observed that patent citations are industry-specific and citation propensity increases between geographical regions that are focused on industrial sectors with specific technological linkages between them. It has also been observed that the frequency of patent citations is high between regions which belong to the same linguistic groups.

Almeida [ 31 ] investigated the contribution patterns of multinational firms in the U.S. semiconductor industry through citation analysis. The study reported that foreign firms also contribute to local technological progress significantly. In a study, Hall et al. [ 32 ] found that firm market value, as indicated by the Tobin’s Q ratio , was correlated to the citation-weighted patent portfolio of the firms. Carpenter et al . [ 33 ] and Fontana et al. [ 34 ] juxtaposed award-winning inventions in the form of patents against patents belonging to a control group, demonstrating that important patents are more cited. In fact, it was found that the average number of citations received by important patents was about 50% higher than other patents. Zhang et al. [ 35 ] proposed to weigh 11 indicators of patent’s technological value by using Shannon entropy and selected forward citations. Thus, patent analysis spans a multitude of research areas right from patent search [ 36 ], patent classification [ 37 , 38 ] and categorization to measuring the social impact of patents [ 39 , 40 ]. Patent citation analysis can thus act as a bridge between these overlapping areas while providing a cursory overview of the patent landscape. The primary reason for using citations received as a quality indicator is that citations are able to capture some form of knowledge spillovers. In fact, citations either serve a similar role or allows building new technology from an existing one [ 41 ]. Consequently, citation chains are helpful in tracing technological evolution. In this regard, the centrality of patents in the citation network can be used to assign scores to patents. However, not all measures of centrality are equally applicable in all scenarios. There are situations for instance, where we would like to quantify the citations received as positive but not necessarily how many citations are spawned. Also, there are various challenges and limitations to citation analysis of patents, including lack of technical knowledge to process patent citations, geographical constraints and language barriers [ 42 ]. Some problems and critiques to citation analysis are presented in the papers by Fortunato et al. [ 43 ], MacRoberts and MacRoberts [ 44 ] and Garfield et al. [ 45 ]. For patents as well, citations often are used as proxies or indicators of knowledge growth and spillovers. However, as pointed out by Jaffe et al. [ 41 ], this does come with some inherent limitations especially when studying the mechanisms associated with the movement of knowledge flows. Our research also operates under these set of assumptions. There have been prior studies on patent collaboration network [ 46 ] for specific fields, but none have focused on the patent citation network from the perspective of categorical sectors (explained later). A number of studies [ 13 , 47 , 48 ] have focused on the sociological aspects of a patent citation network. In particular, Agrawal et al. [ 47 ] observed that knowledge flows to an inventor’s prior location are approximately 50% greater than if they had never lived there, suggesting that social relationships, not just physical proximity, are important for determining knowledge flow patterns. While such sociological factors are equally relevant for our study, it is often difficult to replicate similar findings for other datasets due to difference in patenting processes. The impact of innovation on revenue generation for renowned companies has been studied by Singh et al. [ 49 ]. Recently Kuhn et al. argued that due to systemic changes in the data generation process, many of assumptions in patent citations are no longer valid [ 50 ].

In this paper, apart from presenting detailed descriptive analyses of the citation networks, we also use ERGMs to study how technical features of the patents and social processes influence citation formations. Similar to the work done by An and Ding [ 51 ], we model the effects of a list of covariates we extracted from patents distinguishing between receiving and sending citations. In this study, we provide theoretical expectations on the effects of the covariates and discuss how different patent characteristics can matter for citations. We also aim to account for the homophily in citation formations, especially with respect to the same country and the same language.

Exponential random graphs

Exponential Random Graph Models (ERGMs) are statistical models of network structure, permitting inferences about how network ties are patterned [ 52 ]. ERGMs have been applied to several fields such as economics [ 53 ], sociology [ 54 ], political sciences [ 55 ], international relations [ 56 ], medicine [ 57 ] and public health [ 58 ] with varied application ranging from modeling micro-blog networks [ 59 ], studying relational coordination among healthcare organizations [ 60 ] to strategic management research [ 61 ]. Social network models too have attracted considerable attention from physicists [ 62 , 63 ] and have been pivotal in the development of interdisciplinary perspectives [ 64 ]. On the other hand, networks have been extensively studied from the perspective of preferential attachments. For instance, Barabási et al. [ 65 ] demonstrated that in a large network such as World Wide Web, despite its apparent random character, the topology of the graph has a number of universal scale-free characteristics. While Jeong et al. [ 66 ] proved that nodes acquire links depending on the node’s degree, offering direct quantitative support for the presence of preferential attachment in scientific citation networks. However, little research exists on studying and understanding patent citation networks from a social and structural perspective. We believe our current work will help researchers gain a preliminary understanding of how treating patent citation networks can lead to a more inclusive and interdisciplinary understanding of the impact of patents.

Exponential random graph models

literature review citation network analysis

The typical network features are transitivity, degree distribution and homophily [ 16 ]. Statistics for these structural features were proposed by Snijders et al. [ 69 ]. ERGMs have been known to account for both endogenous network formation processes and covariate effects. ERGMs can be used to model many different network features simultaneously and do not assume that these features are isolated. In order to find a feature in the network data or to check a hypothesis, researchers may compare the value of statistics of the network under study with that in a random network. However, real networks are not random, and instead of comparing them with completely random networks it is better to compare with a NULL model, that takes into account both transitivity, degree distributions, and all the other important network features. The capability of ERGMs to model all these features simultaneously solve the problem of the NULL model in an elegant way.

literature review citation network analysis

Nevertheless, as has been pointed out by An and Ding [ 51 ], studying citation networks using ERGMs poses several challenges. It is a known fact that it is difficult to fit ERGMs on large networks. For large networks composed of thousands of nodes and edges such as the ones studied in this paper, fitting ERGMs may be very slow. This is owing to the fact that ERGMs rely on Markov chain Monte Carlo to simulate networks for estimations. The estimation of parameters of ERGMs by maximization of the likelihood is a computationally expensive procedure. In practice, the size of the largest network for which ERGM parameters may be estimated by this method is limited to a few thousand nodes [ 71 ]. For larger networks, the solution of Eq (3) cannot be found in a reasonable time. Researchers either have to study smaller sub-networks or to use very crude approximations, like for example, contrastive divergence or pseudolikelihood [ 51 , 71 ]. Recently a very efficient algorithm for the solution of Eq (3) was proposed [ 72 , 73 ]. This new algorithm may be considered as a modification of the algorithm proposed by Laurent Younes [ 74 ] and was implemented in open-source software (available at: https://github.com/stivalaa/EstimNetDirected ) for fitting ERGMs to large directed networks [ 75 ]. In this paper we have used this software for the analysis of patent citation networks. Loading data and carrying out computations with large networks demands a lot of memory, and we use a cluster of computers to solve this issue.

literature review citation network analysis

We know in advance that in citation networks the corresponding parameter value θ date is negative. Following An and Ding [ 51 ], instead of estimating the value of this parameter, we always used a constant value θ date = −10 10 .

Finally, we fit ERGMs only on the sector citation networks in which isolated nodes (that is, patents that do not cite or get cited by any other patents) are discarded [ 78 ]. The results based on the sector networks can be interpreted as capturing citation patterns among the patents from that particular sector. One limitation of this method is that some of the results obtained for sub-networks may not generally be applied to the complete citation network comprising of all the sectors. In general, results become more reliable as the size of sub-networks increases.

Data and analytical strategies

For our experiments, we worked with the European Patent (EP) sub-collection from the MAtrixware REsearch Collection (MAREC). This sub-collection of patents consisted of around 1.2 million (granted) patents in English, German and French for a period of 32 years (1976-2007), provided in XML format. From each patent document, we extracted the relevant bibliographic and meta-information such as Date, Language, Title, Applicant Country, Applicant, List of Inventors, Classification Codes, Patent-Patent (P-P) citations, and Patent-Non-Patent (P-NP) citations. In this study, we only focus on the patent to patent citations and thus ignore the patent to non-patent citations, such as to scholarly papers, books etc . We discarded some patents with missing Applicant Country, Classification Codes and P-P citation fields. The total number of curated patents thus stood at 757,869. For building a citation network from these curated patents, we had to eliminate patent citations outside of European Patent Office (EPO), since we did not have any information about such cited patents ( e.g., patents from the National Patent Offices, or the World Intellectual Property Organization (WIPO), etc. ) apart from EPO patents contained in this sub-collection. Also, we had to eliminate citations that belonged to the time period beyond our collection scope, such as the ones from before 1976. The citations formed by these patents that are out of our dataset and older than 1976 are termed as “non-relevant” (although we recognize that they might be). So, the initial network consisted of 3,252,497 citations, and after eliminating non-relevant citations, the network reduced to 646,537 citations.

European Patents registered with the European Patent Office (EPO) follow the International Patent Classification (IPC) system ( https://www.wipo.int/classifications/ipc/ipcpub/ ) under which each patent can be broadly classified under one (or more) of the eight classes or sectors from A to H.

  • A : Human Necessities
  • B : Performing Operations; Transporting
  • C : Chemistry; Metallurgy
  • D : Textiles; Paper
  • E : Fixed Constructions
  • F : Mechanical Engineering; Lighting; Heating; Weapons; Blasting
  • G : Physics
  • H : Electricity

Each of these sectors are further classified into four levels of sub-classes or categories . Fig 1 describes one instance of such classification.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

https://doi.org/10.1371/journal.pone.0241797.g001

In Tables 1 and 2 , we provide the sector wise and category wise (the third level of the classification hierarchy) distributions of the curated patents respectively. One can notice that the sum of the number of patents in Table 1 is more than 757,869. This is owing to the fact that a single patent can belong to multiple sectors (and categories). In other words, when we counted the number of patents belonging to a particular sector (or category), we considered all the patents that were classified with the sector (or category) label. For instance, in the dataset, patent document EP0000001 has been classified with sector labels B, F, G and H. Thus, while counting for the number of patents for sectors B, F, G and H, we consider EP0000001 to appear in each one of them.

thumbnail

https://doi.org/10.1371/journal.pone.0241797.t001

thumbnail

https://doi.org/10.1371/journal.pone.0241797.t002

In this paper, we study the citation networks built from sectors B and E. Our objective is to try and compare the results of our experiments on different sized networks. The corresponding networks of sector B and E have 101,128 and 12,256 citations, respectively. The networks for sector B and E have been made available for public-use on https://github.com/Manajit89/ERGM-patent-analysis . For building these citation networks, we retained citations from the original network where both the citing and cited patent belongs to sector B or E. Since sector B has the largest share of patents, we expect that there will be significant differences of such a large network when compared to a relatively small network such as the one built from patents in sector E.

Analytical strategies

In the literature, scientific collaborations and structure of science have been studied using citation network methods [ 79 – 81 ]. Particularly, citation networks have been examined to study information diffusion [ 82 , 83 ] and scholarly impact [ 84 – 86 ]. Most of these previous works on studying citation networks have been focused on simply providing descriptive analyses. The work done by An and Ding [ 51 ] was one of the first studies that looked beyond descriptive analyses. ERGMs were employed to investigate how citation formations are affected by technical features of the scientific publications and social processes.

To the best of our knowledge, no such study exists with respect to patent citation networks. Taking a cue from the work by An and Ding [ 51 ], we extracted a host of covariates from patent documents (see later) and model their effects, distinguishing effects on receiver and sender characteristics of citations. We also provide theoretical expectations on the effects of the covariates and demonstrate the influence of patent characteristics on patent citation formation. The analyses presented in this paper also aims to account for the homophilous nature of the citation formations [ 87 ]. We anticipate that citations have a tendency to form between patents that are in the same language and patents that belong to the same country.

Further, we also analyze multiple endogenous network formation processes in citations. We explore citation transitivity ( i.e., if Y is cited by X and Z is cited by Y, then Z is more likely to be cited by X). We posit that transitivity in citations can occur because inventors (and sometimes even patent examiners) may use a snowball strategy [ 51 ] to find new patents and similar documents by following the references of other such documents. We also examine preferential attachment [ 51 ] or advantage of citation cumulation, namely, the fact that patents that have historically received a higher number of citations are prone to receive even more citations over time. We anticipate that some patents will receive more citations than other patents, and hence the number of citations received by different patents will vary widely. While at the same time, we expect that the variation in the outgoing citations might be small. Moreover, our analysis takes into account a distinctive feature for citations i.e., there are no forward referencing in patents, implying that earlier patents cannot cite later ones.

For each patent, we extracted and constructed a series of covariates. These covariates are employed to model a host of possible mechanisms for the formation of citations. For example, a patent may cite another patent because they are written in the same language [ 88 ] or if they belong to several categories indicating a diverse patent. To clarify, assuming two patents on similar technologies (one in English and other in German) that could be cited by an applicant, they tend to lean towards the reference patent written in the language of the citing patent. A recent patent may invoke interest in some sectors or categories, and this might result in accruing more citations and so on. As described below, each of these citation mechanisms can be quantified and compared using various covariates, and different aspects of the mechanisms may be measured by each of the covariates. Thus, it is hard to differentiate between the mechanisms and the effect of the covariates to the mechanisms.

The patent covariates that we studied in this paper are listed below:

  • Swiss Patent (binary variable, IsSwiss = 1; others = 0): We wanted to study the effect of country of origin of the patent on citation formations. In particular, since this is a study funded by a Swiss National Science Foundation (SNF) project, we were interested in observing the characteristics of the patents when they are filed by applicants belonging to Switzerland. Obviously, we could repeat the study focusing on any other country.
  • Patent Recency (binary variable, IsRecent = 1; 0 otherwise): We expect that how long ago a patent has been granted will affect how many citations it attracts. For this, we define recency as a measure which can be of either five or ten-year category. This implies that the patent was granted in the last five or ten years, respectively. Recency effect in the network, especially of scientific citations, already studied in the literature [ 89 ], showed that sometimes citing a recent scientific can amount to both reputation and visibility of the cited article over a longer period. In lieu of that, we also wanted to confirm if a similar hypothesis holds for patent citation networks.
  • Whether the patent belongs to Multiple Categories (binary variable, IsTrue = 1, 0 otherwise): This parameter takes into account the diversity of a patent. If a patent belongs to four or more categories (as defined in Section Dataset), it is considered as a multi-categorized patent. We expect that patents with more categories will attract more citations by virtue of belonging to several different domains.
  • Whether the patent was filed by a Prolific Applicant (binary variable, IsTrue = 1, 0 otherwise): We wanted to observe the effect of a patent when it is produced by a company or an organization which has a history of filing a large number of patents ( i.e., a prolific company). This definition is supported by Fig 2 , where we show a truncated view (of 5000 companies from Table 4 except the top 100) of the number of companies with the highest number of patent grants within the period 1976-2008. We considered the companies with at least 50 granted patents within the mentioned time period as ‘Prolific Applicant’.
  • Language (binary variable): We expect that the language that a patent is filed in will also affect how citations are modelled. In particular, we study the effect of this parameter with respect to each of the three languages in the dataset, English (EN), German (DE), and French (FR).

thumbnail

The threshold line for Prolific Applicants is marked with a red dashed line.

https://doi.org/10.1371/journal.pone.0241797.g002

Statistics that are exclusively based on the network information shared by node pairs, a and b , also known as dyads are referred to as dyadic endogenous network statistics. We create three dyadic variables to represent assortative mixing mechanisms in citation formations.

  • Whether any two patents belong to the same country i.e., they are produced by applicants from the same country (yes = 1; no = 0).
  • Whether any two citing patents are written in the same language (yes = 1; no = 0).
  • Whether there is an overlap between the categories of any two patents (yes = 1; no = 0), i.e., the two patents share some of the common categories.

We employ two methods to analyze the citation data. Firstly, we present descriptive analyses of the citation network in patents. We present the distribution of the number of patents over the years, the most cited patents, the top companies, etc. Also, we describe the primary characteristics of the citation network, including:

literature review citation network analysis

  • centralization (the propensity of citations to asymmetrically converge on a few patents),
  • indegree (the number of citations received),
  • outdegree (the number of the patent cited other patents),

literature review citation network analysis

In the context of patents, indegree may be perceived as indicative of a patent’s influence in the field, outdegree can be an indicator for a patent’s interaction with other patents, and betweenness as reflective of a patent’s brokerage power (that is, the concerned patent plays the role of connecting diverse topics and sub-categories).

By receiver effects, we aim to capture the likelihood of patents with certain traits to overtly cite patents with similar traits than citing patents without those traits. Similarly, the sender effects capture the likelihood of patents that are more likely to be cited by patents with certain traits rather than those without those traits. The assortativity of the citation mechanisms is measured by homophily effects and indicate whether patents with the same characteristics ( e.g., belonging to the same category or geographical location) are more inclined to cite one another than those with different characteristics. We fit this model on the sector citation networks.

Additionally, we also incorporate multiple endogenous network formation processes and a variable which indicates forward referencing. We have included the geometrically weighted edgewise shared partners (AltKTriangleT) to account for transitivity in the citations. AltKTriangleT indicates the propensity of citations to form a triangle (that is, if X cites Y and Y cites Z, then X is expected to cite Z). AltTwoPaths represents the propensity of citations to run across two paths but not form a triangle (that is, X cites Y and Y cites Z, but X does not cite Z). Generally, if there is a positive coefficient for AltKTriangleT and a negative coefficient accompanies it for AltTwoPaths, it indicates that higher levels of transitivity are present in the citations [ 90 , 91 ].

Results and analysis

Descriptive statistics.

We begin by presenting some statistics about the patent citation network. Table 3 presents patents with the highest number of citations. The citation distribution for the complete network is provided in Fig 3 . From this figure, we can observe that only few patents have a high count of citations, while the majority of the patents receive very few citations.

thumbnail

https://doi.org/10.1371/journal.pone.0241797.t003

thumbnail

https://doi.org/10.1371/journal.pone.0241797.g003

In Table 4 , we list the highly influential companies or organisations with the highest number of granted patents within the aforementioned time period. We did not need to perform any form of name normalisation since in the dataset there were no discrepancies associated with applicant names. In this table only companies with 10 or more patents are presented, which we label as ‘Prolific Applicant’ and used as covariate in Section “Analytical Strategies”. Table 5 , on the other hand, describes the country-wise distribution of patents. It is interesting to note that even though the dataset is concerned with patents filed at the European Patent Office, most of the patents originate from the United States of America (U.S.). A summary of the descriptive statistics of the complete citation network is presented in Table 6 . As indicated by the extremely low density, the network is extremely sparse. Low centralization score also depicts that the citations are not concentrated only on a few patents. This is in agreement from the findings of An and Ding [ 51 ] with respect to academic citations. The degree of reciprocity is also very low which indicates that there are only about 1% mutual citations, due to lack of forward references. A low degree of transitivity—about 5% of the patents cite their references’ references is also noted. The reciprocity in the network arises due to some discrepancies in the dataset, i.e., due to presence of some cycles within the network.

thumbnail

https://doi.org/10.1371/journal.pone.0241797.t004

thumbnail

https://doi.org/10.1371/journal.pone.0241797.t005

thumbnail

https://doi.org/10.1371/journal.pone.0241797.t006

We can observe that each patent receives about 0.85 citations. There is a high correlation between indegree and betweenness, suggesting the likelihood that patents that are cited across different categories are also the ones with a higher number of citations. There is also a strong correlation between indegree and outdegree, indicating the tendency that highly cited patents often cite more patents.

In Fig 4 presents the degree distribution of the full citation network. Both indegree and outdegree show similar patterns with the majority of the patents having both high indegree and outdegree.

thumbnail

https://doi.org/10.1371/journal.pone.0241797.g004

Fig 5 , shows the distribution of degree centrality of all patents, and we can observe that a large number of nodes appear with low centrality. In contrast, only a few nodes have high centrality.

thumbnail

https://doi.org/10.1371/journal.pone.0241797.g005

Endogenous network formation processes contribute significantly in citation formations and network inference in general and should be taken into account to improve the quality of inference. A good network generating model should fit density, transitivity, and degree distribution in addition to other effects, that researchers want to study. For computational purposes, we chose to work with sector-based citation network, where the citations within the sectors are preserved, instead of the complete network. In Table 7 we present the covariate effects when the ERGM is fitted on the sector E citation network. “Edges” (arcs) act like regression intercepts and are required to fit the density of the network under study.

thumbnail

https://doi.org/10.1371/journal.pone.0241797.t007

A large coefficient for “AltKTriangleT” implies that a reference’s reference is more likely to be cited. The negative coefficient for “AltTwoPathesTD” indicates that citations that do not form a triangle have fewer chances of occurring. Observing “AltKTriangleT” and “AltTwoPathsTD” together corroborates that citations have a tendency to be transitive, which could be due to the fact that inventors, generally, tend to snowball-sample the literature to discover and learn about existing technologies within the sector. The negative coefficient for “AltInStars” implies that comparing to other effects, there is no strong preferential attachment.

From Table 7 one can see that besides transitivity another strong effect is the overlapping category. The high positive value of the corresponding parameter means that when two patents share the same category, they are about 300 times ( e 5.71 ≈ 300) more likely to cite each other. Also, if the patents were issued in the same country, they are much more likely to cite each other. In fact, a patent generated from a particular country has 3 times more chance of citing a patent from that same country than others. This finding is in line with the research carried out by Singh and Marx [ 14 ] where they showed that there is a 77% greater likelihood of within-country knowledge flow in the U.S. than across national borders. While this would seem derivative, it was particularly interesting to see the same effect in a different dataset using a robust approach like ERGMs. Combining the above two observations, likelihood of patents being cited because they belong to same categories and that they are generated from country, also indirectly verifies the findings of Agrawal et al. [ 13 ] where they pointed out that both geographical and social proximity have a positive influence on patent citations.

We also observe a tendency to cite patents written in the same language. However, comparing to “same country” and “overlapping category” the effect of the patent language on citation pattern is not very strong. In literature, there have been studies focused on whether ethnicity has a role to play in information diffusion [ 11 ]. For instance, Kerr [ 12 ] compared the knowledge production and diffusion between two different ethnic communities in the U.S. and showed that poor access to the codified and tacit knowledge regarding new innovations does contribute to slow technology diffusion. While very relevant to our study, verifying if the same holds true for our context was out-of-scope since we did not have information regarding the ethnicity of the inventors of patents in the MAREC dataset.

Often it is desirable to identify an important patent as early as possible. A simple measure of the value of a patent is the number of citation the patent receives. However, the number of citations depends on both time and many other factors that we want to study. We will measure these effects by “receiver effect”. The corresponding parameter values in Table 7 measure the effect of different covariates on the patent value. It is intuitive that, in general, recent patents will have fewer citations than older ones. However, our results clearly demonstrate (see Receiver Recency effects in Table 7 ) that more recent patents are more likely to be cited. This could be attributed to the fact that often newer patents provide incremental updates on an existing patented technology, and hence they do not necessarily cite older patents.

Currently, a trend toward interdisciplinary research is observed in science. To the best of our knowledge by now, nobody studied the impact of interdisciplinarity on the patent value. We can consider the patent as interdisciplinary if it has many categories in its classification. We cannot predict it intuitively, but the results from Table 7 suggest that interdisciplinarity does not increase the patent value. In general, patents with fewer categories are more likely to be cited.

Not surprisingly, the patents from top companies have higher value and are more likely to be cited, and this is confirmed by our results. Besides, our results clearly show that on average Swiss patents are more likely to be cited.

Table 8 presents the ERGM model fitted on the Sector B citation network. The number of patents (nodes) in this network is almost 7 times more than that of in Sector E. However the results presented in Tables 7 and 8 are qualitatively the same. It should be noted that when while analyzing the citation networks of each sector with ERGMs, we operate under the assumption that each sector citation network is independent of each other.

thumbnail

https://doi.org/10.1371/journal.pone.0241797.t008

Analysis of citation network among top companies

Often in a large network, we can witness the “small-world” effect where certain nodes can be reached within a few hops of each other. This effect has been studied extensively in the literature [ 92 , 93 ]. Bialonski et al. [ 94 ] showed that small-world characteristics of interaction networks occur due to the spatial sampling of dynamical systems. Ansmann and Lehnertz [ 95 ], proposed the use of surrogate networks which preserves the strength of the full network in order to study the network characteristics. Surrogate networks also facilitates the presence of small-world, in that respect, which sometimes can provide additional information about network-specific characteristics and thus aid in their interpretation.

In our case, the citation network among the 20 most prolific companies with regard to patents is presented in Fig 6 . Companies are represented as nodes, and each edge represents the citation counts between companies. The size of the node is proportional to the citations received by the node. The network in the graph exhibits a core-periphery structure with some specific nodes acting as authorities . For instance, we can notice that “Siemens AG” has a lot of incoming links but no outgoing links, suggesting that while other companies in top-20 tend to cite patents from other companies, the leading ones do not necessarily cite others. This represents a hierarchy among prolific companies and makes the citations asymmetric. Also, there are companies that tend to cite other top companies a lot like for instance, “BASF AG”. This could be attributed to the fact that patent citations are often invoked due to legal issues and completeness ensured by patent examiners. On the other hand, there are companies like “Siemens AG” which are cited a lot while themselves citing very few top companies. This either reflects that such companies work in a niche area and have a umbrella effect on other companies or that there are certain patents held by such corporations that are essential to innovation and production of knowledge in certain areas and hence the large number of citations. Both kind of companies are helpful in disseminating knowledge throughout the network by virtue of citing other companies’ patents. In fact to realize this fact we provide in Fig 7 the periphery structure for the topmost applicant ‘SIEMENS AG’ and we can notice that there is a significant amount of citations to the central node. Due to lack of clarity in visualizing all the nodes in the graph, we could not present the plot for the complete set of nodes and edges. However, it is clear from Figs 6 and 7 , that there is a peripheral structure in the network.

thumbnail

https://doi.org/10.1371/journal.pone.0241797.g006

thumbnail

https://doi.org/10.1371/journal.pone.0241797.g007

For the whole network we computed the list of hubs and authorities which is presented in Table 9 . It is interesting to note some applicants like “Siemens AG” appear in both the top-10 hubs and authorities list and the same effect is observed in the graph of top applicants. Another interesting observation is that the graph shows a “small-world” effect where any top company can reach another with only a few steps [ 96 ]. To statistically determine this effect we followed the procedure as listed below:

  • We computed the average shortest path length L and the clustering coefficient C of the network.
  • We then generated an ensemble of null-model networks, including a Erdös-Rényi random graph and a Maslov-Sneppen random graph.
  • Next we calculated the mean of the average shortest path length L r over the ensemble of null-model networks and analogously computed C r .

literature review citation network analysis

https://doi.org/10.1371/journal.pone.0241797.t009

The values we achieved for λ = 0.897(≈1) and γ = 2.346(>1) verify the small world effect in our network and thus substantiates our hypothesis.

Conclusions

In this study, we review patent citations within the scope of citation networks. Following the trend of existing studies, we have presented detailed descriptive analyses by indicating the top patents, the most cited patents, and the properties of the citation networks. However, in particular, we also use ERGMs to understand detailed statistical analyses of the citation formation mechanisms in the network. We demonstrate that various patent characteristics have an effect on citations. Both technical features and social processes, like homophily, preferential attachment, and transitivity can lead to the formation of citations. Particularly, our statistical analysis confirms that patents are much more likely to cite each other if patent applicants are from the same country, if patents are classified by the same category, or if patents are written in the same language. We have found that recent patents are more likely to be cited. Finally, we observed that being interdisciplinary had no the impact on patent citations. We believe that by employing ERGMs on patent citation network we are facilitating a new research avenue for future exploration by researchers to investigate ways in which existing innovation affects future innovation.

Moreover, we provide analyses of the citations among the top companies in the citation network. We observed that companies (or organizations), depending on their network positions, play different roles in the citation network. It corroborated our intuitive hypothesis that the ones receiving more citations tend to be more influential. Through our analyses, we have demonstrated that there is a significant disparity in the number of patents granted and the number of citations received by companies in any particular category. Citations between companies are shown to be asymmetrical. These are signals of hierarchy among the companies within the citation network. Most companies tend to constitute a small world where each company can be reached from another with a few steps in the citation network. Thus, in the citation network, sectors can be effectively characterized by a somewhat polycentric structure. In this structure, there appears to be a high level of cohesion within lower-level categories and a moderate level of cohesion across higher-level categories depending on the level of the IPC hierarchy.

We would like to specify that our study is by no means complete and is limited by several constraints, like assuming that sector-wise citation networks are independent, so not accounting for external influences of factors like legal issues, organizational structure of companies etc. that drive applicants to cite certain patents. Incorporating such factors in our study would have required assimilating a much larger dataset with information from various heterogeneous sources, which was currently out-of-scope for us. In our study we also did not consider non-EP citations, which was again dictated by our dataset limitations, where we did not have information for patents either outside of our considered time-span or generated from other patent offices. It is still a big challenge in the literature to work with patent datasets generating from multiple patent offices mostly due to mismatch in classification systems adopted by each one of them ( e.g., IPC vs CPC vs USPC).

Having said that, our current study is an exploratory one, and it paves the way for a set of interesting questions that are worthy of further investigation. Some of them could be directed towards investigation of the proliferation of knowledge from patents to scholarly papers and vice-versa. In particular, we would like to study the knowledge flows occurring between countries across the world and across several sectors while focusing on how the patent production in one country or geographical location affects the other both in terms of patents and scientific publications. Another prospect could be to study the evolution of citations by incorporating temporal aspects in the citation network.

To conclude, this paper provides a descriptive analysis of the patent citation network in order to describe the structural properties of the network and its implications on the patents and their citations. It also delivers a deeper analysis into the “prestige” network of top applicants, stressing on the interaction among themselves and their implications in knowledge flow. And, finally, it performs a study of the effects of various self-defined covariates on the patent citation forming mechanisms using ERGMs. The findings of this research substantiates previous works on similar lines especially in terms of homophily and sociological aspects. However, this is the first study focusing on patent citation forming mechanisms using ERGMs dealing also with the effects of factors like the influence of patent recency, overlapping and multiple categorization, the effect of cross-country patent influence, and the interaction of prolific applicants.

  • View Article
  • Google Scholar
  • 2. Ladas Stephen Pericles. Patents , trademarks , and related rights: national and international protection , volume 1. Harvard University Press, 1975.
  • PubMed/NCBI
  • 8. Ivan Haščič, Nick Johnstone, Fleur Watson, and Christopher Kaminker. Climate Policy and Technological Innovation and Transfer: An Overview of Trends and Recent Empirical Results. OECD Environment Working Papers 30, OECD Publishing, 2010.
  • 16. Lusher Dean, Koskinen Johan, and Robins Garry. Exponential random graph models for social networks: Theory , methods , and applications . Cambridge University Press, 2013.
  • 36. Lupu Mihai, Mayer Katja, Kando Noriko, and Trippe J Anthony. Current Challenges in Patent Information Retrieval . Springer Publishing Company, Incorporated, 2nd edition, 2017.
  • 41. Adam B Jaffe, Manuel Trajtenberg, and Michael S Fogarty. The meaning of patent citations: Report on the NBER/Case-Western Reserve survey of patentees. Technical report, National bureau of economic research, 2000.
  • 46. Verena Bauer, Dietmar Harhoff, and Göran Kauermann. A smooth dynamic network model for patent collaboration data, arXiv preprint arXiv:1909.00736 , 2019.
  • 49. Mayank Singh, Arindam Pal, Lipika Dey, and Animesh Mukherjee. Innovation and revenue: deep diving into the temporal rank-shifts of fortune 500 companies. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD , pages 268–274. 2020.
  • 52. Lusher Dean, Koskinen Johan, and Robins Garry. Exponential random graph models for social networks: Theory , methods , and applications . Cambridge University Press, 2013.
  • 53. Jackson Matthew O. Social and economic networks . Princeton university press, 2010.
  • 54. Friedkin Noah E. A structural theory of social influence , volume 13. Cambridge University Press, 2006.
  • 68. Mark S Handcock, Garry Robins, Tom Snijders, Jim Moody, and Julian Besag. Assessing degeneracy in statistical models of social networks. Technical report, Citeseer, 2003.
  • 70. Lehmann Erich L and Casella George. Theory of point estimation . Springer Science & Business Media, 2006.
  • 74. Laurent Younes. Estimation and annealing for gibbsian fields. In Annales de l’IHP Probabilités et statistiques , volume 24, pages 269–294, 1988.
  • 77. Thomas Kesselring. Jean Piaget, volume 512. CH Beck, 1999.
  • 83. Shan Jiang. Statistical modeling of multi-dimensional knowledge diffusion networks: An ergm-based framework, 2015.
  • 85. Cronin Blaise and Sugimoto Cassidy R. Beyond bibliometrics: Harnessing multidimensional indicators of scholarly impact . MIT Press, 2014.
  • 86. Ying Ding, Ronald Rousseau, and Dietmar Wolfram. Measuring scholarly impact. Springer, 2016.
  • 89. M. Singh, A. Jaiswal, P. Shree, A. Pal, A. Mukherjee, and P. Goyal. Understanding the impact of early citers on long-term scientific impact. In 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pages 1–10, 2017.
  • Reference Manager
  • Simple TEXT file

People also looked at

Review article, equity in park green spaces: a bibliometric analysis and systematic literature review from 2014-2023.

www.frontiersin.org

  • 1 Ecological Technique and Engineering College, Shanghai Institute of Technology, Shanghai, China
  • 2 Pan Tianshou College of Architecture, Art and Design, Ningbo University, Ningbo, China

With the global increase in population and the accelerated process of urbanization, the equitable access to park green spaces by diverse communities has become a growing concern. In order to provide an overview of the developmental trends, research focal points, and influencing factors in the study of equity in park green spaces, this paper employs bibliometric analysis and the visualization software CiteSpace to systematically analyze relevant literature in the Web of Science core database from 2014 to December 2023. The findings reveal an increasing emphasis on the research of equity in park green spaces, delineated into two distinct phases: a period of gradual exploration (2014–2018) followed by rapid development (2018 to present). Key nations contributing to research in this domain include China, the United States, and Germany. Currently, the research focus in this field primarily centers on the analysis of park green space equity based on primary social fairness, analysis of park green space equity based on vulnerable groups, and the relationship between park green spaces and health. The influencing factors of park green space equity mainly involve regional economic factors and government planning, as well as residents’ economic capabilities and racial discrimination. Future research directions could include studying park green space equity among different demographic groups, emerging assessment methods and data, park green space equity based on perceived accessibility, and the relationship between park green space equity and surface temperature.

1 Introduction

In accordance with the latest United Nations report, over 56% of the global population currently resides in urban areas, with an anticipated projection that urban populations will constitute 70% of the world’s total population by 2050 ( UN-Habitat, 2022 ). Throughout the process of urbanization, extensive land development has inflicted severe damage upon the ecological environment, resulting in a reduction of biodiversity and a decline in air quality. Research indicates that urban parks and green spaces have yield diverse benefits by providing a secure habitat and favorable living conditions for various species, preserving the ecological balance of urban areas, improving air quality ( Kroeger et al., 2014 ), and mitigating the urban heat island effect ( Donovan and Butry, 2009 ).

As integral components of urban ecosystems and pivotal platforms for recreational activities, park green spaces play a crucial role in ensuring the sustainable development of cities. However, the widening gap between socioeconomic classes has led to conspicuous inequalities in the distribution and utilization of these spaces among different societal groups. The equity of park green spaces has gradually become a significant indicator for measuring social sustainability, attracting widespread attention from academia, government bodies, and even the planning sector. For instance, the European Environment Agency recommends that urban residents should have access to green spaces within a 15-min walk, equivalent to approximately 900–1,000 m ( Stanners and Bourdeau, 1995 ). Some scholars propose the 3–30-300 rule, suggesting that everyone should be able to see at least 3 mature trees from their home; there should be a tree canopy cover of at least 30% nearby; and individuals should reside within a 300-m radius of high-quality public green spaces (covering at least 0.5 ha) ( Browning et al., 2024 ). In the United Kingdom, urban residents are suggested to have 2 ha of urban green supply within a 300-m radius of their residences ( Handley et al., 2003 ). Nevertheless, some of these recommendations for park green spaces provision may be impractical, and assessing equity standards may entail diverse needs within different cultural backgrounds and regional environments, rendering the evaluation of equity in park green spaces complex and multifaceted.

Scientific knowledge mapping is a visual representation of the development process and structural relationships within a specific field of knowledge. CiteSpace is an information visualization software designed to analyze scientific knowledge maps of literature related to a particular research topic. It generates visual maps by selecting specific information such as authors, institutions, keywords, or co-citations, allowing for the analysis of the current status, hotspots, and forefront directions of research in the field ( Chen, 2017 ). Although scholars have conducted analyses of citation and co-occurrence relationships among relevant literature using CiteSpace, these analyses have been limited to a simple interpretation and factual presentation of scientific knowledge graphs and bibliometric results, with constraints on the detailed content analysis of specific articles. With the increasing emphasis on research regarding park green spaces and their equity, a growing body of research outcomes is emerging. An analysis of the current status, hotspots, influencing factors, and dynamic frontiers of equity in park green spaces is beneficial for understanding the theoretical underpinnings and trends of this field, contributing significantly to the advancement of global sustainable development.

Therefore, this paper utilizes bibliometric analysis and content analysis, employing CiteSpace for literature measurement and visualization, to quantitatively analyze the current state of equity in park green spaces research. It seeks to summarize and categorize research hotspots, influencing factors, and frontiers. The objectives of this study include: 1) providing a comprehensive review and extension of the concept and content of equity in park green spaces; 2) quantifying and elucidating the publication timeline, contributing countries, research hotspots, and themes in equity of park green space research; 3) delving into the influencing factors of equity in park green space; 4) highlighting key research areas within the three aforementioned directions; 5) acknowledging the limitations, future prospects, and challenges of this study. The results aim to offer valuable insights for the future global planning and management of park green spaces, providing theoretical guidance for advancing equity in park green space research.

2 Conceptual

2.1 the concept of equity in park green spaces, 2.1.1 park green spaces.

This paper, following the definition of Park Green Spaces outlined in China’s ‘Classification Standard for Urban Green Spaces’ (CJJT85-2017) ( Standard for Classification of Urban Green Space, 2017 ), defines Park green spaces as areas open to the public primarily for recreational purposes, while also serving ecological, landscape, cultural, educational, and emergency evacuation functions, with certain recreational and service facilities.

2.1.2 Equity in park green spaces

In the 1970s, significant inequities emerged in western countries, with growing disparities based on gender, class, and ethnicity. William Lucy’s ( Lucy, 1981 ) “Five Subconcepts” and Bruce E. Wicks’ ( Wicks and Crompton, 1986 ) “Three Criteria” were representative theoretical frameworks in the study of fairness during that period. They argued that the allocation of public resources should consider both quantity and location, ensuring equal opportunities for everyone based on meeting minimum needs standards. Towards the end of the 20th century, in response to the social phenomenon of class stratification, scholars such as Chun Man Cho (2003) ( Cho, 2004 ) proposed that Park green spaces should be distributed equitably among different spatial units, income levels, racial and political groups, with due consideration to the specific needs of marginalized populations. From these scholars’ arguments, we can observe a shift in the concept of fairness from mere equality towards justice. This signifies a deepening understanding of fairness and reflects the current societal development, which involves the differentiation of social group needs.

2.2 Content and development of equity in park green spaces

The public nature of park green spaces determines their recognition as a form of public service. Therefore, this article draws on the research on equity in public services to clarify the content of equity in park green spaces. The following Table 1 is a compilation of research findings from scholars:

www.frontiersin.org

TABLE 1 . Scholars’ perception of the equity of park green spaces.

Based on the historical stages of modern societal development, this article categorizes the content of equity in park green spaces into three phases, as illustrated in Figure 1 :

1. Before the 1970s, international mainstream research focused on territorial equality, specifically the comparison of per capita park green space indicators across different regions.

2. From the 1970s to the 1990s, attention shifted to distribution balance, specifically the issue of spatial distribution balance of urban park green spaces in different geographical areas.

3. After the end of the 20th century, research on social equity differentiated into two directions: primary social equity, focusing on the differences in the ability and opportunities of different groups to access park green space services, and advanced social equity, advocating for providing park green spaces precisely based on the needs of different groups.

www.frontiersin.org

FIGURE 1 . The main phase of the equity in park green space study.

The article highlights that identifying the components of equity also requires corresponding to the historical stages of socio-economic development. Exploring the historical reasons behind the formation of the concept of equity is beneficial for cities in different countries and regions to recognize the current issues regarding equity based on their respective developmental stages. This understanding can assist in formulating practical and actionable strategies for future development. Therefore, it becomes essential to analyze the social context associated with these studies: before the 1970s, western societies were in a period of post-war reconstruction with government-led elite decision-making. The focus was on increasing the quantity and layout of park green spaces. However, since the 1970s, western societies have transitioned towards postmodernity, challenging elite decision-making and giving more importance to the actual needs of the public. Nevertheless, this stage also had its limitations: the understanding of public needs was relatively simplistic, and studies on the accessibility of park green spaces and the formulation of public policies often relied on a basic assumption that easy transportation would meet the public’s needs. Since the end of the 20th century, sociological research has become more complex, with social stratification and cultural diversification leading to diversified demands, and the differentiation of needs among different social groups. Identifying the needs of different groups and providing more precise services for them has become the focus of research. Besides historical reasons, the equity phased development is also associated with advancements in science and technology. In the 1990s and 2000s, continuous enhancements and optimizations of GIS technology and analytical models, along with the refinement of big data and Internet maps, made spatial fairness analysis feasible across diverse demographics. Scholars began exploring from the standpoint of various demographic needs. With the expansion of Chinese cities and an aging population, the Chinese government has proposed a strategy for common prosperity through the equalization of basic public services, incorporating green equity into residential living circle planning. The research on equity has also rapidly progressed from the first and second stages towards the third stage.

3 Materials and methods

3.1 search methodology.

This study draws upon the Web of Science (WOS) core collection database as the primary data source. In pursuit of a comprehensive compilation of relevant literature on equity in park green space, a refined search strategy was formulated by incorporating synonymous terms associated with parks, green spaces, and equity. After several adjustments, the ultimate search string adopted was TS= (“park green space*" OR “urban green space*" OR “urban park*" OR “green space*") AND (“equity*" OR “justice*" OR “accessibility*"). The selected document types were restricted to “Article” and “Review Article”. The temporal scope spans from 2014 to December 2023, with the search conducted on 10 December 2023, 1,168 articles were retrieved through the search.

3.2 Article inclusion criteria

To ensure the quality and relevance of the retrieved articles, each of the three authors conducted a detailed examination of the titles and abstracts of every potential article. Articles meeting the following inclusion criteria were then downloaded:

1. The study focused on park green space, encompassing various types such as comprehensive parks, community parks, and recreational gardens (According to China’s ‘Classification Standard for Urban Green Spaces’ CJJT85-2017 (2017), a comprehensive park refers to an area with an area ≥5 h m 2 , suitable for various outdoor activities, and equipped with comprehensive recreational and supporting management service facilities.);

2. The content addressed issues related to fairness, accessibility, spatial justice, or other relevant aspects of park green space;

3. The article was published in English and had full-text availability.

Articles not meeting these criteria were excluded from the analysis. The data collection, screening, and analysis adhered to the outlined evaluation criteria, and any disputed articles were thoroughly discussed among all co-authors until a consensus was reached. Finally, 333 articles met the inclusion criteria ( Figure 2A ), comprising 319 articles and 14 reviews ( Figure 2B ).

www.frontiersin.org

FIGURE 2 . Bibliometric analysis. (A) Flowchart for screening retrieved articles. (B) Number and type of articles.

3.3 Bibliometric analysis combined with content analysis

This article will utilize CiteSpace 6.2.4 and employ both bibliometric analysis and content analysis methods to comprehensively illustrate the current research status and development trends of park green space equity from multiple perspectives. The specific structure of this paper is outlined as follows (see Figure 3 ). Firstly, in Section Four, a bibliometric analysis utilizing CiteSpace is employed to examine publication timelines, contributing countries, and high-frequency keywords, accompanied by brief explanations. Subsequently, Sections Five and Six adopt content analysis to systematically review selected literature in terms of research hotspots and influencing factors. Finally, Section Seven analyzes the limitations, prospects, and challenges of the study.

www.frontiersin.org

FIGURE 3 . Flowchart of research methodology.

3.4 CiteSpace parameters

The bibliometric analysis of the dataset was conducted using CiteSpace 6.2.4. The “Full Record and Cited References” option was selected. Subsequently, the source data from the Web of Science (WOS) database was exported in “Plain text file” format and imported into CiteSpace. The parameter settings were configured as follows: ‘Time Slicing = From 2014, JAN To 2023, DEC' (The time range of the literature.), “Node Types = Country OR Keyword” (Selecting to conduct either country of publication analysis or keyword analysis.), “Pruning = Pathfinder AND Pruning sliced networks AND Pruning the merged network” (Trimming each slice and merging networks to highlight important network structures). All other parameters were kept at their default values.

4 Bibliometric review of equity in park green spaces

4.1 bibliometric analysis, 4.1.1 annual publication trend.

From 2014 to the present, based on the analysis of publication timelines (see Figure 4 ), the research can be categorized into the following two major phases:

• Exploratory Phase (2014–2018): This period, also referred to as the foundational research phase, witnessed a gradual fluctuation and growth in the annual publication volume of articles related to equity in park green space. Following a substantial period of applied research and foundational consolidation, the fundamental concepts and measurement methodologies of equity in park green space have essentially crystallized.

• Rapid Development Phase (2019-Present): The period from 2019 to 2023 (264 articles) represents nearly four times the number of articles from 2014 to 2018 (69 articles), there has been an exponential surge in publication volume, owing to the availability of refined network data from sources such as big data and internet mapping. Scholars have commenced endeavors to analyze the equity of parks and green spaces from the perspective of resident demand. Grounded in GIS analysis, this phase has evolved to encompass four major categories: container models, coverage models, gravity models, and two-step mobility search models (refer to Table 2 ). The significance of equity in park green space has progressively gained prominence.

www.frontiersin.org

FIGURE 4 . Annual publications from 2014 to 2023.

www.frontiersin.org

TABLE 2 . The summary of accessibility evaluation models.

4.1.2 Study countries

The node types were set to “country” (Node Types = country), and a temporal chart of publishing countries, delineated by the top 10 contributors, was generated (see Figure 5 ). In this network, N = 57 and E = 73, indicating scholarly contributions from 57 countries and regions globally with the formation of 73 collaborative relationships. In the publication country time zone graph, each country’s vertical alignment with the years represents the year of their first publication. The size of a country’s node corresponds to the total number of publications. Figure 5 reveals that several countries, including the United States, the United Kingdom, and Germany, initiated research in this field by publishing articles in 2014, followed by Chinese scholars in 2015. Notably, since 2017, Chinese scholars have exhibited a remarkable surge in publications (168), surpassing the United States and experiencing a substantial growth (see Figure 6 ). China’s total publication count is nearly three times that of the second-ranking United States (56) and more than five times that of the third-ranking Germany (30). However, as indicated by Table 3 , China demonstrates lower centrality, signifying a deficit in international collaboration. Spain exhibits the highest centrality, maintaining close academic exchanges with nations worldwide, followed by Australia and the United States (The closer the collaboration between one country and others, the higher its centrality).

www.frontiersin.org

FIGURE 5 . Time-zone view of contributing countries.

www.frontiersin.org

FIGURE 6 . Number of publications annually of the top five countries.

www.frontiersin.org

TABLE 3 . Top 10 countries in the field of equity in park green spaces from 2014 to 2023.

4.2 Analysis of research hotspots

4.2.1 keyword co-occurrence analysis.

The node types were designated as ‘keyword’ (Node Types = keyword), and a co-occurrence chart of high-frequency keywords (with a frequency of 15 or more) was generated (see Figure 7 ). The co-occurrence analysis graph primarily exists in the form of nodes and connecting lines. When keywords appear in the same papers, there is a connection between the two nodes. The size of a node indicates the frequency of occurrence of that keyword. The varying sizes of different-colored rings representing a specific keyword correspond to the volume of publications for different years, with the pink ring indicating high centrality of the keyword. The analysis of keyword co-occurrence serves to unveil research hotspots and frontiers within this domain. In addition to thematic terms, high-frequency keywords predominantly encompass the following categories:

• Health and Physical Activity: “health (89 occurrences),” “public health (27 occurrences),” “mental health (17 occurrences),” “healthcare (19 occurrences),” “physical activity (95 occurrences).”

• Ecological Environment: “ecosystem services (65 occurrences),” “environment (24 occurrences),” “exposure (15 occurrences).”

• Human Element: “people (16 occurrences).”

www.frontiersin.org

FIGURE 7 . Keyword co-occurrence analysis.

These findings indicate that, as research delves deeper, amidst the intensification of urbanization and environmental degradation, future studies are likely to witness an increase in research focused on the relationship between the ecological services of parks and green spaces and public health. Emphasis will particularly be placed on understanding the needs of different demographic groups.

4.2.2 Keyword clustering timeline analysis

Building upon the keyword co-occurrence analysis ( Figure 8 ), Selecting ‘K' AND ‘LLR,’ we conducted clustering of literature keywords, using the log-likelihood ratio (LLR) algorithm to identify top-ranking terms as cluster labels. The keyword cluster timeline provides an analytical summary of the primary research directions in the development of equity in park green space over time (see Figure 8 ). In this representation, 15 lines correspond to 15 clusters, and the position of each circle denotes the year of the first appearance of the respective keyword. Arc-shaped connecting lines indicate instances where two keywords co-occurred in the same article. The weighted mean silhouette (S = 0.8794) and modularity (Q = 0.7237) obtained from the cluster analysis signify a significant clustering structure (If S value is greater than 0.5, then the clustering result is considered reasonable; Q value between 0.3 and 1 indicates that the group structure divided by the clustering analysis is significant, with higher values being better.), indicating rational results ( Chaomei, 2005 ). As shown in Figure 8 , the 15 axes represent 15 clusters, with the position of each circle indicating the year of the first appearance of the respective keyword. Arc-shaped connecting lines represent instances where these two keywords appeared in the same paper.

www.frontiersin.org

FIGURE 8 . Keyword clustering timeline map.

From Figure 8 , it is evident that clusters such as #2 environmental equity, #3 environmental justice, #5 walking distance, and #13 green spaces have consistently been the research hotspots in the field of equity in park green space from 2014 to the present. Additionally, clusters like #0 mobile phone data, #1 geographically weighted regression, #4 variable catchment size, and #12 environmental gentrification, although starting later, have sustained ongoing research, indicating scholars’ enduring interest in exploring equity in park green space using various data and models, including the associated phenomenon of gentrification. The cluster #15 land surface temperature appeared later, has the shortest duration, and is still in the early stages of development, not yet forming symbiotic relationships with other clusters.

5 Research hotspots

Based on the focal points identified in the keyword co-occurrence analysis and keyword clustering timeline analysis summarized in Section 4.2 , the research highlights in the field of park green space equity can be categorized into three main areas: analysis of park green space equity based on primary social fairness, analysis of park green space equity focusing on vulnerable groups, and the relationship between park green spaces and health.

5.1 Analysis of park green space equity based on primary social fairness

Based on the content analysis and extension in Section 2.2 , it is evident that the analysis of park green space equity based on primary social fairness primarily focuses on the disparities in the ability and opportunity of different groups to access park green space services. The analytical approach is based on the multidimensional characteristics of park green spaces, such as quantity, quality, and type, as well as the demographic attributes of social groups within spatial units, to assess the ease of access to park green spaces for different social groups (whether different social groups can fairly enjoy access to park green spaces), thus determining equity.

The conclusions can be broadly categorized into two main types (see Table 4 ). The most common finding is the inequity of park green spaces, manifested in lower accessibility for the elderly, children, immigrants, and ethnic minority groups. Areas with higher housing prices tend to have higher accessibility to park green spaces, and park green space accessibility is higher in city centers compared to suburbs. However, very few studies fail to confirm the existence of park green space inequity. Some scholars attribute this to the different spatial scales used in research. Common research scales include streets, neighborhoods, and residential areas. Different spatial scales within the same city may yield different results, especially when the area of aggregation units far exceeds residents’ walking distances, which may lead to erroneous conclusions. Therefore, accessibility analysis should be conducted at finer scales whenever possible.

www.frontiersin.org

TABLE 4 . Analysis of park green space equity based on primary social fairness.

5.2 Analysis of park green space equity focusing on vulnerable groups

The focus of research on park green space equity has gradually shifted towards socially vulnerable groups such as low-income individuals, migrant workers, as well as physiologically vulnerable groups including children, the elderly, and women. Studies suggest that residents with lower socioeconomic status (SES) spend more time using parks and green spaces ( Shen et al., 2017 ). Moreover, Black residents have access to more parks than White residents, but the per capita park area is lower for Black residents. This phenomenon is attributed to parks in predominantly Black communities being more crowded, especially in metropolitan areas ( Julia et al., 2021 ). Faced with this inequality, if the government constructs a large number of parks and green spaces in areas lacking greenery, it can make surrounding residences more attractive, leading to an increase in housing prices, escalating living costs, and even displacement of original residents, ultimately resulting in gentrification ( Boone et al., 2009 ). Noteworthy examples include the High Line park in the United States ( Brisman, 2012 ) and the Cheonggyecheon Stream restoration project in South Korea ( Lim et al., 2013 ). To address this issue, scholars propose the “Just green enough” strategy, advocating for the construction of small-scale, dispersed parks and green spaces rather than large expanses ( Curran and Hamilton, 2012 ).

Chinese scholars, however, present two divergent views on this matter. One perspective suggests that vulnerable groups are gradually moving away from parks and green spaces. For instance, research indicates that park accessibility is negatively correlated with building age and positively correlated with housing prices ( Yu et al., 2020 ). Community parks built in earlier years exhibit higher accessibility, but in recent housing development, the elderly are increasingly distancing themselves from parks and green spaces. The other perspective contends that compared to ordinary residents, vulnerable groups such as immigrants, the unemployed, and residents of welfare housing are more likely to live in areas with better park usage rights ( Nesbitt et al., 2019 ). Two main reasons account for this: first, it is related to the urban green space planning strategy of Shanghai, which emphasizes even distribution of public green space; second, vulnerable groups reside in high-rise buildings in the city center, and although they have better park green space accessibility, their living conditions are extremely poor due to poor economic conditions. Concurrently, with the increasingly severe aging population in China, the elderly face a serious shortage of park green space usage. Studies reveal that different modes of transportation and varying park green space sizes can influence the degree of fairness in elderly people’s access to these spaces ( Meng et al., 2020 ). Addressing this issue, some scholars advocate for greater government leadership in planning, giving priority to the interests of vulnerable groups ( Hewko et al., 2002 ).

5.3 The relationship between park green space and health

The research is primarily categorized into two main areas: the relationship between park green space accessibility and health, and the relationship between park green space ecological services and health.

According to Table 5 , research on the relationship between park green space accessibility and health in China mainly focuses on large cities such as Beijing and ice and snow cities such as Harbin, as well as in Western countries like the United States, the United Kingdom, and Canada. Health data is primarily obtained through questionnaire surveys, face-to-face interviews, and publicly available databases. Some scholars have studied the relationship between park green space equity and resident health from the perspectives of accessibility, distance, and coverage. The majority of research findings indicate a positive effect of park green space equity on residents’ psychological and physiological health ( Wolch et al., 2014 ; Dadvand et al., 2016 ; Rigolon, 2016 ). Park green space accessibility is negatively correlated with the incidence of various diseases among residents ( Kaczynski and Henderson, 2007 ), particularly cardiovascular and cerebrovascular diseases ( Haiya, 2011 ). However, individual studies suggest no correlation between park green space accessibility, proportion, supply-demand ratio, and health.

www.frontiersin.org

TABLE 5 . The association between accessibility of PGS and health.

The relationship between park green space ecological services and health has been explored by scholars. Some researchers ( Zhang et al., 2021c ) have conducted a review and summary of international research findings, outlining the promotion benefits of green spaces on health, factors influencing the effectiveness of health impacts, and mechanisms by which green spaces promote health. Others have delved into the physiological, psychological, and social correlations between green spaces and human health, proposing the concept of green medicine and discussing auxiliary approaches to maintaining the physical and mental health of urban residents ( Kaczynski et al., 2008 ).

6 Analysis of factors influencing equity in park green spaces

The factors influencing equity in park green spaces are both significant and complex, primarily involving regional economy and government planning, as well as residents’ economic capacity.

6.1 Regional economy and government planning

The influence of regional economy on the equity of park green spaces is primarily manifested in the level of supply. In China, as a vast country with uneven development, significant disparities exist between urban and rural areas, as well as between eastern and western regions, and between the north and south. In economically developed areas, local governments often possess more financial resources, enabling them to provide more park green space resources, thereby ensuring relatively sufficient supply in these areas ( Li et al., 2018 ). For instance, the central and southeastern coastal regions typically maintain higher levels of park green space coverage. However, in areas with harsh geographical conditions or relatively underdeveloped economies, such as mountainous or desert regions, constraints on park green space construction are considerable, resulting in inadequate supply, and in some cases, levels even below the national average ( Chen and Wang, 2013 ; Zhao et al., 2013 ). Conversely, in Western countries like the United States, park green space construction is typically funded through mechanisms such as property taxes. Nevertheless, due to limited local funds, there is often a need to seek financial support from non-profit organizations, institutions, and federal governments at the national and state levels ( Harnik and Barnhart, 2015 ). This competitive allocation of funds may exacerbate inequalities ( Joassart-Marcelli et al., 2011 ), with affluent areas receiving more support, further aggravating disparities in park green space distribution.

Government planning plays a crucial role in the spatial layout and construction of park green spaces. Unlike Western countries, where park green spaces are typically planned and constructed by the government and open to the public free of charge, in China, traditional planning tends to rely on metrics like per capita park area to gauge the level of supply. Consequently, park green spaces are often densely concentrated in areas with high population density, while areas with lower population density experience shortages ( Liyan et al., 2023 ). Furthermore, some local governments, in pursuit of political achievements and prestige, may excessively prioritize increasing park green space coverage, investing heavily in superficial projects, thereby neglecting the actual needs and usage patterns of park green spaces ( Zhao et al., 2013 ). This detachment between park green space planning and residents’ demands exacerbates the disparities between planning intentions and community needs ( Chen et al., 2017 ).

6.2 The economic capacity of residents

In China, park green spaces are often closely linked to high-quality housing, with high-quality parks typically adjacent to upscale residences, while lower-quality parks are connected to relatively lower-end housing. Higher-income groups usually have the ability to purchase premium housing, thus enjoying access to better-quality park green spaces. Conversely, lower-income groups may only reside in environments with poorer-quality housing, leading to disparities in park green space utilization among residents of different socioeconomic statuses, further exacerbating spatial inequalities in park green space distribution ( Chen et al., 2020 ; Wu and Rowe, 2022 ). Studies in Western countries have also confirmed this observation, showing that in capitalist societies, higher socioeconomic status (SES) groups typically have access to larger and higher-quality park green spaces, while lower SES groups often lack equivalent opportunities ( Zhu and Zhang, 2008 ; Alessandro and Jeremy, 2018 ; 2020 ; Browning et al., 2022 ).

Racial discrimination and segregation are also significant factors contributing to the inequitable distribution of park green spaces ( Wolch et al., 2014 ; Rigolon, 2016 ). Historically, due to the dominant position of white people, public facilities such as parks were predominantly located in affluent areas, while industrial zones and high-density housing were situated in areas inhabited by people of color and low-income individuals, a legacy that continues to have profound effects. Racial discrimination and segregation make it more difficult for communities of color to access park green spaces ( Dai, 2011 ), even when they reside closer to them. Issues such as smaller park areas, poor quality, and environmental pollution discourage residents from utilizing these spaces ( Alessandro and Jeremy, 2018 ), exacerbating the inequities in park green space distribution ( Wolch et al., 2013 ).

7 Prospects and challenges

7.1 opportunities for future research.

(1) Equity of Parks and Green Spaces for Different User Groups: Future research should focus on the fairness of park green space usage among different demographic groups, especially vulnerable populations such as the elderly ( Guo et al., 2019 ), children, and low-income individuals.

(2) Emerging Evaluation Methods and Data: Future studies can further integrate the latest information technologies, such as mobile signaling data ( Xiao et al., 2019 ; Heikinheimo et al., 2020 ), big data, Public Participation Geographic Information Systems (PPGIS) ( Brown et al., 2014 ), and the latest fairness evaluation criteria, such as the 3–30-300 rule ( Browning et al., 2024 ), to accurately simulate the diverse and complex transportation patterns and travel habits of urban residents. This approach will enable precise identification of the needs and behaviors of different population groups.

(3) Park green space equity based on perceived accessibility: Perceived accessibility refers to residents’ subjective perceptions and evaluations of the accessibility of park green spaces, emphasizing their perception and satisfaction with the level of accessibility ( Sugiyama et al., 2008 ; Mass et al., 2009 ). Studies have shown that actual spatial accessibility does not completely match perceived accessibility ( Wang et al., 2015 ; Crouse et al., 2017 ; Xie et al., 2018 ; Huang and Lin, 2023 ), especially in areas with low perceived accessibility within cities ( Crouse et al., 2017 ). This difference is mainly attributed to factors such as residents’ familiarity with the objective environment, the completeness of accessibility measurement methods, and individual differences among residents. Some scholars have studied perceived accessibility by comparing the time or distance required to reach destinations. For example, some studies have compared perceived distances to actual distances for several different destinations and found that destinations closer to home are often overestimated, meaning that the perceived distance is greater than the actual distance ( Wang et al., 2015 ). In the future, it is important to focus on the subjective perceptions of different groups and further refine research on park green space equity.

(4) Equity in park green space and Surface Temperature: In recent years, many regions globally have faced extreme high-temperature weather. Numerous studies have confirmed that parks and green spaces can effectively alleviate the urban heat island effect. Therefore, providing sufficient and equitable access to parks for residents is considered crucial for sustainable urban development. Scholars have already conducted relevant studies, such as in Dongguan, China, to explore whether the spatially equitable distribution of parks and green spaces contributes to mitigating the urban heat island effect and improving the urban thermal environment ( Chao et al., 2022 ). Other researchers have assessed the carbon reduction potential of 65 urban parks based on the cumulative outdoor carbon reduction model of park surface temperature reduction curves, conducting a network analysis of the spatial accessibility of cooling zones in Da Xi Park ( Du et al., 2023 ). Air temperature might be more salient, even if harder to implement. Future research should delve deeper into the specific relationship between parks and green spaces and surface temperature. For instance, investigating the scale of parks required within certain land areas to effectively reduce surface temperature, and exploring the intricate connection between the equitable distribution of parks and green spaces and surface temperature.

7.2 Limitations of the study

• This study has limitations regarding the selection of databases: The literature was derived from the WOS core database. The choice of databases may result in the omission of some relevant literature, posing a risk of literature gaps. Future research could consider expanding the scope of database selection, such as incorporating Scopus, PubMed, and others, to obtain a more comprehensive literature review.

• Due to the large sample size of the literature, specific quantitative analysis of the research results was not feasible. Subsequent studies may consider increasing the criteria for selecting literature to reduce the research sample, thereby making the analysis results more persuasive.

These limitations should be addressed and rectified in subsequent research to enhance the comprehensiveness and objectivity of the study. Additionally, a more detailed content analysis of relevant literature is warranted to gain a deeper understanding of the current status and trends in the field of equity in park green space.

8 Conclusion

To unveil the research focal points, developmental trends, and influencing mechanisms in the domain of equity in park green space, this study employed CiteSpace software to analyze research spanning from 2014 to December 2023. The key findings are summarized as follows:

(1) The investigation into equity in park green space has gained increasing prominence, delineated into two major phases: a gradual exploration phase (2014–2018) and a rapid development phase (2018 to the present). The primary nations engaged in this research arena include China, the United States, and Germany. Notably, China’s cumulative publication output is nearly three times that of the second-ranked United States (56) and over five times that of the third-ranked Germany (30). However, collaborative efforts with other nations remain limited.

(2) The current research in this field mainly focuses on three key areas: analysis of park green space equity based on primary social fairness, analysis of park green space equity focusing on vulnerable groups, and the relationship between park green spaces and health.

(3) The influencing factors of park green space equity mainly involve two major aspects: regional economy and government planning, and residents’ economic capability and racial discrimination.

(4) In the future, research can be directed towards several areas including park green space equity among different groups, emerging evaluation methods and data, park green space equity based on perceived accessibility, and the relationship between park green space equity and surface temperature.

In conclusion, this study presents novel insights and recommendations for advancing research on equity in park green space. The outcomes offer valuable references for global park green space planning and management, contributing theoretical support for the further development of equity-focused research in this domain.

Author contributions

LY: Funding acquisition, Methodology, Project administration, Writing–review and editing, Funding acquisition, Methodology, Project administration, Writing–review and editing. XJ: Data curation, Software, Writing–original draft, Data curation, Software, Writing–original draft. JZ: Writing–review and editing.

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This research was funded by Zhejiang Provincial Department of Education, grant number Y202146282.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

Adhikari, B., Delgado-Ron, J. A., Van Den Bosch, M., Dummer, T., Hong, A., Sandhu, J., et al. (2021). Community design and hypertension: walkability and park access relationships with cardiovascular health. Int. J. Hyg. Environ. Health 237, 113820. doi:10.1016/j.ijheh.2021.113820

PubMed Abstract | CrossRef Full Text | Google Scholar

Alessandro, R., and Jeremy, N. (2018). What shapes uneven access to urban amenities? Thick injustice and the legacy of racial discrimination in denver’s parks. J. Plan. Educ. Res. 41 (3), 0739456X1878925. doi:10.1177/0739456X18789251

CrossRef Full Text | Google Scholar

Alessandro, R., and Jeremy, N. (2020). Green gentrification or ‘just green enough’: do park location, size and function affect whether a place gentrifies or not? Urban Stud. 57 (2), 402–420. doi:10.1177/0042098019849380

Ayala-Azcárraga, C., Diaz, D., and Zambrano, L. (2019). Characteristics of urban parks and their relation to user well-being. Landsc. Urban Plan. 189, 27–35. doi:10.1016/j.landurbplan.2019.04.005

Bao, Z., Bai, Y., and Geng, T. (2023). Examining spatial inequalities in public green space accessibility: a focus on disadvantaged groups in england .

Google Scholar

Boone, C. G., Buckley, G. L., Grove, J. M., and Sister, C. (2009). Parks and people: an environmental justice inquiry in Baltimore, Maryland. Ann. Assoc. Am. Geogr. 99 (4), 767–787. doi:10.1080/00045600903102949

Brisman, A. (2012). An elevated challenge to ‘broken windows’: the High Line (New York). Crime, Media, Cult. Int. J. 8 (3), 381. doi:10.1177/1741659012443235

Brown, G., Schebella, M. F., and Weber, D. (2014). Using participatory GIS to measure physical activity and urban park benefits. Landsc. Urban Plan. 121, 34–44. doi:10.1016/j.landurbplan.2013.09.006

Browning, M. H. E. M., Locke, D. H., Konijnendijk, C., Labib, S. M., Rigolon, A., Yeager, R., et al. (2024). Measuring the 3-30-300 rule to help cities meet nature access thresholds. Sci. Total Environ. 907, 167739. doi:10.1016/j.scitotenv.2023.167739

Browning, M. H. E. M., Rigolon, A., Ogletree, S., Wang, R., Klompmaker, J. O., Bailey, C., et al. (2022). The PAD-US-AR dataset: measuring accessible and recreational parks in the contiguous United States. Sci. Data 9 (1), 773. doi:10.1038/s41597-022-01857-7

Chang, Z., Chen, J., Li, W., and Li, X. (2019). Public transportation and the spatial inequality of urban park accessibility: new evidence from Hong Kong. Transp. Res. Part D Transp. Environ. 76, 111–122. doi:10.1016/j.trd.2019.09.012

Chao, X., Guangdong, C., Qianyuan, H., Meirong, S., Qiangqiang, R., Wencong, Y., et al. (2022). Can improving the spatial equity of urban green space mitigate the effect of urban heat islands? An empirical study. Sci. total Environ. 841, 156687. doi:10.1016/j.scitotenv.2022.156687

Chaomei, C. (2005). CiteSpace II: detecting and visualizing emerging trends and transient patterns in scientific literature. J. Am. Soc. Inf. Sci. Technol. 57 (3), 359–377. doi:10.1002/asi.20317

Chen, B., Nie, Z., Chen, Z., and Xu, B. (2017). Quantitative estimation of 21st-century urban greenspace changes in Chinese populous cities. Sci. Total Environ. 609, 956–965. doi:10.1016/j.scitotenv.2017.07.238

Chen, C. (2017). Science mapping: a systematic review of the literature. J. Data Inf. Sci. 2, 1–40. doi:10.1515/jdis-2017-0006

Chen, W. Y., and Wang, D. T. (2013). Urban forest development in China: natural endowment or socioeconomic product? Cities 35, 62–68. doi:10.1016/j.cities.2013.06.011

Chen, Y., Yue, W., and La Rosa, D. (2020). Which communities have better accessibility to green space? An investigation into environmental inequality using big data. Landsc. Urban Plan. 204, 103919. doi:10.1016/j.landurbplan.2020.103919

Cho, C. M. (2004). Study on effects of resident-perceived neighborhood boundaries on public services accessibility & its relation to utilization: using Geographic Information System, focusing on the case of public parks in Austin, Texas .

Coppel, G., and Wüstemann, H. (2017). The impact of urban green space on health in Berlin, Germany: empirical findings and implications for urban planning. Landsc. Urban Plan. 167, 410–418. doi:10.1016/j.landurbplan.2017.06.015

Crouse, D. L., Pinault, L., Balram, A., Hystad, P., Peters, P. A., Chen, H., et al. (2017). Urban greenness and mortality in Canada's largest cities: a national cohort study. Lancet Planet. Health 1 (7), e289–e297. doi:10.1016/s2542-5196(17)30118-3

Curran, W., and Hamilton, T. (2012). Just green enough: contesting environmental gentrification in Greenpoint, Brooklyn. Local Environ. 17 (9), 1027–1042. doi:10.1080/13549839.2012.729569

Dadvand, P., Bartoll, X., Basagaña, X., Dalmau-Bueno, A., Martinez, D., Ambros, A., et al. (2016). Green spaces and General Health: roles of mental health status, social support, and physical activity. Environ. Int. 91, 161–167. doi:10.1016/j.envint.2016.02.029

Dai, D. (2011). Racial/ethnic and socioeconomic disparities in urban green space accessibility: where to intervene? Landsc. Urban Plan. 102 (4), 234–244. doi:10.1016/j.landurbplan.2011.05.002

Donovan, G. H., and Butry, D. T. (2009). The value of shade: estimating the effect of urban trees on summertime electricity use. Energy Build. 41 (6), 662–668. doi:10.1016/j.enbuild.2009.01.002

Du, C., Jia, W., and Wang, K. (2023). Valuing carbon saving potential of urban parks in thermal mitigation: linking accumulative and accessibility perspectives. Urban Clim. 51, 101645. doi:10.1016/j.uclim.2023.101645

Fang, L., Zhang, D., Liu, T., Yao, S., Fan, Z., Xie, Y., et al. (2021). A multi-level investigation of environmental justice on cultural ecosystem services at a national scale based on social media data: a case of accessibility to Five-A ecological attractions in China. J. Clean. Prod. 286, 124923. doi:10.1016/j.jclepro.2020.124923

Fian, L., White, M. P., Thaler, T., Arnberger, A., Elliott, L. R., and Friesenecker, M. (2023). Inequalities in residential nature and nature-based recreation are not universal: a country-level analysis in Austria. Urban For. Urban Green. 85, 127977. doi:10.1016/j.ufug.2023.127977

Guo, S., Song, C., Pei, T., Liu, Y., Ma, T., Du, Y., et al. (2019). Accessibility to urban parks for elderly residents: perspectives from mobile phone data. Landsc. Urban Plan. 191, 103642. doi:10.1016/j.landurbplan.2019.103642

Haiya, J. (2011). Advance in the equity of spatial distribution of urban public service in western countries .

Handley, J., Slinn, P., Barber, A., Baker, M., Jones, C., and Lindley, S. (2003). Accessible natural green space standards in towns and cities: a review and toolkit for their implementation .

Harnik, P., and Barnhart, K. (2015). Parks as community development: when it comes to gritty cities, conserving pristine land is not the only way to create places . Editor D. T. T Washington.

Heikinheimo, V., Tenkanen, H., Bergroth, C., Järv, O., Hiippala, T., and Toivonen, T. (2020). Understanding the use of urban green spaces from user-generated geographic information. Landsc. Urban Plan. 201, 103845. doi:10.1016/j.landurbplan.2020.103845

Hewko, J., Smoyer-Tomic, K. E., and Hodgson, M. J. (2002). Measuring neighbourhood spatial accessibility to urban amenities: does aggregation error matter? Environ. Plan. A 34 (7), 1185–1206. doi:10.1068/a34171

Hou, Y., Chen, X., Liu, Y., and Xu, D. (2023). Association between UGS patterns and residents' health status: a report on residents' health in China's old industrial areas. Environ. Res. 239, 117199. doi:10.1016/j.envres.2023.117199

Houlden, V., Weich, S., and Jarvis, S. (2017). A cross-sectional analysis of green space prevalence and mental wellbeing in England. BMC Public Health 17 (1), 460. doi:10.1186/s12889-017-4401-x

Huang, W., and Lin, G. (2023). The relationship between urban green space and social health of individuals: a scoping review. Urban For. Urban Green. 85, 127969. doi:10.1016/j.ufug.2023.127969

Iraegui, E., Augusto, G., and Cabral, P. (2020). Assessing equity in the accessibility to urban green spaces according to different functional levels .

Joassart-Marcelli, P., Wolch, J., and Salim, Z. (2011). Building the healthy city: the role of nonprofits in creating active urban parks. Urban Geogr. 32 (5), 682–711. doi:10.2747/0272-3638.32.5.682

Julia, R., Christiane, B., Julia, W., and André, C. (2021). Socioeconomic differences in walking time of children and adolescents to public green spaces in urban areas—results of the German environmental survey (2014–2017). Int. J. Environ. Res. public health 18 (5), 2326. doi:10.3390/ijerph18052326

Kabisch, N., and Haase, D. (2014). Green justice or just green? Provision of urban green spaces in Berlin, Germany. Landsc. Urban Plan. 122, 129–139. doi:10.1016/j.landurbplan.2013.11.016

Kaczynski, A. T., and Henderson, K. A. (2007). Environmental correlates of physical activity: a review of evidence about parks and recreation. Leis. Sci. 29 (4), 315–354. doi:10.1080/01490400701394865

Kaczynski, A. T., Potwarka, L. R., and Saelens, B. E. (2008). Association of park size, distance, and features with physical activity in neighborhood parks. Am. J. public health 98 (8), 1451–1456. doi:10.2105/ajph.2007.129064

Kim, Y., Corley, E. A., Won, Y., and Kim, J. (2023). Green space access and visitation disparities in the phoenix metropolitan area. Landsc. Urban Plan. 237, 104805. doi:10.1016/j.landurbplan.2023.104805

Kroeger, T., Escobedo, F. J., Hernandez, J. L., Varela, S., Delphin, S., Fisher, J. R. B., et al. (2014). Reforestation as a novel abatement and compliance measure for ground-level ozone. Proc. Natl. Acad. Sci. 111 (40), E4204–E4213. doi:10.1073/pnas.1409785111

Leng, H., Li, S., Yan, S., and An, X. (2020). Exploring the relationship between green space in a neighbourhood and cardiovascular health in the winter city of China: a study using a health survey for Harbin .

Li, F., Wang, X., Liu, H., Li, X., Zhang, X., Sun, Y., et al. (2018). Does economic development improve urban greening? Evidence from 289 cities in China using spatial regression models. Environ. Monit. Assess. 190 (9), 541. doi:10.1007/s10661-018-6871-4

Li, H., Ta, N., Yu, B., and Wu, J. (2023). Are the accessibility and facility environment of parks associated with mental health? A comparative analysis based on residential areas and workplaces. Landsc. Urban Plan. 237, 104807. doi:10.1016/j.landurbplan.2023.104807

Lim, H., Kim, J., Potter, C., and Bae, W. (2013). Urban regeneration and gentrification: land use impacts of the Cheonggye Stream Restoration Project on the Seoul's central business district. Habitat Int. 39, 192–200. doi:10.1016/j.habitatint.2012.12.004

Lin, D., Sun, Y., Yang, Y., Han, Y., and Xu, C. (2023). Urban park use and self-reported physical, mental, and social health during the COVID-19 pandemic: an on-site survey in Beijing, China. Urban For. Urban Green. 79, 127804. doi:10.1016/j.ufug.2022.127804

Liotta, C., Kervinio, Y., Levrel, H., and Tardieu, L. (2020). Planning for environmental justice - reducing well-being inequalities through urban greening. Environ. Sci. Policy 112, 47–60. doi:10.1016/j.envsci.2020.03.017

Liu, D., Kwan, M.-P., and Kan, Z. (2021). Analysis of urban green space accessibility and distribution inequity in the City of Chicago. Urban For. Urban Green. 59, 127029. doi:10.1016/j.ufug.2021.127029

Liyan, X., Yin, H., and Fang, J. (2023). Evaluating the supply-demand relationship for urban green parks in Beijing from an ecosystem service flow perspective. Urban For. Urban Green. 85, 127974. doi:10.1016/j.ufug.2023.127974

Lucy, W. (1981). Equity and planning for local services. J. Am. Plan. Assoc. 47 (4), 447–457. doi:10.1080/01944368108976526

Maas, H., Verheij, R. A., de Vries, S., Spreeuwenberg, P., Schellevis, F. G., and Groenewegen, P. P. (2009). Morbidity is related to a green living environment. J. Epidemiol. community health 63 (12), 967–973. doi:10.1136/jech.2008.079038

Meng, G., Bingxi, L., Yu, T., and Dawei, X. (2020). Equity to urban parks for elderly residents: perspectives of balance between supply and demand. Int. J. Environ. Res. public health 17 (22), 8506. doi:10.3390/ijerph17228506

Nesbitt, L., Meitner, M. J., Girling, C., Sheppard, S. R. J., and Lu, Y. (2019). Who has access to urban vegetation? A spatial analysis of distributional green equity in 10 US cities. Landsc. Urban Plan. 181, 51–79. doi:10.1016/j.landurbplan.2018.08.007

Noordzij, J. M., Marielle, A. B., Joost Oude, G., and Frank, J. V. L. (2020). Effect of changes in green spaces on mental health in older adults: a fixed effects analysis. J. Epidemiol. community health 74 (1), 48–56. doi:10.1136/jech-2019-212704

Pope, D., Tisdall, R., Middleton, J., Verma, A., Van Ameijden, E., Birt, C., et al. (2018). Quality of and access to green space in relation to psychological distress: results from a population-based cross-sectional study as part of the EURO-URHIS 2 project. Eur. J. Public Health 28 (1), 39–38. doi:10.1093/eurpub/ckx217

Rahimi, A., Davatgar Khorsand, E., Breuste, J., and Karimzadeh, H. (2023). Gender justice in green space use in relation to different socio-economic conditions in Tabriz, Iran. Sustain. Cities Soc. 99, 104973. doi:10.1016/j.scs.2023.104973

Rao, Y., Zhong, Y., He, Q., and Dai, J. (2022). Assessing the equity of accessibility to urban green space: a study of 254 cities in China .

Rigolon, A. (2016). A complex landscape of inequity in access to urban parks: a literature review. Landsc. Urban Plan. 153, 160–169. doi:10.1016/j.landurbplan.2016.05.017

Shen, Y., Sun, F., and Che, Y. (2017). Public green spaces and human wellbeing: mapping the spatial inequity and mismatching status of public green space in the Central City of Shanghai. Urban For. Urban Green. 27, 59–68. doi:10.1016/j.ufug.2017.06.018

Standard for Classification of Urban Green Space (2017). Standard for classification of urban green space . Beijing, China: China Architecture & Building Press .

Stanners, D. A., and Bourdeau, P. F. (1995). Europe's environment: the Dobrís assessment .

Sugiyama, A., Leslie, E., Giles-Corti, B., and Owen, N. (2008). Associations of neighbourhood greenness with physical and mental health: do walking, social coherence and local social interaction explain the relationships? J. Epidemiol. community health 62 (5), e9. doi:10.1136/jech.2007.064287

Tamosiunas, A., Grazuleviciene, R., Luksiene, D., Dedele, A., Reklaitiene, R., Baceviciene, M., et al. (2014). Accessibility and use of urban green spaces, and cardiovascular health: findings from a Kaunas cohort study. Environ. Health 13 (1), 20. doi:10.1186/1476-069x-13-20

Tian, M., Yuan, L., Guo, R., Wu, Y., and Liu, X. (2021). Sustainable development: investigating the correlations between park equality and mortality by multilevel model in Shenzhen, China. Sustain. Cities Soc. 75, 103385. doi:10.1016/j.scs.2021.103385

Un-Habitat (2022). World cities report 2022: envisaging the future of cities seeks .

Wang, D., Brown, G., Zhong, G., Liu, Y., and Mateo-Babiano, I. (2015). Factors influencing perceived access to urban parks: a comparative study of Brisbane (Australia) and Zhongshan (China). Habitat Int. 50, 335–346. doi:10.1016/j.habitatint.2015.08.032

Wicks, B. E., and Crompton, J. L. (1986). Citizen and administrator perspectives of equity in the delivery of park services. Leis. Sci. 8 (4), 341–365. doi:10.1080/01490408609513080

Wolch, J., Wilson, J. P., and Fehrenbach, J. (2013). Parks and park funding in Los Angeles: an equity-mapping analysis. Urban Geogr. 26 (1), 4–35. doi:10.2747/0272-3638.26.1.4

Wolch, J. R., Byrne, J., and Newell, J. P. (2014). Urban green space, public health, and environmental justice: the challenge of making cities ‘just green enough. Landsc. Urban Plan. 125, 234–244. doi:10.1016/j.landurbplan.2014.01.017

Wu, L., and Kim, S. K. (2021). Health outcomes of urban green space in China: evidence from Beijing. Sustain. Cities Soc. 65, 102604. doi:10.1016/j.scs.2020.102604

Wu, L., and Rowe, P. G. (2022). Green space progress or paradox: identifying green space associated gentrification in Beijing. Landsc. Urban Plan. 219, 104321. doi:10.1016/j.landurbplan.2021.104321

Xiao, Y., Wang, D., and Fang, J. (2019). Exploring the disparities in park access through mobile phone data: evidence from Shanghai, China. Landsc. Urban Plan. 181, 80–91. doi:10.1016/j.landurbplan.2018.09.013

Xie, B., An, Z., Zheng, Y., and Li, Z. (2018). Healthy aging with parks: association between park accessibility and the health status of older adults in urban China. Sustain. Cities Soc. 43, 476–486. doi:10.1016/j.scs.2018.09.010

Yang, H., Chen, T., Zeng, Z., and Mi, F. (2022). Does urban green space justly improve public health and well-being? A case study of Tianjin, a megacity in China. J. Clean. Prod. 380, 134920. doi:10.1016/j.jclepro.2022.134920

Yang, H., Wen, J., Lu, Y., and Peng, Q. (2023). A quasi-experimental study on the impact of park accessibility on the mental health of undergraduate students. Urban For. Urban Green. 86, 127979. doi:10.1016/j.ufug.2023.127979

Yu, S., Zhu, X., and He, Q. (2020). An assessment of urban park access using house-level data in urban China: through the lens of social equity. Int. J. Environ. Res. public health 17 (7), 2349. doi:10.3390/ijerph17072349

Zhan, D., Zhang, Q., Kwan, M.-P., Liu, J., Zhan, B., and Zhang, W. (2022). Impact of urban green space on self-rated health: evidence from Beijing. Front. Public Health 10, 999970. doi:10.3389/fpubh.2022.999970

Zhang, J. (2023). Inequalities in the quality and proximity of green space exposure are more pronounced than in quantity aspect: evidence from a rapidly urbanizing Chinese city. Urban For. Urban Green. 79, 127811. doi:10.1016/j.ufug.2022.127811

Zhang, J., Cheng, Y., and Zhao, B. (2021a). Assessing the inequities in access to peri-urban parks at the regional level: a case study in Chinas largest urban agglomeration. Urban For. Urban Green. 65, 127334. doi:10.1016/j.ufug.2021.127334

Zhang, J., Cheng, Y., and Zhao, B. (2021b). How to accurately identify the underserved areas of peri-urban parks? An integrated accessibility indicator. Ecol. Indic. 122, 107263. doi:10.1016/j.ecolind.2020.107263

Zhang, J., Feng, X., Shi, W., Cui, J., Peng, J., Lei, L., et al. (2021c). Health promoting green infrastructure associated with green space visitation. Urban For. Urban Green. 64, 127237. doi:10.1016/j.ufug.2021.127237

Zhang, J., Yu, Z., Cheng, Y., Chen, C., Wan, Y., Zhao, B., et al. (2020). Evaluating the disparities in urban green space provision in communities with diverse built environments: the case of a rapidly urbanizing Chinese city. Build. Environ. 183, 107170. doi:10.1016/j.buildenv.2020.107170

Zhang, L., Chen, P., and Hui, F. (2022a). Refining the accessibility evaluation of urban green spaces with multiple sources of mobility data: a case study in Shenzhen, China. Urban For. Urban Green. 70, 127550. doi:10.1016/j.ufug.2022.127550

Zhang, S., Yu, P., Chen, Y., Jing, Y., and Zeng, F. (2022b). Accessibility of park green space in wuhan, China: implications for spatial equity in the post-COVID-19 era. Int. J. Environ. Res. Public Health 19, 5440. doi:10.3390/ijerph19095440

Zhang, Y., Van Dijk, T., Tang, J., and Berg, A. E. V. D. (2015). Green space attachment and health: a comparative study in two urban neighborhoods. Int. J. Environ. Res. public health 12 (11), 14342–14363. doi:10.3390/ijerph121114342

Zhao, J., Chen, S., Jiang, B., Ren, Y., Wang, H., Vause, J., et al. (2013). Temporal trend of green space coverage in China and its relationship with urbanization over the last two decades. Sci. Total Environ. 442, 455–465. doi:10.1016/j.scitotenv.2012.10.014

Zhu, P., and Zhang, Y. (2008). Demand for urban forests in United States cities. Landsc. Urban Plan. 84 (3), 293–300. doi:10.1016/j.landurbplan.2007.09.005

Keywords: park green space, equity, citespace, China, review

Citation: Yan L, Jin X and Zhang J (2024) Equity in park green spaces: a bibliometric analysis and systematic literature review from 2014-2023. Front. Environ. Sci. 12:1374973. doi: 10.3389/fenvs.2024.1374973

Received: 23 January 2024; Accepted: 07 March 2024; Published: 19 March 2024.

Reviewed by:

Copyright © 2024 Yan, Jin and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Lijie Yan, [email protected]

Advertisement

Advertisement

Research streams on digital transformation from a holistic business perspective: a systematic literature review and citation network analysis

  • Original Paper
  • Open access
  • Published: 08 November 2019
  • Volume 89 , pages 931–963, ( 2019 )

Cite this article

You have full access to this open access article

  • J. Piet Hausberg 1 ,
  • Kirsten Liere-Netheler 1 ,
  • Sven Packmohr 2 , 3 ,
  • Stefanie Pakura 4 &
  • Kristin Vogelsang 1  

29k Accesses

73 Citations

8 Altmetric

Explore all metrics

Digital transformation (DT) has become a buzzword, triggering different disciplines in research and influencing practice, which leads to independent research streams. Scholars investigate the antecedents, contingencies, and consequences of these disruptive technologies by examining the use of single technologies or of digitization, in general. Approaches are often very specialized and restricted to their domains. Thus, the immense breadth of technologies and their possible applications conditions a fragmentation of research, impeding a holistic view. With this systematic literature review, we aim to fill this gap in providing an overview of the different disciplines of DT research from a holistic business perspective. We identified the major research streams and clustered them with co-citation network analysis in nine main areas. Our research shows the main fields of interest in digital transformation research, overlaps of the research areas and fields that are still underrepresented. Within the business research areas, we identified three dominant areas in literature: finance, marketing, and innovation management. However, research streams also arise in terms of single branches like manufacturing or tourism. This study highlights these diverse research streams with the aim of deepening the understanding of digital transformation in research. Yet, research on DT still lacks in the areas of accounting, human resource management, and sustainability. The findings were distilled into a framework of the nine main areas for assisting the implications on potential research gaps on DT from a business perspective.

Similar content being viewed by others

literature review citation network analysis

Digital transformation: a review, synthesis and opportunities for future research

Swen Nadkarni & Reinhard Prügl

literature review citation network analysis

Research Methodology: An Introduction

literature review citation network analysis

The role of digitalization in business and management: a systematic literature review

Esther Calderon-Monge & Domingo Ribeiro-Soriano

Avoid common mistakes on your manuscript.

1 Introduction

The pervasive influence of digital technologies impacts value creation and value capture (Schwab 2017 ) as digital products become more the rule than the exception (Brynjolfsson and McAfee 2014 ). Given the transformational character of these digital products on many levels, the concept of Digital Transformation (DT) receives increasing attention in management research and practice. For our purposes, it helps to understand DT as generally the “disruptive implications of digital technologies” (Nambisan et al. 2019 , p. 1). These implications appear at and across various levels, from the individual over the organizational to the societal level (Lepak et al. 2007 ; Nambisan et al. 2019 ). The transformation affects organizations as a whole and leads to changes in ways of performing work (Haverkort and Zimmermann 2017 ), organizing work, and even in the business models of companies (Lucas and Goh 2009 ; Schallmo et al. 2017 ).

However, research approaches are often very specialized and restricted to their domains resulting in a rapidly growing number of publications with results from different disciplines and point of views in the field of DT each year. Due to these different research approaches and domains, the larger field of DT is very complex and hard to comprehend. Researchers do not even agree on a common definition of the term “digital transformation” (cf. Morakanyane et al. 2017 ) and it is often used interchangeably with terms like “digitization” and “digitalization”. This complexity leads to uncertainty regarding the topic, especially in practice, such that many firms struggle with the development, diffusion, and implementation of new technologies regarding digital transformation (Brynjolfsson and McAfee 2014 ), and consequently, great opportunities remain wasted (Hirsch-Kreinsen 2015 ).

In order to improve our understanding of possible implications of DT, it is critical to overcome these uncertainties and to develop further a common understanding of this field. There are already studies in literature on the implications of DT in businesses (Kane et al. 2015 ; Matt et al. 2015 ), which can be used as a basis to foster understanding. Besides many technology-driven studies, additional research approaches from a business perspective are needed (Hirsch-Kreinsen 2015 ). Changes can be observed in the industry and industrial processes (Pisano and Shih 2012 ), as well as in areas like smart homes (Risteska Stojkoska and Trivodaliev 2017 ) or e-health (Ross et al. 2016 ). Therefore, the topic is of interest to many different disciplines, yet there is a lack of synergy. Cooperation among the disciplines electrical engineering, business administration, computer science, business, and information systems engineering is a necessary feature of this phenomenon (Hirsch-Kreinsen 2015 ).

Our study aims at structuring existing research, identifying the major current trends, and thus offers an overview of recent research streams and topics in the area of DT from a business perspective. We contribute to the wide field of DT research by providing a theoretical background for subsequent research. Research areas are shown and possible gaps identified. This work may help researchers to identify similarities and differences within areas of DT research. Our findings may ease the comprehension of complementary conclusions from adjacent fields and foster an interdisciplinary understanding. In emerging topics, expertise is important, as is adaptive expertise, which describes the ability of researchers to understand and combine results and procedures from different fields (Boon et al. 2019 ). Thus, our results can be regarded as the first step towards this ability by showing a holistic approach to DT research. We appreciate a mutual interchange of findings from corresponding research streams in future.

There are many different opportunities to study the complex and immense field of DT from a business perspective. To bring these together, we use a citation network analysis (Boyack and Klavans 2010 ). Unlike other literature review approaches, the network analysis does not focus on a special field within DT research. It is less selective in the first instance and enables the implication of a broad literature base, allowing the diverse field to be structured. To gain a broad literature base, we use search terms combining DT with the focused business perspective. The generated database is further used for the citation network analysis which is executed with the tool, Gephi, resulting in clusters representing different research streams. Finally, the most relevant clusters are examined qualitatively to give an overview of major trends and topics studied in these streams.

In the following, we develop the theoretical foundation for the research approach including the definition of digital transformation and a short introduction to our understanding of the business and technology perspective. Afterward, our method is introduced in detail. Results are presented in general, following an overview of the different clusters identified. Moreover, research gaps are shown. We conclude with a summary, limitations, and an outlook for further research.

2 Theoretical foundation

2.1 digital transformation.

The term “digital transformation” (DT) pervades the modern world. However, a generally valid definition for the concept of digital transformation does not yet exist. Some researchers focus on specific technologies to explain an “organizational shift to big data analytics” (Nwankpa and Roumani 2016 , p. 4), while others focus on technology in general as the driver of radical change (Westerman et al. 2014 ). We want to underline, however, that DT does not merely refer to technological changes, but also to the impacts thereof on the organization itself (Hinings et al. 2018 ). It leads to “transformations of key business operations and affects products and processes, as well as organizational structures and management concepts” (Matt et al. 2015 , p. 339). The changes that come along with the digitalization affect people, society, communication and the whole business (Gimpel and Röglinger 2015 ; Jung et al. 2018 ).

Many of the technologies that affect DT are not new. The innovation is about “combinations of information, computing, communication, and connectivity technologies” (Bharadwaj et al. 2013 , p. 471). The major technological areas which enable DT are very diverse and traditionally called “general purpose technologies” (Hirsch-Kreinsen and ten Hompel 2017 ). These include, for example, cyber-physical systems (CPS), (industrial) internet of things (I/IoT), cloud computing (CC), big data (BD), artificial intelligence but also augmented and virtual reality (Cheng et al. 2016 ).

Yet, “organizations struggle with radical change to adopt novel digital institutional arrangements that are radical and transformational” (Hinings et al. 2018 , p. 59). However, many researchers and practitioners see positive effects of the digitalization. They sense the manifold benefits that foster an increase in sales and productivity triggered by innovative forms of value creation and new ways of interaction with customers and suppliers (Downes and Nunes 2013 ; Matt et al. 2015 ; Parviainen et al. 2017 ). For example, the digital interconnection of machines will enable flexible small series (Spath et al. 2013 ) and improve the value creation process (Stock and Seliger 2016 ). Digital communication opportunities and virtual networks change the way of doing business and gaining competitive advantage (Parviainen et al. 2017 ). Moreover, researchers sense positive effects because DT triggers job growth, such as service occupations and robot development (Brynjolfsson and McAfee 2014 ).

In summary, the DT of business leads to three significant changes (Fitzgerald et al. 2014 ; Liere-Netheler et al. 2018 ) (1) digitally supported and cross-linked processes, (2) digitally enabled communication, and (3) new ways of value generation based on digital innovations or gained digital data. These major changes can be found worldwide and in all industries. Moreover, DT has spawned new business areas such as e-government, e-banking, e-marketing, e-tourism and the highly innovative field of e-health where two research areas (medicine and information systems) meld.

Despite the gains of the DT, more and more researchers see the negative effects of digitalization. A significant threat is impending job loss (Brynjolfsson and McAfee 2014 ). Digital processes and the increased use of robot technologies will lead to employee reduction in mainly low ordered jobs (Frey and Osborne 2017 ). Furthermore, risks such as cybersecurity menaces (Greengard 2016 ) or uncontrolled or errant data (Allcott and Gentzkow 2017 ) pose threats to businesses. Firms within all branches struggle with the heterogeneous landscape of interfaces and integration standards (Bley et al. 2016 ). Still, the general expectations towards DT are high. Researchers from different disciplines contribute to an ongoing evolution of DT, its risks, and future applications.

2.2 Business and technology perspective

As described in the chapter before, DT is based on technological progress but implies a much broader focus influencing organizations as a whole. So, research in technological areas like informatics and engineering are very important. However, to drive the topic forward, business perspectives are necessary. As the discipline of information systems unites these views, we regard it as useful for our purpose. Since the development of information systems, their role in the support of management became increasingly important. Gross and Solymossy ( 2016 ) draft three eras in the development of IS: from 1937 to 1962, storage of economic data in central administrations; from 1962 to 1987, adoption of computer hard- and software by companies; and from 1987 to 2012, usage in transactions with stakeholders. The current era, i.e., after 2012, is characterized by digital technologies implicating how companies are driven (Fitzgerald et al. 2014 ). Companies use digital twins, Business-to-Machine Communication, and data-driven business models to deliver value to customers. Looking at Porter’s value chain (Huggins and Izushi 2011 ) activities move closer together through the use of connected digital devices and IS systems.

Within this paper, we will not focus on specific technologies. The aim is to take a holistic view of how the area of DT is evolving (Devaraj and Kohli 2003 ; Karimi and Walter 2015 ). Of course, we will use specific technological terms for our literature search to find relevant articles, but at the same time connect to its usage within organizations. As different research fields arise within DT (see Sect. 2.1 ), the scope of this article is not limited to applications but rather to a non-technological perspective. We aim at topics from a socio-technical view. This includes the acceptance, adoption and use of technologies (Liere-Netheler et al. 2018 ).

The importance and potential of reviews have increased across all academic disciplines (Schryen 2015 ). To gain an overall understanding, a literature review in the sense of a state of the art has many benefits. Researchers collect and understand what is already known in the specified field of interest. Furthermore, they can identify and name the research gaps. Moreover, it is essential for the foundation of a proposed study (Levy and Ellis 2006 ) and can also help to bring ideas for practical problems (Okoli and Schabram 2010 ), thereby serving as the basis for any further research in a specific field (vom Brocke et al. 2015 ). According to Fink ( 2005 ), a literature review has to be systematic in the approach, explicit in procedure, comprehensive in scope, and reproducible. The documentation of the research process has been identified as the crucial part of a successful review (Brocke et al. 2009 ) which is why in the following we will present our procedure in detail.

We followed a three-step research approach similar to other research designs in the literature (Hausberg and Korreck 2018 ). An overview of the approach can be seen in Fig.  1 . The outcome (out) of each step is used to perform the following step and is thus described as an input (in). The single steps are explained in the further cause of this chapter.

figure 1

Research approach

3.1 Identification of literature

As a first step for our study, we identified the data base for further analysis. To develop the search terms for our review, we firstly read articles from the field of interest with special regard to main titles and keywords. We searched, from a holistic view, seeking research dealing with DT as an organizational change. With the help of the literature, we deduced a set of relevant buzzwords combining two research streams: digitalization and business research. As the goal was not to focus on a specific technology, we included different technologies within the search terms. Using the list of keywords, we conducted several search loops to adopt the relevant terms iteratively. After each loop, the top ten to twenty results regarding times cited were checked to make sure the search stream fits with our research question. The final terms used can be seen in Table  1 . The first column of the table includes synonymous concepts of digitalization like “Industrie 4.0” as well as technologies and inventions linked to DT. Many terms have connections to the field of Information Systems (IS) research and linkage to production systems. The right side of the table mainly presents business areas (e.g., controlling, logistics etc.) and closely linked terms. By combining these two fields, we gain research material dealing with the appreciated view of DT in business. We are aware that the search terms are theory- and technology- as well as less impact-driven. As DT is at an evolving stage, we expect the focus of past and current research on theory and technology development to be useful.

We used the ISI Web of Science (WoS) as the database for our search. The different compositions of terms were searched in title, keywords or abstracts by using the field ‘topic’. WoS is considered the most comprehensive database and is frequently used in management and IS research (Dahlander and Gann 2010 ; Schryen 2015 ; Mian et al. 2016 ; Albort-Morant and Ribeiro-Soriano 2016 ). We conducted the search by November 2017 and decided to limit the search period to the last 20 years because DT as used for the purpose of this article (described in the theoretical foundation) emerged as a topic in the 2000’s. Nevertheless, we included research back to 1997 to miss no important groundwork. Before that time, digital technologies like the Internet just surfaced. To stay focussed on the business and technology perspective, we restricted the research areas to operations research management science, business economics international relations, social sciences other topics, communication, behavioural sciences, social issues, and sociology .

3.2 Citation network analysis

Today, literature reviews face the challenge of a fast-growing number of articles, the majority of which is available online (vom Brocke et al. 2015 ). An analysis with the help of tools makes the large amount of literature manageable. We used the freeware online tool hammer.nailsproject.org to conduct a bibliometric analysis and obtain the co-citation node-edge-files. We imported the data to the software Gephi 0.9.2 to carry out the citation network analysis and visualization of the co-citation network. Citation network analyses assume that with an increasing number of shared citations between two publications, the probability increases that the cited papers share a specialized language and specific worldview (Boyack and Klavans 2010 ). Based on this assumption, we can infer that nodes belonging to the same cluster within such a citation network treat the topic of interest from a similar perspective and with similar argumentative backgrounds and patterns.

In a subsequent step, we searched for double entries, for example, like those due to errors in the spelling of author names. In our final sample, we had 1876 articles citing an additional 71,368 references, leaving us with a total of 73,244 publications that constituted the nodes of our co-citation network. We filtered out all entries with fewer than two citations to make sure that all included articles were cited more than once as we assume one citation as rather random (Boyack and Klavans 2010 ). This is also in accordance with the goal to bring together research with at least few overlaps. Doing so, the network is reduced to a size of 7980 nodes (10.9% of the total network) with 3790 edges, a diameter of 5, and an average path length of 1.598.

Based on this, we ran a cluster analysis identifying 226 clusters. However, only the top 22 clusters had a meaningful size and included each at least 1.1% of all nodes. We took these clusters as a starting point for our qualitative analysis. We visualize the network in Fig.  2 with the nodes being color-coded according to their common research streams as identified through the cluster analysis. Each article in the analysis is assigned to one cluster.

figure 2

Co-citation network graph (largest connected component)

3.3 Qualitative analysis

To study the major topics at the interfaces between business and management research and information systems literature, we sorted the clusters by size (number of articles total within each cluster) and focused on the first ten percent clusters with the highest number of articles. Thus, for our qualitative analysis, we have a total of 22 clusters ranging from 2887 articles (cluster 1) to 841 (cluster 22).

To proceed with the qualitative reading, we checked which of the clustered articles are available within the ISI Web of Science (WoS). In result, we conducted a qualitative reading of 728 articles. The qualitative reading followed a threefold approach: First , we examined all articles within each cluster by reading the heading, the abstract, and the keywords, focusing on categorizing the cluster in the field of existing research on DT from a business and management research perspective. Second , by quantitative text mining tools, we took the headings, as well as the keywords of the articles, and identified the most relevant keywords and topics within each cluster to designate the clusters by main topics and subtopics. The process of cluster-naming and definition took place in a two-stage evaluation process of a team of five heterogeneous researchers. To name the clusters, each author first individually evaluated the cluster. Afterward, the individual cluster evaluation results were merged and discussed jointly among members of the whole research group, before the results of the cluster designation were finally defined and clusters were named.

In this process, we recognized some articles that did not fit within the topic that constituted the theme of the cluster. This usually happens when articles represent fringe topics or when their citation pattern is at odds with the norm in a specific subfield. After filtering for papers without clear relation to the research context of the designated cluster, we conducted the third step of our qualitative analysis, a detailed, qualitative reading of each article left. To evaluate the clusters, different methods are known in literature which are classified into three groups: internal, external and relative validation techniques. These methods are mainly based on distances between objects and are useful to evaluate the algorithms used (Arbelaitz et al. 2013 ). However, because our goal was to evaluate the consistency of topics within one cluster, we developed our own measurement: the “ Cluster Trust Index ” (CTI), which we defined as the ratio of articles utilized to further describe the clusters and the total number of articles in the cluster. Footnote 1 The CTI may provide an indication of the quality of the automated allocation to the clusters. In this last step, we gained deeper insights as we named the main research streams, pointed out the most used theories, presented the key methods and tools, as well as summarized the main results. Furthermore, we identified the most cited authors in each cluster and concluded with identified research gaps and suggested fields for further research.

4 Research streams on digital transformation

The identification of the literature base with the help of Web of Science leads to 1876 hits. Most articles were published during the last five years, as seen in Fig.  3 . We assume the attention on the research is still growing as it has raised attention since 2013. More than 300 papers were published in the journal “Expert Systems with Applications” which focuses on technical solutions and intelligent systems applied in different contexts and is not limited to a specific area. Moreover, many articles were published in “Decision Support Systems” and the “European Journal of Operational Research”. Besides these journals from a business perspective, other journals with a more psychological view were found.

figure 3

Articles per year

The technologies investigated in the analyzed articles (recognized by keywords) can be seen in Fig.  4 . Especially research on big data is gaining more and more attention during the last 5 years. As big data can be understood as a large amount of data (Chen 2014 ) as well as technological challenges associated with these data (Madden 2012 ) many articles are dealing with this topic. The number of articles on cloud computing also rose significantly since 2013. As the Internet of Things emerged as a concept by Kevin Ashton in 2009 (Ashton 2009 ) research grew from that time. Artificial intelligence, machine learning, as well as augmented and virtual reality, seem to be rather steady topics in research.

figure 4

Articles per technology per year

For the identification of clusters and superior research streams, the cited references were included in the analysis. For the qualitative analysis, 22 clusters were analyzed in-depth which represent the most important topics in our database. For an overview of the clusters, see Table  2 . The clusters are further introduced in the following chapters by presenting the research streams identified. This means we merged clusters dealing with similar research issues to one topic. In total, we introduce nine identified streams in the following chapters. The numbering of clusters is based on their size regarding articles found (see # in Table  2 ). During the qualitative analysis, we identified two clusters which were excluded for further examination because they do not fit the business perspective that was intended. One of these was named “methods” as it mainly deals with research methods, especially in statistics and game theory. Moreover, many papers are technology focussed as they deal with programming issues. We also did not investigate the cluster “health care” in further detail because of a missing business perspective.

The size of the clusters can be found in Table  2 . “Total” includes articles from the base sample, as well as references. The column “found” shows only the articles found during the Web of Science search. QA (qualitative analysis) is the number of articles, which were in-depth analysed in the third step. Lastly, the cluster trust index is used to evaluate the quality of the cluster-building process.

The ratio of the size of the clusters, measured by the number of articles, seems to be rather unchanged. A peak of articles can be found between 2011 and 2014 for the innovation and manufacturing cluster (see Fig.  5 ). Yet the topics seem to decline afterwards in the field of DT research leading one to the assumption that these fields are in a more advanced stage than the others from a research perspective. Research on innovation, especially, has been carried out extensively in the last 5 years. Analytics and society, too, have the most articles in 2014. A growing interest in societal questions can be observed as there are more articles in the last few years. The research interest on implications regarding whole societies is getting higher but is still a less mature field of research, e.g. in the field of changing labour markets due to more automation of tasks. Knowledge management, tourism, and marketing seem to be rather steady areas of research. Regarding DT in finance, the interest has decreased a little bit which indicates an advanced stage in this application field of digital technologies. As the total number of papers has grown significantly since 2006, there are no outstanding results before that time.

figure 5

Articles per research stream per year

In the following, the identified research streams are presented by highlighting important results and articles.

4.1 Finance

Within this research stream, three clusters were identified and named credit and risk management (cluster 1), artificial intelligence ( AI ) methods (cluster 10), and trading of investment certificates (cluster 16). The leading journal in this field is ‘Expert Systems with Application’. Within the second cluster, the ‘European Journal of Operational Research’ and within the third cluster ‘Quantitative Finance’ are additional sources with a high number of articles related to the field.

In the first cluster, three articles from ‘Expert Systems with Application’ show high ranks above 150 in their times of citations. Regarding the in-degree, these articles are outstanding with values of six and five. Looking at the betweenness centrality, articles from Tsai and Wu ( 2008 ) as well as Min and Lee ( 2005 ) show values above 1000. They are also those most cited. As “the performance of multiple classifiers in bankruptcy prediction and credit scoring is not fully understood,” Tsai and Wu ( 2008 ) propose to compare a single classifier with multiple classifiers and diversified multiple classifiers by using them on three different datasets.

In the second cluster, two articles from the ‘European Journal of Operational Research’ as well as ‘Information & Management’ have citations above 100. Looking further at in-degree and betweenness centrality the article from the ‘European Journal of Operational Research’ is outstanding with values of 11 as well as 1538. This article is written by Zhang et al. ( 1999 ) and provides a general framework for better understanding artificial neural networks. The authors show the advantage of neural networks over logistic regression and classification rate estimation, relating to the prediction of bankruptcy as well as robustness towards variation in the sample.

In the third cluster, four articles show highest ranks between 20 and 30 citations. All are from the ‘Expert Systems with Application’. Looking at the betweenness centrality, two articles show values above 100. Booth et al. ( 2014 ) also have a high value of citations. In their work, they use seasonal effects and regularities in financial data to develop an expert system based on random forests techniques to develop a trading strategy. The performance of the models is assessed by using data from the German Stock Exchange Index (DAX). In general, using seasonal effects has proven to produce superior results.

Compared to the other two clusters, this third cluster is smaller and the articles newer. Specific algorithms still need to be applied in this area. Interestingly, Hsu et al. ( 2016 ) are questioning the efficiency of financial markets. Views which financial economists have been taken on markets for decades such as Smith’s invisible hand might have to be adjusted. All in all, the field of finance has already presented significant changes and developments due to DT, especially forecasts which are useful for financial decisions can be made using algorithms. Technology enables the control of complex environments like financial markets. However, many unpredictable events still make forecasting difficult and lead to challenges for the DT in the finance sector.

4.2 Marketing

The marketing stream focuses on three aspects: the use of virtual reality (VR) in marketing and sales (cluster 3), the possibilities to work with user - generated content to deduce sentiments and further data (cluster 5) and computer - assisted customer relationship management (cluster 19). For cluster 3, we dismissed topics regarding VR application for pedestrians and mere VR acceptance. The most cited article (288 times with betweenness centrality of 134) of cluster 3 is written by Coyle and Thorson ( 2001 ). This work deals with the perceptions towards websites and the influence of the characteristics vividness and interactivity. This work is closely tied to the work about the effects of different technologies on product ratings. Moreover, the ability to use reviews for further marketing and sales purposes is shown in this cluster (Singh et al. 2017 ; Ordenes et al. 2017 ; Sodero and Rabinovich 2017 ).

Cluster 19 is about customer relationship management (CRM) and technical implications using automated responses for service purposes. The analysis of the most used words within the keywords showed an accumulation of the fields of BD, user - generated content, and consumer . Cui et al. ( 2006 ) show the highest values of in-degree (3) and betweenness centrality (239) of cluster 19. The text deals with machine learning (ML) for direct marketing response to enable immediate response to customer inquiries.

The work of Das and Chen ( 2007 ) provides the highest in-degree (12) in cluster 5 and a betweenness centrality of 1133. The authors developed a methodology for extracting small investor sentiment from stock message boards. The content analysis of cluster 5 shows: BD, customer, social, marketing, and ML are the most used words of the keywords of cluster 5. In general, cluster 5 deals with articles about user-generated content and text mining systems that are used to gain additional information from the data. The analysis of user- or customer- generated data via reviews and the fast reaction of the enterprises play a vital role in this research stream. We identified several articles in all marketing clusters that focus on that topic and on response modelling (Kim et al. 2008 ). Furthermore, new technologies and opportunities like VR and AR enable new dimensions of online product presentation (Yim et al. 2017 ).

In summary, marketing activities are highly influenced by DT which opens up new possibilities of understanding customer behavior and placement of individually adapted advertising which is possible due to a huge amount of data created by the user or automatically generated data. A further need for research in the field of VR and AR for marketing purposes is identified. These technologies should be developed and enhanced to create a more sensual atmosphere.

4.3 Innovation

The clusters of this stream deal with business model innovation (cluster 18), adoption and diffusion of innovations (cluster 2), impact on the process of innovation and organizational learning (cluster 12) as well as strategic aspects of innovation in terms of, for example, search orientation and capabilities (cluster 20).

Cluster 18 is closely related to the manufacturing clusters for it deals with the industrial internet of things (IIoT). However, rather than investigating primarily manufacturing aspects of IIoT, studies in this cluster investigate the relationship between business model innovation and DT in general as well as IIoT in particular. The article with the highest in-degree (4) and 50 citations examines the effects of business model innovations triggered by the DT on accounting (Bhimani and Willcocks 2014 ). Other articles deal more strictly with the implications of IIoT for business models (Arnold et al. 2016 ) and how the new business models of the digital era can be identified and developed (Pisano et al. 2015 ; Najmaei 2016 ). Of particular interest is the emergence of these new business models in the context of the DT through entrepreneurship (Guo et al. 2017 ), as well as their more sustainable nature (Gerlitz 2016 ; Prause and Atari 2017 ).

While the technological focus of cluster 18 was on IIoT, cloud computing (CC) is the subject of cluster 2. In fact, the study of this cluster with the highest in-degree (7) and over 290 citations investigate determinants of its adoption. Oliveira et al. ( 2014 ) find significant differences in the determining factors between manufacturing and service firms. While adoption in manufacturing is driven by the relative advantages and cost savings of CC, service firms are more reluctant to adopt it due to the complexity of CC and require more top management support. In terms of theoretical frameworks, the technology adoption model (TAM) is the most applied in this cluster (Gangwar 2016 ). One of the earlier studies integrates the TAM with marketing theory in order to explain firm adoption behavior regarding radical innovations like CC (Bohling et al. 2013 ). However, some studies also investigate combinations of theories (e.g., TAM and media richness) and technologies (e.g., CC and augmented reality) (Lin and Chen 2015 ).

Cluster 12 covers managerial challenges of the DT. For example Khanagha et al. ( 2013 ) study the impact of management innovation on the adoption of emerging technologies. They show, based on an in-depth case study, that management innovations can provide the required changes in organizational structures that enable the adoption of emerging core technologies. Most importantly, it is argued organizational routines that prevent early stage experimentation with the new technology need to be overturned as they can hinder knowledge accumulation. Other studies investigate the role of established management concepts like absorptive capacity (Lam et al. 2017 ; Trantopoulos et al. 2017 ) and ambidexterity (Khanagha et al. 2014 ). The managerial challenges during the innovation process most investigated by studies in this cluster are the changing opportunities and difficulties related to managing the customer and customer communities, in particular, managing customer co-creation and ideation (Hoornaert et al. 2017 ; Khanagha et al. 2017 ).

Cluster 20 covers also managerial challenges of the DT, but with a distinct focus on BD. The issues investigated regarding the relationship between management and BD range from human resources (Shah et al. 2017 ) over new product success (Xu et al. 2016 ) to firm performance and strategy (Akter et al. 2016 ; Mazzei and Noble 2017 ). The article with the highest in-degree (11) received 130 citations on Google Scholar at the time of analysis and uses the resource-based view of the firm to explain the outcome of BD usage for consumer analytics (Erevelles et al. 2016 ).

In summary, innovation is by nature an important research avenue to pursue in regards to digital transformation because the transformation process has to be innovative itself to be successful. DT implies implementing and using new technologies in combination with a cultural change of the whole organization. Innovation literature can contribute to developing effective ways to apply and utilize DT.

4.4 Knowledge management

The cluster knowledge management (cluster 7) focuses on aspects of knowledge management and strategy in the realm of digitalization. The journal that most occurred in this cluster is the ‘Journal of Knowledge Management’ with one third of the articles published here, of which 57 percent of the articles were published in 2017. The most frequent keywords are big data , analytics and for the content-related realms knowledge management , intellectual capital , and performance . The article by Braganza et al. ( 2017 ) is the most cited article (in-degree = 2) with the highest betweenness centrality (168). They discuss the management of resources in BD initiatives and how to effectively introduce BD initiatives into companies.

We divided this cluster into two main areas as articles show tendencies towards (1) Knowledge Management as well as (2) Strategy .

(1) Knowledge Management is the primary topic focus of 13 articles. The major part of the cluster consists of articles focussing on digitalization in knowledge management. Among these papers, most (8) deal with BD and its use for knowledge management in companies. Half of the articles take a closer look at specific applications of BD in the realm of knowledge management. Fowler ( 2000 ) and Weber et al. ( 2001 ) on the one hand focus more on use cases that involve AI and how it can “contribute to knowledge management solutions” (Weber et al. 2001 , p. 17). On the other hand, Murray et al. ( 2016 )as well as Uden and He ( 2017 ) take a look at IoT devices and how they can enhance knowledge management systems because of the data that are automatically generated. A strict theoretical view can be found with Rothberg and Erickson ( 2017 ), who mean to bring together the existing theory from knowledge management, competitive intelligence and BD analytics. One article is quite critical of the use of BD and elucidates that “to describe it [BD in the context of knowledge management] as ‘revolutionary’ is premature” (Tian 2017 , p. 113).

(2) Strategy is investigated by eight articles. The strategy topics can be divided into three subareas. Two articles focus heavily on decision making and how BD can be of use (Prescott 2014 ; O’Flaherty and Heavin 2015 ), while another two articles deal with text mining techniques and their impact on business strategy (Li et al. 2012 ; Zhang et al. 2016 ). Moreover, four articles investigate performance aspects of BD in relation to business strategy (Cleary and Quinn 2016 ; Tian 2017 ; Blackburn et al. 2017 ). This performance perspective includes papers that show how BD can help to improve the understanding of purchasing decisions (Tian 2017 ). It can also be seen how BD affects operation models (Roden et al. 2017 ), and whether BD might affect R&D Management (Blackburn et al. 2017 ), as well as “how the use of cloud-based accounting/finance infrastructure affects the business performance of small and medium-sized enterprises” (Cleary and Quinn 2016 , p. 225).

Braganza et al. ( 2017 ) propose to utilize theories drawn from strategy and leadership fields. Deeper insights on how strategies are changing and still need to change are missing. Moreover, as business models are already studied in-depth regarding DT, concrete application scenarios would be useful.

4.5 Analytics and data management

Seventy percent of the articles in the Analytics and Data Management cluster are published in 2017. We further subclassify the publications in four major realms:

(1) Operations and supply chain management , in addition to the matter of BD and analytics, enhancement of supply chain processes and ultimately, performance, are important areas of study. Bag ( 2017 ) shows empirically the positive relationship between BD, predictive analytics, and supply chain performance . Rajesh ( 2016 ) presents a prediction model to forecast supply chain resilience performance and to test it. For an extensive literature review, see (Lamba and Singh 2017 ). Tan et al. ( 2015 ) propose an analytic infrastructure to assist firms to capture the potential of supply chain innovation afforded by data. This is also the article with second highest values for in-degree (12) and betweenness centrality (764). Ji et al. ( 2017 ) present an example of how BD in the food chain can be combined with Bayesian network and deduction graph models to guide production decisions.

The second significant research realm is in the context of (2) innovation and operations management . Furthermore, articles dealing with application and exploitation of BD to create competitive advantage and value in business are studied. For instance, Barton and Court ( 2012 ), also the most cited article in this cluster (in-degree: 26), present a practical perspective on how to improve companies’ performance with advanced analytics. Zhan et al. ( 2017 ) suggest how firms could use BD to facilitate product innovation processes. Moreover, Tan and Zhan ( 2017 ) present three principles related to BD which support new product development.

Another noteworthy topic is (3) analytics to improve decision - making in management . For example, Horita et al. ( 2017 ) present a framework that connects decision-making with data sources through an extended modelling notation and modelling process.

The last realm refers to ( 4) data analytic techniques and quality framework of data management systems . Zhang et al. ( 2015 ) discuss specific techniques for modelling BD and analytics in the context of computational efficiency . Others present explicit analytical modelling for designated business fields, such as quality control in manufacturing (He et al. 2016 ).

We conclude that “successfully introducing analytics requires substantial organizational transformation” (Dremel et al. 2017 ). Management decisions supported by BD analytics depend on the underlying data quality. With the highest values on in-degree (12) and betweenness centrality (3108), the article from Hazen et al. ( 2014 ) contributes to the data quality problem within the supply chain management context. Lamba and Singh ( 2017 ) see a lack of data analytics techniques and works which can suggest the practical implementation of BD. For future research, it is suggested one consult, for example, Sivarajah et al. ( 2017 ). How to analyse and use data effectively is still a topic with growing interest in research and a big challenge for practice.

4.6 Manufacturing

The research stream manufacturing is represented by three sub-clusters that deal with the fields of cloud manufacturing , strategic implications for manufacturing and logistics .

Cluster 4 is quite diverse. We excluded specialized topics in the field of space science (Metzger 2016 ), mobile services (Qi et al. 2014 ) and football robots (Bi et al. 2017 ). Among representative works within this cluster, a visualization platform for IoT to control and monitor wireless sensor networks (Bi et al. 2016 ), resource allocation (Pillai and Rao 2016 ) and resource bundling (Guo et al. 2016 ) are examined. Moreover, strategic issues are discussed (Li et al. 2012 ; Guggenheim 2016 ). One particularly strategic article dealing with information architecture in the context of supply chain management (Xu 2011 ) has a very high betweenness centrality (number six and seven of the whole sample). Xu ( 2011 ) is also cited 124 times.

Cluster 17 has a focus on cloud-manufacturing (also most mentioned keyword). The ‘International Journal of Computer Integrated Manufacturing’ focuses topics in this area and is the publisher of most of the articles of the cluster. Cloud-manufacturing means that the principles of cloud computing will be transferred to manufacturing concerns, so related manufacturing resources are offered as services which lead to a network of exchanging needed resources and products . This application of DT can optimize processes which is shown in an example of sheet metal processing (Helo and Hao 2017 ). Frameworks for building a cloud manufacturing solution (Cheng et al. 2016 ; Lu and Xu 2017 ) and the design of the network architecture (Škulj et al. 2015 ) are presented and discussed. Moreover, the communication between machines in different companies is a necessary condition to make cloud-manufacturing a success. Therefore, a scheduling model was developed to efficiently exploit distributed resources (Li et al. 2017 ).

Cluster 22 is the smallest of all clusters in the sample. It includes articles on manufacturing whereas it exhibits limited focus on logistics topics. Most articles were published in the ‘International Journal of Production Research’. The most cited article of the cluster with 43 cites is also the one with the highest betweenness centrality. Reaidy et al. ( 2015 ) and Zhong et al. ( 2017 ) show that RFID technology is especially useful in warehouses to track resources and to connect objects. Advantages of the aforementioned communication technologies in smart logistics , as in higher safety are shown (Trab et al. 2017 ). Moreover, applications of technologies are demonstrated like the development of an algorithm to optimize truck docking (Miao et al. 2014 ).

Smart factories, as well as smart industry (Haverkort and Zimmermann 2017 ), are popular areas of research which are shaped by examples from practical applications. Machines, information systems and workers become more connected. The future factory is decentralized and can produce diverse products in a short time period. The topic of DT is getting more and more important for the manufacturing industry.

4.7 Supply chain management

Two of the identified clusters were allocated to the topic supply chain management (SCM). The importance of the topic was extraordinarily high in the years between 2010 and 2014 when more than 100 articles were published.

The clusters differ especially in their technological focus. These are supply chain and CC for cluster 15 as well as supply chain and BD for cluster 21. Cluster 15 deals with the adoption and usage of one of the central technologies in DT—cloud computing—in the context of supply chain management. Empirical results show a positive effect of the technology on supply chain integration (Bruque Cámara et al. 2015 ; Bruque-Cámara et al. 2016 ) which also leads to higher operational performance. This fostering effect on collaborations is also examined by other authors in different contexts like manufacturing and humanitarian organizations (Schniederjans and Hales 2016 ; Yu et al. 2017 ). The highest betweenness centrality and a total number of times cited can be observed for the article from Cegielski et al. ( 2012 ) which deals with the adoption of CC in supply chains. A few other technologies are also discussed in the context of SCM. O’Donnell et al. ( 2009 ) develop a generic algorithm to reduce the bullwhip effect, and Cantor ( 2016 ) examines effects of work monitoring technologies. The author with most articles in this cluster is Dara Schniederjans who published four of the 20 papers.

Cluster 21 has a focus on the use of BD in SCM. Benefits like a higher supply chain visibility and transparency, along with challenges like the balance between humans and analytics management styles are shown (Waller and Fawcett 2013 ; Dutta and Bose 2015 ; Kache and Seuring 2017 ). The article of Waller and Fawcett ( 2013 ) is in total cited 95 times as they give a broad overview of BD in SCM and define critical terms in this area. Two very famous authors in the area of DT also occur in this cluster with an article on BD impacts (McAfee and Brynjolfsson 2012 ). The reputation can be seen by the in-degree of 75 and total times cited of 387.

In sum, collaborations between firms in supply chains are identified as one primary driver of DT (Liere-Netheler et al. 2018 ) as borders between enterprises are known to blur (Lucke et al. 2008 ). This means that technologies should support this change in the supply chains. Two of the significant technologies which lead to more exchange of data are CC and BD. Wieland et al. ( 2016 ) identified BD and analytics as an overestimated research theme in the next 5 years which is in accordance with our findings. Topics like people dimensions, ethical issues, and integration are underestimated as DT also includes a cultural change in companies and the whole supply chain. Moreover, the exchange of data is still an open question. Security and legal aspects are especially unclear (Richey et al. 2016 ).

4.8 Society

Cluster 8 contains 23 articles. An article from Boyd and Crawford ( 2012 ) has the highest betweenness centrality (2727) and the highest in-degree (37). Besides keywords from the digital context (BD, algorithms, and technology), the most frequently used keywords were social, communication, governance and epistemology . Hence, we further sub-classify the articles in three major realms:

(1) Society and communication Articles in this realm deal with topics like an ‘analytic culture’ (Gano 2015 ), data-driven urban geographical imaginaries and understandings (Lake 2017 ; Shelton 2017 ), ‘datafication’ of daily life (Madsen et al. 2016 ), and the monetization of user data (Doyle 2015 ). Other topics include data-journalism (Parasie 2015 ), data protection (MacDonnell 2015 ), impacts of socio-technical systems (Carolan 2017 ), or BD as communication with targeted audiences in a social and cultural context (Holtzhausen 2016 ). Furthermore, we find articles referring to a technical communication perspective discussion in which BD found to ignore the crucial roles of interpretation and communication (Frith 2017 ).

(2) Policy and international finds most of the articles taking a critical view on digitalization in this context (Chandler 2015 ). For example, Sanders and Sheptycki, who discuss stochastic governance, “defined as the governance of populations and territory using statistical representations based on the manipulation of BD” ( 2017 , p. 2), towards a critique of the moral economy of neo-liberalism. A considerable number of articles deals with the topic ‘algorithmic governance’/‘datafication-governance’ (e.g. Chandler 2015 ; Madsen et al. 2016 ; Rothe 2017 ). Rothe ( 2017 ), for example, highlights the role of visual technologies and discusses the construction of environmental security as a form of ontological politics.

(3) Philosophy and ethics Lake ( 2017 ) integrates an epistemological view and discusses BD and urban governance in a democratic society upon an ontological approach. He concludes that BD leads to an atomistic behaviour in management and thus “undermines the contribution of urban complexity as a resource for governance […]” (Lake 2017 , p. 1). Furthermore, we find articles provide critiques about the efficacy of BD approaches (Lowrie 2017 ) and the hidden, positivist assumptions (labelled techno positivism e.g., (Gano 2015 ) behind the movement. Critics of technological solutions and BD are also discussed, such as surveillance of the population (Heath-Kelly 2017 ). Furthermore, articles reflecting how BD affect people as psychological beings are found (Raab 2015 ). The predicament of living in a networked world and being partly unable to sufficiently grasp with the implications thereof is discussed epistemologically (Van Den Eede 2016 ).

In summary, the cluster provides multidisciplinary approaches on the impact of DT on society, and most of the articles engage with BD and digital technologies from critical positions. In the work of Madsen et al. ( 2016 ), we find a research agenda for future research on BD within international political sociology. An important field for further studies is the importance of theory-driven data production. From a societal point of view, DT needs to be considered as a possibility for advancement but also, and probably more important, risks need to be taken into account so that no people will be left behind.

4.9 Tourism

The cluster tourism deals with research articles in the cross - area of tourism and social media . Starting from the year 2000, there was a peak in 2012 (116 articles) whereas in 2016 only 28 articles were published. A content analysis showed that besides the tourism aspects ( tourism, destination, marketing ), the most frequently used keywords from the digital context were Facebook, social media and data analytics .

We identified only two journals that provided more than one source: ‘Journal of Destination Marketing & Management’ (5 articles) and the ‘Journal of Tourism Management’ (2 publications). Only one author contributed more than one article (Kwok and Yu 2013 , 2016 ). Both articles deal with the consumer communication via Facebook. Furthermore, the article of Kwok and Yu ( 2013 )—an analysis of restaurant business-to-consumer communications—was one of the most cited articles in this cluster. Only Fuchs et al. ( 2014 ) with six citations and Xiang et al. ( 2015 ) with seven citations provided a higher in-degree. The research is about BD analysis in the field of hotel guest experience.

We aligned the articles to dominant fields of interest: destination management , (Fuchs et al. 2014 ; Raun et al. 2016 ) and geospatial data (Supak et al. 2015 ) to improve the touristic attractiveness of an area. A further sub-cluster is the research on the use of forums, customer recommendations and consumer - to - consumer communication . Dominant research focuses on text mining and how user-generated content influences the success of tourism organizations and the feelings of customers (Xiang et al. 2015 ; Ksiazek 2015 ; Kim et al. 2017 ). The last sub-cluster deals with the use of social media for marketing purposes in this field (Buhalis and Foerste 2015 ; Hornik 2016 ).

In summary, the influence of consumers and peers increased due to DT. The digital (user-generated) data is increasingly used for analytical purposes, such as text mining and sentiment analysis. Surprisingly trust plays no critical role in the field of user-generated content. We assume this topic is linked more closely to specific marketing research. Moreover, DT has led to a change of the whole industry as a huge amount of purchasing activities has shifted from travel agencies to online booking.

5 Research agenda for DT

During the analysis of all research streams, two major research directions were present. On the one hand individualization with an increasing influence of individual interaction like customer-created content or individual production is recognized. On the other hand, we sense a shift for widespread technology use where computer-controlled workflows impede human interaction as e.g., in smart production or automated decision support. Though we carefully, and by consensus of the involved researchers, named the clusters and streams by using keywords of related articles, we detect some research deficiencies in the areas of accounting and human resource management, as well as in sustainability in combination with the mentioned fields of interest. This does not necessarily mean that there is no research in this area; rather it indicates research regarding these topics is relatively small concerning our sample. So, the topics are not closely connected in research yet. For example, research streams about the integration of human resource management and IT exist (Bondarouk and Ruël 2009 ). However, a deeper understanding of the consequences of e-human resource on the human resource organization, more particularly an understanding of the phenomenon of e-human resource management and its multilevel consequences within and across organizations, is still lacking (Bondarouk and Ruël 2009 ). Recently, Gepp et al. ( 2018 ) reviewed existing research on BD in accounting and finance supporting our finding that the research stream in auditing is still lagging behind. This indicates future research directions and, as Gepp et al. ( 2018 ) postulate, a greater alignment to practice.

Nearly all research recommendations of the defined clusters appreciate further investigations regarding the future application and impact of digital technologies. Some examples of research gaps, resulting from the analysis of the streams, are presented in Table  3 . Further research in all clusters is required for all technologies associated with DT. We have explicitly identified the need for research in the area of big data analytics in the clusters of marketing, knowledge management, manufacturing and society. For example, a specific linking of data with other applications such as business data or social media, as well as the combination of machine-generated data and customer information, is still new and demanding. These could lead to major efficiency gains and might also simplify lives. To study how these gains can be achieved, empirical research requires more focus. Using in-depth case studies is an appropriate method because case studies can highlight best practices. Both opportunities and threats should be identified, defined and evaluated. Still ethical questions coming along with the accessibility of semi-public or public data for researchers and the other parties (e.g. industry, politics) are not yet sufficiently investigated. Research on the development of mathematical models for the application of BD and for machine learning to support decision making needs to be further focused.

The use of blockchains is also an issue. Many possible use scenarios are still to be discovered and tested. A search in the Web of Science Core Collection with the keyword blockchain within the areas of business, as well as management and a time horizon of 2017 and before shows 32 results. 17 results are not cited by other resources. “The Truth about Blockchain” (Iansiti and Lakhani 2017 ) published in HBR in 2017 is cited 41 times which is the highest amount of citations. This might be an indicator for future importance of this topic in business research.

In general, we emphasize a demand for more case studies describing the benefits, values and weaknesses of DT implementations in all clusters. In order to align the applications of DT with traditional research, the basic models should be tested for their suitability for the new, changed world. Furthermore, researchers advise caution in the sense of security and safety of the data produced and collected. Only the cluster society provided research about possible negative implications. We assume the digital revolution proclaimed is a slow process and for sure not over yet. The implications on culture and society will be enormous, so further work, integrating the cultural, technological and business level would be appreciated. Furthermore, long-term studies will show the real impact of the DT trend. Researchers may answer the major question for all clusters: How much of the enthusiasm is due to the novelty of the technology itself and how great are the long-term benefits? Moreover, the theorization of DT in general is not clear yet. First studies arise which collect different definitions (Morakanyane et al. 2017 ). However, we do not see a conceptualization that is used interdisciplinary. Besides the definitions, characteristics as well as frameworks on DT are necessary.

6 Conclusion and limitations

In sum, our study gives a holistic overview on topics in DT research. We aimed at identifying major research streams and possible gaps for further research. Nine main streams were discussed by giving an overall picture of the sample. Moreover, all relevant streams were presented in detail to get an overview of the fields. The study is based on a structured literature review, combined with a citation network analysis, which enables us to deal with a huge amount of literature. This work aims on a brought overview of recent research of DT in business. Many articles discuss the application of digital technologies to support or refine business (e.g., VR in tourism, marketing, and manufacturing). The three dominant areas in our database are finance, marketing and innovation management. The focussed technological fields in the articles are the internet of things, big data, cloud computing and artificial intelligence. Especially in the field of finance new abilities to work with big data and analytics for trading and predicting markets shape the research field. Data management methods and the application of data analysis methods become more important, as they can be used for prediction and prognosis of e.g., bankruptcy. In the field of the production industry, the topic of cloud manufacturing is gaining more and more attention.

We recognize that our study has limitations. By explanation, a literature review rests on the existing as well as accessible research studies. As we conducted a thorough literature search through the ISI Web of Science to identify all relevant articles according to our search terms, it cannot be excluded that in this literature review some articles could have been missed from some other leading databases (i.e. Scopus and EBSCO). However, WoS is considered the most comprehensive database and is thus frequently used in management and IS research (Schryen 2015 ). Another limitation lies in the definition of the research objectives and selection terms. It is possible that our systematic literature review cannot cover exhaustively the vast field of research. This possibility is especially relevant as different technologies regarding DT are included in the study. Thus, the findings are limited to these technologies. However, by conducting several search loops in an iterative approach of search terms and checking after each loop that the search stream fits our research question, authors are quite confident this research is robust as every effort to mitigate error was taken. Additionally, the qualitative analysis and cluster descriptions are based on the research team interpretation of the selected research articles. By conducting a two-step cluster evaluation process, first cross-checking articles independently, second reviewing clusters in an author team of five heterogeneous researchers, we addressed with this embedded bias. Moreover, we use a citation network analysis. Compared to other literature review approaches, the network analysis does not focus on a special field within DT research. Thus, we were able to study the field of DT from a more holistic perspective and provide implication of a broad literature base and an overview of the current state. Moreover, this study points to future directions in the field.

Besides these limitations, the procedure was permanently reflected during the research process which resulted in two major questions: (1) How consolidated is the body of literature? (2) How do we consolidate the body of literature in an adequate research procedure?

(1) For the first question, we assume that many clusters aroused by the business perspective. However, we also identified clusters with very little connection to management topics such as health care (cluster 14). This cluster contains two management related articles (Bental et al. 1999 ; Brown et al. 2015 ). Therefore, we excluded health care from an in-depth analysis. Other clusters focus on technology or the method (e.g., cluster 1). Therefore, an alternative mean of analysis could be to focus on streams of technology instead of streams of business disciplines or a combined analysis with a matrix approach. Moreover, our research approach is limited due to the search terms used.

(2) For the second question, we chose a combination of quantitative and qualitative approaches to arrive at an appropriate and representative number of articles. Discussions and rounds of consensus within the research team ensured a minimal amount of subjectivity. For the selection of clusters, we decided for an absolute approach to select the largest 10%. Alternative solutions could include relative approaches, like using k-means (Jain 2010 ) or other measurements. The cluster trust index showed that most clusters kept over 50 percent of the assigned articles after the manual qualitative analysis. For this reason, we consider the citation network analysis based on the tool Gephi as a valuable proceeding. In some way, our approach is an example of DT in research, as we worked with a digital-based dataset and presented an exemplary way to work with the rapidly growing amounts of research literature data. With our work, we will encourage researchers to recognize the threats, continue the research about DT in business, and examine the advantages of the digital change. Moreover, in showing a holistic approach to DT research, our results can be regarded as the first step to foster researcher’s adaptive expertise to understand and combine results and procedures from different fields (Boon et al. 2019 ). For future research, we encourage a mutual interchange of findings from corresponding research streams, as we showed with our study.

We calculate the CTI as QA/Found = CTI. For example, for the cluster “Analytics” this would be: 30/37 = 0.81.

Akter S, Wamba SF, Gunasekaran A et al (2016) How to improve firm performance using big data analytics capability and business strategy alignment? Int J Prod Econ 182:113–131

Google Scholar  

Albort-Morant G, Ribeiro-Soriano D (2016) A bibliometric analysis of international impact of business incubators. J Bus Res 69:1775–1779. https://doi.org/10.1016/j.jbusres.2015.10.054

Article   Google Scholar  

Allcott H, Gentzkow M (2017) Social media and fake news in the 2016 election. J Econ Perspect 31:211–236

Arbelaitz O, Gurrutxaga I, Muguerza J et al (2013) An extensive comparative study of cluster validity indices. Pattern Recogn 46:243–256. https://doi.org/10.1016/j.patcog.2012.07.021

Arnold C, Kiel D, Voigt K-I (2016) How the industrial internet of things changes business models in different manufacturing industries. Int J Innov Manag 20:1640015

Ashton K (2009) That “Internet of Things” thing. RFiD J 22:97–114

Bag S (2017) Big data and predictive analysis is key to superior supply chain performance: a South African experience. Int J Inf Syst Supply Chain Manag 10:66–84. https://doi.org/10.4018/IJISSCM.2017040104

Barton D, Court D (2012) Making advanced analytics work for you. In: Harvard business review. https://hbr.org/2012/10/making-advanced-analytics-work-for-you . Accessed 14 Feb 2018

Bental DS, Cawsey A, Jones R (1999) Patient information systems that tailor to the individual. Patient Educ Couns 36:171–180. https://doi.org/10.1016/S0738-3991(98)00133-5

Bharadwaj A, El Sawy OA, Pavlou PA, Venkatraman N (2013) Digital business strategy: toward a next generation of insights. MIS Q 37(2):471–482

Bhimani A, Willcocks L (2014) Digitisation, ‘Big Data’ and the transformation of accounting information. Account Bus Res 44:469–490

Bi Z, Wang G, Xu LD (2016) A visualization platform for internet of things in manufacturing applications. Internet Res 26:377–401. https://doi.org/10.1108/IntR-02-2014-0043

Bi Z, Wang G, Xu LD et al (2017) IoT-based system for communication and coordination of football robot team. Internet Res 27:162–181. https://doi.org/10.1108/IntR-02-2016-0056

Blackburn M, Alexander J, Legan JD, Klabjan D (2017) Big data and the future of R&D management: the rise of big data and big data analytics will have significant implications for R&D and innovation management in the next decade. Res Technol Manag 60:43–51. https://doi.org/10.1080/08956308.2017.1348135

Bley K, Leyh C, Schäffer T (2016) Digitization of German enterprises in the production Sector-Do they know how “digitized” they are? In: Americas Conference on Information Systems (AMCIS)

Bohling TR, Kumar V, Shah R (2013) Predicting purchase timing, product choice, and purchase amount for a firms adoption of a radically innovative information technology: an analysis of cloud computing services. Serv Sci 5:102–123

Bondarouk TV, Ruël HJM (2009) Electronic human resource management: challenges in the digital era. Int J Hum Resour Manag 20:505–514. https://doi.org/10.1080/09585190802707235

Boon M, van Baalen S, Groenier M (2019) Interdisciplinary expertise in medical practice: challenges of using and producing knowledge in complex problem-solving. Med Teach. https://doi.org/10.1080/0142159X.2018.1544417

Booth A, Gerding E, McGroarty F (2014) Automated trading with performance weighted random forests and seasonality. Expert Syst Appl 41:3651–3661

Boyack KW, Klavans R (2010) Co-citation analysis, bibliographic coupling, and direct citation: which citation approach represents the research front most accurately? J Am Soc Inform Sci Technol 61:2389–2404. https://doi.org/10.1002/asi.21419

Boyd D, Crawford K (2012) CRITICAL QUESTIONS FOR BIG DATA: provocations for a cultural, technological, and scholarly phenomenon. Inf Commun Soc 15:662–679. https://doi.org/10.1080/1369118X.2012.678878

Braganza A, Brooks L, Nepelski D et al (2017) Resource management in big data initiatives: processes and dynamic capabilities. J Bus Res 70:328–337. https://doi.org/10.1016/j.jbusres.2016.08.006

Brocke J, Simons A, Niehaves B et al (2009) RECONSTRUCTING THE GIANT: ON THE IMPORTANCE OF RIGOUR IN DOCUMENTING THE LITERATURE SEARCH PROCESS. ECIS 2009 Proceedings

Brown NJ, David M, Cuttle L et al (2015) Cost-effectiveness of a nonpharmacological intervention in pediatric burn care. Value Health 18:631–637. https://doi.org/10.1016/j.jval.2015.04.011

Bruque Cámara S, Moyano Fuentes J, Maqueira Marín JM (2015) Cloud computing, Web 2.0, and operational performance: the mediating role of supply chain integration. Int J Logist Manag 26:426–458. https://doi.org/10.1108/IJLM-07-2013-0085

Bruque-Cámara S, Moyano-Fuentes J, Maqueira-Marín JM (2016) Supply chain integration through community cloud: effects on operational performance. J Purch Supply Manag 22:141–153. https://doi.org/10.1016/j.pursup.2016.04.003

Brynjolfsson E, McAfee A (2014) The second machine age: work, progress, and prosperity in a time of brilliant technologies, 1st edn. W. W. Norton & Company, New York

Buhalis D, Foerste M (2015) SoCoMo marketing for travel and tourism: empowering co-creation of value. J Destin Mark Manag 4:151–161. https://doi.org/10.1016/j.jdmm.2015.04.001

Cantor DE (2016) Maximizing the potential of contemporary workplace monitoring: techno-cultural developments, transactive memory, and management planning. J Bus Logist 37:18–25. https://doi.org/10.1111/jbl.12115

Carolan M (2017) Publicising food: big data, precision agriculture, and co-experimental techniques of addition: publicising f ood. Sociol Rural 57:135–154. https://doi.org/10.1111/soru.12120

Cegielski CG, Allison Jones-Farmer L, Wu Y, Hazen BT (2012) Adoption of cloud computing technologies in supply chains: an organizational information processing theory approach. Int J Logist Manag 23:184–211. https://doi.org/10.1108/09574091211265350

Chandler D (2015) A world without causation: big data and the coming of age of posthumanism. Millenn J Int Stud 43:833–851. https://doi.org/10.1177/0305829815576817

Chen Y-F (2014) See you on Facebook: exploring influences on Facebook continuous usage. Behav Inf Technol 33:1208–1218. https://doi.org/10.1080/0144929X.2013.826737

Cheng G, Liu L, Qiang X, Liu Y (2016) Industry 4.0 development and application of intelligent manufacturing. In: 2016 international conference on information system and artificial intelligence (ISAI). pp 407–410

Cleary P, Quinn M (2016) Intellectual capital and business performance: an exploratory study of the impact of cloud-based accounting and finance infrastructure. J Intellect Cap 17:255–278. https://doi.org/10.1108/JIC-06-2015-0058

Coyle JR, Thorson E (2001) The effects of progressive levels of interactivity and vividness in web marketing sites. J Advert 30:65–77

Cui G, Wong ML, Lui H-K (2006) Machine learning for direct marketing response models: Bayesian networks with evolutionary programming. Manag Sci 52:597–612. https://doi.org/10.1287/mnsc.1060.0514

Dahlander L, Gann DM (2010) How open is innovation? Res Policy 39:699–709. https://doi.org/10.1016/j.respol.2010.01.013

Das SR, Chen MY (2007) Yahoo! for Amazon: sentiment extraction from small talk on the web. Manag Sci 53:1375–1388. https://doi.org/10.1287/mnsc.1070.0704

Devaraj S, Kohli R (2003) Performance impacts of information technology: is actual usage the missing link? Manag Sci 49:60–95

Downes L, Nunes P (2013) Big bang disruption. Harvard Bus Rev 91(3):44–56

Doyle K (2015) Facebook, Whatsapp and the commodification of affective labour (APAFT)—informit. Commun Politics Cult 48:51–65

Dremel Christian, Wulf Jochen, Herterich Matthias M, Waizmann Jean-Claude, Brenner Walter (2017) How AUDI AG established big data analytics in its digital transformation. MIS Q Executive 16(2):81–100

Dutta D, Bose I (2015) Managing a big data project: the case of ramco cements limited. Int J Prod Econ 165:293–306. https://doi.org/10.1016/j.ijpe.2014.12.032

Erevelles S, Fukawa N, Swayne L (2016) Big data consumer analytics and the transformation of marketing. J Bus Res 69:897–904

Fink A (2005) Conducting research literature reviews: from the Internet to paper, 2nd edn. Sage Publications, Thousand Oaks

Fitzgerald M, Kruschwitz N, Bonnet D, Welch M (2014) Embracing digital technology: a new strategic imperative. MIT Sloan Manag Rev 55:1–12

Fowler A (2000) The role of AI-based technology in support of the knowledge management value activity cycle. J Strateg Inf Syst 9:107–128. https://doi.org/10.1016/S0963-8687(00)00041-X

Frey CB, Osborne MA (2017) The future of employment: how susceptible are jobs to computerisation? Technol Forecast Soc Chang 114:254–280

Frith J (2017) Big data, technical communication, and the smart city. J Bus Tech Commun 31:168–187. https://doi.org/10.1177/1050651916682285

Fuchs M, Höpken W, Lexhagen M (2014) Big data analytics for knowledge generation in tourism destinations—a case from Sweden. J Destin Mark Manag 3:198–209. https://doi.org/10.1016/j.jdmm.2014.08.002

Gangwar H (2016) Understanding cloud computing adoption: a model comparison approach. Hum Syst Manag 35:93–114

Gano G (2015) Starting with Universe: Buckminster Fuller’s design science now. Futures 70:56–64. https://doi.org/10.1016/j.futures.2014.12.011

Gepp A, Linnenluecke MK, O’Neill TJ, Smith T (2018) Big data techniques in auditing research and practice: current trends and future opportunities. J Account Lit 40:102–115

Gerlitz L (2016) Design management as a domain of smart and sustainable enterprise: business modelling for innovation and smart growth in Industry 4.0. Entrepr Sustain Issues 3:244–268

Gimpel H, Röglinger M (2015) Digital transformation : changes and chances? Insights based on an empirical study. Fraunhofer Institute for Applied Information Technology FIT, Bayreuth

Greengard S (2016) Cybersecurity gets smart. Commun ACM 59:29–31

Gross A, Solymossy E (2016) Generations of business information, 1937–2012: moving from data bits to intelligence. Inf Cult 51:226–248. https://doi.org/10.7560/IC51204

Guggenheim D (2016) The collision of indeterminate environments and porter’s forces: uncertainty fields and their impact on entrepreneurial alertness. Strateg Change 25:239–257. https://doi.org/10.1002/jsc.2058

Guo R, Cai L, Zhang W (2016) Effectuation and causation in new internet venture growth: the mediating effect of resource bundling strategy. Internet Res 26:460–483. https://doi.org/10.1108/IntR-01-2015-0003

Guo L, Wei YS, Sharma R, Rong K (2017) Investigating e-business models’ value retention for start-ups: the moderating role of venture capital investment intensity. Int J Prod Econ 186:33–45. https://doi.org/10.1016/j.ijpe.2017.01.021

Hausberg JP, Korreck S (2018) Business incubators and accelerators: a co-citation analysis-based, systematic literature review. J Technol Transf. https://doi.org/10.1007/s10961-018-9651-y

Haverkort BR, Zimmermann A (2017) Smart industry: how ICT will change the game! IEEE Internet Comput 21:8–10. https://doi.org/10.1109/MIC.2017.22

Hazen BT, Boone CA, Ezell JD, Jones-Farmer LA (2014) Data quality for data science, predictive analytics, and big data in supply chain management: an introduction to the problem and suggestions for research and applications. Int J Prod Econ 154:72–80. https://doi.org/10.1016/j.ijpe.2014.04.018

He Y, Wang L, He Z, Xiao X (2016) Modelling infant failure rate of electromechanical products with multilayered quality variations from manufacturing process. Int J Prod Res 54:6594–6612. https://doi.org/10.1080/00207543.2016.1154215

Heath-Kelly C (2017) Algorithmic autoimmunity in the NHS: radicalisation and the clinic. SECUR DIALOGUE 48:29–45. https://doi.org/10.1177/0967010616671642

Helo P, Hao Y (2017) Cloud manufacturing system for sheet metal processing. Prod Plan Control 28:524–537. https://doi.org/10.1080/09537287.2017.1309714

Hinings B, Gegenhuber T, Greenwood R (2018) Digital innovation and transformation: an institutional perspective. Inf Organ 28:52–61

Hirsch-Kreinsen H (2015) Digitalisierung von Arbeit: Folgen, Grenzen und Perspektiven

Hirsch-Kreinsen H, ten Hompel M (2017) Digitalisierung industrieller Arbeit: Entwicklungsperspektiven und Gestaltungsansätze. In: Vogel-Heuser B, Bauernhansl T, ten Hompel M (eds) Handbuch Industrie 4.0 Bd.3. Springer, Berlin, pp 357–376

Holtzhausen D (2016) Datafication: threat or opportunity for communication in the public sphere? J Commun Manag 20:21–36. https://doi.org/10.1108/JCOM-12-2014-0082

Hoornaert S, Ballings M, Malthouse EC, Van den Poel D (2017) Identifying new product ideas: waiting for the wisdom of the crowd or screening ideas in real time. J Prod Innov Manag 34:580–597

Horita FEA, de Albuquerque JP, Marchezini V, Mendiondo EM (2017) Bridging the gap between decision-making and emerging big data sources: An application of a model-based framework to disaster management in Brazil. Decis Support Syst 97:12–22. https://doi.org/10.1016/j.dss.2017.03.001

Hornik R (2016) Measuring campaign message exposure and public communication environment exposure: some implications of the distinction in the context of social media. Commun Methods Meas 10:167–169. https://doi.org/10.1080/19312458.2016.1150976

Hsu M-W, Lessmann S, Sung M-C et al (2016) Bridging the divide in financial market forecasting: machine learners vs. financial economists. Expert Syst Appl 61:215–234

Huggins R, Izushi H (2011) Competition, competitive advantage, and clusters: the ideas of Michael Porter. Oxford University Press, Oxford

Iansiti M, Lakhani KR (2017) The truth about blockchain. Harv Bus Rev 95:118–127

Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31:651–666. https://doi.org/10.1016/j.patrec.2009.09.011

Ji G, Hu L, Tan KH (2017) A study on decision-making of food supply chain based on big data. J Syst Sci Syst Eng 26:183–198. https://doi.org/10.1007/s11518-016-5320-6

Jung D, Dorner V, Glaser F, Morana S (2018) Robo-advisory: digitalization and automation of financial advisory. Bus Inf Syst Eng 60:81–86. https://doi.org/10.1007/s12599-018-0521-9

Kache F, Seuring S (2017) Challenges and opportunities of digital information at the intersection of Big Data Analytics and supply chain management. Int J Oper Prod Manag 37:10–36. https://doi.org/10.1108/IJOPM-02-2015-0078

Kane GC, Palmer D, Nguyen Phillips A et al (2015) Strategy, not technology, drives digital transformation. MIT Sloan Manag Rev Deloitte Univ Press 14:1–25

Karimi J, Walter Z (2015) The role of dynamic capabilities in responding to digital disruption: a factor-based study of the newspaper industry. J Manag Inf Syst 32:39–81. https://doi.org/10.1080/07421222.2015.1029380

Khanagha S, Volberda H, Sidhu J, Oshri I (2013) Management innovation and adoption of emerging technologies: the case of cloud computing. Eur Manag Rev 10:51–67

Khanagha S, Volberda H, Oshri I (2014) Business model renewal and ambidexterity: structural alteration and strategy formation process during transition to a C loud business model. R&D Manag 44:322–340

Khanagha S, Volberda H, Oshri I (2017) Customer co-creation and exploration of emerging technologies: the mediating role of managerial attention and initiatives. Long Range Plan 50:221–242

Kim D, Lee H, Cho S (2008) Response modeling with support vector regression. Expert Syst Appl 34:1102–1108. https://doi.org/10.1016/j.eswa.2006.12.019

Kim K, Park O, Yun S, Yun H (2017) What makes tourists feel negatively about tourism destinations? Application of hybrid text mining methodology to smart destination management. Technol Forecast Soc Chang 123:362–369. https://doi.org/10.1016/j.techfore.2017.01.001

Ksiazek TB (2015) Civil interactivity: how news organizations’ commenting policies explain civility and hostility in user comments. J Broadcast Electron Media 59:556–573. https://doi.org/10.1080/08838151.2015.1093487

Kwok L, Yu B (2013) Spreading social media messages on Facebook: an analysis of restaurant business-to-consumer communications. Cornell Hosp Q 54:84–94. https://doi.org/10.1177/1938965512458360

Kwok L, Yu B (2016) Taxonomy of Facebook messages in business-to-consumer communications: what really works? Tour Hosp Res 16:311–328. https://doi.org/10.1177/1467358415600214

Lake RW (2017) Big Data, urban governance, and the ontological politics of hyperindividualism. Big Data Soc 4:205395171668253. https://doi.org/10.1177/2053951716682537

Lam SK, Sleep S, Hennig-Thurau T et al (2017) Leveraging frontline employees’ small data and firm-level big data in frontline management: an absorptive capacity perspective. J Serv Res 20:12–28

Lamba K, Singh SP (2017) Big data in operations and supply chain management: current trends and future perspectives. Prod Plan Control 28:877–890. https://doi.org/10.1080/09537287.2017.1336787

Lepak DP, Smith KG, Taylor MS (2007) Value creation and value capture: a multilevel perspective. Acad Manag Rev 32:180–194. https://doi.org/10.5465/amr.2007.23464011

Levy Y, Ellis TJ (2006) A systems approach to conduct an effective literature review in support of information systems research. Inf Sci 9:181–212

Li L, Zhong L, Xu G, Kitsuregawa M (2012) A feature-free search query classification approach using semantic distance. Expert Syst Appl 39:10739–10748. https://doi.org/10.1016/j.eswa.2012.02.191

Li W, Zhu C, Yang LT et al (2017) Subtask scheduling for distributed robots in cloud manufacturing. IEEE Syst J 11:941–950. https://doi.org/10.1109/JSYST.2015.2438054

Liere-Netheler K, Packmohr S, Vogelsang K (2018) Drivers of digital transformation in manufacturing. In: Proceedings of the 51st Hawaii international conference on system sciences. Honululu, USA

Lin H-F, Chen C-H (2015) Design and application of augmented reality query-answering system in mobile phone information navigation. Expert Syst Appl 42:810–820

Lowrie I (2017) Algorithmic rationality: epistemology and efficiency in the data sciences. Big Data Soc 4:205395171770092. https://doi.org/10.1177/2053951717700925

Lu Y, Xu X (2017) A semantic web-based framework for service composition in a cloud manufacturing environment. J Manuf Syst 42:69–81. https://doi.org/10.1016/j.jmsy.2016.11.004

Lucas HC, Goh JM (2009) Disruptive technology: how Kodak missed the digital photography revolution. J Strateg Inf Syst 18:46–55. https://doi.org/10.1016/j.jsis.2009.01.002

Lucke D, Constantinescu C, Westkämper E (2008) Smart factory—a step towards the next generation of manufacturing. In: Mitsuishi M, Ueda K, Kimura F (eds) Manufacturing systems and technologies for the new frontier. Springer, London, pp 115–118

MacDonnell P (2015) The European Union’s proposed equality and data protection rules: an existential problem for insurers? Econ Aff 35:225–239. https://doi.org/10.1111/ecaf.12127

Madden S (2012) From databases to big data. IEEE Internet Comput 16(3):4–6

Madsen AK, Flyverbom M, Hilbert M, Ruppert E (2016) Big data: issues for an international political sociology of data practices: table 1. Int Political Sociol 10:275–296. https://doi.org/10.1093/ips/olw010

Matt C, Hess T, Benlian A (2015) Digital transformation strategies. Bus Inf Syst Eng 57:339–343. https://doi.org/10.1007/s12599-015-0401-5

Mazzei MJ, Noble D (2017) Big data dreams: a framework for corporate strategy. Bus Horiz 60:405–414

McAfee A, Brynjolfsson E (2012) Big data: the management revolution. Harvard Bus Rev 90:60–68

Metzger PT (2016) Space development and space science together, an historic opportunity. Sp Policy 37:77–91. https://doi.org/10.1016/j.spacepol.2016.08.004

Mian S, Lamine W, Fayolle A (2016) Technology business incubation: an overview of the state of knowledge. Technovation 50:1–12

Miao Z, Cai S, Xu D (2014) Applying an adaptive tabu search algorithm to optimize truck-dock assignment in the crossdock management system. Expert Syst Appl 41:16–22. https://doi.org/10.1016/j.eswa.2013.07.007

Min J, Lee Y (2005) Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters. Expert Syst Appl 28:603–614

Morakanyane R, Grace A, O’Reilly P (2017) Conceptualizing digital transformation in business organizations: a systematic review of literature. In: Proceedings of the 30th bled eConference. pp 427–443

Murray A, Papa A, Cuozzo B, Russo G (2016) Evaluating the innovation of the Internet of Things: empirical evidence from the intellectual capital assessment. Bus Process Manag J 22:341–356. https://doi.org/10.1108/BPMJ-05-2015-0077

Najmaei A (2016) How do entrepreneurs develop business models in small high-tech ventures? An exploratory model from Australian IT firms. Entrepr Res J 6:297–343

Nambisan S, Wright M, Feldman M (2019) The digital transformation of innovation and entrepreneurship: progress, challenges and key themes. Res Policy 48:103773

Nwankpa JK, Roumani Y (2016) IT capability and digital transformation: a firm performance perspective. In: Proceedings of the Thirty Seventh International Conference on Information Systems, Dublin

O’Donnell T, Humphreys P, McIvor R, Maguire L (2009) Reducing the negative effects of sales promotions in supply chains using genetic algorithms. Expert Syst Appl 36:7827–7837. https://doi.org/10.1016/j.eswa.2008.11.034

O’Flaherty B, Heavin C (2015) Positioning predictive analytics for customer retention. J Decis Syst 24:3–18. https://doi.org/10.1080/12460125.2015.994353

Okoli C, Schabram K (2010) A guide to conducting a systematic literature review of information systems research. Sprouts: Working Papers on Information Systems, pp 10

Oliveira T, Thomas M, Espadanal M (2014) Assessing the determinants of cloud computing adoption: an analysis of the manufacturing and services sectors. Inf Manag 51:497–510

Ordenes FV, Ludwig S, De Ruyter K et al (2017) Unveiling what is written in the stars: analyzing explicit, implicit and discourse patterns of sentiment in social media. J Consum Res 43:875–894. https://doi.org/10.1093/jcr/ucw070

Parasie S (2015) Data-driven revelation?: epistemological tensions in investigative journalism in the age of “big data”. Digit Journal 3:364–380. https://doi.org/10.1080/21670811.2014.976408

Parviainen P, Tihinen M, Kääriäinen J, Teppola S (2017) Tackling the digitalization challenge: how to benefit from digitalization in practice. Int J Inf Syst Project Manag 5:63–77

Pillai PS, Rao S (2016) Resource allocation in cloud computing using the uncertainty principle of game theory. IEEE Syst J 10:637–648. https://doi.org/10.1109/JSYST.2014.2314861

Pisano GP, Shih WC (2012) Does America really need manufacturing. Harv Bus Rev 90:94–102

Pisano P, Pironti M, Rieple A (2015) Identify innovative business models: can innovative business models enable players to react to ongoing or unpredictable trends? Entrepr Res J 5:181–199

Prause G, Atari S (2017) On sustainable production networks for Industry 4.0. Entrepr Sustain Issues 4:421–431

Prescott ME (2014) Big data and competitive advantage at Nielsen. Manag Decis 52:573–601. https://doi.org/10.1108/MD-09-2013-0437

Qi J, Zhu C, Yang Y (2014) Recommendations based on social relationships in mobile services: recommendations based on social relationships in mobile services. Syst Res Behav Sci 31:424–436. https://doi.org/10.1002/sres.2279

Raab T (2015) DATA DRIVEN NARCISSISM: HOW WILL “Big Data” FEED BACK ON US? J Conscious Stud 22:215–228

Rajesh R (2016) Forecasting supply chain resilience performance using grey prediction. Electron Commer Res Appl 20:42–58. https://doi.org/10.1016/j.elerap.2016.09.006

Raun J, Ahas R, Tiru M (2016) Measuring tourism destinations using mobile tracking data. Tour Manag 57:202–212. https://doi.org/10.1016/j.tourman.2016.06.006

Reaidy PJ, Gunasekaran A, Spalanzani A (2015) Bottom-up approach based on Internet of Things for order fulfillment in a collaborative warehousing environment. Int J Prod Econ 159:29–40. https://doi.org/10.1016/j.ijpe.2014.02.017

Richey RG, Morgan TR, Lindsey-Hall K, Adams FG (2016) A global exploration of Big Data in the supply chain. Int J Phys Distrib Logist Manag 46:710–739. https://doi.org/10.1108/IJPDLM-05-2016-0134

Risteska Stojkoska BL, Trivodaliev KV (2017) A review of Internet of Things for smart home: challenges and solutions. J Clean Prod 140:1454–1464. https://doi.org/10.1016/j.jclepro.2016.10.006

Roden S, Nucciarelli A, Li F, Graham G (2017) Big data and the transformation of operations models: a framework and a new research agenda. Prod Plan Control 28:929–944. https://doi.org/10.1080/09537287.2017.1336792

Ross J, Stevenson F, Lau R, Murray E (2016) Factors that influence the implementation of e-health: a systematic review of systematic reviews (an update). Implement Sci. https://doi.org/10.1186/s13012-016-0510-7

Rothberg HN, Erickson GS (2017) Big data systems: knowledge transfer or intelligence insights? J Knowl Manag 21:92–112. https://doi.org/10.1108/JKM-07-2015-0300

Rothe D (2017) Seeing like a satellite: remote sensing and the ontological politics of environmental security. Secur Dialogue 48:334–353. https://doi.org/10.1177/0967010617709399

Sanders CB, Sheptycki J (2017) Policing, crime and ‘big data’; towards a critique of the moral economy of stochastic governance. Crime Law Soc Change 68:1–15. https://doi.org/10.1007/s10611-016-9678-7

Schallmo D, Williams CA, Boardman L (2017) Digital transformation of business models—best practice, enablers, and roadmap. Int J Innov Manag 21:1740014. https://doi.org/10.1142/S136391961740014X

Schniederjans DG, Hales DN (2016) Cloud computing and its impact on economic and environmental performance: a transaction cost economics perspective. Decis Support Syst 86:73–82. https://doi.org/10.1016/j.dss.2016.03.009

Schryen G (2015) Writing qualitative is literature reviews—guidelines for synthesis, interpretation, and guidance of research. Commun Assoc Inf Syst 37:286–325

Schwab K (2017) The fourth industrial revolution, First U.S. edition. Crown Business, New York

Shah N, Irani Z, Sharif AM (2017) Big data in an HR context: exploring organizational change readiness, employee attitudes and behaviors. J Bus Res 70:366–378

Shelton T (2017) The urban geographical imagination in the age of Big Data. Big Data Soc 4:205395171666512. https://doi.org/10.1177/2053951716665129

Singh JP, Irani S, Rana NP et al (2017) Predicting the “helpfulness” of online consumer reviews. J Bus Res 70:346–355. https://doi.org/10.1016/j.jbusres.2016.08.008

Sivarajah U, Kamal MM, Irani Z, Weerakkody V (2017) Critical analysis of Big Data challenges and analytical methods. J Bus Res 70:263–286. https://doi.org/10.1016/j.jbusres.2016.08.001

Škulj G, Vrabič R, Butala P, Sluga A (2015) Decentralised network architecture for cloud manufacturing. Int J Comput Integr Manuf. https://doi.org/10.1080/0951192X.2015.1066861

Sodero AC, Rabinovich E (2017) Demand and revenue management of deteriorating inventory on the Internet: an empirical study of flash sales markets. J Bus Logist 38:170–183. https://doi.org/10.1111/jbl.12157

Spath D, Ganschar O, Gerlach S, et al (2013) Produktionsarbeit der Zukunft-Industrie 4.0. Fraunhofer Verlag Stuttgart

Stock T, Seliger G (2016) Opportunities of sustainable manufacturing in Industry 4.0. Procedia CIRP 40:536–541. https://doi.org/10.1016/j.procir.2016.01.129

Supak S, Brothers G, Bohnenstiehl D, Devine H (2015) Geospatial analytics for federally managed tourism destinations and their demand markets. Journal of Destination Marketing & Management 4:173–186. https://doi.org/10.1016/j.jdmm.2015.05.002

Tan KH, Zhan Y (2017) Improving new product development using big data: a case study of an electronics company: a case study of an electronics company. R&D Management 47:570–582. https://doi.org/10.1111/radm.12242

Tan KH, Zhan Y, Ji G et al (2015) Harvesting big data to enhance supply chain innovation capabilities: an analytic infrastructure based on deduction graph. Int J Prod Econ 165:223–233. https://doi.org/10.1016/j.ijpe.2014.12.034

Tian X (2017) Big data and knowledge management: a case of déjà vu or back to the future? J Knowl Manag 21:113–131. https://doi.org/10.1108/JKM-07-2015-0277

Trab S, Bajic E, Zouinkhi A et al (2017) A communicating object’s approach for smart logistics and safety issues in warehouses. Concurr Eng 25:53–67. https://doi.org/10.1177/1063293X16672508

Trantopoulos K, von Krogh G, Wallin MW, Woerter M (2017) External knowledge and information technology: implications for process innovation performance. MIS Q 41:287–300

Tsai C-F, Wu J-W (2008) Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst Appl 34:2639–2649

Uden L, He W (2017) How the Internet of Things can help knowledge management: a case study from the automotive domain. J Knowl Manag 21:57–70. https://doi.org/10.1108/JKM-07-2015-0291

Van Den Eede Y (2016) The (Im)possible grasp of networked realities: disclosing Gregory Bateson’s work for the study of technology. Hum Stud 39:601–620. https://doi.org/10.1007/s10746-016-9400-x

vom Brocke J, Simons A, Riemer K et al (2015) Standing on the shoulders of giants: challenges and recommendations of literature search in information systems research. Commun Assoc Inf Syst 37(1):9

Waller MA, Fawcett SE (2013) Data science, predictive analytics, and big data: a revolution that will transform supply chain design and management. J Bus Logist 34:77–84. https://doi.org/10.1111/jbl.12010

Weber R, Aha DW, Becerra-Fernandez I (2001) Intelligent lessons learned systems. Expert Syst Appl 20:17–34. https://doi.org/10.1016/S0957-4174(00)00046-4

Westerman G, Bonnet D, McAfee A (2014) The nine elements of digital transformation. MIT Sloan Manag Rev 55:1–6

Wieland A, Handfield RB, Durach CF (2016) Mapping the landscape of future research themes in supply chain management. J Bus Logist 37:205–212. https://doi.org/10.1111/jbl.12131

Xiang Z, Schwartz Z, Gerdes JH, Uysal M (2015) What can big data and text analytics tell us about hotel guest experience and satisfaction? Int J Hosp Manag 44:120–130. https://doi.org/10.1016/j.ijhm.2014.10.013

Xu LD (2011) Information architecture for supply chain quality management. Int J Prod Res 49:183–198. https://doi.org/10.1080/00207543.2010.508944

Xu Z, Frankwick GL, Ramirez E (2016) Effects of big data analytics and traditional marketing analytics on new product success: a knowledge fusion perspective. J Bus Res 69:1562–1566

Yim MY-C, Chu S-C, Sauer PL (2017) Is augmented reality technology an effective tool for E-commerce? An interactivity and vividness perspective. J Interact Mark 39:89–103. https://doi.org/10.1016/j.intmar.2017.04.001

Yu Y, Cao RQ, Schniederjans D (2017) Cloud computing and its impact on service level: a multi-agent simulation model. Int J Prod Res 55:4341–4353. https://doi.org/10.1080/00207543.2016.1251624

Zhan Y, Tan KH, Ji G et al (2017) A big data framework for facilitating product innovation processes. Bus Process Manag J 23:518–536. https://doi.org/10.1108/BPMJ-11-2015-0157

Zhang G, Hu M, Patuwo B, Indro D (1999) Artificial neural networks in bankruptcy prediction: general framework and cross-validation analysis. Eur J Oper Res 116:16–32

Zhang Q-T, Liu Y, Zhou W, Yang Z-W (2015) A sequential regression model for Big Data with attributive explanatory variables. J Oper Res Soc China 3:475–488. https://doi.org/10.1007/s40305-015-0109-8

Zhang Y, Zhang G, Chen H et al (2016) Topic analysis and forecasting for science, technology and innovation: methodology with a case study focusing on big data research. Technol Forecast Soc Change 105:179–191. https://doi.org/10.1016/j.techfore.2016.01.015

Zhong RY, Xu C, Chen C, Huang GQ (2017) Big Data Analytics for physical internet-based intelligent manufacturing shop floors. Int J Prod Res 55:2610–2621. https://doi.org/10.1080/00207543.2015.1086037

Download references

Acknowledgements

Open access funding provided by Malmö University.

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Author information

Authors and affiliations.

Osnabrück University, Rolandstraße 8, 49078, Osnabrück, Germany

J. Piet Hausberg, Kirsten Liere-Netheler & Kristin Vogelsang

Malmö University, Nordenskiöldsgatan 1, 21119, Malmö, Sweden

Sven Packmohr

Bahçeşehir University, Çırağan Caddesi 4-6, 34353, Beşiktaş/Istanbul, Turkey

Hamburg University, Von-Melle-Park 9, 20146, Hamburg, Germany

Stefanie Pakura

You can also search for this author in PubMed   Google Scholar

Corresponding authors

Correspondence to J. Piet Hausberg or Sven Packmohr .

Ethics declarations

Conflict of interest.

The author(s) declare that they have no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/ ), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Reprints and permissions

About this article

Hausberg, J.P., Liere-Netheler, K., Packmohr, S. et al. Research streams on digital transformation from a holistic business perspective: a systematic literature review and citation network analysis. J Bus Econ 89 , 931–963 (2019). https://doi.org/10.1007/s11573-019-00956-z

Download citation

Published : 08 November 2019

Issue Date : December 2019

DOI : https://doi.org/10.1007/s11573-019-00956-z

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Citation-network analysis
  • Digital transformation
  • Systematic review

JEL Classification

  • Find a journal
  • Publish with us
  • Track your research

This paper is in the following e-collection/theme issue:

Published on 18.3.2024 in Vol 26 (2024)

Reducing Loneliness and Social Isolation of Older Adults Through Voice Assistants: Literature Review and Bibliometric Analysis

Authors of this article:

Author Orcid Image

  • Rachele Alessandra Marziali 1 * , MSc   ; 
  • Claudia Franceschetti 1 * , MEng   ; 
  • Adrian Dinculescu 2 * , PhD   ; 
  • Alexandru Nistorescu 2 * , PhD   ; 
  • Dominic Mircea Kristály 3 * , PhD   ; 
  • Adrian Alexandru Moșoi 4 * , MSc   ; 
  • Ronny Broekx 5 * , BSB   ; 
  • Mihaela Marin 2 * , MSc   ; 
  • Cristian Vizitiu 2, 3 * , PhD   ; 
  • Sorin-Aurel Moraru 3 * , PhD   ; 
  • Lorena Rossi 1 * , MEng   ; 
  • Mirko Di Rosa 6 * , PhD  

1 Centre for Innovative Models for Aging Care and Technology, IRCCS INRCA-National Institute of Health and Science on Aging, Ancona, Italy

2 The Space Applications and Technologies Laboratory, Institute of Space Science – Subsidiary of INFLPR (National Institute for Laser, Plasma and Radiation Physics), Magurele, Romania

3 Department of Automatics and Information Technology, Faculty of Electrical Engineering and Computer Science, Transilvania University of Brasov, Brasov, Romania

4 Department of Psychology and Education Sciences, Faculty of Psychology and Education Sciences, Transilvania University of Brasov, Brasov, Romania

5 Innovation Department, ePoint, Hamont, Belgium

6 Centre for Biostatistics and Applied Geriatric Clinical Epidemiology, IRCCS INRCA-National Institute of Health and Science on Aging, Ancona, Italy

*all authors contributed equally

Corresponding Author:

Claudia Franceschetti, MEng

Centre for Innovative Models for Aging Care and Technology, IRCCS INRCA-National Institute of Health and Science on Aging

Via Santa Margherita 5

Ancona, 60124

Phone: 39 0718004788

Email: [email protected]

Background: Loneliness and social isolation are major public health concerns for older adults, with severe mental and physical health consequences. New technologies may have a great impact in providing support to the daily lives of older adults and addressing the many challenges they face. In this scenario, technologies based on voice assistants (VAs) are of great interest and potential benefit in reducing loneliness and social isolation in this population, because they could overcome existing barriers with other digital technologies through easier and more natural human-computer interaction.

Objective: This study aims to investigate the use of VAs to reduce loneliness and social isolation of older adults by performing a systematic literature review and a bibliometric cluster mapping analysis.

Methods: We searched PubMed, Embase, and Scopus databases for articles that were published in the last 6 years, related to the following main topics: voice interface, VA, older adults, isolation, and loneliness. A total of 40 articles were found, of which 16 (40%) were included in this review. The included articles were then assessed through a qualitative scoring method and summarized. Finally, a bibliometric analysis was conducted using VOSviewer software (Leiden University’s Centre for Science and Technology Studies).

Results: Of the 16 articles included in the review, only 2 (13%) were considered of poor methodological quality, whereas 9 (56%) were of medium quality and 5 (31%) were of high quality. Finally, through bibliometric analysis, 221 keywords were extracted, of which 36 (16%) were selected. The most important keywords, by number of occurrences and by total link strength; results of the analysis with the Association Strength normalization method; and default values were then presented. The final bibliometric network consisted of 36 selected keywords, which were grouped into 3 clusters related to 3 main topics (ie, VA use for social isolation among older adults, the significance of age in the context of loneliness, and the impact of sex factors on well-being). For most of the selected articles, the effect of VA on social isolation and loneliness of older adults was a minor theme. However, more investigations were done on user experience, obtaining preliminary positive results.

Conclusions: Most articles on the use of VAs by older adults to reduce social isolation and loneliness focus on usability, acceptability, or user experience. Nevertheless, studies directly addressing the impact that using a VA has on the social isolation and loneliness of older adults find positive and promising results and provide important information for future research, interventions, and policy development in the field of geriatric care and technology.

Introduction

Nowadays, the aging of the population presents new challenges that requires consideration and response [ 1 ]. Among the major public health concerns regarding older adults, 2 significant concerns are loneliness and social isolation [ 2 ].

In fact, social networks seem to decrease with age and the prevalence of loneliness is estimated to increase as the population ages [ 2 ], to the extent that Valtorta and Hanratty [ 3 ] define loneliness and isolation as being “increasingly part of the experience of growing old.”

Social isolation and loneliness have severe consequences for older adults’ mental and physical health, including depressive symptoms [ 4 ], dementia [ 5 ], coronary heart disease and stroke [ 6 ], and mortality [ 7 ]. Moreover, social isolation and loneliness also have adverse outcomes concerning the use of health services, increasing emergency department and physician visits, hospital readmissions, and long-term care admissions [ 8 ].

New technologies may have a great impact on providing support in the daily lives of older people, especially in the areas of health monitoring, security, and comfort [ 9 ]. Therefore, they could be valuable tools to respond to the many challenges that older adults face.

In this scenario, technologies based on voice assistants (VAs) are of great interest and have potential benefits. VAs are systems based on artificial intelligence techniques that are programmed to be activated at a specific wake word to capture the user’s voice, process and interpret the command via a server, and respond back with a voice response or completed task [ 10 ].

VA systems have the potential to support behavioral interventions using everyday life technologies such as smartphones, tablets, and smart speakers [ 9 ]. The strength behind the use of voice-based technology, having reached a significant stage of maturity, is strictly related to the concept of ubiquitous computing ( Figure 1 ), introduced by Weiser in 1991 when thinking about a paradigm of technology able to adapt to the human environment that vanish in the background [ 11 ]. Indeed, VA technology is physically intangible; it does not force the user to be physically at a particular place to operate, and it provides interaction using natural language [ 9 ].

literature review citation network analysis

Concerning the application to older people, this easy and natural human-computer interaction gives VA systems the potential to overcome possible barriers existing with other digital technologies, which appears particularly promising and appropriate [ 9 ].

In light of this, the objective of this study is to investigate the use of VAs to reduce loneliness and social isolation of older adults by performing a literature review and a bibliometric analysis.

Database Creation

A literature search of scientific articles published from January 1, 2018, to April 4, 2023, was conducted. Considering that VA technology had not reached a significant stage of maturity, especially in its application for social purposes, this time range was defined.

The PubMed, Embase, and Scopus databases were searched to extend the range of eligible articles. In particular, the search was performed by setting up the “Title/Abstract” field in PubMed, the “Title or Abstract” field in Embase, and the “Title, Abstract, Keywords” field in Scopus.

The search was performed using an appropriate sequence of keywords, based on the research objectives. The first part of the search string was focused on synonyms for VA, whereas the second part specified the application for isolation and loneliness in older adults. The search string used was as follows: ((voice interface) OR (voice assistant) OR (vocal interface) OR (vocal assistant) OR (speech agent) OR (vocal agent)) AND (olde* OR elder*) AND (isolation OR loneliness).

We collected a total of 40 publications: 34 from Scopus, 4 from PubMed, and 2 from Embase.

Study Selection

The selection of the eligible studies was performed according to the following principles:

  • Including only publications in English language: no documents were excluded.
  • Removal of overlaps between the different databases: 3 overlapping documents were identified.
  • Excluding papers in which the title and abstract were not relevant to the research question: 12 papers were excluded.
  • Removal of articles not retrieved: 1 article was excluded.
  • Excluding articles not pertinent to the research question: 8 documents were excluded.

The studies were assessed independently by 3 authors (CF, RAM, and AD). Any disagreement and uncertainties in the study selection were resolved by discussion. In particular, 2 authors conducted the first assessment, and another one solved the divergences.

Multimedia Appendix 1 [ 12 - 19 ] reports the list of excluded articles concerning eligibility assessment and details about the motivations for their exclusion.

The final database was composed of 40% (16/40) of the collected documents.

Figure 2 reports the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram [ 20 ], summarizing the identification, screening, and inclusion procedures performed.

literature review citation network analysis

Quality Scoring

As systematic reviews are comprehensive and rigorous assessments of existing literature on a specific research question and they aim to synthesize the available evidence to provide a reliable and unbiased summary, the “Tool for Scoring Quality of Non-Empirical Data Sources” [ 21 ], owned by the Aerospace Medicine Systematic Review Group, was used to assess the quality of individual studies included in this review. In total, 2 authors (RAM and CF) performed this evaluation independently, solving any disagreements or doubts through discussion. It is important to note that the purpose of quality scoring in systematic reviews is not to exclude studies but rather to provide an evaluation of their methodological strengths and weaknesses. The scoring process helps reviewers assess the overall risk of bias in the body of evidence and inform their conclusions and recommendations.

Data Extraction

To perform the synthesis of findings, a data extraction from the 16 selected articles was conducted. The extraction consisted of a further evaluation of the full text of the articles. In total, 2 authors (MDR and CF) independently extracted information from the selected studies, including reference, population, technological solution, environment, study design, outcomes, and main results. The assessors made the information homogeneous and analyzed the articles together in the case of doubts or missing data. The data extracted were reported in the corresponding section of the synthesis of findings table ( Table 1 ).

a VA: voice assistant.

b PACS: postacute COVID-19 syndrome.

c HRQoL: health-related quality of life.

d DASS-21: Depression Anxiety Scale-21.

e CD-RISC-25: Connor-Davidson Resilience Scale-25.

f EQ-5D-5L: EuroQol-5 Dimensions-5 Levels.

g ISI: Insomnia Severity Index.

h SF-36: 36-Item Short Form Health Survey.

i NGD: normalized Google distance.

j N/A: not applicable.

k SSPA: Social Skills Performance Assessment.

l ADL: activities of daily living.

Bibliometric Analysis

A bibliometric analysis was also conducted to construct a map of the selected articles using VOSviewer software (version 1.6.19; Leiden University’s Centre for Science and Technology Studies). This tool represents one of the most popular programs for bibliometric cluster mapping [ 38 ].

To illustrate the keyword co-occurrence network, keywords were extracted from the list of the 16 included articles.

During the map creation, the authors choose the co-occurrence type of analysis on keywords and selected full counting as the counting method. The threshold of the minimum number of occurrences of a keyword was set at 2 keywords. All the keywords were illustrated regardless of the greatest total link strength. At the selected keywords’ verification step, the authors considered it convenient to merge similar words by creating a thesaurus file. Thus, the thesaurus file included a column of similar keywords and another column with the keyword to be replaced with. Hence, in the final step, the selected keywords were analyzed using the Association Strength normalization method and default values. In addition, for clustering, the default values of resolution (ie, 1.00), minimum cluster size (ie, 1), and merge small cluster option were used.

In the following sections, the synthesis of the findings and results of the bibliometric analysis and qualitative scoring of the 16 selected articles are presented.

Synthesis of Findings

The selected articles were assessed with regard to population, technological solution, environment, study design, outcomes, and main results. Table 1 presents a synthesis of the findings.

In summary, the population most frequently involved in the selected studies is older adults. In some cases, informal caregivers [ 22 ], geriatric experts [ 29 ], the medical community, the general public [ 35 ], or formal caregivers working in a day-care facility with experience in caring for people with dementia [ 36 ] are also involved. All the articles detail the total number of people engaged, except for 31% (5/16) of the articles [ 26 , 31 , 32 , 34 , 35 ]. The remaining articles involve a minimum of 7 and a maximum of 109 older adults. Among the selected articles, the age of the population varies widely, including people aged >50 [ 22 , 24 , 26 , 30 ], >60 [ 23 , 33 ], >65 [ 27 , 29 ], and >75 years [ 28 ]. Naturally, professionals are younger, ranging from 21 [ 29 ] to 33 [ 36 ] years. However, for some articles [ 25 , 31 , 34 - 37 ], there is no information on the age of the population involved. Instead, the sex of the participants is only specified in 56% (9/16) of the articles [ 24 , 26 - 30 , 32 , 33 , 36 ], in which a majority of female users are included.

In addition, 25% (4/16) of the articles consider participants’ familiarity with technology, involving only people with no experience with VA technology [ 26 ] and digital devices [ 31 ], involving only people with low technology use [ 32 ], or specifying people’s technological abilities [ 27 ]. In addition, some studies consider clinical conditions: 6% (1/16) of the articles [ 22 ] included people with diabetes or long-term health conditions, whereas others include people with postacute COVID-19 syndrome [ 24 ]; with normative cognitive functioning [ 28 ]; with no severe visual or hearing impairment and no moderate to severe cognitive impairment [ 30 ]; with mild difficulties in social skills, depression and anxiety symptoms, and nonverbal impairment [ 33 ]; and without dementia [ 36 ].

Technological Solution

Regarding VA technology solutions, 44% (7/16) of the articles [ 22 , 24 , 25 , 28 , 29 , 32 , 34 ] report the use of commercially available VAs, for example, Google Assistant, Amazon Alexa, Apple Siri, and Microsoft Cortana. Some studies specify the design of new VA systems developed using the Amazon Alexa platform and Alexa Voice services [ 36 ] or implementing the Google Voice Android Software Development Kit on a tablet [ 27 ]. In other studies, the newly designed VA is embedded in a mobile app [ 23 ], a PC application [ 26 ], or even embodied as a household potted flower [ 35 ]. A total of 13% (2/16) of the articles [ 31 , 32 ] describe the design and the testing of a new VA-based digital intelligent platform. Finally, 1 (6%) article [ 33 ] presents a web-based automated version of a VA designed to improve communication skills, whereas another one [ 37 ] involves a personalized and expressive VA.

Environment

The environment in most of the articles [ 22 , 24 , 25 , 29 - 31 , 33 , 37 ] is the home, which is alternated, in the study by Pradhan et al [ 32 ], with the older adult living community and, in the studies by Bravo et al [ 23 ] and Simpson et al [ 35 ], with the retirement home. Instead, the environments in other articles are the laboratory [ 26 , 27 ], the independent living facility [ 28 ], the older adult care center [ 34 ], and the day-care facility [ 36 ]. Thus, the selected articles concerning the use of a VA for social isolation and loneliness address both older adults living independently at home and those living in a facility.

Study Design

Regarding the study design, among the 16 selected studies, 4 (25%) are quantitative, including 1 (6%) evaluation test [ 23 ], 1 (6%) pre-post study [ 24 ], 1 (6%) development and user test [ 26 ], and 1 (6%) VAs test [ 34 ]. Qualitative studies include 1 (6%) service evaluation [ 22 ], 1 (6%) evaluation test [ 29 ], and 1 (6%) pre- post study [ 32 ]. Then, there are 5 (31%) mixed studies, including both qualitative and quantitative methods, of which 1 (6%) is an evaluation test [ 27 ], 1 (6%) is a single-group quasi-experimental study [ 28 ], 1 (6%) is a pre-post study [ 30 ], 1 (6%) is a randomized controlled trial [ 33 ], and 1 (6%) was a usability study [ 36 ]. Finally, the remaining studies include 1 (6%) mini review [ 25 ], 2 (13%) conference speeches [ 35 , 37 ], and 1 (6%) study protocol [ 31 ]. More detailed information on the methodology results is presented in the Quality Scoring section.

Among the outcomes, only 31% (5/16) of the articles [ 22 , 25 , 28 , 31 , 35 ] consider loneliness or social isolation. Of these 16 studies, only 1 (6%) [ 28 ] uses a standardized instrument—the 8-item University of California, Los Angeles (UCLA) Loneliness Scale—to assess the perception of loneliness. Instead, most articles (9/16, 56%) [ 22 - 24 , 27 , 29 , 30 , 32 , 35 , 36 ] focus on topics related to the acceptability, user experience, satisfaction, and usability of the technological solution, whereas a smaller number (2/16, 13%) [ 26 , 34 ] focuses on its technical performance. To evaluate these aspects, 5-point Likert scales are used only by 19% (3/16) of the articles [ 23 , 27 , 36 ].

Further outcomes addressed are verbal and nonverbal behavior in social communication [ 33 ], definition of project objectives, scientific and technological goals and actions [ 37 ], program impact on health and care trajectories [ 31 ], codes and overarching themes [ 29 ], interaction anthropomorphic aspects [ 28 ], and psychological and physical aspects such as frailty and quality of life [ 24 , 31 ].

Main Results

Turning to the main results of using a VA, the impact on loneliness and social isolation is positive, leading to an improvement in users’ perceptions. Specifically, the participants in 13% (2/16) of the studies [ 22 , 24 ] report that the VA helped them cope with loneliness, whereas another study (1/16, 6%) [ 28 ] finds a significant reduction in perceived loneliness after 4 weeks of use and that the relational greetings from the user to the VA predict this reduction. Moreover, the loneliness experienced by the person forecasts the number of greetings he or she makes to the VA. Finally, a mini review (1/16, 6%) [ 25 ] outlines that the use of VA in older adults improves social connectedness and reduces loneliness.

Other benefits obtained include a positive impact on health and social well-being [ 22 ]; improvement in postacute COVID-19 syndrome symptoms, frailty, and health-related quality of life at 6 months follow-up [ 24 ]; sedentary life changes [ 24 ]; and significant improvement in eye contact and facial expressivity [ 33 ].

Regarding the VA, it is considered useful [ 24 ], satisfying [ 23 , 27 ], and interesting [ 36 ], and it obtains good results in the acknowledgment (the ability to recognize user contextual information) and engagement (the ability to maintain a coherent conversation) performance [ 34 ]. In addition, among participants in the study by Pech et al [ 30 ], 63% have a positive opinion toward the system used, and in the study by Striegl et al [ 36 ], both older adults and formal caregivers describe that the VA used have a high feasibility to support people with dementia in activities of daily living.

The main results also include technical information about the VA. For example, in 1 (6%) study [ 26 ], the VA obtains, in all the commands, a right answer ratio percentage >75%; another (1/16, 6%) study [ 29 ] identifies 8 major themes as possible VA beneficial functions; and another (1/16, 6%) study [ 32 ] presents crucial information for VA development, whereas in another (1/16, 6%) study [ 35 ], the device prototype is developed. Finally, critical issues emerge: VA interruptions when the person pauses for too long [ 27 ], older adults’ resistance to change, unplanned workload for a formal caregiver, specific technological obstacles [ 30 ], and bad results in the ability to suggest and perform some related activities at the end of the interaction [ 34 ]. Instead, the proposed improvements include facilitated access to professionals, communication at community events, late-night pharmacy service, customized activity proposals, and videoconferencing [ 30 ].

For 13% (2/16) of the articles [ 31 , 37 ], it is not applicable to define the main results.

Along with the bibliometric analysis, the authors built a thesaurus file containing the words that can be replaced, considering their very close meaning. The thesaurus file is presented in Table 2 .

The bibliometric analysis extracted 221 keywords from the included articles, of which 36 (16%) met the threshold of 2 occurrences. The keyword list is presented in Table 3 , in descending order of occurrence, showing the number of occurrences and the total link strength.

As can be observed in Table 3 , the most used keywords by occurrence were as follows: “social isolation” (n=8), “human” (n=6), “older adults” (n=6), “aged” (n=5), “covid-19” (n=5), “loneliness” (n=5), “human computer interaction” (n=4), and “voice assistant” (n=4).

The most used keywords by total link strength, as shown in Table 3 , were as follows: “human” (n=53), “aged” (n=44), “loneliness” (n=44), “social isolation” (n=42), “covid-19” (n=42), “pandemics” (n=29), “very elderly” (n=29), “older adults” (n=28), “prospective study” (n=25), “quality of life” (n=25).

The bibliometric network is illustrated in Figure 3 and consists of 3 clusters of 36 keywords. The clusters are presented in more detail in Table 4 , where each keyword from a cluster is shown in descending order by occurrence.

literature review citation network analysis

According to the scoring tool, 13% (2/16) of the documents were assessed as being of poor quality in terms of the methodology. In the study by Simpson et al [ 35 ], it is unclear what the methodological information is based on, how it is presented, and if it is in line with other sources. The document is based on a conference speech on methods for the design-thinking approach. Instead, in the study by Torres et al [ 37 ], most of the information is not clearly sourced; it is unclear what the methodological information is based on and if it is in line with other sources. In addition, this paper is based on a speech at a conference on the objectives, goals, and actions of a research and innovation project.

A total of 56% (9/16) of the documents were considered medium quality. Specifically, 44% (7/16) articles [ 22 , 23 , 25 - 27 , 29 , 31 ] contain clear sources, methodological quality, and information value, presenting findings in line with the literature. Nevertheless, study designs were not of very high quality, representing mostly multiple case reports and case studies, whereas the study by Corbett et al [ 25 ] is a literature review.

A total of 13% (2/16) of the articles [ 24 , 34 ] have instead a more rigorous approach in the study design, representing a qualitative study and a single-group quasi-experimental study, respectively. However, the former is an abstract document lacking bibliographic references, while in the latter, it is unclear what the methodological information is based on. In both cases, the information presented is not clearly linked with the literature findings.

Finally, 31% (5/16) of the documents were deemed of high quality, considering that the information presented and the methodological information are clearly referenced. Among these, 1 (6%) article [ 33 ] is a randomized controlled study, while the remaining 25% (4/16) [ 28 , 30 , 32 , 36 ] are descriptive or observational studies.

Multimedia Appendix 2 [ 22 - 37 ] provides details of the quality scoring performed on the selected articles.

Principal Findings

The purpose of this study is to synthesize knowledge about the use of VAs to reduce loneliness and social isolation among older adults.

Initially, after conducting the literature research, the quality of the selected articles is investigated, focusing on the strengths and weaknesses of the methodologies used. Of the 16 articles included in the review, only 2 (13%) articles [ 35 , 37 ] are considered poor quality, 9 (56%) articles [ 22 - 27 , 29 , 31 , 34 ] are medium quality, and 5 (31%) articles are high quality [ 28 , 30 , 32 , 33 , 36 ]. In summary, although recent publications in the literature on the use of VA by older adults for the reduction of loneliness and social isolation are not numerous, most of them are of medium to high methodological quality in terms of study design, authenticity, clear methodological quality, clear informational value, and representativeness of available primary sources.

After assessing the methodological quality of the selected articles, the findings are summarized, focusing on population, technological solution, environment, study design, outcomes, and main results for a more detailed overview. Among the 16 articles presented, most focus on the evaluation of acceptability, user experience, satisfaction, usability, or performance of the VA, while only 5 (31%) papers deepen the theme of social isolation and loneliness. Of these studies, 1 (6%) [ 31 ] has no available results, as it is a study protocol, and another (6%) [ 35 ] reached the development stage of a VA prototype. Therefore, 3 (19%) articles remain that investigate the possible effect of the use of a VA on social isolation and loneliness by older adults.

The first paper [ 22 ], a service evaluation study, found that using a VA for 2 months at home helped people with diabetes or other long-term health conditions (such as multiple sclerosis, dementia, and depression) combat loneliness. This is particularly relevant because it seems that social isolation increases the risk of mortality through physiological upregulation of chronic inflammation. This impact is significant even for middle-aged people, but is greater for older adults, particularly men [ 39 ]. Thus, the results obtained from the use of VAs are particularly relevant considering the population the study targeted but an assessment of loneliness would be needed to investigate the actual impact of the use on this dimension.

The second paper, a single-group quasi-experimental study [ 28 ], reported a significant reduction in perceived loneliness, assessed through the 8-item UCLA Loneliness Scale, after older adults living in an independent living facility used a VA for 4 weeks. Thus, loneliness among older adults living alone using a VA has decreased. Moreover, the loneliness perceived at the beginning of the intervention by participants predicts the number of greetings to the VA (such as “Good morning” or “Alexa, I’m home”), and, in addition, these relational greetings forecast loneliness reduction during the month of use. Therefore, according to the authors, VA anthropomorphization might have a role in combating loneliness in older adults.

Finally, the results of a mini review [ 25 ] suggest that the VA reduces loneliness among older adults and increases their connectedness. Older adults perceive the VA as a “companion,” especially those who live alone or have solitary lives for most of the day.

These studies show encouraging results about the potential of a VA in reducing social isolation and loneliness in older adults, in line with the suggestion from a systematic review [ 40 ] that new technologies can be promising opportunities to reduce social isolation and loneliness in this population. For example, 1 (6%) study found that the use of technology by older adults predicts less loneliness, which has in turn been associated with, on the one hand, better self-reported health and subjective well-being and, on the other hand, fewer chronic diseases and less depression [ 41 ]. Therefore, these are preliminary results suggesting that the association between technology use and physical and mental health may be mediated by loneliness.

VAs have the potential to be used by older adults to reduce their social isolation and loneliness, and the results presented go in that direction; however, they cannot be exhaustive nor conclusive.

Finally, the bibliometric cluster mapping analysis provides valuable insights into the relationships between keywords in the included articles. The generated keyword co-occurrence network revealed 3 distinct clusters, each representing a specific theme or concept in the literature.

Cluster 1, represented by keywords such as “social isolation,” “elderly people,” “voice assistant,” and “human computer interaction,” highlights the relevance of VA technology in combating social isolation among older adults. This cluster emphasizes the relevance of the topic. A VA could be a promising tool for facilitating social interactions, promoting well-being, and addressing the challenges faced by older people regarding social isolation. The relevance of VAs in addressing social isolation among older adults aligns with the findings of Portet et al [ 9 ] on the design and evaluation of a smart home VA for older adults. This cluster also corresponds to the author’s focus on the use of quality scoring to evaluate the methodological strengths and weaknesses of the studies, as the inclusion of studies exploring the effectiveness of VAs in combating social isolation would be of particular interest. This cluster emphasizes the importance of designing user-friendly interfaces and incorporating natural language generation and recognition for effective human-computer interaction. This cluster aligns with the literature on ambient assisted living, assistive technology, and artificial intelligence, and it is supported by the work presented in 1 (6%) article [ 10 ] on VAs and their applications, as well as in another (1/16, 6%) article [ 8 ] that discusses technological solutions for addressing social isolation and loneliness in primary care.

Cluster 2 emphasizes the significance of age in the context of loneliness. Keywords such as “loneliness,” “human,” and “quality of life” indicate the importance of understanding the psychological and emotional aspects of loneliness, considering the diverse experiences of individuals across different demographics. This is supported by the works presented by Valtorta and Hanratty [ 3 ] and Holt-Lunstad et al [ 7 ], who discuss the association between loneliness, social isolation, and health outcomes in older adults, emphasizing the importance of considering demographic factors in understanding and addressing these issues. Cluster 2 is also relevant in the context of the COVID-19 pandemic, as it includes keywords such as “COVID-19,” “pandemics,” and “digital divide,” which illustrates the impact of the pandemic on social isolation and the need for technological solutions, such as VAs, to bridge the digital divide and ensure connectivity and support for older adults during times of crisis. A study [ 6 ] on the association between social isolation, loneliness, and health outcomes in the context of coronary heart disease and stroke further emphasizes the significance of addressing social isolation during pandemics.

Cluster 3 encapsulates a range of keywords related to sex, clinical research, and well-being. The presence of keywords, such as “adult,” “female,” and “male,” along with “clinical article” and “well-being” underscores the importance of understanding how sex-specific factors can significantly impact overall well-being. This cluster likely refers to studies and investigations that explore the intersection of sex-related variables with clinical research outcomes, shedding light on how these factors can influence health and well-being differently among various demographic groups. Moreover, Cluster 3 may offer valuable insights into the evolving landscape of clinical research and its focus on addressing sex-specific health concerns, thus promoting a more comprehensive approach to well-being across diverse populations.

These clusters shed light on important topics related to social isolation, loneliness, and the use of VAs in addressing these issues among older adults. The findings underlined here can inform future research, interventions, and policy development in the field of geriatric care and technology.

Strengths and Limitations

The study provides a comprehensive exploration of voice assistance systems used by older individuals, highlighting popular examples such as Amazon Alexa, Google Assistant, Apple Siri, Microsoft Cortana, Samsung Bixby, and Huawei HiVoice. The study examines the strengths and limitations of these systems.

One of the notable strengths of this study is its investigation into the use of VAs to alleviate loneliness and social isolation among older adults. This topic is fairly recent, but its relevance is growing in both the scientific and technological communities.

Moreover, this investigation is supported by both a literature review and a bibliometric analysis to gather as much knowledge as possible on the role of technology in combating loneliness and social isolation in older adults.

In addition, the selection of studies included in the article underwent an independent evaluation process by the authors, with any disagreements or uncertainties being resolved through discussion.

Another strength is the consideration of the scientific articles published in 2018. This choice was driven by the fact that VAs are relatively new and are continually advancing technological solutions. Furthermore, the application of such technology among older individuals is not yet widespread, resulting in a limited number of studies available on the topic. Despite this limitation, the potential benefits of VA solutions for older adults are highly intriguing, and this study aims to shed light on possible applications and the associated impact on older users.

This study also has limitations that need to be pointed out. First, the number of publications in the systematic review is reduced because the topic has only gained relevance recently. However, the authors decided to proceed with the bibliometric analysis to contribute in terms of interpretation, even though the number of papers on the use of VAs to reduce loneliness and social isolation among older adults is limited. Further limitations relate to the fact that 1 (6%) article [ 42 ] could not be retrieved and that the synthesis of findings is not comprehensive, as only the abstract was available for 1 article [ 24 ], nor complete, as it was not applicable to define the main results of 13% (2/16) articles [ 31 , 37 ]. Moreover, the selected studies had great heterogeneity, with only 6% (1/16) of studies [ 33 ] having a control group and 6% (1/16) of studies [ 28 ] having follow-up. Concerning the information about the population, it is not specified if people involved in the studies live alone or not. This could limit considerations regarding social isolation and loneliness. Finally, most articles collected qualitative data without providing quantitative instruments to assess the actual impact of VA use.

Future Directions

On the basis of this literature review and bibliometric analysis, several priorities for future research can be identified. First, working with keywords from clusters 1 and 2, it is easy to see that “loneliness” and “social isolation” have a huge impact on older people [ 43 ]. On the basis of our literature review, authors are more interested in system use and acceptability [ 30 ], acceptance user experience [ 22 ], and system usability [ 36 ], which are just some examples. The main points are “loneliness” and “social isolation,” and we only found 1 study [ 28 ] to reduce perceived loneliness in older adults. Thus, the topic of the use of VA for social isolation and loneliness among older adults seems to be underestimated in comparison to user experience aspects, which are more deeply investigated in the scientific literature.

Similarly, we encourage that researchers include questionnaires to measure loneliness in future studies, for example, the Revised UCLA Loneliness Scale [ 44 ], the De Jong Gierveld Loneliness Scale [ 45 , 46 ], the Steptoe Social Isolation Index for social isolation [ 44 ], and the Cornwell Perceived Isolation Scale for perceived isolation [ 47 ], for use with VA systems based on artificial intelligence techniques or other related systems to improve the life expectancy of older people. For other specific information about these questionnaires, refer to Social Isolation and Loneliness in Older Adults: Opportunities for the Health Care System [ 48 ]. Second, this work shows that the terms social isolation and loneliness are still often treated as interchangeable, although they are actually related but distinct concepts [ 3 ].

In fact, nowadays, the tendency is to refer to loneliness as a subjective negative feeling of perceiving a lack of social network or desired companion, whereas social isolation is the objective lack or scarcity of social contacts and interactions with family, friends, or community [ 3 ]. Therefore, it would be particularly relevant if future studies would clearly define which dimensions they measure, as mentioned in the preceding section. Third, future research should examine the large heterogeneity within the older adult population. Some of the selected articles described different characteristics of the population, but none delved into the possible different impacts of VA use in relation to these variables. Future studies should explore the effects of using a VA on the social isolation and loneliness of older adults, investigating possible differences in sex, socioeconomic background, and also familiarity with technology and living conditions.

Conclusions

This paper conducted a literature review and a bibliometric analysis of the use of VAs among older adults to reduce social isolation and loneliness. The findings indicate that most studies focus on the usability, acceptability, or user experience of the VA. However, studies directly addressing the impact that using a VA has on the social isolation and loneliness of older adults have positive results and provide important information for future research, interventions, and policy development in the field of geriatric care and technology.

Acknowledgments

This study has been developed within the framework of the EMILIO (Increase Self Management and Counteract Social Isolation Using a Voice Assistant Enabled Virtual Concierge) project (AAL-2021-8-120-CP), cofinanced under the Ambient Assisted Living Joint Programme of the European Commission [ 49 ] and the National Funding Agencies of Belgium, the Netherlands, Italy, and Switzerland.

The authors are grateful to all consortium partners: Italian National Institute of Health and Science on Aging (IRCCS INRCA), Solving Team SRL, ICT Factory GmbH, Erdmann Design AG, Magicview, ePoint, Vulpia VZW, Institute of Space Science, INFLPR Subsidiary, Transilvania University of Brasov.

The project website is available on the internet [ 50 ].

Authors' Contributions

RAM, CF, AD, AN, MM, and CV contributed to the methodology, investigation, writing of the original draft, and reviewing and editing. DMK, AAM, and S-AM were responsible for the investigation, writing of the original draft, reviewing, and editing. RB conducted reviewing and editing. LR was involved in conceptualization and funding acquisition, whereas MDR was involved in methodology, project administration, conceptualization, supervision, funding acquisition, reviewing, and editing.

Conflicts of Interest

None declared.

Excluded articles and motivations for the exclusion.

Quality scoring of selected articles.

  • Active ageing : a policy framework. World Health Organization. 2002. URL: https://apps.who.int/iris/handle/10665/67215 [accessed 2023-06-01]
  • Holt-Lunstad J. The potential public health relevance of social isolation and loneliness: prevalence, epidemiology, and risk factors. Public Policy Aging Rep. 2017;27(4):127-130. [ CrossRef ]
  • Valtorta N, Hanratty B. Loneliness, isolation and the health of older adults: do we need a new research agenda? J R Soc Med. Dec 2012;105(12):518-522. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Van As BA, Imbimbo E, Franceschi A, Menesini E, Nocentini A. The longitudinal association between loneliness and depressive symptoms in the elderly: a systematic review. Int. Psychogeriatr. Apr 14, 2021;34(7):657-669. [ CrossRef ]
  • Kuiper JS, Zuidersma M, Oude Voshaar RC, Zuidema SU, van den Heuvel ER, Stolk RP, et al. Social relationships and risk of dementia: a systematic review and meta-analysis of longitudinal cohort studies. Ageing Res Rev. Jul 2015;22:39-57. [ CrossRef ] [ Medline ]
  • Valtorta NK, Kanaan M, Gilbody S, Hanratty B. Loneliness, social isolation and risk of cardiovascular disease in the English longitudinal study of ageing. Eur J Prev Cardiol. Sep 2018;25(13):1387-1396. [ CrossRef ] [ Medline ]
  • Holt-Lunstad J, Smith TB, Baker M, Harris T, Stephenson D. Loneliness and social isolation as risk factors for mortality: a meta-analytic review. Perspect Psychol Sci. Mar 2015;10(2):227-237. [ CrossRef ] [ Medline ]
  • Freedman A, Nicolle J. Social isolation and loneliness: the new geriatric giants: approach for primary care. Can Fam Physician. Mar 2020;66(3):176-182. [ FREE Full text ] [ Medline ]
  • Portet F, Vacher M, Golanski C, Roux C, Meillon B. Design and evaluation of a smart home voice interface for the elderly: acceptability and objection aspects. Pers Ubiquit Comput. Oct 2, 2011;17(1):127-144. [ CrossRef ]
  • Hoy MB. Alexa, Siri, Cortana, and more: an introduction to voice assistants. Med Ref Serv Q. Jan 12, 2018;37(1):81-88. [ CrossRef ] [ Medline ]
  • Weiser M. The computer for the 21st century. Sci Am. Sep 1991;265(3):94-104. [ FREE Full text ] [ CrossRef ]
  • Chen J, Yang YT, Zhu X, Zhu Z. Share and care: a senior-friendly family interaction application. In: Proceedings of the IEEE MIT Undergraduate Research Technology Conference (URTC). 2020. Presented at: URTC 2020; October 9-11, 2020; Cambridge, MA. URL: https://ieeexplore.ieee.org/document/9668885 [ CrossRef ]
  • Eimontaite I, Voinescu A, Alford C, Caleb-Solly P, Morgan P. The impact of different human-machine interface feedback modalities on older participants’ user experience of CAVs in a simulator environment. In: Proceedings of the International Conference on Human Factors in Transportation. 2019. Presented at: AHFE 2019; July 24-28, 2019; Washington, DC. URL: https://link.springer.com/chapter/10.1007/978-3-030-20503-4_11 [ CrossRef ]
  • Eirale A, Martini M, Tagliavini L, Gandini D, Chiaberge M, Quaglia G. Marvin: an innovative omni-directional robotic assistant for domestic environments. Sensors (Basel). Jul 14, 2022;22(14):1-22. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Martin-Hammond A, Vemireddy S, Rao K. Exploring older adults' beliefs about the use of intelligent assistants for consumer health information management: a participatory design study. JMIR Aging. Dec 11, 2019;2(2):e15381. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Méndez JI, Mata O, Ponce P, Meier A, Peffer T, Molina A. Multi-sensor system, gamification, and artificial intelligence for benefit elderly people. In: Ponce H, Martínez-Villaseñor L, Brieva J, Moya-Albor E, editors. Challenges and Trends in Multimodal Fall Detection for Healthcare. Cham, Switzerland. Springer; 2020.
  • Restyandito, Febryandi, Nugraha KA, Sebastian D. Mobile social media interface design for elderly in Indonesia. In: Proceedings of the HCI International 2020 – Late Breaking Posters. 2020. Presented at: HCII 2020; July 19-24, 2020; Copenhagen, Denmark. URL: https://link.springer.com/chapter/10.1007/978-3-030-60703-6_10 [ CrossRef ]
  • Syeda MZ, Park M, Kim Y, Kwon YM. Tangible social content service system: making digital technology easier to use by elderly and its usability evaluation. In: Proceedings of the 12th International Conference on Complex, Intelligent, and Software Intensive Systems. 2018. Presented at: CISIS 2018; July 4-6, 2018; Matsue, Japan. [ CrossRef ]
  • Zhou D, Barakova EI, An P, Rauterberg M. Assistant robot enhances the perceived communication quality of people with dementia: a proof of concept. IEEE Trans Human Mach Syst. Jun 2022;52(3):332-342. [ CrossRef ]
  • Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. Mar 29, 2021;372:n71. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Laws JM, Winnard A. Tool for scoring the quality of non-empirical data sources- E.G: technical reports. Aerospace Medicine and Rehabilitation Laboratory, Northumbria University. 2019. URL: https:/​/www.​researchgate.net/​publication/​331385312_Tool_for_Scoring_the_Quality_of_Non-Empirical_Data_Sources-_EG_Technical_Reports [accessed 2024-02-23]
  • Balasubramanian GV, Beaney P, Chambers R. Digital personal assistants are smart ways for assistive technology to aid the health and wellbeing of patients and carers. BMC Geriatr. Nov 15, 2021;21(1):643. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Bravo SL, Herrera CJ, Valdez EC, Poliquit KJ, Ureta J, Cu J, et al. CATE: an embodied conversational agent for the elderly. In: Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART. 2020. Presented at: ICAART 2020; February 22-24, 2020; Valletta, Malta. URL: https://www.scitepress.org/Link.aspx?doi=10.5220/0009174009410948 [ CrossRef ]
  • Caselgrandi A, Milić J, Motta F, Belli M, Venuta M, Aprile E, et al. Voice assistance to develop a participatory research and action to improve health trajectories of people with PACS. Antivir Ther. Dec 1, 2021;26(1_suppl):13-14. [ CrossRef ]
  • Corbett CF, Wright PJ, Jones K, Parmer M. Voice-activated virtual home assistant use and social isolation and loneliness among older adults: mini review. Front Public Health. 2021;9:742012. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Farías-Barraza B, Reyes-Rogget M, López FA, López-Martínez IN, Contreras-Bolton C, Linfati R. Low-cost voice assistant design and testing for older adults. In: Proceedings of the Computer Information Systems and Industrial Management. 2022. Presented at: CISIM 2022; July 15-17, 2022; Barranquilla, Colombia. [ CrossRef ]
  • Garcia-Mendez S, de Arriba-Perez F, Gonzalez-Castano FJ, Regueiro-Janeiro JA, Gil-Castineira F. Entertainment chatbot for the digital inclusion of elderly people without abstraction capabilities. IEEE Access. May 17, 2021;9:75878-75891. [ CrossRef ]
  • Jones VK, Hanus M, Yan C, Shade MY, Blaskewicz Boron J, Maschieri Bicudo R. Reducing loneliness among aging adults: the roles of personal voice assistants and anthropomorphic interactions. Front Public Health. 2021;9:750736. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • O'Brien K, Light SW, Bradley S, Lindquist L. Optimizing voice-controlled intelligent personal assistants for use by home-bound older adults. J Am Geriatr Soc. May 2022;70(5):1504-1509. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Pech M, Gbessemehlan A, Dupuy L, Sauzéon H, Lafitte S, Bachelet P, et al. Lessons learned from the SoBeezy program for older adults during the COVID-19 pandemic: experimentation and evaluation. JMIR Form Res. Nov 24, 2022;6(11):e39185. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Pérès K, Zamudio-Rodriguez A, Dartigues JF, Amieva H, Lafitte S. Prospective pragmatic quasi-experimental study to assess the impact and effectiveness of an innovative large-scale public health intervention to foster healthy ageing in place: the SoBeezy program protocol. BMJ Open. Apr 29, 2021;11(4):e043082. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Pradhan A, Findlater L, Lazar A. "Phantom friend" or "Just a box with information": personification and ontological categorization of smart speaker-based voice assistants by older adults. Proc ACM Hum Comput Interact. Nov 07, 2019;3(CSCW):1-21. [ CrossRef ]
  • Razavi SZ, Schubert LK, van Orden K, Ali MR, Kane B, Hoque E. Discourse behavior of older adults interacting with a dialogue agent competent in multiple topics. ACM Trans Interact Intell Syst. Jul 23, 2022;12(2):1-21. [ CrossRef ]
  • Reis A, Paulino D, Paredes H, Barroso I, Monteiro MJ, Rodrigues V, et al. Using intelligent personal assistants to assist the elderlies: an evaluation of Amazon Alexa, Google Assistant, Microsoft Cortana, and Apple Siri. In: Proceedings of the 2nd International Conference on Technology and Innovation in Sports, Health and Wellbeing (TISHW). 2018. Presented at: TISHW; June 20-22, 2018; Thessaloniki, Greece. URL: https://ieeexplore.ieee.org/document/8559503/authors#authors [ CrossRef ]
  • Simpson J, Gaiser F, MacÍk M, Breßgott T. Daisy: a friendly conversational agent for older adults. In: Proceedings of the 2nd Conference on Conversational User Interfaces. 2020. Presented at: CUI '20; July 22-24, 2020; Bilbao, Spain. [ CrossRef ]
  • Striegl J, Gollasch D, Loitsch C, Weber G. Designing VUIs for social assistance robots for people with dementia. In: Proceedings of Mensch und Computer 2021. 2021. Presented at: MuC '21; September 5-8, 2021; Ingolstadt, Germany. [ CrossRef ]
  • Torres MI, Chollet G, Montenegro C, Tenorio-Laranga J, Gordeeva O, Esposito A, et al. EMPATHIC, Expressive, Advanced Virtual Coach to Improve Independent Healthy-Life-Years of the Elderdy. Presented at: 4th International Conference on Advances in Speech and Language Technologies for Iberian Languages, IberSPEECH 2018; 21-23 November 2018, 2018;172-173; Barcelona, Spain. [ CrossRef ]
  • van Eck NJ, Waltman L. Software survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics. Aug 2010;84(2):523-538. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Yang YC, McClintock MK, Kozloski M, Li T. Social isolation and adult mortality: the role of chronic inflammation and sex differences. J Health Soc Behav. Jun 2013;54(2):183-203. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Poscia A, Stojanovic J, La Milia DI, Duplaga M, Grysztar M, Moscato U, et al. Interventions targeting loneliness and social isolation among the older people: an update systematic review. Exp Gerontol. Feb 2018;102:133-144. [ CrossRef ] [ Medline ]
  • Chopik WJ. The benefits of social technology use among older adults are mediated by reduced loneliness. Cyberpsychol Behav Soc Netw. Sep 2016;19(9):551-556. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Chen S, Nakamura M. Generating personalized dialogues based on conversation log summarization and sentiment analysis. In: Proceedings of the 23rd International Conference on Information Integration and Web Intelligence. 2021. Presented at: iiWAS2021; November 29-December 1, 2021; Linz, Austria. [ CrossRef ]
  • OʼSúilleabháin PS, Gallagher S, Steptoe A. Loneliness, living alone, and all-cause mortality: the role of emotional and social loneliness in the elderly during 19 years of follow-up. Psychosom Med. 2019;81(6):521-526. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • Steptoe A, Shankar A, Demakakos P, Wardle J. Social isolation, loneliness, and all-cause mortality in older men and women. Proc Natl Acad Sci U S A. Apr 09, 2013;110(15):5797-5801. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • de Jong-Gierveld J, Kamphuls F. The development of a Rasch-type loneliness scale. Appl Psychol Meas. Jul 27, 2016;9(3):289-299. [ CrossRef ]
  • Gierveld JD, Tilburg TV. A 6-item scale for overall, emotional, and social loneliness: confirmatory tests on survey data. Res Aging. 2006;28(5):582-598. [ CrossRef ]
  • Cornwell EY, Waite LJ. Social disconnectedness, perceived isolation, and health among older adults. J Health Soc Behav. Mar 2009;50(1):31-48. [ FREE Full text ] [ CrossRef ] [ Medline ]
  • National Academies of Sciences, Engineering, and Medicine, Division of Behavioral and Social Sciences and Education, Health and Medicine Division, Board on Behavioral, Cognitive, and Sensory Sciences, Board on Health Sciences Policy, Committee on the Health and Medical Dimensions of Social Isolation and Loneliness in Older Adults. Social Isolation and Loneliness in Older Adults: Opportunities for the Health Care System. Washington, DC. National Academies Press; 2020.
  • Ageing well in the digital world. Active Assisted Living Programme. URL: https://www.aal-europe.eu/ [accessed 2024-02-14]
  • Emilio–personal assistant. Active Assisted Living Programme. URL: https://www.emilio-aal.eu/ [accessed 2024-02-26]

Abbreviations

Edited by T de Azevedo Cardoso; submitted 04.07.23; peer-reviewed by V Jones, F Yang; comments to author 26.09.23; revised version received 13.10.23; accepted 24.11.23; published 18.03.24.

©Rachele Alessandra Marziali, Claudia Franceschetti, Adrian Dinculescu, Alexandru Nistorescu, Dominic Mircea Kristály, Adrian Alexandru Moșoi, Ronny Broekx, Mihaela Marin, Cristian Vizitiu, Sorin-Aurel Moraru, Lorena Rossi, Mirko Di Rosa. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 18.03.2024.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

IMAGES

  1. Citation Network Analysis

    literature review citation network analysis

  2. The network visualization map of journal co-citation analysis by

    literature review citation network analysis

  3. Citation Network Analysis [13,19,20,39,43,46,48,49].

    literature review citation network analysis

  4. (PDF) Empirical research on performance effects of supply chain

    literature review citation network analysis

  5. The co-citation network analysis.

    literature review citation network analysis

  6. (PDF) Empirical research on performance effects of supply chain

    literature review citation network analysis

VIDEO

  1. Citation Network Using VOS Viewer

  2. ER 33

  3. Module-6, Unit-35 Exploring the Citation Network Rules & Tools

  4. Module3 Literature Review Searching and Citation Part 11

  5. Module 1-4 Literature Review

  6. How to Write Literature Review, Citation & Referencing With Practical Examples Part I

COMMENTS

  1. Servant Leadership: a Systematic Literature Review and Network Analysis

    Methods. The paper is based on a two-step method, referred to as "Systematic Literature Network Analysis (SLNA)" (Colicchia & Strozzi, 2012): a Systematic Literature Review (SLR) and a further analysis of the subset of relevant articles obtained through a bibliographic Network Analysis (NA): namely, the citation network analysis, the co-occurrence networks analysis and the basic statistics.

  2. Research Interdisciplinarity and Citation Impact: A Network Analysis of

    This research constructed the network using the citation analysis software Citespace. Citespace is the software that was developed to process citation data and visualize citation networks (Chen et al., 2010). The commonly used network analysis software, Pajek, was utilized to transfer the network output from Citespace to the adjacency matrix.

  3. Current State and Future Trends: A Citation Network Analysis of the

    This work was a systematic review and meta-analysis of the years between 1997 and 2010 that analyzed 7167 articles on different demographic and psychosocial factors that influence academic performance in university students. Figure 2. Citation Network of the 20 most cited publications on academic performance.

  4. Assessing citation networks for dissemination and implementation

    In general, citation network analysis provides a map of the most highly cited publications within a given research domain, ... Based on the structured literature review of the Tabak article using the CNA tool, we identified 239 articles across the network and its three levels of 'distance.' This included 17 level-one articles directly ...

  5. Analysis and Visualization of Citation Networks

    Citation analysis—the exploration of reference patterns in the scholarly and scientific literature—has long been applied in a number of social sciences to study research impact, knowledge flows, and knowledge networks. ... Individual issues that are particularly important in citation network analysis are then scrutinized, namely: field ...

  6. A Systematic Literature Network Analysis of Existing Themes and ...

    Accordingly, based on a Systematic Literature Network Analysis, this paper tackles this gap. First, a Citation Network Analysis is used to unearth the development of the CE literature based on papers' references, whilst the Main Path is traced to detect the seminal papers in the field through time.

  7. Citation network analysis: International Review of Sport and Exercise

    Citation network analysis (CNA) is a review method that seeks to map the scientific structure of a field of research as a function of citation practices. Generally speaking, research texts that receive more citations from others symbolizes a degree of prominence to a field of study; however, the more common approaches to synthesizing research ...

  8. Citation network analysis

    Citation network analysis. ABSTRACT Knowledge is socially constructed, and one way that researchers convey knowledge is through citation practices within research texts to illustrate the foundation upon which current research is designed and results interpreted. Citation network analysis (CNA) is a review method that seeks to map the scientific ...

  9. PDF Servant Leadership: a Systematic Literature Review and Network Analysis

    Analysis (SLNA)" (Colicchia & Strozzi, 2012): a Systematic Literature Review (SLR) and a further analysis of the subset of relevant articles obtained through a bibliographic Net-work Analysis (NA): namely, the citation network analysis, the co-occurrence networks analysis and the basic statistics.

  10. How to conduct a bibliometric analysis: An overview and guidelines

    The techniques for science mapping include citation analysis, co-citation analysis, bibliographic coupling, co-word analysis, and co-authorship analysis. Such techniques, when combined with network analysis, are instrumental in presenting the bibliometric structure and the intellectual structure of the research field ( Baker et al., 2020a ...

  11. Research Streams on Digital Transformation from a Holistic ...

    With this systematic literature review, we aim to fill this gap in providing an overview of the different disciplines of DT research from a holistic business perspective. We identified the major research streams and clustered them with co-citation network analysis in nine main areas.

  12. (PDF) Research streams on digital transformation from a holistic

    Research streams on digital transformation from a holistic business perspective: a systematic literature review and citation network analysis December 2019 Journal of Business Economics 89(2)

  13. Mapping the historical development of physical activity and health

    The aim of this study was to use citation analysis to provide insight into the evolution of knowledge in the field of physical activity and public health. 2. Methods. A structured literature review was conducted from February 2015 to June 2016 using citation network analysis (Lecy and Beatty, 2012). A stepwise protocol (before, during, after ...

  14. Servant Leadership: a Systematic Literature Review and Network Analysis

    clusters emerging from the citation network by using the V OS Clustering analysis (V an Eck et al., 2010 ; Waltman et al., 2010 ). " A citation network is a network where the nodes

  15. Patent citation network analysis: A perspective from descriptive ...

    Patent Citation Analysis has been gaining considerable traction over the past few decades. In this paper, we collect extensive information on patents and citations and provide a perspective of citation network analysis of patents from a statistical viewpoint. We identify and analyze the most cited patents, the most innovative and the highly cited companies along with the structural properties ...

  16. (PDF) Social Network Analysis: Literature Review

    Social Network Analysis: Literature Review. November 2018; AJIT-e Academic Journal of Information Technology 9(34):34; ... 2.3+ billion citations; Join for free. Public Full-text 1.

  17. A broad overview of interactive digital marketing: A bibliometric

    A literature review can therefore help identify main areas of interest related to digital marketing (Webster & Watson, 2002). ... Citation network analysis (CNA) is a technology forecasting tool that acts as an alternative to an expert-based approach. In the present work, we constructed the citation network of the corpus of research articles on ...

  18. Citation Analysis for Bibliometric Study

    About this Guided Project. In this 2 hour long project, you will learn to search and extract relevant research articles and their linked references efficiently from a journal database to conduct a bibliometric literature review. Then with these extracted data, you will learn to create a citation network. The visualization tool Gephi will be ...

  19. A network approach toward literature review

    This study introduces a method that uses a network approach towards literature review. To employ this approach, we use hypotheses proposed in scientific publications as building blocks. In network terms, a hypothesis is a directed tie between two concepts or nodes. The network emerges by aggregating the hypotheses from a set of articles in a specific domain. This study explains the method and ...

  20. Empirical research on performance effects of supply chain resilience

    conducting a systematic literature review (SLR) and citation network analysis (CNA). Based on the results of the CNA, we identify citation network clusters of the articles with three main strategies of SCRES, namely, supply chain (SC) agility, SC risk management, and SC reengineering. A structured framework is

  21. Servant Leadership: a Systematic Literature Review and Network Analysis

    The paper is based on a two-step method, referred to as "Systematic Literature Network Analysis (SLNA)" (Colicchia & Strozzi, 2012): a Systematic Literature Review (SLR) and a further analysis of the subset of relevant articles obtained through a bibliographic Network Analysis (NA): namely, the citation network analysis, the co-occurrence networks analysis and the basic statistics.

  22. Sustainability

    The significant role of corporate social responsibility (CSR) in achieving sustainability and in meeting the expectations of stakeholders has been well documented. Using a collection of 2173 publications on CSR and its connections with business performance, this study conducted a bibliometric investigation using the Systematic Literature Network Analysis (SLNA) technique combined with network ...

  23. (PDF) Empirical research on performance effects of supply chain

    PDF | On Jan 1, 2022, Sakun Boon itt and others published Empirical research on performance effects of supply chain resilience: Systematic literature review, citation network analysis and future ...

  24. Frontiers

    With the global increase in population and the accelerated process of urbanization, the equitable access to park green spaces by diverse communities has become a growing concern. In order to provide an overview of the developmental trends, research focal points, and influencing factors in the study of equity in park green spaces, this paper employs bibliometric analysis and the visualization ...

  25. Electronics

    Industry 4.0 heralds a new era of industrial innovation, characterised by the integration of advanced production and operational techniques with smart technologies in organisations, people and assets [].Technical terms will be properly explained the first time they are used, and the writing style will adhere to objectivity, a clear and logical structure, conventional sections and formatting ...

  26. Research streams on digital transformation from a holistic business

    Moreover, we use a citation network analysis. Compared to other literature review approaches, the network analysis does not focus on a special field within DT research. Thus, we were able to study the field of DT from a more holistic perspective and provide implication of a broad literature base and an overview of the current state.

  27. Journal of Medical Internet Research

    Background: Loneliness and social isolation are major public health concerns for older adults, with severe mental and physical health consequences. New technologies may have a great impact in providing support to the daily lives of older adults and addressing the many challenges they face. In this scenario, technologies based on voice assistants (VAs) are of great interest and potential ...

  28. Powder metallurgy processing of high entropy alloys: Bibliometric

    Research attention in powder metallurgy (PM) processing of high-entropy alloys (HEAs) is rising. Some reviews have been published but a detailed historical analysis to identify the thematic research areas and prospective future research areas is lacking. Therefore, this study presents a bibliometric literature analysis of PM-processed HEAs by mapping and clustering 700 articles published ...