A systematic review of Web engineering research

Ieee account.

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

Help | Advanced Search

Computer Science > Software Engineering

Title: web engineering.

Abstract: Web Engineering is the application of systematic, disciplined and quantifiable approaches to development, operation, and maintenance of Web-based applications. It is both a pro-active approach and a growing collection of theoretical and empirical research in Web application development. This paper gives an overview of Web Engineering by addressing the questions: a) why is it needed? b) what is its domain of operation? c) how does it help and what should it do to improve Web application development? and d) how should it be incorporated in education and training? The paper discusses the significant differences that exist between Web applications and conventional software, the taxonomy of Web applications, the progress made so far and the research issues and experience of creating a specialisation at the master's level. The paper reaches a conclusion that Web Engineering at this stage is a moving target since Web technologies are constantly evolving, making new types of applications possible, which in turn may require innovations in how they are built, deployed and maintained.

Submission history

Access paper:.

  • Other Formats

References & Citations

  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

Bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • Springer Nature - PMC COVID-19 Collection

Logo of phenaturepg

Accessibility engineering in web evaluation process: a systematic literature review

1 Department of Electrical Engineering and Information Systems, University of Pannonia, Egyetem u. 10, Veszprem, 8200 Hungary

Cecilia Sik-Lanyi

Arpad kelemen.

2 Department of Organizational Systems and Adult Health, University of Maryland Baltimore, 655 W. Lombard St #455B, Baltimore, MD 21201 USA

Several works of literature contributed to the web evaluation process in recent years to promote digital inclusion by addressing several accessibility guidelines, methods, processes, and techniques. Researchers have investigated how the web evaluation process could be facilitated by including accessibility issues to obtain an inclusive and accessible solution to improve the user experience and increase user satisfaction. Three systematic literature reviews (SLRs) have been conducted in the context of past research, considering such research focuses. This paper presents a new SLR approach concerning accessibility in the web evaluation process, considering the period from 2010 to 2021. The review of 92 primary studies showed the contribution of publications on different phases of the web evaluation process mainly by highlighting the significant studies in the framework design and testing process. To the best of our knowledge, this is the first study focused on the web accessibility literature reporting the engineering assets for evaluation of new accessible and inclusive web-based solutions (e.g., websites). Besides, in this study, we aim to provide a new direction to the web designers and developers with an updated view of process, methods, techniques, tools, and other crucial aspects to contribute to the accessible process enrichment, as well as depict the gaps and challenges that may be worthy to be investigated in the future. The findings of this SLR introduce a new dimension in web accessibility research on determining and mitigating the research gap of web accessibility issues for web designers, developers, and other practitioners.

Introduction

In recent years, various aspects have motivated researchers to conduct studies about digital accessibility. The extension and increased availability of the web for multiple purposes (e.g., information search), the representation of the content (e.g., video, audio), and the emergence of new platforms (e.g., Internet of Things) and technologies (e.g., mobile, computer, tablets) are significant aspects to reinforce the investigation of the digital information platform. In particular, from the very beginning of the digital revolution, digital resources become the fundamental source for citizens to access information such as education, health care, government, news, and other information such as entertainment and sports [ 1 , 2 ].

According to the World Wide Web Consortium (W3C) and the Web Accessibility Initiative (WAI) report, accessibility is a broad and extensible term associated with people who have disabilities, incompetent skills, or situational-induced impairment [ 3 ]. This initiative's objective is to ensure accessibility which means people with special needs should be able to access, navigate, interact, and contribute to the information that is available on the Web/Internet, electronic resources/materials, and computer. The current mission of the WAI initiative is to coordinate international, technical, and human efforts to improve web accessibility [ 4 ]. With this mission in mind, WAI launched a set of accessibility guidelines called Web Content Accessibility Guidelines (WCAGs) [ 5 , 6 ]. A detailed description of WCAG is given in Sect.  2 .

The scientific research community has recognized that web design and development must inspect the assorted number of requirements of citizens across the population, including special needs users and elderly citizens. Earlier researchers considered accessibility checking as a supplementary requirement in the evaluation phase of any application development. However, in recent years, researchers suggested that accessibility requirements should be followed from the very beginning of the application design and development. Lack of consideration of accessibility issues during the design and development might introduce violations of accessibility guidelines and consequently basic rights of people with disabilities. A great volume of literature exists addressing accessibility guidelines in the design and development of web platforms [ 7 , 8 ]. More recently, a few studies highlighted the importance and emerging need of considering accessibility throughout the web development life cycle [ 9 , 10 ].

Few studies discussed the importance of systematic literature review (SLR) approaches to present the true insights of a particular topic for highlighting future improvement directions [ 11 – 13 ]. Campoverde-Molina et al. [ 14 ] mentioned that SLR is a synthesis process of past studies that have been published in different scientific databases focusing on a particular issue. SLR aims to review past literature on a specific domain to determine its effectiveness and find the research gap and new research areas. It helps to identify the way of knowledge improvement, promotes new theories for development, and reveals the new investigated area that needs to focus. Therefore, an SLR focusing on web accessibility engineering assets is essential to determine a way to promote an accessible web platform according to WCAG standards.

Emphasizing the necessity of the SLR approach in the web accessibility context, Akram and Sulaiman [ 14 ] and Campoverde‑Molina et al. [ 15 ] have conducted SLRs to analyze the accessibility of educational institute websites within a specific period. The first SLR performed the analysis regarding the period between 2009 and 2017. The second one conducted the investigation considering the period of 2009 to 2020. In 2021, Campoverde‑Molina et al. [ 16 ] extended their previous work intending to update the result of the past SLR and extended the period from 2002 to 2020. In general, SLR refers to the aggregation of knowledge about a particular domain of research with a set of research questions and solutions. Thus, the SLR process should be as unbiased as possible [ 17 ]. The selected SLRs are auditable and have significant effects. However, the focus on engineering assets such as processes, development techniques, and technologies is limited, which is a drawback of SLRs.

This paper presents an extensive SLR in the context of accessibility in the web evaluation process to identify several engineering processes to improve the accessibility of web platforms. This study will help a wide array of people (developers, designers, inventors, leaders, researchers, and users) and facilitate the accessible web design and evaluation process. The paper is organized as follows: In Sect.  2 , accessibility concepts, importance, and related works are presented. Section 3  describes the details of conducting the SLR. Section  4 represents the result of conducted SLR and discusses the main findings through a broad discussion. In Sect.  5 , we conclude the paper.

Background and related work

Digital accessibility is a process to ensure the availability of online tools or content to the users [ 13 ]. The prime objective of digital accessibility is to make an accessible, operable, and interactable online platform to provide equal information accessing opportunities for people with disabilities [ 18 , 19 ]. Several aspects might initiate barriers to implementing and ensuring digital accessible platforms or tools or content, such as limited accessibility knowledge and its guidelines. Sometimes organizational barriers and parameters such as organization size, capital, and cost influence accessibility issues. Addressing these issues, the governments and organizations of several countries declared various guidelines, standards, and conformance levels for the stakeholders [ 20 ]. Following these guidelines, associate authorities might overcome critical issues and ensure digital accessibility.

Accessibility standards

To develop an accessible solution (e.g., application, websites, software, etc.), several accessibility guidelines have been introduced by the government of several countries and various public and private institutes such as WCAG, Section 508, EN 301 549, YD/T 1761–2012, WAI-ARIA, BITV, ISO 9241 and ATAG are prominent. Web Content Accessibility Guideline (WCAG) was introduced by the Web Accessibility Initiative of the World Wide Web Consortium with several success criteria under 13 guidelines. Section 508 is accessibility requirements rules published by the US Government for digital resources to make the resources accessible. EN 301 549 is a European accessibility requirement that is suitable for public procurement of ICT products and services in Europe. YD/T 1761–2012 refers to the Chinese Technical requirements standards for web accessibility that primarily focus on ensuring accessibility in the digital platform. Besides, the WAI-ARIA standard was published by W3C to define a set of guidelines for HTML attributes to improve semantic accessibility. BITV is a German standard that is issued focusing on WCAG 2.0 to make the website and application accessible for people with disabilities by ensuring perceivable, operable, understandable, and robust guidelines. Similarly, ISO 9241 provides requirements for accessible developments throughout the application development life cycle. It concerns both hardware and software components for interactive design and development. Authoring Tool Accessibility Guidelines (ATAG) is WCAG and User Agent Accessibility guidelines-based instruction for accessible web content design and development.

Among these guidelines, WCAG is the most widely used accessibility standard. WCAG is a documented guideline that explains all the accessibility criteria and step-by-step recommendations about implementation, improvement, and measurement of accessibility to provide a better user experience, especially for people with disabilities. W3C-based WAI first developed the WCAG standards to make the web accessible [ 3 ]. As of July 2022, WAI has published five versions of the WCAG standard, including WCAG 1.0, WCAG 2.0, WCAG 2.1, WCAG 2.2, and WCAG 3.0 (draft version). The WCAG 3.0 is the most sophisticated standard, currently available as a working draft for web developers (front and back end) and designers to develop accessible and usable web content [ 21 ].

In 1999, the first version of WCAG 1.0 was released by W3C with three priorities, 14 guidelines, and 65 checkpoints [ 22 ]. In 2008, W3C released the second version of standards/guidelines, including 61 success criteria and 12 guidelines under four principles: perceivable, operable, understandable, and robust, concerning three conformance levels: Level A, Level AA, and Level AAA [ 23 ]. Furthermore, in 2018, the W3C published an updated version of WCAG 2.0 principles, namely the WCAG 2.1 standard [ 6 ]. It has all the principles, guidelines, success criteria, and conformance levels similar to WCAG 2.0 but they added one new guideline and 17 new success criteria. Therefore, completion of the WCAG 2.1 standard ensures the fulfillment of WCAG 2.0 and is followed with more accessibility concerns. The significant update in WCAG 2.1 is the ‘Operable’ principle. In this principle, a new guideline with six success criteria has been added.

In 2021, W3C extended the WCAG 2.1 guideline and released the WCAG 2.2, an updated version [ 24 ]. In this version, in the Operable principle under guideline (2.4), three new success criteria have been added. In December 2021, the last modified version of WCAG (3.0, working draft) was published, now in progress, waiting for the final draft of guidelines [ 21 ]. Figure  1 shows the WCAG standard with its principles, success criteria, and conformance levels. For the details about success criteria and conformance level, the author refers the reader to [ 24 ]. In addition, all the versions of WCAG followed three conformance levels of A, AA, and AAA to classify web content. By following the WCAG standard, developers and designers can make digital content accessible for a wide range of people with disabilities, including blindness, low vision or vision impairments, deafness and hearing loss, limited movement, dexterity, speech disabilities, sensory disorders, cognitive and learning disabilities, photo-sensitivity and combinations of these [ 25 ]. Nowadays, ensuring an accessible web and improving user experience is crucial for web engineers, researchers, and developers. According to the researchers' opinions, more research needs to be carried out in the next years to improve the accessibility of digital platforms [ 26 ]. Therefore, to understand web accessibility in-depth, a detailed and updated SLR approach is important.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig1_HTML.jpg

Overview of web content accessibility guidelines (WCAG) version 2.0, 2.1 and 2.2

Our investigation found seven SLRs between 2010 and 2021 related to the area of digital accessibility (two), web accessibility (three), and web-based image and games accessibility (two). The main focus of these seven SLRs is to make digital content accessible for people with disabilities, which is also a prime objective of the digital accessibility consortium. A detailed discussion of the three SLRs concerning web accessibility has been described in the following subsection (2.2) and a comparison of our SLR with the seven earlier SLR studies is conducted in the discussion section.

Related SLR studies

In the web accessibility context, the first selected SLR was carried out by Akram et al. [ 14 ] to identify the issues with web accessibility of the Saudi Arabian university webpages from the web engineering point of view. To conduct this SLR, they followed three research questions: (1) what are the main principles of Web Content Accessibility Guideline 2.0 (WCAG-2.0) proposed by the W3C to improve web accessibility, (2) what is the compliance level of university and government websites with WCAG-2.0 globally, and (3) what is the compliance level of Saudi Arabian university and government websites with WCAG-2.0. To search past literature, they considered ten scientific databases: Google Scholar, Google search engine, EBSCO host, IEEE Explorer, Science Direct, The Elsevier, Springer Link, ACM Digital Library, Wiley, and Emerald, and found 15 pieces of literature from 2009 to 2017. Their systemic literature review concluded that 87% of the past research employed automatic accessibility testing tools to evaluate university websites. Their SLR also revealed that the most experimented automatic accessibility tools are Bobby, AChecker, eXaminator, TAW, Total Validator, EvalAccess, Cynthia Says, Magenta, Site Analyzer, MAUVE, FAE, WAVE, Valet, and W3C validator service. In addition, they incorporated the manual evaluation process (e.g., interview, questionnaire-based assessment). The manual investigation illustrated that in past research the majority of the work emphasized the improvement of a few accessibility issues such as navigation errors, orientation issues, timing errors, text equivalent to graphics, content, the validity of hypertext markup language (HTML), and cascading style sheets (CSS), use of HTML5, interface design, content, and scripting. However, they conclude that in Saudi Arabia, most universities do not follow World Wide Web Consortium guidelines.

The work proposed by Akram et al. is important in representing the insights of accessibility considering several aspects. However, to validate their represented statistics of implemented automatic accessibility testing with the experimented tools and to identify other possible techniques to validate the accessibility, Campoverde‑Molina et al. [ 15 ] carried out the second SLR and present the empirical results of the accessibility evaluation of educational websites. They have considered 25 past studies from 2009 to 2019 to answer ten research questions. This SLR investigated the selected papers focusing on the bibliometric analysis context and literature review. The SLR determined that 80% of past studies focused on automatic analysis through automatic accessibility evaluation tools, 8% through user incorporation, and 12% through hybrid approaches such as expert invitation, user involvement, and automated tools consideration. This SLR concluded that selected websites did not satisfy any version of the WCAG standard and their conformance levels that introduce the necessity of correction of errors by adopting automated tools and manual observation during website construction.

Following their first SLR, Campoverde‑Molina et al. [ 16 ] extended their previous SLR considering the period from 2002 to 2020 to investigate more research works to represent the accessibility insights in depth. This recent SLR aimed to analyze past literature that focused on the accessibility analysis of university websites. They performed an investigation of 42 selected papers obtained from three scientific databases (Web of Science, Scopus, and IEEE Xplore), focusing on the accessibility standards and accessibility evaluation methodologies. In 42 papers, they found that 38,416 university webpages have been experimented with in the past years. Their SLR result illustrates that all the experimented websites were from Asia. Most of the existing research has experimented with university homepages. All the past literature followed two standards: ISO/IEC 40,500:2012, and Sect. 508, to analyze the accessibility of web pages. They also concluded that past studies considered automatic evaluation tools to validate university web pages, which is around 90.47%. The most frequently used accessibility testing tools are AChecker, WAVE, Bobby, and TAW. However, the inspection result of this SLR is that most of the past investigated university websites showed violations of accessibility guidelines, most commonly adaptability, compatible, distinguishable, input assistance, keyboard accessible, navigable, predictable, readable, and text alternatives that show important accessibility issues.

The selected three systematic literature reviews represent the current insights of the web in detail, considering the term of accessibility context. Despite the importance of these SLR approaches, they have a poor concern about past research domains and lack consideration of engineering approaches, methods, etc. The lack of engineering methods shows the shortcoming of the past SLR that initiate the importance of a detailed future of SLR. In this paper, our presented SLR is unlike the other three systematic literature reviews. We consider a wide range of existing literature intending to determine the engineering approach to initiate future research to mitigate the current research gap.

Research methodology

This study aims to conduct a systematic literature review by following the SLR process guidelines and Kitchenham’s guidelines from Kitchenham and Charters [ 27 ]. This research considers three steps to facilitate the SLR approach: (i) planning the SLR process, (ii) conducting the SLR approach, and (iii) reporting the review findings. Figure  2 represents the flowchart of our SLR process.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig2_HTML.jpg

Flowchart of the proposed systematic literature review (SLR)

Planning the SLR process

The main sub-activities related to planning the SLR are (i) research question specification, (ii) search string formulation, and (iii) database selection. All these sub-activities are described below.

Research questions

The first step of a literature review is to develop the research questions. Therefore, we developed the research questions according to our research focus. The two research questions are the following:

Research Question-1: What are the available methods, techniques, processes, and approaches to support the evaluation of accessible web?

Research Question-2: What are the current engineering assets (tools, technologies, etc.) to support the evaluation of accessible web?

Search string

To select the appropriate search strings, we defined a set of keywords according to our research questions concerning the accessibility and website domain. We tested the developed set of keywords in different scientific databases by searching manually and refined it based on the relevancy of the output with the research objective. The finally selected set of keywords represented using Boolean operation is the following:

{(Web engineering) or (Website accessibility) or (Web page accessibility) or (Universal accessibility design) or (Accessibility evaluation) or (Accessibility framework) or (Web accessibility methods and algorithms) or (Accessibility measuring software) or (Current accessibility violations)}.

Database selection

For the most relevant and updated literature identification, database selection is crucial. Several scientific databases are available, so appropriate database selection is critical. Herein considering the opinions of other researchers, we selected seven popular databases that provide quality literature and scientific publications. These databases used advanced search algorithms to extract the most related literature according to the user's interest. Seven databases used in this SLR are Scopus, Web of Science, Science Direct, ACM digital library, Google Scholar, IEEE Xplore, and PubMed.

Conducting the systematic literature review

This phase aims to describe review activities through the specification of (i) database searching and literature extraction, (ii) inclusion and exclusion implication, and (iii) data extraction and quality assessment. These sub-activities are described in detail in the following subsections. Figure  3 shows the flowchart of the review overview.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig3_HTML.jpg

Flowchart of the review overview

Database searching and literature extraction

We tested the search strings in seven databases to extract past literature. These databases are accepted by scientific committees for scientific publishing. Most of the literature is open access. These databases have advanced search algorithms and semantic technology to retrieve the appropriate literature according to the search strings.

In total, 152 papers were found in the period from 2010 to November 2021 (Scopus: 30, Web of Science: 28, IEEE Xplore: 8, PubMed: 5, Science directory: 20, ACM digital library: 16 and Google Scholar: 45). Five studies were found from other source and were included in the preliminary screening process. These five papers were found in Research Gate (platform of scientific work) based on the suggestion of digital accessibility expertise (3 papers) and other colleagues’ recommendations (2 papers). These works were not available in the seven databases that we have used in this work. The considered five papers have potential contributions to web accessibility and significant observation that addressed the importance of consideration in this systematic literature review. Figure  4 shows the search result considering the number of papers selected in each database through the search query. However, Scopus, Web of Science, and Google scholar have a wider array of literature than other databases. Among 157 papers, we have selected the most related papers required for this review through inclusion and exclusion criteria (described in the next section).

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig4_HTML.jpg

The number of selected literature per database

Inclusion and exclusion criteria

The extracted literature has been evaluated to include the most relevant studies in this research. We excluded the literature that did not meet the inclusion criteria for the review. The inclusion criteria were the following: written in English, papers published in peer-reviewed journals or conferences (i.e., not books), publication period between 2010 and 2021, and describe accessibility improvement, development, or related to accessibility assessment.

The exclusion process was performed to eliminate papers from this review. The exclusion criteria were the following: duplicate papers, non-English papers, not directly related or irrelevant papers, papers that are not freely accessible, and those that are not research papers such as posters, letters, thesis, and editorials. After applying inclusion and exclusion criteria to 157 papers, the following observation was made: 12 papers were duplicates, 11 papers were not in English, 29 papers were not directly related to our research focus, and 7 papers were not research papers. In total, we excluded 59 papers by primary screening. After eliminating these papers, we conducted the proposed SLR process considering the selected 98 papers (including 6 past literature reviews). The entire literature selection process has been performed through the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) technique. The PRISMA flow diagram of the study selection is shown in Fig.  5 .

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig5_HTML.jpg

Study selection through PRISMA approach

Data extraction and quality analysis

In this study, our research was conducted based on the search results during 10–15 January 2022, returning 157 papers. To identify a high-quality paper, data extraction and quality assessment are essential. Several earlier literature reviews followed this technique for the primary evaluation of the selected studies. Therefore, we followed assessment guidelines to identify quality papers, complete paper reading, and answer our research questions. Table ​ Table1 1 shows the assessment criteria for the evaluation of selected studies.

Questionnaire for quality assessments of primary selected studies

For each question, we set the score to 0 or 1. For each positive answer, a paper gets a score of 1. If not relevant to the assessment questions, the score is 0. For the Q1 indexed journal, the additional points are + 0.50. Similarly, the extra points for the Q2, Q3, and Q4 indexed journal is + 0.40, + 0.30, and + 0.20, respectively. We incorporated Equation-1 and Equation-2 to calculate the final and normalization score to estimate the quality of each selected paper. After conducting the quality analysis, we consider only those studies that passed at least four quality assessment questions with α  ≥ 0.4 normalized scores. However, among 98 selected studies, six were excluded from this SLR (as shown in Fig.  5 , PRISMA diagram) based on the result of the quality assessment criteria. Table ​ Table2 2 shows the quality assessment result of the qualified 92 papers for this review.

Quality assessment result of the selected studies

Reporting the findings

In general, selected papers were related to web development, web accessibility, and information and communications technology (ICT) tools. The statistics of past research showed that existing SLRs focused on a few criteria, but other aspects also need to be considered. However, this study focused on previous SLR results and added new findings from our investigation results that were not highlighted in the earlier SLRs. Earlier SLRs considered accessibility requirements, standards, frequent violations, and improvement suggestions. However, accessible development criteria, evaluation tools development and their engineering methods, and updated validation and testing procedure need to highlight to identify the new research area. According to Durdu and Yerlikaya [ 28 ], before ensuring accessible web, web developers and designers should consider the standard guidelines and the requirements of people with disabilities. Also, Bradbard and Peters [ 29 ] shared the same observation. They highlighted that the majority of developers and designers have no adequate knowledge about accessibility requirements for people with disability and also lack knowledge about accessible web application development. Thus, in recent days, accessibility specialists have suggested checking accessibility criteria during the development and testing process through automatic accessibility testing tools and user and expert testing. Past works introduced various aspects of developing an effective webpage, but recent studies revealed that accessibility issues completely align with user satisfaction or usability. Therefore, the government of different countries and public and private organizations initiated a few guidelines concerning accessibility and usability criteria [ 30 ] that directed a new research area to make the development easier and barrier-free. In the Following, we would like to describe our findings and analysis results of the selected literature in the context of two research questions.

RQ-1: What are the available methods, techniques, processes, and approaches to support the evaluation of accessible web?

To answer the first research question, we analyzed 92 selected studies. The selected papers were classified into seven groups/processes: (i) accessibility requirements (AR), (ii) challenges (C), (iii) improvement directions (ID), (iv) framework design (FD), (v) framework implementations (FI), (vi) testing (T), and (vii) evaluation (E). All these phases are described in detail in the following subsections. Figure  6 presents the seven processes with an accounted number of papers for each process. Furthermore, nineteen studies emphasized two activities as presented in the Venn diagram of Fig.  7 , which is: {2 (AR & E) + 1 (AR & T) + 2 (AR & FI) + 1 (C & ID) + 1 (C & FD) + 4 (ID & T) + 1 (ID & E) + 1 (ID & FD) + 2 (FD & E) + 4 (T & E)}. The Venn diagram represents the number of papers that have multiple focuses instead of a particular focus or objective. In total 19 unique papers have been found that have multiple focuses. Figure  7 shows the number of papers with their associated activities through the blue arrow. For example, considering ‘accessibility requirements,’ 2 papers focused on accessibility requirements and evaluation process, 2 papers focused on accessibility requirements and implementation, and 1 paper focused on accessibility requirements and testing. Figure  7 shows the complete view of the number of papers with their multi-focused area. Moreover, results depict that past research mostly emphasized the technical processes, especially improvement direction, testing, and evaluation.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig6_HTML.jpg

Percentage of studies considering each process related to web evaluation and accessible web applications

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig7_HTML.jpg

Venn diagram representing the number of studies for certain activities and multiple activities

Accessibility requirements (AR)

This section describes the accessibility and usability requirements with new methods for imposing the accessibility and usability requirements on the current web. Among 92 papers, nine (9) were related to accessibility requirements (representing 9.7% of the total literature) that emphasized ensuring the accessibility guidelines. These studies could be grouped into three main topics of interest, as presented in Table ​ Table3 3 .

The nine studies related to accessibility requirements (AR), grouped by three topics of interest

In the context of accessibility requirements, Bai [ 31 ] and Henry et al. [ 32 ] described the importance of accessibility and usability criteria in web and mobile software applications. They added that improving web accessibility is essential for users with disabilities and non-disabled users. They indicate a significant gap between the needed strategies and the developed solutions for people with disabilities, including auditory, cognitive, neurological, physical, speech, and visual impairments. Therefore, the requirements of people with disabilities should be acknowledged during development as accessible technology is essential for equal access and interaction in today's digital world. Also, Riley-Huff [ 34 ] pointed out that the first step to developing an accessible website is following the web accessibility guidelines/standards. To ensure higher accessibility standards, a possible way is to improve accessibility and usability [ 35 ]. Thus, automatic accessibility testing tools are essential. To ensure usability, they mentioned a few existing models that are prominent to analyze. Another study by Wu et al. [ 33 ] investigated data visuality (chart type, chart embellishment, and data continuity) for people with intellectual and developmental disabilities. They emphasized that people with intellectual and developmental disabilities perform information processing differently. But the actual scenario is quite challenging that complicates the process of data visuality for these people. Thereby they suggested considering all the potential requirements with disabilities during development to improve data visuality and accessibility.

Sauer et al. [ 36 ] identified three criteria: accessibility, usability, and user experience. These are essential for making the internet platform accessible and convenient for people with and without disabilities. They suggested several methods to ensure accessibility (checklists, cognitive barrier walkthroughs, automatic checking), usability (user testing, observation, questionnaires, interviews, focus groups, heuristic evaluation, cognitive walkthrough, and data logging), and user experience. They suggested that accessibility and usability could be imposed during development to improve the user experience. Vu et al. [ 37 ] addressed that low-quality web designs often lead to user frustration that might cause abandonment of undesirable sites. They highlighted several potential and usable web design and evaluation components/methods to improve website usability, which results in a better user experience.

Furthermore, Almeida and Baranauskas [ 38 ] pointed out that web accessibility requirements for people with disabilities are crucial. The difficulty of understanding accessibility guidelines is the prime cause of inaccessible design and development. They also added that developers and designers are not experts and have limited knowledge of accessibility requirements. Therefore, they proposed an inclusive web-based collaborative tool to evaluate and modify the guidelines according to the universal and accessible design and development guidelines. It helps to represent the accessibility guidelines more skillfully. Furthermore, Gaggi and Pederiva [ 39 ] developed a tool for designers and developers concerning the same issue. They assisted the importance of accessibility measurement with a complete direction about guidelines that need attention during the web design and development phase.

Challenges (C)

This section describes the accessibility challenges that are generally liable for the current inaccessible web platform. Among 92 papers, four (4) studies were related to accessibility challenges (representing 4.3% of the total literature). These investigated studies could be grouped into three main topics of interest, as presented in Table ​ Table4 4 .

The four studies related to challenges (C), grouped by three topics of interest

Researchers are trying to ensure an accessible web for more than a decade, including digital content, websites, user-machine interface, software, etc. Acosta-Vargas et al. [ 40 ] pointed out that to implement an accessible web, web researchers have found several challenges. They specified that accessible web page development required adequate knowledge that demands financial investment such as manufacturing and maintaining costs, testing costs, and quality assurance costs. These deliberations are crucial to improving the accessibility of the developed system. However, these deliberations rely on the organization's size, capital, opportunities, etc. Thus, ensuring these necessities is comparatively challenging. Inal et al. [ 41 ] showed their effort by conducting a user survey about digital accessibility practices to identify the challenges of creating an accessible system. They invited user experience (UX) professionals to find the most common challenges. The challenges were associated with time constraints, lack of training cost constraints, work overload, not being a requirement for the organization, not being a customer requirement, and people with disabilities or special needs not included as target users. Inal et al. highlighted that such challenges act as barriers to considering accessibility requirements seriously, which is responsible for the current inaccessible web.

Another study by Brajnik and Vigo [ 42 ] addressed some crucial challenges that need to consider for introducing an accessible web. The most pressing ones are validity, reliability, sensitivity, and adequacy of user-tailored metrics. Challenges with validity are associated with different validation systems of metrics. For example, there are no specific/gold standards to produce the output for the validation process. The reliance on tools and their limited coverage, completeness, and correctness are heterogonous issues that arise as challenges during metrics result in validation. The reliability of several evaluation performance metrics (human judgment, automatic evaluation, etc.) depends on the evaluation metric transparency and their reproducible and comparable results. Brajnik and Vigo depict that the actual cause for low reliability is the adopted sampling method to evaluate the pages, such as accessibility violation criteria, identified data, formulae, or methods to compute the final score. Sensitivity and adequacy are related to the meaningfulness and suitability of the generated scores through metrics. User-tailored metrics depend on the user's ability as all users have different needs. Accessibility barriers affect different ability users in various manners. Thus, such aspects addressed by Brajnik and Vigo need to be considered in future research.

Furthermore, Palaskar et al. [ 43 ] revealed that existing automated accessibility testing tools consider around 50% of Web Content Accessibility Guidelines. Though most of the rules are easy to understand, sometimes it is pretty challenging to implement all the natural language rules in an automatic system. They also claimed that some rules are unacceptable for ensuring accessibility, and others are inappropriate, for example, rules for color schemes and image captions accessibility checking. To develop an accessible website, consideration of some specific aspects is insufficient. Sometimes, accessibility checking requires more than the considered rules. Thus, validating the appropriate rules and incorporating all the guidelines is the major challenge for current web-based accessibility research.

Improvement directions (ID)

This section describes directions for future improvement of accessible development. Among 92 papers, ten (10) studies were related to improvement directions (representing 10.8% of the total literature). These investigated studies could be grouped into two main topics of interest, as presented in Table ​ Table5 5 .

The ten studies related to improvement directions (ID), grouped by two topics of interest

In the context of improvement direction, few studies focused on the technological aspects of accessible development. Edelberg and Verhulsdonck [ 44 ] addressed that web developers and associated authorities choose the colors and font based on the choice of organization identity. However, through this process, it is not always possible to address accessibility issues such as inaccessible color and contrast, and fonts, which makes a difference in design and development for people with disabilities such as a person with low vision. They suggested that colors, fonts, and supporting elements should be perceived correctly during the development phase. Development should be encoded according to the content management system (CMS) to enhance the user experience of a wider audience. Brajnik and Vigo [ 42 ] pointed out the significant progress for accessibility metrics in the last decades. However, immaturity is still present in modern development. Thus, future research for further improvements is indicated. Based on their observation, they added a few improvement directions. For example, the implementation should follow Agile, an iterative development model to keep track of accessibility issues. In addition, following the hybrid approach like human judgments through different levels of expertise and users, such as disability type or user, might improve the accessibility of the development. Miesenberger et al. [ 45 ] presented some accessibility challenges related to cognitive disability with associated improvement direction. For instance, individual user-centered and personal services-based design and development should ensure. Accessibility requirements should be tested in development cycles with several testing tools (keyboard/mouse logging, eye tracking, etc.). To better usability design, an advanced development framework or platform for R&D should incorporate, and the development should follow the process model (e.g., Waterfall, Iterative, Spiral, Agile, etc.). In addition, Alismail and Chipidza [ 46 ] recommended following the WCAG 2.0 and 2.1 guidelines to develop accessible websites by addressing potential accessibility issues. Also, they emphasized user testing by involving people with disabilities, integrating assistive technologies during web accessibility evaluation, incorporating accessibility requirements during design, development, and maintenance phases, and arranging training for web developers and designers to spread accessibility awareness.

In the accessible prototype design context, Bhagat and Joshi [ 47 ] presented a few technical recommendations to overcome accessibility challenges. They also mentioned that all the accessibility requirements should be checked and validated by the website's quality assurance (QA) team. Ojha et al. [ 48 ] provided a few guidelines based on their detailed study on improving website accessibility with readability. Readability improvement suggestions are related to website structural components such as hyperlinks and image alt-text. These functions should ensure by incorporating the variable weight-based approach for different elements of web pages. Also, website dynamism should be considered in readability score computation to improve the readability in terms of the accessibility of the website. Furthermore, Morris et al. [ 52 ] emphasized ensuring alt text of visual content for screen reader users. They have articulated design guidelines for the representation of visual content with prototype design requirements, especially for people with vision impairments, to facilitate and improve visual content accessibility.

Framework design (FD)

This section describes several frameworks designed to contribute to the web evaluation process to facilitate web platform accessibility. Among 92 papers, seventeen (17) were related to framework design (representing 18.4% of the total literature). These investigated studies could be grouped into three main topics of interest, as presented in Table ​ Table6 6 .

The seventeen studies related to framework design (FD), grouped by three topics of interest

To contribute to accessible user-centric design, Alahmadi [ 53 ] proposed a state-of-the-art framework for web accessibility evaluation to facilitate accessibility measurement and identify accessibility standards errors. The proposed model ensures user-centered design (UCD) based on usability and accessibility guidelines for deaf, visually impaired, and deaf-blindness people. Kaur and Gupta [ 54 ] proposed a quality index evaluation framework to evaluate website design to ensure the quality of web design and development. Hassouna et al. [ 55 ] addressed some significant issues for users with visual impairment. Concerning the accessibility requirements for users with vision impairment, they designed an accessible web page prototype. Few studies focused on ontology design. For example, Sapna and Mohanty [ 56 ] proposed a large-scale test scenario management process using ontology modeling with the help of Web Ontology Language (OWL) to facilitate the software and web development, and testing process by providing faster and more reliable services. Kourtiche et al. [ 57 ] designed an ontology of user-profiles considering user disability context to understand various user requirements during accessible web development. Another study proposed by Fayzrahmanov et al. [ 58 ] developed a user interface to improve web navigability considering the user requirements with visual impairment.

In the context of web accessibility evaluation, Li et al. [ 59 ] designed an interactive web accessibility evaluation system based on the Chinese government guidelines. This framework incorporates automatic tools and human inspection to make evaluation feasible for large web pages. Alsaeedi [ 60 ] proposed a novel framework for evaluating the performance of two accessibility testing protocols in webpage evaluation. Song et al. [ 61 ] designed a crowdsourcing-based web accessibility evaluation framework to validate against WCAG. It generates the automatic accessibility score of each evaluated webpage according to the weight of each checkpoint. Giovanna et al. [ 62 ] developed an open support accessibility evaluation tool to improve automatic accessible support following accessibility conformance testing (ACT) rules. Sanchez-Gordon and Luján-Mora [ 64 ] proposed an agile environment-based accessibility evaluation framework to improve evaluation results based on automated tools, simulators, and expert and user-based testing. In further evaluation, Song et al. [ 65 ] addressed the complexity of accessibility evaluation methods and the shortage of experts in this field. These aspects make the accessibility evaluation process difficult and reduce their significance. Thus, they proposed a crowdsourcing-based web accessibility evaluation system that uses decision strategies such as the golden set strategy and time-based golden set strategy. Palaskar et al. [ 43 ] claimed that most existing Americans with Disabilities (ADA) tools detect only 50% to 60% of accessibility violations because the rules are not understandable. They developed an API to test websites according to the WCAG 2.0 guidelines and A, AA, and AAA conformance level. Additionally, Acosta-Vargas et al. [ 66 ] designed a heuristic method to enable accessibility measurement of websites to ensure an accessible and inclusive web platform.

Additionally, Won [ 67 ] developed a color tool to understand website color meaning for accessible design practice. The proposed approach can evaluate webpage HTML design prototypes and provide a clear understanding of product-specific colors, cross-cultural color meanings, and color preference. It assists designers in making better color decisions during the design and development phase.

Framework implementations (FI)

This section describes several studies that implemented different approaches to contribute to evaluating an accessible web platform. Among 92 papers, seventeen (17) were related to implementation purposes (representing 18.4% of the total literature). These investigated studies could be grouped into three main topics of interest, as presented in Table ​ Table7 7 .

The seventeen studies related to framework implementation (FI), grouped by three topics of interest

Concerning web accessibility evaluation, many research studies proposed decision support systems, evaluation tools, algorithms, frameworks, models, and interfaces. Mohamad et al. [ 68 ] developed a decision support system for large-scale compliance assessment against web accessibility recommendations and legislation. This architecture aims to provide scalable, interoperable, and integrated web accessibility assessment in the context of user-centric design to develop accessible web and mobile applications. Li et al. [ 69 ] proposed an EDBA decision support system for website accessibility evaluation at a lower cost. Among other scientific studies, Žuliček et al. [ 70 ] developed an accessibility evaluation tool to evaluate the whole webpage, including subpages, to provide a detailed analysis and simplified code refinement. Oliveira et al. [ 71 ] developed an accessibility assessment tool to analyze the strength and weaknesses of the website following the Web Content Accessibility Guidelines. Rashida et al. [ 72 ] developed an automated web-based tool to identify the quality of academic websites by considering websites' content of information, loading time, and overall performance metrics. Lim et al. [ 73 ] proposed an open-source customized automated accessibility testing tool based on the existing Axe accessibility testing engine to scale up the accessibility testing process. Gaggi and Pederiva [ 39 ] developed an automatic tool to assist designers and developers in understanding development aspects that should be considered during the development process to introduce an accessible website. In addition, Duarte et al. [ 74 ] developed an algorithm to automatically identify the semantic similarity between web content and its textual description in the context of web accessibility evaluation guidelines or rules. Wu et al. [ 75 ] developed a semi-supervised regression algorithm involving manual evaluation (webpage sampling) and automatic accessibility testing to generate the overall evaluation result of the website. Almeida and Baranauskas [ 38 ] developed a framework following universal design (UD) accessibility guidelines to help designers overcome accessibility barriers in a web-based system. Morato et al. [ 76 ] proposed a framework for automatic website accessibility checking in the context of readability through a linguistic characteristics analyzer to identify the best linguistic feature to detect text readability.

In the context of accessibility evaluation for visually impaired people, Michailidou et al. [ 78 ] implemented an open-source web accessibility prediction model to predict and visualize the complexity of web pages in the form of a pixelated heat map. Another work proposed by Bonacin et al. [ 79 ] developed an adaptive interface focusing on the requirements of Color Vision Deficiency (CVD) people considering automatic recoloring facilities to facilitate the interaction of CVD people with the web.

Additionally, for accessible prototype development, Matošević et al. [ 82 ] developed a machine learning algorithms-based expert knowledge system to classify web pages or parts of web pages to improve search engine optimization (SEO) guidelines.

Testing (T)

This section describes the studies associated with the testing purpose for accessibility validation of web platforms. Among 92 papers, thirty (30) were related to accessibility testing (32.6% of the total literature). These investigated studies could be grouped into five main topics of interest, as presented in Table ​ Table8 8 .

The thirty studies related to testing (T), grouped by five topics of interest

In the context of testing, many research studies focused on accessibility testing tools to validate website accessibility considering several disabilities. Few studies tested accessibility issues by incorporating automated accessibility testing tools. Martins et al. [ 83 ] tested eHealth websites using a single accessibility testing tool to identify the accessibility issues. Addressing the effectiveness of multiple automatic testing tools, Padure and Pribeanu [ 84 ] applied six accessibility evaluation tools to evaluate their selected websites. They suggested that a single testing tool is not enough to identify all the accessibility issues of a website. In other studies, Marino and Alfonzo [ 35 ] claimed that automatic tools are inadequate to clarify all the accessibility issues of websites. Thus, further manual observation is required. Therefore, Hassouna et al. [ 85 ] initiated a semi-automated evaluation process utilizing an automatic tool and human observation to evaluate design prototypes of websites. The considered evaluation tool is effective as it identifies problems in the design stage. For example, if it detects any error, it redirects to the design stage to show the problem and repair the design problems without modifying the original code. Also, Bhagat and Joshi [ 47 ] observed that a lack of awareness regarding assistive technologies and global accessibility standards is responsible for less inclusive and less accessible website design and development. Thus, they conducted the experimental procedure following automatic and user testing to help service providers, government divisions, and ministries ensure maximum accessibility of online platforms. Rysavy and Michalak [ 92 ] evaluated the library tools and services in terms of accessibility and usability with open-source tools that emphasized the involvement of blind student workers to validate the resulting transparency.

In contrast, few studies evaluated websites considering several tools and techniques to measure the performance of accessibility, usability, readability, and quality. Akgül [ 93 ] evaluated website accessibility, usability, quality, and readability using several tools and techniques. The author employed online open-source tools for accessibility testing and visual and manual inspection for usability testing considering several design standards and Google search results. For quality performance, Akgül incorporated webpage monitoring software considering download time, page size, and objects per website. Finally, evaluated readability, considering text alignment, webpage language, and all-caps text. Grant et al. [ 97 ] examined web accessibility and user experience using the hidden code optimization technique. They aim to motivate better web development practices and improve the overall holistic user experience. Ajuji et al. [ 96 ] and Kumar et al. [ 98 ] proposed two scientific studies. They depict that though the increasing web interactivity is significantly visible, still people with disabilities are finding it difficult to access. They highlighted that website has non-compliant issues against the W3C guidelines. Thus, to evaluate the websites' conformance to the WCAG, Ajuji et al. implemented an automatic accessibility testing tool to evaluate the websites in terms of Perceivable, Operable, Understandable, and Robust. Kumar et al. considered a simulator to visualize the accessibility issues for different types of disabilities. Burkard et al. [ 99 ] counteracted that the importance and awareness of digital accessibility are often not recognized during web development. Due to the complexity of the guidelines, people are not motivated to follow them. Therefore, automatic accessibility barrier checking, identifying, and fixing is an important issue. They considered several accessibility monitoring systems to validate the websites and compare tools in the context of completeness and correctness. Also, Alshamari [ 51 ] evaluated the accessibility of E-commerce websites through multiple accessibility evaluation tools to generate evaluation reports, locate potential errors, and direct warnings to help in accessible website design and development. Furthermore, Król et al. [ 105 ] evaluated the quality of the websites through automatic testing tools considering website performance, SEO quality, website availability, and mobile friendliness.

In the context of better user experience, Bai [ 31 ] emphasized accessibility and usability observation as website accessibility and usability are highly correlated. Bai choose the most frequently used automatic conformance testing tools and several usability testing models. Another study proposed by Yi [ 106 ] claimed that most websites are not accessible to people with visual impairment, even not readable by the screen reader. This problem happens as websites have too many menus, multiple frames, and a lack of alternative text. Thus, Yi proposed the web accessibility evaluation process using questionnaire-based user testing incorporating people with visual impairment. All the users tested websites using assistive technologies such as screen readers to share their opinions by answering questions about the websites’ accessibility.

This section describes several accessibility evaluation methods and techniques. Among 92 papers, nineteen (19) were related to accessibility evaluation (representing 20.6% of the total literature). These investigated studies could be grouped into three main topics of interest, as presented in Table ​ Table9 9 .

The nineteen studies related to evaluation (E), grouped by three topics of interest

Here, we focus on the studies performed in the context of evaluation purposes. For accessibility evaluation, few studies focused on the questionnaire and expert-based evaluation. Hassouna et al. [ 55 ] and Moreno et al. [ 107 ] argued that the web is less accessible for people with vision impairments. They utilized questionnaire-based evaluation for accessibility prototypes with the participation of people with visual impairment. For descriptive analysis of the questionnaire result, they used statistical techniques to observe the relationship between the questionnaire items and the dependent variables. Another study by Hadadi [ 110 ] stated that designers are not careful about considering the requirements of disabilities such as color blindness. Thus, they overlooked the accessibility criteria to integrate into the design tools. This work evaluated the accessibility of widely used design tools through user feedback. The aim was to increase accessibility awareness and encourage product designers to design and develop an accessible solution. Alcaraz Martínez et al. [ 114 ] addressed that several statistical charts on websites are valuable for representing the information. Unfortunately, charts on websites are not accessible for people with low vision and CVD. Thus, they performed a heuristic accessibility evaluation of statistical charts focusing on the needs of people with low vision and CVD to find the usability problems in user interface design. In another study, Giovanna et al. [ 62 ] conducted a quantitative and qualitative analysis of user feedback regarding task completion time and computing success rate metric. In addition, some existing literature focused on automatic testing validator performance assessment and effeteness. Krawiec and Dudycz [ 108 ] evaluated the performance of automatic accessibility testing validator considering standards, the number of page validation ability, user interface interactivity, software update, free/commercial, etc. This assessment system helps to understand the most effective tool according to the specific requirements. Kous et al. [ 102 ] reinforced that several statistical methods using the quantitative data analysis concept are valuable for validating automatic web accessibility testing results. Grantham et al. [ 109 ] claimed that low literacy and numeracy skills sometimes affect user access and understanding of the website's content. Following accessibility guidelines and incorporating advanced assessment criteria against international legal accessibility requirements should be considered to ensure an accessible web.

Considering readability, Kimmons [ 111 ] claimed that most websites have accessibility issues with content understanding (readability) and structural elements. These issues introduce serious accessibility problems and act as a leading cause of reducing accessibility. Another work is conducted by Sun et al. [ 113 ] to assess e-textbooks’ accessibility. They investigated accessibility considering reading time and accuracy to content-related questions. They evaluated experiment results through composite, average, and weighted average scores to examine user experience and performance. However, Ojha et al. [ 48 ] addressed a wide array of accessibility and readability evaluation metrics for online content based on machine learning and statistical language modeling techniques.

In the context of usability evaluation, Radcliffe et al. [ 112 ] conducted m-Health app evaluation considering accessibility and usability concerns. The evaluation was performed through rapid user-testing and quantifying usability feedback. The user testing result and usability feedback were validated through several standardized evaluation methods for inclusive design requirements specification. Wu et al. [ 33 ] addressed that web designers and developers should focus on usability criteria instead of user experience as ensuring usability improves accessibility. Thus, their paper presents several methods and techniques for usability and accessibility evaluation of web design, such as naturalistic observation, participatory evaluation, web-based methods, prototyping, usability inspections, and usability laboratory testing. Giraud et al. [ 116 ] indicated that filtering redundant and irrelevant information is crucial for people with visual impairments similar to sighted users to improve the accessibility of the web. Therefore, to improve website usability, some specific needs of users with visual impairment are emerging to consider. They conducted experiments with users with vision impairment to determine the accessibility of web content in terms of filtered or not irrelevant and redundant information. Also, cognitive load, performance, and participants' satisfaction were investigated through the dual-task paradigm.

Figure  8 represents the number of papers on each topic of interest according to the seven processes. This figure depicts that the number of proposed approaches for accessibility testing to identify accessibility issues is more frequent than other approaches such as development, implementation, and evaluation. The observation result of research question 1 concludes that the number of the proposed approach for the development and implementation of the accessible web evaluation approach was relatively lower, which addresses a further concern of the web researcher.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig8_HTML.jpg

Number of studies on each topic of research interest according to the seven processes/phase

RQ-2: What are the current engineering assets (tools, technologies, etc.) to support the evaluation of accessible web?

We analyzed the selected papers and identified several groups of interest considering our seven processes. Table ​ Table10 10 summarizes the 22 groups of topics of interest related to the seven processes, including verities of methods, tools, and techniques to answer our second research question.

Distribution of all the groups of the topics of interest in selected papers related to the seven processes

Asset description

From the 7 process groups and 22 topics of interest (Table ​ (Table10), 10 ), we aimed to highlight the main assets related to the engineering aspects to support the technical process we have found in our SLR offered by past researchers. These findings will help developers, web engineers, accessibility researchers, and associated authorities to support the accessible design and development process. The addressed assets are listed and described below.

Assets of accessibility requirements (AR)

(AR 1 .) Assets for the importance of accessibility and usability guidelines: (1)  explanation  of higher accessibility standards in website evaluation [ 31 ]; (2)  explanation  of the importance of accessibility guidelines and user requirements for people with disabilities [ 32 ]; (3) sets of usability requirements  for conventional visualization elements design for cognitive barriers people [ 33 ]; (4)  explanation  of web usability and accessibility requirements of WCAG 2.0 and ISO 9241 standards [ 34 ].

Therefore, in this group, we identified two subgroups of assets: s-group-1: 3 studies for the explanation, and s-group-2: 1 study for requirements.

(AR 2 .) Assets for accessibility, usability, and user experience improvement methods: (1)  methods  to improve accessibility, usability, and user experience [ 36 ]; (2)  methods  to understand user perception to improve usability and user experience [ 37 ].

Therefore, in this group, we identified one subgroup of assets: s-group-1: 2 studies for methods.

(AR 3 .) Assets for accessibility requirements specification: (1) Faware is a framework for accessibility requirements representation and implementation in visualization elements design and development [ 38 ]; (2) WCAG4All is a tool for understanding accessibility requirements following standards guidelines [ 39 ];

Therefore, in this group, we identified two subgroups of assets: s-group-1: 1 study for framework and s-group-2: 1 study for tools.

Assets of challenges (C)

(C 1 .) Assets of limited resource adequacy: (1)  cost  for maintaining, testing, and quality assurance is challenging that depends on organization size, capital, and opportunities [ 40 ]; (2)  opportunities for the training program, learning materials, etc. are not enough for accessibility knowledge improvement [ 56 ]; 3) practical experience and advanced knowledge  of UX professionals from different countries are limited [ 41 ].

Therefore, in this group, we identified three subgroups of assets: s-group-1: 1 study for cost, s-group-2: 1 study for opportunities, and s-group-3: 1 study for experience and knowledge.

(C 2 .) Assets of success criteria validation: (1)  metrics  for accessibility evaluation concerning validity, reliability, sensitivity, and adequacy are challenging to ensure [ 42 ].

Therefore, in this group, one subgroup of assets was found: s-group-1: 1 study for metrics.

(C 3 .) Assets of rules optimization : (1)  guidelines  are not enough or appropriate, even difficult to incorporate in automated systems or web development processes [ 43 ].

Therefore, we found one subgroup of assets in this group: s-group-1: 1 study for guidelines.

Assets of improvement directions (ID)

(ID 1 .) Assets for technological aspects: (1) guidelines for accessible and functional prototype design and development [ 44 ]; (2) directions for accessible development [ 42 ]; (3) directions for cognitive disabilities and their particular accessibility barriers in recent development [ 45 ]; (4) suggestions for development that would be facilitated and tested during the design and development phase [ 46 ].

Therefore, in this group, we identified three subgroups of assets: s-group-1: 1 study for guidelines, s-group-2: 2 studies for directions, and s-group-3: 1 study for suggestions.

(ID 2 .) Assets for accessible prototype design: (1)  directions  for accessible prototype design and development [ 47 ]; (2)  guidelines  for improving website readability by ensuring proper structural components and website dynamism [ 48 ]; (3)  suggestions  for spreading awareness, organizing training, and focusing on the accessible prototype design to make the websites accessible to all, including people with special needs [ 49 ]; (4)  suggestions  for accessible prototype design to ensure advanced multimedia components [ 50 ]; (5)  suggestions  for some potential features that should be taken into consideration during feature development [ 51 ]; (6)  guidelines  for visual content representation and accessible prototype design for screen reader users [ 52 ].

Therefore, in this group, we identified three subgroups of assets: s-group-1: 2 studies for guidelines, s-group-2: 1 study for directions, and s-group-3: 3 studies for suggestions.

Assets of framework design (FD)

(FD 1 .) Assets for accessible user-centric design practice: (1) state-of-the-art  framework for university website accessibility evaluation for students with hearing and visual impairment [ 53 ]; (2)  evaluation  of web page prototype design considering blind user requirement [ 55 ]; (3) OUPIP is a user profile-based ontological  model  for designers and developers to develop applications, and devices considering user’s needs, disability type and dynamic context [ 57 ]; (4) multi-axial serialization  framework  for the users with visual impairment to understand and find the required information in the webpage [ 58 ]; (5)  Ontology  for test management process to provide detailed knowledge about the specific domain and captured requirements for testing [ 56 ]; (6) tool for quantitative measurement by evaluating website HTML code to identify the quality of the website design [ 54 ].

Therefore, in this group, we identified five subgroups of assets: s-group-1: 2 studies for framework, s-group-2: 1 study for evaluation, s-group-3: 1 study for models, s-group-4: 1 study for tool, and s-group-5: 1 study for ontology.

(FD 2 .) Assets for web accessibility evaluation: (1) design a cost-effective crowdsourcing  framework  for web accessibility evaluation considering 25 checkpoints and 5 conformance levels [ 59 ]; (2) proposed a  framework  in order to evaluate the well-known automatic accessibility tools in terms of webpage accessibility through their proposed measurement metrics [ 60 ]; (3) a crowdsourcing  framework  for web accessibility evaluation against web accessibility content guidelines checkpoints [ 61 ]; (4) an open and flexible accessibility testing  tool  to support single and multi-page validation [ 62 ]; (5) WUAM is a framework  for websites usability and accessibility evaluation to improve website performance [ 63 ]; (6) proposed a framework  for web accessibility improvement following ISTQB in agile contexts [ 64 ]; (7) proposed a crowdsourcing  framework  for website accessibility evaluation to identify the accessibility barriers and determine the overall accessibility level [ 65 ]; (8) proposed an API  based website accessibility testing tool following ADA guidelines to identify the potential errors and violations, even without prior knowledge [ 43 ]; (9) proposed a framework  for website accessibility barrier measurement according to several variable magnitude techniques [ 50 ]; 10) proposed a heuristic  method  to determine the level of accessibility of high ranked websites [ 66 ].

Therefore, in this group, we identified four subgroups of assets: s-group-1: 7 studies for framework, s-group-2: 1 study for method, s-group-3: 1 study for tool, and s-group-4:1 study for API.

(FD 3 .) Assets for accessible color design: (1) an accessible color suggestions  tool  for designers to improve their color judgment ability and increase their inspiration for accessible design practice [ 67 ].

Therefore, in this group, we identified one subgroup of assets: s-group-1: 1 study for tools.

Assets of framework implementation (FI)

(FI 1 .) Assets for web accessibility evaluation system: (1) user-centric holistic decision support environment  system  for web and mobile application accessibility evaluation [ 68 ]; (2) a cost-effective task assignment-based decision support  system  for web accessibility evaluation [ 69 ]; (3) a module  for automatically analyzing, identifying and solving the accessibility issues [ 70 ]; (4) an automated website readability assessment  model to improve the accessibility and readability of the website [ 76 ]; (5) ShoppingForAll is a tool  for evaluating and identifying the strength and weaknesses of the website in terms of user satisfaction and accessibility criteria [ 71 ]; (6) an  algorithm  for semantic similarity improvement of website content from the web accessibility perspective [ 74 ]; (7) a  tool  for quality assessment of the university websites by assessing website source code [ 72 ]; (8) FAware is a  tool  to provide accessibility issues and available suggestions [ 38 ]; (9) a semi-supervised  model  to evaluate and predict website accessibility [ 75 ]; (10) an open-source, industry-standard  tool  to addresses the shortcomings of current accessibility testing tools for the local government context [ 73 ]; (11) WCAG4All is a  tool  for consulting web designers and developers about accessibility guidelines [ 39 ]; (12) WAccess is a browser extension open-source accessibility testing  tool  to evaluate websites against WCAG guidelines [ 77 ].

Therefore, in this group, we identified five subgroups of assets: s-group-1: 2 studies for systems, s-group-2: 1 study for modules, s-group-3: 2 studies for models, s-group-4: 6 studies for tools, and s-group-5: 1 study for algorithms.

(FI 2 .) Assets for web accessibility evaluation for visually impaired users: (1) ViCRAM is a tool  to predict the visual complexity of the web pages associated with accessibility issues for people with visually impaired or low vision people [ 78 ]; (2) FAIBOUD is a framework  to facilitate the interaction of CVD people with the web [ 79 ]; (3) proposed an automatic  system  for identifying website drop-down menu widgets [ 80 ].

Therefore, in this group, we identified three subgroups of assets: s-group-1: 1 study for tool, s-group-2: 1 study for framework, and s-group-3: 1 study for the system.

(FI 3 .) Assets for accessible prototype improvements : (1) proposed a method  to improve accessibility issues by modifying faulty code into correct code to make content management system-based websites more accessible [ 81 ]; (2) An expert knowledge  system  to detect web page SEO quality [ 82 ].

Therefore, in this group, we identified two subgroups of assets: s-group-1: 1 study for method and s-group-2: 1 study for the expert system.

Assets of testing (T)

(T 1 .) Assets for automatic detection of accessibility issues : (1) ACCESSWEB is an automated validator  for accessibility evaluation considering different accessibility guidelines [ 83 ]; (2) TAW is an automated validator  for web pages evaluation against the web content standards [ 35 ]; (3) Total Validator is an automated validator  to validate accessibility against standards guidelines [ 86 ]; (4) A semi-automated  process  is to evaluate website design prototypes and repair without modifying the original page code [ 85 ]; (5) AChecker and TAW  automated validators are to validate the accessibility of the website and identify the associated issues that violated accessibility guidelines [ 87 ]; (6) automatic testing by AChecker, Total Validator, WAVE, and HTML/CSS/ARIA  automated validators  for evaluation of higher educational institute websites [ 88 ]; (7) hybrid accessibility testing  process  with AChecker, WAVE, and aXe automatic accessibility testing tools and JAWS and Non-Visual Desktop Access, two open-source screen reader applications [ 47 ]; (8) WAVE is an automated validator  to indicate accessibility issues and related accessibility features [ 89 ]; (9) AChecker, Cynthia Says, Mauve, TAW, Total Validator, and Wave are automated validators to identify the accessibility issues and compare their result to understand the effectiveness of the system [ 84 ]; (10) AChecker, WAVE, and SortSite are automated validators  to identify the shortcoming of websites [ 46 ]; (11) AChecker, Cynthia Says, EIII Checker, MAUVE, SortSite, TAW, Tenon, and WAVE are automated validators  to identify the effectiveness of result considering coverage completeness, correctness, specificity, inter-reliability and intra-reliability, validity, efficiency, and capacity [ 90 ]; (12) multi-tool accessibility assessment through  automated validators  such as AChecker, Cynthia Says, Tenon, WAVE, Mauve, and Hera to perform a comparative analysis of websites to identify the effective testing tool [ 49 ].

Therefore, in this group, we identified two subgroups of assets: s-group-1: 10 studies for the automated validator, and s-group-2: 2 studies for the process.

(T 2 .) Assets for content evaluation for osteoarthritis: (1) SMOG and FOG are two  automated validators  to determine webpage content readability considering informative images and relevant video [ 91 ].

Therefore, we identified one subgroup of assets in this group: s-group-1: 1 study for the automated validator.

(T 3 .) Assets for accessibility evaluation for blind users: (1) WAVE is an online  automated validator  for accessibility issues identification of library tools and services for blind users [ 92 ].

Therefore, this group identified one subgroup of assets: s-group-1: 1 study for the automated validator.

(T 4 .) Assets for accessibility evaluation: (1) propose a hybrid  evaluation  approach for improving user experience [ 97 ]; (2 ) A hybrid  evaluation  process for accessibility, usability, quality and readability testing [ 93 ]; (3) A semi-automated evaluation process incorporating AChecker, Total Validator, WAVE and expert opinion to examine the webpage code [ 94 ]; (4) AChecker is an automated validator  to analyze education cooperative websites to determine its accessibility considering disabilities [ 95 ]; (5) TAW is an automated validator  to validate websites against the conformance of WCAG 2.0 [ 96 ]; (6) A simulator  for visual, hearing and mobility impairment to visualize the accessibility issue associated with the particular disability [ 98 ]; (7) A semi-automated process considering axe Monitor, Pope Tech, Siteimprove, ARC with user feedback to validate websites accessibility [ 99 ]; (8) WAVE is an automated validator  to validate the accessibility of COVID-19 vaccine registration portals [ 100 ]; (9) accessibility evaluation through comparative  analysis  using automatic accessibility testing protocols and statistical observation [ 101 ]; (10) AChecker is an automated validator  to evaluate website accessibility [ 102 ]; (11) AChecker, Cynthia Says, and TAW are automated validators  to validate website e-accessibility [ 103 ]; (12) A comparative analysis using Webaccessibility automated accessibility validator and statistical technique to validate the websites against WCAG 2.1 conformance guidelines [ 104 ]; (13) Google PageSpeed Insights, Blink Audit Tool, Backlink Checker, WAVE and Bulk are automated validator  to assess and evaluate website quality [ 105 ]; (14) Achecker, TAW, Eval Access, MAUVE and FAE are automated validators  to identify the accessibility issues of the selected websites [ 51 ].

Therefore, in this group, we identified four subgroups of assets: s-group-1: 2 studies for evaluation, s-group-2: 7 studies for the automated validator, s-group-3: 1 study for the simulator, s-group-4: 2 studies for analysis, and s-group-5: 2 studies for the process.

(T 5 .) Assets for better user experience: (1) FAE, Nielsen’s10-item metric, and Baker’s six-dimension are automated validators  for accessibility and usability testing for better user experience [ 31 ]; (2) questionnaire-based user  assessment  to identify the accessibility incompatibility with screen reader application [ 106 ].

Therefore, in this group, we identified two subgroups of assets: s-group-1: 1 study for automated validator and s-group-2: 1 study for assessment.

Assets of evaluation (E)

(E 1 .) Assets for accessibility evaluation methods: (1) manual  assessment  through assistive technology with users and experts in this field [ 83 ]; (2) questionnaire-based  assessment  for people with visual impairment through several data analysis techniques [ 85 ]; (3) questionnaire-based  evaluation  for discovering the navigation strategies of low vision people that cause to experience accessibility barriers [ 107 ]; (4) automatic  assessment  system to identify the most effective validator for accessibility testing [ 108 ]; (5) statistical data  analysis  to validate the reliability of the questionnaire result [ 55 ]; (6) quantitative data  analysis  using statistical analysis methods [ 102 ]; (7) manual  assessment  criteria for accessibility assessment of Australian private and governmental websites against DDA standards [ 109 ]; (8) user  evaluation  of Adobe online design platforms tool with the help of mix panel data analysis [ 110 ]; (9) statistical  evaluation  for quality analysis of the websites [ 105 ].

Therefore, in this group, we identified three subgroups of assets: s-group-1: 4 studies for assessment, s-group-2: 3 studies for evaluation, and s-group-3: 2 studies for analysis.

(E 2 .) Assets for readability evaluation tools and techniques : (1)  metrics / tools  for website content readability measurement to make website content universally accessible [ 48 ]; (2) descriptive  evaluation  of university homepage to validate the readability [ 111 ].

Therefore, in this group, we identified two subgroups of assets: s-group-1: 1 study for metrics/tool and s-group-2: 1 study for evaluation.

(E 3 .) Assets for usability evaluation methods: (1) statistical  techniques  for usability testing of m-Health application [ 112 ]; (2) questionnaire-based  evaluation  for user experience testing [ 113 ]; (3) quantitative and qualitative  analysis  considering the user performance, computing task completion time, and correct task completion ratio [ 62 ]; (4) quantitative and qualitative  analysis  with statistical measurements to evaluate user perceptions [ 115 ]; (5) statistical  analysis  to determine the relationship between web accessibility and usability [ 31 ]; (6) user  evaluation  to improve website accessibility and interface usability by reducing the cognitive load of people with blindness [ 116 ]; (7) hybrid  evaluation  process to identify the effectiveness of usability and interface design [ 33 ]; (8) quantitative and qualitative  analysis  to evaluate user perceptions for interactive user interface design [ 114 ].

Therefore, in this group, we identified three subgroups of assets: s-group-1: 1 study for technique, s-group-2: 3 studies for evaluation, and s-group-3: 4 studies for analysis.

Figure  9 shows the graphical representation of assets obtained in this SLR. The observation result of research question 2 (as shown in Fig.  9 ) illustrates that automated validators, tools, and frameworks are the main research assets in the investigated area. It demonstrated that most past researchers and the scientific community contributed to accessibility research using the existing automated validators. Recently researchers focused on developing accessibility testing tools and designing frameworks to contribute to accessibility practice, though the number of developed tools and frameworks is limited. In addition, a small group of researchers has conducted studies on other aspects in the accessibility context.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig9_HTML.jpg

Identified assets of the research outcome

Research context’s investigation results

This section highlights the context we focused on in our investigation to reveal in this SLR. The first context of the discussion is the invested domain of past studies. Figure  10 shows the number of papers in each research area found in this SLR. Most studies focused on education, such as government and higher education institute websites. However, few studies focused on other areas such as libraries, health care, electronic materials (e.g., eBooks, visual charts, etc.), tourism, and E-commerce. Accessibility research has a significant contribution to national and international legislation to develop accessible software or web in different domains. However, more investigation for accessibility measurement should be carried out considering other areas to present accessible systems within a broad scope of future research. Besides, during the COVID-19 pandemic, accessible healthcare websites were significantly valuable and were a crucial requirement for the world community [ 117 ]. However, the observation result depicts that the number of proposed studies focusing on the healthcare domain is not adequate, which is the present research gap in this particular domain. This finding exposes the necessity of devoting continued effort to investigating the healthcare domain in future research.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig10_HTML.jpg

Number of studies of each area of research considering accessibility domain

Figure  11 shows that according to the investigated platforms, most of the selected studies focused on web systems (75 studies), four (4) studies focused on tools and applications, and six (6) studies presented platform-independent approaches.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig11_HTML.jpg

Number of investigated studies of each platform

Regarding guidelines, most of the selected studies followed WCAG standards to evaluate and develop the web or software application. Figure  12 depicts that WCAG is the dominant and accepted standard for referencing primary accessibility guidelines for the accessible solution and prototype design or user-centric design issues. WCAG is also extensively used as a referencing guideline in accessibility assessment or testing tool development. However, as WCAG is incorporated widely; a few deliberations are laborious to solve by imposing this standard alone. Thus, a wide variety of supporting resources and other guidelines or standards is crucial help for web developers and designers to improve accessibility issues and overcome the current accessibility limitation.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig12_HTML.jpg

Number of studies according to the focused guideline

Regarding programming language, the frequently used programming language to implement the proposed methods, tools, and frameworks were JavaScript (object-oriented), Python (high-level programming language), HTML (markup language), CSS/SCSS (style description language), PHP (scripting language), C+ + (case sensitive language), OWL (knowledge representation language) and SWRL (logical inference engine). The most frequently marked engineering tools were Apache and MySQL webserver, Oracle database, JavaScript (React), FontAwesome, Axe, Chrome, and HTML Code Sniffer accessibility evaluation libraries. Frequently applied Application Programming Interfaces (APIs) are Clarifai (for image and video), Indico (for semantic matching), Swoogle, AATT, and REST API for Windows and Linux Operating systems. Most tested websites followed content management systems such as WordPress, Joomla, and Drupal. The tested report represents in extensible markup language (XML), enhanced address recognition logic (EARL), and portable document format (PDF). However, Selenium Web Browser Automation and ChromeDriver Tools with Webdriver and MutationObserver API are effective among other web engineering tools.

Generally, the effectiveness and performance of the web concerning accessibility issues have been assessed through automatic testing (accessibility and usability) and human observation. Figure  13 shows that frequently implemented testing tools are WAVE, AChecker, and TAW. However, earlier studies also addressed other accessibility and usability testing tools such as Mauve, Cynthia Says, Total Validator, aXe Monitor, Tenon, Siteimprove, SortSite, etc. Among several automatic testing tools, some specific tools have been implemented frequently in the past literature. Despite the availability of a wide array of accessibility testing tools (approximately 75 according to W3C), most tools are underrated, and even web designers and developers have no idea about these tools and their effectiveness [ 118 ]. In the investigated works of literature, only three pieces of literature compared multiple automatic accessibility testing tools to evaluate their effectiveness. This limited number of comparative analyses is not sufficient to show the usefulness of the existing automated tools. Thus, it is crucial to devote continued effort to perform further comparative analysis considering the benefits of automatic testing tools in future accessibility research.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig13_HTML.jpg

Number of studies considering implemented testing tools

Concerning the accessibility and usability evaluation and validation results, SPSS, Microsoft Excel, and STATISTICA were the most used statistical analysis tools. Frequently used statistical standards are standard deviation (SD), Pearson’s correlation analysis, one-way ANOVA, System Usability Scale (SUS), Tierney’s 7-min accessibility assessment and app rating system, z-score calculation, Kolmogorov–Smirnov test, Shapiro–Wilk test, Wilcoxon signed-rank test, arithmetic mean, median, coefficient of variation, minimum and maximum value computation. According to past literature, these statistical techniques are effective in accessibility evaluation and validation practice.

Concerning the publication frequency, the observation result shows that between 2010 and 2021, seven (7) studies were published per year on average. Figure  14 displays that the observed number of published studies was low until 2017. Since then, the number of published works has grown. Between 2020 and 2021, the number of publications has shown tremendous growth. This significant growing number of publications depicts that nowadays, web researchers are concerned about the importance of accessible web and ensuring accessibility of the digital platform.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig14_HTML.jpg

Number of publications per study year for the SLR

Considering our seven processes, we classified the selected papers into three periods: 2010–2013, 2014–2017, and 2018–2021. As shown in Fig.  15 , the number of publications between 2018 and 2021 was much higher in each of the 7 processes compared to the earlier periods. This increase was greatest in testing. The rise of articles in the implementation, evaluation, and design areas is also remarkable. These statistics indicate that concern about digital accessibility has increased in recent years. Compared with other processes, accessibility requirements, challenges, and improvement directions are underrated topics in accessibility research. In addition, the number of papers for development methods (development and implementation) is also limited. This observation directs the importance of devoting continued efforts to conducting future research concerning accessibility requirements, challenges, improvement directions, and development methods.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig15_HTML.jpg

Number of publications in seven processes of the SLR considering three time periods

As the prime objective of accessibility research is to ensure online platforms are accessible to people with disabilities, thus, in this SLR study, we classified the past studies according to their focused disability type. Almost one-third of the selected studies did not focus on any group of disabilities (see Fig.  16 ). A prominent number of studies focused on issues with every disability. The number of studies focused on visual impairment is also noticeable. However, compared to these three criteria (AI (area independent), AD (all types of disabilities), and VD (visual impairment)), a few studies considered the cognitive, sensory impairment, and physical disabilities issues. Apart from the invested disability types, it is crucial to show the continued effort for other exceptional cases, such as hearing disabilities, moving disabilities, special children, and autism.

An external file that holds a picture, illustration, etc.
Object name is 10209_2023_967_Fig16_HTML.jpg

Publications with focused disabilities group

Despite the importance of applications to support during the web development process to ensure accessible application development, studies related to application development for accessibility direction are still limited compared to studies on web accessibility evaluation. This result shows the importance of putting effort into methods, tools, and assets to support the development of accessible web and web applications, considering the engineering feature of this platform.

Web accessibility in past studies

In our search for past studies, we found seven SLRs addressing web accessibility. Najadat et al. [ 119 ] indicated that research on web accessibility has grown since 2007. However, the development of accessibility evaluation tools, metrics, and standards was addressed poorly by past literature. They showed the most common web metrics regarding design, speed, size, diagnosis tools, and metrics for better provision of services. Following this, an SLR carried out by Muniandy and Sulaiman [ 120 ] depicts that for years, accessible computer application design, including mobile applications, computer applications, and online web applications for visually impaired people, has gained immense popularity. Research conducted by Baldwin and Ching [ 121 ] identified that user-centric web prototype design would be helpful to improve accessibility in upcoming development for people with disabilities.

Addressing these issues, an SLR carried out by Akram and Sulaiman [ 14 ] indicated that many studies published between 2009 to 2017 devoted to automated tools development to validate the technical aspects against the accessibility conformance or guidelines. Despite the importance of automatic accessibility testing tools, the lack of advanced techniques to develop these tools required human observation to interact with people with disabilities with interactive systems. With the same focus, an SLR carried out by Campoverde-Molina et al. [ 15 ] stated that a synthesis study is crucial to determine the web accessibility standards and the evaluation methods. They also indicated that the testing process remains the main focus of the current web research. In another SLR, Campoverde-Molina et al. [ 16 ] added that the majority of the experimented websites have potential accessibility issues that address further investigation and more research in this field.

In our findings, we identified a few studies related to the accessible design pattern of rich internet application (RIA), accessibility guidelines visualization, and user interface designs. Compared with the previous SLR studies proposed by Akram and Sulaiman and Campoverde-Molina et al., our proposed study also identifies the importance and growth of accessibility requirements elicitation. They added that research on accessible development and evaluation techniques, user-centric design, and user requirements with disabilities should consider.

Further, an SLR conducted by Oh et al. [ 122 ] indicated that web accessibility research in the area of web image analysis and web-based gamification or game development has increased. They added that understanding visual information (e.g., images) is a critical challenge for people with low vision. Another SLR proposed by Salvador-Ullauri et al. [ 123 ] depicted that web-based games are helpful for teaching and learning for people with disabilities. Web and game developers and designers are fascinated by implementing accessible features as accessibility guidelines are not limited to a particular domain of people. However, from the comparative analysis of previous SLRs, we can observe that (Table ​ (Table11) 11 ) most of the past SLR studies have lacked consideration of development and implementation approaches for web evaluation that are necessary to include in our SLR process.

Evaluation of past SLR considering the seven processes of the proposed SLR

Observation of research

In the investigated studies of this research, among the considered seven processes, challenges, and accessibility requirements experienced with less literature. The primary reason might be aligned with the current research focus. The majority of the research focused on the development of evaluation and testing methods, though addressing accessibility challenges during web development and enhancing the importance of ensuring accessibility guidelines is also important [ 124 ]. Without demonstrating the challenges that might be raised during the development process and their associated solutions, it is barely possible to ensure accessibility for digital sources (e.g., websites, software, etc.). To improve these issues, more attention should be given to the current research focus to identify the major challenges associated with the development of the accessible solution and demonstrate the accessibility guidelines with its advancements. Besides, the literature for framework design and development/implementation is not significant compared to the other processes (e.g., testing). Also, there was limited investigation for evaluation metrics to evaluate the correlation between experimented results and user (e.g., people with disabilities) perceptions, which introduces an urgent need to investigate accessibility result validation systems. In addition, our SLR result illustrates that most of the research focused on automatic accessibility testing tools to investigate the accessibility of the web platform. The articles found considered automatic accessibility testing tools while largely neglecting engineering asset development. Therefore, our proposed SLR depicts the importance of future research for updated methods, techniques, processes, and approaches to support the ensurement of an accessible web.

However, a positive finding observed in this SLR was the rapid growth of the number of studies in the accessibility context. Improving accessibility means developing accessible applications and solutions to help users with various disabilities. This perspective emphasized that developed systems should focus on user requirements (especially for special needs users) to ensure user-centric design, considering user involvement and global accessibility design guidelines for digital inclusion. To enable accessible development tendencies in companies and governmental organizations, several governments have proposed rules to improve the accessibility of digital services; for instance, the United Kingdom, the European Union, the Chinese government, and other public and private organizations. Despite several new digital content accessibility guidelines, investigating new processes, tools and techniques is a significant challenge that directs the importance of future investigations or state-of-the-art research.

A systematic literature review is presented in this paper, considering accessibility in the context of web evaluation processes. In this paper, we attempted to take a small step toward contributing to this research by pointing to a new direction for future goals and considerations.

This study showed automatic accessibility testing and evaluation of the focused area of research in the last decade for ensuring the inclusion of accessible web content. There was a great increase in the number of published works after 2017 compared to the previous years.

In the past, most of the literature focused on visual impairment, and very few papers discussed other disabilities, such as hearing, physical, and cognitive disabilities. In this SLR, we found requirements, challenges, engineering techniques, ontology, frameworks, API, algorithms, and testing tools for different levels of satisfaction associated with disabilities, but especially for visual impairment. Therefore, we identified and reported a research gap regarding other disabilities.

Unfortunately, there are few reference architectures for referring to accessible web design, development, and evaluation processes. For example, a framework for accessibility improvement of people with color vision deficiency [ 79 ], an approach for automatically identifying widgets [ 80 ], and an accessibility testing and refinement tool for the early design phase [ 110 ]. It would be beneficial to develop other reference architecture focusing on other contributing areas to solving three problems: (i) framework for the developer to identify and implement accessibility features to improve the accessibility issues, (ii) easy methods to understand and ensure accessibility requirements concerning every type of disabilities during the development phase, and (iii) updated automatic accessibility testing protocols incorporating the latest WCAG standards rules. To overcome these problems, we can note that developing new methods and tools could be a research topic in the upcoming years.

Considering the accessibility of current web platforms, in general, currently available web resources (websites, web-based games, web/mobile applications, etc.) are not accessible. Recently, the governments of many countries-imposed accessibility-related laws (i.e., WCAG) to ensure accessibility requirements. Furthermore, the methods and tools to solve the accessibility problems have limitations that direct future research concerning the development of engineering approaches.

For current accessibility research, there are many challenges to incorporating updated WCAG. Regarding automatic accessibility testing protocol, several studies focused on the limited number of guidelines and disability requirements. Studies for the design and development of accessibility testing protocols are limited. Thus, automatic accessibility testing protocol development concerning different disabilities and elderly user requirements could be a research area in the upcoming years.

Finally, consideration of several methodologies and open-source developments for ensuring accessibility is significantly important. Recently, several researchers and companies have been developing web-based solutions by adopting accessibility requirements. They develop open-source software that has an essential role for end-users and corporations. Accessibility is a crucial technological aspect of developing a new solution for any domain.

Author’s contribution

JA and CS-L conceptualized the study. JA undertook the investigation and statistical analyses and managed the data with support from CS-L and AK and wrote the first draft of the paper. All authors critically inputted into the draft and reviewed it and agreed on the final version. All authors have read and agreed to the published version of the manuscript.

Open access funding provided by University of Pannonia. The authors declare that this article is their research and was not financially supported.

Declarations

The authors declare that there is no conflict of interest.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Contributor Information

Jinat Ara, Email: [email protected] .

Cecilia Sik-Lanyi, Email: uh.nonnap-inu.somla@iynal .

Arpad Kelemen, Email: ude.dnalyramu@nemelek .

Cookie Consent

To improve the website, the DAAD and third parties set cookies and process usage data . In doing so, the DAAD and third parties transfer usage data to third countries in which there is no level of data protection comparable to that under EU law. By clicking the "Accept all" button, you consent to this processing. You can also find selection options and explanations of these cookies and processing at the end of this page under "Cookies". There you can withdraw consent at any time with effect for the future.

  • Privacy Policy

Jump to content

students in teaching room

Higher Education Compass

Web engineering full time, master of science.

Master Degree

4 semesters

Standard period of study (amount)

expired (Germans and inhabitants)

expired (EU), expired (Non-EU) Please enquire

Overview and admission

Admission semester.

Summer and Winter Semester

Area of study

  • Computer Science
  • Software Engineering

Distributed Systems, Software Methods, Databases

Admission modus

open admission

Admission requirements

The beginning of studies for new students is usually the winter semester. Subject to the agreement of the examination board, enrolment is also possible for the summer semester, although this may lead to the standard period of study being extended.

Lecture period

  • 09.10.2023 - 02.02.2024
  • 02.04.2024 - 12.07.2024

Application deadlines

Winter semester (2023/2024), deadlines for international students from the european union.

Registration is via uni-assist.de.

Deadlines for international students from countries that are not members of the European Union

Application deadline for germans and inhabitants.

Extension period for enrolment: to 06.10.2023

Enrollment deadline for Germans and foreign students

Summer semester (2024).

Registration is via uni-assist.de. Applications are only accepted for the first study semester.

Deadline Extension for enrolment: until 05.04.2024

Languages of instruction

Main language.

Deutscher Akademischer Austauschdienst e.V. Kennedyallee 50 53175 Bonn

All addresses in the DAAD Network

DAAD Newsletters

Receive regular up-to-date information about our work and organisation.

Newsletter - DAAD

Useful Links

  • Find Scholarships
  • DAAD offices worldwide

Jump to top of page

When Software Engineering Meets Quantum Computing

  • Hacker News
  • Download PDF
  • Join the Discussion
  • View in the ACM Digital Library
  • Introduction

The Impact of Quantum Computing

Why quantum software engineering, classical vs. quantum software engineering.

binary code in detail of quantum computer illustration

Over the last few decades, quantum computing (QC) has intrigued scientists, engineers, and the public across the globe. Quantum computers use quantum superposition to perform many computations, in parallel, that are not possible with classical computers, resulting in tremendous computational power. 7 By exploiting such power, QC and quantum software enable many applications that are typically out of the reach of classical computing, such as drug discovery and faster artificial intelligence (AI) techniques.

Quantum computers are currently being developed with a variety of technologies, such as superconducting and ion trapping. Private companies, such as Google and IBM, are building their own quantum computers, while public entities are investing in quantum technologies. For example, the European Union Commission is spending €1 billion on quantum technologies (“EU’s Quantum Flagship Project’s Website” a ). Currently, the key goal for quantum computers is to reduce hardware errors that limit their practical uses. Regardless of the eventual technology that wins the quantum hardware race, the key enabler for building QC applications is quantum software (see Figure 1 ).

f1.jpg

Quantum software needs to be supported with a quantum software stack, ranging from operating systems to compilers and programming languages, (see examples in Table 1 ) as postulated by Bertels et al. from the University of Porto. 3 Quantum software engineering ( QSE ) enables the cost-effective and scalable development of dependable quantum software to build revolutionary quantum software applications in many domains—for example, finance, chemistry, healthcare, and agriculture (see Figure 1 and Table 1 ). However, effective quantum software applications cannot be developed with classical software engineering methods due to quantum computing’s inherent characteristics—for instance, superposition and entanglement. Thus, we need to build novel QSE methodologies (with tool support) that cover different phases of QSE, possibly including requirements engineering, modeling, coding, testing, and debugging as shown in Figure 1 .

t1.jpg

In this article, we first present a general view of quantum computing’s potential impact, followed by some highlights of EU-level QC initiatives. We then argue the need for QSE, present the state of the art of QSE from multiple aspects (testing, for example) by comparing quantum computers with their classical counterparts, and shed light on possible research directions.

Back to Top

Quantum computing is primed to solve a broad spectrum of computationally expensive societal and industrial problems. Notable examples include accelerated drug discovery and vaccine development in healthcare, portfolio management and optimization in finance, and complex simulations in physics to better understand our universe. As a result, QC’s success will inevitably and significantly impact our day-to-day lives and revolutionize most industries across many domains. Such impact must be realized via quantum software, the development of which should be systematically powered by QSE. Scientifically speaking, QSE will open new areas of research to develop real applications by fostering research communities across disciplines (such as computer science, software engineering, mathematics, and physics) and interactions with other fields such as medicine, chemistry, and finance. Table 1 summarizes various dimensions of QC with examples.

EU-level quantum initiatives. Efforts to build quantum computers in Europe are increasing. VTT–Technical Research Centre of Finland, together with IQM, aims to build Finland’s first 25-qubit, fully functional quantum computer by 2024. In Sweden, the Wallenberg Centre for Quantum Technology at Chalmers is building a superconducting quantum computer capable of up to 100 qubits. The Future and Emerging Technologies’ Quantum Technologies Flagship program also funds projects to build quantum computers. For example, AQTION is building Europe’s first ion-trapped quantum computer, while OpenSuperQ is focused on building a 100-qubit superconducting QC. To boost research on the development of novel QC applications in Germany, Fraunhofer installed an IBM Quantum System One to provide access to organizations interested in developing QC applications. Finally, NordiQuEst is a new collaborative effort between four Nordic countries and Estonia to build a dedicated Nordic-Estonian QC ecosystem that will integrate various quantum computers and emulators and make them accessible to the Nordic-Estonian region to accelerate QC research, development, and education.

Building practical and real-life QC applications requires the implementation of quantum algorithms as software. Learning from the classical computing realm, developing dependable software entails following a software development life cycle ( SDLC ), which typically includes requirements engineering, architecture and design, development, testing, debugging, and maintenance phases.

Given that quantum software development is relatively new, an SDLC for quantum software doesn’t exist. However, quantum programming languages are available to implement quantum algorithms (see examples in Table 1 ). In their current state, these languages allow programming at the lower level—for instance, as quantum circuits consisting of quantum gates. Figure 2 shows a quantum program example in IBM’s Qiskit performing quantum entanglement, its equivalent quantum circuit in the middle, and execution result on the right side.

f2.jpg

Programming quantum circuits is challenging, as evidenced in the example, because it requires a specialized background in quantum physics, including an understanding of how quantum gates work. Unfortunately, classical computing programmers do not often possess such a background, thus making it difficult for them to program quantum computers. Moreover, in the context of quantum SDLC, quantum programming is just one aspect; attention must be given to other SDLC phases, such as requirements, design, and architecture; verification and validation; and maintenance.

Quantum software requirements engineering and quantum software modeling. Due to the increasing complexity of software application domains, requirements engineering is critical, as it is the process of eliciting, specifying/modeling, managing requirements, among others. During the process, various stakeholders—domain experts and software developers, for example—interact, and the resulting requirements serve as key artifacts to drive software design and development. From this perspective, we consider that requirements engineering in the quantum world aligns with requirements engineering of its classical computing counterpart (see Table 2 for details). However, QC brings new challenges.

t2.jpg

First, its software engineers often find its application domains—for instance, radiotherapy optimization or drug discovery—hard to comprehend. Second, quantum software engineers must also equip themselves with basic knowledge about quantum mechanics, linear algebra, algorithms and their analysis, and more. Therefore, requirements engineering is very important for easing communication among various stakeholders while raising the level of abstraction in understanding the domain and linking the domain to analysis, design, and implementation. To the best of our knowledge, requirements engineering for quantum software is an uncharted area of research. There has not been any publication in this area yet. We argue that, as with classical software, quantum software engineering requires the development of novel elicitation, specification, modeling, analysis, and verification methods.

Quantum finite-state machines, and the study of their formal properties, have been investigated in the literature. 8 However, their application to quantum software modeling remains unstudied. Recently, there has been an increasing interest in extending the Unified Modeling Language (UML) to model quantum software, mainly in the classical software engineering community as highlighted by European researchers. 2 More research is needed, though, to determine whether extending UML is sufficient or more domain-specific modeling solutions are required. In general, there are many opportunities for quantum software modeling, such as developing novel and intuitive quantum modeling notations and methodologies, verification and validation with quantum software models, and empowering code/circuit generation.

Quantum software testing. It is important to ensure quantum programs are correct—that is, they can deliver their intended functionalities. Testing quantum programs is difficult compared to classical software due to their inherent characteristics, including their probabilistic nature; computations in superpositions; the use of advanced features, such as entanglement; a difficulty in reading or estimating quantum program states in superposition; and a lack of precise test oracles. Thus, there is a need for novel, automated, and systematic methods for testing quantum programs. Quantum software testing is garnering increased attention, and several papers have recently been published, with significant contributions from European researchers. 1 , 6 , 9 , 10

Several research areas need to be explored, such as how to define and check (with relevant statistics, for instance) quantum test oracles without destroying superposition, and how to cost-effectively find test data that can break a quantum program. Given hardware noises in quantum computers, testing techniques must also be noise-aware. In general, we foresee the need to build theoretical foundations of quantum software testing, including coverage criteria, test models, and test strategies. Test strategies consist of test oracles, test data, and test cases, and can be designed by considering fault types, metamorphic testing to deal with test oracle issues, and mutation analysis. To maximize the benefit, all test techniques are expected to be independent of a quantum programming language. Furthermore, we need practical applications, extensive empirical evaluations of testing techniques, and the creation of benchmarks for the community. Several automated quantum software testing tools have recently been developed, with major contributions by European researchers such as Quito, Muskit, QuSBT, QsharpCheck, and QuCAT.

Quantum software debugging. Observed failures in quantum pro-grams—for instance, found with testing—need to be diagnosed with the debugging process to isolate and patch the code to fix the failure. This process typically comprises multiple tactics usually found in debuggers, such as relying on print statements in code to achieve interactive debugging. Similarly, we need quantum software debugging tactics, implemented in debuggers, to cost-effectively diagnose and resolve quantum software failures. However, the development of effective debugging techniques faces several challenges as discussed in a key debugging work: 4 an inability to directly monitor quantum software states in superposition; the understanding of quantum software states, when possible (for instance, in quantum computer simulators), can be unintuitive; and a lack of best practices, in general, to perform debugging.

Several research opportunities exist for debugging quantum programs: tailoring classical debugging tactics (backtracking and cause elimination, for example) to debug programs on quantum simulators and developing novel tactics to debug on real quantum computers; novel visualization approaches to inspect values without the need to measure quantum states, with intuitive visualizations comprehensible by humans; and novel ways to infer quantum software states using statistical 4 and projection-based assertions (for example, see Li et al. 5 ), in addition to developing novel assertion types.

Quantum computing is on the rise and will, no doubt, revolutionize many technologies. It will transform our understanding of and the way we deal with complex problems and challenges. Quantum software engineering is key to the systematic and cost-effective creation of tomorrow’s powerful, reliable, and practical QC applications.

Compared with classical computing, QC’s inherent complexity and its complex application domains—drug discovery, for example—present new multidimensional challenges that emphasize the significance of QSE. Fascinated by this observation, we presented in this article the key highlights of QC activities in Europe, key QSE innovations (when compared with classical software engineering), and open QSE research directions. This is the time to embrace QC and form the QSE community in Europe and globally.

Submit an Article to CACM

CACM welcomes unsolicited submissions on topics of relevance and value to the computing community.

You Just Read

The Digital Library is published by the Association for Computing Machinery. Copyright © 2022 ACM, Inc.

April 2022 Issue

Published: April 1, 2022

Vol. 65 No. 4

Pages: 84-88

Related Reading

Quantum-Safe Trust for Vehicles: The Race Is Already On

Computing Applications

Research and Advances

A Blueprint for Building a Quantum Computer

Architecture and Hardware

The Complex Path to Quantum Resistance

Computing Profession

Computing: The Fourth Great Domain of Science

Artificial Intelligence and Machine Learning

Advertisement

research in web engineering

Join the Discussion (0)

Become a member or sign in to post a comment, the latest from cacm.

Conversations with AI

research in web engineering

Are You a Doomer or a Boomer?

University of Wyoming lecturer Robin K. Hill

Forget the Catastrophic Forgetting

research in web engineering

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

  • Current Article

Raghunathan receives Bement Award for groundbreaking AI research

research in web engineering

WEST LAFAYETTE, Ind. — Anand Raghunathan, the Silicon Valley Professor of Electrical and Computer Engineering, has been chosen to receive Purdue University’s 2023 Arden L. Bement Jr. Award. The award is given annually to a university researcher who has made highly significant and impactful contributions in pure and applied sciences and engineering. Raghunathan is being recognized for his pioneering work in making AI systems more energy-efficient through specialized hardware architectures for AI workloads and the design paradigm of approximate computing.

Raghunathan’s achievement will be recognized on April 22, when Karen Plaut, Purdue’s executive vice president for research, will host the Excellence in Research Awards and Lectures event, 2-5 p.m. in the North Ballroom of Purdue Memorial Union. Raghunathan will present a lecture titled, “AI’s Energy Challenge and Four A’s to Address It.” Faculty, staff, students and the public are encouraged to attend. Event details and registration information can be found here . The award will be presented at an event in May.

A fellow of the Institute of Electrical and Electronics Engineers and the Association for Computing Machinery, Raghunathan has been a Purdue faculty member since 2008. He is a founding co-director of the Purdue-led Center for Secure Microelectronics Ecosystem and co-director of the Center for the Co-Design of Cognitive Systems, which is funded by the Semiconductor Research Corp. and the U.S. Defense Advanced Research Projects Agency.

“Anand was one of the very first researchers to realize that machine learning and data analytics would drive the future of computing platforms and their underlying hardware fabrics. His group created some of the first hardware accelerators for AI workloads, in the process recognizing the need for new design paradigms to create such hardware,” said Kaushik Roy, the Edward G. Tiedemann, Jr. Distinguished Professor of Electrical and Computer Engineering, in nominating Raghunathan. “This led him to his pioneering work in approximate computing, which has deeply influenced subsequent efforts in academia and industry and has been recognized with best paper and test-of-time awards.”

“I am deeply honored to be chosen to receive this award,” Raghunathan said. “I am indebted to all my mentors, collaborators and students over the years, and to Purdue for providing an amazing environment in which to pursue my research.”

Looking ahead to future challenges, Raghunathan has clearly defined his priorities: “Artificial intelligence has fundamentally altered the trajectory of demand for computing. Our ability to address the AI compute efficiency challenge will shape the future of AI and many other fields. I hope to tackle this challenge through my work.”

“Professor Raghunathan’s pioneering research in utilizing approximate computing techniques for improving the efficiency of AI hardware — specifically, trained quantization of deep neural networks — has been foundational to enabling transformative advances in AI systems in recent years across the industry,” said Vivek De, fellow and director of circuit technology research at Intel.

Among many accolades, Raghunathan was cited as one of the world’s top 35 innovators under the age of 35 by MIT Technology Review magazine. He has received nine best paper awards, a ten-year retrospective most influential paper award and a best design contest award at premier conferences in his field. Before joining Purdue, he received a Patent of the Year Award and two Technology Commercialization Awards from NEC Corp. for his work that shaped multiple generations of semiconductor products. At Purdue, he has received the College of Engineering Faculty Excellence Award for Research, the Qualcomm Faculty Award and the IBM Faculty Award.

The Arden L. Bement Jr. Award was established in 2015 by Distinguished Professor Emeritus Arden Bement and his wife, Louise Bement, to annually recognize a Purdue faculty member for recent outstanding accomplishments in pure and applied sciences and engineering. Winners of the award are nominated by colleagues, recommended by a faculty committee and approved by the executive vice president for research and the university president.

Writer/Media contact: Amy Raley, [email protected] Source: Anand Raghunathan, [email protected]

Microsoft Research Blog

Learning from interaction with microsoft copilot (web).

Published March 27, 2024

By Scott Counts , Senior Principal Research Manager Jennifer Neville , Partner Research Manager Mengting Wan , Senior Applied Research Scientist Ryen W. White , General Manager and Deputy Lab Director Longqi Yang , Principal Applied Research Manager

Share this page

  • Share on Facebook
  • Share on Twitter
  • Share on LinkedIn
  • Share on Reddit
  • Subscribe to our RSS feed

flowchart showing how AI learns from user interactions

AI systems like Bing and Microsoft Copilot (web) are as good as they are because they continuously learn and improve from people’s interactions. Since the early 2000s, user clicks on search result pages have fueled the continuous improvements of search engines. Recently, reinforcement learning from human feedback (RLHF) brought step-function improvements to response quality of generative AI models. Bing has a rich history of success in improving its AI offerings by learning from user interactions. For example, Bing pioneered the idea of improving search ranking (opens in new tab) and personalizing search (opens in new tab) using short- and long-term user behavior data (opens in new tab) .

With the introduction of Microsoft Copilot (web), the way that people interact with AI systems has fundamentally changed from searching to conversing and from simple actions to complex workflows. Today, we are excited to share three technical reports on how we are starting to leverage new types of user interactions to understand and improve Copilot (web) for our consumer customers. [1]

How are people using Copilot (web)?

One of the first questions we asked about user interactions with Copilot (web) was, “How are people using Copilot (web)?” Generative AI can perform many tasks that were not possible in the past, and it’s important to understand people’s expectations and needs so that we can continuously improve Copilot (web) in the ways that will help users the most.

Microsoft Research Podcast

research in web engineering

Collaborators: Renewable energy storage with Bichlien Nguyen and David Kwabi

Dr. Bichlien Nguyen and Dr. David Kwabi explore their work in flow batteries and how machine learning can help more effectively search the vast organic chemistry space to identify compounds with properties just right for storing waterpower and other renewables.

A key challenge of understanding user tasks at scale is to transform unstructured interaction data (e.g., Copilot logs) into a meaningful task taxonomy . Existing methods heavily rely on manual effort, which is not scalable in novel and under-specified domains like generative AI. To address this challenge, we introduce TnT-LLM ( T axonomy Generation and T ext Prediction with LLM s) , a two-phase LLM-powered framework that generates and predicts task labels end-to-end with minimal human involvement (Figure 1).

The figure illustrates a comparison of three text data processing frameworks. The first, a labor-intensive human-in-the-loop framework, involves the manual derivation of label taxonomy and annotation before the developing the classifier. The second, a conventional unsupervised text clustering framework, clusters data initially and generates label taxonomy afterwards. The third, the TnT-LLM framework, integrates LLM in both the derivation of label taxonomy and annotation. A scatter plot shows that human-in-the-loop is highly interpretable but not very scalable, the text clustering framework is highly scalable but less interpretable, the TnT-LLM framework excels in both.

We conducted extensive human evaluation to understand how TnT-LLM performs. In discovering user intent and domain from Copilot (web) conversations, taxonomies generated by TnT-LLM are significantly more accurate than existing baselines (Figure 2).

The figure compares the accuracy of different AI methods in generating user intent taxonomies. Two bar plots are presented side by side, labeled with “Accuracy (Intent)” and “Accuracy (Domain)”. The methods compared are “GPT-4 (TnT-LLM)”, “GPT-3.5-turbo (TnT-LLM)”, “ada2 + GPT-4”, “ada2 + GPT-3.5-turbo”, “Instructor-XL + GPT-4” and “Instructor-XL + GPT-3.5-turbo”. The label “GPT-4 evaluation” is noted at the bottom. “GPT-4 (TnT-LLM)” appears to outperform other methods in this figure.

We applied TnT-LLM to a large-scale number of fully de-identified Copilot (web) conversations and traditional Bing Search sessions. The results (Figure 3) suggest that people use Copilot (web) for knowledge work tasks in domains such as writing and editing, data analysis, programming, science, and business. Further, tasks done in Copilot (web) generally are of higher complexity and more knowledge work-oriented compared to tasks done in traditional search engines. Generative AI’s emerging capabilities have evolved the tasks that machines can perform, to include some that humans have traditionally had to do without assistance. Results demonstrate that people are doing more complex tasks, frequently in the context of knowledge work, and show that this type of work is being newly assisted by Copilot (web).

The figure compares Bing Copilot conversations with Bing Search sessions for the degree to which they are complex in nature and oriented toward knowledge work. Two scatterplots are presented side by side, one each for Bing Copilot and Bing Search. The x-axes are labeled “Percent of Per-Domain Copilot Chats Classified as Complex” and “Percent of Per-Domain Search Sessions Classified as Complex”. The y-axes are labeled “Percent of Per-Domain Copilot Chats Classified as Knowledge Work” for Bing Copilot and “Percent of Per-Domain Search Sessions Classified as Knowledge Work” for Bing Search. The points in the scatterplot are task domains, such as “Programming and scripting” and “Gaming and entertainment”. The data points in the scatter plot show that for Bing Search, the majority of search sessions are lower in both complexity and knowledge work relevance, whereas for Bing Copilot, many data points are high in both complexity and knowledge work.

Estimating and interpreting user satisfaction

To effectively learn from user interactions, it is equally important to classify user satisfaction and to understand why people are satisfied or dissatisfied while trying to complete a given task. Most important, this will allow system developers to identify areas of improvement and to amplify and suggest successful use cases for broader groups of users.

People give explicit and implicit feedback when interacting with AI systems. In the past, user feedback was in the form of clicks, ratings, or survey verbatims. When it comes to conversational systems like Copilot (web), people also give feedback in the messages they send during the conversations (Figure 4).

The figure illustrates the continuous improvement process of an AI assistant. The process starts with an example conversation between a user and an AI assistant. The user in the example is unsatisfied with the response of the AI assistant. An improvement process takes the unsatisfied example as input and outputs better responses in the same conversation.  After improvement, the user is satisfied with the response.

To capture this new category of feedback signals, we propose our Supervised Prompting for User Satisfaction Rubrics (SPUR) (opens in new tab) framework (Figure 5). It’s a three-phase prompting framework for estimating user satisfaction with LLMs:

  • The supervised extraction prompt extracts diverse in situ textual feedback from users interacting with Copilot (web).
  • The summarization rubric prompt identifies prominent textual feedback patterns and summarizes them into rubrics for estimating user satisfaction.
  • Based on the summarized rubrics, the final scoring prompt takes a conversation between a user and the AI agent and rates how satisfied the user was.

The figure shows the framework of Supervised Prompting for User Satisfaction Rubrics. The first step shows that a LLM explains user satisfaction or dissatisfaction based on user utterances. Then, LLM summarizes satisfaction or dissatisfaction reasons into SAT and DSAT rubrics in the second step. Finally, LLM uses SAT and DSAT rubrics to determine whether a user is satisfied with the responses of an AI agent in the third step.

We evaluated our framework on fully de-identified conversations with explicit user thumbs up/down in Copilot (web) (Table 1). We find that SPUR outperforms other LLM-based and embedding-based methods, especially only limited human annotations of user satisfaction are available. Open-source reward models used for RLHF cannot be a proxy for user satisfaction, because reward models are usually trained with auxiliary human feedback that may differ from the feedback from the user who was involved in the conversation with the AI agent.

Another critical feature of SPUR is its interpretability. It shows how people express satisfaction or dissatisfaction (Figure 6). For example, we see that users often give explicit positive feedback by clearly praising the response from Copilot (web). Conversely, they express explicit frustration or switch topics when encountering mistakes in the response from Copilot (web). This presents opportunities for providing customized user experience at critical moments of user satisfaction and dissatisfaction, such as context and memory reset after switching topics.

The figure shows two histogram plots. The left histogram plot shows the distribution of the ten-item SAT rubric, and the right histogram plot shows the distribution of the ten-item DSAT rubric in Bing Copilot. The y-axis of the left histogram shows ten summarized patterns that express how a user is satisfied with the responses of Bing Copilot, and the x-axis shows the percentage of each pattern occurring in Bing Copilot. Similarly, the y-axis of the right histogram shows 10 summarized patterns that express how a user is dissatisfied with the responses of Bing Copilot, and the x-axis shows the percentage of each pattern happening in Bing Copilot.

In the user task classification discussed earlier, we know that people are using Copilot (web) for knowledge work and more complex tasks. As we further apply SPUR for user satisfaction estimation, we find that people are also more satisfied when they complete or partially complete cognitively complex tasks. Specifically, when regressing task complexity on the SPUR-derived summary user-satisfaction score, we find generally increasing coefficients on increasing levels of task complexity when using the lowest level of task complexity (i.e. Remember) as a baseline, provided the task was at least partially completed (see Table 2). For instance, partially completing a Create-level task, which is the highest level of task complexity, leads to an increase in user satisfaction that is more than double the increase when partially completing an Understand-level task. Fully completing a Create-level task leads to the largest increase in user satisfaction.

The table shows results from a regression analysis, including the predictor variables and their respective coefficients in the regression. In this regression, three predictor variables are regressed on user satisfaction as the outcome variable. The three predictors are task complexity, task completion, and the number of user messages. Additionally, interaction terms are included for the interactions between task complexity and task completion, and between the number of user messages and task completion. The results indicate that that when users complete more complex tasks, their user satisfaction increases.

These three reports present a comprehensive and multi-faceted approach to dynamically learning from conversation logs in Copilot (web) at scale. As AI’s generative capabilities increase, users are finding new ways to use the system to help them do more and shift from traditional click reactions to more nuanced, continuous dialogue-oriented feedback. To navigate this evolving user-AI interaction landscape, it is crucial to shift from established task frameworks and relevance evaluations to a more dynamic, bottom-up approach to task identification and user satisfaction evaluation.

Key Contributors

Reid Andersen , Georg Buscher, Scott Counts , Deepak Gupta, Brent Hecht , Dhruv Joshi, Sujay Kumar Jauhar , Ying-Chun Lin, Sathish Manivannan, Jennifer Neville , Nagu Rangan, Chirag Shah, Dolly Sobhani, Siddharth Suri , Tara Safavi , Jaime Teevan , Saurabh Tiwary , Mengting Wan , Ryen W. White , Xia Song , Jack W. Stokes, Xiaofeng Xu, and Longqi Yang .

[1] The research was performed only on fully de-identified interaction data from Copilot (web) consumers. No enterprise data was used per our commitment to enterprise customers. We have taken careful steps to protect user privacy and adhere to strict ethical and responsible AI standards. All personal, private or sensitive information was scrubbed and masked before conversations were used for the research. The access to the dataset is strictly limited to approved researchers. The study was reviewed and approved by our institutional review board (IRB).

Related publications

Interpretable user satisfaction estimation for conversational systems with large language models, tnt-llm: text mining at scale with large language models, the use of generative search engines for knowledge work and complex tasks, using large language models to generate, validate, and apply user intent taxonomies, meet the authors.

Portrait of Scott Counts

Scott Counts

Senior Principal Research Manager

Portrait of Jennifer Neville

Jennifer Neville

Partner Research Manager

Portrait of Mengting Wan

Mengting Wan

Senior Applied Research Scientist

Portrait of Ryen W. White

Ryen W. White

General Manager and Deputy Lab Director

Portrait of Longqi Yang

Longqi Yang

Principal Applied Research Manager

Continue reading

diagram

AI Controller Interface: Generative AI with a lightweight, LLM-integrated VM

a group of people sitting at a desk in front of a crowd

Teachers in India help Microsoft Research design AI tool for creating great classroom content

"ICCV23 PARIS" to the left of a picture of the first page of the HoloAssist publication on a blue and purple gradient background.

HoloAssist: A multimodal dataset for next-gen AI copilots for the physical world

AutoGen hero image

AutoGen: Enabling next-generation large language model applications

Research areas.

research in web engineering

Research Groups

  • Office of Applied Research
  • LEAP - Language, Learning, Audio, Privacy
  • Microsoft Turing Academic Program (MS-TAP)
  • Microsoft Research Special Projects
  • Augmented Learning and Reasoning

Related labs

  • Microsoft Research Lab - Redmond
  • Follow on Twitter
  • Like on Facebook
  • Follow on LinkedIn
  • Subscribe on Youtube
  • Follow on Instagram

Share this page:

Departments

  • Applied Physics
  • Biomedical Engineering
  • Center for Urban Science and Progress
  • Chemical and Biomolecular Engineering
  • Civil and Urban Engineering
  • Computer Science and Engineering
  • Electrical and Computer Engineering
  • Finance and Risk Engineering
  • Mathematics
  • Mechanical and Aerospace Engineering
  • Technology, Culture and Society
  • Technology Management and Innovation

Degrees & Programs

  • Bachelor of Science
  • Master of Science
  • Doctor of Philosophy
  • Digital Learning
  • Certificate Programs
  • NYU Tandon Bridge
  • Undergraduate
  • Records & Registration
  • Digital Learning Services
  • Teaching Innovation
  • Explore NYU Tandon
  • Year in Review
  • Strategic Plan
  • Diversity & Inclusion

News & Events

  • Social Media

Looking for News or Events ?

NYU Tandon @ The Yard

An Emerging Media Playground

Be a part of NYUs newest Motion, Volumetric capture studio and help tranfsorm NYC into the emerging media capital of the world.

person in front of motion capture wall

On this Page

NYU Tandon @ The Yard leverages the Emerging Media expertise of NYU Tandon School of Engineering with industry and media partners across the country to create a new media hub and must-visit facility within the Brooklyn Navy Yard . Our goals are to advance integrative research in AR/VR/XR, virtual production, and other topics relevant to experiential computing; provide our industry partners with first-hand contact with both NYU faculty and students and cultural and media organizations in New York City and beyond; and bring research-grade emerging technology within reach of media, entertainment, and cultural sectors of the city, allowing companies, museums and other arts organizations, and independent creatives to have an equitable platform in which to engage in new tools and techniques.

NYU Tandon @ The Yard is a 14,000 square foot production and research facility, consisting of two production studios as well as other systems for working with emerging media technology.

Motion Capture Studio

tandon navy yard motion capture studio

Our motion capture stage is the second largest in New York City: 40'x35' (30' tall) equipped with a 24-camera Optitrak Prime 13 motion capture system along with facial and hand tracking equipment. The stage has a high definition projection system as well as a large virtual production “kit” consisting of a 30-foot LED wall, camera equipment, and other equipment developed in partnership with Final Pixel. The stage has a control booth, green room, and other amenities to support professional capture.

Volumetric Capture Studio

tandon navy yard volumetric capture studio

Our volumetric capture stage consists of a 8’ diameter capture area for a 21-camera Evercoast system, supporting both streaming and cloud-based rendering of volumetric capture data up to 30fps.

Post-production Facilities

tandon navy yard post-production facilities

The facility supports the cleaning and post-production of motion capture and volumetric data, as well as production integration into the Unreal Engine. 

Other Equipment

NYU Tandon @ The Yard also has equipment to support development and research in emerging media, including:

  • a robotic motion base for prototyping cyber-physical interactions; 
  • a full suite of current XR equipment for virtual and augmented reality, including social VR; 
  • a Dolby Atmos system for working with spatial audio; 
  • a “Belfast-method” style broadcast system for working with real-time streaming technologies; 
  • desk production space for ~60 MS/PhD students, as well as meeting and touchdown space for faculty, staff, and industry partners.

NYU Tandon @ The Yard hosts both graduate-level studio courses in emerging media as part of the MS degree in Integrated Design & Media (IDM) and Professional Education programs, including our bundle of Virtual Production courses, developed with support from Epic Games.

NYU Tandon @ The Yard hosts regular events, meetups, and workshops open to the public. Drop us a line and we’ll put you on our mailing list.

Production and event rentals

NYU Tandon @ The Yard is available for production rentals and consultations with NYU Tandon students working as production technicians and consultants. We are also available to host meetups, corporate offsites, and other events. Discounted rates are available for non-profit organizations to use our facilities. If you’re working on a production and think we can help, please get in touch.

NYU Tandon @ The Yard is the only facility of its kind on the East Coast that can represent the human form in Extended Reality (XR) in so many ways (volumetric, mocap, live performance). Unlike most emerging media facilities universities, we prototype across all industries, not just film. We are the only emerging media facility at this scale housed within an R1 engineering school, with the ability to engage meaningfully with sponsored research as well as industry.

NYU Tandon @ The Yard is actively interested in working with industry and cultural partners working in emerging technologies. We’d love to hear from you and see how we can work together. Get in touch: [email protected]

Partnerships

Studio partners.

209 Group

Client Partners

American Ballet Theater

Building Meta’s GenAI Infrastructure

research in web engineering

  • Marking a major investment in Meta’s AI future, we are announcing two 24k GPU clusters. We are sharing details on the hardware, network, storage, design, performance, and software that help us extract high throughput and reliability for various AI workloads. We use this cluster design for Llama 3 training.
  • We are strongly committed to open compute and open source. We built these clusters on top of Grand Teton , OpenRack , and PyTorch and continue to push open innovation across the industry.
  • This announcement is one step in our ambitious infrastructure roadmap. By the end of 2024, we’re aiming to continue to grow our infrastructure build-out that will include 350,000 NVIDIA H100 GPUs as part of a portfolio that will feature compute power equivalent to nearly 600,000 H100s.

To lead in developing AI means leading investments in hardware infrastructure. Hardware infrastructure plays an important role in AI’s future. Today, we’re sharing details on two versions of our 24,576-GPU data center scale cluster at Meta. These clusters support our current and next generation AI models, including Llama 3, the successor to Llama 2 , our publicly released LLM, as well as AI research and development across GenAI and other areas .

A peek into Meta’s large-scale AI clusters

Meta’s long-term vision is to build artificial general intelligence (AGI) that is open and built responsibly so that it can be widely available for everyone to benefit from. As we work towards AGI, we have also worked on scaling our clusters to power this ambition. The progress we make towards AGI creates new products, new AI features for our family of apps , and new AI-centric computing devices. 

While we’ve had a long history of building AI infrastructure, we first shared details on our AI Research SuperCluster (RSC) , featuring 16,000 NVIDIA A100 GPUs, in 2022. RSC has accelerated our open and responsible AI research by helping us build our first generation of advanced AI models. It played and continues to play an important role in the development of Llama and Llama 2 , as well as advanced AI models for applications ranging from computer vision, NLP, and speech recognition, to image generation , and even coding .

research in web engineering

Under the hood

Our newer AI clusters build upon the successes and lessons learned from RSC. We focused on building end-to-end AI systems with a major emphasis on researcher and developer experience and productivity. The efficiency of the high-performance network fabrics within these clusters, some of the key storage decisions, combined with the 24,576 NVIDIA Tensor Core H100 GPUs in each, allow both cluster versions to support models larger and more complex than that could be supported in the RSC and pave the way for advancements in GenAI product development and AI research.

At Meta, we handle hundreds of trillions of AI model executions per day. Delivering these services at a large scale requires a highly advanced and flexible infrastructure. Custom designing much of our own hardware, software, and network fabrics allows us to optimize the end-to-end experience for our AI researchers while ensuring our data centers operate efficiently. 

With this in mind, we built one cluster with a remote direct memory access (RDMA) over converged Ethernet (RoCE) network fabric solution based on the Arista 7800 with Wedge400 and Minipack2 OCP rack switches. The other cluster features an NVIDIA Quantum2 InfiniBand fabric. Both of these solutions interconnect 400 Gbps endpoints. With these two, we are able to assess the suitability and scalability of these different types of interconnect for large-scale training, giving us more insights that will help inform how we design and build even larger, scaled-up clusters in the future. Through careful co-design of the network, software, and model architectures, we have successfully used both RoCE and InfiniBand clusters for large, GenAI workloads (including our ongoing training of Llama 3 on our RoCE cluster) without any network bottlenecks.

Both clusters are built using Grand Teton , our in-house-designed, open GPU hardware platform that we’ve contributed to the Open Compute Project (OCP). Grand Teton builds on the many generations of AI systems that integrate power, control, compute, and fabric interfaces into a single chassis for better overall performance, signal integrity, and thermal performance. It provides rapid scalability and flexibility in a simplified design, allowing it to be quickly deployed into data center fleets and easily maintained and scaled. Combined with other in-house innovations like our Open Rack power and rack architecture, Grand Teton allows us to build new clusters in a way that is purpose-built for current and future applications at Meta.

We have been openly designing our GPU hardware platforms beginning with our Big Sur platform in 2015 .

Storage plays an important role in AI training, and yet is one of the least talked-about aspects. As the GenAI training jobs become more multimodal over time, consuming large amounts of image, video, and text data, the need for data storage grows rapidly. The need to fit all that data storage into a performant, yet power-efficient footprint doesn’t go away though, which makes the problem more interesting.

Our storage deployment addresses the data and checkpointing needs of the AI clusters via a home-grown Linux Filesystem in Userspace (FUSE) API backed by a version of Meta’s ‘Tectonic’ distributed storage solution optimized for Flash media. This solution enables thousands of GPUs to save and load checkpoints in a synchronized fashion (a challenge for any storage solution) while also providing a flexible and high-throughput exabyte scale storage required for data loading.

We have also partnered with Hammerspace to co-develop and land a parallel network file system (NFS) deployment to meet the developer experience requirements for this AI cluster. Among other benefits, Hammerspace enables engineers to perform interactive debugging for jobs using thousands of GPUs as code changes are immediately accessible to all nodes within the environment. When paired together, the combination of our Tectonic distributed storage solution and Hammerspace enable fast iteration velocity without compromising on scale.     

The storage deployments in our GenAI clusters, both Tectonic- and Hammerspace-backed, are based on the YV3 Sierra Point server platform , upgraded with the latest high capacity E1.S SSD we can procure in the market today. Aside from the higher SSD capacity, the servers per rack was customized to achieve the right balance of throughput capacity per server, rack count reduction, and associated power efficiency. Utilizing the OCP servers as Lego-like building blocks, our storage layer is able to flexibly scale to future requirements in this cluster as well as in future, bigger AI clusters, while being fault-tolerant to day-to-day Infrastructure maintenance operations.

Performance

One of the principles we have in building our large-scale AI clusters is to maximize performance and ease of use simultaneously without compromising one for the other. This is an important principle in creating the best-in-class AI models. 

As we push the limits of AI systems, the best way we can test our ability to scale-up our designs is to simply build a system, optimize it, and actually test it (while simulators help, they only go so far). In this design journey, we compared the performance seen in our small clusters and with large clusters to see where our bottlenecks are. In the graph below, AllGather collective performance is shown (as normalized bandwidth on a 0-100 scale) when a large number of GPUs are communicating with each other at message sizes where roofline performance is expected. 

Our out-of-box performance for large clusters was initially poor and inconsistent, compared to optimized small cluster performance. To address this we made several changes to how our internal job scheduler schedules jobs with network topology awareness – this resulted in latency benefits and minimized the amount of traffic going to upper layers of the network. We also optimized our network routing strategy in combination with NVIDIA Collective Communications Library (NCCL) changes to achieve optimal network utilization. This helped push our large clusters to achieve great and expected performance just as our small clusters.

research in web engineering

In addition to software changes targeting our internal infrastructure, we worked closely with teams authoring training frameworks and models to adapt to our evolving infrastructure. For example, NVIDIA H100 GPUs open the possibility of leveraging new data types such as 8-bit floating point (FP8) for training. Fully utilizing larger clusters required investments in additional parallelization techniques and new storage solutions provided opportunities to highly optimize checkpointing across thousands of ranks to run in hundreds of milliseconds.

We also recognize debuggability as one of the major challenges in large-scale training. Identifying a problematic GPU that is stalling an entire training job becomes very difficult at a large scale. We’re building tools such as desync debug, or a distributed collective flight recorder, to expose the details of distributed training, and help identify issues in a much faster and easier way

Finally, we’re continuing to evolve PyTorch, the foundational AI framework powering our AI workloads, to make it ready for tens, or even hundreds, of thousands of GPU training. We have identified multiple bottlenecks for process group initialization, and reduced the startup time from sometimes hours down to minutes. 

Commitment to open AI innovation

Meta maintains its commitment to open innovation in AI software and hardware. We believe open-source hardware and software will always be a valuable tool to help the industry solve problems at large scale.

Today, we continue to support open hardware innovation as a founding member of OCP, where we make designs like Grand Teton and Open Rack available to the OCP community. We also continue to be the largest and primary contributor to PyTorch , the AI software framework that is powering a large chunk of the industry.

We also continue to be committed to open innovation in the AI research community. We’ve launched the Open Innovation AI Research Community , a partnership program for academic researchers to deepen our understanding of how to responsibly develop and share AI technologies – with a particular focus on LLMs.

An open approach to AI is not new for Meta. We’ve also launched the AI Alliance , a group of leading organizations across the AI industry focused on accelerating responsible innovation in AI within an open community. Our AI efforts are built on a philosophy of open science and cross-collaboration. An open ecosystem brings transparency, scrutiny, and trust to AI development and leads to innovations that everyone can benefit from that are built with safety and responsibility top of mind. 

The future of Meta’s AI infrastructure

These two AI training cluster designs are a part of our larger roadmap for the future of AI. By the end of 2024, we’re aiming to continue to grow our infrastructure build-out that will include 350,000 NVIDIA H100s as part of a portfolio that will feature compute power equivalent to nearly 600,000 H100s.

As we look to the future, we recognize that what worked yesterday or today may not be sufficient for tomorrow’s needs. That’s why we are constantly evaluating and improving every aspect of our infrastructure, from the physical and virtual layers to the software layer and beyond. Our goal is to create systems that are flexible and reliable to support the fast-evolving new models and research. 

Share this:

  • Hacker News

Read More in AI Research

research in web engineering

Available Positions

  • Research Engineer - Reality Labs BURLINGAME, US
  • Research Engineer - Reality Labs NEW YORK, US
  • Research Tech Lead - AI for Chemistry SEATTLE, US
  • Research Tech Lead - AI for Chemistry SAN FRANCISCO, US

Stay Connected

footer-fb-engineering

Open Source

Meta believes in building community through open source technology. Explore our latest projects in Artificial Intelligence, Data Infrastructure, Development Tools, Front End, Languages, Platforms, Security, Virtual Reality, and more.

android

To help personalize content, tailor and measure ads and provide a safer experience, we use cookies. By clicking or navigating the site, you agree to allow our collection of information on and off Facebook through cookies. Learn more, including about available controls: Cookie Policy

research in web engineering

Empirical Research Methods in Web and Software Engineering

Cite this chapter.

Book cover

  • Claes Wohlin 3 ,
  • Martin Höst 4 &
  • Kennet Henningsson 3  

2028 Accesses

7 Citations

Web and software engineering are not only about technical solutions. They are to a large extent also concerned with organisational issues, project management and human behaviour. For disciplines like Web and software engineering, empirical methods are crucial, since they allow for incorporating human behaviour into the research approach taken. Empirical methods are common practice in many other disciplines. This chapter provides a motivation for the use of empirical methods in Web and software engineering research. The main motivation is that it is needed from an engineering perspective to allow for informed and well-grounded decisions. The chapter continues with a brief introduction to four research methods: controlled experiments, case studies, surveys and post-mortem analyses. These methods are then put into an improvement context. The four methods are presented with the objective to introduce the reader to the methods to a level where it is possible to select the most suitable method at a specific instance. The methods have in common that they all are concerned with quantitative data. However, several of them are also suitable for qualitative data. Finally, it is concluded that the methods are not competing. On the contrary, the different research methods can preferably be used together to obtain more sources of information that hopefully lead to more informed engineering decisions in Web and software engineering.

  • Controlled experiment
  • Post-mortem analysis
  • Empirical investigation
  • Engineering discipline

A previous version of this chapter has been published in Empirical Methods and Studies in Software Engineering: Experiences from ESERNET, pp 7–23, editors Reidar Conradi and Alf Inge Wang, Lecture Notes in Computer Science Springer-Verlag, Germany, 2765, 2003. This chapter has been adapted by Emilia Mendes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
  • Durable hardcover edition

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Unable to display preview.  Download preview PDF.

Garvin DA (1998) Building a Learning Organization. Harvard Business Review on Knowledge Management, 47–80, Harvard Business School Press, Boston, USA

Google Scholar  

Basili VR, Caldiera G, Rombach HD (2002) Experience Factory. In: Marciniak JJ (ed.) Encyclopaedia of Software Engineering, John Wiley & Sons, Hoboken, NJ, USA

Creswell JW (1994) Research Design, Qualitative and Quantitative Approaches, Sage Publications, London, UK

Denzin NK, Lincoln YS (1994) Handbook of Qualitative Research, Sage Publications, London, UK

Fenton N, Pfleeger SL (1996) Software Metrics: A Rigorous & Practical Approach, 2nd edition, International Thomson Computer Press, London, UK

Kitchenham B, Pickard L, Pfleeger SL (1995) Case Studies for Method and Tool Evaluation. IEEE Software, July, 52–62

Montgomery DC (1997) Design and Analysis of Experiments, 4th edition, John Wiley & Sons, New York, USA

Siegel S, Castellan J (1998) Nonparametric Statistics for the Behavioral Sciences, 2nd edition, McGraw-Hill International, New York, USA

Robson C (2002) Real World Research, 2nd edition, Blackwell, Oxford, UK

Zelkowitz MV, Wallace DR (1998) Experimental Models for Validating Technology. IEEE Computer, 31(5):23–31

Manly BFJ (1994) Multivariate Statistical Methods-A Primer, 2 nd edition, Chapman & Hall, London

Stake RE (1995) The Art of Case Study Research, SAGE Publications, London, UK

Pfleeger S (1994–1995) Experimental Design and Analysis in Software Engineering Parts 1–5. ACM Sigsoft, Software Engineering Notes, 19(4):16–20; 20(1):22–26; 20(2):14–16; 20(3):13–15; 20(4):14–17

Yin RK (1994) Case Study Research Design and Methods, Sage Publications, Beverly Hills, CA, USA

Babbie E (1990) Survey Research Methods, Wadsworth, Monterey, CA, USA

Tukey JW (1977) Exploratory Data Analysis, Addison-Wesley, Reading, MA, USA

Robson C (1994) Design and Statistics in Psychology, 3rd edition, Penguin Books, London, UK

Wohlin C, Runeson P, Höst M, Ohlsson MC, Regnell B, Wesslén A (1999) Experimentation in Software Engineering — An Introduction, Kluwer Academic Publishers, Boston, MA, USA

Judd CM, Smith ER, Kidder LH (1991) Research Methods in Social Relations, Harcourt Brace Jovanovich College Publishers, Forth Worth, TX, USA, 6th edition

Juristo N, Moreno A (2001) Basics of Software Engineering Experimentation, Kluwer Academic Publishers, Boston, MA, USA

Birk A, Dingsøyr T, Stålhane T (2002) Postmortem: Never Leave a Project without It. IEEE Software, May/June, 43–45

Collier B, DeMarco T, Fearey P (1996) A Defined Process for Project Postmortem Review. IEEE Software, July, 65–72

Whitten N (1995) Managing Software Development Projects — Formula for Success, John Wiley & Sons, NY, USA

Download references

Author information

Authors and affiliations.

Dept. of Systems and Software Engineering, School of Engineering, Blekinge Institute of Technology, Box 520, SE-372 25, Ronneby, Sweden

Claes Wohlin ( Professor ) & Kennet Henningsson ( Dr. )

Dept. of Communication Systems, Lund Institute of Technology, Lund University, Box 118, SE-221 00, Lund, Sweden

Martin Höst ( Dr. )

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

Computer Science Department, University of Auckland, Private Bag 92019, Auckland, New Zealand

Emilia Mendes

MetriQ (NZ) Ltd., 19A Clairville Crescent, Wai-O-Taiki Bay, Auckland, New Zealand

Nile Mosley

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this chapter

Wohlin, C., Höst, M., Henningsson, K. (2006). Empirical Research Methods in Web and Software Engineering. In: Mendes, E., Mosley, N. (eds) Web Engineering. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-28218-1_13

Download citation

DOI : https://doi.org/10.1007/3-540-28218-1_13

Publisher Name : Springer, Berlin, Heidelberg

Print ISBN : 978-3-540-28196-2

Online ISBN : 978-3-540-28218-1

eBook Packages : Computer Science Computer Science (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

University of Rhode Island

  • Future Students
  • Parents and Families

College of Engineering

  • Research and Facilities
  • Departments

E-Week 2024! Join Us!

Eweek 2024.

An end-of-year event to celebrate and showcase what engineering at URI is all about!

Most events take place in The Fascitelli Center for Advanced Engineering (FCAE) located on the URI Kingston Campus .  Visitors should park at the Welcome Center (GPS address: 75 Briar Lane, Kingston, RI 02881) and register your vehicle by selecting “Campus Tour Parking Pass Lot “. There is no charge for parking.

RSVP Join us for eWeek and be a part of an inspiring celebration of innovation, collaboration, and excellence in engineering!  RSVP here.

If you have any questions, please contact Associate Dean Gindy at [email protected] .

15 days until eWeek begins . Dismiss message

Monday, april 15, 2024.

IEP Social Hour | 12:30-1:30 PM | Engineering Quad (rain date Wednesday 4/17) International Engineering Program is hosting a social hour for all IEP students. 

IEP Block Party | 1:30-3:30 PM | Engineering Quad (rain date Wednesday 4/17) The International Engineering Program (IEP) is hosting this fun event to kick off eWeek for all engineering students, faculty and staff.  Stop by for lawn games, DIY snack packs to grab & go (while supplies last), and maybe even play with the URI therapy dog.  There will also be a raffle of IEP T-shirts!

Endowed Professor Lecture by Dr. Arun Shukla | 4:00-5:00 PM | FCAE 010/025

Research Showcase | 5:00-7:00 PM | Toray Commons, FCAE Curious about the groundbreaking research being conducted by engineering students at the College of Engineering?  At the Research Showcase, you’ll have the opportunity to engage with poster presentations and dynamic 3-minute talks delivered by a variety of students. From first-year undergraduates supported by the URISE program (Undergraduate Research in Science and Engineering) to doctoral candidates pushing the boundaries of innovation, our students will showcase their passion and expertise across a multitude of disciplines.  Don’t miss the chance to network during the reception, where light refreshments will be served at 5 pm.  Come and be inspired by the next generation of engineering leaders!

Tuesday, April 16th, 2024

Student Organizations Recognition Night| 5:30-7:30 PM | Toray Commons, FCAE Join us for an evening dedicated to celebrating the outstanding achievements and contributions of our vibrant engineering student organizations! At the Engineering Student Organizations Recognition Night, we shine a spotlight on the remarkable efforts and impactful activities carried out by our diverse range of student groups.  

Wednesday, April 17, 2024

Evening with Industry | 5:30-7:30 PM | Toray Commons, FCAE Get ready for an unforgettable opportunity to connect with industry professionals and alumni!  At Evening with Industry, students will have the chance to engage in meaningful conversations, gain firsthand knowledge about different career paths, and expand their professional network.

Thursday, April 18th, 2024

CCRI Meet-and-Greet | 10:00-10:30 AM | Bliss Hall 410 Join us for a special Meet-and-Greet event aimed at fostering connections between CCRI students considering a transfer to our engineering programs.  This is a great opportunity for CCRI students to learn more about our programs and experience the vibrant engineering community at URI.

College of Engineering Picnic | 11:30-1:00 PM | Engineering Quad  Get ready to soak up the sun (fingers crossed!) and enjoy time filled with fun, food, and festivities at our Annual College Picnic on the Quad! This cherished tradition brings together students, faculty, staff, alumni, and friends of the College for a memorable outdoor gathering that celebrates the spirit of our engineering community.  Please RSVP to ensure we have a proper headcount for ordering food. In case of rain, the picnic will be inside the Toray Commons.

RISICA Lecture and Reception | 5:00-7:00 PM | FCAE 010/025 We are honored to welcome back Chemical Engineering alum ’89 and National Academy of Engineering member ’23, Dave Parrillo to give the Annual RISICA Lecture on Leading the Materials Industry through Disruption .  Dave Parrillo is currently the Vice President of Research & Development (R&D) for Dow’s Packaging & Specialty Plastics division. This $28 Billion revenue division of Dow is key to Dow’s decarbonize and grow strategy and includes Dow’s Hydrocarbons & Energy business.  In 2023, Dave was inducted into the highly prestigious National Academy of Engineering for the development and commercialization of new innovative processes and products for consumer and industrial applications.

Abstract: Materials and chemicals impact daily life, as they enable modern healthcare, preservation of food, and delivering performance and efficiency of transportation and infrastructure. The need for sustainable solutions, including transforming waste into valuable resources and reducing the greenhouse gas emissions, is demanding new innovations and collaborations across industry partners. This talk will detail the technology breakthroughs and cross-industry partnership needed to transition an industry to one of low-to-zero carbon and circular solutions that fight climate change towards a sustainable future.  

Friday, April 19, 2024

College of Engineering Advisory Council Meeting | 8:00-4:00 PM | Bliss Hall 410 (Not Open to Public)

Capstone Design Showcase | 1:00-3:30 PM* | Toray Commons, FCAE Prepare to be inspired as we showcase the culmination of a year-long experience of creativity, collaboration, and innovation at our Capstone Design Showcase! This event brings together graduating engineering seniors to present their design projects in a dynamic and engaging exhibition.  Capstone design is the pinnacle of the undergraduate engineering experience, where students demonstrate their mastery of technical skills, problem-solving abilities, and real-world application of engineering principles. From sustainable energy solutions to robotics, the projects on display represent a diverse range of disciplines and address pressing challenges faced by society today. *Mechanical Engineering will start at 12 PM, Biomedical Engineering is from 12:30 – 4:30 PM, all other programs will start at 1:00 PM

IMAGES

  1. Web engineering is multidisciplinary

    research in web engineering

  2. (PDF) A systematic review of Web engineering research

    research in web engineering

  3. Web engineering is multidisciplinary

    research in web engineering

  4. (PDF) Web engineering-: An introduction

    research in web engineering

  5. (PDF) Web Engineering: A New Discipline for Development of Web-Based

    research in web engineering

  6. Web Engineering

    research in web engineering

VIDEO

  1. Coding a Web Server in 25 Lines

  2. web Engineering in Software Engineering(Tamil)

  3. CSS

  4. UW CSE 481m demos from Spring 11

  5. Introduction to Basic Web Engineering (Part-I)

  6. Engineering Coffee Break: Front End Engineering & Design

COMMENTS

  1. JOURNAL OF WEB ENGINEERING Home

    Implementation and evaluation of a resource-based learning recommender based on learning style and web page features. Mohammad Tahmasebi. Department of Computer Engineering, Faculty of Engineering, University of Qom and Yazd University, Faranak Fotouhi Ghazvini. Department of Computer Engineering and IT, Faculty of Engineering, University of Qom,

  2. Journal of Web Engineering

    Model-Driven Web Engineering approaches have become an attractive research and technology solution for Web application development. However, for more than 20 years of development, the industry has not adopted them due to the mismatch between technical ...

  3. Web engineering: an introduction

    The emerging field of Web engineering fulfils these needs. It uses scientific, engineering, and management principles and systematic approaches to successfully develop, deploy, and maintain high-quality Web systems and applications. It aims to bring the current chaos in Web based system development under control, minimize risks, and enhance Web ...

  4. 15930 PDFs

    Web Engineering - Science topic. Explore the latest full-text research PDFs, articles, conference papers, preprints and more on WEB ENGINEERING. Find methods information, sources, references or ...

  5. The Need for Web Engineering: An Introduction

    The objective of this chapter is three-fold. First, it provides an overview of differences between Web and software development with respect to their development processes, technologies, quality factors, and measures. Second, it provides definitions for terms used throughout the book. Third, it discusses the need for empirical investigations in ...

  6. Web Engineering

    She is the principal investigator in the Tukutuku Research project, which aims to collect data about Web projects and use it to develop Web cost estimation models and to benchmark productivity across and within Web Companies. She is the director of the WETA (Web Engineering, Technology and Applications) research group.

  7. A systematic review of Web engineering research

    This paper uses a systematic literature review as means of investigating the rigor of claims arising from Web engineering research. Rigor is measured using criteria combined from software engineering research. We reviewed 173 papers and results have shown that only 5% would be considered rigorous methodologically. In addition to presenting our results, we also provide suggestions for ...

  8. WEB ENGINEERING

    Web Engineering is the application of systematic, disciplined and quantifiable approaches to development, operation, and maintenance of Web-based applications. It is both a pro-active approach and a growing collection of theoretical and empirical research in Web application development.

  9. Web Engineering: Introduction and Perspectives

    The first two papers originate from the early exposure of the authors to Web developmental activities, which helped to define the field of Web Engineering and to bring a focus on areas that are not regarded as part of the traditional domains of computer science, information systems and software engineering.

  10. Web Engineering: Modelling and Implementing Web Applications

    In the Web engineering research area [6], MDWE approaches [5,3,7] have become an attractive solution for building Web applications as they raise the level of abstraction and simplify the Web ...

  11. [cs/0306108] Web Engineering

    Abstract: Web Engineering is the application of systematic, disciplined and quantifiable approaches to development, operation, and maintenance of Web-based applications. It is both a pro-active approach and a growing collection of theoretical and empirical research in Web application development.

  12. Web Engineering: A Multidisciplinary Field For Web Development

    Web Engineering, an emerging new discipline, advocates a process and a systematic approach to development of high quality Web-based systems or applications. It promotes the establishment and use ...

  13. Web Engineering: The Discipline of Systematic Development of Web

    This book presents a new discipline called Web Engineering taking a rigorous interdisciplinary approach to the development of Web applications, covering Web development concepts, methods, tools and techniques. ... Siegfried Reich is director of Salzburg Research, the non-profit research organization owned by the County of Salzburg. Werner ...

  14. A Systematic Review of Web Engineering Research

    Web engineering research based on lessons learntby the software engineering community. Introduction The term "Web engineering" was first published in 1996 in a conference paper by Gallersen et al. [13]. Since then this term has been

  15. PDF Student Research in Web Engineering: An International Perspective on

    research in web engineering: Elluminate [6]: A web conferencing tool that can connect remote groups together with sound and video connections, shared presentations, white boards, and web tours. Skype [7]: This has helped connect undergraduate researchers and faculty across the globe with affordable and powerful video conferencing software.

  16. Accessibility engineering in web evaluation process: a systematic

    Introduction. In recent years, various aspects have motivated researchers to conduct studies about digital accessibility. The extension and increased availability of the web for multiple purposes (e.g., information search), the representation of the content (e.g., video, audio), and the emergence of new platforms (e.g., Internet of Things) and technologies (e.g., mobile, computer, tablets) are ...

  17. Journal of Web Engineering

    Top authors and change over time. The top authors publishing in Journal of Web Engineering (based on the number of publications) are: 幸雄 田村 (23 papers) absent at the last edition,; 武司 大熊 (17 papers) absent at the last edition,; Akashi Mochida (16 papers) absent at the last edition,; 一喜 日比 (15 papers) absent at the last edition,; 禎秀 富永 (15 papers) absent at the ...

  18. Journal of Engineering Research

    Journal of Engineering Research (JER) is an international, peer reviewed journal which publishes full length original research papers, reviews and case studies related to all areas of Engineering such as: Civil, Mechanical, Industrial, Electrical, Computer, Chemical, Petroleum, Aerospace, Architectural, etc. JER is intended to serve a wide range of educationists, scientists, specialists ...

  19. Study "Web Engineering" in Germany

    Expired. Registration is via uni-assist.de. Applications are only accepted for the first study semester. Application deadline for Germans and inhabitants. 18.12.2023 - 15.03.2024. Deadline Extension for enrolment: until 05.04.2024. Enrollment deadline for Germans and foreign students. 18.12.2023 - 15.03.2024.

  20. When Software Engineering Meets Quantum Computing

    Therefore, requirements engineering is very important for easing communication among various stakeholders while raising the level of abstraction in understanding the domain and linking the domain to analysis, design, and implementation. To the best of our knowledge, requirements engineering for quantum software is an uncharted area of research.

  21. Raghunathan receives Bement Award for groundbreaking AI research

    At Purdue, he has received the College of Engineering Faculty Excellence Award for Research, the Qualcomm Faculty Award and the IBM Faculty Award. The Arden L. Bement Jr. Award was established in 2015 by Distinguished Professor Emeritus Arden Bement and his wife, Louise Bement, to annually recognize a Purdue faculty member for recent ...

  22. Learning from interaction with Microsoft Copilot (web)

    Microsoft Research Blog. Learning from interaction with Microsoft Copilot (web) AI systems like Bing and Microsoft Copilot (web) are as good as they are because they continuously learn and improve from people's interactions. Since the early 2000s, user clicks on search result pages have fueled the continuous improvements of search engines.

  23. NYU Tandon @ The Yard

    NYU Tandon @ The Yard leverages the Emerging Media expertise of NYU Tandon School of Engineering with industry and media partners across the country to create a new media hub and must-visit facility within the Brooklyn Navy Yard.Our goals are to advance integrative research in AR/VR/XR, virtual production, and other topics relevant to experiential computing; provide our industry partners with ...

  24. Building Meta's GenAI Infrastructure

    Building Meta's GenAI Infrastructure. Marking a major investment in Meta's AI future, we are announcing two 24k GPU clusters. We are sharing details on the hardware, network, storage, design, performance, and software that help us extract high throughput and reliability for various AI workloads. We use this cluster design for Llama 3 training.

  25. Empirical Research Methods in Web and Software Engineering

    Web and software engineering are not only about technical solutions. They are to a large extent also concerned with organisational issues, project management and human behaviour. For disciplines like Web and software engineering, empirical methods are crucial, since they allow for incorporating human behaviour into the research approach taken.

  26. Lessons From the Early Web

    The Impact of SSL. The introduction of SSL by Netscape was a game-changer for the internet: E-Commerce Enablement: By addressing concerns of security and trust, SSL laid the groundwork for e ...

  27. Accessibility engineering in web evaluation process: a systematic

    This paper presents a new SLR approach concerning accessibility in the web evaluation process, considering the period from 2010 to 2021. The review of 92 primary studies showed the contribution of ...

  28. Synthetic material could improve ease and cut cost of gut microbiome

    A synthetic material recently developed by Penn State scientists could lower the difficulty and barrier to entry for researchers studying how microorganisms interact with the gastrointestinal system and potentially improve labs' ability to screen drugs that impact gut health.

  29. E-Week 2024! Join Us!

    eWeek 2024! An end of year event to celebrate and showcase what engineering is all about! eWeek is a week-long series of events dedicated to highlighting the incredible achievements and innovations of our engineering students. Highlights include: International Engineering Program block party, inspiring talks, research showcase, student ...