Main navigation

Studies in language testing (silt).

Studies in Language Testing (SiLT) is a series of academic volumes edited by Professor Lynda Taylor and Dr Nick Saville. It is published jointly by Cambridge English and Cambridge University Press (CUP).

The series addresses a wide range of important issues and new developments in language testing and assessment, and is an indispensable resource for test users, developers and researchers. There are currently over 50 titles available; a full list of these, plus content summaries, is provided below. For a reader's overview of the series, including a thematic categorisation and extracts from reviews, please see this essay, kindly contributed to Cambridge English by a visiting professor, Xiangdong Gu: Download Studies in Language Testing Essay by Xiangdong Gu (PDF)

Copies of the volumes are available from booksellers or can be ordered direct from the Cambridge University Press website.

SiLT volumes

Volume 54 - on topic validity in speaking tests (khabbazbashi 2021).

Front cover of Volume 54 - On Topic Validity in Speaking Tests (Khabbazbashi 2021)

On Topic Validity in Speaking Tests Nahal Khabbazbashi

Topics are often used as a key speech elicitation method in performance-based assessments of spoken language, and yet the validity and fairness issues surrounding topics are surprisingly under-researched. Are different topics ‘equivalent’ or ‘parallel’? Can some topics bias against or favour individuals or groups of individuals? Does background knowledge of topics have an impact on performance? Might the content of test taker speech affect their scores – and perhaps more importantly, should it? Grounded in the real-world assessment context of IELTS, this volume draws on original data as well as insights from empirical and theoretical research to address these questions against the backdrop of one of the world’s most high-stakes language tests.

This volume provides:

  • an up-to-date review of theoretical and empirical literature related to topic and background knowledge effects on second language performance
  • an accessible and systematic description of a mixed methods research study with explanations of design, analysis, and interpretation considerations at every stage
  • a comprehensive and coherent approach for building a validity argument in a given assessment context.

The volume also contributes to critiques of recent models of communicative competence with an over-reliance on linguistic features at the expense of more complex aspects of communication, by arguing for an expansion of current definitions of the speaking construct emphasising the role of content of speech as an important – yet often neglected – feature.

This volume will be a valuable resource for postgraduate students, those working professionally in the field of speaking assessment such as personnel in examination boards, item writers and curriculum developers, and anyone seeking to better understand and improve the fairness and validity of topics used in assessments.

Download a free full pdf version of this volume

Volume 53 - Insights into Assessing Academic Listening: The Case of IELTS

Front cover of Volume 53 - Assessing Academic Listening: The Case of IELTS

Insights into Assessing Academic Listening: The Case of IELTS

Opening with an overview of studies that investigate the listening test component of the International English Language Testing System (IELTS), this volume proposes and illustrates a new line of enquiry for academic listening assessment: a better understanding of the cognitive processes underlying everyday listening events can provide a framework for recognising what is distinctive about the skill when applied to an English for Academic Purposes (EAP) or professional context. The outcome is a set of validation criteria against which a reviewer can measure to what degree a given test does or does not represent the academic or professional experience, which can be applied across various features of a listening test and to the design of all similar tests in this field.

  • an up to date review of relevant literature on assessing academic listening
  • a clear and detailed specification of the construct of academic listening, with an evaluation of how this is used for assessment purposes
  • a consideration of the nature of academic listening in a digital age, and its implications for assessment research and test development

As test developers need to support score validity claims with a sound theoretical framework which guides their coverage of appropriate language ability constructs, this volume will be a rich resource for examination boards and other institutions, as well as researchers and graduate students in the field of language assessment, and teachers preparing students for IELTS or involved in EAP programmes.

Volume 51 - Research and Practice in Assessing Academic Reading: The Case of IELTS (Weir and Chan 2019)

Front cover of Volume 51 - Research and Practice in Assessing Academic Reading: The Case of IELTS (Weir and Chan 2019)

Research and Practice in Assessing Academic Reading: The Case of IELTS Cyril J Weir and Sathena Chan

This volume describes differing approaches to understanding academic reading ability that have emerged in recent decades and goes on to develop an empirically grounded framework for validating tests of this skill. The framework is then applied to the IELTS Academic Reading module to investigate a number of different validity perspectives that reflect the socio-cognitive nature of any assessment event. The authors demonstrate how a systematic understanding and application of the framework and its components can help test developers to operationalise their tests so as to fulfill the validity requirements for an academic reading test.

The book provides:

  • an up to date review of the relevant literature on assessing academic reading
  • a clear and detailed specification of the construct of academic reading and evaluation of how this is used for assessment purposes
  • a consideration of the nature of academic reading in a digital age and its implications for assessment research and test development.

The volume is a rich source of information on all aspects of testing academic reading ability. Examination boards and other institutions who need to validate their own academic reading tests in a systematic and coherent manner, or who wish to develop new instruments for measuring academic reading, will find it a useful resource, as will researchers and graduate students in the field of language assessment, and teachers preparing students for IELTS (and similar tests) or involved in English for Academic Purposes (EAP) programmes.

Volume 50 - Lessons and Legacy: A Tribute to Professor Cyril J Weir (1950-2018) (Edited by Taylor and Saville 2020)

Front cover of Studies in Language Testing – Volume 49

Lessons and Legacy: A Tribute to Professor Cyril J Weir (1950-2018) Edited by Lynda Taylor and Nick Saville

Written by a selection of his friends and collaborators, this volume pays tribute to the academic achievements of the late Professor Cyril J Weir. His passing in September 2018 leaves an eclectic legacy in the field of language testing and assessment, and the chapters contained herein, part of a series he guided and often wrote for, honour and illuminate his lessons.

Professor Weir’s chronicling of the history and evolution of language testing is reflected in chapters on his role in assessment reform and the origins of his socio-cognitive framework; authors also reflect on the impact of this model on test validity and validation. He was also a vital influence in putting these ideas into action, as reported in chapters on test operationalisation and the establishment of the Centre for Research in English Language Learning and Assessment (CRELLA).

By drawing on a rich range of voices in language assessment, from China to the UK to the Middle East, and from Professor Weir’s earliest to most recent collaborators, we illustrate breadth and depth of his impact on language testing and assessment, and how his lessons continue to be relevant to the present day.

Volume 49 — Applying the Socio-cognitive Framework to the BioMedical Admissions Test: Insights from language assessment (Cheung, McElwee and Emery 2017)

Front cover of Studies in Language Testing – Volume 49

Applying the Socio-cognitive Framework to the BioMedical Admissions Test: Insights from language assessment Edited by Kevin Y F Cheung, Sarah McElwee and Joanne Emery

This volume takes a framework for validating tests that was developed in language testing, and applies it to an admissions test used for biomedical courses. The framework is used to consider validity in the BioMedical Admissions Test (BMAT). Each chapter focuses on a different aspect of validity and also presents research that has been conducted with the test. By addressing all of the validity aspects identified as important by language testers, this volume presents a comprehensive evaluation of BMAT's validity. The processes of evaluation used in the book also promote a cross-disciplinary approach to assessment research, by demonstrating how effectively language testing frameworks can be used in different educational contexts. The authors of the chapters include Cambridge Assessment staff and medical education experts, from a wide range of subject backgrounds. Psychologists, clinicians, linguists and assessment experts have all contributed to the volume, making it an example of multidisciplinary collaboration.

The Cambridge approach to admissions testing (Nick Saville) Considering the test taker in test development and research (Devine, Taylor and Cross) Cognitive validity (Cheung and McElwee) Building fairness and appropriacy into testing contexts Tasks and administrations (Shannon, Crump and Wilson) Scoring validity (Elliott and Gallacher) Criterion-related validity (Fyfe, Devine and Emery) Consequential validity (McElwee, Fyfe and Grant) Conclusions and Recommendations (Cheung)

Volume 48 — Second Language Assessment and Action Research (Burns and Khalifa 2017)

Front cover of Studies in Language Testing – Volume 48

Second Language Assessment and Action Research Edited by Anne Burns and Hanan Khalifa

Volume 47 — examining young learners: research and practice in assessing the english of school-age learners (papp and rixon 2018).

RV - SiLT issue 47 - image

Examining Young Learners: Research and practice in assessing the English of school-age learners Szilvia Papp and Shelagh Rixon (2018)

The unique areas of children and teenagers’ second language development and assessment is given a state-of-the-art account in this volume. Common issues in cognitive psychology, child second language (L2) acquisition studies, recent research on adolescents, and language assessment are explored by linking research carried out within the educational, academic and testing communities.

The volume reflects on how learners’ L2 development between the ages of 6 and 16 can be coherently described and their L2 assessment defined in terms of socio-cognitive validity. There is particular focus on the theoretical foundations, language competence model, development and validation framework, and evaluation and review processes to provide evidence for the validity of the Cambridge English family of assessments for children and teenagers.

Academics, assessment professionals and postgraduate researchers of L2 development in children and teenagers will find great value in the volume’s theoretical insight, while policy-makers and teachers will gain rigorous practical advice for the young language learner’s classroom and assessment.

Order a copy

Volume 46 — Advancing the Field of Language Assessment: Papers from TIRF doctoral dissertation grantees (Christison and Saville 2016)

Front cover of Studies in Language Testing – Volume 46

Advancing the Field of Language Assessment: Papers from TIRF doctoral dissertation grantees Edited by MaryAnn Christison and Nick Saville (2016)

Since 2002, the International Research Foundation for English Language Education (TIRF) has supported students in completing their doctoral research on topics related to the foundation’s priorities. Each year applicants who have been advanced to candidacy in legitimate PhD or EdD programmes are invited to submit proposals for Doctoral Dissertation Grants (DDGs).

This volume brings together a set of 11 TIRF-related research papers on English language assessment. As a member of the TIRF Board of Trustees, Cambridge English wishes to support the foundation in achieving its aims in disseminating and influencing language testing policies. 

  • focuses on the applied nature of research in language assessment
  • discusses the implications of such research, and
  • presents its findings from a global perspective.

This volume can serve as a core or supplemental text for graduate seminars in English language assessment in applied linguistics, education, TESOL, and TEFL, and it is useful for scholars of L2 methodology, curriculum design, and teacher development in ELT, as well as for courses on language assessment. As a reference volume, it is appropriate for individual scholars, test developers, graduate and undergraduate students, and researchers.

Volume 45 — Learning Oriented Assessment: A systemic approach (Jones and Saville 2016)

Front cover of Studies in Language Testing – Volume 45

Learning Oriented Assessment: A systemic approach Neil Jones and Nick Saville (2016)

The learning-oriented approach to assessment developed in this book seeks to exploit the commonality as well as the complementarity of formal assessment and classroom assessment. It proposes a Learning Oriented Assessment (LOA) model which presents a systemic, ecological approach in which all kinds of assessment contribute positively to their two major educational purposes: promoting better learning and measurement and contributing to a meaningful interpretation of learning outcomes.

The volume poses three key questions central to LOA: ‘What is learning?’, ‘What is to be learned?’, and ‘What is to be assessed?’, and discusses how a focus on these fundamental aspects of learning and assessment can support learners, teachers and assessment professionals. The volume also focuses on the use of evidence and on how it can be collected and used to feed back into learning. It overviews large-scale assessment as practised by Cambridge English and learning-oriented classroom assessment practices, which is where learning interactions take place. The volume concludes with a look at implementing LOA in practice.

This volume is a rich source of information on key issues, principles and practices in the area of LOA. It provides fresh insights into current knowledge and understanding of the role of assessment in supporting learning, as well as useful guidance on good practice. As such, it will be of considerable interest to assessment practitioners, teachers and academics, educational policy-makers and examination board personnel.

Volume 44 — Language Assessment for Multilingualism: Proceedings of the ALTE Paris Conference, April 2014 (Docherty and Barker 2016)

Front cover of Studies in Language Testing – Volume 44

Language Assessment for Multilingualism: Proceedings of the ALTE Paris Conference, April 2014 Edited by Coreen Docherty and Fiona Barker (2016)

This volume explores the role of multilingualism in social, educational and practical contexts. It brings together a collection of edited papers based on presentations given at the 5th International Conference of the Association of Language Testers in Europe (ALTE) held in Paris, France, in April 2014.;

The selected papers focus on several core strands addressed during the conference. 

Section 1 deals with frameworks in social contexts and focuses on their role in migration and multilingual policy and practice. It addresses how recent education reforms aim to increase both social mobility and intercultural communication. Section 2 focuses on the response of language assessment providers to the rise of linguistic diversity. Section 3 then discusses the role of intercultural professionalisation of language assessors. Finally, Section 4 reflects on the approach of various institutes to achieve fairness and quality in test provision.

Key features of the volume include:

  • insights on the effect of multilingualism on international mobility
  • discussion of how multilingualism can address the challenge of increasing linguistic diversity
  • reflection on the impact of intercultural communication on linguistic competence
  • advice on how to ensure fairness and quality in language assessment.

With its broad coverage of key issues and combination of theoretical insights and practical advice, this volume is a valuable reference work for academics, employers and policy-makers in Europe and beyond. It is also a useful resource for postgraduate students of language testing and for practitioners, and anyone else seeking to understand the policies, procedures and challenges encountered in the application of multilingualism.

Volume 43 — Second Language Assessment and Mixed Methods Research (Moeller, Creswell and Saville 2016)

Front cover of Studies in Language Testing – Volume 43

Second Language Assessment and Mixed Methods Research Edited by Aleidine J Moeller, John W Creswell and Nick Saville (2016)

Test developers have a responsibly to ensure that the assessments they develop meet the needs of test users and provide a fair assessment in educational and social contexts. Mixed methods research plays an important role in providing a set of different but complementary research tools which can be used to underpin the assessment validation process and add value to assessment research.

The purpose of this volume is to create a deeper understanding of the role of mixed methods in language assessment, and to provide essential information needed to conduct and publish mixed methods research within the context of language assessment. Mixed methods language assessment studies on topics such as community-based participatory test development, investigating test impact, developing new test tasks and rating scales, illustrate first-hand the benefits and added value of mixed methods to the language testing and assessment field. 

  • theoretical insights and practical guidance on the use of mixed methods research
  • advice on the essential components for conducting and publishing mixed methods research
  • case studies from language assessment to demonstrate how mixed methods research can be rigorously and systematically applied in a specific context. 

This is the first volume of its kind to comprehensively illustrate the application of the principles of mixed methods research in language assessment and to combine theoretical insights and practical illustrations of good practice. As such, it is a valuable reference work for academics, postgraduate students and practitioners, and anyone else seeking to understand the purpose, design and application of mixed methods research.

Volume 42 — Assessing Language Teachers’ Professional Skills and Knowledge (Wilson and Poulter 2015)

Front cover of Studies in Language Testing – Volume 42

Assessing Language Teachers’ Professional Skills and Knowledge Edited by Rosemary Wilson and Monica Poulter (2015)

The growth in English language teaching worldwide, and the related increase in teacher training programmes, have made it more important than ever for greater accountability in the assessment of teachers. Formal, summative assessment has taken on greater importance in many teacer training programmes and requires procedures which do not always sit easily with the development process. Meanwhile, transparency of assessment procedures is also increasingly demanded by the candidates themselves.

This edited volume discusses key issues in assessing language teachers’ professional skills and knowledge, and provides case study illustrations of how teacher knowledge and teaching skills are assessed at pre-service and in-service levels within the framework of the Cambridge English Teaching Qualifications.

The volume provides:

  • discussion of ways in which the changing nature of English language teaching has impacted on teacher education and assessment
  • useful illustrations of specific assessment procedures for both teaching knowledge and practical classroom skills
  • real-life examples of the ways in which the Cambridge English Teaching Qualifications have been integrated into and adapted to work in local contexts.

This is the first volume of its kind wholly dedicated to language teacher assessment. As such, it will be of interest not only to researchers and postgraduate students but also language teachers and teacher educators.

Volume 41 — Validating Second Language Reading Examinations (Wu 2014)

Front cover of Studies in Language Testing – Volume 41

Validating Second Language Reading Examinations Establishing the validity of the GEPT through alignment with the Common European Framework of Reference Rachel Wi-fen Wu (2014)

Validating Second Language Reading Examinations describes the development of an empirical framework for test validation and comparison of reading tests at different proficiency levels through a critical evaluation of alignment with the Common European Framework of Reference (CEFR). It focuses on contextual parameters, cognitive processing operations and test results, and identifies parameters for the description of different levels of reading proficiency examinations. The volume explores procedures for linking tests to the CEFR and proposes both qualitative and quantitative methods that complement the procedures recommended in the Council of Europe’s Relating Language Examinations to the Common European Framework of Reference for Languages (CEFR): A Manual , piloted in 2003 and revised in 2009.

  • a detailed review of the literature on CEFR alignment, vertical scaling, test specifications and test comparability
  • a comprehensive and coherent approach to the validation of reading tests
  • an accessible and systematic description of procedures for collecting validity evidence
  • a case study comparing different testing systems targeting the same CEFR level.

This volume will be a valuable resource for academic researchers and postgraduate students interested in using CEFR alignment procedures and methodology to demonstrate differentiation across different levels of a testing system and equivalence between different examinations that target a particular CEFR level. It will be of particular relevance to exam boards who wish to validate their reading tests in terms of differentiation across test levels and external criteria. It will also be a useful reference for teachers and curriculum designers who wish to reflect real-life reading activities when they prepare reading tasks for language learning.

Volume 40 — Multilingual Frameworks: The construction and use of multilingual proficiency frameworks (Jones 2014)

Front cover of Studies in Language Testing – Volume 40

Multilingual Frameworks: The construction and use of multilingual proficiency frameworks Neil Jones (2014)

This volume describes 20 years of work at Cambridge English to develop multilingual assessment frameworks and presents useful guidance of good practice. It covers the development of the ALTE Framework and ‘Can Do’ project, the Common European Framework of Reference (CEFR) and the linking of the Cambridge English exam levels to it, Asset Languages – a major educational initiative for UK schools, and the European Survey on Language Competences, co-ordinated by Cambridge English for the European Commission. It proposes a model for the validity of assessment within a multilingual framework and, while illustrating the constraints which determined the approach taken to each project, makes clear recommendations on methodological good practice. It also explores and looks forward to the further extension of assessment frameworks to encompass a model for multilingual education.

Key features of the volume include: • a clear and comprehensive explanation of several major multilingual projects • combination of theoretical insights and practical advice • discussion of the interpretation and use of the CEFR.

Multilingual Frameworks is a rich source of information on key issues in the development and use of multilingual proficiency frameworks. As such, it will be a valuable reference work for academics, education policy-makers and examination board personnel. It is also a useful resource for postgraduate students of language assessment and for practitioners, and any stakeholders seeking to gain a clearer picture of the issues involved with cross-language assessment frameworks.

Volume 39 — Testing Reading through Summary: Investigating summary completion tasks for assessing reading comprehension ability (Taylor 2013)

Front cover of Studies in Language Testing – Volume 39

Testing Reading through Summary: Investigating summary completion tasks for assessing reading comprehension ability Lynda Taylor (2013)

Testing Reading through Summary explores the use of summary tasks as an effective means of assessing reading comprehension ability. It focuses in particular on text-removed summary completion as a task type that offers a way of addressing more directly the reader’s mental representation of text for reading assessment purposes.

The volume describes a series of empirical studies that investigated the development of text-removed summary completion tasks, their trialling and validation with results from an independent measure of reading ability. Findings from the project suggested that it is possible to develop a satisfactory summary of a text which will be consistent with most readers’ mental representation if their reading of the text is adequately contextualised within some purposeful activity.

Key features of the book include:

  • an in-depth discussion of the nature of reading comprehension and approaches to assessing reading comprehension ability
  • a comprehensive empirical report and practical guidance on the development, trialling and validation of summary completion tasks
  • fresh insights into current knowledge and understanding of the assessment of reading ability.

This volume will be a valuable resource for those working professionally in the field of reading assessment such as key personnel in examination agencies and those with an academic interest in language testing/examining. It will also be a useful resource for postgraduate students of language testing and for practitioners, i.e. teachers, teacher educators, curriculum developers, materials writers, and anyone seeking to better understand the nature of reading comprehension ability and how it can be assessed most effectively.

Volume 38 — Cambridge English Exams – The First Hundred Years (Hawkey and Milanovic 2013)

Front cover of Studies in Language Testing – Volume 38

Cambridge English Exams – The First Hundred Years A history of English language assessment from the University of Cambridge 1913-2013 Roger Hawkey and Michael Milanovic (2013)

The first Cambridge English examination for non-native speakers was taken by three candidates in 1913. Today, the exams are taken by nearly four million people a year in 130 countries and cover a wide range of needs, from English for young learners to specific qualifications for university entrance and professional use.  Throughout their history, the Cambridge English exams have been designed to meet the changing needs of learners, teachers, universities, employers and official bodies, and to deliver educational and social benefits. They have benefited from - and contributed to - research in education, language learning and assessment to ensure that they offer valid, reliable and fair qualifications. This book traces the history of the exams through their first hundred years, setting them in the context of wider educational and academic developments. The authors pay particular attention to the contribution of the dedicated individuals in Cambridge and around the world who have contributed to the success of the exams and to their positive educational impact. It will be of interest to anyone interested in language teaching and assessment, applied linguistics or educational history, and to the thousands of people who are part of the wider Cambridge English network.

Volume 37 — Measured Constructs: A history of Cambridge English language examinations 1913-2012 (Weir, Vidaković and Galaczi 2013)

Measured constructs: a history of cambridge english language examinations 1913-2012 cyril j weir, ivana vidaković and evelina d galaczi (2013).

This volume sheds light on how approaches to measuring English language ability evolved worldwide and at Cambridge over the last 100 years.  The volume takes the reader from the first form of the Certificate of Proficiency in English offered to three candidates in 1913, a serendipitous hybrid of legacies in language teaching from the previous century, up to the current Cambridge approach to language examinations, where the language construct to be measured is seen as the product of the interactions between a targeted cognitive ability based on an expert user model, a highly specified context of use and a performance level based on explicit and appropriate criteria of description.

This volume:

  • chronicles the evolution of constructs in English language teaching and assessment over the last century
  • provides an accessible and systematic analysis of changes in the way constructs were measured in Cambridge English exams from 1913- 2012
  • includes copies of past Cambridge English exams, from the original exams to the current ones, as well as previously unpublished archive material.

Measured Constructs is a rich source of information on how changes in language pedagogy, together with wider socio-economic factors, have shaped the development of English language exams in Cambridge over the last century.  As such, it will be of considerable interest to researchers, practitioners and graduate students in the field of language assessment.  This volume complements previous historical volumes in the series on the development of Cambridge English exams, as well as titles which investigate language ability constructs underlying current Cambridge English exams.

Volume 36 — Exploring Language Frameworks: Proceedings of the ALTE Kraków Conference, July 2011 (Galaczi and Weir 2013)

Front cover of Studies in Language Testing – Volume 36

Exploring Language Frameworks: Proceedings of the ALTE Kraków Conference, July 2011 Edited by Evelina D Galaczi and Cyril J Weir (2013)

This volume explores the role of language frameworks in social, educational and practical contexts. It brings together a collection of 21 edited papers based on presentations given at the 4th International Conference of the Association of Language Testers in Europe (ALTE) held in Kraków in July 2011. The selected papers focus on several core strands addressed during the conference. Section one deals with frameworks in social contexts and focuses on their role in migration and multilingual policy and practice. Section two addresses the use of frameworks in educational contexts and addresses issues such as defining an inclusive framework for languages, the use of frameworks in test and course development and their role in guiding test users. Section three focuses on practical issues associated with the application of frameworks and presents studies associated with rating scales, the use of frameworks in test development and validation, and the role of statistical procedures as part of quality assurance.

  • insights into the influence of language frameworks on social policy and practice
  • up-to-date information on the application of frameworks in a variety of learning and teaching contexts worldwide 
  • accounts of recent projects involving the practical role of frameworks in addressing assessment issues.

With its broad coverage of key issues and combination of theoretical insights and practical advice, this volume is a valuable reference work for academics, employers and policy-makers in Europe and beyond. It is also a useful resource for postgraduate students of language testing and for practitioners, and anyone else seeking to understand the policies, procedures and challenges encountered in the application of language frameworks.

Volume 35 — Examining Listening: Research and practice in assessing second language listening (Geranpayeh and Taylor 2013)

Front cover of Studies in Language Testing – Volume 35

Examining Listening: Research and practice in assessing second language listening Edited by Ardeshir Geranpayeh and Lynda Taylor (2013)

This volume develops a theoretical framework for validating tests of second language listening ability. The framework is then applied through an examination of the tasks in Cambridge English listening tests from a number of different validity perspectives that reflect the socio-cognitive nature of any assessment event. The authors show how an understanding and analysis of the framework and its components can assist test developers to operationalise their tests more effectively, especially in relation to the key criteria that differentiate one proficiency level from another.

  • an up-to-date review of the relevant literature on assessing listening
  • an accessible and systematic description of the different proficiency levels in second language listening
  • a comprehensive and coherent basis for validating tests of listening.

This volume is a rich source of information on all aspects of examining listening ability. As such, it will be of considerable interest to examination boards who wish to validate their own listening tests in a systematic and coherent manner, as well as to academic researchers and graduate students in the field of language assessment more generally. This is a companion volume to the previously published Examining Writing (2007) , Examining Reading (2009)and Examining Speaking (2011).

"Geranpayeh and Taylor have put together a collection that will undoubtedly become a significant addition to the literature on a still very under-represented skill." Luke Harding (2015), Language Testing 31, 121-124.

Volume 34 — IELTS Collected Papers 2: Research in reading and listening assessment (Taylor and Weir 2012)

Ielts collected papers 2: research in reading and listening assessment edited by lynda taylor and cyril j weir (2012).

IELTS (International English Language Testing System) serves as a high-stakes proficiency test to assess the English language skills of international students wishing to study, train or work in English-speaking environments. The test has been regularly revised in light of findings from ongoing research and validation studies to ensure that it remains a valid and reliable measure. This volume brings together a set of research studies conducted between 2005 and 2010, sponsored under the auspices of the British Council/IELTS Australia Joint-funded Research Program, which provides annual grant funding to encourage research activity among IELTS test stakeholders around the world. The eight studies – four on reading and four on listening assessment – provide valuable test validity evidence and directly inform the continuing development of the IELTS Reading and Listening tests. The volume chronicles the evolution of the Reading and Listening tests in ELTS and IELTS from 1980 to the present day. It explains the rationale for revising these tests at various points in their history and the role played in this by research findings. The editors comment on the specific contribution of each study in this volume to the ongoing process of IELTS Reading and Listening test design and development. This is a companion volume to the previously published IELTS Collected Papers on IELTS speaking and writing assessment. It will be of particular value to language testing researchers interested in IELTS as well as to institutions and professional bodies who use IELTS test scores. It will also be relevant to students, lecturers and researchers working more broadly in the field of English for Academic Purposes.

Volume 33 — Aligning Tests with the CEFR: Reflections on using the Council of Europe’s draft Manual (Martyniuk 2010)

Aligning tests with the cefr: reflections on using the council of europe’s draft manual edited by waldemar martyniuk (2010).

This volume contains 12 case studies that piloted the Council of Europe’s preliminary Manual for Relating Language Examinations to the Common European Framework of Reference for Languages (CEFR) , released in 2003. The case studies were presented at a 2-day colloquium held in Cambridge in December 2007, an event which helped to inform the Manual revision project during 2008/2009. As well as describing their studies and reporting on their findings, contributors to the volume reflect and comment on their experience of using the draft Manual. A clear and comprehensive introductory chapter explains the development of the CEFR and the draft Manual for linking tests, discussing its relevance for the future. The volume will be of particular interest to examination boards, language test developers and educational policy makers, as well as to academic lecturers, researchers and graduate students interested in the principles and practice of aligning tests with the CEFR.  

‘This volume … is another excellent book in the Studies in Language Testing (SiLT) series … This volume of papers will serve as an excellent resource for professionals around the world who wish to learn how to go about the difficult task of aligning their assessments with the CEFR.’ Craig Deville (2012), Language Testing 29 (2), 312–314.

Volume 32 — Components of L2 Reading: Linguistic and processing factors in the reading test performances of Japanese EFL learners (Shiotsu 2010)

Components of l2 reading: linguistic and processing factors in the reading test performances of japanese efl learners toshihiko shiotsu (2010).

This volume investigates the linguistic and processing factors that underpin the reading comprehension performance of Japanese learners of English. It describes a comprehensive and rigorous empirical study to identify the main candidate variables that affect reading performance and to develop appropriate research instruments to investigate these. The study explores the contribution to successful reading comprehension of factors such as syntactic knowledge, vocabulary breadth and reading speed in the second language. Key features of the book include: an up-to-date review of the literature on the development and assessment of L1 and L2 reading ability; practical guidance on how to investigate the L2 reading construct using multiple methodologies; and fresh insights into interpreting test data and statistics, and into understanding the nature of L2 reading proficiency. This volume will be a valuable resource for academic researchers and postgraduate students interested in investigating reading comprehension performance, as well as for examination board staff concerned with the design and development of reading assessment tools. It will also be a useful reference for curriculum developers and textbook writers involved in preparing syllabuses and materials for the teaching and learning of reading.

Volume 31 — Language Testing Matters: Investigating the wider social and educational impact of assessment – Proceedings of the ALTE Cambridge Conference, April 2008 (Taylor and Weir 2009)

Language testing matters: investigating the wider social and educational impact of assessment – proceedings of the alte cambridge conference, april 2008 edited by lynda taylor and cyril j weir (2009).

This volume explores the social and educational impact of language testing and assessment by bringing together a collection of 20 edited papers given at the 3rd international conference of the Association of Language Testers in Europe (ALTE). Section One considers new perspectives on testing for specific purposes, including the role played by language assessment in the aviation industry, the legal system, and migration and citizenship policy. Section Two contains insights on testing policy and practice in the context of language teaching and learning in different parts of the world, including Africa, Europe, North America and Asia. Section Three offers reflections on the impact of testing among differing stakeholder constituencies, such as the individual learner, educational authorities, and society in general. With its broad coverage of key issues, this volume is a valuable reference work for academics, employers and policy makers in Europe and beyond. It is also a useful resource for postgraduate students of language testing and for practitioners, i.e. teachers, teacher educators, curriculum developers and materials writers.

Volume 30 — Examining Speaking: Research and practice in assessing second language speaking (Taylor 2011)

Examining speaking: research and practice in assessing second language speaking edited by lynda taylor (2011).

This edited volume develops a theoretical framework for validating tests of second language speaking ability. The framework is then applied through an examination of the tasks in Cambridge English Speaking tests from a number of different validity perspectives that reflect the socio-cognitive nature of any assessment event. The chapter authors show how an understanding and analysis of the framework and its components can assist test developers to operationalise their Speaking tests more effectively, especially in relation to the key criteria that differentiate one proficiency level from another. As well as providing an up-to-date review of relevant literature on assessing speaking, the volume also offers an accessible and systematic description of the different proficiency levels in second language speaking, and a comprehensive and coherent basis for validating tests of speaking. The volume will be of interest to examination boards who wish to validate their own Speaking tests in a systematic and coherent manner, as well as to academic researchers and students in the field of language assessment more generally.

“This edited volume provides useful information on how to apply a socio-cognitive theoretical framework of validity by illustrating research on Cambridge ESOL exams…[it] provides a broad picture with constructive examples for future researchers who want to apply this validity framework.” Youngshin Chi (2013), Language Assessment Quarterly 10, 476-479

Volume 29 — Examining Reading: Research and practice in assessing second language reading (Khalifa and Weir 2009)

Examining reading: research and practice in assessing second language reading hanan khalifa and cyril j weir (2009).

This volume develops a theoretical framework for validating tests of second language reading ability. The framework is then applied through an examination of tasks in Cambridge English Reading tests from a number of different validity perspectives that reflect the socio-cognitive nature of any assessment event. The authors show how an understanding and analysis of the framework and its components can assist test developers to operationalise their tests more effectively. As well as providing an up-to-date review of relevant literature on assessing reading, it also offers an accessible and systematic description of the key criteria that differentiate one proficiency level from another when assessing second language reading. The volume will be of interest to examination boards who wish to validate their own reading tests in a systematic and coherent manner, as well as to academic researchers and students in the field of language assessment more generally.

‘The book offers the field another splendid exposition on second language (L2) reading. This work is unique, however, in that it was written by two scholars who are quite familiar with the Cambridge suite of examinations, and they make extensive use of their knowledge of these tests to demonstrate how the Cambridge ESOL examinations implement theory and research in practice … This volume represents an important contribution to the field in terms of both theory and practice, its timeliness regarding several topics (e.g. alignment with the CEFR, computerized testing, among others), and its appeal to and relevance for multiple audiences.’ Craig Deville (2011), The Modern Language Journal 95, 334–335. SiLT 29 was nominated as a runner-up in the prestigious Sage/ILTA 2012 award for the best book on language testing.

Volume 28 — Examining FCE and CAE: Key issues and recurring themes in developing the First Certificate in English and Certificate in Advanced English exams (Hawkey 2009)

Examining fce and cae: key issues and recurring themes in developing the first certificate in english and certificate in advanced english exams roger hawkey (2009).

This volume examines two of the best-known Cambridge English examinations – Cambridge English: First , also known as First Certificate in English (FCE) and Cambridge English: Advanced , also known as Certificate in Advanced English (CAE) . It starts with the introduction of FCE (then the Lower Certificate in English) in 1939 and traces subsequent developments, including the introduction of FCE in 1975 and of CAE in 1991, as well as the regular projects to modify and update both tests. Key issues addressed are: test constructs; proficiency levels; principles and practice in test development, validation and revision; organisation and management; and stakeholders and partnerships. The book includes a unique set of facsimile copies of FCE and CAE test versions, from the original tests in 1939 and 1991 through various revision projects to the updated formats of 2008. The volume will be of interest to language testing researchers, academic lecturers, postgraduate students and educational policy makers, as well as to teachers, directors of studies, school owners and other stakeholders involved in preparing students for the Cambridge exams. This title complements previous historical volumes on C PE, BEC, CELS and IELTS .

Volume 27 — Multilingualism and Assessment: Achieving transparency, assuring quality, sustaining diversity – Proceedings of the ALTE Berlin Conference, May 2005 (Taylor and Weir 2008)

Multilingualism and assessment: achieving transparency, assuring quality, sustaining diversity – proceedings of the alte berlin conference, may 2005 edited by lynda taylor and cyril j weir (2008).

This collection of edited papers, based on presentations given at the 2nd ALTE Conference, explores the impact of multilingualism on language testing and assessment. The 20 papers consider ways of describing and comparing language qualifications to establish common levels of proficiency, balancing the need to set shared standards and ensure quality, and at the same time sustain linguistic diversity. The contributions come from authors within and beyond Europe and address substantive issues in assessing language ability today. Key features of the volume include: advice on quality management processes in test development and administration; discussion of the role of language assessment in migration and citizenship; and guidance on linking examinations to the CEFR, including some case studies. This volume is a valuable reference for academics and policy makers both within Europe and beyond, as well as a useful resource for practitioners seeking to define language proficiency levels in relation to the CEFR and similar frameworks.

“Overall the book provides well-selected papers with wide-ranging subject matters from the European community, which allows a glance into the challenging tasks the member countries are facing as they are adjusting to the concept of shared standards in language proficiency. The book will serve as timeless reference for testing professionals as it chronicles the tasks that have to be undertaken when 46 countries are involved in a task of this magnitude … The papers are important not only for the European member organisations (and the five observing countries: Canada, the Holy See, Japan, Mexico and the United States) but also for the assessment community in general, because they illustrate that with a clear mission and with dedicated researchers guided globalization can be beneficial to all.” Zsuzsa Cziraky Londe (2010), Language Assessment Quarterly 7 (3), 280–283.

Volume 26 — Examining Writing: Research and practice in assessing second language writing (Shaw and Weir 2007)

Examining writing: research and practice in assessing second language writing stuart d shaw and cyril j weir (2007).

This volume describes the theory and practice of the Cambridge English approach to assessing second language writing ability. A comprehensive test validation framework is used to examine the tasks in Cambridge English Writing tests from a number of different validity perspectives that reflect the socio-cognitive nature of any assessment event. The authors show how an understanding and analysis of the framework and its components can assist test developers to operationalise their tests more effectively. As well as providing an up-to-date review of relevant literature on assessing writing, it also offers an accessible and systematic description of the different proficiency levels in second language writing. The volume will be of interest to examination boards who wish to validate their own Writing tests in a systematic and coherent manner, as well as to academic researchers and students in the field of language assessment more generally. ‘… it should be of interest to a wider audience as well for at least two reasons: (1) it provides a coherent, up-to-date summary of research on writing as a phenomenon in itself, as well as on the assessment of writing; and (2) it presents a great deal of practical information based on solid research that will be helpful in assisting others who are designing, evaluating, or wishing to improve upon their own assessment practices.’ Sara Cushing Weigle (2010), Language Testing 27 (1), 141–144.

Volume 25 — IELTS Washback in Context: Preparation for academic writing in higher education (Green 2007)

Ielts washback in context: preparation for academic writing in higher education anthony green (2007).

Based upon a PhD dissertation completed in 2003, this volume reports an empirical study to investigate the washback of the IELTS Writing subtest on English for Academic Purposes (EAP) provision. The study examines dedicated IELTS preparation courses alongside broader programmes designed to develop the academic literacy skills required for university study. Using a variety of data collection methods and analytical techniques, the research explores the complex relationship that exists between teaching and learning processes and their outcomes. The role of IELTS in EAP provision is evaluated, particularly in relation to the length of time and amount of language support needed by learners to meet minimally acceptable standards for English-medium tertiary study. This volume will be of direct interest to providers and users of general proficiency and EAP tests, as well as academic researchers and graduate students interested in investigating test washback and impact. It will also be relevant to teachers, lecturers and researchers concerned with the development of EAP writing skills.

Volume 24 — Impact Theory and Practice: Studies of the IELTS test and Progetto Lingue 2000 (Hawkey 2006)

Impact theory and practice: studies of the ielts test and progetto lingue 2000 roger hawkey (2006).

This book describes two recent case studies to investigate test impact in specific educational contexts: one analyses the impact of IELTS (International English Language Testing System) , while the second focuses on a major national language teaching reform programme introduced by the Ministry of Education in Italy. With its combination of theoretical overview and practical advice, this volume is a useful manual on how to conduct impact studies and will be of particular interest to language test researchers and students of language testing. It will also be relevant to those who are concerned with the process of curriculum and examination reform.

Volume 23 — Assessing Academic English: Testing English proficiency, 1950–1989 – the IELTS solution (Davies 2008)

Assessing academic english: testing english proficiency, 1950–1989 – the ielts solution alan davies (2008).

This volume presents an authoritative account of academic language proficiency testing in the UK. It chronicles the early development and use of the English Proficiency Test Battery (EPTB) in the 1960s, followed by the creation and implementation of the revolutionary English Language Testing Service (ELTS) in the 1970s and 1980s, and the introduction of the International English Language Testing System (IELTS) in 1989. The book offers a coherent socio-cultural analysis of the changes in language testing and an explanation of why history matters as much in this field as elsewhere. It discusses the significant factors which impact on language test design, development, implementation and revision, and presents historical documents relating to the language tests discussed in the volume, including facsimile copies of original test versions. The volume will be of interest to language test developers and policy makers, as well as teachers, lecturers and researchers interested in assessing English for Academic Purposes (EAP) and in the role played by ELTS and IELTS . 

Volume 22 — The Impact of High-stakes Testing on Classroom Teaching: A case study using insights from testing and innovation theory (Wall 2005)

The impact of high-stakes testing on classroom teaching: a case study using insights from testing and innovation theory dianne wall (2005).

This volume gives an account of one of the first data-based studies of examination ‘washback’. Through a detailed analysis of the impact of examination reform in one specific educational setting, it considers the effects of a test which was meant to serve as a lever for change, and describes how the intended outcome was shaped by factors in the test itself, as well as by features of the context, teachers and learners. The volume provides a helpful model for researching washback and impact as well as practical guidelines for the planning and management of change within an educational context. It is of particular relevance to all who are involved in the process of curriculum and examination reform, and to academic researchers, university lecturers, graduate students and practising teachers.

Volume 21 — Changing Language Teaching through Language Testing: A washback study (Cheng 2005)

Changing language teaching through language testing: a washback study liying cheng (2005).

This volume presents a study of how the introduction in 1996 of a high-stakes public examination impacted on classroom teaching and learning in Hong Kong secondary schools. The washback effect was observed among different stakeholder groups within the local educational context, and also in terms of teachers’ attitudes, teaching content and classroom interactions. The volume is of particular relevance to language test developers and researchers interested in the consequential validity of tests, as well as to teachers, curriculum designers, policy makers and others concerned with the interface between language testing and teaching practices.

Volume 20 — Testing the Spoken English of Young Norwegians: A study of test validity and the role of ‘smallwords’ in contributing to pupils’ fluency (Hasselgreen 2004)

Testing the spoken english of young norwegians: a study of test validity and the role of ‘smallwords’ in contributing to pupils’ fluency angela hasselgreen (2004).

This volume reports on a study to validate a test of spoken English for secondary school pupils in Norway. The study included a corpus-based investigation of how conversational fillers or ‘smallwords’ contribute to spoken fluency. Findings from this work informed the development of rating scale descriptors for assessing fluency levels. The volume will be of particular interest to those concerned with the design and validation of spoken language tests, as well as those interested in features of spoken communication and in how classroom practice can help develop learners’ fluency.

Volume 19 — IELTS Collected Papers: Research in speaking and writing assessment (Taylor and Falvey 2007)

Ielts collected papers: research in speaking and writing assessment edited by lynda taylor and peter falvey (2007).

This book brings together 10 research studies conducted between 1995 and 2001 under the auspices of the British Council/IELTS Australia Joint-funded Research Program. The studies – four on speaking and six on writing assessment – provided valuable test validity evidence and directly informed the revised IELTS Speaking and Writing tests introduced in 2001 and 2005. Volume 19 chronicles the evolution of the Writing and Speaking tests in ELTS/IELTS from 1980 to the present day and discusses the role of research in their development. In addition, it evaluates a variety of research methods to provide helpful guidance for novice and less experienced researchers. This collection of studies will be of particular value to language testing researchers interested in IELTS as well as to institutions and professional bodies who make use of IELTS test scores; it will also be relevant to students, lecturers and researchers working more broadly in the field of English for Academic Purposes. “It is really a book which anyone concerned with performance testing should read and benefit from. At the very least, the literature reviews under each topic and the detailed explanations, then critique, of methods are excellent contributions to the field.” Wayne Rimmer (2010) Modern English Teacher 19 (1), 91–92.

Volume 18 — European Language Testing in a Global Context: Proceedings of the ALTE Barcelona Conference July 2001 (Milanovic and Weir 2004)

European language testing in a global context: proceedings of the alte barcelona conference july 2001 edited by michael milanovic and cyril weir (2004).

The ALTE Conference, European Language Testing in a Global Context, was held in Barcelona in 2001 in support of the European Year of Languages. The contents of this volume represent a small subset of the many presentations made at that event and papers were selected to provide a flavour of the issues that the conference addressed which included: technical dimensions of language testing; matters of fairness and ethics in assessment; aspects of education and language policy in the European context; and reports of recently completed research studies and work in progress.

Volume 17 — Issues in Testing Business English: The revision of the Cambridge Business English Certificates (O’Sullivan 2006)

Issues in testing business english: the revision of the cambridge business english certificates barry o’sullivan (2006).

This book explores the testing of language for specific purposes (LSP) from a theoretical and practical perspective, with a particular focus on the testing of English for business purposes. A range of tests – both past and present – is reviewed, and the development of Business English testing at Cambridge English is discussed. The description of the revision of Cambridge English: Business Certificates , also known as Business English Certificates (BEC) , in 2002 forms a major part of the book and offers a unique insight into an approach to large-scale ESP test development and revision. The volume will be of particular relevance to test developers and researchers interested in language testing for specific purposes and contexts of use; it will also be of interest to ESP teachers, especially those teaching English for business, as well as to lecturers and postgraduates working in the field of LSP.

Volume 16 — A Modular Approach to Testing English Language Skills: The development of the Certificates in English Language Skills (CELS) examinations (Hawkey 2004)

A modular approach to testing english language skills: the development of the certificates in english language skills (cels) examinations roger hawkey (2004).

This volume documents in some detail the development of the Cambridge English Certificates in English Language Skills (CELS), a suite of modular examinations first offered in 2002. The book traces the history of various important English language exams offered by UCLES and other examination boards which significantly influenced the development of CELS including: the Communicative Use of English as a Foreign Language (CUEFL) exams; the Certificates in Communicative Skills in English (CCSE); the English language tests of reading and writing produced by the University of Oxford Delegacy of Local Examinations; and the Oral English exams offered by the Association of Recognised English Language Schools (ARELS) Examinations Trust.

Volume 15 — Continuity and Innovation: Revising the Cambridge Proficiency in English Examination 1913–2002 (Weir and Milanovic 2003)

Front cover of Studies in Language Testing – Volume 15

Continuity and Innovation: Revising the Cambridge Proficiency in English Examination 1913–2002 Edited by Cyril Weir and Michael Milanovic (2003)

This volume documents in some detail the most recent revision of Cambridge English: Proficiency , also known as Certificate of Proficiency in English (CPE) , which took place from 1991 to 2002. CPE is the oldest of the Cambridge suite of English as a Foreign Language (EFL) examinations and was originally introduced in 1913. Since that time the test has been regularly revised and updated to bring it into line with current thinking in language teaching, applied linguistics and language testing theory and practice. The volume provides a full account of the revision process, the questions and problems faced by the revision teams, and the solutions they came up with. It is also an attempt to encourage in the public domain greater understanding of the complex thinking, processes and procedures which underpin the development and revision of all the Cambridge English tests, and as such it will be of interest and relevance to a wide variety of readers.

“An invaluable case book for training language testers and teachers … Makes explicit the developing philosophy of good testing practice … With its wealth of illustrative examples and detailed statistics, this study clearly presents an exceptional case study of a well-managed and professionally-serviced English language test … An important study, showing the possibilities of good language testing.” Bernard Spolsky (2004) ELT Journal, 58 (3), 305–309.

Volume 14 — A Qualitative Approach to the Validation of Oral Language Tests (Lazaraton 2002)

A qualitative approach to the validation of oral language tests anne lazaraton (2002).

Language testers have generally come to recognise the limitations of traditional statistical methods for validating oral language tests, and have begun to consider more innovative approaches to test validation which can illuminate the assessment process itself, rather than just assessment outcomes (i.e. test scores). One such approach is conversation analysis (or CA), a rigorous empirical methodology developed by sociologists, which employs inductive methods in order to discover and describe the recurrent, systematic properties of conversation. This book aims to provide language testers with a background in the conversation analytic framework, and a fuller understanding of what is entailed in using conversation analysis in the specific context of oral language test validation.

“… this book provides an excellent, and clearly written, introduction to the use of discourse analysis, especially CA, in examining the functioning of oral language tests … I would recommend this book to teachers or test developers who might be developing oral language tests as well as those who are intending to carry out research using discourse analytic techniques. Finally, also, it must be said, the book was enjoyable to read; in particular I found Lazaraton’s discussion of the literature on oral interview research to be well-organised and clear, and her discussion of CA theory to be extremely accessible.” Annie Brown (2005) Language Assessment Quarterly 2 (4), 309–313.

Volume 13 — The Equivalence of Direct and Semi-direct Speaking Tests (O’Loughlin 2001)

The equivalence of direct and semi-direct speaking tests kieran o’loughlin (2001).

This book documents a comparability study of direct (face-to-face) and semi-direct (language laboratory) versions of the Speaking component of the access : test, an English language test designed in the 1990s by the Language Testing Research Centre (University of Melbourne) as part of the selection process for immigration to Australia. The study gathered a broad range of quantitative and qualitative evidence to investigate the issue of test equivalence, and this multi-layered approach yields a complex and richly textured perspective on the comparability of the two kinds of Speaking tests. The findings have important implications for the use of direct and semi-direct Speaking tests in various high-stakes contexts such as immigration and university entrance. As such, the book will be of interest to policy makers and administrators as well as language teachers and language testing researchers.

‘... this book makes an important contribution to the language testing literature … For its insights and multifaceted approach to examining test equivalence, it is a valuable resource to language test developers, researchers, graduate students, and even language programs considering using either of these test formats ... a very readable tale of two tests and the complexity needed to unravel what actually happens in them.’ Lindsay Brooks (2006) Language Assessment Quarterly 3 (4), 369–373.

Volume 12 — An Empirical Investigation of the Componentiality of L2 Reading in English for Academic Purposes (Weir, Huizhong and Yan 2000)

An empirical investigation of the componentiality of l2 reading in english for academic purposes edited by cyril j weir, yang huizhong and jin yan (2000).

This volume describes the development and validation of an advanced level test for evaluating expeditious (skimming, search reading and scanning) and careful EAP reading abilities at tertiary level in China. It reports on the methodological procedures which led to the development of the test and discusses the results of empirical investigations carried out to establish its validity both a priori and a posteriori . It is of particular interest and value to teachers, researchers and test developers.

“... this book is a systematic presentation of the authors’ dual-purpose pioneering work in EFL reading. On the one hand, they focus on the research question of the componentiality of academic EFL reading ... On the other hand, the researchers’ experimental work has rewarded them with a unique academic EFL reading test, whose development process is a wonderful model for other test developers to follow.” Ning Chen (2006) Language Assessment Quarterly 3 (1), 81–86.

Volume 11 — Experimenting with Uncertainty: Essays in honour of Alan Davies (Elder, Brown, Grove, Hill, Iwashita, Lumley, McNamara, O’Loughlin 2001)

Experimenting with uncertainty: essays in honour of alan davies edited by c elder, a brown, e grove, k hill, n iwashita, t lumley, t mcnamara, k o’loughlin (2001).

This festschrift brings together 28 invited papers surveying the state of the art in language testing from a perspective which combines technical and broader applied linguistics insights. The papers, written by key figures in the field of language testing, cover issues ranging from test construct definition to the design and application of language tests, including their importance as a means of exploring larger issues in language teaching, language learning and language policy. The volume locates work in language assessment in a context of social, political and ethical issues at a time when testing is increasingly expected to be publicly accountable.

"The breadth of perspectives of [Experimenting with Uncertainty: Essays in honour of Alan Davies, Studies in Language Testing 11, Elder et al (Eds) (2001), CUP/UCLES] is wide enough, providing critically informative commentaries on the issues that language testers should be aware of, particularly in these times when assessment and accountability are increasingly valued in overall circles of education as well as the field of language testing … Providing a readable introduction … this book will guide … readers in how to grapple with thorny issues that language testing researchers may encounter in their professional career.” Hyeong-Jong Lee (2005) Language Testing 22 (4), 533–545.

Volume 10 — Issues in Computer-Adaptive Testing of Reading Proficiency: Selected papers (Chalhoub-Deville 1999)

Issues in computer-adaptive testing of reading proficiency: selected papers edited by micheline chalhoub-deville (1999).

This volume is an important resource for those interested in research on and development of computer-adaptive (CAT) instruments for assessing the receptive skills, mainly reading. It includes selected papers from a conference on the computer-adaptive testing of reading held in Bloomington, Minnesota, in 1996, as well as a number of specially written papers.

"For those interested in developing and appreciating CAT for reading measurement, the volume [Issues in Computer-Adaptive Testing of Reading Proficiency, Studies in Language Testing 10, Chalhoub-Deville (Ed.) (1999), CUP/UCLES] has, to date, had no parallel in its value as an excellent resource book.” Jungok Bae (2005) Language Assessment Quarterly 2 (2), 169–173.

“[T]he chapters in this book represent state-of-the-art thinking in computer-adaptive language testing. The book will remain a key volume in the field for many years to come.” Glenn Fulcher (2000) Language Testing 17 (3), 361–367.

Volume 09 — Fairness and Validation in Language Assessment: Selected papers from the 19th Language Testing Research Colloquium, Orlando, Florida (Kunnan 2000)

Fairness and validation in language assessment: selected papers from the 19th language testing research colloquium, orlando, florida edited by antony john kunnan (2000).

Fairness of language tests and testing practices has always been a concern among test developers and test users. In the past decade educational and language assessment researchers have begun to focus directly on fairness and related matters such as test standards, test bias and equity and ethics for testing professionals. The 19th annual Language Testing Research Colloquium held in 1997 in Orlando, Florida, brought this overall concern into sharp focus by having ‘Fairness in Language Testing’ as its theme. The conference presentations and discussions attempted to understand the concept of fairness, define the scope of the concept and connect it with the concept of validation of test score interpretation. The papers in this volume offer a first introduction to fairness and validation in the field of language assessment.

Volume 08 — Learner Strategy Use and Performance on Language Tests: A structural equation modeling approach (Purpura 1999)

Front cover of Studies in Language Testing – Volume 08

Learner Strategy Use and Performance on Language Tests: A structural equation modeling approach James E Purpura (1999)

This volume investigates the relationship between learner strategy use and performance on second language tests, by examining the construct validity of two questionnaires designed within a model of information processing that measures test takers’ self-reported cognitive and metacognitive strategy use. The book investigates how learner strategy use influences test performance, and how high performers use strategies differently from low performers.

Volume 07 — Dictionary of Language Testing (Davies, Brown, Elder, Hill, Lumley and McNamara 1999)

Dictionary of language testing alan davies, annie brown, cathie elder, kathryn hill, tom lumley and tim mcnamara (1999).

This volume constitutes a valuable resource for anyone seeking a better understanding of the terminology and concepts used in language testing. It contains some 600 entries, each listed under a headword with extensive cross-referencing and suggestions for further reading. The selection of headwords is based on advice from specialists in language testing around the world, combined with the scanning of current textbooks in this field and of dictionaries and encyclopaedias in adjacent fields (e.g. psychometrics, applied linguistics, statistics).

"Multilingual Glossary of Language Testing Terms (Studies in Language Testing 6, ALTE [1998], CUP and UCLES) and Dictionary of Language Testing (Studies in Language Testing 7, Davies et al [1999], CUP/UCLES) are monumental works in the field of language testing.” Yoshinori Watanabe (2005) Language Assessment Quarterly 2 (1), 69–75. “... the book can act as a specific point of reference for language testing terminology and concepts, and students will find it increasingly useful as their understanding within the field develops.” Roger Barnard (2000) Modern English Teacher 9 (3), 89–90.

Volume 06 — Multilingual Glossary of Language Testing Terms (ALTE Members 1998)

Multilingual glossary of language testing terms prepared by alte members (1998).

A multilingual glossary has a particularly significant role to play in encouraging the development of language testing in less widely taught languages by establishing terms which may be new alongside their well-known equivalents in the commonly used languages. The glossary contains entries in 10 languages: Catalan, Danish, Dutch, English, French, German, Irish, Italian, Portuguese and Spanish. This volume will be of use to many working in the context of European languages who are involved in testing and assessment.

“… exploration of the MG reveals it, in my opinion, to be of real value in its own right, both as a working glossary of language testing terms, and, perhaps more importantly, as an invaluable aid to speakers of the ten represented languages … represents an invaluable resource for the tester and student of testing alike.” Barry O’Sullivan (2002) Applied Linguistics 23 (2), 273–275.

Volume 05 — Verbal Protocol Analysis in Language Testing Research: A handbook (Green 1998)

Verbal protocol analysis in language testing research: a handbook alison green (1998).

Verbal protocol analysis (VPA) is a methodology that is being used extensively by researchers. Recently, individuals working in the area of testing, and in language testing in particular, have begun to appreciate the roles VPA might play in the development and evaluation of assessment instruments. This book aims to provide potential practitioners of VPA with the background to the technique and a good understanding of what is entailed in using VPA in the context of language testing and assessment. Tutorial exercises are presented which enable the reader to try out each of the different steps involved in VPA.

"The book is successful in providing a practical guide for graduate students and researchers wishing a better understanding of VPA in language testing … it fulfils the need for a basic introduction to the application of VPA … a stimulating guide for researchers interested in language testing.” Abdoljavad Jafarpur (1999) Language Testing 16 (4), 483–486.

Volume 04 — The Development of IELTS: A study of the effect of background knowledge on reading comprehension (Clapham 1996)

The development of ielts: a study of the effect of background knowledge on reading comprehension caroline clapham (1996).

This book investigates the ESP claim that tertiary level ESL students should be given reading proficiency tests in their own academic subject areas, and studies the effect of background knowledge on reading comprehension. It is set against a background of recent research into reading in a first and second language, and emphasises the impact schema theory has had on this. The book is a useful resource for those involved with IELTS and others interested in the testing of English for academic purposes. "Caroline Clapham has written a major, seminal book. She has examined a dangerous field of landmines, detected them, and disarmed them. This book will serve as a map of that minefield for years to come. Higher-education language departments … who are seriously considering special-fields testing should read this book carefully.” Fred Davidson (1998) Language Testing 15 (2), 289–301.

Volume 03 — Performance Testing, Cognition and Assessment: Selected papers from the 15th Language Testing Research Colloquium, Cambridge and Arnhem (Milanovic and Saville 1996)

Performance testing, cognition and assessment: selected papers from the 15th language testing research colloquium, cambridge and arnhem edited by michael milanovic and nick saville (1996).

This book contains a selection of research papers presented at the 15th Annual Language Testing Research Colloquium (LRTC). The Colloquium was jointly hosted by the University of Cambridge Local Examinations Syndicate (UCLES) in Cambridge and CITO in Arnhem, the Netherlands. At the Cambridge venue, the papers were presented on the themes of performance testing, and at Arnhem they covered aspects of communication in relation to cognition and assessment. A selection of papers has been made in order to achieve a balanced coverage of these themes.

“The book thus provides a valuable resource for readers interested in a variety of approaches to investigating and understanding L2 performance assessment … a useful collection of research summaries and a source for relevant ideas.” John Norris (1999) Language Testing 16 (1), 121–125.

Volume 02 — Test Taker Characteristics and Test Performance: A structural modeling approach (Kunnan 1995)

Test taker characteristics and test performance: a structural modeling approach anthony john kunnan (1995).

This book investigates the influence of test taker characteristics on test performance in tests of English as a foreign language by exploring the relationships between these two groups of variables. Data from a test taker questionnaire and performance on several tests including Cambridge English: First , also known as First Certificate in English (FCE) , and the TOEFL were used for the study.

Volume 01 — An Investigation into the Comparability of Two Tests of English as a Foreign Language: The Cambridge TOEFL Comparability Study (Bachman, Davidson, Ryan and Choi 1995)

An investigation into the comparability of two tests of english as a foreign language: the cambridge toefl comparability study lyle f bachman, fred davidson, katherine ryan and inn-chull choi (1995).

This book documents a major study, which compares Cambridge English: First , also known as First Certificate in English (FCE) , with the Test of English as a Foreign Language (TOEFL) and investigates similarities in test content, candidature and use.

phd thesis language testing

BYU ScholarsArchive

BYU ScholarsArchive

Home > Humanities > Linguistics > Theses and Dissertations

Linguistics Theses and Dissertations

Theses/dissertations from 2022 2022.

Temporal Fluency in L2 Self-Assessments: A Cross-Linguistic Study of Spanish, Portuguese, and French , Mandy Case

Biblical Hebrew as a Negative Concord Language , J. Bradley Dukes

Revitalizing the Russian of a Heritage Speaker , Aaron Jordan

Analyzing Patterns of Complexity in Pre-University L2 English Writing , Zachary M. Lambert

Prosodic Modeling for Hymn Translation , Michael Abraham Peck

Interpretive Language and Museum Artwork: How Patrons Respond to Depictions of Native American and White Settler Encounters--A Thematic Analysis , Holli D. Rogerson

Theses/Dissertations from 2021 2021

Trademarks and Genericide: A Corpus and Experimental Approach to Understanding the Semantic Status of Trademarks , Richard B. Bevan

First and Second Language Use of Case, Aspect, and Tense in Finnish and English , Torin Kelley

Lexical Aspect in-sha Verb Chains in Pastaza Kichwa , Azya Dawn Ladd

Text-to-Speech Systems: Learner Perceptions of its Use as a Tool in the Language Classroom , Joseph Chi Man Mak

The Effects of Dynamic Written Corrective Feedback on the Accuracy and Complexity of Writing Produced by L2 Graduate Students , Lisa Rohm

Mental Contrasting with Implementation Intentions as Applied to Motivation in L2 Vocabulary Acquisition , Lindsay Michelle Stephenson

Linguistics of Russian Media During the 2016 US Election: A Corpus-Based Study , Devon K. Terry

Theses/Dissertations from 2020 2020

Portuguese and Chinese ESL Reading Behaviors Compared: An Eye-Tracking Study , Logan Kyle Blackwell

Mental Contrasting with Implementation Intentions to Lower Test Anxiety , Asena Cakmakci

The Categorization of Ideophone-Gesture Composites in Quichua Narratives , Maria Graciela Cano

Ranking Aspect-Based Features in Restaurant Reviews , Jacob Ling Hang Chan

Praise in Written Feedback: How L2 Writers Perceive and Value Praise , Karla Coca

Evidence for a Typology of Christ in the Book of Esther , L. Clayton Fausett

Gender Vs. Sex: Defining Meaning in a Modern World through use of Corpora and Semantic Surveys , Mary Elizabeth Garceau

The attributive suffix in Pastaza Kichwa , Barrett Wilson Hamp

An Examination of Motivation Types and Their Influence on English Proficiency for Current High School Students in South Korean , Euiyong Jung

Experienced ESL Teachers' Attitudes Towards Using Phonetic Symbols in Teaching English Pronunciation to Adult ESL Students , Oxana Kodirova

Evidentiality, Epistemic Modality and Mirativity: The Case of Cantonese Utterance Particles Ge3, Laak3, and Lo1 , Ka Fai Law

Application of a Self-Regulation Framework in an ESL Classroom: Effects on IEP International Students , Claudia Mencarelli

Parsing an American Sign Language Corpus with Combinatory Categorial Grammar , Michael Albert Nix

An Exploration of Mental Contrasting and Social Networks of English Language Learners , Adam T. Pinkston

A Corpus-Based Study of the Gender Assignment of Nominal Anglicisms in Brazilian Portuguese , Taryn Marie Skahill

Developing Listening Comprehension in ESL Students at the Intermediate Level by Reading Transcripts While Listening: A Cognitive Load Perspective , Sydney Sohler

The Effect of Language Learning Experience on Motivation and Anxiety of Foreign Language Learning Students , Josie Eileen Thacker

Identifying Language Needs in Community-Based Adult ELLs: Findings from an Ethnography of Four Salvadoran Immigrants in the Western United States , Kathryn Anne Watkins

Theses/Dissertations from 2019 2019

Using Eye Tracking to Examine Working Memory and Verbal Feature Processing in Spanish , Erik William Arnold

Self-Regulation in Transition: A Case Study of Three English Language Learners at an IEP , Allison Wallace Baker

"General Conference talk": Style Variation and the Styling of Identity in Latter-day Saint General Conference Oratory , Stephen Thomas Betts

Implementing Mental Contrasting to Improve English Language Learner Social Networks , Hannah Trimble Brown

Comparing Academic Vocabulary List (AVL) Frequency Bands to Leveled Biology and History Texts , Lynne Crandall

A Comparison of Mobile and Computer Receptive Language ESL Tests , Aislin Pickett Davis

Yea, Yea, Nay, Nay: Uses of the Archaic, Biblical Yea in the Book of Mormon , Michael Edward De Martini

L1 and L2 Reading Behaviors by Proficiency Level: An English-Portuguese Eye-Tracking Study , Larissa Grahl

Immediate Repeated Reading has Positive Effects on Reading Fluency for English Language Learners: An Eye-tracking Study , Jennifer Hemmert Hansen

Perceptions of Malaysian English Teachers Regarding the Importation of Expatriate Native and Nonnative English-speaking Teachers , Syringa Joanah Judd

Sociocultural Identification with the United States and English Pronunciation Comprehensibility and Accent Among International ESL Students , Christinah Paige Mulder

The Effects of Repeated Reading on the Fluency of Intermediate-Level English-as-a-Second-Language Learners: An Eye-Tracking Study , Krista Carlene Rich

Verb Usage in Egyptian Movies, Serials, and Blogs: A Case for Register Variation , Michael G. White

Theses/Dissertations from 2018 2018

Factors Influencing ESL Students' Selection of Intensive English Programs in the Western United States , Katie Briana Blanco

Pun Strategies Across Joke Schemata: A Corpus-Based Study , Robert Nishan Crapo

ESL Students' Reading Behaviors on Multiple-Choice Items at Differing Proficiency Levels: An Eye-Tracking Study , Juan M. Escalante Talavera

Backward Transfer of Apology Strategies from Japanese to English: Do English L1 Speakers Use Japanese-Style Apologies When Speaking English? , Candice April Flowers

Cultural Differences in Russian and English Magazine Advertising: A Pragmatic Approach , Emily Kay Furner

An Analysis of Rehearsed Speech Characteristics on the Oral Proficiency Interview—Computer (OPIc) , Gwyneth Elaine Gates

Predicting Speaking, Listening, and Reading Proficiency Gains During Study Abroad Using Social Network Metrics , Timothy James Hall

Navigating a New Culture: Analyzing Variables that Influence Intensive English Program Students' Cultural Adjustment Process , Sherie Lyn Kwok

Second Language Semantic Retrieval in the Bilingual Mind: The Case of Korean-English Expert Bilinguals , Janice Si-Man Lam

Evaluating the Effectiveness of a Korean Heritage-Speaking Interpreter , Yoonjoo Lee

Reading Idioms: A Comparative Eye-Tracking Study of Native English Speakers and Native Korean Speakers , Sarah Lynne Miner

Applying the Developmental Path of English Negation to the Automated Scoring of Learner Essays , Allen Travis Moore

Performance Self-Appraisal Calibration of ESL Students on a Proficiency Reading Test , Jodi Mikolajcik Petersen

Switch-Reference in Pastaza Kichwa , Alexander Harrison Rice

The Effects of Metacognitive Listening Strategy Instruction on ESL Learners' Listening Motivation , Corbin Kalanikiakahi Rivera

The Effects of Teacher Background on How Teachers Assess Native-Like and Nonnative-Like Grammar Errors: An Eye-Tracking Study , Wesley Makoto Schramm

Rubric Rating with MFRM vs. Randomly Distributed Comparative Judgment: A Comparison of Two Approaches to Second-Language Writing Assessment , Maureen Estelle Sims

Investigating the Perception of Identity Shift in Trilingual Speakers: A Case Study , Elena Vasilachi

Theses/Dissertations from 2017 2017

Preparing Non-Native English Speakers for the Mathematical Vocabulary in the GRE and GMAT , Irina Mikhailovna Baskova

Eye Behavior While Reading Words of Sanskrit and Urdu Origin in Hindi , Tahira Carroll

An Acoustical Analysis of the American English /l, r/ Contrast as Produced by Adult Japanese Learners of English Incorporating Word Position and Task Type , Braden Paul Chase

The Rhetoric Revision Log: A Second Study on a Feedback Tool for ESL Student Writing , Natalie Marie Cole

Quizlet Flashcards for the First 500 Words of the Academic Vocabulary List , Emily R. Crandell

The Impact of Changing TOEFL Cut-Scores on University Admissions , Laura Michelle Decker

A Latent Class Analysis of American English Dialects , Stephanie Nicole Hedges

Comparing the AWL and AVL in Textbooks from an Intensive English Program , Michelle Morgan Hernandez

Faculty and EAL Student Perceptions of Writing Purposes and Challenges in the Business Major , Amy Mae Johnson

Multilingual Trends in Five London Boroughs: A Linguistic Landscape Approach , Shayla Ann Johnson

Nature or Nurture in English Academic Writing: Korean and American Rhetorical Patterns , Sunok Kim

Differences in the Motivations of Chinese Learners of English in Different (Foreign or Second Language) Contexts , Rui Li

Managing Dynamic Written Corrective Feedback: Perceptions of Experienced Teachers , Rachel A. Messenger

Spanish Heritage Bilingual Perception of English-Specific Vowel Contrasts , John B. Nielsen

Taking the "Foreign" Out of the Foreign Language Classroom Anxiety Scale , Jared Benjamin Sell

Creole Genesis and Universality: Case, Word Order, and Agreement , Gerald Taylor Snow

Idioms or Open Choice? A Corpus Based Analysis , Kaitlyn Alayne VanWagoner

Applying Corpus-Assisted Critical Discourse Analysis to an Unrestricted Corpus: A Case Study in Indonesian and Malay Newspapers , Sara LuAnne White

Investigating the effects of Rater's Second Language Learning Background and Familiarity with Test-Taker's First Language on Speaking Test Scores , Ksenia Zhao

Theses/Dissertations from 2016 2016

The Influence of Online English Language Instruction on ESL Learners' Fluency Development , Rebecca Aaron

The Effect of Prompt Accent on Elicited Imitation Assessments in English as a Second Language , Jacob Garlin Barrows

A Framework for Evaluating Recommender Systems , Michael Gabriel Bean

Program and Classroom Factors Affecting Attendance Patterns For Hispanic Participants In Adult ESL Education , Steven J. Carter

A Longitudinal Analysis of Adult ESL Speakers' Oral Fluency Gains , Kostiantyn Fesenko

Rethinking Vocabulary Size Tests: Frequency Versus Item Difficulty , Brett James Hashimoto

The Onomatopoeic Ideophone-Gesture Relationship in Pastaza Quichua , Sarah Ann Hatton

A Hybrid Approach to Cross-Linguistic Tokenization: Morphology with Statistics , Logan R. Kearsley

Getting All the Ducks in a Row: Towards a Method for the Consolidation of English Idioms , Ethan Michael Lynn

Expecting Excellence: Student and Teacher Attitudes Towards Choosing to Speak English in an IEP , Alhyaba Encinas Moore

Lexical Trends in Young Adult Literature: A Corpus-Based Approach , Kyra McKinzie Nelson

A Corpus-Based Comparison of the Academic Word List and the Academic Vocabulary List , Jacob Andrew Newman

A Self-Regulated Learning Inventory Based on a Six-Dimensional Model of SRL , Christopher Nuttall

The Effectiveness of Using Written Feedback to Improve Adult ESL Learners' Spontaneous Pronunciation of English Suprasegmentals , Chirstin Stephens

Pragmatic Quotation Use in Online Yelp Reviews and its Connection to Author Sentiment , Mary Elisabeth Wright

Theses/Dissertations from 2015 2015

Conditional Sentences in Egyptian Colloquial and Modern Standard Arabic: A Corpus Study , Randell S. Bentley

A Corpus-Based Analysis of Russian Word Order Patterns , Stephanie Kay Billings

English to ASL Gloss Machine Translation , Mary Elizabeth Bonham

The Development of an ESP Vocabulary Study Guidefor the Utah State Driver Handbook , Kirsten M. Brown

Advanced Search

  • Notify me via email or RSS

ScholarsArchive ISSN: 2572-4479

  • Collections
  • Disciplines
  • Scholarly Communication
  • Additional Collections
  • Academic Research Blog

Author Corner

Hosted by the.

  • Harold B. Lee Library

Home | About | FAQ | My Account | Accessibility Statement

Privacy Copyright

  • Faculty of Arts

Language Testing Research Centre

Celebrating 30 years

Established in 1990, the Language Testing Research Centre (LTRC) at the University of Melbourne has become an international leader in research and development in language assessment and language program evaluation.

Dr Ute Knoch, Director of the Language Testing Research Centre

Based in the School of Languages and Linguistics in the Faculty of Arts , the LTRC comprises a team of internationally renowned researchers in language assessment and language program evaluation.

Our work focuses on:

  • research and validation of language tests.
  • test development, consultancies and industry linkages.
  • licensing a variety of language tests for use by external clients.
  • a Professional Certificate in Language Assessment.
  • training in language assessment through workshops for teachers and researchers.
  • supervision of Masters and PhD students.

To read about the history of the LTRC, our mission, purpose and leadership go to  About us

In 2020 we celebrated the 30 year anniversary of the LTRC and the 10 year anniversary of the Association for Language Testing and Assessment Australia and New Zealand (ALTAANZ) with an joint online celebratory event and an LTRC symposium.

Read about the celebratory event

Subscribe to our news and events updates

Subscribe now

The LTRC provides services in the evaluation of language programs, including bilingual, English as a Second Language, and foreign language programs in schools, colleges and universities.

Government projects

The LTRC has expertise in school-based assessment in English and other languages. Government funded projects carried out in Australian schools include the evaluation of assessment tools and frameworks for speakers of English as a second or foreign language and measurement of school achievements in Asian languages.

Industry projects

LTRC specialises in measuring the language proficiency skills required for the workplace or for academic study purposes. Projects undertaken with industry partners have focussed on a range of sectors including health, defence, interpreting and translating, aviation and higher education.

The Language Testing Research Centre (LTRC) runs workshops on language assessment on topics relevant to client needs. LTRC staff teach into the Professional Certificate in Language Assessment and the Master of Applied Linguistics program and are actively involved in supervising graduate students.

Workshops on Language Testing and Assessment

The LTRC offers training to develop your skills in language assessment and/or language program evaluation. Custom-made workshops can be created on demand.

Professional Certificate in Language Assessment

Master of applied linguistics.

The LTRC is actively involved in supervising Masters theses on language assessment / testing. More information on the Master of Applied Linguistics...

Doctor of Philosophy – Arts

LTRC staff are willing to supervise PhD theses on topics relating to language assessment. More information on the Doctor of Philosophy – Arts...

We work with clients to develop new language tests tailored to their specific needs or to evaluate assessment tools and frameworks that are currently in use. We also offer a number of ready-made tests.

Custom test development: DELA & Placement tests

Tests that were all either developed at the LTRC or with major involvement by LTRC staff.

Ready-made tests: AEST & Language Placement Tests

Tests that are currently available either offline or online.

The Language Testing Research Centre (LTRC) has established links with educational institutions and assessment agencies in many countries and regions including Japan, China, Chile, Hong Kong, Korea, Singapore, New Zealand, UK and US. Consultancies undertaken for these agencies include research on high stakes admissions tests such as the International English Language Testing System (IELTS) and Test of English as a Foreign Language (TOEFL), test development and analysis, assessor training and professional development.

Partner with us

International projects.

The LTRC is internationally renowned for its prolific research (both commissioned and grant-funded) on the assessment of language proficiency in the context of migration, as well as in various educational and workplace contexts. We have a strong record of publication in both peer-reviewed academic journals and other outlets. We are active in mentoring graduate students in language testing through thesis supervision, studentships and via a regular seminar series on language testing and assessment for Doctor of Philosophy – Arts students and staff.

Research projects

Our research projects showcased.

Publications

Our academic publications, presentations, project reports and submissions. Journals: Studies in Language Assessment (SiLA, formerly PLTA) & MPLT

View the LTRC videos.

Language Assessment Seminar Series

View the recordings of the seminar series.

  • Current 2023-2024 programme
  • 2022-2023 Programme Archive
  • 2021-2022 Programme Archive
  • 2020-2021 Programme Archive
  • 2019 – 2020 Programme Archive
  • 2018 – 2019 Programme Archive
  • 2017 – 2018 Programme archive
  • 2016 – 2017 Programme Archive
  • 2015 – 2016 Programme Archive
  • 2014 – 2015 Programme Archive
  • 2013 – 2014 Programme Archive
  • 2012 – 2013 Programme Archive
  • 2011 – 2012 Programme Archive
  • LTRG members
  • Visiting Scholars
  • Current PhD Students’ Research
  • Completed PhDs
  • Open Access resources in language testing
  • Other resources in language testing
  • DIALANG 2.0
  • SELT project
  • Language Assessment Literacy Survey
  • Eye-tracking
  • Language Assessment Literacy Symposium 2016
  • Language Testing Forum 2010
  • Eye-tracking Lab
  • Staff eye-tracking research
  • Student eye-tracking research

Completed PhDs in Language Testing

Sickinger Rebecca – An exploration of comparative judgement for evaluating writing performances of the Austrian Year 8 test for English as a Foreign Language (2023)

Glyn Jones – Replicating the validation of the CEFR illustrative descriptors (2023)

Noor Asbahan Shahizan – Validating a group oral task in a university entry test: Interactional competence as a target construct in an academic context (2023)

Kathrin Eberharter – An investigation into the rater cognition of novice raters and the impact of cognitive attributes when assessing speaking (2021)

Olena Rossi – Item writing skills and their development: Insights from an induction item writer training course (2021)

Salomé Villa Larenas – An investigation of the language assessment literacy of teacher educators in Chile: Knowledge, practices, learning, beliefs, and context (2020)

Franz Holzknecht – Double play in listening assessment (2020)

Anuchit Toomaneejinda – An exploratory study of expressing disagreement in ELF academic group discussion (2019)

Pucheng Wang – Investigating test-takers’ cognitive processes while completing an integrated reading-to-write task: Evidence from eye-tracking, stimulated recall and questionnaire (2018)

Theresa Weiler – Investigating the construct tested through four item types used to assess lexicogrammatical competence in English as a foreign language (2018)

Sahar Alkhelaiwi – Cognitive processes, sub-skills and strategies in academic lecture listening at a Saudi Arabian university: A needs analysis study (2017)

Doris Froetscher – An investigation into the washback of a standardized national exam on the classroom testing of reading (2017)

Charalambos Kollias – Virtual standard setting in language testing: Exploring the feasibility and appropriateness of setting cut scores in synchronous audio and audio-visual environments (2017)

Aaron Batty – The impact of visual cues on item response in video-mediated tests of foreign language listening comprehension (2017)

Anchana Rukthong – Investigating the listening construct underlying listening-to-summarize tasks (2016)

Diana Mazgutova – Linguistic and cognitive development of L2 writing during an intensive English for Academic Purposes programme (2015)

Margherita Calderon Lopez – Writing across home and school. The literacy practices and beliefs of 7 to 10 year-old Chilean children, and their relationship with writing (2015)

Zahra Al-Lawati  – An investigation of the characteristics of language test specifications and item writer guidelines and their effect on item development (2014)

Karen Dunn – What makes L2 words difficult to know? Using Explanatory Item Response Theory to model the difficulty of vocabulary test items for learners of English as a Second Language (2014)

Gareth McCray – Statistical modelling of cognitive processing in reading comprehension in the context of language testing (2014)

Sharon McCulloch – Critical engagement with source material in ESL postgraduate reading to write (2014)

Paul Underwood – Japanese teachers’ beliefs and intentions regarding the teaching of English under national curriculum reforms (2014)

Giles Witton-Davies – The study of fluency and its development in monologue and dialogue (2014)

Tania Horak – An investigation into the effect of the Skills for Life Strategy on assessment and classroom practices in ESOL teaching in England (2012)

Geraldine Ludbrook – Investigating the English language needs of CLIL teachers in Italian secondary school science classrooms (2012)

Hiroko Usami – The application of corpora to language testing – Identifying and improving problematic multiple choice grammar questions in Japanese university entrance exams (2012)

Chihiro Inoue – Task parallelness: Investigating the difficulty of two spoken narrative tasks (2011)

Simon Kinzley – The impact of a university pre-sessional course on the academic writing behaviours of a group of Chinese undergraduate students studying for a degree in media and cultural studies (2011)

Sang-Bok Park – Using motivation theories to analyse students’ perceptions of an examination and their inclination to study for it (2011)

Jun-Shik Kim – Developmental patterns of Korean learners corresponding to morphosyntactic items (2010)

Mahmood Moradi – The washback effect of the specialised English test (SPE) (2010)

Yu-Hua Chen – Investigating lexical bundles across learner writing development (2009)

Junko Hondo – Constructing Knowledge in SLA: The impact of timing in form-focused intervention (2009)

Liyan Huang – Washback on teacher beliefs and behaviour: investigating the process from a social psychology perspective (2009)

Heyon Oak – Exploring EFL reading instruction in high school classrooms in Korea: the pedagogic life of the grammar translation method (2008)

R Al-Zadjali – The integrated assessment of reading and writing: Investigating test products and processes in an EFL context

Spiros Papageorgiou – Setting standards in Europe: The judges’ contribution to relating language examinations to the Common European Framework of Reference (2007)

Dina Tsagari – Investigating the washback effect of a high-stakes EFL exam in the Greek context: Participants’ perceptions, material design and classroom applications (2007)

Alistair Van Moere – Group oral tests: How does task affect candidate performance and test scores? (2007)

Philip Glover – Examination influence on how teachers teach: A study of teacher talk (2006)

Jayanti Banerjee – Interpreting and using proficiency test scores (2003)

Chieko Kawauchi – The effect of pre-task activities on L2 oral performance (2003)

Vita Kalnberzina – The interaction between affect and metacognition in language use: the case of foreign language anxiety (2002)

Antony Karl Heinz Erben – Student-teachers’ use of microteaching activity to construct sociolinguistic knowledge within a Japanese immersion initial teacher education program in Australia (2001)

G Perrin – The effect of multiple-choice foreign language tests of listening and reading on teacher behaviour and student attitudes

Rolande Parel – Lexical inferencing strategies of low proficiency second language learners (2000)

Junko Yamashita – Reading in a first and a foreign language: A study of reading comprehension in Japanese (the L1) and English (the L2) (1999)

Dianne Wall – The impact of high-stakes examinations on classroom teaching: A case study using insights from testing and innovation theory (1999)

Sayyed Mohammed Alavi – An investigation of the usefulness of rhetorical structure theory in testing reading comprehension (1997)

Yoshinori Watanabe – The washback effects of the Japanese university entrance examinations on English-classroom based research (1997)

Jo Lewkowicz – Investigating authenticity in language testing (1996)

Marian Tyacke – Dealing with cognitive diversity in language teaching and testing: A three-phase investigation of the relationships between learning style, learner behaviour and reading task performance (1996)

Frank Bonkowski – Teacher use and interpretation of textbook materials in the secondary ESL classroom in Quebec (1995)

Caroline Clapham – The effect of background knowledge on EAP reading test performance (1994)

Glenn Fulcher – The construction and validation of rating scales for oral tests in English as a Foreign Language (1993)

Alastair Allan – EFL reading comprehension test validation: Investigating aspects of process approaches (1992)

Gary Buck – The testing of second language listening comprehension (1990)

Jeanne Marie Kattan – The construction and validation of an EAP test for second year English and Nursing majors at Bethlehem University (1990)

Dejenie Leta – Achievement, washback and proficiency in school leaving examination: A case of innovation in an Ethiopian setting (1990)

Lorna Rowsell – Classroom factors pertaining to dropout among adult ESL students: A constructivist analysis (1990)

Pauline M. Rea – The relationship between grammatical abilities and aspects of communicative competence with special reference to the testing of grammar (1988)

© 2024 Language Testing Research Group (LTRG)

Theme by Anders Noren — Up ↑

The University of Edinburgh home

  • Schools & departments

Postgraduate study

Linguistics and English Language PhD

Awards: PhD

Funding opportunities

Programme website: Linguistics and English Language

The transformative supervision meetings, labs, professional training and departmental research seminars are all conducive to a thriving linguistics training. I am exposed to cutting-edge training and research. This inspiring environment allows me to conduct world-leading research in bilingualism. Katerina Pantoula Current PhD student in Linguistics & English Language

Discovery Day

Join us online on 18th April to learn more about postgraduate study at Edinburgh

View sessions and register

Research profile

We have an outstanding international reputation in many areas of Linguistics and English Language research.

Linguistics & English Language is rated 3rd in the UK by Times Higher Education for the quality and breadth of the research using the latest Research Excellence Framework (REF 2021).

We can offer expert supervision across a wide range of topics, including:

  • Applied Linguistics
  • Developmental linguistics, including first and second language acquisition
  • Discourse and conversation analysis
  • Historical English linguistics, including the syntax, morphology, and phonology from the earliest periods to the present day
  • Language evolution
  • Linguistic fieldwork
  • Morphology, including word formation
  • Multilingualism
  • Phonetics and phonology, including diachronic phonology and the phonology of varieties of English, Scots and their history
  • Sociolinguistics
  • Speech technology
  • Syntax and semantics, including theoretical syntax, descriptive syntax of English, diachronic syntax and both lexical and formal semantics
  • Varieties of English, both British and international

Research groups

Our expertise clusters in a number of research groups and research centres:

  • Developmental Linguistics
  • English Language
  • Language Evolution & Computation (LEC)
  • Language in Context
  • Language Variation and Change (LVC)
  • Meaning and Grammar
  • Phonetics and Phonology
  • Centre for Language Evolution
  • Bilingualism Matters

Research in speech technology is carried out at the Centre for Speech Technology Research, a collaboration between the School of Philosophy, Psychology and Language Sciences and Informatics:

  • Centre for Speech Technology Research

Training and support

You will receive supervision by at least two members of academic staff, who will meet regularly with you to discuss your progress and wider issues in your field of study.

This may include:

  • discussion of relevant literature (for example, journal articles and book chapters)
  • Firming up of research proposal
  • Preparation for fieldwork and data collection
  • discussion of draft chapters of your thesis
  • preparation for conference presentations

Research students are assigned to research groups, each of which hosts regular research activities.

The department also has a visiting speaker series (the Linguistic Circle), and you are encouraged to participate in the School’s Language at Edinburgh research network.

The unrivalled holdings of the University and National Libraries and the National Archives of Scotland make study of this subject at Edinburgh especially attractive.

Our students become part of one of the biggest communities of linguists in the United Kingdom.

We have state-of-the art technical and laboratory facilities:

  • School resources
  • Find out more about our community

The School of Philosophy, Psychology and Language Sciences is home to a large, supportive and active student community, hosting events and activities throughout the year which you can join. As a postgraduate student, you will have access to a range of research resources, state-of-the-art facilities research seminars and reading groups.

Career opportunities

While many of our PhD graduates choose to remain in academia as lecturers and researchers, going onto post-doctoral opportunities or progressing into faculty positions, some pursue employment and careers in other sectors.

Important application information

Find a research opportunity that matches your interests.

  • View our main research areas

Write a research proposal

Your research proposal will be used to consider whether the proposed research is feasible and can be supervised by our staff members, so it is important that your theoretical and methodological preparation for it are clear.

We understand that it can be difficult to formulate research plans well in advance of carrying out the work, but we encourage you to articulate your ideas as clearly as possible. You should draft your proposal several times and, ideally, seek comments on it from other people (perhaps from your referees or former lecturers) before submitting it.

It is recommended that you contact your planned supervisor(s) well in advance of the deadline to identify a suitable topic for your research proposal.

You should then draft the research proposal independently and then discuss it with your planned supervisor(s), revising it based on their comments and suggestions.

Each PhD thesis contains several theoretical and empirical chapters. Your proposal should focus on the empirical work, laying out plans for at least two empirical studies (further plans can be worked out as you progress). Ideally, each of the studies should be a publishable journal article; students are strongly encouraged to publish their work in collaboration with their supervisors.

Your proposal must not exceed 1000 words; the panel may not read the part of your proposal exceeding the limit. This does not include references.

Your proposal should include:

  • A title for the project
  • A brief background for the planned research question(s)
  • A compelling, brief rationale for the studies, including the specific research questions/hypotheses
  • A description of the methodology for addressing these questions/hypotheses, which generally includes:
  • Sufficiently large sample(s) of participants (allowing for appropriate statistical power) and measurement/experimental procedures
  • If using existing data (e.g., data from large cohort studies or biobanks, imaging data sets, etc.), describe the data sets
  • Your data analytical approach (e.g., suitable statistical models)
  • If using qualitative data such as interviews, describe your methods and analytical approach
  • Note that the methodology should be realistic, within the resources and time-scales available to you and your supervisor(s), and also allowing for necessary time for writing the thesis
  • An indication of how your proposed work fits with and contributes to the research programme of your planned supervisor(s).

A PhD thesis typically means teamwork, involving the student and one or two supervisors, and often also other members of the research group(s) of the supervisor(s); a student receives training and help form the team, but can also contribute to the team with their research. Applicants who can show a good fit with a supervising team have an advantage.

We may ask for a brief (Zoom or MS Teams) interview with you if we have further questions.

If your application is successful, we expect that your research will develop. It is likely that your supervisor(s) or those reviewing the work will suggest changes or developments to your research as your studies progress.

Therefore, you will not be held to the ideas that you explain in your proposal during the course of your research.

  • How to write a good PG research proposal

Contact potential supervisors prior to making an application

We strongly encourage you to get in touch with a potential supervisor, and to include their name in your application.

When contacting a potential supervisor, please include a draft proposal and CV as this will provide the starting point for discussion. You can introduce yourself by explaining why their work interests you.

Please note that our academic staff are very busy and it may take time for them to respond to your enquiry.

  • View our staff profiles and contact details

Get ready to apply

In order to ensure full consideration of your application, we ask that you submit your complete application including all supporting documentation.

You will be asked to add contact details for your referees. We will email them with information on how to upload their reference directly to your online application. Please allow plenty of time as we can only consider your application once we have received your full application, including your references.

  • Find out more about the application process

Consider your funding options

There are a number of funding opportunities both within the University and externally. Funding is highly competitive at PhD level.

  • More information on funding

Pre-application Checklist

To receive a pre-arrival checklist to help you with your application, please email the PPLS Postgraduate Office at

Please complete this checklist to keep track of your application preparations. Please submit the completed checklist as an additional document to your application.

Language Sciences at Edinburgh

Entry requirements.

These entry requirements are for the 2024/25 academic year and requirements for future academic years may differ. Entry requirements for the 2025/26 academic year will be published on 1 Oct 2024.

A UK 2:1 honours degree, or its international equivalent, in English language, linguistics, or a related subject.

Your academic achievements will be assessed by a panel of academics along with the research proposal submitted as part of your application.

(Revised 19 February 2024 to clarify entry requirements and assessment methods.)

International qualifications

Check whether your international qualifications meet our general entry requirements:

  • Entry requirements by country
  • English language requirements

Regardless of your nationality or country of residence, you must demonstrate a level of English language competency at a level that will enable you to succeed in your studies.

English language tests

We accept the following English language qualifications at the grades specified:

  • IELTS Academic: total 7.0 with at least 6.5 in each component. We do not accept IELTS One Skill Retake to meet our English language requirements.
  • TOEFL-iBT (including Home Edition): total 100 with at least 23 in each component. We do not accept TOEFL MyBest Score to meet our English language requirements.
  • C1 Advanced ( CAE ) / C2 Proficiency ( CPE ): total 185 with at least 176 in each component.
  • Trinity ISE : ISE III with passes in all four components.
  • PTE Academic: total 70 with at least 62 in each component.

Your English language qualification must be no more than three and a half years old from the start date of the programme you are applying to study, unless you are using IELTS , TOEFL, Trinity ISE or PTE , in which case it must be no more than two years old.

Degrees taught and assessed in English

We also accept an undergraduate or postgraduate degree that has been taught and assessed in English in a majority English speaking country, as defined by UK Visas and Immigration:

  • UKVI list of majority English speaking countries

We also accept a degree that has been taught and assessed in English from a university on our list of approved universities in non-majority English speaking countries (non-MESC).

  • Approved universities in non-MESC

If you are not a national of a majority English speaking country, then your degree must be no more than five years old* at the beginning of your programme of study. (*Revised 05 March 2024 to extend degree validity to five years.)

Find out more about our language requirements:

  • Fees and costs

Read our general information on tuition fees and studying costs:

Scholarships and funding

Only applications received by the Round 1 deadline will be considered for University of Edinburgh based funding.

You may be able to secure external funding outside of this deadline.

Featured funding

  • Scottish Graduate School for Arts and Humanities Funding
  • Scottish Graduate School of Social Science Funding
  • [College of Arts, Humanities and Social Sciences Research Awards] ( https://www.ed.ac.uk/ppls/linguistics-and-english-language/prospective/postgraduate/funding-research-students/arts-humanities-soc-sci-research-awards )
  • [Edinburgh Doctoral College Scholarships] ( https://www.ed.ac.uk/student-funding/postgraduate/international/other-funding/doctoral-college )

UK government postgraduate loans

If you live in the UK, you may be able to apply for a postgraduate loan from one of the UK’s governments.

The type and amount of financial support you are eligible for will depend on:

  • your programme
  • the duration of your studies
  • your tuition fee status

Programmes studied on a part-time intermittent basis are not eligible.

  • UK government and other external funding

Other funding opportunities

Search for scholarships and funding opportunities:

  • Search for funding

Further information

  • PPLS Postgraduate Office
  • Phone: +44 (0)131 651 5002
  • Contact: [email protected]
  • Dugald Stewart Building
  • 3 Charles Street
  • Central Campus
  • Programme: Linguistics and English Language
  • School: Philosophy, Psychology & Language Sciences
  • College: Arts, Humanities & Social Sciences

This programme is not currently accepting applications. Applications for the next intake usually open in October.

Start date: September

Application deadlines

Only applications received by the Round 1 deadline will be considered for University of Edinburgh based funding. You may be able to secure external funding outside of this deadline.

We operate a gathered field approach to PhD applications.

This means that all complete applications which satisfy our minimum entry requirements will be held until the nearest deadline. The admissions panel will meet to consider all applications received together after that date.

Applications are held for processing over two deadlines:

  • How to apply

Please read through the ‘Important application information’ section on this page before applying.

Find out more about the general application process for postgraduate programmes:

  • Open access
  • Published: 18 May 2022

Reliability of measuring constructs in applied linguistics research: a comparative study of domestic and international graduate theses

  • Kioumars Razavipour   ORCID: orcid.org/0000-0002-6533-2968 1 &
  • Behnaz Raji 1  

Language Testing in Asia volume  12 , Article number:  16 ( 2022 ) Cite this article

2496 Accesses

1 Citations

1 Altmetric

Metrics details

The credibility of conclusions arrived at in quantitative research depends, to a large extent, on the quality of data collection instruments used to quantify language and non-language constructs. Despite this, research into data collection instruments used in Applied Linguistics and particularly in the thesis genre remains limited. This study examined the reported reliability of 211 quantitative instruments used in two samples of domestic and international theses in Applied Linguistics. The following qualities in measuring instruments were used to code the data: the instrument origin, instrument reliability, reliability facets examined, reliability computation procedures utilized, and the source of reliability reported (i.e., primary or cited). It was found that information about instrument origin was provided in the majority of cases. However, for 93 instruments, no reliability index was reported and this held true for the measurement of both language and non-language constructs. Further, the most frequently examined facet of reliability was internal consistency estimated via Cronbach’s alpha. In most cases, primary reliability for the actual data was reported. Finally, reliability was more frequently reported in the domestic corpus than in the international corpus. Findings are discussed in light of discursive and sociomaterial considerations and a few implications are suggested.

Introduction

In educational measurement literature and in language testing, confidence in measurements depends on their consistency and validity. For an instrument to be valid, it has to be consistent (though the term consistency is more precise compared to reliability, in this paper, we used them interchangeably). That said, whereas in educational measurement and in language testing, much attention has been paid to investigating reliability and validity of tests used for selection and achievement purposes, the quality of measuring instruments used for research purposes in Applied Linguistics and language teaching remains underexplored. Such studies are warranted on the grounds that they carry immediate implications for practitioners, policy makers, and researchers. For the practitioners who rely on research findings to improve their language teaching practices, it is imperative that such research is based on sound measurements of constructs. Additionally, in action research, the effectiveness of educational interventions can only be examined through sound measurements of key variables. Sound measurements are also crucial for education policy makers who rely on research findings to choose, adapt, and implement language education policies. If the research informing policies is founded on inconsistent measurements, they are likely to derail proper policy making with grave consequences for language teachers, learners, and the wider society. Finally, proper measurements are of utmost importance for the progress of research and the production of knowledge in the field of Applied Linguistics and language teaching. Threats to the consistency and validity of measurements in research would potentially derail future research that depends on incremental accumulation of research evidence and findings. Given the mutual exchange of ideas and insights between Applied Linguistics and language testing (see Bachman & Cohen, 1998  and Winke & Brunfaut, 2021 ), the quality of research in different areas of AL influence research directions and decisions in language testing.

Despite the noted implications that reliable assessments hold for policy and practice, whether and the extent to which Applied Linguistics researchers examine or maximize the consistency of their measuring instruments remains underexplored. More specifically, the current literature on research instrument quality in AL is mostly focused on the published research papers. Indeed, we are aware of no published work on the reliability of measuring instruments in theses or dissertations in AL. We believe that as a distinct genre which operates under different sociomaterial circumstances and is written for a different audience, the thesis genre warrants closer scrutiny in terms of measurement quality because of the consequences and implications that the quality of this genre has for the academia and the wider society. This study intends to narrow the noted gap by investigating the reliability with which variables are measured in a corpus of theses and dissertations in Applied Linguistics across several academic settings. In the remaining of this paper, we first examine research quality in quantitative research in Applied Linguistics. We then zero in on issues of instrument validity and reliability within current theories of validity, particularly those of Messick and Kane.

Research quality and measurement

The fact that a good deal of Applied Linguistics research depends on the production and collection of quantitative data makes the quality of measuring instruments of crucial importance (Loewen & Gass, 2009 ). Unreliable data generates misleading statistical analyses, which, in turn, weakens or defeats the entire argument of quantitative and mixed methods studies. Subsequently, the quality of measuring instruments affects the internal validity of research studies (Plonsky & Derrick, 2016 ), which in turn compromises the credibility of research findings.

In the social sciences and Applied Linguistics, concern with reliability and validity of measuring instruments is a perennial problem that can “neither be avoided nor resolved” (Lather, 1993 , p. 674) because unlike metric systems in physics, which are almost of universal value and credibility, measuring instruments in AL do not satisfy the principle of measurement invariance (Markus & Borsboom, 2013 ). That is, the properties of measuring instruments are dependent upon the properties of the object of measurement (i.e., research participants, context of use, etc.). Hence, every time, a test or a questionnaire is used in a research study, its reliability and validity should be examined.

Given the centrality of measurement invariance, Douglas ( 2014 ) uses the “rubber ruler” metaphor to refer to this property of measuring instruments in AL research. As a rubber ruler may stretch or shrink depending on temperature, the interval between units of measurement fluctuate with changes in temperature. Therefore, the quality of measuring instruments (MIs) in AL research is often subject to contextual fluctuations. For this reason, examining and maximizing the reliability of measuring instruments is crucial. The following quote from Kerlinger (1986 cited in Thompson, 1988 ) captures the significant of instrument reliability in quantitative research.

Since unreliable measurement is measurement overloaded with error, the determination of relations becomes a difficult and tenuous business. Is an obtained coefficient of determination between two variables low because one or both measures are unreliable? Is an analysis of variance F ratio not significant because the hypothesized relation does not exist or because the measure of the dependent variable is unreliable? ...High reliability is no guarantee of good scientific results but there can be no good scientific results without reliability. (p. 415)

The above quote goes back to almost half a century ago, yet problems with MIs continue to persist in Applied Linguistics and SLA (Purpura et al., 2015 ).

In language teaching research, concern with how researchers handle quantitative data has recently increased. As such, several studies have addressed the quality of quantitative analyses (Khany & Tazik, 2019 ; Lindstromberg, 2016 ; Plonsky et al., 2015 ), researchers’ statistical literacy (Gonulal, 2019 ; Gonulal et al., 2017 ), and quality of instrument reporting (Derrick, 2016 ; Douglas, 2001 ; Plonsky & Derrick, 2016 ). Douglas ( 2001 ) states that researchers in SLA often do not examine indexes of performance consistency for the MIs they use.

Recently, inquiry into the quality of research studies has spurred interest in the evaluation of MIs, in particular their reliability and performance consistency (Derrick, 2016 ; Plonsky & Derrick, 2016 ) in published research articles. A common theme in both of the noted studies is that the current practices in measuring instruments’ reliability reporting are less than satisfactory. That is, inadequate attention is often given to the reliability of MIs in Applied Linguistics research. The current slim literature on research instrument quality is largely about the research article (RA) genre in. As such, we are aware of no published research on how the reliability of quantitative instruments is handled and reported in the thesis genre in Applied Linguistics research and almost exclusively the academic north of the globe (Ryen & Gobo, 2011 ). Given the culture and context-bound nature of research methogology and hence assessment methods (Chen, 2016 ; Ryen & Gobo, 2011 ; Stone & Zumbo, 2016 ), studying MIs in other contexts is warranted. In addition, theses are not subject to the same space limitations that the research paper is; thus, one would expect detailed accounts of data elicitation instruments in a thesis. For the noted reasons, this study examines the quailty of data elicitation instruments in a sample of theses in Applied Linguistics. We hope that findings would encourage graduate students and early career researchers to exercise more care and seek more rigor in their choice of MIs and the inferences they make of them, which would enhance the credibilty of research findings. In the remaining of this paper, we will first briefly discuss validity in Applied Linguistics and language testing. We do so to situate issues of reliability and consistency in the broader context of validity, which is the ultimate criterion of data and inference quality. We will then present our own study along with a discussion of findings and implications it might carry for research in Applied Linguistics.

Quality of measurements: validity and reliability

In psychometrics and educational measurement as well as in Applied Linguistics research, quality of measuring instruments is often captured by the term validity. In more traditional yet still quite common definitions, validity refers to the extent to which a measuring instrument measures what it is purported to measure and reliability is about how consistently it does so (Kruglanski, 2013 ). From this perspective, reliability is considered a necessary but insufficient precondition for validity, that is, an instrument can be reliable without being valid (Grabowski & Oh, 2018 ), which implies that an instrument may demonstrate consistency in the kind of data it yields without essentially tapping what it is purported to tap. In recent conceptualizations of validity, however, reliability is integrated within the domain of validity (Kane, 2006 ; Newton & Shaw, 2014 ; Purpura et al., 2015 ; Weir, 2005 ). Largely thanks to Messick’s legacy, validity is defined as an overall evaluative judgment of the degree to which empirical evidence and theoretical rationale justifies the inferences an actions that are made based on test scores (Messick, 1989 ). Viewed from this holistic approach to validity, reliability is considered one source of validity evidence that should be used to support the inferences that are to be made of test scores. Whereas this conceptualization of validity as argument is increasingly being embraced in educational measurement and language testing, it has yet to permeate the broad literature on Applied Linguistics research in general and TEFL in specific (Purpura et al., 2015 ). In fact, some scholars believe that lack of knowledge about how to effectively measure L2 proficiency is the main reason for the failure of the field of SLA to make real progress in explaining development and growth in an L2 (Ellis, 2005, cited in Chapelle, 2021 ).

While we are mindful of the importance of validity, in this paper, we focus exclusively on reliability for two reasons. First, we believe that despite the theoretical unification of aspects of validity evidence (Bachman & Palmer, 2010 ; Chapelle, 2021 ; Kane, 2013 ), reliability still serves as a good heuristic to examine measurement quality. This is evident even in Kane’s argument-based validity. In going from data to claims, the first argument that must be supported in argument-based validation is evaluation, which refers to how verbal or non-verbal data elicited via a quantitative measure is converted to a quantity and unless this argument is adequately supported, the rest of the validity chain cannot be sustained. Secondly, despite the noted theoretical shift, scholars continue to make the distinction between validity and reliability, perhaps because for the practitioners both Messick’s unified approach and Kane’s argument based validation are difficult to translate into the practice of evaluating their measuring instruments. For the noted reasons, we thought that imposing a theoretical framework of validity that is incompatible with current practices may not be helpful.

Reliability of data collected via quantitative data collection instruments

Concern with the quality of measurements in Applied Linguistics research is not new. More than two decades ago, Bachman and Cohen edited a book volume on how insights from SLA and Language Testing can assist in improving the measurement practices in the two fields. More recently, several studies have investigated reliability and consistency of quantitative instruments across disciplines (Plonsky, 2013 ; Plonsky & Derrick, 2016 ; Vacha-Haase et al., 1999 ). Al-Hoorie and Vitta ( 2019 ) investigated the psychometric issues of validity and reliability, inferential testing, and assumption checking in 150 papers sampled from 30 Applied Linguistics journal. Concerning reliability, they found that “almost one in every four articles would have a reliability issue” (p. 8).

Taken together, the common theme in most studies is that the current treatment of quantitative measures and instruments is far from ideal (Larson-Hall & Plonsky, 2015 ). That said, the findings of past studies are mixed, ranging from six percent of studies reporting reliability to 64% (Plonsky & Derrick, 2016 ). This loose treatment of quantitative data collection tools seems to be common in other social science disciplines such as psychology (Meier & Davis, 1990 ; Vacha-Haase et al., 1999 ).

Compared to research articles, much less work has been done on how the quality of MIs is addressed in other research genres such as theses and dissertations. Evaluating the research methodology of dissertations and published papers, Thompson ( 1988 ) identified seven methodological errors, one of which was the use of instruments with inadequate psychometric integrity. Likewise, Wilder and Sudweeks ( 2003 ) examined 106 dissertations that had used Behavioral Assessment System for Children and found that only nine studies did report reliability for the subpopulation they had studied and the majority of the studies only cited reliability from the test manual. Such practices in treating reliability likely arise from the misconception that reliability or consistency is an attribute of a measurement tool. However, given that reliability, in its basic definition, is the proportion of observed score variance in the data to the true variance, it follows that observed variance depends on the data collection occasion, context, and participants; change the context of use, and both observed variance and true variance change. That said, perhaps because of discursive habits, reliability is often invoked as an instrument property not the property of the data that is gathered via the instrument.

In sum, the above brief review points to a gap in research into the reliability of MIs in Applied Linguistics research. The current study intends to narrow this gap in the literature in the hope that it will raise further awareness of the detriments of poor research instruments. Our review of the literature showed that writers of RAs sometimes fail to provide full details regarding their MIs (Derrick, 2016 ), a practice which has repercussions for future research. Given the differences between the RA and thesis genre noted above, it is important to see how the quality of measuring instruments quality is addressed in the theses. The literature also suggests that reliability is underreported in RAs. In addition to addressing this in the thesis genre, in this study, we also delve further into the facets of reliability that are given attention. Given that in the discourse around reliability in Applied Linguistics, reliability is often attributed to the instrument not to the data, we further inquire into the extent to which this discourse affects the way researchers report the reliability of their data or choose to rely on reliability evidence reported in the literature. In addition, to our knowledge, extant literature has not touched upon possible relationship between reliability reporting behavior and the nature of constructs measured, a further issue we address in this study. Finally, given the situated nature of knowledge and research, it is important to know how the quality of quantitative research instruments is treated across contexts. The above objectives are translated into the following research questions.

How frequently are the origins of research instruments reported?

How frequently is the reliability reported? And when it is reported, what reliability facets are addressed and what estimation procedures are used for computing it?

What is the source of reliability (i.e., primary, cited, or both) that is reported?

Does the reliability reporting practices differ across construct types measured (language vs. non-language constructs) and across geographical regions?

We believe that these questions are important because the insights gained can contribute to our collective assessment literacy (Harding & Kremmel, 2021 ), which “has the capacity to reverse the deterioration of confidence in academic standards” (Medland, 2019 , p. 565), for research that relies on instruments of suspicious consistency add noise to the body of scholarship and can mislead and misinform future research.

To answer the research questions, a corpus of 100 theses and dissertations from 40 universities in 16 countries across the world was collected. Roughly half of the theses were chosen from Iran, and the other half were selected from 39 universities based mostly in American and European countries. The theses from universities in the USA had the highest frequency (15) followed by those in the Netherlands (6), Canada (5), and England (4). Given that at the time of data collection, we knew of no comprehensive repository of theses accommodating theses from all universities across the globe, a random sample of theses could not be secured. Therefore, we do not claim that the corpus of theses examined in this study are representative of the universe of theses across the globe; yet, they are diverse enough to provide us with relevant insights.

For international theses, the most popular database is the ProQuest ( https://pqdtopen.proquest.com ). Yet, its search mechanism does not allow the user to search the theses by country and once the theses are searched using key words, the search results yielded are mostly those written in North American universities, specially the USA. To diversity the corpus and make it more representative of theses done in other universities of the world, we searched the following website: http://www.dart-europe.eu , which gives the user the option of limiting the search to a given country. All the international theses collected were then saved as PDF files.

Our only inclusion criterion was whether a thesis had made use of quantitative measures such as language tests, surveys, questionnaires, rating scales, and the like. To make inclusion decisions, the abstract and the Methods section of each thesis were carefully examined. In order to determine whether and how reliability was treated in each thesis in the domestic corpus, the abstract, the Methods chapter, and in some cases, the Results and Findings chapter were closely examined. As for the international theses, the entire Methods chapter was checked. In case we could not find information about the reliability in the noted sections, we used the search option in Acrobat Reader using the following search terms: reliability, consistency, agreement, alpha, Cronbach, valid, and KR (i.e., KR-20 and KR-21).

Our unit of analysis was the measuring instrument and not the thesis. Two hundred and eleven MIs including 110 language tests, 82 questionnaires, 9 rating scales, 8 coding schemes, and two tests of content area (e.g. math) had been used in the corpus of theses we examined. The most frequently tested aspects of language were overall language proficiency (22), vocabulary (13), writing (12), and reading comprehension (11). Regarding the questionnaires, the most frequently measured constructs were learning strategies (8), motivation (4), and teacher beliefs (4).

The coding process was mainly informed by the research questions, which were about the origin, reliability type, reliability source, and reliability estimation methods. In addition, coding schemes used in similar studies such as Plonsky and Derrick ( 2016 ) and Derrick ( 2016 ) were reviewed. Thus, coding began with the major categories highlighted in research questions. We coded the MIs used in the first 30 theses and after a thesis was coded, if a new category was found, the coding scheme was further refined to accommodate new categories. Therefore, though we started with a set of categories a priori, the actual coding was rather emergent, cyclic, and iterative. Once we settled on the final coding scheme, the entire corpus was coded once again from scratch. To minimize the subjectivity that inhere in coding, a sample of the theses was coded by the second author. The Kappa agreement rate was 96% and in a few cases of disagreement, the differences were resolved through discussion between the authors. Table  1 shows the final coding system used.

Finally, to analyze data generated using our coding scheme, we mainly used descriptive statistics such as raw frequencies, percentages, and graphic representation of data using bar graphs. In cases where we needed to compare the domestic with international theses, we used Pearson chi-square test of independence, as a non-parametric analytic procedure (see Pallant, 2010 , p. 113). The above-mentioned analytic procedures were deemed appropriate because of the nominal and discrete nature of the data we worked with in this study.

In this section, we first report the findings on the origin of MIs. Next, findings with regard to facets of reliability reported. This is then followed by reporting the results related to reliability estimation procedures used in the corpus. The source of reliability estimate along with reliability reporting across construct types comes next. Finally, findings pertaining to reliability across the domestic and international corpus of theses are reported.

Our first research question was about the origin of measuring instruments used. That is, we looked for information about whether a measurement tool used had been adapted or adopted from a previous work, designed by the researcher, adapted and translated, compiled from various measures and then adapted to the study context, or if the origin of the MI was not specified in the theses. As Fig.  1 displays, in 12 cases, the authors failed to give information regarding the origin of their MIs. Roughly half of the MIs had been designed by researchers, and a third of them had been adopted from previous studies. In the remaining cases, they had been either adapted ( n  = 15), their origin was not reported ( n  = 12), they were compiled and then adapted (6), or adapted and then translated ( n  = 4).

figure 1

Frequency of reporting instrument origin

The second research question of the study concerned facets of reliability (Grebowsky, 2018) that were addressed and the estimation procedures used for computing reliability . According to Fig.  2 , for 93 MIs, the authors did not provide any information about the reliability of the instruments they used. In cases where reliability was reported, internal consistency was the most commonly used reliability facet ( n  = 75), followed by inter-rater reliability ( n  = 8), inter-rater and internal consistency ( n  = 7), and the test–retest method ( n  = 6). On the other hand, for 18 instruments, no information was provided about the reliability facet that had been reportedly used. That is, the thesis writers did not specify the facet of reliability they had examined.

figure 2

Frequency of reporting each reliability facet

As to reliability estimation procedures used in the corpus, Fig.  3 shows that Cronbach alpha stands out with a frequency of 65, followed by Pearson correlation ( n  = 7). The two Kuder-Richardson formulas with a frequency of six and five, respectively, come next. Other less frequently used reliability estimation procedures are Spearman, Kappa, Pearson chi-square, Cohen K , and paired sample t -test. It bears noting that in 19 cases, the reliability estimation procedure was not specified. In other words, the thesis writers did not specify how they had arrived at the reliability coefficient they reported.

figure 3

Methods of estimating reliability

The third research question was about the source of the reported reliability estimate. We sought to know whether and the extent to which researchers report the reliability of their own data (i.e., primary reliability), report a reliability index from a previous study (i.e., cited reliability), or report both primary and cited reliability. The results showed that in the majority of cases ( n  = 96), primary reliability was reported. In four cases, both primary and cited reliabilities were reported and for 10 MIs, a reliability estimate from another study was reported (i.e., cited reliability).

Our fourth research question concerned whether the type of construct measured by MIs (i.e., language vs. non-language constructs) moderates the frequency with which reliability is reported (see Figs.  3 and 4 ).

figure 4

Frequency of reporting reliability for language and non-language measures

To know if there is any association between construct type measured and the extent to which reliability is reported, Pearson chi-square test of independence (see Pallant, 2010 p. 113) was run. It was found that reliability reporting did not significantly vary across construct types, X 2 (1, N  = 211) = 0.23, p  = 0.62.

Finally, we sought to know whether reliability reporting practices vary across the domestic and international corpus. Table  2 gives the frequency of reporting reliability in the domestic and international theses.

As Table  2 displays, reliability seemed to be more frequently reported in the domestic corpus of theses. To see if the apparent difference in frequency is significant, another Pearson chi-square test for independence was conducted, which showed that the difference is significant, X 2 (1, N  = 211) = 4.59, p  = 0.02.

Conclusions and discussion

The credibility of knowledge and of research findings continues to spark debate, confusion, and controversy. Hence, across research paradigms, the question of whether and how truth is to be established has been addressed differently. In Applied Linguistics, the question of truth and credibility is often addressed using the notion of research validity, which can be threatened or compromised by different sources including inconsistencies in evidence arising from temporal, spatial, social sources. The issue of consistency is treated by examining reliability. It is assumed that when consistency is not established, claims of truth or validity cannot be made (Chapelle, 2020 ). In this study, we examined whether and the extent to which the reliability of measuring instruments used in measuring variables in research is addressed. More specifically, we probed into reliability reporting practices in a corpus of domestic and international theses in Applied Linguistics.

Overall, our findings in this study indicate that in a considerable number of cases, the researchers failed to examine the reliability of their research instruments and this held constant across language and non-language measuring instruments, which echo the findings of similar studies on published papers such as Plonsky and Gass ( 2011 ), Plonsky and Derrick ( 2016 ), and Purpura et al. ( 2015 ). It was also found that reliability was often treated in a ritualistic manner where, by default, researchers opt for examining the internal consistency facet of their instruments without providing a logic to choosing this facet at the elimination of other reliability facets. This finding accords with those of several studies across a number of fields (Douglas, 2001 ; Dunn et al., 2014 ; Hogan et al., 2000 ; Plonsky & Derrick, 2016 ). Finally, it was observed that in domestic corpus of theses, reliability if frequently reported than in the international corpus. In the remainder of this section, we try to explain the observed findings drawing on a socio-material frame of thought (see Canagarajah, 2018 ; and Coole & Frost, 2010 ) and sociology of knowledge (Dant, 2013 ).

More specifically, our finding that compared to the research articles, reliability is more frequently reported in theses might have to do with space issues as a dimension of material considerations or disciplinary conventions (Harding & Kremmel, 2021 ). Likewise, the dominant tendency to choose Cronbach’s alpha as an index of reliability must be due to logistic and practicality concerns, as alpha is the default reliability facet in most statistical packages. Socio-material considerations are also at play when researchers often treat reliability in a post hoc manner after they have already conducted their main study. In such cases, if reliability of the data turns out to be low, researchers would prefer to skip reporting reliability (Grabowski & Oh, 2018 ) rather than starting over, modifying instruments, and collecting new data.

Other aspects of the findings can be accounted for by drawing on sociology of knowledge, particularly by invoking issues of genre and conventions within Applied Linguistics as discourse communities. For instance, contrary to our expectations, we found more frequent reporting of reliability in the domestic corpus. We tend to think that this might have to do with a certain discourse around reliability that is dominant in the Iranian Applied Linguistics community, where common sense meaning of reliability and its psychometric meaning are possibly conflated. As Ennis ( 1999 ) notes, reliable data does not mean good data, nor does it mean data we can rely on. These are common sense meanings of the term reliability. In contrast, in the educational measurement and psychometric discourse community, reliable data only mean data that is consistent across some test method facets. When researchers take reliable data to mean good data, they would give it more value and try to report it more frequently as a perceived index of research rigor.

Another observation that can be made sense of by invoking discursive realities has to do with the origin of MIs, which in many cases were designed by researchers. Measurement in the social sciences continues to be a source of controversy (Lather, 1993 ). There are some who believe that all measurements in psychometrics and education are flawed because they conflate statistical analysis with measurement, for the very objects of measurement fail to satisfy the ontological conditions of quantification (see Michell, 1999 , 2008 ). Lather even go so far as to say that validity as a mechanism “to discipline the disciplines” is in fact the problem not the solution. Yet, despite all the complexities around measurement, it is not uncommon in Applied Linguistics to observe simplistic approaches to measuring instruments where any set of assembled items is taken to serve as a measuring instrument. It is for this reason that language testing scholars believe that designing a measuring instrument demands expertise and assessment literacy (Harding & Kremmel, 2021 ; Phakiti, 2021 ; Purpura et al, 2015 ), which is often in short supply in the academic south of the world (Oakland, 2009 ).

A further discursive myth regarding reliability that is somewhat common in Applied Linguistics community is that reliability is a characteristic of the measuring instrument (Grabowski & Oh, 2018 ; Larson-Hall & Plonsky, 2015 ; Vacha-Haase, 1998 ). This myth explains our finding that in many cases, some thesis writers rely on a reported reliability in the literature rather than examining the reliability of their own data. As Rowley ( 1976 ) states “It needs to be established that an instrument itself is neither reliable nor unreliable…A single instrument can produce scores which are reliable, and other scores which are unreliable” (p. 53).

Relatedly, some measuring conventions and reliability practices seem to have become dogmatized, at least in some communities of social science and Applied Linguistics. One such dogma is the status that Cronbach alpha has come to enjoy. Some methodologists maintain that repeated use of alpha has become dogmatized, routinized, and ingrained in the culture of research in social sciences and humanities (Dunn et al., 2014 ), and despite the heavy scrutiny that alpha has recently come under, recommendations from statistics experts have yet to penetrate research in social science, psychology, and Applied Linguistics research (McNeish, 2018 ). Alpha, like many other statistics, makes certain assumptions about the data, which are often ignored by researchers (Dunn et al., 2014 ; McNeish, 2018 ). In addition, these assumptions have been demonstrated to be unrealistic and difficult to meet (Dunn et al., 2014 ). For the noted flaws in alpha, scholars have called for more robust ways of assessing reliability such as exploratory and confirmatory factor analysis. Yet, there seems to be a prevailing reluctance on the part of most researchers to go beyond Cronbach alpha perhaps because of the technical knowledge that is necessary for proper use, implementation, and interpretation of exploratory and confirmatory factor analysis. A further limitation that should be taken into consideration with regard to alpha is that alpha is essentially a parametric statistic assuming continuous data and non-skewed distributions (Grabowski & Oh, 2018 ). However, in much Applied Linguistics research, the kind of score interpretations made of quantitative data are of criterion-referenced nature with positively or negatively skewed distributions, which would require specific reliability estimation that are different from those commonly used for norm-referenced interpretations (Bachman, 2004 ; Brown, 2005 ; Brown & Hudson, 2002 ).

Implications

In this study, we claimed that sociomaterial and discursive considerations account for current practices and approaches to measuring instruments and their reliability in theses written in Applied Linguistics. As noted above, some of the pitfalls in measuring language and non-language constructs stem from rigid disciplinarity that characterizes current higher education structure. This insulation of disciplines results in our becoming unaware of insights and progress that is made in neighboring disciplines. As Long and Richards ( 1998 , p. 27) maintain, “advances in language testing” remain “a closed book” for some, if not many, Applied Linguistics researchers (Chapelle, 2021 ). Perhaps, this is partly due to further compartmentalization that has transpired in Applied Linguistics as a result of which the sub-disciplines of the field are hardly aware of each other’s advances (Cook, 2015 ).

Therefore, more inter and cross-discipline dialogue and research holds the potential to deepen our understanding of sound measurement of constructs in Applied Linguistics. Some scholars go even further to suggest that Applied Linguistics must be seen as epistemic assemblage, which would strip the established sub-disciplines of Applied Linguistics of their ontological status as disciplines (Pennycook, 2018 ). Accordingly, to increase research rigor, we would like to call further cross-fertilization among SLA, language teaching, language testing, and even the broader field of measurement in social and physical sciences.

One curious observation we made in this study was that, in some cases, high alpha indexes were reported for proficiency tests that had been used to ensure the homogeneity of a sample of participants, often with the conclusion that the sample turned out to be homogenous. Given that parametric assumptions of alpha are violated with a homogenous sample of participants, high alpha values are almost impossible to obtain. How such high alpha coefficients have been produced remains an open question. The implication that awareness of such malpractices carries is that Cronbach’s alpha and other reliability estimation procedures make assumptions about the data. Unless there is evidence that such assumptions have been met, one is not justified in using the chosen reliability estimation methods (Grabowski & Oh, 2018 ). Therefore, to foster research rigor, a ritualistic reporting of a high alpha coefficient is not adequate. Rather, both common sense and expertise in language assessment must be drawn upon to judge MI quality.

The other implication is that investigating and maximizing reliability must not be guided solely by practical considerations and statistical analysis. Instead, theoretical and substantive considerations should inform the process. As every research context is likely to be different, it falls on the researcher to predict and explain all the possible internal and external factors bearing on the consistency of the data collected via quantitative instruments (Grabowski & Oh, 2018 ). It is this context-bound nature of reliability that makes it difficult to prescribe any rule that would work across contexts for all instruments.

We would like to support the call for more rigor and conservatism in designing, adopting, and adapting measurement instruments in Applied Linguistics research. Graduate students and early career professors should not shy away from deep reflections on and involvement in the foundations of research design and data collection methods. The critique made of research in education four decades ago Pedhazur ( 1992 p. 368) still holds true.

There is a curious mythology about understanding and mastery of the technical aspects of research. Statistics is often called “mere statistics,” and many behavioral researchers say they will use a statistician and a computer expert to analyze their data.

An artificial dichotomy between problem conception and data analysis is set up.

To think that a separate group of experts are responsible for the design and development of proper measurements and to think that the job of the research practitioner is to merely use those instruments is to perpetuate the noted artificial dichotomy between research practice and theoretical conceptions.

In sum, measurement is a tricky business even physics. In the social sciences where we work with humans, language, and discourse within complex socio-political structures, isolating, defining, and measuring constructs is very complicated. If this statement sounds radical, it is only because we in Applied Linguistics are insulated from serious debates about the ontology and epistemology of measurement (see Michell, 1999 ; Markus & Borsboom, 2013 ; Chapelle, 2020 ). Furthermore, the massification of higher education and the publish or perish regime in the academia has generated a mindset which takes a superficial and simplistic approach to testing complex social constructs. To improve on this situation, the fast food approach to research production (Pourmozafari, 2020 ) should be discouraged and countered.

Availability of data and materials

Data can be supplied upon request.

Abbreviations

Measurement instrument

English as a Foreign Language

Research article

Kuder-Richardson

Second language acquisition

Al-Hoorie, A. H., & Vitta, J. P. (2019). The seven sins of L2 research: a review of 30 journals’ statistical quality and their CiteScore, SJR, SNIP JCR impact factors. Language Teaching Research, 23 (6), 727–744.

Article   Google Scholar  

Bachman, L. F. (2004). Statistical analyses for language assessment book . Cambridge: Cambridge University Press.

Book   Google Scholar  

Bachman, L. F., & Palmer, A. (2010). Language assessment in practice: developing language assessments and justifying their use in the real world . Oxford: Oxford University Press.

Google Scholar  

Bachman, L. F., & Cohen, A. D. (Eds.). (1998). Interfaces between second language acquisition and language testing research . Cambridge: Cambridge University Press.

Brown, J. D. (2005). Testing in language programs: a comprehensive guide to English language assessement . New York: McGraw-Hill.

Brown, J. D., & Hudson, T. (2002). Criterion-referenced language testing . Cambridge: Cambridge University Press.

Canagarajah, S. (2018). Materializing ‘competence’: Perspectives from international STEM scholars. The Modern Language Journal , 102 (2), 268–291.

Chapelle, C. A. (2020). Argument-based validation in testing and assessment . Los Angeles: Sage.

Chapelle, C. A. (2021). Validity in language assessment. In P. Winke & T. Brunfaut (Eds.), The Routledge handbook of second language acquisition and language testing (pp. 11–20). New York: Routledge.

Chen, X. (2016). Challenges and strategies of teaching qualitative research in China. Qualitative Inquiry, 22 (2), 72–86.

Cook, G. (2015). Birds out of dinosaurs: the death and life of applied linguistics. Applied linguistics, 36 (4), 425–433.

Coole, D., & Frost, S. (2010). Introducing the new materialisms. New materialisms: ontology, agency, and politics. In D. Coole & S. Frost (Eds.), New materialisms: Ontology, agency, and politics (pp. 1–43).

Dant, T. (2013). Knowledge, ideology & discourse: a sociological perspective . London: Routledge.

Derrick, D. J. (2016). Instrument reporting practices in second language research. TESOL Quarterly, 50 (1), 132–153.

Douglas, D. (2001). Performance consistency in second language acquisition and language testing research: a conceptual gap. Second Language Research, 17 (4), 442–456.

Douglas, D. (2014). Understanding language testing . London: Routledge.

Dunn, T. J., Baguley, T., & Brunsden, V. (2014). From alpha to omega: a practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology, 105 (3), 399–412.

Ennis, R. H. (1999). Test reliability: a practical exemplification of ordinary language philosophy. Philosophy of Education Yearbook

Gonulal, T. (2019). Statistical knowledge and training in second language acquisition: the case of doctoral students. ITL-International Journal of Applied Linguistics, 17 (1), 62–89.

Gonulal, T., Loewen, S., & Plonsky, L. (2017). The development of statistical literacy in applied linguistics graduate students. ITL-International Journal of Applied Linguistics, 168 (1), 4–32.

Grabowski, K. C., & Oh, S. (2018). Reliability analysis of instruments and data coding. In A. Phakit, P. De Costa, L. Plonsky, & S. Starfield (Eds.), The Palgrave handbook of applied linguistics research methodology (pp. 541–565). London: Springer.

Chapter   Google Scholar  

Harding, L., & Kremmel, B. (2021). SLA researcher assessment literacy. In P. Winke & T. Brunfaut (Eds.), The Routledge handbook of second language acquisition and language testing . New York: Routledge.

Hogan, T. P., Benjamin, A., & Brezinski, K. L. (2000). Reliability methods: a note on the frequency of use of various types. Educational and Psychological Measurement, 60 (4), 523–531.

Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement. Westport, Conn: Praeger.

Kane, M. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50 (1), 1–73.

Khany, R., & Tazik, K. (2019). Levels of statistical use in applied linguistics research articles: from 1986 to 2015. Journal of Quantitative Linguistics, 26 (1), 48–65. https://doi.org/10.1080/09296174.2017.1421498 .

Kruglanski, A. W. (2013). Lay epistemics and human knowledge: cognitive and motivational bases . New York: Plenum Press.

Larson-Hall, J., & Plonsky, L. (2015). Reporting and interpreting quantitative research findings: what gets reported and recommendations for the field. Language Learning, 65 (S1), 127–159.

Lather, P. (1993). Fertile obsession: validity after poststructuralism. The Sociological Quarterly, 34 (4), 673–693.

Lindstromberg, S. (2016). Inferential statistics in language teaching research: a review and ways forward. Language Teaching Research, 20 (6), 741–768.

Loewen, S., & Gass, S. (2009). The use of statistics in L2 acquisition research. Language Teaching, 42 (2), 181–196.

Long, M. H. & Richards, J. C. (1998). Series editors' preface. In Bachman, L. F., & Cohen, A. D. (Eds.). (1998). Interfaces between second language acquisition and language testing research (p. 27–28). Cambridge: Cambridge University Press.

Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: measurement, causation, and meaning . New York: Routledge.

McNeish, D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychological Methods, 23 (3), 412.

Medland, E. (2019). ‘I’m an assessment illiterate’: towards a shared discourse of assessment literacy for external examiners. Assessment and Evaluation in Higher Education, 44 (4), 565–580.

Meier, S. T., & Davis, S. R. (1990). Trends in reporting psychometric properties of scales used in counseling psychology research. Journal of Counseling Psychology, 37 (1), 113.

Messick, S. (1989). Meaning and values in test validation: The science and ethics of assessment. Educational Researcher , 18 (2), 5–11.

Michell, J. (1999). Measurement in psychology: a critical history of a methodological concept . Cambridge: Cambridge University Press.

Michell, J. (2008). Is psychometrics pathological science? Measurement, 6, 7–24.

Newton, P., & Shaw, S. (2014). Validity in educational and psychological assessment . California: Sage.

Oakland, T. (2009). How universal are test development and use. In E. Grigorenko (Ed.), Multicultural psychoeducational assessment (pp. 1–40). New York: Springer.

Pallant, J. (2010). SPSS Survival Manual ( 4th ed). Open University Press: Maidenhead.

Pedhazur, E. J. (1992). In Memoriam—Fred N. Kerlinger (1910–1991). Educational Researcher , 21 (4), 45–45.

Pennycook, A. (2018). Applied linguistics as epistemic assemblage. AILA Review, 31 (1), 113–134.

Phakiti, A. (2021). Likert-type Scale Construction. In P. Winke, & T. Brunfaut (eds). The Routledge handbook of second language acquisition and language testing

Plonsky, L. (2013). Study quality in SLA: an assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35 (4), 655–687.

Plonsky, L., & Derrick, D. J. (2016). A meta-analysis of reliability coefficients in second language research. The Modern Language Journal, 100 (2), 538–553.

Plonsky, L., & Gass, S. (2011). Quantitative research methods, study quality, and outcomes: the case of interaction research. Language Learning, 61 (2), 325–366. https://doi.org/10.1111/j.1467-9922.2011.00640.x .

Plonsky, L., Egbert, J., & Laflair, G. T. (2015). Bootstrapping in applied linguistics: assessing its potential using shared data. Applied Linguistics, 36 (5), 591–610.

Pourmozafari, D. (2020). Personal communication .

Purpura, J. E., Brown, J. D., & Schoonen, R. (2015). Improving the validity of quantitative measures in applied linguistics research 1. Language Learning, 65 (S1), 37–75.

Rowley, G. L. (1976). Notes and comments: the reliability of observational measures. American Educational Research Journal, 13 (1), 51–59.

Ryen, A., & Gobo, G. (2011). Editorial: managing the decline of globalized methodology. International journal of Social Research Methodology, 14, 411–415.

Stone, J., & Zumbo, B. D. (2016). Validity as a pragmatist project: A global concern with local application. In V. Aryadoust & J. Fox (Eds.), Trends in language assessment research and practice (pp. 555–573). Newcastle: Cambridge Scholars Publishing.

Thompson, B. (1988). Common methodology mistakes in dissertations: improving dissertation quality . Louisville, KY: Paper presented at the annual meeting of the Mid-South Educational Research Association.

Vacha-Haase, T. (1998). Reliability generalization: Exploring variance in measurement error affecting score reliability across studies. Educational and Psychological Measurement, 58 (1), 6–20.

Vacha-Haase, T., Ness, C., Nilsson, J., & Reetz, D. (1999). Practices regarding reporting of reliability coefficients: a review of three journals. The Journal of Experimental Education, 67 (4), 335–341.

Weir, C. J. (2005). Language testing and validation . Hampshire: Palgrave McMillan.

Wilder, L. K., & Sudweeks, R. R. (2003). Reliability of ratings across studies of the BASC. Education and Treatment of Children, 26 (4), 382–399.

Winke, P., & Brunfaut, T. (Eds.). (2021). The Routledge handbook of second language acquisition and language testing . New York: Routledge.

Download references

Acknowledgements

We appreciate the Editors and Reviewers of Language Testing in Asia for their timely review and feedback.

We received no funding for conducting this study.

Author information

Authors and affiliations.

Department of English Language and Literature, College of Letters and Humanities, Shahid Chamran University of Ahvaz, Ahvaz, Iran

Kioumars Razavipour & Behnaz Raji

You can also search for this author in PubMed   Google Scholar

Contributions

This paper is partly based on Ms. Raji's M.A thesis, which was supervised by Kioumars Razavipour. The paper however was solely written by Kioumars Razavipour. The author(s) read and approved the final manuscript.

Corresponding author

Correspondence to Kioumars Razavipour .

Ethics declarations

Competing interests.

The authors declare that they have no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Razavipour, K., Raji, B. Reliability of measuring constructs in applied linguistics research: a comparative study of domestic and international graduate theses. Lang Test Asia 12 , 16 (2022). https://doi.org/10.1186/s40468-022-00166-5

Download citation

Received : 06 October 2021

Accepted : 26 April 2022

Published : 18 May 2022

DOI : https://doi.org/10.1186/s40468-022-00166-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Reliability
  • Consistency
  • Data elicitation instrument

phd thesis language testing

ELT Theses and Dissertations

  • Research Groups & Labs
  • ELT Theses & Dissertations
  • ELIT Theses & Dissertations
  • Recent Publications
  • Undergraduate Research

Last Updated:

  elt theses and dissertations.

Tweets by METUFLE

Follow @METUFLE

 © Middle East Technical University, Faculty of Education, Department of Foreign Language Education, Üniversiteler Mahallesi, Dumlupınar Bulvarı No:1, 06800 Çankaya/Ankara

Department of Teaching and Learning, Policy and Leadership (TLPL)

Applied linguistics and language education, ph.d..

Faculty research interests in the Applied Linguistics area of focus include classroom discourse, conversational analysis, dual language learner education, language and literacy teacher development, language assessment policy, language contact and multilingualism, language diversity, language in school contexts, language planning and policy, multilingualism, peer interaction, second language teaching, sociocultural approaches to second language acquisition, teacher collaboration, codeswitching, and translanguaging. The doctoral program is primarily focused on language education in pre-kindergarten through high school settings in the US.

The program provides competitive financial support packages for all admitted students.

Applied Linguistics and Language Education (ALLE) faculty and doctoral students run an important center on campus, called the Multilingual Research Center (MRC). The MRC is committed to promoting research and outreach related to multilingualism, multilingual communities, and the education of multilingual populations.  It aims to increase the quality and number of TESOL, World Language, and dual language programs and teachers in Maryland, the nation, and the world through outreach; to sponsor and conduct research which illuminates our understanding of multilingualism and multilingual communities; and to disseminate research results to teachers, school systems, and national and international research communities.  The MRC uses its financial resources to support faculty and student research, sponsor prominent outside speakers and visitors, and provide faculty and doctoral students with generous support to attend national and international conferences. Learn more about the MRC .

The University of Maryland is the state's flagship university and one of the nation's preeminent public research universities. A global leader in research, entrepreneurship and innovation, the university is home to more than 37,000 students, 9,000 faculty and staff, and 250 academic programs. Its faculty includes three Nobel laureates, two Pulitzer Prize winners, and 49 members of the national academies.  It is a member of the Association of American Universities and competes athletically as a member of the Big Ten Conference. The College of Education at the University of Maryland is consistently ranked as one of the country’s leading education schools by US News . TLPL’s Division of Language, Literacy, and Social Inquiry is home to the Multilingual Research Center, which seeks to create an infrastructure for practice and research in the broader community.

UMD is the nation’s premier institution for language-related research.  It is home to over 200 language scientists in 17 different departments and centers. The campus-wide Maryland Language Science Center coordinates and creates opportunities for collaborations across disciplines and perspectives, and sponsors a wide range of talks, mini-conferences, and workshops.  Students in the LLSI program are encouraged to take full advantage of program flexibility to draw on the university’s wide range of intellectual resources in this area.

Primary Program Faculty

Shenika Hankerson (PhD, Michigan State University): African American Language; race, equity, language, and literacy; second language writing; language policies and language rights; critical discourse studies. Email [email protected]

Jeff MacSwan (PhD, UCLA): Bilingualism; codeswitching; applied linguistics; the role of language in schooling; language assessment policy. Email [email protected]

Laura Mahalingappa (PhD, The University of Texas at Austin): Teacher preparation and development for marginalized students; linguistically responsive pedagogy; first and additional language acquisition; critical language pedagogies; language awareness for teachers and learners. Email [email protected] .

Melinda Martin-Beltrán (PhD, Stanford University): Sociocultural approaches to second language acquisition focusing on dual language learners (ESOL students); peer interaction; language exchange; and teacher learning to build upon students’ linguistic and cultural diversity. Email [email protected]

Nihat Polat (PhD, University of Texas at Austin): Applied linguistics; individual differences (e.g., motivation, identity) in additional language acquisition (e.g., writing, syntax) and pedagogy (e.g., SIOP); teacher education (e.g., cognition, dispositions); the education of minoritized multilingual learners (e.g., emergent bilinguals, Muslim students in the U.S.). Email [email protected] .

Megan Madigan Peercy (PhD, University of Utah): Pedagogies of teacher education; preparation and development of teachers throughout their careers and as they work with language learners; theory-practice relationship in language teacher education; teacher collaborative relationships and learning. Email [email protected]

Kellie Rolstad (PhD, UCLA): Language of schooling; language diversity; second language teaching; unschooling; democratic education. Email [email protected] .   

Participating Faculty

Peter Afflerbach (PhD, State University of New York at Albany): Reading comprehension strategies and processes, especially related to new literacies; the verbal reporting methodology; reading in Internet and hypertext environments; reading assessment.

Ayanna Baccus (PhD, University of Maryland): Reading and literacy instruction.

Perla Blejer (EdD, George Washington University): Second language acquisition; foreign language education methodology; language program administration in higher education; issues of equal opportunity for at-risk students and disadvantaged populations.

Drew Fagan (EdD, Teachers College, Columbia University): Influence of teacher talk on language learning opportunities in classroom discourse; conversation analysis and second/foreign language classroom interactions; factors affecting teachers; preparing mainstream teachers for working with English Language Learners.

Loren Jones (PhD, University of Miami): Literacy and language instruction to support culturally and linguistically diverse students; writing development of English learners (ELs); translanguaging to promote literacy development; teacher preparation for working with ELs across content areas. 

Sarah C. K. Moore (PhD, Arizona State University): Language policy; equity and access for minoritized language communities; educator professional development and preparation around language teaching and learning; online and virtual educator preparation.

John O'Flahavan (PhD, University of Illinois; Urbana-Champaign): PK-12 literacy teaching and learning; the discourses involved in teaching and learning in schools; comprehensive school-wide literacy programs; sustainable school improvement.

Olivia Saracho (PhD, University of Illinois; Urbana-Champaign): Emergent literacy; family literacy; cognitive style and play.

Ebony Terrell Shockley (PhD, University of Maryland, College Park): Teacher preparation for culturally and linguistically diverse learners,  primarily in STEM and literacy contexts; written language assessment bias for bidialectal and multilingual learners; preparing teachers for speakers of African American Language; Black English Learners and the achievement gap; English Learners in Special Education.

Wayne Slater (PhD, University of Minnesota): Persuasion in reading comprehension and written communication, with a focus on biased assimilation and stasis theory.

Jennifer Turner (PhD, Michigan State University): Culturally responsive approaches to elementary reading instruction; vision as a conceptual and practical tool for preparing reading teachers for diversity; literacy as an indicator of college and career readiness; diverse students’ multimodal representations of future professional identities and workplace literacies.

Peggy Wilson (PhD, University of Maryland): Secondary literacy, writing, and grammar.

Affiliated Program Faculty

Donna Christian (PhD, Georgetown University): Dual language education; bilingual education; dialects and education; heritage language education; language and public policy; second/foreign language learning; sociolinguistics. Dr. Christian is a Senior Research Fellow and past President/CEO of the Center for Applied Linguistics.

Elisa Gironzetti (PhD, Texas A&M University-Commerce; PhD, Universidad de Alicante): Applied linguistics; second language and heritage language pedagogy; instructional pragmatics; humor; multimodal discourse analysis. An assistant professor in the School of Languages, Literatures, and Cultures, Dr. Gironzetti is director of the Spanish Language Program at UMD.

Francis M. Hult  (PhD, University of Pennsylvania; Docent, University of Jyväskylä): Discourse studies; educational linguistics; ethnography; language policy and planning; linguistic landscapes; multilingual education; nexus analysis; sociolinguistics; sustainability; and transdisciplinarity.  Dr. Hult is Professor of Education at UMBC.

Manel Lacorte (PhD, University of Edinburgh): Applied linguistics; second language and heritage language pedagogy, teacher education, classroom interaction and contexts; sociopolitical issues in second language and heritage language teaching and learning. 

Minglang Zhou (PhD, Michigan State University): Chinese as a second/global language; bilingualism and bilingual education; language identity; language contact; the relationship between language, ethnicity, and nation-state in China. Dr. Zhou is director of the Chinese Language Program and an associate professor in the School of Languages, Literatures, and Cultures at UMD.

The PhD focus in Applied Linguistics and Language Education (ALLE) provides competitive funding packages for all admitted full-time students.  As a general rule, the program anticipates that all its students will devote themselves full time to graduate study, and will not have significant employment outside of the university for the duration of the program.  This permits the ALLE community to function as a community of practice in which students not only attend classes but are also socialized into a scholarly community.  While doctoral programs traditionally focus on a domain (the subject matter or body of knowledge), little attention is generally given to the creating of a community permitting routine interaction around the construction of professional practice. ALLE faculty believe that a successful program must substantially focus on building a strong sense of community among students, extending into the larger intellectual community of faculty within the home department and throughout the university, providing ample opportunity for participants to engage in their principal craft in spaces outside of traditional classrooms.  

These are some of the specific resources ALLE provides to its doctoral students to help build a community of practice:

A shared space .  All ALLE doctoral students are assigned a desk space with other area doctoral students.  This shared space gives students an opportunity to interact intellectually around course content, program expectations, and research collaborations.

The Multilingual Research Center .  ALLE is home to the Multilingual Research Center (MRC), which engages in research and outreach activities in support of linguistic diversity. The MRC provides research funding support, generous conference travel support for students and faculty, and hosts exciting speaker and brown bag events on campus.   Learn more about the  MRC .

The broader intellectual community. ALLE participates in the Maryland Language Science Center (MLSC), a campus-wide consortium of over 200 language scientists and scholars from numerous departments across campus.  The MLSC hosts events, conferences, talks, and research collaboration events throughout the year.  Learn more about the MLSC .

Student-faculty research collaboration . Students and faculty actively collaborate on a wide range of research projects.  Our goal is to involve every student hands-on in research activity, leading to research conference presentations and co-authored publications.  While these publications typically involve faculty participation, students sometimes collaborate with other students as well on collaborative research activity. Review a list of recent coauthored student-faculty publications .

Typical applicants to the Applied Linguistics and Language Education (ALLE) focus in Language Literacy have completed a prior master’s degree and will need to complete an additional 60 credits of coursework at the University of Maryland for the PhD. (In unusual cases, we may admit students who have not yet completed a master’s degree; in that case, an additional 30 credits are required.)Students complete six major components of coursework, as follows:

  • TLPL794 Foundations of Educational Research I (3 credits).  An introduction to the “contested terrain” of education research. It examines major conceptual, methodological and political issues embedded in efforts to carry out education research and focuses on the development of the analytic dispositions and communication skills required to carry out research that meets the variously defined quality, utility and significance standards of scholarship in the field.
  •   TLPL795 Foundations of Educational Research II (3 credits). Students engage in the process of conceptualizing and completing a rigorous review of a section of literature in their area of specialization.
  • Students in the specialization in Applied Linguistics and Language Education (ALLE) are required to take at least one course in Literacy or Reading Education (3 credits)  as a Breadth Requirement .
  • TLPL740 Language and Education (3 Credits). Dialect, language varieties in school settings; historical and current perspectives on the role of language in learning; theories of school achievement and consequences for language assessment.
  • TLPL743 Teaching English Language Learners: Current and Future Research Directions (3 credits). Research on the preparation of generalists and specialists teaching English Language Learners. Current research and future research directions.
  • TLPL744 Research Foundations of Second Language Education: Examining Linguistically Diverse Student Learning (3 credits). Critically examines theories of second language acquisition and research in applied linguistics relevant to linguistically diverse students and learners of English as an additional language. Analysis of research from linguistic, psycholinguistic, sociolinguistic and sociocultural perspectives, with an emphasis on the social contexts of second language learning and teaching. 
  • TLPL788 Foundations of Applied Linguistics Research (3 credits). Explores the interdisciplinary field of Applied Linguistics, drawing upon a wide range of theoretical and methodological approaches. 
  • Students choose four Research Methods courses (12 credits).  Courses may be selected from a wide range of options in qualitative and quantitative research methods and may include TLPL793 Discourse Analysis .
  • In consultation with the advisor, students choose six courses (18 credits) as Electives .  The elective provision gives students access to the full range of relevant graduate courses throughout the university.
  • While working on the dissertation, students will enroll in 12 credits of Dissertation Research .

The Comprehensive Exam .  Students write a comprehensive exam after the fourth or fifth semester of their program, often in the intervening summer.  The comprehensive exam provides an opportunity for students to review a body of literature relevant to their developing dissertation project interest. The comprehensive exam is evaluated according to a rubric by at least two program faculty.  View Comprehensive Exam Rubric .

The Dissertation Proposal . Typically done the third year, students work closely with an advisor to develop a detailed research plan for the dissertation, called a Dissertation Proposal. The proposal presents a rationale for the study, prior relevant research, and details about the research plan, and generally builds on the work completed for the Comprehensive Exam.  A dissertation committee meets with the student for a Proposal Defense before moving on to the dissertation research.

The Dissertation .  Students produce a final dissertation based on the research plan developed in the Dissertation Proposal.  The results of the study are presented at a Dissertation Final Defense with the student’s dissertation committee.  Family members and other members of the public are welcome to attend

Typical Course Sequence

By design, students will complete the program in four years.  A typical course sequence is shown in the table below.

For more information about the program, contact any of the primary program faculty .  We welcome campus visits for students considering applying to the program and routinely hold information events where students can learn more in person about the program.

For information about applying, contact Kay Moon, TLPL Graduate Coordinator, at (301) 405-3118 or [email protected] .

University of Cambridge

Study at Cambridge

About the university, research at cambridge.

  • Events and open days
  • Fees and finance
  • Student blogs and videos
  • Why Cambridge
  • Qualifications directory
  • How to apply
  • Fees and funding
  • Frequently asked questions
  • International students
  • Continuing education
  • Executive and professional education
  • Courses in education
  • How the University and Colleges work
  • Visiting the University
  • Term dates and calendars
  • Video and audio
  • Find an expert
  • Publications
  • International Cambridge
  • Public engagement
  • Giving to Cambridge
  • For current students
  • For business
  • Colleges & departments
  • Libraries & facilities
  • Museums & collections
  • Email & phone search
  • Postgraduates
  • Postgraduate Study in Linguistics

PhD Programmes in Linguistics

  • Department of Theoretical and Applied Linguistics
  • Faculty Home
  • About Theoretical & Applied Linguistics
  • Staff in Theoretical & Applied Linguistics overview
  • Staff and Research Interests
  • Research overview
  • Research Projects overview
  • Current projects overview
  • Expressing the Self: Cultural Diversity and Cognitive Universals overview
  • Project Files
  • Semantics and Philosophy in Europe 8
  • Rethinking Being Gricean: New Challenges for Metapragmatics overview
  • Research Clusters overview
  • Comparative Syntax Research Area overview
  • Research Projects
  • Research Students
  • Senior Researchers
  • Computational Linguistics Research Area overview
  • Members of the area
  • Experimental Phonetics & Phonology Research Area overview
  • EP&P Past Events
  • Language Acquisition & Language Processing Research Area overview
  • Research Themes
  • Mechanisms of Language Change Research Area overview
  • Mechanisms of Language Change research themes
  • Semantics, Pragmatics & Philosophy Research Area overview
  • Group Meetings 2023-2024 overview
  • Previous years
  • Take part in linguistic research
  • Information for Undergraduates
  • Prospective Students overview
  • Preliminary reading
  • Part I overview
  • Li1: Sounds and Words
  • Li2: Structures and Meanings
  • Li3: Language, brains and machines
  • Li4: Linguistic variation and change
  • Part II overview
  • Part IIB overview
  • Li5: Linguistic Theory
  • Part IIB Dissertation
  • Section C overview
  • Li6: Phonetics
  • Li7: Phonological Theory
  • Li8: Morphology
  • Li9: Syntax
  • Li10: Semantics and Pragmatics
  • Li11: Historical Linguistics
  • LI12: History of Ideas on Language
  • Li13: History of English
  • Li14: History of the French Language
  • Li15: First and Second Language Acquisition
  • Li16: Psychology of Language Processing and Learning
  • Li17: Language Typology and Cognition
  • Li18: Computational Linguistics
  • Undergraduate Timetables
  • Marking Criteria
  • Postgraduate Study in Linguistics overview
  • MPhils in Theoretical and Applied Linguistics overview
  • MPhil in Theoretical and Applied Linguistics by Advanced Study overview
  • Michaelmas Term Courses
  • Lent Term Seminars overview
  • Computational and Corpus Linguistics
  • Experimental Phonetics and Phonology
  • Experimenting with Meaning
  • French Linguistics
  • Historical Linguistics and History of English
  • Language Acquisition
  • Psychological Language Processing & Learning
  • Semantics, Pragmatics and Philosophy
  • Syntactic change
  • Topics in Syntax
  • MPhil in Theoretical and Applied Linguistics by Thesis
  • PhD Programmes in Linguistics overview
  • PhD in Theoretical and Applied Linguistics
  • PhD in Computation, Cognition and Language
  • Life as a Linguistics PhD student
  • Current PhD Students in TAL
  • Recent PhD Graduates in TAL
  • News and Events overview
  • News and Events
  • COPiL overview
  • All Volumes overview
  • Volume 14 Issue 2
  • Volume 14 Issue 1
  • All Articles
  • TAL Talks Archive
  • Editorial Team
  • Linguistics Forum overview
  • Schedule of Talks
  • Societies overview
  • Linguistics Society
  • Research Facilities
  • Faculty of Modern and Medieval Languages and Linguistics
  • MPhils in Theoretical and Applied Linguistics

Open Day Linguistics

Please see the  Applying to MMLL  page for information on applications and funding.

PhDs in Theoretical & Applied Linguistics

In British universities the PhD ('Doctorate of Philosophy') is traditionally awarded solely on the basis of a dissertation, a substantial piece of writing which reports original research into a closely defined area of enquiry. Candidates for the PhD in Cambridge are guided by a Supervisor, though they will normally also discuss their work with a number of other experts in their field. The nature of the work depends on topic.

Within linguistics, some PhD students may do most of their work in libraries, or spend part of their time collecting and analysing data, or carry out experiments in the  phonetics laboratory  or psycholinguistics laboratory. The dissertation must make a significant contribution to learning, for example through the discovery of new knowledge, the connection of previously unrelated facts, the development of new theory or the revision of older views. The completion of a PhD dissertation is typically expected to take three to four years, or five to seven years if part-time.

PhD in Theoretical and Applied Linguistics  (Course Code: MLAL212)

The PhD in Theoretical and Applied Linguistics is a PhD track for students whose research interests lie more widely in the field of linguistics. Research proposals from a broad range of linguistic subdisciplines are welcomed.

PhD in Computation, Cognition and Language (Course Code: MLAL211)

The PhD in Computation, Cognition and Language is a PhD track for students who conduct basic and applied research in the computational study of language, communication, and cognition, in humans and machines. This research is interdisciplinary in nature and draws on methodology and insights from a range of disciplines that are now critical for the further development of language sciences, including (but not limited to) Linguistics, Cognitive Science, Computer Science, Engineering, Psychology and Neuroscience. A variety of PhD topics that fall within this remit are accepted.

Please direct any enquiries regarding entry requirements and academic matters to the Postgraduate Administrative Assistant in the MMLL Graduate Office: [email protected] , and any enquiries regarding the technicalities of applying to the Postgraduate Admissions Office .

Search form

phd thesis language testing

Linguistics links

  • Language Centre
  • Online Timetable
  • Polyglossia
  • Guidelines for Incoming Erasmus Students
  • Green Matters

Related links

  • Student Support
  • Wellbeing at Cambridge
  • Year Abroad FAQ
  • Polyglossia Magazine
  • The Cambridge Language Collective
  • Information for current undergraduates
  • Visiting and Erasmus Students

Keep in touch

  • University of Cambridge Privacy Policy
  • Student complaints and Examination Reviews

© 2024 University of Cambridge

  • University A-Z
  • Contact the University
  • Accessibility
  • Freedom of information
  • Terms and conditions
  • Undergraduate
  • Spotlight on...
  • About research at Cambridge

Edinburgh Research Archive

University of Edinburgh homecrest

  •   ERA Home
  • Philosophy, Psychology and Language Sciences, School of
  • Linguistics and English Language

Linguistics and English Language PhD thesis collection

phd thesis language testing

Show simple item record

Using Arabic (L1) in testing reading comprehension in English (L2) as a foreign language

Files in this item, this item appears in the following collection(s).

phd thesis language testing

Book cover

Assessing Chinese Learners of English pp 270–286 Cite as

The Power of General English Proficiency Test on Taiwanese Society and Its Tertiary English Education

  • Shwu-Wen Lin  

1447 Accesses

Reacting to the government’s policies to increase Taiwanese university students’ international competitiveness by raising their English proficiency, universities in Taiwan have set up a graduation requirement of English proficiency in recent years. This chapter reports on how the implementation of the English graduation requirement has affected the university students and the English curriculum. The requirement accepts scores from various English proficiency tests as proof of proficiency, instead of one particular test. Thus, the findings of this study have implications as to what determines the strongest washback that any language test can have in the context of multiple tests existing and competing for influence.

  • Proficiency Test
  • English Proficiency
  • English Learning
  • Language Test
  • English Language Proficiency

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Unable to display preview.  Download preview PDF.

Alderson, J.C. & Hamp-Lyons, L. (1996). TOEFL preparation courses: A study of washback. Language Testing, 13 (3), 280–297.

Article   Google Scholar  

Alderson, J.C. & Wall, D. (1993). Does washback exist? Applied Linguistics, 14 (2), 115–129.

Cheng, L. (1997). How does washback influence teaching? Implications for Hong Kong. Language and Education, 11 (1), 38–54.

Cheng, L. (2005). Changing language teaching through language testing: A washback study , Cambridge University Press.

Google Scholar  

Cheng, L., Watanabe, Y. & Curtis, A. (2004). Washback in language testing: Research contexts and methods , Mahwah, NJ: Lawrence Erlbaum.

Chen, L. (2002). Taiwanese junior high school English teachers’ perceptions of the washback effect of the Basic Competence Test in English . Unpublished PhD thesis, College of Education, The Ohio State University, Ohio, United States.

Council of Europe (2001). Common European Framework of Reference for Languages: learning, teaching and assessment . Cambridge: Cambridge University Press.

Gates, S. (1995). Exploiting washback from standardized tests. In J. Brown & S. Yamashita, eds. Language testing in Japan . (pp. 101–106). Tokyo: Japan Association for Language Teaching.

Green, A. (2006). Washback to the learner: Learner and teacher perspectives on IELTS preparation course expectations and outcomes. Assessing Writing, 11 (2), 113–134.

GEPT Research Highlights, 2013. [online] Available at: https://www.lttc.ntu.edu.tw /E_LTTC/E_GEPT/files/GEPT_Research_Highlights.pdf [Accessed 30 March 2014].

Madaus, G. (1988). The influence of testing on the curriculum. In L. Tanner, ed. Critical issues in curriculum (pp. 83–121). Chicago, Illinois: Chicago University Press.

McNamara, T.F. & Roever, C. (2006). Language testing: The social dimension , Malden, MA and Oxford: Wiley-Blackwell.

Shih, C.M. (2006). Perceptions of the general English proficiency test and its wash-back: A case study at two Taiwan technological institutes . Unpublished PhD thesis, Department of Curriculum, Teaching and Learning, Ontario Institute for Studies in Education, University of Toronto, Toronto, Canada.

Shih, C.M. (2007). A new washback model of students’ learning. Canadian Modern Language Review/La Revue Canadienne des Langues Vivantes, 64 (1), 135–161.

Shih, C.M. (2008). The General English Proficiency Test. Language Assessment Quarterly , 5(1), 63–76.

Shih, C.M. (2010). The washback of the General English Proficiency Test on university policies: A Taiwan case study. Language Assessment Quarterly , 7(3), 234–254.

Shohamy, E., Donitsa-Schmidt, S. & Ferman, I. (1996). Test impact revisited: Washback effect over time. Language Testing, 13 (3), 298–317.

Stecher, B., Chun, T. & Barron, S. (2004). The effects of assessment-driven reform on the teaching of writing in Washington state. In L. Cheng, Y. Watanabe & A. Curtis, eds. Washback in Language Testing: Research context and methods (pp. 53–72). Mahwah, NJ: Lawrence Erlbaum.

Tsai, Y. & Tsou, C.H. (2009). A standardised English Language Proficiency test as the graduation benchmark: student perspectives on its application in higher education. Assessment in Education: Principles, Policy & Practice, 16 (3), 319–330.

Vongpumivitch, V. (2010). The General English Proficiency Test. In L. Cheng & A. Curtis, eds. English language assessment and the Chinese learner (pp. 158–172). New York: Routledge.

Wall, D. (1996). Introducing new tests into traditional systems: Insights from general education and from innovation theory. Language Testing, 13 (3), 334–354.

Wall, D. & Alderson, J.C. (1993). Examining washback: The Sri Lankan impact study. Language Testing, 10 (1), 41–69.

Watanabe, Y. (1996). Does grammar translation come from the entrance examination? Preliminary findings from classroom-based research. Language Testing, 13 (3), 318–333.

Watanabe, Y. (2001). Does the university entrance examination motivate learners? A case study of learner interviews. In A. Murakami, ed., Trans-Equator Exchanges: A Collection of Academic Papers in Honour of Professor David Ingram (pp. 100–110). Akita, Japan: Akita University.

Watanabe, Y. (2004a). Methodology in washback studies. In L. Cheng, Y. Watanabe & A. Curtis, eds. Washback in language testing: Research context and methods (pp. 19–38). Mahwah, NJ: Lawrence Erlbaum.

Watanabe, Y. (2004b). Teacher factors mediating washback. In L. Cheng, Y. Watanabe & A. Curtis, eds. Washback in language testing: Research context and methods (pp. 129–146). Mahwah, NJ: Lawrence Erlbaum.

Woods, D. and Fassnacht, C. (2012). Transana v2.50. http://www.transana.org . Madison, WI: The Board of Regents of the University of Wisconsin System.

Wu, R., & Chin, J. (2006). An impact study of the Intermediate-Level GEPT. Proceedings of the Ninth International Conference on English Language Testing in Asia, Taipei, 41–65.

Zhang, R. & Tu, Y. (2007). 國內技專校院學生英語能力畢業門檻現況與省思 [The English graduation requirement for students in domestic universities and colleges of technological and vocational education, the present state and review]. Cross-Strait Technological and Vocational Education Conference. Taichung: Chaoyuang University of Technology.

Download references

You can also search for this author in PubMed   Google Scholar

Editor information

Editors and affiliations.

University of Bristol, UK

Shanghai Jiao Tong University, China

Copyright information

© 2016 Shwu-Wen Lin

About this chapter

Cite this chapter.

Lin, SW. (2016). The Power of General English Proficiency Test on Taiwanese Society and Its Tertiary English Education. In: Yu, G., Jin, Y. (eds) Assessing Chinese Learners of English. Palgrave Macmillan, London. https://doi.org/10.1057/9781137449788_13

Download citation

DOI : https://doi.org/10.1057/9781137449788_13

Publisher Name : Palgrave Macmillan, London

Print ISBN : 978-1-349-55397-6

Online ISBN : 978-1-137-44978-8

eBook Packages : Social Sciences Social Sciences (R0)

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. (PDF) Language Testing and Assessment: A Comprehensive Guide

    phd thesis language testing

  2. Fundamental Considerations in Language Testing by Lyle F. Bachman

    phd thesis language testing

  3. Language testing part II

    phd thesis language testing

  4. Guide to Write a PhD Thesis

    phd thesis language testing

  5. Buy Language Testing Journal Subscription

    phd thesis language testing

  6. Language Testing and Assessment: An Advanced Resource Book 1st Edition

    phd thesis language testing

VIDEO

  1. Writing That PhD Thesis

  2. How to Defend Your MS/MPhil/PhD Research Thesis

  3. ## PhD thesis writing methods off the social science

  4. Must read book for English Literature entrance exam: PhD in English

  5. IB ENGLISH: Thesis Workshop

  6. How to write your PhD thesis #1: Calm Focus

COMMENTS

  1. Studies in Language Testing (SiLT)

    The Studies in Language Testing (SiLT) series of academic volumes address new developments in language testing and assessment. ... Based upon a PhD dissertation completed in 2003, ... graduate students, and even language programs considering using either of these test formats ... a very readable tale of two tests and the complexity needed to ...

  2. Linguistics Theses and Dissertations

    Theses/Dissertations from 2021. PDF. Trademarks and Genericide: A Corpus and Experimental Approach to Understanding the Semantic Status of Trademarks, Richard B. Bevan. PDF. First and Second Language Use of Case, Aspect, and Tense in Finnish and English, Torin Kelley. PDF. Lexical Aspect in-sha Verb Chains in Pastaza Kichwa, Azya Dawn Ladd.

  3. (PDF) Emergent Trends and Research Topics in Language Testing and

    Abstract and Figures. This study, which is of descriptive nature, aims to explore the emergent trends and research topics in language testing and assessment that have attracted increasing ...

  4. Topic and background knowledge effects on performance in speaking

    Towards a model of performance in oral language testing. (Unpublished PhD thesis). University of Reading, UK. Google Scholar. O'Sullivan B., Green A. (2011). Test taker characteristics. In Taylor L. (Ed.), Examining speaking: Research and practice in assessing second language speaking (pp. 36-64). Studies in Language Testing 30.

  5. Research in language assessment

    Since its inception in 1990, the Language Testing Research Centre (LTRC) at the University of Melbourne has earned an international reputation for its work in the areas of language assessment and testing as well as program evaluation. The mission of the centre is: (1) to carry out and promote research and development in language testing; (2) to ...

  6. Review of doctoral research in language assessment in Canada (2006-2011)

    Dimensions of lexical proficiency in writing summaries for an English as a foreign language test (doctoral dissertation). Retrieved from Theses Canada (33748472). ... Towards defining a valid assessment criterion of punctuation proficiency in non-native English-speaking graduate students (doctoral dissertation). Retrieved from ProQuest (MR24877).

  7. PDF Integrating diagnostic assessment into curriculum: a theoretical

    grained attributes a test taker would use in a content domain, how the attributes develop, and how test takers of higher proficiency differ from those of lower profi-ciency (Mislevy et al., 2003). Over the past decade, the profusion of CDA research in the field of language testing has led to a more comprehensive conceptualization

  8. PDF Language testing

    Language testing development of a dissertation writing support program for ESL graduate research students. English for Specific Purposes (Exeter, UK), 17,2 (1998), 199-217. Despite an explosion in the number of students writing graduate theses in a language other than their first, there are very few accounts either of research into the

  9. Home

    We have a strong record of publication in both peer-reviewed academic journals and other outlets. We are active in mentoring graduate students in language testing through thesis supervision, studentships and via a regular seminar series on language testing and assessment for Doctor of Philosophy - Arts students and staff.

  10. Postgraduate students' conception of language assessment

    Introduction. Assessment is an inseparable part of English language learning-teaching. It is defined as "any act of interpreting information about student performance, collected through any of a multitude of means" (Brown and Hirschfeld 2008, p. 4). Language educators need to depend on assessment for several reasons.

  11. Linguistics and English Language PhD thesis collection

    Blankinship, Brittany (The University of Edinburgh, 2023-03-21) The overarching aim of this thesis is to explore the question of what role the knowledge and use of multiple languages plays in ageing. To answer this question two approaches were taken: first a natural history perspective ...

  12. Completed PhDs in Language Testing

    Gary Buck - The testing of second language listening comprehension (1990) Jeanne Marie Kattan - The construction and validation of an EAP test for second year English and Nursing majors at Bethlehem University (1990) Dejenie Leta - Achievement, washback and proficiency in school leaving examination: A case of innovation in an Ethiopian ...

  13. Linguistics and English Language PhD

    A PhD thesis typically means teamwork, involving the student and one or two supervisors, and often also other members of the research group(s) of the supervisor(s); a student receives training and help form the team, but can also contribute to the team with their research. ... English language tests. We accept the following English language ...

  14. Cognitive Diagnosis in Language Assessment: A Thematic Review

    Along with the surging demand for diagnostic feedback in large-scale language tests, an increasing number of CDA studies have emerged primarily for the purpose of facilitating language teaching and learning. In this paper, we conducted a thematic review of 35 empirical studies on cognitive diagnosis in language assessment during the years 2009 ...

  15. Reliability of measuring constructs in applied linguistics research: a

    The credibility of conclusions arrived at in quantitative research depends, to a large extent, on the quality of data collection instruments used to quantify language and non-language constructs. Despite this, research into data collection instruments used in Applied Linguistics and particularly in the thesis genre remains limited. This study examined the reported reliability of 211 ...

  16. ELT Theses and Dissertations

    Analyses of the English language testing and evaluation course in English language teaching programs in Turkey: A language testing and assessment literacy study. Çiler Hatipoğlu. 2018. PhD. Akşit, Zeynep. Validating aspects of a reading test. Çiler Hatipoğlu. 2018. PhD. Altınbaş, Mehmet Emre. The Use of Multiplayer Online Computer Games ...

  17. Applied Linguistics and Language Education, Ph.D.

    The doctoral program is primarily focused on language education in pre-kindergarten through high school settings in the US. The program provides competitive financial support packages for all admitted students. Applied Linguistics and Language Education (ALLE) faculty and doctoral students run an important center on campus, called the ...

  18. PhD Programmes in Linguistics

    The dissertation must make a significant contribution to learning, for example through the discovery of new knowledge, the connection of previously unrelated facts, the development of new theory or the revision of older views. ... The PhD in Computation, Cognition and Language is a PhD track for students who conduct basic and applied research ...

  19. Language testing and assessment (Part I)

    Unpublished PhD dissertation, University of Edinburgh, Edinburgh.Google Scholar. Khaniyah, T. R. (1990 b). The washback effect of a textbook-based test. Edinburgh Working Papers in Applied Linguistics, 1, 48 ... Investigating authenticity in language testing. Unpublished PhD dissertation, Lancaster University, Lancaster.Google Scholar.

  20. Language Tests for PhD Study

    The guides below introduce some of the tests that are suitable for doctoral research, including the IELTS and TOEFL for English as well as specific tests for German, French, Mandarin and other languages. You can also find out more about how language tests work for PhD study, below. English Language Tests for PhD Study.

  21. Using Arabic (L1) in testing reading comprehension in English (L2) as a

    The use of Arabic in the English reading comprehension tests did not improve the performance of students. Interview responses were mixed, but with no consensus in favour of Arabic. Limitations of this study are discussed, and recommendations for further research in testing reading comprehension in English as a foreign language are presented.

  22. Language Education and Multilingualism, PhD

    TOEFL minimum score is a 250 for a computer based test, 600 for a paper based test and 96 for the Internet based test ; IELTS minimum score is 7.0 overall; PTE minimum score is 55 overall; Financial documentation — International graduate applicants must document their ability to pay for all costs incurred while studying in the U.S.

  23. The Power of General English Proficiency Test on Taiwanese ...

    The General English Proficiency Test. In L. Cheng & A. Curtis, eds. English language assessment and the Chinese learner (pp. 158-172). New York: Routledge. Google Scholar Wall, D. (1996). Introducing new tests into traditional systems: Insights from general education and from innovation theory. Language Testing, 13(3), 334-354.