essay on the multi store model of memory

Live revision! Join us for our free exam revision livestreams Watch now →

Reference Library

Collections

  • See what's new
  • All Resources
  • Student Resources
  • Assessment Resources
  • Teaching Resources
  • CPD Courses
  • Livestreams

Study notes, videos, interactive activities and more!

Psychology news, insights and enrichment

Currated collections of free resources

Browse resources by topic

  • All Psychology Resources

Resource Selections

Currated lists of resources

Study Notes

Multi-Store Model of Memory

Last updated 22 Mar 2021

  • Share on Facebook
  • Share on Twitter
  • Share by Email

Atkinson and Shiffrin (1968) developed the Multi-Store Model of memory (MSM), which describes flow between three permanent storage systems of memory: the sensory register (SR), short-term memory (STM) and long-term memory (LTM).

The SR is where information from the senses is stored, but only for a duration of approximately half a second before it is forgotten. It is modality-specific, i.e. whichever sense is registered will match the way it is consequently held (for instance, a taste held as a taste).

However, if attended to, sensory information moves into the STM for temporary storage, which will be encoded visually (as an image), acoustically (as a sound) or, less often, semantically (through its meaning). STM is thought to have a capacity of 5-9 items and duration of approximately 30 seconds. This capacity can be increased through ‘chunking’ (converting a string of items into a number of larger ‘chunks’, e.g. number 343565787 to 343 565 787).

Rehearsing information via the rehearsal loop helps to retain information in the STM, and consolidate it to LTM, which is predominantly encoded semantically. Information can be stored and retrieved for up to any duration, and equally has a seemingly unlimited capacity.

essay on the multi store model of memory

Evaluation of the MSM

  • There is a large base of research that supports the idea of distinct STM and LTM systems (e.g. brain-damaged case study patient KF’s STM was impaired following a motorcycle accident, but his LTM remained intact).
  • It makes sense that memories in the LTM are encoded semantically – i.e. you might recall the general message put across in a political speech, rather than all of the words as they were heard.
  • The MSM was a pioneering model of memory that inspired further research and consequently other influential models, such as the Working Memory Model.

Weaknesses:

  • Some research into STM duration has low ecological validity, as the stimuli participants were asked to remember bear little resemblance to items learned in real life, e.g. Peterson and Peterson (1959) used nonsense trigrams such as ‘XQF’ to investigate STM duration.
  • The model is arguably over-simplified, as evidence suggests that there are multiple short and long-term memory stores, e.g. ‘LTM’ can be split into Episodic, Procedural and Semantic memory.
  • It does not make much sense to think of procedural memory (a type of LTM) as being encoded semantically, i.e. knowing how to ride a bike through its meaning.
  • It is only assumed that LTM has an unlimited capacity, as research has been unable to measure this accurately.
  • Multi-Store Model
  • Short-Term Memory
  • Long-Term Memory
  • Sensory Register
  • Atkinson and Shiffrin (1968)

You might also like

Episodic, procedural and semantic memory, proactive and retroactive interference, proactive interference - keppel and underwood (1962), ​duration of short-term memory, model answer for question 5 paper 1: as psychology, june 2016 (aqa).

Exam Support

Example Answer for Question 7 Paper 1: AS Psychology, June 2017 (AQA)

Models of memory - "lockdown" activity.

Quizzes & Activities

Memory: Working Memory Model | AQA A-Level Psychology

Our subjects.

  • › Criminology
  • › Economics
  • › Geography
  • › Health & Social Care
  • › Psychology
  • › Sociology
  • › Teaching & learning resources
  • › Student revision workshops
  • › Online student courses
  • › CPD for teachers
  • › Livestreams
  • › Teaching jobs

Boston House, 214 High Street, Boston Spa, West Yorkshire, LS23 6AD Tel: 01937 848885

  • › Contact us
  • › Terms of use
  • › Privacy & cookies

© 2002-2024 Tutor2u Limited. Company Reg no: 04489574. VAT reg no 816865400.

Providing a study guide and revision resources for students and psychology teaching resources for teachers.

The Multi-Store Model Of Memory

March 5, 2021 - paper 1 introductory topics in psychology | memory.

  • Back to Paper 1 - Memory

Before we focus on the key characteristics of the Multi-Store Model of Memory it is important to develop an understanding of the definition of ‘memory.’

AO1, Definition of ‘Memory’: The process by which we retain information about events that have happened in the past. This includes fleeting (short term) memories as well as memories that last for longer (long term). Research has identified a number of key differences between short-term memory (STM) and long-term memory (LTM) in terms of the way these types of memory work.

A model of memory is a theory of how the memory system operates, the various parts that make up the memory system and how the parts work together. The Multi-Store Model of Memory as developed by Atkinson and Shiffrin describes the key characteristics of memory including; the sensory store, short term memory store and long term memory store.

AO1, Description The Multi-Store Model (MSM) of Memory: Atkinson and Shiffrin

The most well-known and influential model of memory was put forward by Atkinson and Shiffrin in 1968. They proposed that human memory involved:

  • A flow of information through an information processing system.
  • A system divided into 3 stages or storage components i.e., Sensory Register (SR), Short-term Memory (STM) and Long-term Memory ( LTM).
  • Information passing from one stage to another in a fixed sequence.
  • Constraints (or restrictions) at each stage in terms of capacity (size, i.e., how much each stage can hold), duration (length of time the memory stays in each stage) and coding (the way the information is stored at each stage e.g., visual images or sounds).

There are three limitations (or constraints) to the separate memory stores which are:

(1)  Coding:  The way that information is represented in the memory store (e.g., by sound [auditory], meaning [semantic] or image [visual]).

(2)  Duration:  The length of time that memories can be held within the memory store.

(3)  Capacity:  The amount of information that can be held in the memory store at any one time.

The Multi-Store model of Memory (MSM), AO1, Description:

According to  Atkinson and Shiffrin:

  • The sensory store is constantly receiving information from the environment. Most of this information receives no  attention  and so is lost through However, if the information that enters the sensory stores is  attended  to (paid attention) then this information is encoded and passes through to the short term memory (STM).
  • Once in the STM, information  ( if not  rehearsed) , can be lost through either  displacement  (this is because the STM has a  capacity of 7+/- 2  items) or  decay  (as the STM only has a  duration of 0-18 seconds).
  • If information is  elaborative   rehearsed (over and over )  and the information is  understood,  then the information will be transferred /encoded into the  long term memory (LTM).
  • The LTM can hold information for an unlimited about of time and has an unlimited capacity.
  • When stored information is needed, it can be  retrieved  from the LTM back to the STM.
  • Atkinson and Shiffrin proposed a direct link between rehearsal in the STM and the strength of the long term memory.

essay on the multi store model of memory

Evaluation, AO3 of The Multi-Store Memory Model

(1) Point:  Further research from brain scanning techniques has supported the Multi-Store Memory model and the idea of separate memory stores (i.e. a short term memory store and a long term memory store.  Evidence: Squire et al (1992)  used brain-scanning techniques and found that STM can be associated with activity in the prefrontal cortex and that LTM can be associated with activity in the hippocampus.  Evaluation: This is a strength because  it provides biological evidence that the different types of memory are processed by different parts of the brain and that the memory stores are distinct as the multi-store model suggests.

(2) Point:  Case studies of brain damaged patients (e.g. KF) have also offered support for the Multi-Store Model of memory.  Evidence:  Shallice and Warrington (1970),  reported the case of KF, who was brain damaged as a result of a motorcycle accident. His STM was severely impaired, however his LTM remained intact.  Evaluation:   This supports the  view that STM and LTM are separate and distinct stores and therefore supports the proposals of the Multi-Store Model of memory as it shows that it is possible to damage only one store in memory.

(3) Point: The main strength of the MSM come from support for the idea that at least two separate memory stores do exist (i.e. STM and LTM).  Evidence: Murdock’s (1962) Serial position effect (laboratory experiment):   Murdock argued that no matter how many words a person is shown and then asked to recall, items at the beginning  of the list are recalled to a greater degree than those in the middle, while words at the end have a greater recall than either the beginning or the middle. Words recalled at the beginnig are rfered to as the primacy effect, words remembered at the end of the list are refered to as the recency effect.   Evaluation: This supports the MSM because  the fact that participants remember words more at the beginning of the list is due to the fact that these words are rehearsed and are starting to pass into the LTM (as suggested by the MSM). Words in the middle of the list are not remembered as well due to the fact that these words are not rehearsed and therefore are lot through displacement. Finally, as suggested by the MSM, the words at the end are remembered well due to the fact that we can hold words in in our STM without rehearsal for up to 30 seconds.

Weaknesses:

(3) Point:  Case studies of brain damaged patients criticise the MSM.  Evidence:  The case of KF demonstrated that his deficit in STM was for verbal information and that the STM for visual material was normal.  Evaluation: This is a weakness because  it demonstrated that it is possible to damage only part of the STM going against the MSM idea that STM is unitary (suggesting that there may be more than one type of STM).

Click here to learn about the  Working Memory Model  as developed by Baddeley and Hitch.

  • Psychopathology
  • Social Psychology
  • Approaches To Human Behaviour
  • Biopsychology
  • Research Methods
  • Issues & Debates
  • Teacher Hub
  • Terms and Conditions
  • Privacy Policy
  • Cookie Policy
  • [email protected]
  • www.psychologyhub.co.uk

captcha txt

We're not around right now. But you can send us an email and we'll get back to you, asap.

Start typing and press Enter to search

Cookie Policy - Terms and Conditions - Privacy Policy

essay on the multi store model of memory

From short-term store to multicomponent working memory: The role of the modal model

  • Published: 26 November 2018
  • Volume 47 , pages 575–588, ( 2019 )

Cite this article

essay on the multi store model of memory

  • Alan D. Baddeley 1 ,
  • Graham J. Hitch 1 &
  • Richard J. Allen 2  

39k Accesses

74 Citations

18 Altmetric

Explore all metrics

The term “modal model” reflects the importance of Atkinson and Shiffrin’s paper in capturing the major developments in the cognitive psychology of memory that were achieved over the previous decade, providing an integrated framework that has formed the basis for many future developments. The fact that it is still the most cited model from that period some 50 years later has, we suggest, implications for the model itself and for theorising in psychology more generally. We review the essential foundations of the model before going on to discuss briefly the way in which one of its components, the short-term store, had influenced our own concept of a multicomponent working memory. This is followed by a discussion of recent claims that the concept of a short-term store be replaced by an interpretation in terms of activated long-term memory. We present several reasons to question these proposals. We conclude with a brief discussion of the implications of the longevity of the modal model for styles of theorising in cognitive psychology.

Similar content being viewed by others

essay on the multi store model of memory

On some of the main criticisms of the modal model: Reappraisal from a TBRS perspective

The many faces of working memory and short-term storage.

essay on the multi store model of memory

The effect of working memory maintenance on long-term memory

Avoid common mistakes on your manuscript.

Introduction

Some 50 years after its first publication, the paper by Atkinson and Shiffrin ( 1968 ) has been cited over 10,000 times (as of October 2018, source: Google Scholar) and continues to be influential in the development of cognitive psychology. We reflect on why this is the case, and what lessons can be learned regarding theory development in our field. As Atkinson and Shiffrin point out, their paper falls into two parts, the first of which comprises “a fairly comprehensive theoretical framework for memory which emphasises the role of control processes – processes under the voluntary control of the subject such as rehearsal, coding and search strategies” (pp. 190-191). The second part describes a series of models developed using this general approach.

Two aspects of their framework influenced our own views and will form the bulk of our discussion. The first of these is their postulation of a short-term store of limited capacity, and the second is their proposal that this acts as a ‘working memory’, playing a crucial role in performing a wide range of cognitive activities. Their initial section is followed by a detailed account of the development and testing of a series of models concerned with the role of a rehearsal buffer in long-term learning. Importantly, they describe this development not as a general theory, but as exploring “a sub class of possible models that can be generated by the framework proposed”, emphasising that a range of other approaches are feasible within the general framework (Atkinson & Shiffrin, p. 191). The resulting system is comprehensive enough to provide a good general account of research on human memory in terms of a framework that is simple and coherent but open to more detailed exploration and subsequent modification without the need to abandon the framework when unexpected results emerge.

The Atkinson and Shiffrin (A & S) framework became known as the modal model, although this term appears to have been originally proposed by Murdock ( 1967 ) in a paper that summarises a range of memory results and interprets them within a less developed information-processing model than that proposed by A & S. They summarise the many advances made in the study of memory over the previous decade, presenting them within a coherent broad framework that we will argue has stood the test of time. An important feature of the framework is its differentiation between memory structures and fluid ‘control processes’, which manipulate information within those structures. Finally, it attempts to link the model to the world beyond the laboratory, although this is more by implication than by empirical investigation, proposing that the short-term store (STS) within their model acts as a ‘working memory’. As they acknowledge, this was at the heart of Broadbent’s ( 1958 ) attempt to link attention and short-term memory, a tradition that we ourselves have attempted to carry on, as have many others.

Assumptions of the Atkinson and Shiffrin ( 1968 ) framework

In thinking about this 50-year-old model, it is tempting to limit consideration to the simplified representation that has occurred within text books ever since, and to ignore the many underlying assumptions that have proved to be robust and important, allowing the framework to continue to be productive. We discuss these basic assumptions before going on to consider aspects of the model that were less successful, observing that, rather than leading to an abandonment of the model, as some approaches to theorisation might suggest, they proved to be growing points that allowed further extension and enrichment of the basic framework proposed.

In an article that is highly critical of the lack of theory in current psychology, Gigerenzer ( 2010 ) stresses the importance of being aware of the assumptions underpinning theoretical development, contrasting psychology unfavourably with physics and economics. The latter is perhaps an unfortunate choice given the fallibility of its complex theoretical structures based on assumptions such as human rationality and the perfection of the market. As Keynes remarked, “it is better to be roughly right than precisely wrong”. Gigerenzer’s criticism cannot, however, be levelled at Atkinson and Shiffrin, who explicitly list the basic assumptions of their research framework, together with the evidence on which they are based. Some 50 years later, we can revisit them and see how well they have withstood the test of time. They are broadly as follows:

Atkinson and Shiffrin (A & S) propose a “general theoretical framework for human memory”.

Their system distinguishes between permanent structural features and readily modifiable programmable control processes. We regard this as an important distinction sometimes lost in later tendencies to theorise in terms of memory as a by-product of ‘processing’, objecting to the term ‘store’ as implying passive maintenance of the original experience (e.g. Craik & Lockhart, 1972 ). We ourselves suggest the need for both storage and processing; processes are certainly important but require some form of continuing maintenance over time, for which the term ‘storage’ is helpful.

A & S assume three structural components, a bank of sensory registers, a short-term store (STS) and a long-term store (LTS). They defend this on the basis of earlier research, notably including information from neuropsychological single case studies. This assumption has subsequently been contested, particularly on the basis of neuroimaging studies. We return to this issue later. Our own view, however, is that this separation has continued to be well supported, although subsequent work has led to further fractionation of the three systems (see Baddeley, Eysenck, & Anderson, 2015 ). The sensory registers are assumed to differ across modalities and link to further analysis and investigation of the role of both storage and processing within the relevant perceptual systems. The STS concept has been elaborated into a more complex working memory system (see below), while long-term memory has also been fractionated into semantic/episodic and implicit/explicit systems.

A & S accept that memory is likely to operate across a number of modalities, but focus on what they term the ‘audio-visual-linguistic system’, linking it directly to their proposed STS. This emphasis on verbal memory is understandable given that the vast bulk of experimental and theoretical work on human memory has involved such material. We would argue, however, that it is perhaps unfortunate that more effort has not, over the years, been made to explore the generality of results of verbal studies, other than simply regarding nonverbal memory as providing further potentially helpful features, as in Paivio’s ( 1971 ) dual coding hypothesis. This imbalance has recently begun to change, principally through investigators interested in vision, often influenced by attempts to develop automatic object recognition systems (Brady, Konkle, & Alvarez, 2011 ; Isola, Xaio, Parikh, Torralba, & Oliva, 2014 ). Such theorisation has, however, tended to focus on stimulus characteristics rather than the activities of the rememberer, although recent work has attempted to combine research from the verbal and the visual memory traditions (see, e.g., Baddeley & Hitch, 2017 ; Evans & Baddeley, 2018 ).

A & S’s proposed framework assumes pathways from the sensory registers to STS and between STS and LTS, and emphasises the importance of control processes in modifying the flow of information through them, stressing their potential complexity and dependence on LTS. However, in practice A & S focused on the particular control process of verbal rehearsal. While this can be readily demonstrated using an appropriate paradigm, it is far from optimal as a mechanism for long-term learning (Hyde & Jenkins, 1969 ), and in particular underestimates the role of more complex encoding strategies such as those demonstrated in levels of processing studies (Craik & Lockhart, 1972 ; Craik & Tulving, 1975 ).

It is important to note, however, that more complex methods of rehearsal remain entirely plausible within their system, which emphasises the flexibility and importance of the strategies adopted by participants, as exemplified by subsequent models within this tradition (e.g. Lehman & Malmberg, 2013 ; Raaijmakers & Shiffrin, 1981 ). While strategy has not been extensively studied, it has continued to be accepted as potentially important within cognitive psychology and typically controlled by requiring a sequence of experiments that carefully constrain potential processing strategies. Unfortunately, this has been much less common in neuroimaging studies where the trouble and expense of running a series of experiments has tended to encourage reliance on the simplistic assumption that a single task reflects a single underlying concept, whereas few if any tasks are in fact sufficiently process-pure to justify this assumption. As A & S ( 1968 , p. 101) point out, “Both STS and LTS are active in both STS and LTS experiments”.

This issue is reflected in A & S’s distinction between the concepts of short-term memory (STM) and their proposed short-term store (STS). In their account, STM refers to a range of paradigms whereby small amounts of information are maintained over a limited period, whereas the term STS refers to a hypothetical storage system that may be involved to a greater or lesser extent in such STM paradigms. Hence, as Keppel and Underwood ( 1962 ) showed, the Peterson and Peterson ( 1959 ) task involving the retention of consonant triplets over delays of up to 18 s, initially regarded as a classic STM task, does in fact depend heavily on LTS, although the STS is also involved (Baddeley & Scott, 1971 ). Similarly, recency effects, initially regarded as a hallmark of the STS (Glanzer & Cunitz, 1966 ), can be found across a range of LTM and STM paradigms, and can better be seen as reflecting the application of a recency-based retrieval control strategy to primed representations within a range of different storage systems (Baddeley & Hitch, 1993 ; see also Lehman & Malmberg, 2013 ).

It is of course entirely valid to ask a time-based question such as what is happening to information stored over a brief time interval, as for example in the analysis of ongoing processing in speech comprehension. It is, however, important in doing so to accept that this is likely to involve a number of potentially separable processes, and that a tendency to conflate STM and STS is likely to lead to theoretical confusion (Jenesen & Squire, 2012 ; Waugh & Norman, 1965 ). LTS does influence storage of information over the first few seconds, and relevant theories of LTS such as those based on Estes ( 1950 ) are likely to be relevant in accounting for processing over this interval. (e.g. Nairne, 2002 ). They are not, however, the whole story, and effects of LTS need to be carefully controlled if a system such as Atkinson and Shiffrin’s STS is to be investigated.

The assumption made by A & S that is most central to our own work is that “the short-term store is the subject’s working memory; it receives selected input from the sensory register and also from long-term memory” (A & S, 1968 , p. 97). They propose further that it yields hypotheses that are linked to thinking, problem solving and a range of other complex cognitive activities, while accepting that “the framework raises more questions than it answers” (A & S, 1968 , p. 97). The multicomponent model of working memory stemmed from an attempt to use the STS component to answer some of these questions, resulting in the need to extend and elaborate this aspect of the modal model. This provides the focus of what follows.

STS as a working memory

We began our first grant at a time when the intense interest in STM was beginning to fade. The gold rush days when everyone seemed to have their own paradigm and a mathematical model to fit were fading, overtaken by interest in semantic memory and levels of processing. This in fact proved fortunate, since instead of worrying about how our work fitted in with everyone else’s, we could focus on the model produced by A & S, a ‘modal model’ in the sense that it encompassed and reflected much of the work that had gone on during the previous decade and presented it in a manner that invited further exploration. Both Baddeley and Hitch had completed their graduate training at the MRC Applied Psychology Unit in Cambridge (now the Cognition and Brain Sciences Unit) under Broadbent’s directorship, and were influenced by the Unit’s remit of combining basic and applied psychology (Baddeley, 2018 ). We decided that the first question we should ask was whether the STS did indeed serve as a general working memory. We did so by attempting to manipulate its available storage capacity, observing the effect on three different cognitive activities: reasoning, comprehension and learning. We based our approach on a concurrent task method, requiring participants to perform the relevant cognitive activities at the same time as repeating random digit sequences varying in length. Performance declined as the length of the concurrent sequence was increased on each of our three cognitive activities of reasoning, comprehension and learning, suggesting that the STS did indeed serve as some kind of working memory. However, the decrements were far less than anticipated. The STS was indeed relevant, but not nearly as important as the modal model would seem to suggest. We decided to modify the modal model, taking into account both our own results and some neuropsychological evidence that had just been published (Shallice & Warrington, 1970 ). This reported a newly discovered patient who appeared to have a grossly impaired STS with a digit span of two, together with an apparently normal LTS and no evidence of the very general cognitive disruption that would be expected if the STS served as a working memory. How could both the neuropsychological and our own data be reconciled with the modal model?

Our new model comprised three components, one of which was the phonological loop involved a verbal/acoustic system, similar in nature to A & S’s STS, in which material could be maintained and if necessary transferred to LTM via subvocal rehearsal. We also postulated a broadly equivalent visuo-spatial system, although this was mentioned only briefly in our original paper (Baddeley & Hitch, 1974 ). We were already beginning to investigate visual STM (Baddeley, Grant, Wight, & Thomson, 1975 ; Baddeley & Lieberman, 1980 ; Phillips & Baddeley, 1971 ), and although included in the 1974 proposal, we only began to actively incorporate the visuo-spatial sketchpad into the overall model some years later. The most marked difference from the modal model, however, was the explicit structurally-defined short-term verbal/acoustic store and a separate attentional control system, the central executive. We initially termed the verbal/acoustic store the articulatory loop , emphasizing its function as a control process, as did A & S. We did, however, later decide that this term did not do justice to its basic storage function, adopting the term phonological loop , although without wishing to be precise about the linguistic processes underpinning it. We return later to the structure versus processing distinction.

We started by focusing on the phonological loop since we regarded it as the simplest and most tractable subcomponent of the system. This proved to be the case, allowing us to separate and analyse both the storage system, principally using phonological similarity as a marker, and the subvocal rehearsal system, principally using articulatory suppression to disrupt rehearsal (e.g. Allen, Baddeley & Hitch, 2006 ; Baddeley, Chincotta & Adlam, 2001 ). The precise nature of forgetting within the phonological store remains controversial, however. We presented evidence that we felt suggested time-based trace decay (Baddeley et al. 1975 ), while others presented both counter evidence (Lovatt, Avons & Masterson, 2000 ) and evidence in favour (Mueller, Seymour, Kieras & Meyer, 2003 ). The issue remains hotly disputed (Barrouillet & Camos, 2014 ; Hulme, Suprenant, Bireta, Stuart, & Neath, 2004 ), and the nature of short-term forgetting remains an important question but is fortunately not crucial for the overall concept of a phonological loop.

Although the three-component system could account for a wide range of experimental results, it had difficulty in handling data based on prose recall as used, for example, in the working memory span task that Daneman and Carpenter ( 1980 ) had shown to be such a good predictor of individual differences, not only as originally proposed in prose comprehension, but also in a wide range of other complex cognitive tasks including reasoning and performance on standard intelligence tests (Conway et al., 2008 ). In the face of these and other related problems, a fourth component was proposed, the episodic buffer (Baddeley, 2000 ), a multidimensional interface that was assumed to be capable of binding information, either within or between systems into episodes that were then available for conscious awareness. As such it provided an essential component of our revised working memory system.

Much of the last decade has been concerned with attempting to use the concept of an episodic buffer productively, hence avoiding the danger that it may simply become a convenient way of explaining unwanted anomalies. Our initial assumption was that the binding of features such as colour and shape into objects, or of words into meaningful phrases, was directly dependent on the buffer. However, a series of studies systematically manipulating the various components of working memory consistently argued against this hypothesis. Syntactic and semantic binding appears to occur relatively automatically based on language skills within LTM (Baddeley, Hitch, & Allen, 2009 ), while the binding of visual features into objects appears to occur at a level prior to accessing the episodic buffer (Allen et al., 2006 ). We concluded, therefore, that it is essentially a passive system for combining information from a range of dimensions and cognitive subsystems and making it available to conscious awareness, but that it does not itself serve a binding function (see Baddeley, 2012 ; Baddeley, Allen, & Hitch, 2011 ), although maintaining such representations against trace decay or interference does appear to be attentionally dependent (Allen, Baddeley, & Hitch, 2014 ).

In recent years there has been a dramatic increase in interest in visual short-term and working memory, principally coming from investigators with interests in visual perception and visual attention. We ourselves have become involved in the area, principally focused on supplementing the initially relatively narrow range of methodologies applied to studying visual working memory with methods that had already proved theoretically productive in the study of verbal working memory. These include manipulating attentional capacity by concurrent tasks (Allen et al., 2006 ; Baddeley et al., 2009 ), investigating the role of strategy by instruction to focus on subsamples of the visual stimuli (Atkinson, Baddeley, & Allen, 2018 ; Hu, Hitch, Baddeley, Zhang, & Allen, 2014 ) and moving from simultaneous presentation of an array of visual stimuli to sequential presentation of individual items (Allen et al., 2006 , 2014 ). This also allowed us to study effects of visual suffixes, noting that their capacity to disrupt STM depended not only on their visual characteristics but also on whether they might or might not potentially have formed part of the relevant test set or came from a different set of broadly similar items (Hu et al., 2014 ; Ueno, Allen, Baddeley, Hitch, & Saito, 2011 ).

By pursuing these lines of research and combining them, we found ourselves focusing on the nature of attention and its control, an issue we had initially avoided as being too difficult. Our current results suggest that visual working memory depends on two pools of attentional capacity, both of limited extent. One is concerned with attentional control and can broadly be seen as an aspect of our proposed central executive. It is sensitive to concurrent attentional load, regardless of modality. The other is concerned with the intake of perceptual information rather than executive control (Allen et al., 2014 ; Hu et al., 2014 ; Hu, Allen, Baddeley, & Hitch, 2016 ). Our conclusions have turned out to be broadly similar to those of colleagues approaching the same issue often using different methods from within the attentional field (Chun, Golomb & Turk-Browne, 2011 ; Lavie et al., 2004 ; Posner, 1980 ; Yantis, 2000 ).

Much of this work, including our own, is limited to studying the retention of simple stimuli such as colored shapes. Such an approach has the advantage of allowing methods from visual attention and its neurobiological basis to be directly applied and for precise and detailed models to be developed. A good example of this is provided by the controversy as to whether the limitation in visual STM is best modeled using the concept of a limited number of storage locations or in terms of limited but flexible storage capacity (e.g. Bays, Catalao & Husain, 2009 ; Ma, Husain, & Bays, 2014 ; Zhang & Luck, 2008 ). This in turn has led to the development of new continuous response measures based on precision rather than categorical error rate. Such detailed modeling occurs explicitly or implicitly within a broader framework, and it is encouraging to see this in the case of visual working memory, as in the case of the recent proposal by Van der Stigchel and Hollingworth ( 2018 ) that visuo-spatial working memory plays a fundamental role in the operation of eye movement control system.

We thus regard our own work as part of an attempt to explain the way in which attention and memory interact in allowing us to perform a wide range of cognitive activities. We see our work as part of an ongoing enterprise that extends from Broadbent ( 1958 ), through the Atkinson and Shiffrin modal model to a very wide range of studies of working memory across both cognitive psychology and cognitive neuroscience. It is of course important to bear in mind that studies using the concept of working memory reflect many different approaches to the topic, with studies in neuroscience in particular often applying the term ‘working memory’ to simple STM tasks.

STS as activated LTM?

However, while the broad framework produced by A & S, with its emphasis on separate strictures for STS and LTS, has been very influential for over 50 years, in recent years it has been seriously challenged by the claim that short-term storage is simply activated LTM. This could be regarded as perhaps the most substantial objection to our own multicomponent model and as such merits careful consideration. We should begin by stressing that we do not suggest that LTM plays no role in working memory. Even a basic digit span task will depend on knowledge of digit names and frequency of digit sequences (Jones & Macken, 2015 ), and be much reduced when the digits come from a non-native language, while if presented visually, span will depend on the familiarity of spatial configuration (Darling, Allen, & Havelka, 2017 ) and learned capacity to turn the visual symbols into sounds. This and many other tasks will also be influenced by strategy, with reliance on phonological coding tending to be abandoned as sequences become longer (Hall, Wilson, Humphreys, Tinzmann, & Bowyer, 1983 ; Salame & Baddeley, 1986 ) or when semantic coding proves feasible, as in sentence span (Baddeley, Hitch, & Allen, 2009 ). As material becomes more complex, the inter-relation with LTM is itself likely to increase in complexity.

We suggest therefore that the crucial question is not whether working memory depends on LTM, but how long-term and working memory interact and indeed whether it is necessary to assume separate long-term and temporary storage systems. The strongest evidence for this, tentatively accepted by A & S, comes from neuropsychology, with some patients showing grossly impaired LTM but preserved STM (Baddeley & Warrington, 1970 ; Milner, 1966 ), while others show the opposite pattern of preserved LTM and grossly impaired STM (Shallice & Warrington, 1970 ; Vallar & Baddeley, 1984 ).

Cowan ( 1988 , p182) has suggested an alternative view of the neuropsychological evidence, suggesting that the patient described by Shallice and Warrington may have had “a deficiency in one or more of the control processes used to enhance short-term storage (e.g. overt articulation)”. There is, however, no evidence for this; such patients can have excellent language production skills combined with a substantial verbal STM deficit (Vallar & Baddeley, 1984 ). It could be argued that this is only one possibility, but to propose a model with a range of potential but unspecified control processes that might possibly explain the result does not seem to offer a clear way forward when compared with a well–supported and specified alternative

A more direct criticism of the neuropsychological evidence for separate visual and verbal STM system is provided by Morey’s ( 2018 ) proposal that the concept of a separate short-term visual store is unnecessary. Morey’s case rests principally on questioning two sources of evidence for a short- term visual system. The first of these concerns the neuropsychological evidence and in particular on case ELD, initially identified as a case of long-term learning deficit for faces (Hanley, Pearson, & Young, 1990 ; Hanley, Young, & Pearson, 1991 ), but which subsequently proved to offer a visual analogue to the type of verbal STM deficit first reported by Shallice and Warrington ( 1970 ) that formed the basis for the concept of a phonological loop (Vallar & Baddeley, 1984 ). Morey ( 2018 ) criticizes both of these studies, but this appears to depend on a number of misreports and/or misinterpretations of the original studies, as pointed out by Hanley and Young ( in press ). In particular, Morey reports ELD’s face memory as normal when this applies only to already familiar faces, whereas her retention of unfamiliar faces was grossly impaired, a pattern resembling verbal STM patient PV’s good retention of words but impaired STM for new phonological information in the form of nonwords (Baddeley, Papagno, & Vallar, 1988 ). The suggestion of a specific visual STM deficit led to a number of new hypotheses, including the prediction that ELD’s pattern of deficits would extend beyond faces, with impaired performance on a range of visual STM tasks including the Corsi block-tapping task together with normal digit span and impaired performance on the visual but not the verbal components of the Brooks tasks (Brooks, 1967 ). These, together with her difficulty in remembering new but not familiar faces, provide a clear double-dissociation when combined with the equivalent pattern for patients with verbal STM deficits (see Baddeley & Hitch, 2018 ). Such a dissociation is not open to Morey’s claim that one type of task is simply harder than the other.

As Morey points out, a double dissociation in which one patient shows a deficit in A but not B while a second shows the opposite pattern, while providing stronger evidence for two separate systems than a single dissociation, is not conclusive, especially in a system with more than two components (Baddeley, 2003 ; Dunn & Kirsner, 2003 ). A triple or quadruple dissociation for a three- or four-component system, however, becomes rapidly impractical, forcing the investigator to rely on the method of converging operations whereby the same question is asked using a range of different methods and different populations, only accepting the result when there is extensive agreement (Garner, Hake, & Ericsson, 1956 ). This is the approach we have consistently taken in developing the multi-component model.

The second major theme of Morey’s review is to reject the hypothesis of separate visual and verbal short-term stores by conducting an extensive meta-analysis of studies in which visual and verbal tasks must be performed simultaneously, finding clear evidence of costs above those expected by such tasks when performed alone. This is suggested to provide evidence against the assumption of separate visual and verbal STM. This is not, however, a valid prediction from our multicomponent model, which would assume at least two additional central executive costs. The first comes from the role of the central executive in maintaining information over the short–term even under single-task conditions. This would be expected to be reduced with verbal information for which articulatory subvocalisation provides a method of maintaining small amounts of information at a relatively low attentional cost, although this cost is likely to increase with longer sequences. In the case of visual STM, we assume that even small loads will require some form of rehearsal by refreshing (Barrouillet & Camos, 2014 ), an attentionally demanding process. Secondly, there is clear evidence that dual or multi-tasking places a specific additional demand on the central executive (Baddeley, Logie, Bressi, Della Sala, & Spinnler, 1986 ; Logie, Cocchini, Della Sala, & Baddeley, 2004 ). We would therefore predict some cost of performing visual and verbal tasks simultaneously, although this would be less than combining two tasks that both involve visual or verbal short-term storage. The degree of interference is likely to depend on precisely which tasks are combined, leading to the pattern of results that Morey observed.

We would argue that although it is not possible to conduct any single experiment that leads to an unequivocal conclusion, the balance of evidence across studies favors our proposal of separate visual and verbal storage maintained by a common executive control system. Demonstrating this within a single experimental study is very demanding, as shown by the attempt to rule out all potential objections to the proposal of separate visual and spatial contributions to STM by Klauer and Zhao ( 2004 ), in which they review the literature, finding none of the studies totally convincing, and attempt to test each possible objection across multiple experiments before concluding that the distinction is valid.

One advantage of attempting to apply a model such as our own across a wide range of differing situations is that it does provide potentially converging ways of attempting to conceptualize such a model. The best example of this is provided by the concept of the phonological loop. As we have already mentioned, we began by assuming that the loop was based purely on the process of articulation, as Cowan suggested but moved gradually to a more nuanced approach that assumes separate contributions from both storage and from an optional articulatory rehearsal strategy. Fortunately, it is possible to disrupt rehearsal by articulatory suppression, repeatedly uttering an irrelevant word such as ‘the – the – the’ (Baddeley, Lewis & Vallar, 1984 ; Murray, 1968 ). This impairs span, eliminates the word length effect and interferes with long-term learning of new phonological material while leaving semantically-based learning unaffected (Baddeley, Gathercole, & Papagno, 1998 ), experimentally induced effects that resemble those typically shown by STM-deficit patients. These effects are, however, substantially reduced in magnitude, relative to those shown by patients. Thus, suppression reduces span by about two items, leaving performance well above the 1- to 2-item span in patients (Vallar & Shallice, 1990 ), suggesting that span depends on substantially more than the capacity of the rehearsal system. Dyslexia and related developmental reading problems tend also to be associated with reduced span, a finding that Shankweiler, Liberman, Mark, Fowler and Fischer ( 1979 ) attributed to failure to use the articulatory loop, since they observed an apparent absence of phonological coding in their poor readers. However, people tend to abandon phonological coding strategy when sequence lengths begin to exceed span and error rates build up (Salame & Baddeley, 1986 ). This proves to be the case when poor readers with reduced spans are tested at a level that is sufficient to tax the capacity of normal reading-control children. When tested at appropriately shorter lengths, the poor readers showed typical phonological similarity effects, suggesting that the absence of phonological coding in poor readers is strategic rather than structurally-based (Hall, Wilson, Humphreys, Tinzmann, & Bowyer, 1983 ). Converging evidence comes from other groups selected as being more severely dyslexic who, when tested at appropriate lengths, show evidence of both phonological similarity and word length effects, together with memory error patterns that resemble those of younger children, consistent with an interpretation of the Shankweiler et al. ( 1979 ) results as a strategic response to their limited storage capacity (Baddeley, Logie, & Ellis, 1988 ).

An attempt to study the role of the phonological loop in reading comprehension using lexical decision suggested that the store itself can best be considered as reflecting two components, one articulatory that allows the continued maintenance and manipulation of material, and a second acoustic that allows simple judgements to be made under suppression but does not allow manipulation (Baddeley & Lewis, 1981 ; Besner, 1987 ; Besner & Davelaar, 1982 ), a conclusion extended in a recent study by Norris, Butterfield, Hall and Page ( 2018 ).

The assumption that rehearsal is an optional strategy does not of course deny the interest in and importance of this process, which, as Cowan has shown, can be divided in children between time to retrieve the articulated items and time needed to articulate them, suggesting a two-stage process (Cowan et al., 2003 ; Jarrold, Hewes, & Baddeley, 2000 ). Unfortunately, separating these two depends on measuring inter-item gaps in the stream of overtly spoken rehearsal, which is possible in children but not in fluent adults for whom retrieval and articulation appear to overlap (Mattys, Baddeley, & Trenkic, 2018 ).

There is evidence, furthermore, that articulation need not involve overt speech movements. A locked-in patient who had lost all capacity for peripheral muscle control, including that of speech, nevertheless showed good STM capacity and clear evidence of both phonological similarity and word length effects (Baddeley & Wilson, 1985 ), implying a preserved capacity for internal subvocal rehearsal. These are only a sample of the relevant literature, but illustrate the way in which the initial simple phonological loop model has been used to investigate a wide range of situations and populations. It is not clear that the more general and less constrained concept of subvocal rehearsal as one of an unspecified number of control processes has been, or promises to be, nearly so fruitful.

The purpose of the previous discussion was not to refute Cowan’s reasonable speculation, but rather to point out the value of having a relatively specified and simple system that can be tested by being applied across a wide range of differing situations. We do indeed assume that our views have much in common with those of Cowan, noting that Cowan and Chen ( 2008 ) propose that “although the mechanisms of short-term memory are separate from those of long-term memory they are closely related” (p. 104), going on to elaborate with a suggestion that a “phonologically-based storage and rehearsal mechanism such as the phonological loop mechanism (Baddeley, 1986 ) may come into play primarily when items have to recalled in the correct serial order” (p. 94). We also agree with his suggestion that “Baddeley’s ( 2000 ) episodic buffer is possibly the same as the information saved in Cowan’s focus of attention or at least is a closely similar concept” (Cowan, 2005 , p. 11). We regard our concepts of a central executive interacting with an episodic buffer as essentially equivalent to Cowan’s more intensively studied attentional approach. We see ourselves as differing principally in the greater emphasis on our more detailed analysis of processes and systems involved in visual and auditory short-term storage. Our principal point of disagreement thus concerns the way in which long-term and working memory interact and in particular whether it is helpful to assume separate short-term systems.

The case for the importance of temporary storage systems has been made recently by Norris ( 2017 ), who combined the evidence from behavioral studies, neuropsychology, neuroimaging and computational modeling to question the claim that activated LTM provides an adequate basis for working memory, criticizing in particular the tendency for brain imaging studies to conclude that because working memory tasks are typically associated with brain areas that are also linked to LTM, that activated LTM is sufficient to account for short-term storage (e.g. Acheson, Hamidi, Binder, & Postle, 2011 ; Cameron, Haarmann, Grafman & Ruchkin, 2005 ; Lewis-Peacock & Postle, 2008 ). The latter claim in their abstract that: “This result implies that activated long-term memory provides a representational basis for semantic verbal short-term memory, and hence supports theories that postulate that short-term and long-term stores are not separate”. Similarly, Öztekin, Davachi and McElree ( 2010 ) state in their abstract that “these findings support single store accounts that assume there are similar operating principles across WM and LTM representations” (Öztekin et al., 2010 , p. 1123). However, as we have already noted, A & S ( 1968 , p. 101) point out that “both STS and LTS are active in both STS and LTS experiments”. The modal model and many other models of memory assume close links between WM and LTM, hence demonstrating a positive association is inconclusive in deciding whether one or two systems are involved.

Norris goes on to argue that models that rely on activation of existing representations in LTM, with no temporary short-term component, may flounder on the ‘problem of two’ (Norris, 2017 , p. 1003). This refers to the long-standing issue of serial recall where an item may be used in a sequence more than once, or may need to be recalled more than once, as for the digit 1 in recalling the sequence 971312 . If such a sequence does not already occupy a specific representation in LTM, it will require a separate representation to be created in some other store. Given that we can handle limitless repetitive sequences of novel items, it is implausible to assume that all of these already exist in LTM. A temporary STS of some kind solves this problem. Such a store could indeed contain pointers rather than copies of the original items, but although “STM would indeed depend on LTM representations, all of the heavy lifting would be done by processes outside LTM itself” (Norris, 2017 , p. 1003). Cowan ( 1999 ) accepts this problem but proposes that it can be handled by the rapid formation of new LTM representations. However, while extensive research has shown that adequate models of the storage and retrieval of serial order have been developed with the aid of a separate short-term store, Norris claims that detailed modelling of how this might be achieved without such temporary storage is currently absent. Given the importance of the capacity to create and maintain serial order, this is a major omission.

Of course, the question of how serial order is stored also occurs within working memory as in the case of the phonological loop. This has, however, been recognised and has led to extensive and detailed modelling, with a range of different approaches (some though not all based on the multicomponent model), both in the case of verbal recall (e.g. Burgess & Hitch, 1999 , 2006 ; Page & Norris, 1998 , 2009 ), and visuospatial STM (Hurlstone & Hitch, 2015 , 2018 ). Happily, a coherent set of principles appear to be emerging from the literature with a growing degree of agreement (Hurlstone et al., 2013 ).

Approaches to theorizing in psychology

It is relevant at this point to provide a brief discussion of the implications of the success and longevity of the modal model for the wider issue of theorizing within psychology. Although there is currently justifiable concern with methodological issues such as transparency and replicability, it appears to be no longer fashionable within cognitive psychology to discuss philosophy of science; we should instead simply concentrate on getting our papers in high citation journals, preferably with a neuroscience flavor. In this connection it is perhaps worth noting another quote from the great economist John Maynard Keynes, who observed that: “Practical men who believe themselves to be free of any intellectual influence are typically the slaves to some defunct economist.” Could that also be true of science? If so, what might be the implicit theories within experimental psychology for example?

In the middle years of the last century, the philosophy of science was a topic of some general interest, with the dominant view probably being that of Popper ( 1959 ), who was part of a general movement originating in Vienna sometimes termed ‘falsificationism’. This approach was applied to both philosophy and science, and proposed that for a theory to be useful, it had to make clear and falsifiable predictions; if these were not supported, the theory should be abandoned. This tended to be backed up by reference to Newtonian physics with its clear postulates and precise predictions (Braithwaite, 1953 ). Its clearest instantiation in psychology was through Clark Hull’s ( 1943 ) Principles of Behavior, which attempted to explain learning in the white rat, and by implication more generally, in terms of a series of postulates linked by precise equations. An alternative view was that proposed by Toulmin ( 1953 ), who viewed theories as resembling maps, useful as far as they represent what is known, as accurately and elegantly as possible, providing a tool for further exploration. The outcome of such exploration was then likely to involve elaboration of the earlier map rather than its total abandonment, unless of course a different and better map was produced.

Observations as to how scientists actually behave, however, suggests yet another approach, that presented by Kuhn ( 1962 ) with his concept of scientific paradigms. These reflect the dominant questions and methods operating in a particular science at a given moment. ‘Normal science’ involves operating within the current paradigm, leading occasionally to a paradigm shift when the old paradigm is abandoned and a new one taken up. This certainly captures the extent to which science responds to fashions, very reasonably in the sense that an exciting new technique or finding will attract people from areas that were showing little progress. Unfortunately, in the hands of philosophers and sociologists it has sometimes been interpreted as suggesting that science is simply a matter of what is fashionable (Dawkins, 1998 ; Sokal, 1996 ).

A rather more constructive development came with the proposal by Popper’s colleague Lakatos ( 1976 ) that theories should not be decided on the success or otherwise of precise predictions, but by how productive they are. This does not refer simply to the number of subsequent papers and citations, as fashionable questions are by no means always theoretically or practically productive, but rather to how effective a theory is in creating a framework that captures existing knowledge in a way that leads to further questions that in turn generate new findings or extend existing findings to new fields. He distinguishes such theories from those principally concerned with protecting themselves from attack from further evidence, which he describes as ‘degenerative’. The broad framework proposed by A & S clearly fits more comfortably into the approach advocated by Lakatos, as indeed does our own theoretical approach.

That is not of course to say that more precise theories are not necessary. In its original form we had no means of storing information in serial order, a problem raised by Lashley ( 1951 ). A number of mechanisms have been suggested, but in order to decide between them it has proved necessary to have much more precise models and carefully focused empirical studies concerned, for example, not only with how serial order is maintained but also with the issue of whether it differs from one modality to another, or whether a common ordering mechanism applies across modalities (see Hurlstone et al., 2013 , for further discussion).

As in the case of geographical maps, the most appropriate form of theorizing will depend on the scale of the enterprise. We need both broadly-based maps of countries and regions together with more detailed maps of towns and cities with yet more detail when precisely delineating each individual’s property. It is also important to accept that we need different maps for different purposes; a map of the London tube system is not very helpful in finding your way when walking, although it will broadly mirror the street map. Similarly, theories based on behavior and on neuroscience are likely to have different emphases but to ultimately be broadly compatible. Furthermore, a cognitive framework that is based on well-controlled experiments within the laboratory becomes more productive if it can also be applied beyond the laboratory. This criterion of generality is not by any means the only criterion of a productive theory. Equally important is its capacity to generate questions that then allow the framework to be extended or remodeled, a process that ideally should be combined with more precise attempts to cover individual areas within the broad model. It would of course be very nice to have a model that did both, and this, we assume, is behind the recent attempt to provide ‘benchmarks’ across the various phenomena that are agreed to be characteristic of working memory (Oberauer et al., 2018 ), presumably with the aim of creating a broad but also precise model of working memory. However, with a total selection of 20 ‘major’ and 31 ‘minor’ phenomena to fit, we suspect modeling them might be a little premature and potentially may have the undesirable effect of limiting further exploration as likely to further complicate an already daunting task (see Logie, 2018 ).

Evaluating the concept’s productivity

So how should we evaluate the concept proposed by A & S of a working memory? One approach is that proposed by Lakatos, in terms of productivity. A simple estimate might come from the frequency of the term ‘working memory’ in journal titles. Within psychology or psychology-related fields, this has increased from six in the year 1980 to 40 in 1990, 306 in 2000, 604 in 2010 and 845 for the year 2016 (Source: Web of Science). Of course, this refers to a wide range of different uses of the term and an overall increase in the range and number of publications, and should therefore be interpreted with some caution. Nevertheless, when calculated as a percentage of the number of articles with the broader term ‘memory’ in the title, a clear increase can be observed across this same time period (1% in 1980; 4% in 1990; 16% in 2000; 21% in 2010; and 24% in 2016). However, as mentioned earlier, the simple popularity of a concept does not necessarily mean that it is scientifically fruitful; it could simply reflect unproductive controversy.

A more informative way of evaluating the productivity of the working memory concept is to consider concrete examples of its use. We ourselves share Broadbent’s original commitment to link theory with its application beyond the laboratory, and have been pleased to see the working memory concept applied across an increasingly wide range of fields. One major development has been through its application to the field of individual differences by Kyllonen and Christal ( 1990 ), relating it to the earlier concept of general intelligence, an approach that has been further developed by a range of groups (Barrouillet & Camos, 2014 ; Conway, Cowan, & Bunting 2001 ; Engle, Tuholski, Laughlin, & Conway, 1999 ; Miyake, Friedman, Emerson, Witzki, Howerter, & Wager, 2000 ), with extensive studies relating to the development of working memory in childhood (Cowan et al., 2003 ; Hitch, Towse, & Hutton, 2001 ). The concept of working memory as a mental workspace has extended beyond psychology, a good example being its extension to paleoarcheology by Coolidge and Wynn ( 2005 ), who propose that working memory may have proved the crucial advantage held by homo sapiens over Neanderthal man. This suggestion was based on the study of remaining artefacts and their implications for the cognitive abilities they reflect, a claim that is taken sufficiently seriously within the field to merit extensive discussion in the journal Science (see Balter, 2010 ).

An advantage of the multicomponent model over the hypothesis of a single unitary attentional workspace is that it allows more detailed but constrained hypotheses to be proposed and tested. While A & S focus on one particular control process in verbal rehearsal, it is not clear to us that this has led to fruitful extension to other control processes or to practical applications. This has, however, proved possible with the fractionation of the A & S STS into the three-component working memory. One important feature of our early model is that, like the original A&S model, it could readily be understood without a very precise knowledge of cognitive psychology. This, together with a series of relatively simple tools for identifying and separating the three components, has led to its being widely adopted as a means of investigating the role of working memory across a range of populations and situations. An obvious application is within the field of education (Pickering, 2006 ), typified by the work of Gathercole and colleagues in developing measures of the components of working memory across the school years (Gathercole & Pickering, 2000a , b , Gathercole, Pickering, Knight, & Stegmann, 2004 ), identifying different components associated principally with vocabulary (Gathercole & Baddeley, 1989 ), reading (Swanson & Berninger, 1995 ) and language development more generally (Baddeley, Gathercole, & Papagno, 1998 ). A somewhat different pattern emerges in the study of mathematics where a visuo-spatial rather than phonological component tends to dominate (Bull, Johnston, & Roy, 1999 ; Hitch & McAuley, 1991 ). The multicomponent model has also begun to be used widely within the field of second language learning (Wen, Mota, & McNeill, 2015 ), while a recent meta-analysis based on individual differences in the rate of second language acquisition based on a wide range of studies involving a total of 3,707 learners showed clear and substantial separable contributions from the central executive and phonological loop (Linck, Osthus, Koeth, & Bunting, 2014 ).

Application of the model to special populations has also been fruitful, with Morris ( 1984 ) reporting a central executive deficit in Alzheimer’s disease, followed by a demonstration that this group has a particular problem in dual-task performance proposed by Baddeley ( 1996 ) as one component of the executive (Baddeley et al., 1986 ; Logie, et al., 2004 ). Dual-task performance has also proved to offer a sensitive genetic marker of a familial form of early-onset Alzheimer’s Disease, allowing family members with the gene to be identified before the onset of other major symptoms (Parra, Abrahams, Logie, Mendez Lopera, & Della Sala, 2010 ). The multicomponent model has also been applied successfully in a twin study of language disorder by Bishop, North, and Donlan ( 1996 ), who found evidence for the heritability of an underlying phonological loop component. Clear genetically-based differences have also been shown between people with Down syndrome, who tend to have a phonological loop deficit, and those with William’s syndrome, for whom the sketchpad appears to be clearly impaired (Jarrold, Baddeley, & Hewes, 1999 ; Wang & Bellugi, 1994 ). The model also proved useful in further analysis of familial cognitive deficit, with Schulze, Vargha-Khadem and Mishkin ( 2018 ) identifying a phonological loop deficit as crucial in a family showing marked impairment in normal language development. These are simply some examples of the application of the concept of a multi-component working memory across a range of fields that operate well beyond the bounds of the psychological laboratory. It is hard to see the concept of working memory simply as activated LTM proving to be equally productive.

Conclusions

So, why has the modal model been so influential? We suggest first of all that it survived while many other models have been forgotten because it attempted to provide a broad framework within which further detail could be developed. The separation between structure and processing has also stood the test of time. Less successful was the modal model’s reliance on the most widely studied memory tasks at the time, based largely on the short-term retention of acoustic/linguistic material. As a result, the need to account for remembering and processing visuo-spatial information was comparatively neglected. A further consequence of the over-reliance on verbal materials was an initial oversimplification of the processes whereby information is transferred from STS to LTS. The emphasis of the model on simple maintenance was called into question shortly afterwards by evidence for the importance of deeper and more elaborative processing for long-term retention (Craik & Lockhart, 1972 ). Relatedly, the assumption that the STS serves as the gateway to LTS was challenged by the existence of neuropsychological patients with impaired STM but normal LTM.

We suggest that our own multicomponent working memory concept forms an extension and elaboration of the STS component of the modal model that has avoided these latter difficulties. In particular, we suggest that it has proved fruitful to separate the attentional control processes that we termed the central executive from temporary storage, and to suggest that more than one storage modality is likely to be involved. In addition, by postulating the concept of an episodic buffer, we explicitly link the system to hypotheses about conscious awareness. We suggest that our broad framework is compatible with a range of other more detailed proposals regarding specific components of the system. Such development and elaboration are of course essential if the overall framework is to continue to be fruitful.

The original A&S model’s well-deserved longevity stems in part from its capacity to crystalize the major advances made in the previous decade in the understanding of human memory and combine them within a well justified theoretical framework, a framework that was broad enough to encompass modifications and additions in the face of new evidence. We see the multicomponent model of working memory as an extension of this approach, exploring further the nature of their proposed STS by focusing on its capacity to function as part of a more general working memory system.

Acheson, D. J., Hamidi, M., Binder, J. R., & Postle, B. R. (2011). A common neural substrate for language production and verbal working memory. Journal of Cognitive Neuroscience, 23 , 1358-1367. https://doi.org/10.1162/jocn.2010.21519

Article   PubMed   Google Scholar  

Allen, R., Baddeley, A. D., & Hitch, G. J. (2006). Is the binding of visual features in working memory resource-demanding? Journal of Experimental Psychology: General, 135 , 298-313.

Article   Google Scholar  

Allen, R. A., Baddeley, A. D., & Hitch, G. J. (2014). Evidence for two attentional components in visual working memory. Journal of Experimental Psychology. Learning, Memory, and Cognition, 40 , 1499-1509. https://doi.org/10.1037/xlm0000002

Atkinson, A. L., Baddeley, A. D., & Allen, R. J. (2018). Remember some or remember all? Ageing and strategy effects in visual working memory Quarterly Journal of Experimental Psychology, 71(7) , 1561-1573. https://doi.org/10.1080/17470218.2017.1341537

Atkinson, R. C., & Shiffrin, R. M. (1968). Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation: Advances in research and theory. (Vol. 2, pp. 89-195). New York: Academic Press.

Google Scholar  

Baddeley, A. (1986). Working memory . New York: Oxford University Press.

Baddeley, A. (2012). Working memory, theories models and controversy. The Annual Review of Psychology, 63 , 12.11–12.29.

Baddeley, A., Gathercole, S., & Papagno, C. (1998). The phonological loop as a language learning device. Psychological Review, 105 , 158-173.

Baddeley, A. D. (1996). Exploring the central executive. Quarterly Journal of Experimental Psychology, 49A , 5-28.

Baddeley, A. D. (2000). The episodic buffer: A new component of working memory? Trends in Cognitive Sciences, 4 , 417-423.

Baddeley, A. D. (2003). Double dissociation: Not magic but still useful. Cortex: , 39 , 129-131.

Baddeley, A. D. (2018). Working memories: Postmen, divers and the cognitive revolution . Hove: Routledge.

Book   Google Scholar  

Baddeley, A. D., Allen, R. J., & Hitch, G. J. (2011). Binding in visual working memory: The role of the episodic buffer. Neuropsychologia, 49 , 1393-1400.

Baddeley, A. D., Chincotta, D., & Adlam, A. (2001). Working memory and the control of action: Evidence from task switching. Journal of Experimental Psychology: General, 130 , 641-657.

Baddeley, A. D., Eysenck, M., & Anderson, M. C. (2015). Memory (2nd ed.). Hove: Psychology Press.

Baddeley, A. D., Grant, S., Wight, E., & Thomson, N. (1975). Imagery and visual working memory. In P. M. A. Rabbitt & S. Dornic (Eds.), Attention and performance V (pp. 205-217). London: Academic Press.

Baddeley, A. D., & Hitch, G. J. (1974). Working memory. In G. A. Bower (Ed.), Recent advances in learning and motivation (Vol. 8, pp. 47-89). New York: Academic Press.

Baddeley, A. D., & Hitch, G. J. (1993). The recency effect: Implicit learning with explicit retrieval? Memory and Cognition, 21 , 146-155.

Baddeley, A. D., & Hitch, G. J. (2017). Is the levels of processing effect language-limited? . Journal of Memory & Language, 92 , 1-13. https://doi.org/10.1016/j.jml.2016.05.001

Baddeley, A. D., & Hitch, G. J. (2018). The phonological loop as a buffer store: An update. Cortex . https://doi.org/10.1016/j.cortex.2018.05.015

Baddeley, A. D., Hitch, G. J., & Allen, R. J. (2009). Working memory and binding in sentence recall. Journal of Memory and Language, 61 , 438-456.

Baddeley, A. D., & Lewis, V. J. (1981). Inner active processes in reading: The inner voice, the inner ear and the inner eye. In A. M. Lesgold & C. A. Perfetti (Eds.), Interactive processes in reading (pp. 107-129). Hillsdale, N.J.: Lawrence Erlbaum.

Baddeley, A. D., Lewis, V. J., & Vallar, G. (1984). Exploring the articulatory loop. Quarterly Journal of Experimental Psychology, 36 , 233-252.

Baddeley, A. D., & Lieberman, K. (1980). Spatial working memory. Attention and Performance VIII , 521-539.

Baddeley, A. D., Logie, R., Bressi, S., Della Sala, S., & Spinnler, H. (1986). Dementia and working memory. Quarterly Journal of Experimental Psychology, 38A , 603-618.

Baddeley, A. D., Logie, R. H., & Ellis, N. C. (1988). Characteristics of developmental dyslexia. Cognition, 29 , 197-228.

Baddeley, A. D., Papagno, C., & Vallar, G. (1988). When long-term learning depends on short-term storage. Journal of Memory and Language, 27 , 586-595.

Baddeley, A. D., & Scott, D. (1971). Short-term forgetting in the absence of proactive interference. Quarterly Journal of Experimental Psychology, 23 (275-283).

Baddeley, A. D., Thomson, N., & Buchanan, M. (1975). Word length and the structure of short-term memory. Journal of Verbal Learning and Verbal Behavior, 14 , 575-589.

Baddeley, A. D., & Warrington, E. K. (1970). Amnesia and the distinction between long- and short-term memory. Journal of Verbal Learning and Verbal Behavior, 9 , 176-189.

Baddeley, A. D., & Wilson, B. (1985). Phonological coding and short-term memory in patients without speech. Journal of Memory and Language, 24 , 490-502.

Balter, M. (2010). Did working memory spark creative culture? Science, 328 , 160-163.

Barrouillet, P., & Camos, V. (2014). Working memory: Loss and reconstruction Hove: Psychology Press.

Bays, P. M., Catalao, R. F., & Husain, M. (2009). The precision of visual working memory is set by allocation of a shared resource. Journal of Vision, 9 , 1-11. https://doi.org/10.1167/9.10.7

Besner, D. (1987). Phonology, lexical access in reading and articulatory suppression: A critical review. Quarterly Journal of Experiment Psychology, 39A , 467-478.

Besner, D., & Davelaar, E. (1982). Basic processes in reading: Two phonological codes. Canadian Journal of Psychology, 36 , 701-711.

Bishop, D. V. M., North, T., & Donlan, C. (1996). Genetic basis of specific language impairment: Evidence from a twin study. Developmental Medicine and Child Neurology, 37 , 56-71.

Brady, T. F., Konkle, T., & Alvarez, G. A. (2011). A review of visual memory capacity: Beyond individual items and toward structured representations. Journal of Vision, 11 , 1-4.

Braithwaite, R. B. (1953). Scientific explanation . Cambridge: Cambridge University Press.

Broadbent, D. E. (1958). Perception and communication . London: Pergamon Press.

Brooks, L. R. (1967). The suppression of visualization by reading. Quarterly Journal of Experimental Psychology, 19 , 289-299.

Bull, R., Johnston, R. S., & Roy, J. A. (1999). Exploring the roles of the visual-spatial sketch pad and central executive in children’s arithmetical skills: Views from cognition and developmental neuropsychology. Developmental Neuropsychology, 15 , 421-442.

Burgess, N., & Hitch, G. J. (1999). Memory for serial order: A network model of the phonological loop and its timing. Psychological Review, 106 , 551-581.

Burgess, N., & Hitch, G. J. (2006). A revised model of short-term memory and long-term learning of verbal sequences. Journal of Memory and Language, 55 , 627-652.

Cameron, K. A., Haarmann, H. J., Grafman, J., & Ruchkin, D. (2005). Long-term memory is the representational basis for semantic verbal short-term memory. Psychophysiology, 42 , 643-653.

Chun, M. M., Golomb, J. D., & Turk-Browne, N. B. (2011). A taxonomy of external and internal attention. Annual Review of Psychology, 62 , 73-101. https://doi.org/10.1146/annurev.psych.093008.100427

Conway, A. R. A., Cowan, N., & Bunting, M. F. (2001). The cocktail party phenomenon revisited: The importance of working memory capacity. Psychonomic Bulletin & Review, 8 , 331-335.

Conway, A. R. A., Jarrold, C., Kane, M. J., Miyake, A., & Towse, J. N. (2008). Variation in working memory . New York: Oxford University Press.

Coolidge, F. L., & Wynn, T. (2005). Working memory, its executive functions, and the emergence of modern thinking. Cambridge Archaeology Journal, 15 , 5-26.

Cowan, N. (1988). Evolving conceptions of memory storage, selective attention, and their mutual constraints within the human information processing system. Psychological Bulletin, 104 , 163-191.

Cowan, N. (1999). An embedded-processes model of working memory. In A. Miyake & P. Shah (Eds.), Models of working memory (pp. 62-101). Cambridge, UK: Cambridge University Press.

Chapter   Google Scholar  

Cowan, N. (2005). Working memory capacity. Hove: Psychology Press.

Cowan, N., & Chen, Z. (2008). How chunks form in long-term memory and affect short-term memory limits. In M. Page & A. Thorn (Eds.), Interactions between short-term and long-term memory in the verbal domain (pp. 86-107). Hove, UK: Psychology Press.

Cowan, N., Towse, J. N., Hamilton, Z., Saults, J. S., Elliott, E. M., Lacey, J. F., ... Hitch, G. J. (2003). Children’s working memory processes: A response-timing analysis. Journal of Experimental Psychology: General, 132, 113-132.

Craik, F. I. M., & Lockhart, R. S. (1972). Levels of processing. A framework for memory research. Journal of Verbal Learning & Verbal Behavior, 11 , 671-684. https://doi.org/10.1016/S0022-5371(72)80001-X

Craik, F. I. M., & Tulving, E. (1975). Depth of processing and the retention of words in episodic memory. Journal of Experimental Psychology: General, 104 (3), 268-294. https://doi.org/10.1037/0096-3445.104.3.268

Daneman, M., & Carpenter, P. A. (1980). Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behaviour, 19 , 450-466.

Darling, S., Allen, R. J., & Havelka, J. (2017). Visuospatial bootstrapping: When visuospatial and verbal memory work together. Current Directions in Psychological Science, 26 , 3-9. https://doi.org/10.1177/0963721416665342

Dawkins, R. (1998). Postmodernism disrobed. Nature, 394 , 141-143.

Dunn, J. C., & Kirsner, K. (2003). What can we infer from double dissociations? . Cortex: A Journal Devoted to the Study of the Nervous System and Behavior, 39 , 1-7.

Engle, R. W., Tuholski, S. W., Laughlin, J. E., & Conway, A. R. A. (1999). Working memory, short-term memory, and general fluid intelligence: A latent-variable approach. Journal of Experimental Psychology: General, 128 , 309-331.

Estes, W. K. (1950). Toward a statistical theory of learning. Psychological Review, 57 , 94-107. https://doi.org/10.1037/h0058559

Evans, K., & Baddeley, A. D. (2018). Intention, attention and long-term memory for visual scenes: It all depends on the scenes Cognition, 180 , 24-37. https://doi.org/10.1016/j.cognition.2018.06.022

Garner, W. R., Hake, H. W., & Ericsson, C. W. (1956). Operationism and the concept of perception. Psychological Review, 63 , 149-159.

Gathercole, S. E., & Baddeley, A. D. (1989). Evaluation of the role of phonological STM in the development of vocabulary in children: A longitudinal study. Journal of Memory & Language, 28 , 200-213.

Gathercole, S. E., & Pickering, S. J. (2000a). Assessment of working memory in six- and seven-year-old children. Journal of Educational Psychology, 92 , 377-390.

Gathercole, S. E., & Pickering, S. J. (2000b). Working memory deficits in children with low achievements in the national curriculum at seven years of age. British Journal of Educational Psychology, 70 , 177-194.

Gathercole, S. E., Pickering, S. J., Knight, C., & Stegmann, Z. (2004). Working memory skills and educational attainment: Evidence from National Curriculum assessments at 7 and 14 years of age. Applied Cognitive Psychology, 40 , 1-16.

Gigerenzer, G. (2010). Personal reflections on theory and psychology. Theory & Psychology, 20 , 733-743.

Glanzer, M., & Cunitz, A. R. (1966). Two storage mechanisms in free recall. Journal of Verbal Learning and Verbal Behavior, 5 , 351-360.

Hall, J. W., Wilson, K. P., Humphreys, M. S., Tinzmann, M. B., & Bowyer, P. M. (1983). Phonemic similarity effects in good vs. poor readers. Memory and Cognition, 11 , 520-527.

Hanley, J. R., Pearson, N. A., & Young, A. W. (1990). Impaired memory for new visual forms. Brain, 113 , 1131-1148.

Hanley, J. R., & Young, A. W. (in press). ELD revisited: A second look at a neuropsychological impairment of working memory affecting retention of visuo-spatial material. Cortex.

Hanley, J. R., Young, A. W., & Pearson, N. A. (1991). Impairment of the visuo-spatial sketch pad. Quarterly Journal of Experimental Psychology Section A, 43 , 101-125. https://doi.org/10.1080/14640749108401001

Hitch, G. J., & McAuley, E. (1991). Working memory in children with specific arithmetical learning difficulties. British Journal of Psychology, 82 , 375-386.

Hitch, G. J., Towse, J. N., & Hutton, U. (2001). What limits children’s working memory span? Theoretical accounts and applications for scholastic development. Journal of Experimental Psychology: General, 130 , 184-188.

Hu, Y., Allen, R. J., Baddeley, A. D., & Hitch, G. J. (2016). Executive control of stimulus-driven and goal-directed attention in visual working memory. Attention, Perception & Psychophysics, 78 , 2164-2175. https://doi.org/10.3758/s13414-016-1106-7

Hu, Y., Hitch, G. J., Baddeley, A. D., Zhang, M., & Allen, R. J. (2014). Executive and perceptual attention play different roles in visual working memory: Evidence from suffix and strategy effects. Journal of Experimental Psychology: Human Perception and Performance, 40 , 1665-1678.

PubMed   Google Scholar  

Hull, C. L. (1943). The principles of behaviour . New York: Appleton-Century.

Hulme, C., Suprenant, A. M., Bireta, T. J., Stuart, G., & Neath, I. (2004). Abolishing the word-length effect. Journal of Experimental Psychology: Learning, Memory, and Cognition., 30 , 98-106.

Hurlstone, M. J., & Hitch, G. J. (2015). How is the serial order of a spatial sequence represented? Insights from transposition latencies. Journal of Experimental Psychology - Learning Memory and Cognition, 42 , 295-324.

Hurlstone, M. J., & Hitch, G. J. (2018). How is the serial order of a visual sequence represented? Insights from transposition latencies. Journal of Experimental Psychology-Learning Memory and Cognition, 44 , 167-192.

Hurlstone, M. J., Hitch, G. J., & Baddeley, A. D. (2013). Memory for serial order across domains: An overview of the literature and directions for future research. Psychological Bulletin. Advance online publication. https://doi.org/10.1037/a0034221

Hyde, T. S., & Jenkins, J. J. (1969). Differential effects of incidental tasks on the organization of recall of a list of highly associated words. Journal of Experimental Psychology, 82 , 472-481. https://doi.org/10.1037/h0028372

Isola, P., Xaio, J., Parikh, D., Torralba, A., & Oliva, A. (2014). What makes a photograph memorable? . IEEE Transactions on pattern analysis and machine intelligence, 36 , 1469-1482.

Jarrold, C., Baddeley, A. D., & Hewes, A. K. (1999). Genetically dissociated components of working memory: Evidence from Down’s and Williams syndrome. Neuropsychologia, 37 , 637-651.

Jarrold, C., Hewes, A., & Baddeley, A. D. (2000). Do two separate speech measures constrain verbal short-term memory in children? Journal of Experimental Psychology, 26 , 1626-1637.

Jenesen, A., & Squire, L. R. (2012). Working memory, long-term memory, and medial temporal lobe function. Learning & Memory, 19 , 15-25.

Jones, G., & Macken, B. (2015). Questioning short-term memory and its measurement: Why digit span measures long-term associative learning. Cognition, 144, 1-13.

Keppel, G., & Underwood, B. J. (1962). Proactive inhibition in short-term retention of single items. Journal of Verbal Learning & Verbal Behavior, 1 , 153-161.

Klauer, K. C., & Zhao, Z. (2004). Double dissociations in visual and spatial short-term memory. Journal of Experimental Psychology: General, 133 , 355-381.

Kuhn, T. (1962). The structure of scientific revolutions. Chicago: University of Chicago Press.

Kyllonen, P. C., & Christal, R. E. (1990). Reasoning ability is (little more than) working memory capacity. Intelligence, 14 , 389-433.

Lakatos, I. (1976). Proofs and refutations. Cambridge: Cambridge University Press.

Lashley, K. S. (1951). The problem of serial order in behavior. In L. A. Jeffress (Ed.), Cerebral mechanisms in behavior: The Hixon symposium . New York: John Wiley.

Lavie, N., Hirst, A., de Fockert, J. W., & Viding, E. (2004). Load theory of selective attention and cognitive control. Journal of Experimental Psychology: General, 133 , 339-354.

Lehman, M., & Malmberg, K. J. (2013). A buffer model of memory encoding and temporal correlations in retrieval. Psychological Review, 120 , 155-189.

Lewis-Peacock, J. A., & Postle, B. R. (2008). Temporary activation of long-term memory supports working memory. The Journal of Neuroscience 28 , 8765-8771.

Article   PubMed   PubMed Central   Google Scholar  

Linck, J. A., Osthus, P., Koeth, J. T., & Bunting, M. F. (2014). Working memory and second language comprehension and production: A meta-analysis. Psychonomic Bulletin & Review, 21 , 861-883. https://doi.org/10.3758/s13423-013-0565-2

Logie, R. H. (2018 in press). Scientific advance and theory integration in working memory: Commentary on Oberauer et al. (2018) Benchmarks for models of short-term and working memory. Psychological Bulletin . Advance online publication.

Logie, R. H., Cocchini, G., Della Sala, S., & Baddeley, A. (2004). Is there a specific capacity for dual task co-ordination? Evidence from Alzheimer’s Disease. Neuropsychology, 18 (3), 504-513.

Lovatt, P., Avons, S. E., & Masterson, J. (2000). The word length effect and disyllabic words. Quarterly Journal of Experimental Psychology, 53A (1), 1-22.

Ma, W. J., Husain, M., & Bays, P. M. (2014). Changing concepts of working memory. Nature Neuoscience, 17 , 347-356. https://doi.org/10.1038/nn.3655

Mattys, S., Baddeley, A. D., & Trenkic, D. (2018). Is the superior verbal memory span of Mandarin speakers due to faster rehearsal? . Memory & Cognition, 46 , 361-369. https://doi.org/10.3758/s13421-017-0770-8

Milner, B. (1966). Amnesia following operation on the temporal lobes. In C. W. M. Whitty & O. L. Zangwill (Eds.), Amnesia (pp. 109-133). London: Butterworths.

Miyake, A., Friedman, N. P., Emerson, M. J., Witzki, A. H., Howerter, A., & Wager, T. D. (2000). The unity and diversity of executive functions and their contributions to complex "frontal lobe" tasks: A latent variable analysis. Cognitive Psychology, 41 , 49-100.

Morey, C. (2018). The case against specialized visual-spatial short-term memory. Psychological Bulletin, 144 , 849-883.

Morris, R. G. (1984). Dementia and the functioning of the articulatory loop system. Cognitive Neuropsychology, 1 , 143-157.

Mueller, S. T., Seymour, T. L., Kieras, D. E., & Meyer, D. E. (2003). Theoretical implications of articulatory duration, phonological similarity, and phonological complexity in verbal working memory. Journal of Experimental Psychology: Learning, Memory, and Cognition,, 29 (6), 1353-1380.

Murdock, B. B (1967). Recent developments in short-term memory. British Journal of Psychology, 58 , 421-433.

Murray, D. J. (1968). Articulation and acoustic confusability in short-term memory. Journal of Experimental Psychology, 78 , 679-684.

Nairne, J. S. (2002). Remembering over the short-term: The case against the standard model. Annual Review of Psychology, 53 , 53-81.

Norris, D. (2017). Short-term memory and long-term memory are still different. Psychological Bulletin, 143 , 992-1009.

Norris, D., Butterfield, S., Hall, J., & Page, M. P. A. (2018). Phonological recoding under articulatory suppression. Memory & Cognition, 46 , 173-180.

Oberauer, K., Lewandowsky, S., Awh, E., Brown, G. D. A., Conway, A., Cowan, N., ... Ward, G. (2018). Benchmarks for models of short-term and working memory. Psychological Bulletin, 144, 885-958. https://doi.org/10.1037/bul0000153

Öztekin, I., Davachi, L., & McElree, B. (2010). Are representations in working memory distinct from representations in long-term memory? Neural evidence in support of a single store. Psychological Science, 21 , 123-1133.

Page, M. P. A., & Norris, D. (1998). The primacy model: A new model of immediate serial recall. Psychological Review, 105 , 761-781.

Page, M. P. A., & Norris, D. (2009). A model linking immediate serial recall, the Hebb repetition effect and the learning of phonological word forms. Philosophical Transactions of the Royal Society B Biological Science, 364 , 3737-3753.

Paivio, A. (1971). Imagery and verbal processes . London: Holt Rinehart and Winston.

Parra, M. A., Abrahams, S., Logie, R. H., Mendez, L. G., Lopera, F., & Della Sala, S. (2010). Visual short-term memory binding deficits in familial Alzheimer's disease. Brain, 133 , 2702-2713. https://doi.org/10.1093/brain/awq148 .

Peterson, L. R., & Peterson, M. J. (1959). Short-term retention of individual verbal items. Journal of Experimental Psychology, 58 , 193-198.

Phillips, W. A., & Baddeley, A. D. (1971). Reaction time and short-term visual memory. Psychonomic Science, 22 , 73-74.

Pickering, S. J. (Ed.) (2006). Working memory and education . London: Elsevier Press.

Popper, K. (1959). The logic of scientific discovery . London: Hutchison.

Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experiment Psychology, 32 , 3-25. https://doi.org/10.1080/00335558008248231

Raaijmakers, J. G. W., & Shiffrin, R. M. (1981). Search of associative memory. Psychological Review, 88 , 93-134.

Salame, P., & Baddeley, A. D. (1986). Phonological factors in STM: Similarity and the unattended speech effect. Bulletin of the Psychonomic Society, 24 , 263-265.

Schulze, K., Vargha-Khadem, F., & Mishkin, M. (2018). Phonological working memory and FOXP2. Neuropsychologia, 108 , 147-152 https://doi.org/10.1016/j.neuropsychologia.2017.11.027

Shallice, T., & Warrington, E. K. (1970). Independent functioning of verbal memory stores: A neuropsychological study. Quarterly Journal of Experimental Psychology, 22 , 261-273.

Shankweiler, D., Liberman, I. Y., Mark, L. S., Fowler, C. A., & Fischer, F. W. (1979). The speech code and learning to read. Journal of Experimental Psychology: Human Learning and memory, 5 , 531-545.

Sokal, A. (1996). A physicist experiments with cultural studies. Lingua Franca, 6, 62-64.

Swanson, H. L., & Berninger, V. (1995). The role of working memory in skilled and less skilled reader’s comprehension. Intelligence, 21 , 83-108.

Toulmin, S. (1953). The philosophy of science . London: Hutchison.

Ueno, T., Allen, R. J., Baddeley, A. D., Hitch, G. J., & Saito, S. (2011). Disruption of visual feature binding in working memory. Memory & Cognition, 39 , 12-23.

Vallar, G., & Baddeley, A. D. (1984). Fractionation of working memory: Neuropsychological evidence for a phonological short-term store. Journal of Verbal Learning & Verbal Behavior, 23 , 151-161.

Vallar, G., & Shallice, T. (Eds.). (1990). Neuropsychological impairments of short-term memory . Cambridge: Cambridge University Press.

Van der Stigchel, S., & Hollingworth, A. (2018). Visuo-spatial working memory as a fundamental component of the eye movement system. Current Directions in Psychological Science, 27 , 136-143.

Wang, P. P., & Bellugi, U. (1994). Evidence from two genetic syndromes for a dissociation between verbal and visual-spatial short-term memory. Journal of Clinical and Experimental Neuropsychology, 16 , 317-322.

Waugh, N. C., & Norman, D. A. (1965). Primary memory. Psychological Review, 72 , 89-104.

Wen, Z., Mota, M. B., & McNeill, A. (Eds.). (2015). Working memory in second language acquisition and processing . Bristol: Multilingual Matters.

Yantis, S. (2000). Goal directed and stimulus driven determinants of attentional control. In S. M. J. Driver (Ed.), Control of cognitive processes: Attention and performance (Vol. XVIII pp. 73-103). Cambridge, MA: MIT Press.

Zhang, W., & Luck, S. J. (2008). Discrete fixed-resolution representations in visual working memory. Nature, 453 , 233–235.

Download references

Acknowledgements

We are grateful to Nelson Cowan, David Huber and Ian Neath for their helpful comments on an earlier draft.

Author information

Authors and affiliations.

Department of Psychology, University of York, Heslington, York, YO10 5DD, UK

Alan D. Baddeley & Graham J. Hitch

School of Psychology, University of Leeds, Leeds, UK

Richard J. Allen

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Alan D. Baddeley .

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Baddeley, A.D., Hitch, G.J. & Allen, R.J. From short-term store to multicomponent working memory: The role of the modal model. Mem Cogn 47 , 575–588 (2019). https://doi.org/10.3758/s13421-018-0878-5

Download citation

Published : 26 November 2018

Issue Date : 15 May 2019

DOI : https://doi.org/10.3758/s13421-018-0878-5

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Short-term memory
  • Working memory
  • Modal model
  • Long-term memory
  • Philosophy of science
  • Find a journal
  • Publish with us
  • Track your research
  • Abnormal Psychology
  • Assessment (IB)
  • Biological Psychology
  • Cognitive Psychology
  • Criminology
  • Developmental Psychology
  • Extended Essay
  • General Interest
  • Health Psychology
  • Human Relationships
  • IB Psychology
  • IB Psychology HL Extensions
  • Internal Assessment (IB)
  • Love and Marriage
  • Post-Traumatic Stress Disorder
  • Prejudice and Discrimination
  • Qualitative Research Methods
  • Research Methodology
  • Revision and Exam Preparation
  • Social and Cultural Psychology
  • Studies and Theories
  • Teaching Ideas

Example essay: Contrast two models of memory

Travis Dixon November 9, 2021 Cognitive Psychology , Revision and Exam Preparation

essay on the multi store model of memory

  • Click to share on Facebook (Opens in new window)
  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Pinterest (Opens in new window)
  • Click to email a link to a friend (Opens in new window)

Of the command terms for IB Psychology essays, “contrast” is the hardest to write. Here is an example essay that contrasts two models of memory. Please note – this essay is not written with the intention that you will memorize it. That is a highly inefficient way to study. It’s written so you can get ideas on how to  structure  a contrast essay.

Travis Dixon

Travis Dixon is an IB Psychology teacher, author, workshop leader, examiner and IA moderator.

IBDP Psychology

Website by John Crane

Updated 30 April 2024

InThinking Subject Sites

Subscription websites for IB teachers & their classes

Find out more

  • thinkib.net
  • IBDP Biology
  • IBDP Business Management
  • IBDP Chemistry
  • IBDP Economics
  • IBDP English A Literature
  • IBDP English A: Language & Literature
  • IBDP English B
  • IBDP Environmental Systems & Societies
  • IBDP French B
  • IBDP Geography
  • IBDP German A: Language & Literature
  • IBDP History
  • IBDP Maths: Analysis & Approaches
  • IBDP Maths: Applications & Interpretation
  • IBDP Physics
  • IBDP Spanish A
  • IBDP Spanish Ab Initio
  • IBDP Spanish B
  • IBDP Visual Arts
  • IBMYP English Language & Literature
  • IBMYP Resources
  • IBMYP Spanish Language Acquisition
  • IB Career-related Programme
  • IB School Leadership

Disclaimer : InThinking subject sites are neither endorsed by nor connected with the International Baccalaureate Organisation.

InThinking Subject Sites for IB Teachers and their Classes

Supporting ib educators.

  • Comprehensive help & advice on teaching the IB diploma.
  • Written by experts with vast subject knowledge.
  • Innovative ideas on ATL & pedagogy.
  • Detailed guidance on all aspects of assessment.

Developing great materials

  • More than 14 million words across 24 sites.
  • Masses of ready-to-go resources for the classroom.
  • Dynamic links to current affairs & real world issues.
  • Updates every week 52 weeks a year.

Integrating student access

  • Give your students direct access to relevant site pages.
  • Single student login for all of your school’s subscriptions.
  • Create reading, writing, discussion, and quiz tasks.
  • Monitor student progress & collate in online gradebook.

Meeting schools' needs

  • Global reach with more than 200,000 users worldwide.
  • Use our materials to create compelling unit plans.
  • Save time & effort which you can reinvest elsewhere.
  • Consistently good feedback from subscribers.

For information about pricing, click here

Download brochure

See what users are saying about our Subject Sites:

Find out more about our Student Access feature:

  • Multi-store Model
  • Psychology textbook
  • 3. Cognitive approach
  • Cognitive processing

essay on the multi store model of memory

Types of memory

Researchers distinguish between different types of memory.  This is important because it appears that different types of memory may be stored in different parts of the brain.

Declarative memory (“knowing what”) is the memory of facts and events and refers to those memories that can be consciously recalled. There are two subsets of declarative memory:

Episodic memory contains the memory of specific events that have occurred at a given time and in a given place.

Semantic memory contains general knowledge of facts and people, for example, concepts and schemas, and it is not linked to time and place.

Procedural memory (“knowing how”) is the unconscious memory of skills and how to do things.

The Multi-store Model

essay on the multi store model of memory

Atkinson and Shiffrin (1968) were among the first to suggest a basic structure of memory with their Multi-store Model [MSM] of memory . Although this model seems rather simplistic today, it sparked much research based on the idea that humans are information processors.

The Multi-store model was suggested in the 1960s and is clearly inspired by computer science. The model is based on a number of assumptions. First, the model argues that memory consists of a number of separate locations in which information is stored.  Second, those memory processes are sequential. Third, each memory store operates in a single, uniform way. In this model, short-term memory (STM) serves as a gateway by which information can gain access to long-term memory. The various memory stores are seen as components that operate in conjunction with the permanent memory store (LTM) through processes such as attention, coding, and rehearsal. You need to pay attention to something in order to remember information. According to this model, rehearsal is vital to keeping material active in STM by repeating it until it can be stored in LTM .

The model suggests that sensory information from the world enters sensory memory , which is modality specific - that is, related to different senses, such as hearing and vision. The most important stores in the model are the visual store (iconic memory) and the auditory store (echoic memory). Information in the sensory store stays here for a few seconds and only a very small amount of the information will continue into the short-term memory (STM) store.

The capacity of STM has traditionally been assumed to be limited to around seven items (7+/-2) and its duration is normally about 6–18 seconds. With rehearsal, information may stay in STM for up to 30 seconds. Information in STM is quickly lost if not rehearsed. Information may also be displaced from STM by new information. For example, when you are rehearsing that phone number for ordering the pizza and then someone calls out your name.  When your attention is taken away from the information in your STM, it is then displaced and no longer available. The rehearsal of material in STM  plays a key role in determining what is stored in long-term memory in the multi-store model of memory.

Miller's Magic Number 7 (1956)

essay on the multi store model of memory

After running tests to see how many numbers an individual can recall in a sequence of numbers, Miller (1956) proposed the "Magic Number 7" - plus or minus two.  According to Miller, the average memory span is between 5 and 9 items. Think about numbers that we are asked to remember - zip codes, passport numbers, social security numbers, telephone numbers - and you will see that they fall between 5 and 9 numbers. 

Numbers are one thing, but is all information the same?  Does it all fit in these 9 "slots?"

  • 3 1 9 0 2 5

And so on.  But Cowan argues that this type of task sets the participant up to employ "processing strategies" that do not reflect how we actually use our short-term memory on a day-to-day basis. 

Instead, Cowan had participants recall a "running span procedure" - that is, they listened to a list of numbers but they did not know in advance how long the list would be.  He found that participants recalled a range of 3 - 5 digits, not 5 to 9.

essay on the multi store model of memory

This is a good example of the problem of using artificial procedures in laboratory experiments.  The original research by Miller had low ecological validity - and today's research challenges the belief that STM memory can hold up to 9 digits. How many of you cannot remember your telephone number?

The long-term memory (LTM) store is conceptualized as a vast storehouse of information. This storehouse is believed to be of indefinite duration and potentially unlimited capacity, although psychologists do not know exactly how much information can be stored there. The material is not an exact replica of events or facts but is stored in some outline form. Memories may be distorted when they are retrieved because we fill in the gaps to create a meaningful memory. This is exactly what is predicted by schema theory.

Evidence in support of the model

In the biological chapter, we looked at the case study of HM (Milner,1966) . This is just one example of biological evidence that STM and LTM are located in different stores in the brain. In Milner's study, HM had anterograde amnesia - that is, he could not transfer new information to long-term memory; however, he still had access to many of his memories prior to his surgery. However, the fact that he could create new procedural memories shows that memory may be more complex than the MSM predicts.

Glanzer and Cunitz (1966) used free recall of lists of 15 items combined with an interference task to show that there are two processes involved in retrieving information. The researchers showed fifteen lists of 15 words one at a time. The researchers used a repeated measures design in which the participants were asked to recall the words either with no delay, with a 10-second delay, or with a 30-second delay.  With no delay, the first five and last three words were recalled best but with a 10 or 30-second delay during which the participant counted backward, there was little effect on the words at the beginning of the list but poor recall of later items. This suggests that the later words were held in short-term storage and were lost due to interference whereas the earlier words had been passed to long-term storage. The ability to recall words at the beginning of the list because they had already been transferred to long-term memory is called the primacy effect. The ability to recall words that have just been spoken because they are still in short-term memory is called the recency effect .

Evaluation of the multi-store model

Today the multi-store model is considered to be too simplistic. It reflects the knowledge available in the 1960s but it is an important model all the same because it has influenced our understanding of memory.  First of all, it presents a good account of the basic mechanisms in memory processes (encoding, storage, and retrieval). Secondly, several experiments support the assumption of multiple memory stores. There is also supporting evidence from case studies of patients with brain damage, such as HM suffering from amnesia, who have impaired long-term memory but intact short-term memory. This clearly points toward multiple memory stores. 

The assumption that STM is simply a gateway to LTM has been challenged by Logie (1999). He argues that information in STM is not simply passed into LTM through rehearsal. Instead, there must be an interaction between STM and LTM  in which the information is interpreted with regard to previously stored knowledge and past experience. Short-term memory is therefore not part of a sequential system but rather a 'workstation' that handles and computes information coming from the sensory store together with knowledge already stored in LTM. This also is what schema theory would predict.

Strengths and limitations of the Multi-Store Model

Strengths of the MSM

  • There is significant research to support the theory of separate memory stores - both in experimental research and biological case studies of patients with brain damage.
  • The model is of historical importance. It gave psychologists a way to talk about memory and much of the research which followed was based on this model.

Limitations of the MSM

  • The model is over-simplified. It assumes that each of the stores works as an independent unit.
  • The model does not explain memory distortion.
  • The model does not explain why some things may be learned with a minimal amount of rehearsal. For example, once bitten by a dog, that memory is quite vivid in spite of the lack of rehearsal.
  • There are several times that we rehearse a lot to remember information and it is not transferred to LTM.

Checking for understanding

Which is true about Short Term Memory?

It is limited in capacity, but limitless in duration.

It is limited in both capacity and duration.

It is limitless in capacity, but limited in duration.

It is limitless in both capacity and duration.

According to Atkinson and Shiffrin, the first step in placing information into memory storage is

Reconstruction

Short-term memory

Sensory memory

Which of the following determines whether information moves from sensory memory to short-term memory?

The size of the information

Selective attention

The modality of the stimulus - whether it is auditory or visual

Remembering the first and last items of a list better than the items in the middle of a list can be explained by

Rehearsal of the early words - and the fact that the last words are still in STM.

Displacement of information from the STM store.

Brain damage

The limit of only 7 plus or minus items in STM.

Which of the following is not a limitation of the MSM?

It does not explain the role of emotion in memory.

It is not supported by biological evidence.

It is an overly simplistic explanation of memory.

It does not explain how memory can be distorted.

What is the key difference between Miller's (1956) and Cowan's (2010) research on STM capacity?

Cowan did not let the participants know the length of the list before he read it to them.

Cowan made use of fMRIs to watch how many pieces of data were put into memory.

Cowan found that STM capacity was greater than Miller predicted.

Cowan gave explicit directions not to chunk the data.

Cowan made us of a "running span procedure" which meant that participants were less likely to develop strategies for memorization as they approach the task. These strategies he called "processing strategies."

Working memory model

Long-Term Memory in Psychology: Types, Capacity & Duration

Saul Mcleod, PhD

Editor-in-Chief for Simply Psychology

BSc (Hons) Psychology, MRes, PhD, University of Manchester

Saul Mcleod, PhD., is a qualified psychology teacher with over 18 years of experience in further and higher education. He has been published in peer-reviewed journals, including the Journal of Clinical Psychology.

Learn about our Editorial Process

Olivia Guy-Evans, MSc

Associate Editor for Simply Psychology

BSc (Hons) Psychology, MSc Psychology of Education

Olivia Guy-Evans is a writer and associate editor for Simply Psychology. She has previously worked in healthcare and educational sectors.

Long-term memory (LTM) is the final stage of the multi-store memory model proposed by Atkinson-Shiffrin, providing the lasting retention of information and skills.

Theoretically, long-term memory capacity could be unlimited, the main constraint on recall being accessibility rather than availability.

Duration might be a few minutes or a lifetime.  Suggested encoding modes are semantic (meaning) and visual (pictorial) in the main but can be acoustic also.

Using the computer analogy, the information in your LTM would be like the information you have saved on the hard drive. It isn’t there on your desktop (your short-term memory ), but you can pull up this information when you want it, at least most of the time.

Types of Long-Term Memory

Long-term memory is not a single store and is divided into two types: explicit (knowing that) and implicit (knowing how).

essay on the multi store model of memory

One of the earliest and most influential distinctions of long-term memory was proposed by Tulving (1972).  He proposed a distinction between episodic, semantic, and procedural memory.

Procedural Memory

Procedural memory is a part of the implicit long-term memory responsible for knowing how to do things, i.e., memory of motor skills.

It does not involve conscious (i.e., it’s unconscious-automatic ) thought and is not declarative.  For example, procedural memory would involve knowledge of how to ride a bicycle.

Semantic Memory

Semantic memory is a part of the explicit long-term memory responsible for storing information about the world.  This includes knowledge about the meaning of words, as well as general knowledge.

For example, London is the capital of England. It involves conscious thought and is declarative.

The knowledge that we hold in semantic memory focuses on “knowing that” something is the case (i.e. declarative).  For example, we might have a semantic memory for knowing that Paris is the capital of France.

Episodic Memory

Episodic memory is a part of the explicit long-term memory responsible for storing information about events (i.e. episodes) that we have experienced in our lives.

It involves conscious thought and is declarative.  An example would be a memory of our 1st day at school.

The knowledge that we hold in episodic memory focuses on “knowing that” something is the case (i.e. declarative).  For example, we might have an episodic memory of knowing that we caught the bus to college today.

Cohen and Squire (1980) drew a distinction between declarative knowledge and procedural knowledge.

Procedural knowledge involves “knowing how” to do things. It included skills, such as “knowing how” to playing the piano, ride a bike; tie your shoes, and other motor skills.

It does not involve conscious thought (i.e. it’s unconscious – automatic).  For example, we brush our teeth with little or no awareness of the skills involved.

Declarative knowledge involves “knowing that”, for example London is the capital of England, zebras are animals, your mum’s birthday etc.

Recalling information from declarative memory involves some degree of conscious effort – information is consciously brought to mind and “declared”.

Evidence for the distinction between declarative and procedural memory has come from research on patients with amnesia. Typically, amnesic patients have great difficulty retaining episodic and semantic information following the onset of amnesia.

Their memory for events and knowledge acquired before the onset of the condition tends to remain intact, but they can’t store new episodic or semantic memories. In other words, it appears that their ability to retain declarative information is impaired.

However, their procedural memory appears to be largely unaffected. They can recall skills they have already learned (e.g. riding a bike) and acquire new skills (e.g. learning to drive).

bahrick et al 1975

Bahrick, Bahrick, and Wittinger (1975) investigated what they called very long-term memory (VLTM). Nearly 400 participants aged 17 – 74 were tested.

Participants were asked to list the names they could remember of those in their graduating class in a free recall test.

There were various conditions including: a free recall test, where participants tried to remember names of people in a graduate class; a photo recognition test, consisting of 50 pictures; a name recognition test for ex-school friends.

Results of the study showed that participants who were tested within 15 years of graduation were about 90% accurate in identifying names and faces. After 48 years they were accurate 80% for verbal and 70% visual.

Participants were better at photo recognition than free recall. Free recall was worse. After 15 years it was 60% and after 48 years it was 30% accurate.

They concluded that long-term memory has a potentially unlimited duration.

A strength of this study s that it used meaningful stimuli. Bahrick et al. tested people’s memories from their own lives by using high school yearbooks. The study has higher external validity when compared to studies using meaningless pictures (where recall rates tend to be lower).

But the study did not control for confounding variables (they may have rehearsed their memory of the photos over the years), so any real-world application should be applied with caution.

Bahrick, H. P., Bahrick, P. O., & Wittinger, R. P. (1975). Fifty years of memory for names and faces: a cross-sectional approach. Journal of Experimental Psychology: General , 104, 54-75.

Cohen, N. J., & Squire, L. R. (1980). Preserved learning and retention of pattern analyzing skill in amnesia: Dissociation of knowing how and knowing that . Science , 210, 207–209.

Tulving, E. (1972). Episodic and semantic memory. In E. Tulving & W. Donaldson (Eds.), Organization of Memory , (pp. 381–403). New York: Academic Press.

Print Friendly, PDF & Email

Our approach

  • Responsibility
  • Infrastructure
  • Try Meta AI

RECOMMENDED READS

  • 5 Steps to Getting Started with Llama 2
  • The Llama Ecosystem: Past, Present, and Future
  • Introducing Code Llama, a state-of-the-art large language model for coding
  • Meta and Microsoft Introduce the Next Generation of Llama
  • Today, we’re introducing Meta Llama 3, the next generation of our state-of-the-art open source large language model.
  • Llama 3 models will soon be available on AWS, Databricks, Google Cloud, Hugging Face, Kaggle, IBM WatsonX, Microsoft Azure, NVIDIA NIM, and Snowflake, and with support from hardware platforms offered by AMD, AWS, Dell, Intel, NVIDIA, and Qualcomm.
  • We’re dedicated to developing Llama 3 in a responsible way, and we’re offering various resources to help others use it responsibly as well. This includes introducing new trust and safety tools with Llama Guard 2, Code Shield, and CyberSec Eval 2.
  • In the coming months, we expect to introduce new capabilities, longer context windows, additional model sizes, and enhanced performance, and we’ll share the Llama 3 research paper.
  • Meta AI, built with Llama 3 technology, is now one of the world’s leading AI assistants that can boost your intelligence and lighten your load—helping you learn, get things done, create content, and connect to make the most out of every moment. You can try Meta AI here .

Today, we’re excited to share the first two models of the next generation of Llama, Meta Llama 3, available for broad use. This release features pretrained and instruction-fine-tuned language models with 8B and 70B parameters that can support a broad range of use cases. This next generation of Llama demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning. We believe these are the best open source models of their class, period. In support of our longstanding open approach, we’re putting Llama 3 in the hands of the community. We want to kickstart the next wave of innovation in AI across the stack—from applications to developer tools to evals to inference optimizations and more. We can’t wait to see what you build and look forward to your feedback.

Our goals for Llama 3

With Llama 3, we set out to build the best open models that are on par with the best proprietary models available today. We wanted to address developer feedback to increase the overall helpfulness of Llama 3 and are doing so while continuing to play a leading role on responsible use and deployment of LLMs. We are embracing the open source ethos of releasing early and often to enable the community to get access to these models while they are still in development. The text-based models we are releasing today are the first in the Llama 3 collection of models. Our goal in the near future is to make Llama 3 multilingual and multimodal, have longer context, and continue to improve overall performance across core LLM capabilities such as reasoning and coding.

State-of-the-art performance

Our new 8B and 70B parameter Llama 3 models are a major leap over Llama 2 and establish a new state-of-the-art for LLM models at those scales. Thanks to improvements in pretraining and post-training, our pretrained and instruction-fine-tuned models are the best models existing today at the 8B and 70B parameter scale. Improvements in our post-training procedures substantially reduced false refusal rates, improved alignment, and increased diversity in model responses. We also saw greatly improved capabilities like reasoning, code generation, and instruction following making Llama 3 more steerable.

essay on the multi store model of memory

*Please see evaluation details for setting and parameters with which these evaluations are calculated.

In the development of Llama 3, we looked at model performance on standard benchmarks and also sought to optimize for performance for real-world scenarios. To this end, we developed a new high-quality human evaluation set. This evaluation set contains 1,800 prompts that cover 12 key use cases: asking for advice, brainstorming, classification, closed question answering, coding, creative writing, extraction, inhabiting a character/persona, open question answering, reasoning, rewriting, and summarization. To prevent accidental overfitting of our models on this evaluation set, even our own modeling teams do not have access to it. The chart below shows aggregated results of our human evaluations across of these categories and prompts against Claude Sonnet, Mistral Medium, and GPT-3.5.

essay on the multi store model of memory

Preference rankings by human annotators based on this evaluation set highlight the strong performance of our 70B instruction-following model compared to competing models of comparable size in real-world scenarios.

Our pretrained model also establishes a new state-of-the-art for LLM models at those scales.

essay on the multi store model of memory

To develop a great language model, we believe it’s important to innovate, scale, and optimize for simplicity. We adopted this design philosophy throughout the Llama 3 project with a focus on four key ingredients: the model architecture, the pretraining data, scaling up pretraining, and instruction fine-tuning.

Model architecture

In line with our design philosophy, we opted for a relatively standard decoder-only transformer architecture in Llama 3. Compared to Llama 2, we made several key improvements. Llama 3 uses a tokenizer with a vocabulary of 128K tokens that encodes language much more efficiently, which leads to substantially improved model performance. To improve the inference efficiency of Llama 3 models, we’ve adopted grouped query attention (GQA) across both the 8B and 70B sizes. We trained the models on sequences of 8,192 tokens, using a mask to ensure self-attention does not cross document boundaries.

Training data

To train the best language model, the curation of a large, high-quality training dataset is paramount. In line with our design principles, we invested heavily in pretraining data. Llama 3 is pretrained on over 15T tokens that were all collected from publicly available sources. Our training dataset is seven times larger than that used for Llama 2, and it includes four times more code. To prepare for upcoming multilingual use cases, over 5% of the Llama 3 pretraining dataset consists of high-quality non-English data that covers over 30 languages. However, we do not expect the same level of performance in these languages as in English.

To ensure Llama 3 is trained on data of the highest quality, we developed a series of data-filtering pipelines. These pipelines include using heuristic filters, NSFW filters, semantic deduplication approaches, and text classifiers to predict data quality. We found that previous generations of Llama are surprisingly good at identifying high-quality data, hence we used Llama 2 to generate the training data for the text-quality classifiers that are powering Llama 3.

We also performed extensive experiments to evaluate the best ways of mixing data from different sources in our final pretraining dataset. These experiments enabled us to select a data mix that ensures that Llama 3 performs well across use cases including trivia questions, STEM, coding, historical knowledge, etc.

Scaling up pretraining

To effectively leverage our pretraining data in Llama 3 models, we put substantial effort into scaling up pretraining. Specifically, we have developed a series of detailed scaling laws for downstream benchmark evaluations. These scaling laws enable us to select an optimal data mix and to make informed decisions on how to best use our training compute. Importantly, scaling laws allow us to predict the performance of our largest models on key tasks (for example, code generation as evaluated on the HumanEval benchmark—see above) before we actually train the models. This helps us ensure strong performance of our final models across a variety of use cases and capabilities.

We made several new observations on scaling behavior during the development of Llama 3. For example, while the Chinchilla-optimal amount of training compute for an 8B parameter model corresponds to ~200B tokens, we found that model performance continues to improve even after the model is trained on two orders of magnitude more data. Both our 8B and 70B parameter models continued to improve log-linearly after we trained them on up to 15T tokens. Larger models can match the performance of these smaller models with less training compute, but smaller models are generally preferred because they are much more efficient during inference.

To train our largest Llama 3 models, we combined three types of parallelization: data parallelization, model parallelization, and pipeline parallelization. Our most efficient implementation achieves a compute utilization of over 400 TFLOPS per GPU when trained on 16K GPUs simultaneously. We performed training runs on two custom-built 24K GPU clusters . To maximize GPU uptime, we developed an advanced new training stack that automates error detection, handling, and maintenance. We also greatly improved our hardware reliability and detection mechanisms for silent data corruption, and we developed new scalable storage systems that reduce overheads of checkpointing and rollback. Those improvements resulted in an overall effective training time of more than 95%. Combined, these improvements increased the efficiency of Llama 3 training by ~three times compared to Llama 2.

Instruction fine-tuning

To fully unlock the potential of our pretrained models in chat use cases, we innovated on our approach to instruction-tuning as well. Our approach to post-training is a combination of supervised fine-tuning (SFT), rejection sampling, proximal policy optimization (PPO), and direct preference optimization (DPO). The quality of the prompts that are used in SFT and the preference rankings that are used in PPO and DPO has an outsized influence on the performance of aligned models. Some of our biggest improvements in model quality came from carefully curating this data and performing multiple rounds of quality assurance on annotations provided by human annotators.

Learning from preference rankings via PPO and DPO also greatly improved the performance of Llama 3 on reasoning and coding tasks. We found that if you ask a model a reasoning question that it struggles to answer, the model will sometimes produce the right reasoning trace: The model knows how to produce the right answer, but it does not know how to select it. Training on preference rankings enables the model to learn how to select it.

Building with Llama 3

Our vision is to enable developers to customize Llama 3 to support relevant use cases and to make it easier to adopt best practices and improve the open ecosystem. With this release, we’re providing new trust and safety tools including updated components with both Llama Guard 2 and Cybersec Eval 2, and the introduction of Code Shield—an inference time guardrail for filtering insecure code produced by LLMs.

We’ve also co-developed Llama 3 with torchtune , the new PyTorch-native library for easily authoring, fine-tuning, and experimenting with LLMs. torchtune provides memory efficient and hackable training recipes written entirely in PyTorch. The library is integrated with popular platforms such as Hugging Face, Weights & Biases, and EleutherAI and even supports Executorch for enabling efficient inference to be run on a wide variety of mobile and edge devices. For everything from prompt engineering to using Llama 3 with LangChain we have a comprehensive getting started guide and takes you from downloading Llama 3 all the way to deployment at scale within your generative AI application.

A system-level approach to responsibility

We have designed Llama 3 models to be maximally helpful while ensuring an industry leading approach to responsibly deploying them. To achieve this, we have adopted a new, system-level approach to the responsible development and deployment of Llama. We envision Llama models as part of a broader system that puts the developer in the driver’s seat. Llama models will serve as a foundational piece of a system that developers design with their unique end goals in mind.

essay on the multi store model of memory

Instruction fine-tuning also plays a major role in ensuring the safety of our models. Our instruction-fine-tuned models have been red-teamed (tested) for safety through internal and external efforts. ​​Our red teaming approach leverages human experts and automation methods to generate adversarial prompts that try to elicit problematic responses. For instance, we apply comprehensive testing to assess risks of misuse related to Chemical, Biological, Cyber Security, and other risk areas. All of these efforts are iterative and used to inform safety fine-tuning of the models being released. You can read more about our efforts in the model card .

Llama Guard models are meant to be a foundation for prompt and response safety and can easily be fine-tuned to create a new taxonomy depending on application needs. As a starting point, the new Llama Guard 2 uses the recently announced MLCommons taxonomy, in an effort to support the emergence of industry standards in this important area. Additionally, CyberSecEval 2 expands on its predecessor by adding measures of an LLM’s propensity to allow for abuse of its code interpreter, offensive cybersecurity capabilities, and susceptibility to prompt injection attacks (learn more in our technical paper ). Finally, we’re introducing Code Shield which adds support for inference-time filtering of insecure code produced by LLMs. This offers mitigation of risks around insecure code suggestions, code interpreter abuse prevention, and secure command execution.

With the speed at which the generative AI space is moving, we believe an open approach is an important way to bring the ecosystem together and mitigate these potential harms. As part of that, we’re updating our Responsible Use Guide (RUG) that provides a comprehensive guide to responsible development with LLMs. As we outlined in the RUG, we recommend that all inputs and outputs be checked and filtered in accordance with content guidelines appropriate to the application. Additionally, many cloud service providers offer content moderation APIs and other tools for responsible deployment, and we encourage developers to also consider using these options.

Deploying Llama 3 at scale

Llama 3 will soon be available on all major platforms including cloud providers, model API providers, and much more. Llama 3 will be everywhere .

Our benchmarks show the tokenizer offers improved token efficiency, yielding up to 15% fewer tokens compared to Llama 2. Also, Group Query Attention (GQA) now has been added to Llama 3 8B as well. As a result, we observed that despite the model having 1B more parameters compared to Llama 2 7B, the improved tokenizer efficiency and GQA contribute to maintaining the inference efficiency on par with Llama 2 7B.

For examples of how to leverage all of these capabilities, check out Llama Recipes which contains all of our open source code that can be leveraged for everything from fine-tuning to deployment to model evaluation.

What’s next for Llama 3?

The Llama 3 8B and 70B models mark the beginning of what we plan to release for Llama 3. And there’s a lot more to come.

Our largest models are over 400B parameters and, while these models are still training, our team is excited about how they’re trending. Over the coming months, we’ll release multiple models with new capabilities including multimodality, the ability to converse in multiple languages, a much longer context window, and stronger overall capabilities. We will also publish a detailed research paper once we are done training Llama 3.

To give you a sneak preview for where these models are today as they continue training, we thought we could share some snapshots of how our largest LLM model is trending. Please note that this data is based on an early checkpoint of Llama 3 that is still training and these capabilities are not supported as part of the models released today.

essay on the multi store model of memory

We’re committed to the continued growth and development of an open AI ecosystem for releasing our models responsibly. We have long believed that openness leads to better, safer products, faster innovation, and a healthier overall market. This is good for Meta, and it is good for society. We’re taking a community-first approach with Llama 3, and starting today, these models are available on the leading cloud, hosting, and hardware platforms with many more to come.

Try Meta Llama 3 today

We’ve integrated our latest models into Meta AI, which we believe is the world’s leading AI assistant. It’s now built with Llama 3 technology and it’s available in more countries across our apps.

You can use Meta AI on Facebook, Instagram, WhatsApp, Messenger, and the web to get things done, learn, create, and connect with the things that matter to you. You can read more about the Meta AI experience here .

Visit the Llama 3 website to download the models and reference the Getting Started Guide for the latest list of all available platforms.

You’ll also soon be able to test multimodal Meta AI on our Ray-Ban Meta smart glasses.

As always, we look forward to seeing all the amazing products and experiences you will build with Meta Llama 3.

Our latest updates delivered to your inbox

Subscribe to our newsletter to keep up with Meta AI news, events, research breakthroughs, and more.

Join us in the pursuit of what’s possible with AI.

essay on the multi store model of memory

Product experiences

Foundational models

Latest news

Meta © 2024

IMAGES

  1. The multi-store model of memory (Atkinson and Shiffrin, 1968)

    essay on the multi store model of memory

  2. The multi store model of memory [AQA ALevel]

    essay on the multi store model of memory

  3. AQA GCSE Psychology

    essay on the multi store model of memory

  4. Describe and evaluate the Multi Store Model of memory Essay Example

    essay on the multi store model of memory

  5. the multi store model of memory

    essay on the multi store model of memory

  6. Essay: Multi-Store Model of Memory

    essay on the multi store model of memory

VIDEO

  1. Unit 5: Memory #2 (Multi-Store Model) (AP Psychology)

  2. Memory Power Part 1

  3. Multi-store model

  4. multi-store model

  5. Multi store model of memory

  6. Multi topic essay for 2nd year

COMMENTS

  1. Multi-Store Memory Model: Atkinson and Shiffrin

    The multi-store model is an explanation of memory proposed by Atkinson and Shiffrin which assumes there are three unitary (separate) memory stores, and that information is transferred between these stores in a linear sequence. The three main stores are the sensory memory, short-term memory (STM) and long-term memory (LTM).

  2. Multi-Store Model of Memory

    Share : Atkinson and Shiffrin (1968) developed the Multi-Store Model of memory (MSM), which describes flow between three permanent storage systems of memory: the sensory register (SR), short-term memory (STM) and long-term memory (LTM). The SR is where information from the senses is stored, but only for a duration of approximately half a second ...

  3. The multi-store model of memory (Atkinson and Shiffrin, 1968)

    The multi-store model of memory (the MSM) is a product of the cognitive revolution of the 1950s and '60s. This produced a new wave of experimental research into memory. Before this time, the dominant movement was "behaviorism," which used the scientific method to study observable actions.

  4. 2.1.4 Multi-Store Model of Memory

    What is the multi-store model of memory? It is a cognitive model of memory written in very much in the same way information processing models in computing are designed. It involves the forward flow of information from sensory input, through to sensory memory (SM), then to short-term memory (STM), to long-term memory (LTM) and finally as an output.

  5. The multi store model in memory psychology

    Essay Writing Service. Atkinson and Shiffrin (1968-71) developed the multi store model to explain their theory as to how we process information and is sometimes called the duel model of memory due to its focus on short and long term memory stores. Their theory was that one's memory involves a sequence of three stages; sensory memory (SM ...

  6. The Multi-Store Model Of Memory

    (1) Point: Further research from brain scanning techniques has supported the Multi-Store Memory model and the idea of separate memory stores (i.e. a short term memory store and a long term memory store. Evidence: Squire et al (1992) used brain-scanning techniques and found that STM can be associated with activity in the prefrontal cortex and that LTM can be associated with activity in the ...

  7. Memory: Neurobiological mechanisms and assessment

    MULTI-STORE MODEL OF MEMORY. Richard Atkinson and Richard Shiffrin put forth a model of memory which is known as "The multi-store model or modal model." It states that memory consists of three distinct elements: "a sensory register, a short-term store, and a long-term store." The data from the environment and our senses goes into the ...

  8. The Multistore Model of Memory

    Atkinson and Shiffrin's (1968) multi-store model of memory (MSM) distinguishes between the separate stores of sensory, short-term and long-term memory. Likely features include: It is a structural model. STM and LTM are unitary stores. Information passes from store to store in a linear way.

  9. What is the Multi-Store Model of Memory

    The MSM. The multi-store model of memory (MSM) was devised by Atkinson & Shiffrin (1968) as a way of conceptualising the processes by which memories are encoded and stored; The model encapsulates the flow i.e. from one storage facility to the next, whereby information reaches the senses and is then translated into memories; The model is a linear representation of the ways in which information ...

  10. From short-term store to multicomponent working memory: The role of the

    The term "modal model" reflects the importance of Atkinson and Shiffrin's paper in capturing the major developments in the cognitive psychology of memory that were achieved over the previous decade, providing an integrated framework that has formed the basis for many future developments. The fact that it is still the most cited model from that period some 50 years later has, we suggest ...

  11. Example essay: Contrast two models of memory

    Contrast two models of memory. Two models of memory that will be contrasted in this essay are Atkinson and Shiffrin's multi-store model of memory and Craik and Lockhart's levels of processing model. The primary difference in these two models is that one focuses on the structures of memory (the MSM), while the other focuses on processes (the ...

  12. PDF Essay Plans

    Discuss the Working Model of Memory. Include Strengths and limitations in your answer. (16 MARKS) AO1 The Working Model of Memory was Baddeley and Hitch (1974) as an alternative to Atkinson and Shiffrin's Multi Store Model of Memory. This was developed, as due to the dual task effect, they believed STM was not a unitary store. The dual

  13. Outline and evaluate the Multi Store Model of memory

    Atkinson and Shiffrin (1968) proposed the structural model of memory, known as the Multi Store Model. The model is linear meaning the information passes from one store to another in a fixed sequence. This model explained their theory of memory in 3 main separate stores; sensory memory; short term memory and long-term memory. ...

  14. DP Psychology: Multi-store Model

    The Multi-store model was suggested in the 1960s and is clearly inspired by computer science. The model is based on a number of assumptions. First, the model argues that memory consists of a number of separate locations in which information is stored. Second, those memory processes are sequential.

  15. [PDF] Multi-store model memory

    Sensory register: The first store which holds the sensory information received through all the senses for a brief period of time. Examples include iconic (visual) and echoic (sound) memory. Short-term memory: The memory for immediate events. These memories tend not to last for more than a minute or two, usually shorter, and disappear unless they are rehearsed. Capacity is limited to 7 plus or ...

  16. Understanding the Multi Store Model of Memory

    This model suggests that memory is composed of three main stores: sensory memory, short-term memory, and long-term memory. Information is processed and transferred between these stores, allowing us to retain and retrieve memories. In this essay, I will discuss the strengths and weaknesses of the multi store model and its application to real ...

  17. Psychology Memory Essay

    Atkinson and Shiffrin (1968) developed the Multi-Store Model of memory (MSM), which describes flow between three permanent storage systems of memory: the sensory register, short-term memory, and long-term memory. It suggests that the sensory store, short-term and long-stem memory are stores of information.

  18. Long-Term Memory In Psychology: Types, Capacity & Duration

    Long-term memory (LTM) is the final stage of the multi-store memory model proposed by Atkinson-Shiffrin, providing the lasting retention of information and skills. Theoretically, long-term memory capacity could be unlimited, the main constraint on recall being accessibility rather than availability. Duration might be a few minutes or a lifetime.

  19. The Multi-Store Memory Model vs. The Working Memory Model; How does

    Please use one of the following formats to cite this article in your essay, paper or report: APA. Huggins-Cooper, Anthoni. (2023, May 10). The Multi-Store Memory Model vs.

  20. Multi Store Model AO1 AO2 AO3

    Evaluate the Multi Store Model of memory. (8 marks) A 8-mark "evaluate" question awards 4 marks for AO1 (Describe) and 4 marks for AO3 (Evaluate). MSM is credible because it is supported by case studies of people like H.M. and Clive Wearing. Because of brain damage, these people have amnesia and cannot make new memories.

  21. Essay about Mulit-Store Model of Memory vs. Working Memory Model

    This essay will firstly briefly describe the theories and important facts about the original multi-store model of memory (MSM) and the working memory model (WMM). This essay will then evaluate the key studies within these two models and explain the strengths and weaknesses of the main theories. The final part of this essay will be to examine ...

  22. Memory essay plans

    The multi-store memory model was put forward by Atkinson and Shiffrin, it consists of 3 memory stores which are each linked in order to transfer information from one to another. The sensory register is where information is taken in from the senses. It is coded iconically and echoically, and has an unlimited capacity, the

  23. Introducing Meta Llama 3: The most capable openly available LLM to date

    We made several new observations on scaling behavior during the development of Llama 3. For example, while the Chinchilla-optimal amount of training compute for an 8B parameter model corresponds to ~200B tokens, we found that model performance continues to improve even after the model is trained on two orders of magnitude more data.

  24. Outline and evaluate the MSM of memory (16 marks)

    The multi-store model depicts the STM and LTM as being unitary (single units). However, evidence from a case study has proposed that there are many components within the STM and LTM. The study done on Clive Wearing (2007) is support of this because he has selective impairment (where only a certain part of the semantic memory is impaired).