Reply to Couper (2020)

In Speak Out! Issue 61 (Sept 2019) we published Teaching the underlying systems of English pronunciation as Motor Skills, available here.


Graeme Couper responded to this in Issue 62 in A response to Messum and Young's argument for an Articulatory Approach.


We were allowed a 200 word reply in the issue (that you can read here), and a link to the longer reply you can download or read below. 

Reply to Couper (2020).pdf
Adobe Acrobat Document 134.7 KB


Limitations of written communication

Couper mainly discusses our work in terms of the teaching of sounds, and we mentioned above that his understanding of what we do for this is incorrect in some ways. Because we have described the teaching of sounds elsewhere (e.g. Messum and Young 2012; Young 2018) we wanted to give over as much of this Speak Out! article as possible[1] to its main theme: how we teach the underlying systems of English pronunciation (stress, reduction, etc) as motor skills[2].


We only wrote five lines near the end of the article dealing with how to teach sounds:


“We use the PronSci English Rectangle chart as our phonemic map. We gradually increase the number of sounds in play, making sure the students begin to develop a distinctive articulatory gesture for each one and giving them the chance to try these gestures out in various contexts. At this point we are not looking for perfection; we are happy as long as the students are out of the grip of L1.”


We should have included references to the other articles in this paragraph, and without them Couper was reconstructing what we actually do for sounds in the classroom from very little information. Trying to imagine a classroom where the teacher never models for the students when one hasn’t experienced this would be hard enough even if one had a detailed description. Unsurprisingly, he appears to have ended up with a mental image that doesn’t reflect what actually happens.


[1] Speak Out! articles are expected to be around 3000 words, and we’re grateful to the editor for allowing us to go over this limit for the original article.

[2] This was the subject of our presentation at the 2019 PronSIG Pre Conference Event in Liverpool, which in turn built on a presentation that Piers gave at PronSIG’s Accentuate conference in 2015 called “What to teach before you teach sounds”, available at .


Two responses to Couper: major themes and detailed points

For the record, we do feel the need to point out where Couper is mistaken about the AA approach. However this will not be of interest to all readers so we first respond to his thoughts on two major themes in the discussion, and then to what for him were the three key difficulties with our approach.


In a separate document, we respond to his critique point-by-point.


Two major themes

1 Pronunciation as a motor skill

Couper attributes to us an, “implicit theory of language that leads to the conclusion that pronunciation involves no more than the articulation of phonemes and the combination of those phonemes and consequently that it can be treated solely as a motor skill.”


Nobody could believe this, and we don’t either. Pronunciation is more than just a motor skill, but it is still a motor skill.


So for some of the time, but not all of it, it is helpful to consider solely the motor skill aspect of pronunciation. Most current pronunciation teaching completely ignores this need or does not address it properly. (As we described in the article, Catford pointed out that articulatory instruction as it is often implemented is not the way to coach the motor skill of pronunciation. Underhill (2012) makes the same point more specifically about classroom teaching.)


Consider either learning to play a musical instrument or learning a sport. Neither are solely motor skills, but in learning both it is necessary and useful to dedicate time—in fact, a lot of time—to their motor skill aspects, without consideration during that time of public performance or playing a match or game.


Spoken L2 communication is ‘playing a language game’. Acquiring the motor skills to play this game is best done, as in music or sport, in periods of practice where the game itself is not being played. The students can concentrate on the motor activities that create rhythmic and melodic strings of sounds in L2 without reference to meaning or grammar, which are distractions during this motor skill practice. Motor skill practice in music or sport turns deliberate controlled movements into automatic ballistic movements which can be executed when the player’s attention is elsewhere. This soon improves the musician’s performance or the sportsman’s game, and exactly the same applies in pronunciation.


The challenge for the teacher is to keep motor skill practice intrinsically interesting and relevant, i.e. clearly carrying a potential for high performance in ‘the game’. When motor skill practice has this characteristic, students are happy to spend time on it, as sportspeople are with well-run practice sessions.


We asserted that most current pronunciation teaching completely ignores the need to teach the motor aspects of pronunciation in the way that motor skills should be taught. If we are right, then language teachers are like a flute teacher would be if she didn’t dedicate time to teach fingering to a pianist who is starting to learn the flute (“Oh, he already knows about fingering”).


One difference is that the pianist will know that he has to learn fingering for the flute and will do it by himself despite his misguided teacher, but few students of an L2 have any idea of what developing a new set of articulatory gestures actually entails. There is a need for a teacher to coach them in this process. We believe that the lack of this work is the major reason why the results of current pronunciation teaching are so disappointing.


As we said in the main article, it is often useful to divorce pronunciation from meaning when a student has a pronunciation problem, but then it gets reunited with meaning a few minutes later once the motor or conceptual difficulty has been resolved. Our example of the back chaining of A quarter to two was one illustration of this process but it is systematic in our teaching.


2 Creating concepts by doing

Concepts within production

Couper quotes Brown approvingly when she says (2000, pp 6-7) that, “successful acquisition of phonological representations requires accurate perception of phonemic contrasts in the input.” In itself, this assertion is neutral as to whether accurate perception precedes production or if production has been developed at the same time or even earlier. However, Couper seems to be taking it as support for the first of these possibilities, which is also the basis of the mainstream view on how to teach segments, as Pennington and Rogerson-Revell confirm:


“In relation to teaching individual phonemes, perception-based phonetic training has long been promoted as essential to establish the foundations of pronunciation.” (2019:199)


However, Gattegno and teachers who have worked in the Silent Way tradition have demonstrated that the idea that perception of L2 contrasts via native speaker models must precede production is wrong. Gattegno taught pronunciation very successfully while remaining completely silent.


The mainstream view assumes that successful perception leads to successful production via an auditory ‘matching-to-target’ learning process undertaken by the student. This implies a further assumption that a single phonological lexicon serves both perception and production. This view is peculiar to speech researchers, constrained by their belief that speech sounds are primarily developed in both L1 and L2 through imitative processes. It is not shared by psychologists or neuroscientists, who believe that speech is supported by two phonological lexicons, one for input and one for output. (See Messum and Howard (2015) for discussion and references.) We find two-lexicon accounts of speech more plausible than single-lexicon ones.


So we can agree with Brown if we take her to be referring to the successful acquisition of input phonological representations. But this says nothing of the output phonological representation and how and when it is developed.


Couper does not engage with one of our key points: that concepts can be created by doing things first, not just by first perceiving things. He does not seem to have appreciated that we are doing no more than invoking the ubiquitous human learning mechanism of the action-perception loop[3]. In practice, the doing of something also involves the perception of the result, and in a class of people doing vocal things, that means that learners perceive the results of their vocal actions alongside the doing. Both sides of the concept are developed simultaneously, rather than the perceptual side of a concept being developed before production is turned to, the sequence of events that is practised in current mainstream pronunciation teaching.


We understand the superficial plausibility of the idea that learning the concept of a new motor skill first requires one to find it in others. People often imagine, for example, that children learn to walk by seeing others doing it and then trying to copy them. But how do children learn to crawl? Many young children learn this without ever seeing another baby crawling. And learning to crawl certainly includes developing the concept of crawling.


Elbers and Wijnen (1992:341) argue for taking this perspective when considering language development:


“[A] ‘production-based’ approach has the important advantage of bringing together language learning and other kinds of learning that occur in childhood. For instance, no one would seriously defend the idea that a child learns how to build with blocks primarily by analyzing the block constructions produced by others. Rather, one would assume that the child learns from his or her own constructive operations and their outcome …. Yet theories of language acquisition, of whatever signature, mainly acknowledge the role of input in the learning process, not that of children’s constructive production.”


And following this, to the best of our knowledge, no one in the field of child speech believes that children develop vocal motor schemes[4] through discovering a sound in the environment and then trying to develop it for themselves. Vocal motor schemes are the building blocks for first words: McCune and Vihman (2001) found that, “children based virtually all stable words on their own specific VMS consonants (92% vs. 8% for other consonants).”


If children are able to start speech by developing motor/sound concepts that initially have no relation to language but come to have such a relationship over time, why can adults not be encouraged to do the same for L2 pronunciation? This is not the final state, but no one could argue, surely, that this is an unnatural stepping stone for teachers to provide.


Fully developed concepts can arise from doing things, and observing their results in perception. This is the action-perception loop which is considered to be a foundational mechanism in human learning. If there is a source of learnèd feedback that informs students of the linguistic status of their production of segments or strings of segments, then these concepts can be fully linguistic without students having to listen to native speaker models at all. (It will become useful to listen to NS models later, just as it is for the child using building blocks … but later.)

[3] This is variously called the action-perception loop, the action-perception cycle, the perception-action loop or the perception-action cycle. However, the three main elements are always (1) action, (2) perception of the outcome, (3) cognitive activity, usually some kind of conclusion drawn and a hypothesis or prediction to be tested. The learning part of this cycle starts with an action.

[4] A vocal motor scheme (VMS) is the name given to the first stable sounds that children produce. The researchers who coined the name describe them alternatively as, “well practiced and longitudinally stable vocal productions” or, “generalized action patterns that yield consistent phonetic forms” (McCune and Vihman 2001:671 & 673).


Concepts within perception

Couper thinks that concept building in pronunciation has to start with perception. So he must have been surprised that we said so little about perception, just a single sentence in the section headed Two birds with one stone:


“The experience the students have given themselves [creating and listening to a new noise] is exactly the kind of evidence that their minds need to create a new concept, linked in production and perception ...”


From this, Couper interprets us as suggesting that, “if you teach the production then the perception will take care of itself,” which is an overstatement of our view but does have some truth in it. Now we have some extra space we can digress from what we saw as the main theme of the article and say something about concept formation in perception. (Later we will deal with perception again when we address our supposed, “denial of any role for perception”.)


As we said earlier:


“in a class of people doing vocal things ... learners perceive the results of their vocal actions alongside the doing. Both sides of the concept are developed simultaneously, rather than the perceptual side of a concept being developed before production is turned to.”


Support for this simultaneous action/perception learning paradigm comes from, for example, the classic experiments performed by Richard Held more than 50 years ago which he summarised as demonstrating that, “the correlation entailed in the sensory feedback accompanying movement—reafference—plays a vital role in perceptual adaptation” (Held 1965). In Held and Hein (1963) he had demonstrated that kittens who were given an identical perceptual experience to others having a particular motor/perceptual experience failed to develop perceptual representations that could guide behaviour. In Held and Mikaelian (1964) he had shown that humans readily adapt to prismatic glasses when they can move voluntarily within a scene but not when they are moved within it by means of a wheelchair.


The idea that perception is educated by the motor system—by the sensorimotor contingencies that voluntary movement produces—is also central to more modern work, e.g. O’Regan and Noe (2001). Pure perceptual learning (via exafference) is possible, but much less efficient and effective than learning which invokes the action-perception loop.


Not only does the learning paradigm within the Articulatory Approach work well in practice, but it is also a ubiquitous and natural general mechanism for learning, and thus for concept development in perception as well as production.


Finally, let us reiterate a point we made in the original article, that work on production is going to be needed even if students do first get trained to identify/discriminate problematic sounds and other features. So it’s more efficient to start with production straightaway if, as predicted by theory and as we have found in practice, perception improves in tandem with it.


Couper’s three key difficulties

Couper finds three key difficulties in our approach, which he describes as, “the denial of any role for perception in L2 pronunciation learning, the refusal to allow for any sort of role for a model, and the lack of any empirical evidence to support their case.”


Refusal to allow a role for a model

We will deal with Couper’s second difficulty first: our supposed, “refusal to allow for any sort of role for a model.” As he later puts it, we do indeed, “argue strongly against the teacher providing a model on the basis that this will distract the learner from focusing on their own actions,” although we don’t in fact make the stronger claim he attributes to us, that “any sort of model of the target pronunciation is detrimental to learning.”[5]

We believe that the following points in learning psychology and classroom dynamics help to explain why we eschew models.

[5] See the section below called ‘When a model becomes useful’.


Attention and presence

When a teacher gives students a model to copy, she is directing them to place their presence in their ear in order to capture what is said to them. Most will keep their presence there because of the nature of the task the teacher is asking of them:


       to hold onto the fast-fading auditory image of the model

       to capture what they themselves say

       to compare the two auditory images.


Their attempt to match the model will use their existing powers of imitation, drawing on the automatisms that create L1 speech sounds in the case of most learners, and on a more general capacity for imitative noise production in the case of better learners. For students in both groups, doing something novel with their articulators is not the natural response because their presence is not in their mouths but anchored in their ears.


Exceptional students do transcend this. They move their presence to their articulators and begin exploring and experimenting even though the exercise proposed to them has not invited them to do so.

To turn all students into exceptional students, it is only necessary to draw them into exploring and experimenting from the beginning. Gattegno saw that the best way to do this is for the teacher not to provide a model. This leaves the students with no alternative other than to become present to their articulators and to listen closely to what they produce. The teacher’s role is to provide them with three types of supportive feedback: the evaluation of their performance (which they themselves are not yet in a position to make), technical coaching in what they should be doing with themselves physically (“Try relaxing your lips”), and encouragement to continue exploring.


Another pedagogical benefit of the teacher’s ‘silence’ (which is not to say that she is mute, but that she doesn’t model what the students are attempting to produce) is that the students cooperate more with each other and become less self-conscious and less judgmental of themselves. One reason for this is that students don’t know exactly what they are aiming for. So a ‘no’ from the teacher cannot be construed as a failure; it is just part of the process of discovering the target. If a teacher does provide a model, her ‘no’ does indicate failure, and for many students this discourages exploration and experimentation.


Put differently, we think that providing a model misleads the typical student into thinking that simple imitation is all that is required of him, and that this will lead to improvement in his pronunciation.

What is actually required is the development of a new motor use of himself. This is very different from simple imitation and he will direct his attention and efforts differently depending upon which activity he thinks he is undertaking. What he needs to be doing is to direct his attention and effort in the way that they need to be directed when learning any motor skill; which, in the case of learning pronunciation, means to his articulators.


In the article, we mentioned Gilbert Ryle’s description of ‘thinking’ in learning as the kind of engagement that a mountaineer has with a difficult path that he is trying to negotiate in bad weather: he is then both walking the path and learning to walk this particular path at the same time. If you ask him to look at the view, it is a huge distraction from his real task; to the extent that he will have to stop walking if he is going to comply. Similarly, demanding attention to an expert model is a distraction for someone trying to create a new use of his articulators to produce a new sound. It drives him back to using sound-making routines from L1 or another part of his past, and precludes the development of new ones.


Exposure ≠ model

Couper states that, “It is also unrealistic to expect that learners will not have been exposed to models elsewhere anyway, so what would be the point in denying their existence.”


Clearly many learners of English will have been exposed to huge amounts of the language. This does not mean that they have related to any of what they have heard as a model. Language is only a pronunciation model when a learner puts the meaning aside and concentrates on the pronunciation form. For most learners, this rarely if ever happens outside work in class on pronunciation.


Five years of living in a country can give thousands of hours of exposure to a language but zero hours of models if the person never relates to the form of what is said rather than the content. We shouldn’t conflate exposure to the language with engagement with the language as a model. 


Misapprehensions about imitating speech

Within L1, there are a range of imitative activities that most people can readily perform, including,


       Putting on regional or ‘foreign’ accents

       Affecting a speech impediment, like a lisp

       Mocking someone, by copying a particular intonation contour they have just used.


No one doubts their own capacity to do these things. Many teachers and learners imagine that learning L2 pronunciation is comparable to these activities, and when students are asked to imitate an L2 model both teachers and students assume that this too should be within their capability.


We are not experts in voice coaching, but it seems to us that one might accomplish the activities above as follows:


       To put on an accent: by modifying one’s L1 articulatory setting in just one or two ways, at which point everything about the accent falls into place. For example, to create an Australian accent, try speaking with the teeth held more closely together than normal. (“To keep the flies out” is the advice that is sometimes given ...)

       To affect a lisp: explore within one’s L1 articulatory setting what will give the effect required.

       To copy a marked intonation contour: choose, and modify as necessary, a contour one already has within one’s own inventory. For example, we all have whines which we can use to reflect someone else’s whine back to him or her.


Notice that in all these cases, the speaker can hear the effect required because it is taking place within L1. (I.e. Trubetskoy’s ‘sieve’ doesn’t mask it.)


However, developing a distinctive articulatory gesture for a new sound in L2 is not like any of these genuinely imitative activities in L1. It is not an imitative task but the development of a new motor skill which will involve both a new primary tongue gesture and the adjustment of all the components of the L1 articulatory setting (the tongue’s basis of articulation, the use of the respiratory system, the muscle tone in the articulators, etc) to what is required for the L2 setting.


Are our models even models?

We should keep in mind that when we provide models for our students in pronunciation classes, we are presenting them with the results of our actions and not the actions themselves, since most of the actions involved are hidden inside the mouth. In this way, teaching pronunciation is very unlike teaching most skills or activities where a ‘model’ actually shows the learner what the demonstrator is doing.


If Tiger Woods was teaching us to drive a golf ball by simply striking 300 yard shots off the tee we don't think we'd learn as much as if he gave us advice on how to improve our own swings. But at least we'd pick up something from watching him in action.


Now imagine if he was hitting those 300 yard drives while standing behind a tarpaulin, so we couldn't see what he was doing and could only see the result: a ball sailing down the fairway, every time he produced a ‘model’. Then we don't think we'd get very much at all from the experience, and we'd actually find it rather boring and dispiriting.


When we provide 'models' of sounds for our students, we're similarly giving them the results of our actions, not insights into the actions themselves. Unsurprisingly, students often find this rather boring and dispiriting.


When a model does become useful

All this said, there is a time when hearing an expert model in class is useful for students: once they can hear the model using L2 filters that they have developed through the type of work on production that we describe and can thus learn something from a comparison process.


Even then, we don’t allow them to copy a model. But, for example, we might make use of the ‘Human Computer’ technique from Community Language Learning, where the teacher repeats the student’s phrase after him in good L2, allowing the student to notice what differences there are. If the student wants to try again, he has to wait 20 seconds so that the teacher’s model has faded from his mind and he is genuinely saying the phrase, not copying it. Students appreciate the value of working this way.


Denial of any role for perception

Couper takes issue with our supposed “denial of any role for perception in L2 pronunciation learning.”


We don’t issue our students with ear plugs! Of course perception plays a great role in teaching pronunciation. The question is, what should learners be listening to? As we explained earlier, we suggest that they should start by listening to their own production (as part of the action-perception loop) and that of their fellow students.


The conventional view, of course, is that students should listen to models of correct pronunciation. In the previous section, we gave several reasons why this is problematic. We can now add a reason why it is preferable for students to attend to themselves and their fellow students when they are working on pronunciation.


In a figure-skating competition, it's very instructive to watch the lower-placed contestants. When the champions perform, the commentator tells us that they have just done a wonderful triple Salchow, or a double Axel, and we, at least, cannot tell the difference. All we see is this: they steady themselves, they leap, they twist and they land. It's so smooth that we can never see whatever it is that they are doing which makes it a Salchow or an Axel.


When watching the lower level candidates, it's quite different. They don't have the smoothness and the grace, and suddenly the movements are much easier to see.


This applies to language learning, too. Learners can more easily detect what other learners do to pronounce sounds, words and sentences than what fluent, expert native speakers do. They learn more from watching, listening to, and being inspired by the achievements of other students than from trying to copy the teacher.


Thus, if this type of listening is combined with the teacher constantly giving feedback on the students’ trials, then there are two main advantages.


  1. By listening to himself, each student learns what muscular configurations lead to what sounds. He can change what he does with his muscles, he knows what he has done and he can hear what effect this has on the sound he hears. Being asked to consciously do something different with an articulator primes a student to expect to hear something different as a result. He listens more carefully than he would normally, with a heightened attentional set. His perception is sharpened. This is an efficient way for a student to develop both production and perception criteria for his own speech.
  2. By watching other students, listening to what they produce and attending to the evaluation of this that the teacher provides, he develops further perceptual criteria which serve his L2 listening but which can also inform and inspire further work on his own production.

In practice there doesn't seem to be a need for specific teacher intervention to work on perception once work on production has been done well. When a student's pronunciation contains a sound or some other feature in production—or is even on its way to containing it—our experience is that students seem to be able to hear it in perception.


In summary, what we find in practice is that class work on production in the Articulatory Approach is, at the same time, effective work on perception.


Critical Listening and High Variability Phonetic Training (HVPT)

Couper (2015) commends Critical Listening as a classroom activity. Drawing on the work of Fraser (2009), he describes it as follows:


“Critical Listening involves the learner in listening for the contrast between two productions: one that is acceptable and one that is not. Typically there should be a meaningful difference, and ideally it would involve comparing the learner’s production when it is acceptable with when it is not …. ” (p. 426)


As should be clear by now, in classes taught using the Articulatory Approach, a very productive form of Critical Listening is happening all the time.


Couper suggests this implementation of Critical Listening:     


“In practice this might involve learners recording themselves and then listening to their recording and comparing it with a model in conjunction with getting feedback from peers or the teacher.”


But we prefer student models for the reasons we have explained.

HVPT is another technique commended in the literature (Pennington and Rogerson-Revell 2018:199). In our classes, the students are hearing themselves every time they say something, and they're hearing everyone else's experiments in the class, too. There's a huge amount of variability generated, and plentiful feedback on how acceptable everyone's output is. They're discovering what's acceptable, what's close but not acceptable, and in lots of different voices.


This is HVPT, but with student rather than native speaker models.


Lack of any empirical evidence to support our case

We share with Couper a desire that classroom teaching of pronunciation be improved, not just in our own classes but around the world, in all classes. The question is how to achieve this.

Couper regrets that we do not present any empirical evidence in support of the Articulatory Approach. We assume that he has comparative classroom studies of the type seen in journals of applied linguistics in mind.[6]


It would be wonderful if any approach or technique could be proven in this way—to teachers’ satisfaction—to be better than any others. We’re happy that applied linguists are attempting to do this, but we’re sceptical about their chances of success. Even when the non-trivial problem of control groups of students is addressed reasonably well, there remains the problem of teacher variability which we rarely see addressed in the ISLA literature.


Teachers vary enormously in their skill. The same teacher can vary in the skill she displays over the course of any day. The first time she presents some material she is likely to do it very much less effectively than the fifth or fifteenth time. Learning is not just a result of what the teacher imagines in the lesson plan or proposes in the class; it also happens in the ‘dark matter’ of a class (Underhill 2014) and this is completely dependent upon the skill of each individual teacher. And so on. The issue of teacher variability is more acute the more broad and significant the question being asked.

Without addressing these control issues relating to the teacher in its experiments, comparative educational research methods are not reliable and rarely convince us.


[6] If Couper is in fact advocating ‘research without control groups’ or qualitative research, or something similar, then the problem here is that such studies are even less convincing to anyone not already in agreement with the starting point (or indeed less likely to be even read by them). Too much researcher bias is always assumed by the reader.


Academic research vs. teacher-led change

Our approach to improving classroom teaching is fivefold.


  1. We are explicit about the general model of learning we espouse, enabling us to discuss the how and why of learning with confidence. (We have given this a book length treatment in Young and Messum 2011.)
  2. In our articles and guides to the use of materials in the Articulatory Approach we include descriptions of how and why we think the student learning will take place. (We find that teachers appreciate and relate to a plausible account of the learning moves that will take place when using a given technique.)
  3. We describe our methods in enough detail for other teachers to try them for themselves. (Unlike in most reports of academic research studies (Pennington and Rogerson-Revell 2019:219).)
  4. We train teachers, and invite them to observe our classes if they wish.
  5. We make videos of us teaching available for viewing (e.g. Young 2015).

In other words, for proof of the efficacy of the Articulatory Approach and its techniques, we make our work as accessible as possible to teachers and invite them to try it.


More eminent people than ourselves have expressed a similar view to ours, that change within the EFL world is rarely the result of academic research. See, for example, Maley (2016) (especially the section headed, “Where have new ideas in TESOL come from?”) and Medgyes (2017).


Furthermore, we work on L1 pronunciation acquisition (e.g. Messum 2007; Messum 2008; Messum and Howard 2012; Messum and Howard 2015) because we believe that L2 pronunciation teaching practices are heavily (if often implicitly) influenced by what teachers believe to be ‘natural’ in this field. If children do not learn key features of L1 pronunciation by imitation, as presently assumed—but without evidence or sufficient critical scrutiny—then a better understanding of how they do learn to pronounce L1 will certainly help to inform better L2 teaching.



In this document, we have now responded to Couper’s thoughts on two major themes in the discussion: pronunciation seen as a motor skill and the development of concepts. We have also responded to what for him were the three key difficulties with our approach: our supposed refusal to allow for models, our supposed denial of any role for perception, and the lack of empirical evidence for the Articulatory Approach presented in an academic format.


Clearly our classes differ from Couper’s, but we’d like to finish by reassuring him that they do exemplify what he considers to be best practice (Couper 2006:59):


“[E]ffective pronunciation teaching involves:

       making learners aware that there is a difference between what they say and what native speakers say

       helping learners to hear the difference and practise it

       finding the right metalanguage

       helping learners to discover useful patterns and rules

       giving feedback and providing opportunities for further practice.”


The Articulatory Approach does do these things differently, taking production as a starting point, but it does them all well.


In a second document, “A point-by-point reply to Couper (2020)”, we respond at a more detailed level to some individual points that Couper made. This is available here.



Caudrelier, T., Ménard, L., Perrier, P., Schwartz, J.-L., Gerber, S., Vidou, C., & Rochet-Capellan, A. (2019). Transfer of sensorimotor learning reveals phoneme representations in preliterate children. Cognition, 192, 103973.

Couper, G. (2006). The short and long-term effects of pronunciation instruction. Prospect, 21(1), 46–66.

Foote, J. A., Trofimovich, P., Collins, L., & Urzúa, F. S. (2013). Pronunciation teaching practices in communicative second language classes. The Language Learning Journal, 1–16.

Held, R. (1965). Plasticity in Sensory-Motor Systems. Scientific American, 213(5), 84–94.

Held, R., & Hein, A. (1963). Movement-produced stimulation in the development of visually guided behavior. Journal of Comparative and Physiological Psychology, 56(5), 872–876.

Held, R., & Mikaelian, H. (1964). Motor-Sensory Feedback versus Need in Adaptation to Rearrangement. Perceptual and Motor Skills, 18(3), 685–688.

Henderson, A., Curnick, L., Frost, D., Kautzsch, A., Kirkova-Naskova, A., Levey, D., Tergujeff, E., & Waniek-Klimczak, E. (2015). The English Pronunciation Teaching in Europe Survey: Factors inside and outside the Classroom. In J. A. Mompean & J. Fouz-González (Eds.), Investigating English Pronunciation (pp. 260–291). Palgrave Macmillan UK.

Levis, J. (2016). The interaction of research and pedagogy. Journal of Second Language Pronunciation, 2(1), 1–7.

Maley, A. (2016). ‘More Research is Needed’ – A Mantra Too Far? Humanising Language Teaching, 18(3).

Medgyes, P. (2017). The (ir)relevance of academic research for the language teacher. ELT Journal, 71(4), 491–498.

Messum, P. R. (2008). Embodiment, not imitation, leads to the replication of timing phenomena. In Acoustics 08 (pp. 2405–2410). SFA/ASA/EAA.

Messum, P. R., & Howard, I. S. (2015). Creating the cognitive form of phonological units: The speech sound correspondence problem in infancy could be solved by mirrored vocal interactions rather than by imitation. Journal of Phonetics, 53, 125–140.

Messum, P, & Young, R. (2012). Non-imitative ways of teaching pronunciation: Why and how. Pronunciation Science Ltd.

Messum, Piers, & Howard, I. S. (2012). Speech development: Toddlers don’t mind getting it wrong. Current Biology, 22(5), R160–R161.

O’Regan, J. K., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. The Behavioral and Brain Sciences, 24, 939–1031.

Pennington, M. C., & Rogerson-Revell, P. (2019). English Pronunciation Teaching and Research: Contemporary Perspectives.

Underhill, A. (2012). 2 The physicality of pronunciation & proprioception.

Underhill, A. (2014). Training for the unpredictable. EJALTEFL, 3(2), 59–69.

Underhill, A., Messum, P., & Young, R. (2019). Language is for expression before it is for communication. In T. Pattison (Ed.), 2018 Brighton Conference Selections (pp. 36–38). IATEFL.

Young, R. (1995). Caleb Gattegno’s ‘Silent Way’: Some of the reasons why. In E. Scheiner (Ed.), Methoden der Fremdsprachenvermittlung (Vol. 40, pp. 55–74). University of Mainz.

Young, R. (2015). Working on Pronunciation in English Part 1: Increasing students’ sensitivity to their mouth.

Young, R. (2018). How to use a chart and a pointer for teaching pronunciation. Speak Out! (Whitstable, IATEFL), 59, 20–26.

Young, R., & Messum, P. R. (2011). How We Learn and How We Should be Taught: An Introduction to the Work of Caleb Gattegno. Duo Flumina.