A Systematic Review: Exploring L2 Vowel Production from Revised Speech Learning Theory Perspective

Abstract

Speech learning model aims to explain the variables contributing to the differences in L2 phonetic productions. Most previous studies comparing L2 vowel production with L1 vowel production mostly attribute the differences to the mother tongue inference, which is also proposed by Speech learning model. However, the past transfer studies show a number of discrepant findings even regarding the same L2 vowel production. Therefore, this systematic review collected past studies compared L2 vowel production with L1 vowel production to understand the causes to the discrepant findings. Relevant articles published 2000 onward were searched with key words, such as “L2 accented English”, “vowel space”, “L2 Formants”, “Chinese-accented English”, “comparing Chinese English” in the online database. In the initial search with the key words, 120 articles were found. After two screenings on titles and abstracts, and based on inclusion and exclusion criteria, 14 articles were kept to be reviewed. Another search was conducted by referring to the reference lists of the selected articles. Another 2 articles were added, which is 16 articles totally reviewed in this paper. This review starts with a review on the speech learning model. Then synthesizing and analyzing the collected articles are followed. Pedagogical implications and recommendations for future studies regarding the language transfers studies are discussed.

Share and Cite:

Liu, Z. (2023) A Systematic Review: Exploring L2 Vowel Production from Revised Speech Learning Theory Perspective. Open Journal of Social Sciences, 11, 452-463. doi: 10.4236/jss.2023.116029.

1. Introduction

The revised speech learning theory

The speech learning model (SLM) focuses on the L2 sounds acquisition both vowel and consonant sounds, which was originally developed by Flege in 1995. Sequential bilinguals are the subjects of SLM, which indicates that L2 leaners have already mastered their L1 phonetic system before their L2 learning. It assumes that L2 phonetic learning is affected by the perceived relationships between the L2 and the L1 phonetic sounds. The SLM proposes that an automatic and subconscious cognitive process links L1 and L2 sounds perceptually. It is assumed that when L2 learners firstly encounter L2 phonetic sounds, they subconsciously interpret the L2 phonetics by referring back to the established L1 phonetic system (Flege, 1995) . However, SLM did not specify the amount of L2 input needed by L2 learners to build up stable patterns of interlingual identification.

The revised speech learning model (SLM-r) still focuses on the sequential bilinguals, but with some new premises are added. There are totally eleven premises proposed by the revised speech learning model. First, SLM-r has abandoned the concept that whether the high proficiency L2 learners will master L2 sounds native-likely, due to the different input between L1 acquisition and L2 learning. The second is the SLM-r recognizes the coevolve between segmental production and perception. The third is that a L2 sound different from the closest L1 sound might lead to a composite L1-L2 phonetic category depending on the input from two languages. The fourth one is that the processes and mechanism used in the L1 development will be intact and accessible for L2 learning. The firth premise is that the ongoing input will gradually shape the development of new L2 phonetic categories. The sixth one is phonetic factors, which says that the formation of a new L2 sound is determined by three factors, the perceived phonetic dissimilarity from the L1 and the quality and the amount of L2 input provided via meaningful conversations. Another factor is how precisely the closest L1 is specified at the moment of learning L2. The seventh is L1 category precision, which says that for those who have a relatively precise L1 phonetic categories would better distinguish differences between L2 and L1 sounds. This will eventually lead to a better production of L2 sound for them. Number eight is L1 phonetic category differences, which recognizes the differences in L1 from individual speaks caused by the L1 input during L1 speech development and the precision of L1 categories. The ninth one is Endogenous factors. Endogenous factors concern the influences of individual factors, such as the differences in working auditory memory, early-stage auditory processing on the mastery of L2 sounds. The tenth premise is inter-subject variability. It also concentrates more on the individual differences, such as the accuracy of L2 pronunciation and perception, and the degree of being able to distinguish L2 sound from the closest L1 sound, and the quality and quality of L2 phonetic input. The last premise is continuous learning, which indicates that the mastery of L2 phonetic sounds is a life-span process (Wayland, 2021) . The revised speech language model has shifted its focus significant to the individual differences.

Studies comparing L2 vowel production

A large number of empirical research have also focused on the comparison between L2 vowel production and L1 vowel production to understand more about the vowel characteristics produced by L2 English learners. Most studies hypothesized that differences between L1 and L2 phonetic system will lead to difficulties for L2 learners to master the different phonemes based on SLM. However, studies comprising on the vowel produced by Chinese English speakers show various discrepancies regarding their findings, and almost all the differences in the vowel production between L2 learners and native speakers are mainly explained by the differences between English phonetic system and the learners’ mother tongue phonetic systems. For instance, research on the duration of vowels produced by Chinese English speakers showed different findings. Some studies reported that L2 Chinese English speakers produced vowels with higher F1 and F2 values than native speakers (Chen & Robb, 2000) . However, another study exploring L2 Chinese English speaker vowel production showed that F1 and F2 values of English vowels produced by Mandarin-speaking learners were generally smaller and more centralized compared to that of native speakers (Zhang & Chen, 2008) . Another study also investigated L2 vowels production by Chinese English learners (Ma, 2016) , showing that F1 and F2 values of English vowels produced by Mandarin-speaking learners showed a pattern of both convergence and divergence. Study also concluded that Chinese English learners with higher proficiency levels showed closer approximations to native-like vowel targets in terms of F1, F2, and duration (Wu & Shih, 2012) . The discrepant findings from the past studies need more attention.

To shed light on the future research exploring the L2 vowel production, it is significant to review past literature systematically to understand what has been done and how, and what is still missing based on the revised speech learning model. Hence, the objectives of this systematic review of the literature were to evaluate the past research on Chinese English speakers L2 vowel production in comparison with native English speakers from the revised speech learning theory perspective to understand the reasons caused to the discrepancies found in the existing studies.

In line with the objectives, the following research questions are addressed.

1) What are the possible reasons leaded to the discrepant findings found in the existing studies?

2) Are there any other factors to be added to the SLM-r theory?

2. Methodology

2.1. Literature Search and Data Sources

The relevant studies included in this review are searched multiply. First, articles are searched with key words in the online database, such as SAGE JOURNALS ONLINE, SCIENCE DIRECT, SCOPUS, SPRINGERLINK, JSTOR. Being aware that some articles exploring the comparison between L2 vowel production with different terms, therefore, terms like, L2 accented English, vowel space, L2 Formants, Chinese-accented English, comparing Chinese English speakers L2 are searched in the online database mentioned above. Another search was also conducted by referring to the reference lists from selected relevant articles.

2.2. Inclusion and Exclusion Criteria

Studies included in this systematic review are peer-reviewed articles. The second criterion is all of them are empirical research conducted since 2000 onward. The third criterion is that all studies should contain all sections for an empirical study, and there should be a comparison between Chinese English speakers L2 vowel production with native speakers’ vowel production. Table 1 illustrates the inclusion criteria of this review.

2.3. Data Extraction

Figure 1 shows how the articles included in this review are selected. The obtained articles from online database have been checked firstly by focusing on the titles and abstracts to determine the appropriateness of inclusion. There are 30 articles kept after the first examine. The second check was carried out by following the inclusion criteria, which results 14 articles to be included. A second search for relevant articles was done by referring to the reference lists of the 14 articles, which added another two articles to be included. The total number of included articles in this review is 16.

2.4. Articles Included in This Systematic Review

Table 2 summarizes the articles included in this review.

Table 1. Inclusion criteria of the systematic review.

Figure 1. Flow chart of date extraction.

Table 2. Studies included in this review.

3. Findings

1) Participants in the examined studies

Table 3 illustrates the L2 Chinese English speakers participated in the examined studies. According to the table, there are 14 studies explored bilingual Chinese English speakers’ vowel production, which takes up 87.5% totally. All participants in these 14 articles are fluent both in English and Chinese. Another two articles explored trilingual participants’ vowel production, which accounts for 12.5% overall.

The table also depicts the gender of participants in the examined studies. There are eight articles explored vowel production with balanced gender participants, taking up 50% totally. Five articles examined L2 vowel production with unbalanced gender participants. These five articles account for 31.2%. Another three articles did not mention specifically if the participants are gender balanced or unbalanced, which is 18.8% totally.

Table 4 shows five individual factors about L2 Chinese English speaker participants. Eight articles mentioned participants English proficiency, account for 50%. There are two articles indicated the age when L2 Chinese English speakers started learning English. There are only four articles mentioned the dialect spoken by the Chinese English speaker participants, which takes up 25%. Regarding the years of English learning, there is only one article mentioned the length of English learning of the participants. It takes up around 6.25%. The last individual factor is about the length of staying in an English-speaking country. There are 12 articles specifically mentioned the how long the participants lived in an English-speaking country, which is about 75%.

Table 5 illustrates the information about native speaker participants in the examined studies. There is only one article employed bilingual native speakers as English vowel baseline producers, which accounts for 6.2%. A great number of articles (15 articles) did not report if the native speakers are bilingual or monolingual, taking up 93.8% overall. Regarding the gender of native speaker participants, most studies (12 articles) explored native speakers vowel production with balanced gender native speaker participants, which takes up around 75%. There are 4 articles did not mention if the native speaker participants in their studies are balanced gender or unbalanced.

2) Vowel elicitation methods

Table 6 depicts the methods used in the examined articles to elicit vowel production both from L1 and L2. The methods employed to elicit target vowels are also various. The majority of the examined studies adapted the procedure proposed by Hillenbrand in 1995. There are 12 articles employed it, which takes up 75% overall. Another three articles elicited vowel production by using self-developed methods. There is only one articles used sentence reading to extract vowels produced by L1 and L2.

3) Vowel normalization

Due to the differences in the vocal tracts between male and female or to minimize the effect caused by the individual differences among participants, vowel normalization is a necessary procedure to normalize F1 and F2 values from two groups of participants before comparing the values. Table 7 shows studies that only six articles have employed certain strategies for vowel normalization, which takes up 37.5%. Most examined studies (10 articles) did not apply any vowel normalization strategies, which is about 62.5% totally. Even the six articles have normalized vowels, none of them applied the same method.

Table 3. L2 participants in the examined studies.

Table 4. L2 participants’ individual factors.

Table 5. Information about the baseline native speakers in the examined articles.

Table 6. Vowel elicitation method.

Table 7. Vowel normalization.

4. Discussion

Homogeneity of Participants

To answer the first research question. The potential reasons contributing to the discrepancies in the findings of past studies are various. Based on this review, the homogeneity of participants seems to be the most important factor leading to the differences in the findings of past studies. Table 1 shows that some participants are bilingual, whereas, some are trilingual in the examined studies. The influence of being bilingual and trilingual needs to be scrutinized. Furthermore, individual differences among participants are critical to the different findings. Very few studies indicated the participants English fluency. However, study has already proved that proficient Chinese English speakers show little or no differences in the L2 vowel production in comparison with native speakers (Xie & Jaeger, 2020) . Therefore, specifically and precisely evaluating L2 participants English fluency might be the essential.

Dialects Background

One more factor possibly contributing to the varied findings is the dialects spoken by the L2 participants. Only 25% of examined articles mentioned the participants dialects background. However, dialects spoken by the L2 participants might have a significant influence on the discrepant findings. Studies have shown variations in the vowel production among Chinese English speakers with different dialects background. For instance, study conducted by Siqi and Sewell (2012) explored a number of phonological features produced by twelve students from different locations in China, which showed that the substitution of the palato-alveolar fricative /ʒ/ is caused by originally where the students come from. Similarly, study also showed the differences between speaker of standard mandarin and Shanghai accented mandarin in the diphthongs production (Li & Wang, 2003) . The variations in vowel production among Chinese English speakers with different dialects background are also found in the study conducted by Xinyi Wen and Yuan Jia (2016) , which found that vowel production of /a/ produced by Changsha English learners differs from those produced by native American speakers significantly, the difference is more caused by Changsha dialect (Wen & Jia, 2016) . Similarly, English speakers with Cantonese as dialect are also reported to produce vowels differently from Mandarin English speakers (Ji & Jiang, 2022) , which also proved that dialects influence vowel production significantly. Hence, a rigorous control on the L2 participants dialects seems to have a great impact on the findings.

L2 Onset Learning Age

Another factor is the L2 onset learning age, which is mostly neglected by the examined articles. There are only two articles mentioned the onset L2 learning age. Studies have proved that age-related differences in Chinese English speakers in the English vowel production. For instance, vowel production accuracy is related to age, which is proved that older-learner performed better in the vowel production in the Chinese English speakers overall in China. However, for the participants who have been in the United States for a few years, the younger when they started learning English the better they performed in vowel productions (Jia, Strange, Wu, Collado, & Guan, 2006) . Therefore, it is highly suggested to specify the ages of starting L2 learning of the participants since it might influence their performance significantly. One more possible contributing factor is the varied length of L2 learning and the length of staying in an English-speaking County, which is almost neglected by most examined studies. The length of L2 learning and the length of staying in an English-speaking country will eventually lead to a better proficiency in general. Therefore, scrutinizing on the length of L2 learning and the length of staying in an English-speaking country of the participants in a transfer study should be considered.

Methodology Issues

Other factors affecting to the discrepant findings of past studies are from methodology. One of the most important factors should be considered is vowel normalization. 37.5% of studies examined in this review did not normalize the vowel production both from L1 and L2. Normalization is a necessary process since it could minimize the differences caused by individual factors, such as the difference in the vocal sizes between male and female. Another methodology issue is the gender balanced participants. Gender balanced participants could help to reduce noise caused by different genders. However, among the examined studies, half of them did not apply balanced gender participants, which could particularly explain on the discrepant findings of past studies.

Baseline English Native Providers

Bilingual or monolingual native speakers as baseline providers also contribute to the differences in the findings of past studies. The majority of examined studies in this review did not mention if the baseline native speakers are bilingual or monolingual. However, it should be scrutinized in the transfer study since the influences on L1 production from L2 has also been widely proved in different language contents. For instance, Portuguese speaker were considered speaking Portuguese with American English accent by native Portuguese listeners after staying in the US for four months (Sancier & Fowler, 1997) . The effect of L2 on L1 has also been proved by a great number of studies (e.g., Kartushina et al., 2015; Liu & Chen, 2021; Wang & Munro, 2018; Wang & Munro, 2019; Zhang, 2019 ). Therefore, it matters to mention if the native speakers are bilingual or monolingual, since being bilingual could influence on their L1 vowel production.

The issues mentioned above might be able to explain the discrepant findings in the past studies. Regarding the second research question of this review, the participants dialects backgrounds should be considered and added to the SLM-r model. Even though the revised SLM model has focused extensively on the individual factors, dialects of participants should receive more emphasis especially in a context like China, where there are 56 minority groups. In most families, mandarin might not be the first language for their children. Dialects are often the first variety used in home domain. Mandarin is used as a formal language in most formal contexts, such as schools. However, dialects are mostly used among friends and families. In some cases, participants might be more fluent in their dialects than standard mandarin. Therefore, the influence of dialects on L2 learning should be given more attention in transfer studies. Most transfer studies attribute the discrepancies between Chinese English speakers’ vowel productions and native speakers’ vowel productions to the L1 influence without precisely considering and controlling the participants dialects background (e.g., Evanini & Huang, 2013; Olagbaju, Barkana, & Gupta, 2010; Jin & Liu, 2013 ).

5. Implications

Further studies exploring on the reasons of the differences found in the comparison between the vowel productions of L1 and L2 should consider the dialects spoken by the L2 participants. Since the discrepancy might be caused by their dialects. Further empirical studies might focus more on the comparison between certain dialect phonetic systems and English phonetic systems. By doing so, insights would be provided, which could provide a better understanding about the causes to the differences. Eventually, it will shed light to the English teaching in certain locations as well. Furthermore, the proficiency of dialects should be evaluated. Evaluation on the proficiency of dialects and standard Chinese might enable to provide an insightful understanding of the influence on L2. It helps to determine whether the influence is from mandarin or the dialects spoken by the participants.

Other individual factors from participants in a transfer study, such as L2 onset learning age, the length of L2 learning, L2 proficiency and the length of staying in an English-speaking country should all be carefully examined. Since each of these variables potentially has an impact on the vowel production. Therefore, studies comparing the L2 and L1 phonetic productions should strictly control on the variables. Further studies are also suggested to explore more on the relationship among these variables.

Pedagogical Implications

Teachers should pay more attention to the dialects spoken by their students when teaching phonetics and phonology. Being aware of that enables teachers to be better at helping students to master L2 phonemes and adjust the teaching.

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Barkana, B. D., & Patel, A. (2020). Analysis of Vowel Production in Mandarin/Hindi/American-Accented English for Accent Recognition Systems. Applied Acoustics, 162, Article ID: 107203.
https://doi.org/10.1016/j.apacoust.2019.107203
[2] Chen, Y., & Robb, M. (2000). Acoustic Features of Vowel Production in Mandarin Speakers of English. In Proceedings of 6th International Conference on Spoken Language Processing (ICSLP 2000) (Vol. 2, pp. 587-590). ISCA.
https://doi.org/10.21437/ICSLP.2000-337
[3] Chen, Y., Robb, M., Gil, H., & Lerman, J. (2001). Vowel Production by Mandarin Speakers of English. Clinical Linguistics & Phonetics, 15, 427-440.
https://doi.org/10.1080/02699200110044804
[4] Evanini, K., & Huang, B. (2013). Production of English Vowels by Speakers of Mandarin Chinese with Prolonged Exposure to English. Proceedings of Meetings on Acoustics, 18, Article ID: 060004.
https://doi.org/10.1121/1.4793560
[5] Flege, J. (1995). Second Language Speech Learning: Theory, Findings, and Problems. In W. Strange (Ed.), Speech Perception and Linguistic Experience: Issues in Cross-Language Research (pp. 233-273). York Press.
[6] Guo, J., Feng, H., & Jia, Y. (2022). Influence of Chinese Er Suffixation on American English R-Colored Vowels by Northeast Chinese EFLL Learners. In 2022 International Conference on Asian Language Processing (IALP). IEEE.
https://doi.org/10.1109/IALP57159.2022.9961268
[7] Hardman, J. (2014). Accentedness and Intelligibility of Mandarin-Accented English for Chinese, Koreans, and Americans. In Proceedings of the International Symposium on the Acquisition of Second Language Speech (Vol. 5, pp. 240-260). COPAL.
[8] Ji, A., Berry, J. J., & Johnson, M. T. (2013). Vowel Production in Mandarin Accented English and American English: Kinematic and Acoustic Data from the Marquette University Mandarin Accented English Corpus. Proceedings of Meetings on Acoustics, 19, Article ID: 060221.
https://doi.org/10.1121/1.4800290
[9] Ji, J., & Jiang, A. (2022). The Influence of Dialect on the Perception and Production of Lax-Tense Vowel Distinction in English Learning. International Journal of Language and Linguistics, 10, 103-110.
https://doi.org/10.11648/j.ijll.20221002.16
[10] Jia, G., Strange, W., Wu, Y., Collado, J., & Guan, Q. (2006). Perception and Production of English Vowels by Mandarin Speakers: Age-Related Differences Vary with Amount of L2 Exposure. The Journal of the Acoustical Society of America, 119, 1118-1130.
https://doi.org/10.1121/1.2151806
[11] Jin, S.-H., & Liu, C. (2013). The Vowel Inherent Spectral Change of English Vowels Spoken by Native and Non-Native Speakers. The Journal of the Acoustical Society of America, 133, EL363-EL369.
https://doi.org/10.1121/1.4798620
[12] Kartushina, N., Hervais-Adelman, A., Frauenfelder, U. H., & Golestani, N. (2015). The Effect of Phonetic Production Training with Visual Feedback on the Perception and Production of Foreign Speech Sounds. The Journal of the Acoustical Society of America, 138, 817-832.
https://doi.org/10.1121/1.4926561
[13] Li, A., & Wang, X. (2003). A Contrastive Investigation of Standard Mandarin and Accented Mandarin. In 8th European Conference on Speech Communication and Technology (Eurospeech 2003) (pp. 2345-2348). ISCA.
https://doi.org/10.21437/Eurospeech.2003-647
[14] Liu, C., & Chen, H. (2021). The Production of Englishvowel Contrasts by Mandarin-Speaking Learners: The Role of Phonetic Training and Individual Differences. Journal of Second Language Pronunciation, 7, 190-219.
[15] Liu, C., Jin, S.-H., & Chen, C.-T. (2013). Durations of American English Vowels by Native and Non-Native Speakers: Acoustic Analyses and Perceptual Effects. Language and Speech, 57, 238-253.
https://doi.org/10.1177/0023830913507692
[16] Ma, Q. (2016). Acoustic Analysis of English Vowels Produced by Chinese Learners at Different Proficiency Levels. Language Teaching Research, 20, 35-53.
[17] Olagbaju, Y., Barkana, B. D., & Gupta, N. (2010). English Vowel Production by Native Mandarin and Hindi Speakers. In 2010 7th International Conference on Information Technology: New Generations. IEEE.
https://doi.org/10.1109/ITNG.2010.220
[18] Robb, M. P., & Chen, Y. (2008). Vowel Space in Mandarin-Accented English. Asia Pacific Journal of Speech, Language and Hearing, 11, 175-188.
https://doi.org/10.1179/136132808805297179
[19] Sancier, M. L., & Fowler, C. A. (1997). Gestural Drift in a Bilingual Speaker of Brazilian Portuguese and English. Journal of Phonetics, 25, 421-436.
https://doi.org/10.1006/jpho.1997.0051
[20] Siqi, L., & Sewell, A. (2012). Phonological Features of China English. Asian Englishes, 15, 80-101.
https://doi.org/10.1080/13488678.2012.10801331
[21] Van Anh Le, N., & Kitahara, M. (2020). English Vowel Duration Affected by Voicing Contrast in Chinese, Korean, Japanese, and Vietnamese. In The Asian Conference on Language 2020: Official Conference Proceedings. IAFOR.
https://doi.org/10.22492/issn.2435-7030.2020.3
[22] Wang, H., & van Heuven, V. J. (2013). Mutual Intelligibility of American, Chinese and Dutch-Accented Speakers of English Tested by SUS and SPIN Sentences. In Interspeech 2013 (pp. 431-435). ISCA.
https://doi.org/10.21437/Interspeech.2013-129
[23] Wang, H., & van Heuven, V. J. (2006). Acoustical Analysis of English Vowels Produced by Chinese, Dutch and American Speakers. Linguistics in the Netherlands, 23, 237-248.
[24] Wang, J., & Munro, M. J. (2018). The Production of English Vowels by Mandarin-Speaking Learners: The Effect of Language Experience on Vowel Duration. Journal of Phonetics, 71, 372-383.
[25] Wang, J., & Munro, M. J. (2019). The Production of English Vowels by Mandarin-Speaking Learners: A Critical Review of Research. Journal of Second Language Pronunciation, 5, 245-269.
[26] Wayland, R. (2021). Second Language Speech Learning: Theoretical and Empirical Progress. Cambridge University Press.
https://doi.org/10.1017/9781108886901
[27] Wen, X., & Jia, Y. (2016). Joint Effect of Dialect and Mandarin on English Vowel Production: A Case Study in Changsha EFL Learners. In Interspeech 2016 (pp. 185-189). ISCA.
https://doi.org/10.21437/Interspeech.2016-1022
[28] Wu, C.-H., & Shih, C. (2012). A Corpus Study of Native and Non-Native Vowel Quality. In Speech Prosody 2012 (pp. 234-237). ISCA.
http://www.isca-speech.org/archive
[29] Xie, X., & Jaeger, T. F. (2020). Comparing Non-Native and Native Speech: Are L2 Productions More Variable? The Journal of the Acoustical Society of America, 147, 3322-3347.
https://doi.org/10.1121/10.0001141
[30] Zhang, J. (2019). Perception and Production of English Vowel Contrasts by Mandarin-Speaking Learners of English: A Review. Journal of Second Language Pronunciation, 5, 214-244.
[31] Zhang, J., & Chen, M. (2008). F1 and F2 Variations in Mandarin-Accented English. Journal of Phonetics, 36, 664-679.

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.