Analysis of Text Readability of College English Course Books

Baohua Zhang

doi:10.4236/oalib.1107817

Open Access Library Journal > Vol.8 No.9, September 2021

Analysis of Text Readability of College English Course Books

Baohua Zhang
Zhejiang Yuexiu University, Shaoxing, China.
DOI: 10.4236/oalib.1107817 PDF HTML XML 143 Downloads 1,179 Views Citations

Abstract

Course books are an important source of knowledge input in classroom teaching. Text difficulty is one of the key factors in choosing course books while text readability serves as an important indicator to measure the text difficulty. However, there have been few studies on text readability of college English course books in the past. This survey uses the WE Research platform and adopts Flesch Reading Ease and other readability formulas as the detection tools to study the text readability development of the course books. The researcher analyzes the surface text characteristics of word and sentence length as well as the readability index 1 and 2 of the texts in the 4 volumes of New Target College English Integrated Course to verify whether the compilation of the course books obeys the rule of text difficulty development from low to high. This article firstly summarizes the background, discusses the necessity of understanding text readability in college English teaching. Secondly, it reviews the development and the study of text readability both at home and abroad. Thirdly, it talks about the methodology of the study and discusses the data collected from the platform. It is hoped that this study can provide some guiding significance for editors, publishers, teachers as well as students.

Keywords

Text Readability, College English, Course Books, Text Difficulty

Share and Cite:

Zhang, B.H. (2021) Analysis of Text Readability of College English Course Books. Open Access Library Journal, 8, 1-14. doi: 10.4236/oalib.1107817.

1. Introduction

Course books play an indispensable role in classroom teaching. They provide students with various texts, explanations, activities, as well as exercises, thus they affect the quality of teaching and learning to a certain extent. In January 2020, National Textbook Committee issued “The National Planning for the Construction of Textbooks in Universities, Middle Schools, and Primary Schools (2019-2022)” [1] , and the Ministry of Education published four management measures, including “The Management Measures on Course Books for Universities”, calling for more scientific and standardized measures on college course books. Due to the expansion of college enrollment, more and more students of different levels are enrolled; college education has made a transition from elite education to mass education. The admission rate of the college entrance exams rises from about 5% in the year 1977 when China restored college entrance examination to about 90% in the year 2021. Nowadays most high school students can go to colleges or vocational schools to get further education. Course books serve as the media to deliver knowledge in the process, so more and more attention has been paid to compiling and choosing them. During the college English teaching procedure for non-English majors, the teaching content mainly covers the vocabulary, syntax, and themes of the texts in the unit. Therefore, the difficulty level of the course books, the readability of the texts can directly influence the students’ understanding of English knowledge and acquisition of the skills.

2. Literature Review

Reading a text in a foreign language will present readers with lots of challenges, such as unfamiliar meanings and usage of particular vocabulary, unknown syntactic features and unacquainted topics. Alderson considered text readability as a variable in influencing reading a text [2] . Thus, choosing reading materials with appropriate text readability is crucial. According to the Longman Dictionary of Language Teaching & Applied Linguistics, text readability is defined as the degree to which a text is easy to read. The factors that affect readability include sentence length, number of new words, and grammatical complexity [3] .

2.1. Studies on Text Readability Abroad

Studies on readability originate in America, which can be traced back to the early 20^th century. Thorndike, the famous psychologist, first introduced a text readability method in his book “Teacher’s Word Book” [4] . The method was to summarize the common words in the course book in accordance with their frequencies into a vocabulary table and then use the difficulty level of the table to judge the text readability. This helped set enlightenment for subsequent researches of this paradigm based on using words as a variable. Till now, text readability researches have been continuing all over the world.

Rudolf Franz Flesch was the representative of the scholars in exploring how to effectively measure the readability of English text. Flesch proposed a readability formula in his book “A New Readability Yardstick” published in 1948 [5] . This formula can help readers to calculate the text readability by calculating the influencing factors, such as vocabulary, the length of syllables and syntax. Later, many other readability formulas were put forward, including the Automated Readability Index proposed by Senter and Smith [6] , the readability of SMOG proposed by G・Harry McLaughlin [7] , and Flesch-Kincaid Grade Level readability formula proposed by Kincaid and Flesch [8] .

Representatives of studies on English text readability are Dale・Chall and Kasule. Dale・Chall considered text readability as one of the factors in deciding readers’ success of understanding the reading material [9] . Kasule discussed the correlation between text readability and second language learning. He noted that if teachers were more aware of text readability, it could be helpful to develop learners’ reading ability [10] .

Studies on text readability abroad mainly focus on how to effectively measure it and its influencing factors. The formulas and the ideas proposed have profound significance on studies at home.

2.2. Studies on Text Readability at Home

Scholars in early readability researches mainly introduced the foreign research results of readability to empirically analyze domestic English course books with the purpose to verify that foreign English readability formulas could be applied in domestic English course books where English is taught as a second language. Le Meiyun tested 16 texts of different styles, different levels of difficulty from 5 course books with the help of the American scholar Fry’s Reading Difficulty Assessment Indicators [11] . The results proved that Fry’s assessment formula was also valid in the texts in domestic English course books.

Recently, more attention has been paid to analyzing the readability of domestic course books in the context of new curriculum standards and diversified teaching materials. Domestic researches on readability of English course books could be mainly divided into two categories: one was to analyze the text readability, find the problems and offer some constructive suggestions to the compilation of the course books. For example, Yin Fan and Zhang Yanbin found that the shortcomings of college English course books in most independent colleges lied in readability, practicability and interest. The study also suggested that independent colleges should adopt course books according to the needs of their students and take readability, practicability and interest into consideration in compiling course books [12] . The other was to use corpus language analysis software to calculate the readability value of the texts in Middle School, Vocational School and college English course books. For example, Pan Xiao used Coh-Metrix to quantitatively analyze the language difficulty of the reading part of the CET-4 and college English course books and realized the differences in language difficulty between the reading tests and the currently used course books [13] . The study also found that the readability of the CET-4 reading materials was much higher than that of Volume 1 and 2 of the course books and there was no obvious difficulty difference between two volumes.

3. Methodology

3.1. Research Questions

It can be found that most foreign course book readability researches focus on the formulas for determining readability and the influencing factors, while domestic text readability researches of English course books mainly focus on how to use effective formulas and corpus analysis software to calculate the readability value. Researches on the comparison between different series of course books or between different volumes of the same series of course books can be found, but there is a lack of researches on comparison of text readability among different units within a course book.

This thesis analyzes and compares the readability of texts in the 4 volumes of New Target College English Integrated Course and intends to answer the following questions:

1) What are the features of the readability of texts in New Target College English Integrated Course?

2) What is the result of the comparison of the text readability among different units in each volume and that among different volumes?

3) Does the development of text difficulty between consecutive units and consecutive volumes follow the compilation rule of gradual increase?

3.2. The Corpora

The research corpora in this thesis are volume 1 - 4 of New Target College English Integrated Course published by Shanghai Foreign Language Education Press and edited by Liu Zhengguang as the chief editor. This series of course books is jointly compiled by many well-known experts from different universities. It focuses on compiling both basic language skills and cultural connotations to cultivate students’ humanistic spirit and accomplishment through teaching. It also appeals to a wide range of teachers and students with its blend of traditional type and communicative activities. There is a rich array of learning activities in each unit and the study selects text A of each unit of 4-volume course book because text A is the main component used in class. Each volume has a total of 8 texts, therefore, there is a total of 32 corpus copies.

3.3. Method

The examination of the readability of the texts in this study was based on the evaluation of the text complexity from the perspective of vocabulary and syntax. WE Research platform was used to analyze the text readability and calculate the average word length and sentence length of the text. Six commonly used readability indicators were also summarized by the platform, namely Flesch Reading Ease, Flesch-Kincaid Grade Level, Automated Readability Index, Coleman-Lian Readability Score, Gunning Fog, and the SMOG Readability Index. WE platform is a one-stop digital service platform created by Shanghai Foreign Language Education Press, aiming at helping national college teachers and students. It provides multiple teaching and research resource library and can fully satisfy users’ needs for teaching, researching, training and testing.

Flesch Reading Ease is based on the statistical method of Dr. Rudolf Flesch. It is one of the accurate measures for school texts. The number of words in the sentence and the number of syllables in the sentence are used to calculate the value, which ranges from 0 to 100. The higher the score is, the easier the text is to read. Its calculation formula is:

$206.835 - 1.015 (\frac{total words}{total sentences}) - 84.6 (\frac{total syllables}{total words})$

The difficulty degree can be divided into 8 levels. More information can be found in Table 1.

Flesch-Kincaid Grade Level, which roughly corresponds to grade-school level education required by American Primary and Secondary schools, consists of 12 levels. The calculation formula is:

$0.39 (\frac{total words}{total sentences}) + 11.8 (\frac{total syllables}{total words}) - 15.59$

Similar to the Flesch-Kincaid Grade Level, the Automated Readability Index also roughly corresponds to the reading level of American Primary and Secondary schools. It outputs an approximate grade level to understand the text. For example, U.S. grade level 6 corresponds to understanding to 11 - 12-year-old 6^th Grade students. Its calculation formula is:

$4.71 (\frac{characters}{words}) + 0.5 (\frac{words}{sentences}) - 21.43$

Meri Coleman and T. L. Liau designed the Coleman-Lian Readability Score. Its calculation formula is:

$C L I = 0.0588 L - 0.296 S - 15.8$

L stands for the average number of letters of every 100 words. S stands for the

Table 1. Flesch reading ease value.

average number of sentences of every 100 words. It approximates U.S. grade level to understand the text based on characters rather than on syllables.

The value of the Gunning Fog readability index is based on the daily newspapers, magazines. It roughly reflects the number of formally educated years required to get to a certain text understanding level. For example, if the Fog Index value of the text is 14, it means the U.S. college sophomore (about 20 years old) can understand the text. Thus, texts with Fog Index value score 7 - 8 are considered as ideal and score above 12 is too hard for most of the people. Its calculation formula is:

$0.4 [(\frac{words}{sentences}) + 100 (\frac{c o m p l e x w o r d s}{words})]$

G・Harry McLaughlin designed the SMOG readability index. It also roughly reflects how many years of formal education are needed to understand different levels of the text. Its calculation formula is:

$grade = 1.0430 \sqrt{number of polysyllables \times \frac{30}{number of sentences}} + 3.1291$

Polysyllables refer to the words which contain three or more than three syllables.

3.4. Data Collection and Data Analysis

The researcher used the traditional method to read, organize and analyze the features of the text readability of the corpora and applied statistical methods as well as comparative research methods to comparatively analyze New Target College English Integrated Course from the perspectives of vocabulary, syntax, readability index 1 and readability index 2.

Firstly, a total of 32 independent corpora were developed, including text A in each unit of New Target College English Integrated Course 1 - 4 volume. Secondly, the researcher input each corpus into the WE Research platform [14] . Thirdly, the researcher inserted the data in EXCEL to get the average of the values in each unit in 4 volumes and then calculated variances for comparative analysis.

Output data include six commonly used readability formulas and eight types of text features. More information can be found in Figure 1.

The eight types of text features are: the total number of words, sentences, syllables, letters, the average word length, the average sentence length, complex words (words with more than three syllables), and average words syllables. The six readability formulas are: Flesch Reading Ease, Flesch-Kincaid Grade Level, Automated Readability Index, Coleman-Liau Readability Score, Gunning Fog and SMOG. Of all these values, the average word length and the average sentence length were chosen as the two surface characteristics to be analyzed because these two values played an important role in affecting text readability. The researcher adopted the Flesch Reading Ease as the Readability Index 1. The rest 5

Figure 1. Output data from WE Research Platform.

readability formulas of Flesch-Kincaid Grade Level, Automated Readability Index, Coleman-Liau Readability Score, Gunning Fog and SMOG all related with the length of the reader’s formal education or U.S. students’ grades, so the average value of the five values in each unit was defined as the Readability Index 2.

4. Results and Discussion

4.1. The Average Word Length

Each text was imported into the readability detection tool on WE Research platform and the average word length value of each text was calculated. Based on the data, the researcher calculated the average word length of each volume and the variance of them and then made a line chart which can visually demonstrate the variation trend of the average word length among texts and volumes. More information can be found in Table 2 and Figure 2.

According to Table 2 and Figure 2, the following conclusions can be drawn.

First, the average word length in Volume 1 is 4.51, Volume 2 is 4.89, Volume 3 is 4.83, Volume 4 is 4.87. Although the average word length of Volume 1, 3 and 4 generally goes from low to high, it can be seen that the average word

Table 2. Value of average word length.

Figure 2. Variation trend of average word length.

length of Volume 2 used in the second semester is higher than that of Volume 3 used in the third semester and Volume 4 used in the fourth semester, and some units show the trend of change from high to low.

Second, the average word length of each text in the 4 volumes is uneven and it does not show the trend of changing from low to high. Combined with the variance value, it can be seen that the average word length is relatively stable in Volume 1 and 3, that of Volume 2 and 4 fluctuates slightly.

Based on the analysis, from the perspective of average word length, the difficulty setting of Volume 1, 3 and 4 generally follows the rule of difficulty development from low to high. Volume 2 does not obey the rule. It has the highest average word length. Furthermore, the average word length between each unit in each volume is uneven. It does not follow the rule of difficulty development from low to high.

4.2. The Average Sentence Length

Each text was imported into the readability detection tool on WE Research platform and the average sentence length value of each text was calculated. Based on the data, the researcher calculated the average sentence length of each volume and the variance of them and then made a line chart which can visually demonstrate the variation trend of the average sentence length among texts and volumes. More information can be found in Table 3 and Figure 3.

According to Table 3 and Figure 3, the following conclusions can be drawn.

First, the average sentence length increases from 19.06 in Volume 1 to 21.69 in Volume 4 but there is a decrease from 19.49 in Volume 2 to 19.14 in Volume 3. Meanwhile the changing range is larger between Volume 3 and Volume 4 while the changing range is much smaller between Volume 1 and Volume 2, and between Volume 2 and Volume 3.

Second, the average sentence length of each text in the 4 volumes is uneven and it does not show the trend of changing from low to high. Moreover, combined with the variance value, it can be seen that the average sentence length fluctuates slightly in Volume 2, that of Volume 3 and 4 fluctuates more greatly and that of Volume 1 fluctuates the most greatly.

Based on the analysis, from the perspective of average sentence length, the difficulty setting of Volume 1, 2, and 4 generally follows the rule of difficulty development from low to high. Volume 3 does not obey the rule. It is lower than that of Volume 2 and the increase in the difficulty from Volume 1 to Volume 3 is too small. Furthermore, the average sentence length between each unit in each

Table 3. Value of average sentence length.

Figure 3. Variation trend of average sentence length.

volume is uneven. It does not follow the rule of difficulty development from low to high.

4.3. The Readability Index 1 Value

Each text was imported into the readability detection tool on WE Research platform and the Flesch Reading Ease value of each text was calculated. Based on the data, the researcher calculated the average readability index 1 value of each volume and the variance of them and then made a line chart which can visually demonstrate the variation trend of the readability index 1 value among texts and volumes. More information can be found in Table 4 and Figure 4.

According to Table 4 and Figure 4, the following conclusions can be drawn.

First, the average readability index 1 value of Volume 1 is 63.16, Volume 2 is 50.35, Volume 3 is 54.89, Volume 4 is 49.28. Although the average readability index 1 value of the 4 volumes generally goes from high to low, it can be seen that the average readability index 1 value of Volume 2 used in the second semester is lower than that of Volume 3 used in the third semester and nearly as low as that of Volume 4 used in the fourth semester, and some units show the trend of change from low to high.

Table 4. Readability index 1 value.

Figure 4. Variation trend of readability index 1 value.

Second, the highest readability score in Volume 1 is the fourth text (78.11), considered as Fairly Easy. The fist text (51.37) has the lowest readability score, considered as Fairly Difficult. The highest readability score in Volume 2 is the sixth text (65.76), considered as Standard. The third text (34.99) has the lowest readability score, considered as Difficult. Volume 3 has the highest readability score for the fifth text (68.32), considered as Standard. The lowest is the eighth text (46.99), considered as Difficult. Volume 4 has the highest readability score for the second text (62.35), considered as Standard. The lowest is the third text (32.28), considered as Difficult. Moreover, combined with the variance value, it can be seen that the average readability index 1 value fluctuates slightly in Volume 3, that of Volume 1 and 4 fluctuates more greatly and that of Volume 2 fluctuates the most greatly.

Based on the analysis, from the perspective of average readability index 1 value, the difficulty setting of volumes 1, 3, and 4 generally follows the rule of difficulty development from low to high. Volume 2 does not obey the rule. It is lower than that of Volume 3 and is approximately as low as that of Volume 4. Furthermore, the average readability index 1 value between each unit in each volume is uneven. It does not follow the rule of difficulty development from low to high.

4.4. The Readability Index 2 Value

Each text was imported into the readability detection tool on WE Research platform and the 5 readability values of Flesch-Kincaid Grade Level, Automated Readability Index, Coleman-Liau Readability Score, Gunning Fog and SMOG of each text was calculated. Based on the data, the researcher calculated the average readability index 2 value of each volume and the variance of them and then made a line chart which can visually demonstrate the variation trend of the readability index 2 value among texts and volumes. More information can be found in Table 5 and Figure 5.

According to Table 5 and Figure 5, the following conclusions can be drawn.

First, the average readability index 2 value of Volume 1 is 10.13, Volume 2 is 12.19, Volume 3 is 11.68, Volume 4 is 12.83. Although the average readability index 2 value of Volume 1, 3 and 4 generally goes from low to high, it can be seen that the average readability index 2 value of Volume 2 used in the second semester is higher than that of Volume 3 used in the third semester, and some units show the trend of change from high to low.

Table 5. Readability index 2 value.

Figure 5. Variation trend of readability index 2 value.

Second, the highest readability index 2 value in Volume 1 is the first text (13.21), corresponding to U.S. college freshman. The fourth text (6.54) has the lowest readability value, corresponding to U.S. 6^th grade students. The highest readability value in Volume 2 is the third text (14.48), corresponding to U.S. college sophomore. The second text (9.51) has the lowest readability value, corresponding to U.S. high school freshman. Volume 3 has the highest readability value for the third and eighth text (12.89), corresponding to U.S. high school senior students. The lowest is the fifth text (8.51), corresponding to U.S. 8^th grade students. Volume 4 has the highest readability value for the third text (15.70), corresponding to U.S. college junior or senior students. The lowest is the second text (10.05), corresponding to U.S. high school sophomore. Besides, combined with the variance value, it can be seen that the average readability index 2 value fluctuates slightly in Volume 3, that of Volume 1, 2 and 4 fluctuates more greatly.

Based on the analysis, from the perspective of average readability index 2 value, the difficulty setting of Volume 1, 3, and 4 generally follows the rule of difficulty development from low to high. Volume 2 does not obey the rule. It is higher than that of volume 3. Furthermore, the average readability index 2 value between each unit in each volume is uneven. It does not follow the rule of difficulty development from low to high.

5. Conclusion

This study demonstrates that the readability scores differ significantly among the texts in the four volumes. Thus, the present chapter summarizes the major findings and the limitations of the study and puts forward suggestions for further study.

5.1. Findings

Firstly, in terms of average word length, the value goes from 4.51 in Volume 1 to 4.89 in Volume 2 to 4.83 in Volume 3 to 4.87 in Volume 4. It can be seen that Volume 1, 3 and 4 generally show a trend of increasing difficulty from low to high, Volume 2 has the highest average word length value. That is to say, the texts in Volume 2 have the longest average word length. In terms of average sentence length, the value goes from 19.06 in Volume 1 to 19.49 in Volume 2 to 19.14 in Volume 3 to 21.69 in Volume 4. It can be seen that Volume 1, 2 and 4 generally show a trend of increasing difficulty from low to high, Volume 3 has a lower value than Volume 2. That is to say, the texts in Volume 3 have a shorter average sentence length than those in Volume 2.

Secondly, from the perspective of readability index 1, the value goes from 63.16 in Volume 1 to 50.35 in Volume 2 to 54.89 in Volume 3 to 49.28 in Volume 4. It can be concluded that Volume 1, 3 and 4 generally show a trend of increasing text difficulty from low to high, Volume 2 has a much lower value than Volume 1 and is nearly as low as that of Volume 4. That is to say, the texts in Volume 2 are far more difficult than those in Volume 1 and Volume 3 and are almost as difficult as those in Volume 4. From the perspective of readability index 2, the value goes from 10.13 in Volume 1 to 12.19 in Volume 2 to 11.68 in Volume 3 to 12.83 in Volume 4. It can be concluded that Volume 1, 3 and 4 generally show a trend of increasing text difficulty from low to high, Volume 2 has a much higher value than Volume 3 and is nearly as high as that of Volume 4. That is to say, the texts in Volume 2 are fit for higher grade students than those in Volume 3 or are for nearly the same grade students as those in Volume 4.

Thirdly, from the data collected the researcher concludes: Although New Target College English Integrated Course shows an overall upward trend in text difficulty from Volume 1 to Volume 4, the text difficulty setting of Volume 2 is unscientific and does not obey the linear development from easy to difficult. Of the eight texts in volume 2, four are regarded as difficult, two as fairly difficult and the rest two as standard. Also the difficulty setting of each text in the four volumes is uneven and does not follow the rule of difficulty development from low to high. Among all the texts, the difficulty fluctuation among the texts in Volume 1 and 3 are relatively small, and the difficulty fluctuations among the texts of Volume 2 and 4 are relatively large.

5.2. Limitations

This study investigates the readability of the texts in New Target College English Integrated Course Volume 1 - 4. With the help of the online WE Research platform, 32 texts are analyzed. However, in terms of the four values taken in the study, it is necessary to admit the limitations of the present research. Firstly, the sample capacity is the main limitation of the research. There are only 32 texts examined in the current study. Further study should expand sample size to make the result more accurate. Secondly, the detection tools used in this research might be limited. Although Flesch Reading Ease and other five formulas are popular in analyzing text difficulty, there are also shortcomings. Further study should combine other analysis tools such as Coh-Metrix and Read-X. Thirdly, the values chosen to be discussed in the research might also be limited. Average word length and sentence length are two surface characteristics. Further study should adopt other characteristics such as semantic units and syntactic structures complexity.

Nonetheless, the present study helps identify and analyze text features of college English texts. The potential limitations do not affect the reliability and validity of the result. On the contrary, they demonstrate there is much more work lying ahead.

Conflicts of Interest

The author declares no conflicts of interest.

References

[1]	http://www.moe.gov.cn/jyb_xwfb/xw_zt/moe_357/jyzt_2020n/2020_zt04/baodao/202004/t20200409_441835.html
[2]	Alderson, J.C. (2000) Assessing Reading. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511732935
[3]	Richards, J.C., Platt, J. and Platt, H. (2000) Longman Dictionary of Language Teaching and Applied Linguistics (English-English—English-Chinese Bilingual). Foreign Language Teaching and Research Press, Beijing.
[4]	Thorndike, E.L. (1921) The Teacher’s Word Book. Teachers College, Columbia University, New York.
[5]	Flesch, R. (1948) A New Readability Yardstick. Journal of Applied Psychology, 32, 221-233. https://doi.org/10.1037/h0057532
[6]	Senter, R.J. and Smith, E.A. (1967) Automated Readability Index. Wright-Patterson Air Force Base. Aerospace Medical Division, Ohio.
[7]	McLaughlin, G.H. (1969) SMOG Grading—A New Readability Formula. Journal of Reading, 12, 639-646.
[8]	Flesch-Kincaid Readability Tests. https://en.wikipedia.org/wiki/Flesch%E2%80%93Kincaid_readability_tests#Flesch%E2%80%93Kincaid_grade_level
[9]	Dale, E. and Chall, J.S. (1948) A Formula for Predicting Readability. Educational Research Bulletin, 27, 37-54.
[10]	Kasule, D. (2011) Textbook Readability and ESL Learner. Reading & Writing, 2, 63-76. https://doi.org/10.4102/rw.v2i1.13
[11]	Le, M. (1983) Introducing a Scientific Method of Measuring the Difficulty of English Textbooks. Foreign Language Teaching and Research, 4, 47-49.
[12]	Yin, F. and Zhang, Y. (2009) On the Construction of College English Course Books in Independent Colleges. Science & Technology Information, 32, 3-4.
[13]	Pan, X. (2019) Comparison of Readability of College English Intensive Reading Textbook and CET4 Reading Materials. Culture and Education Data, 2, 208-210+205.
[14]	https://we.sflep.com/research/ReadingEase.aspx

Journals Menu

Follow SCIRP

	+1 323-425-8868
	customer@scirp.org
	+86 18163351462(WhatsApp)
	1655362766

	Paper Publishing WeChat

Journals Menu

Home

About SCIRP

Service

Policies