TITLE:
Typos Correction in Overseas Chinese Learning Based on Chinese Character Semantic Knowledge Graph
AUTHORS:
Jing Xiong, Xue Zhai, Zhan Zhang, Feng Gao
KEYWORDS:
Chinese Character Meaning, Knowledge Graph, Typos Correction, OpenHowNet, Semantic Relevancy
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.11 No.2,
May
31,
2023
ABSTRACT: In recent years, more and more foreigners begin to learn Chinese
characters, but they often make typos when using Chinese. The fundamental
reason is that they mainly learn Chinese characters from the glyph and
pronunciation, but do not master the semantics of Chinese characters. If they
can understand the meaning of Chinese characters and form knowledge groups of
the characters with relevant meanings, it can effectively improve learning
efficiency. We achieve this goal by building a Chinese character semantic
knowledge graph (CCSKG). In the process of building the knowledge graph, the
semantic computing capacity of HowNet was utilized, and 104,187 associated
edges were finally established for 6752 Chinese characters. Thanks to the
development of deep learning, OpenHowNet releases the core data of HowNet and provides
useful APIs for calculating the similarity between two words based on sememes.
Therefore our method combines the advantages of data-driven and
knowledge-driven. The proposed method treats Chinese sentences as subgraphs of
the CCSKG and uses graph algorithms to correct Chinese typos and achieve good
results. The experimental results show that compared with keras-bert and
pycorrector + ernie, our method reduces
the false acceptance rate by 38.28% and improves the recall rate by 40.91% in
the field of learning Chinese as a foreign
language. The CCSKG can help to promote Chinese overseas communication
and international education.