TITLE:
Next Words Prediction and Sentence Completion in Bangla Language Using GRU-Based RNN on N-Gram Language Model
AUTHORS:
Afranul Hoque, Busrat Jahan, Shaikat Chandra Paul, Zinat Ara Zabu, Rakhi Mondal, Papeya Akter
KEYWORDS:
Bangla Language, Words Prediction, Sentence Completion, GRU, RNN, Corpus, N-Gram
JOURNAL NAME:
Journal of Data Analysis and Information Processing,
Vol.11 No.4,
November
6,
2023
ABSTRACT: We use a lot of devices in our daily life to communicate with others. In
this modern world, people use email, Facebook, Twitter, and many other social
network sites for exchanging information. People lose their valuable time
misspelling and retyping, and some people are not happy to type large sentences
because they face unnecessary words or grammatical issues. So, for this reason,
word predictive systems help to exchange textual information more quickly,
easier, and comfortably for all people. These systems predict the next most
probable words and give users to choose of the needed word from these suggested
words. Word prediction can help the writer by predicting the next word and
helping complete the sentence correctly. This research aims to forecast the
most suitable next word to complete a sentence for any given context. In this
research, we have worked on the Bangla language. We have presented a process
that can expect the next maximum probable and proper words and suggest a
complete sentence using predicted words. In this research, GRU-based RNN has
been used on the N-gram dataset to develop the proposed model. We collected a
large dataset using multiple sources in the Bangla language and also compared
it to the other approaches that have been used such as LSTM, and Naive Bayes.
But this suggested approach provides excellent exactness than others. Here, the
Unigram model provides 88.22%, Bi-gram model is 99.24%, Tri-gram model is
97.69%, and 4-gram and 5-gram models provide 99.43% and 99.78% on average
accurateness. We think that our proposed method profound impression on Bangla
search engines.