Expert System for the Diagnosis and Prognosis of Common Dental Diseases Using Bayes Network ()
1. INTRODUCTION
Expert systems engage the application of human knowledge to solve problems that usually require human intelligence [1]. In medical field, the aim of an expert system is to support the doctor’s diagnostic process. It considers facts and symptoms associated with an aliment to provide a diagnosis. This means that, expert system uses knowledge of a disease and facts or history of a patient stored in its database to propose a diagnosis [2]. Expert systems are defined as intelligent systems that emulate the decision-making ability of a human expert. Expert systems are designed to solve complex problems by reasoning through bodies of knowledge, represented mainly as if-then rules rather than through conventional procedural code [3]. The term expert system could be applied to any computer program which is able to draw conclusions and make decisions based on knowledge represented in its database.
Oral health played a major role in human existence [4]. Therefore, oral diseases have to be effectively and timely treated to avert acute pains and discomforts associated with it. These diseases are caused by several factors. It has now become one of the most common diseases in the world and arises with serious health and economic burdens, greatly reducing the quality of life for those affected [5].
Dental caries, tooth wear, traumatic injuries, developmental defects, aesthetic corrections, gum disease among others are the various conditions that cause tooth defects [6]. Other than these conditions, there are also few terminologies related to defects of tooth like Attrition (wear of incisal or Occlusal surfaces of teeth due to frictional contact between opposing teeth), Abrasion (Tooth surface loss due to force between the teeth and external objects), Erosion (Tooth loss due to chemical/mechanical action), Enamel hypoplasia (defective formation or calcification of enamel) etc. Even though tooth is a small part in human body, its importance and impact are always high during mastication of food, maintenance of aesthetics, proper speech and protection of supporting tissues which represent the overall wellbeing of a person.
However, the main focus of this research is to develop an expert system capable of diagnosing most common dental problems which are: Halitosis (Bad breathe), Dental cavity (Caries), Gingivitis & Periodontitis (Gum disease), Oral carcinoma (oral cancers), Mouth sores, tooth erosion, dentinal sensitivity and dental pain as well as tooth urgencies.
The rest of the paper is organized as follows: Section 2 reviews the related works. Section 3 presents the methodology employed in undertaking this research. Section 4 explains the experimental set up while Section 5 discusses the results and Section 6 concludes the paper and identifies areas of future work.
2. RELATED WORK
Segmentation of dental X-ray images in medical imaging using neutrosophic orthogonal matrices was proposed by Ali et al. [7]. In this paper, a new fuzzy clustering algorithm based on the neutrosophic orthogonal matrices for segmentation of dental X-Ray images was proposed. This algorithm transformed image data into a neutrosophic set and computes the inner products of the cutting matrix of input. Pixels are then segmented by the orthogonal principle to form clusters. The experimental validation carried out on real dental datasets of Hanoi Medical University Hospital, Vietnam showed the superiority of the proposed method against the relevant ones in terms of clustering quality. Experimental results on the real dental X-Ray image datasets showed that, the proposed method outperformed the relevant fuzzy clustering schemes. It also showed that, the proposed method achieved better validity index values. Future research of this work is to be conducted on improving the method by an idea of Boole matrix and enhance the computational time by parallel strategy.
Amer and Aqel [8] presented method to extract wisdom teeth automatically from panoramic images that consisted of three stages, pre-processing, extraction and post-processing. The results obtained from the proposed method have shown that it could successfully extract the wisdom teeth. The segmented images can be used to classify the extracted teeth and then according to a specific problem. Future work is associated with implementation of the algorithm.
Oladele and Yetunde [9] developed an expert system with the intention of solving real life problems. The system is a desktop-based medical expert system for diagnosis and prediction of dental diseases. The system is open looped which is operated by a dentist who selects the symptoms associated with the patient’s condition where cause, prevention and diagnosis are generated after processing the symbolic rules. The system was developed using coactive neuro-fuzzy model but limitation has to do with the utilization of few symptoms for diagnosis.
An expert system for diagnosis and suggestion of treatment plan for oral cancer was presented by Khosravi et al. [10]. The system receives input from user, analyses it and reforms it. It is able to diagnose oral cancer and generate appropriate treatment. However, system lacks clinical review to ascertain correctness of result. It only acts based on user’s answers and can’t study the correctness of user answers.
Decision support and training system for management of endodontically treated teeth already exist [11]. One of the important attributes of the system is to train users to think holistically like an expert while solving a problem and planning treatment. It is a functional prototype of clinical decision support system for restoration of endodontically treated teeth. The tool can be incorporated as part of class curriculum to supplement traditional teaching methods which can be helpful to students as well as less experienced clinicians. Given the scalable design of the system, it can further be developed to support other challenging sub-areas within restorative dentistry as well as areas within other dental disciplines. However, limitation has to do with the utilization of few symptoms for diagnosis.
Chattopadhyay et al. [12] designed a methodology for dental decision making, using exclusively toothaches. The basic focus of this work is to mathematically identify some dental diseases (D) based on a set of pain parameters (P) using the concept of Bayesian probabilistic modeling. Hill climbing search algorithm was used to train the classifier and compute a conditional probability table (CPT) entries. However, more diseases and symptoms are required to be added to this work in the future.
3. METHODOLOGY
For the purpose of this research, interviews and observations were carried out to obtain knowledge from dental doctors and how they reason with their knowledge. The knowledge acquired was then stored in a knowledge base and translated into a computer-usable language with an inference engine (a reasoning structure), that uses the knowledge appropriately. The inference engine manipulates the dental knowledge acquired from the dental expert to get new knowledge. The manipulation of the inference engine on the stored knowledge in the knowledge base is likened to the reasoning of the human dental expert termed diagnosis.
Steps involved in data set acquisition were;
· Choosing what knowledge is needed;
· Obtaining the knowledge from the human dental expert;
· Analyzing the obtained knowledge;
· Storing the obtained knowledge in a knowledge base.
3.1. Dataset
The dataset is the constituent of diagnosis which are dental diseases (Table 1) and symptoms (Table 2) from which possible diagnosis is determined. The result of diagnosis is the presence or absence of a disease.
The inference engine applied the logical rules in the knowledge base to deduce new information for diagnosis. The knowledgebase is made up:
· Rules;
· Mathematical models;
· Symptom/Disease Descriptions.
The rule base contains the IF-THEN constructs accompanied by vital signs associated with a particular ailment. The mathematical model base contains the Bayes rule alongside the descriptions of parameters with which the system uses to obtain the probability of a disease based on given symptoms. The symptom/disease description base contains the analysis of all the parameters with which the inference engine uses to deduce or arrive at a conclusion. The system was implemented using Visual Basic.Net (VB.Net) programming language. Microsoft Structured Query Language (MS SQL) was used as to create and manage the knowledge base while crystal report was used to generate and print reports.
3.2. System Algorithm
Step 1. Start;
Step 2. Input: Diseases, Symptoms;
Step 3. Select symptoms;
Step 4. Get related diseases;
Step 5. Insert parameters into Baye’s formula:
to get posterior value for diseases and symptoms;
Step 6. Sum up the posterior values of the disease;
Step 7. Compare posterior values of all related diseases;
Step 8. Choose disease with max.posterior value;
Step 9. Compare max.posterior valued disease with max.symptoms disease;
Step 10. If (disease with max.posterior value = disease with max.number of symptoms; then set Specific disease = disease;
Else;
set Specific disease = disease with max.number of symptoms;
end.
4. EXPERIMENTAL SETUP
Bayesian rule was used in this experiment for accurate diagnosis. It is a conditional probabilistic rule. This is described as the probability of an event occurrence, given that some other events associated with it had already occurred. Bayes’ theorem shows the relation between a conditional probability and its reverse form, which is written thus;
(1)
An important part of Bayesian inference is the establishment of parameters and models. Models are the mathematical formulations of observed events. Parameters are the factors in the models affecting the observed data. With the application of Bayesian theorem, given a symptom(s), the posterior probability of a disease is computed thus;
(2)
· Where P is the probability of occurrence;
· Symptom represents all the vital signs of a patient;
· Disease represents abnormal health state;
· P(D|S), is the conditional probability of disease D existence with respect to given symptoms. It is also called the posterior probability because it depends on the specified value of symptoms.
· P(S|D), is the inverse of P(D|S). In other words, it is the likelihood function of the symptom S with respect to a given disease D.
· P(D) is the prior probability or marginal probability of D. It is prior in the sense that, it does not take into account any information about S.
· P(S) is the prior probability or marginal probability of S and acts as a normalizing constant.
With the implementation of the Bayesian theorem, the system attempts to gather all possible information from the patients so as to have a prior knowledge of the disease by computing the posterior probability for each disease and choosing the disease with the highest probability.
4.1. System Input
The input was diseases and symptoms. A probabilistic rectangular matrix was created using disease prevalence and symptom scores. This is a connection of each disease to their respective symptoms. The values are converted to percentages to show the severity of symptoms in a disease.
4.2. Output Generation
The output generation (related diseases/specific disease) is a function of Baye’s rule:
(
) on the input data (symptoms/signs)
.
where:
D = disease;
S = symptoms;
F = Baye’s rule.
5. RESULTS
Disease Prevalence Values
Table 3 is the disease prevalence table of a sample population of 150. The sample size is basically adult irrespective of age. The table contains the converted values of disease prevalence values from percentage to decimal values for mathematical application. For example, Diseases 1, 2 and 3 in the table represents Gingivitis, Periodontitis and Dental caries respectively. Their prevalence in the sample population are 75.4%, 15.4% and 35.5% respectively.
The results of the proposed system are tabulated in Table 4. The table comprises of selected symptoms, diseases related to the selected symptoms, specific disease from the result of the diagnosis, disease prevalence value and Bayes posteriors value. Existing system evaluation was based on the accuracy of the classifier over a given test set tuples that were correctly classified by the system. The tuples were termed as positive tuples (i.e., presence of disease) and negative tuples (i.e., absence of disease). True positives were the positive tuples that are correctly labeled by the system while the true negatives were the negative tuples that are correctly labeled by the classifier. False positives were the negative tuples that are incorrectly labeled (absence of the disease for which the classifier labels as presence of the disease) while false negatives were the positive tuples that are incorrectly labeled (presence of the disease for which the classifier predicts as absence of the disease). Existing system used 14 pain parameters out of which only 6 were significant and the rest were redundant. The proposed system was also evaluated same way as existing system. In the case of the proposed system, parameters used correctly diagnose relevant disease as validated by experts. Figure 1 and Figure 2 shows the records and diagnosis menus of the proposed system.
![]()
Table 3. Disease prevalence values.
![]()
Table 4. Results of improved system.
6. CONCLUSION AND FUTURE WORK
A dental disease diagnostic tool was developed based on Bayes’ rule. Bayes rule was able to generate the probability of a desired dental disease given a set of symptom parameters, however it has its shortcomings. The parameters used were disease prevalence, (which is the number of people with disease in the given sample population), disease symptom scores, (the numeric value of the symptom which determines the effect on the disease). In order to achieve a high rate of accuracy, an improved algorithm was added to the Bayes theorem, which is the “Number of Symptoms (NS)”. NS closed the gap created by Bayes theorem in this research. The Bayes shorting comings were as a result of the prevalence values used to calculate the posterior (probability of a disease) value. Prevalence is the population of people affected with a disease in the sample population. Difference in prevalence values in various locations affected the results of the posterior, hence NS was inserted into the system to give a more accurate result. In the future, the prototype implementation would be tested for computational complexity and time consumption rate.