https://doi.org/10.31449/inf.v46i1.3269 Informatica 46 (2022) 107–119 107 An Illustration of Rheumatoid Arthritis Disease Using Decision Tree Algorithm Uma Ramasamy and Santhoshkumar Sundar E-mail: seen.uma25@gmail.com, santhoshkumars@alagappauniversity.ac.in Alagappa University, Karaikudi, Tamil Nadu, India Student paper Keywords: rheumatoid arthritis, decision tree, entropy, information gain, gain ratio Received: November 13, 2020 The Data Mining domain integrates several partitions of the computer science and analytics field. Data mining focuses on mined data from a repository of the dataset to identify patterns, discover knowledge, additionally to predict probable outcomes. Decision tree belongs to classification techniques is a well- known method appropriate for medical diagnosis. Iterative Dichotomiser 3 (ID3) is the general significant algorithm to construct a decision tree. C4.5 is the successor of ID3 that handles dataset contains different numerical attributes. Although many studies have described and compared different decision tree algorithms, some studies have confined paper with analysis and comparison of the decision tree algorithm without the output of the decision tree. One of the inflammatory diseases is Rheumatoid Arthritis (RA) caused by specific autoantibodies with the destruction of synovial joint autoantibodies. Medical dataset applied to construct a decision tree as output has become seldom study. This study elucidates to explore the medical dataset with the decision tree approach and exhibit the derived decision tree output from the RA dataset. The objective of this paper is to construct a decision tree and display the prominent features that predict RA from the RA dataset using the decision tree algorithm. Povzetek: Za predstavitev bolezni revmatoidnega artritisa so uporabili metodo za gradnjo odločitvenih dreves. 1 Introduction Rheumatoid Arthritis (RA) is a rheumatic disease. The word 'Rheumatoid' implies 'rheumatism' relates to a musculoskeletal illness, 'arthr' means 'to joints,' and 'itis' denotes 'inflammation.' It is an inflammatory disorder that mostly impairs the joints, as well as other organs like the skin and lungs. Well-defined and reliable estimation of RA symptoms circumvents durable destruction to the patient’s joints and bones if treated earlier, or else it affects the patient’s quality of life. The research gap has found in the field of Rheumatoid Arthritis using data mining [1, 2]. A dataset is an indispensable component in the discussion of the classification algorithm. The dataset features or attributes are qualitative (nominal) and quantitative (numeric). Many researchers have applied various datasets [3-6] on different classification algorithms and have processed different results based on it. The dataset was utilized as a training set. From the training set decision tree is built. 'Playing tennis’ is often used dataset in the decision tree illustration [7-10]. Preferably the next used dataset in the decision tree example is the student performance [11]. Similarly, dataset like 'a dog represents a risk for citizens [7, 12],' 'reservoir inflow forecasting [13],' 'PEP (Portfolio Evaluation Plan) [14],' 'rainfall forecasting [15],’ and 'college scholarship evaluation [16],' are some illustrations in the classification algorithm that rarely handled by many research authors. A few authors only have examined and published medical datasets for the decision tree illustration. The medical dataset created for this study is named the 'RA dataset.' The RA dataset was obtained from a new approach of the 2010 ACR/EULAR (American College of Rheumatology / European League Against Rheumatism) classification criteria of RA, which was formed, by two active groups of the ACR and the EULAR [17]. It contains qualitative attributes in a binary category (yes / no). This dataset aims to diagnose whether the patients have Rheumatoid Arthritis or do not have Rheumatoid Arthritis. Most RA patients experience abhor pain on the joints of the hands, legs, hip, spine, and shoulder. It would be beneficial for medical practitioners to predict the prominent features responsible for RA disease. The feasible attributes to identify RA patients are displayed in Figure 3. Among these feasible attributes, the optimal attributes for the RA patient are predicted in this study using the RA dataset. Information gain was determined to find the dominant attributes from the dataset to build the decision tree for the iterative dichotomiser 3 (ID3) algorithm. C4.5 is another algorithm to construct the decision tree by calculating the gain ratio. Decision tree algorithms such as ID3 and C4.5 (modified version of ID3) are popular and efficiently used classifiers for RA prediction from a RA dataset. Only a few authors practiced the decision tree illustration with 108 Informatica 46 (2022) 107–119 U. Ramasamy et al. medical datasets [18,19]. Although many authors have described and compared the decision tree algorithms, some confined their papers without the relevant decision tree result. 2 Related study Data mining is the method to classify models from massive databases, that being broad, applied to learn and analyze, and obtain information [20-23]. The decision tree algorithm falls under the type of supervised learning. It is the most familiar data mining technique used frequently to build the classification model. They are used to solve both regression and classification problems. All classification model, function with the classifier, which is a supervised learner that automatically perform the learning process for the training dataset, to predict its target attribute. Data mining techniques are widely used for classification and prediction of the healthcare domain so that it can be an aid for the doctors to identify complex diseases precisely and design a more reliable Decision Support System (DSS) [24]. The Electronic Health Record (EHR) of RA patients were studied for early prediction and diagnosis of the RA disease. Moreover, the comparative study made on several machine learning algorithms identifies which algorithm suites well for the prediction of RA disease [25, 26]. rheumatoid factor (RF), anti-cyclic citrullinated peptide (Anti-CCP), swollen joint count (SJC), and erythrocyte sedimentation rate (ESR) are four essential judging factors for rheumatoid arthritis [27]. Once a patient is diagnosed with RA, the probability of getting heart failure is higher compared to the non-RA patient [28-31]. Medical research and biological research are the ever-growing fields where many biological data are collected, classified, estimated, predicted, associated, clustered, and finally visualized through reports and patterns using data mining techniques [32, 33]. The application of data mining is always in the progress of continuous development. The ID3 algorithm has some issues to handle multi-valued attributes and requires a high amount of computational complexity. A novel approach has been introduced to split attributes in the ID3 algorithm [34]. In the field of bioinformatics, data mining has some challenges like sequencing technologies and data analysis skills. Under analysis estimation instruments, a review of data mining methods performs with the combination of examination tools suitable in research tasks. The literature review finalized the merits and demerits of data mining in bioinformatics [35]. After simulation analysis, ID3 decision tree classification accuracy was higher 6-7 percentage compared to other classifiers. The author proposed an optimized ID3 algorithm that constructs a tree with a minimum node so that it can improve the efficiency and reduce the error rate [36]. Using the Gaussian mixture model, the analysis done using different clinical and laboratory data displayed results with various distributions. The patient global assessment (PGA) and health assessment questionnaire (HAQ) collected after three months of RA diagnosis, SJC, and tender joint count (TJC) considered being the functional attribute for RA diagnosis [37]. Regarding Arthritis disease, women are affected at a higher rate when compared to men [38]. The RA prediction and the RA diagnosis development done by the machine learning approach, it is mandatory to diagnose the essential features for RA prediction [39, 40]. The earlier study practiced the decision tree computation technique to investigate the selection of the second-line drug DMARDs (Disease Modifying Antirheumatic Drug) by rheumatologists which depend on the factor of disease rigor to treat RA patients after the failure of Methotrexate [41]. A few years back the immune suppression effects of DMARDs are systematic and lead to various side effects. Medical experts improved autoimmune response produced from RA by customizing a good care plan and predicting the prognosis of the disease [42]. A recent study was made to support clinical RA treatments using the decision support system to predict a model that can support medical people to give suitable decisions in the early stage of RA disease [43]. The specific proteomic biomarkers have identified for RA diagnosis using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS) combined with weak cationic exchange (WCX) magnetic beads. The classification tree model has been considered an innovative diagnostic tool for RA [44]. The combination of proteomic fingerprint technology and magnetic beads obtained efficient biomarkers and discovered the diagnostic patterns for RA. The biomarker C-C motif chemokine 24 (CCL24) has considered as a significant diagnostic indicator for RA [45]. The author states anti-citrullinated protein antibodies (ACPAs) are specific for RA and, RF was observed in health and elder people with other autoimmune diseases, which indicate immune response for RA development. The shared epitope alleles dwell in the major histocompatibility complex (MHC) class II region involved in a genetic risk factor for RA development. ACPA is the spectrum of autoantibodies that aims for posttranslational modification (PTM) [46]. The authors declare that in the future machine learning (ML) will support rheumatologists to analyze and predict the development of the disease and discover significant disease agents. Furthermore, the authors affirm ML will perform treatment propositions and evaluate their predicted outcome. The shared decision-making combines the patient's viewpoint, rheumatologist's suggestion, and also machine-learned evidence in the future [47]. The general methodologies applied to examine the intensity of RA are the clinical, laboratory, and physical examinations. The authors proposed a hybrid optimization strategy called rheumatoid arthritis disease using weighted decision tree approach (REACT), which combines the features of ID3 and particle swarm optimization (PSO) for feature selection and classification of RA to improve the efficiency and reliability of RA diagnosis [48]. It is necessary to develop therapies for RA patient's treatment at each stage of the disease progress using pathological mechanisms that urge the deterioration of RA progress in individuals. Several modern pharmacologic An Illustration of Rheumatoid Arthritis Disease Using... Informatica 46 (2022) 107–119 109 therapies play a vital role in disease relief without joint deformity. The RA pathogenesis, disease-modifying drugs, and views on next-generation therapeutics for RA have been discussed in this review [49]. Though joint connection, serology, levels of acute-phase reactants, and the duration of the symptoms are marked to be the primary diagnosis classification criteria for RA, yet the diagnosis requires well trained specialists who can discern early symptoms of RA from additional pathology [50]. The paper [51] developed a model for the flare prediction on the RA patients, with reduced intake of biological disease-modifying anti-rheumatic drugs (bDMARDs) in sustained remission. This proposed model used nested cross-validation and optimal hyper- parameters for a suitable model selection approach with machine learning algorithms like Logistic Regression, k- Nearest Neighbors, Naïve Bayes and Random Forest. A dose reduction, feature was selected to be the predominant flare predictor attribute. A new method [52] focused to promotes the treatment selection in RA patients using GUIDE (Generalized, Unbiased, Interaction Detection and Estimation) decision tree, which matches with predefined rules to predict treatment response to sarilumab and adalimumab. The result classified the presence of Anti-CCP and C-reactive protein (CRP) with a threshold greater than 12.3mg/l exposed as a biomarker pattern to predict response to sarilumab. Since RA diagnosis is prominently challenging because of reliable biomarkers, the authors [53] identified nine hub genes namely CFL1 (Cofilin 1), COTL1 (Coactosin Like F-Actin Binding Protein 1), ACTG1 (Actin Gamma 1), PFN1 (Profilin-1), LCP1 (Lymphocyte Cytosolic Protein 1), LCK (lymphocyte- specific protein tyrosine kinase), HLA-E(Major Histocompatibility Complex, Class E), FYN (Proto- oncogene tyrosine-protein kinase), and HLA-DRA (Human Leukocyte Antigen – DR isotype) biomarkers that probably distinguished the RA samples out of 52 differentially expressed genes (DEGs) from 112 RA patients. Further, Machine Learning models namely logistic regression and random forest were applied based on the identified genes. This paper [54] presents a review that summarized the healing treatment for RA. The objective was to highlight, polypeptides, small intermediate or end products of metabolism, and epigenetics regulators as the new targets for healing RA. And prominent molecular targets for medication design were identified, which lessen the early RA and determine nonresponses followed by the partial responses and severe effects for modern DMARDs. Algorithm Pipeline Development and Validation Study were conducted on this paper [55] using EHR to identify patients with RA. Patients' records who had their first visits were suggested as input from EHRs, and Natural Language Processing (NLP) text processing was applied from randomly selected EHRs. Moreover, Six Machine Learning Methods were utilized in the training and 10- fold cross-validation dataset to identify patients with rheumatoid arthritis from format-free text fields of EHRs. In this paper [56] dataset taken from The Korean College of Rheumatology Biology (KOBIO) Registry, nearly 1204 RA patients were treated with biologic disease-modifying anti-rheumatic drugs (bDMARDs). To predict remission machine learning techniques included Lasso, Ridge, SVM, Random Forest, and Xgboost and explainable artificial intelligence (XAI) were used to identify the essential clinical features correlated with remission. The accuracy and area under the receiver operating characteristic (AUROC) curve were analysed for prediction. Treatment guideline for RA patients is given in this paper [57], many references and research work associated with vaccination were collected from precise literature reviews formed by ACR guidelines to deal with RA. These studies recommend services to assist the clinician and patient decision-making and relieve them from RA disease anxiety. In this study, let us analyze the RA dataset using the decision tree model and predict the efficient features that diagnose the disease. 3 About decision tree A tree structure classifier is the decision tree with a decision node or internal node, a branch, and a leaf node. The test of the attribute has denoted by each internal node.Each leaf node predicts the target classification. Each branch corresponds to the attribute value. To classify training dataset using the decision tree, begin from the root node, follow the suitable decision branches corresponding to the attribute values, and finally reach a leaf-node predicted with the target class. The conjunction of attribute tests corresponds to each path from the root to the leaf. Further, as a whole, the disjunction of these conjunctions represents the tree [58]. The dominant attribute is the best attribute classifier from the training set. The internal node represents the dominant attribute that supports to build the decision tree. The dominant attribute is the attribute with the highest information gain and gain ratio, which is discussed in sections 4.2 and 4.3. 3.1 Algorithm 3.1.1 ID3 A set of training examples are processed to learn and construct the decision tree. Furthermore, with the learned classifier, the decision tree classifies the new training examples. The algorithm technique employed is from the basic top-down greedy approach. The fundamental algorithm to build the decision tree is the ID3 algorithm developed by Quinlan in 1973 based on the Concept Learning System (CLS) algorithm. ID3 finds the dominant attribute that classifies the training examples by applying a greedy search and never backtrack [58], [59] (p.55). 3.1.2 C4.5 ID3 cannot handle practical issues such as attributes with missing values in the training dataset and attributes with continuous values. Additional problems to handle are a small sample of data leads to overfitting, to select an 110 Informatica 46 (2022) 107–119 U. Ramasamy et al. attribute for the decision node, one feature tested at the moment is time-consuming, and it is sensitive with a greater number of attribute values. Practical issues in ID3 overcome by the C4.5 algorithm, stated by Ross Quinlan, create the decision tree. C4.5 is a continuation of Quinlan’s earlier ID3 algorithm [59] (p.55). 4 Metrics of ID3 and C4.5 Decision tree metrics are a set of measurement support to draw a decision tree with some parameters quantitative assessment derived from the dataset. 4.1 Entropy Figure 1: Entropy function relative to binary classification. S is the sample of training examples (size =10). In the S dataset, positive proportion examples denoted as 'p,' and negative proportion examples denoted as 'n.' Entropy(S) is zero, if the proportion of positive examples (10+, 0-) is the same as the size of the training examples, similarly if the proportion of negative examples (0, 10-) is the same as the size of the training examples. Suppose, positive and negative examples are of equal size (5+, 5-), the impurity in the dataset S is maximum, i.e., entropy is one as shown in Figure 1. Therefore, it is distinct that the impurity of dataset S is measured by entropy. Entropy(S) is the expected number of bits needed to encode class (true or false, + or -, yes or no, low or medium or high) of randomly drawn members of S. A novel way to assign −log 2 𝑝 bits to messages having probability ‘p’ introduced in the Information Theory concept of optimal length code [58]. So the expected numbers of bits to encode (yes or no, true or false, + or -) a random member of S is−𝑝 log 2 𝑝 − 𝑛 log 2 𝑛 , where positive examples proportion denoted as 'p,' and negative examples proportion denoted as 'n.' Entropy characterizes the impurity of a collection and measures the information content from the sample of training examples. If the number of unique target feature values assigned as m, then the entropy of S w.r.t n-wise classification is equated as 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 (𝑆 ) = − ∑ 𝑝 𝑖 𝑛 𝑖 =1 log 2 𝑝 𝑖 (1) Where, 𝑝 𝑖 − 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 𝑜𝑓 𝑆 𝑏𝑒𝑙𝑜𝑛𝑔𝑖𝑛𝑔 𝑡𝑜 𝑐𝑙𝑎𝑠𝑠 𝑐 𝑖 4.2 Information Gain Let S be the sample of the training examples with A 1, A 2, ... , A n are the non-target attributes. All the features in the dataset calculated using the information gain formula as shown in Equation 2. Attribute with the highest information gain is the best classifier because the expected reduction is laid out by the information gain in entropy formed by partitioning the records of the dataset using the attribute. How effectively an attribute classifies the training examples according to their target classification has been defined in the information gain measure [59] (p.57-58). WA(A) defines the weighted sum of the information content of each subset of the examples partitioned by the possible values of the attribute. It measures the total disorder or in-homogeneity of the leaf nodes. The minimum WA (A) or maximum information gain(S, A) shows attribute A as the best attribute at a node [58]. The best attribute to select in growing the tree using each step of the ID3 algorithm, a precise measure is the information gain. The calculation of information gain is briefly described in Section 7. 𝐺𝑎𝑖𝑛 (𝑆 , 𝐴 ) = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 (𝑆 ) − 𝑊𝐴 (𝐴 ) = 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 (𝑆 ) − ∑ ( 𝑆 𝑣 𝑆 ) 𝑣 ∈𝑉𝑎𝑙𝑢𝑒𝑠 (𝐴 ) 𝐸𝑛𝑡𝑟𝑜𝑝𝑦 (𝑆 𝑣 ) (2) Where, 𝑆 𝑣 − 𝑠𝑢𝑏𝑠𝑒𝑡 𝑓𝑟𝑜𝑚 𝑆 𝑓𝑜𝑟 𝑤 ℎ𝑖𝑐 ℎ 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝐴 ℎ𝑎𝑠 𝑣𝑎𝑙𝑢𝑒 𝑣 𝑉𝑎𝑙𝑢𝑒 (𝐴 ) − 𝑠𝑒𝑡 𝑜𝑓 𝑎𝑙𝑙 𝑝𝑜𝑠𝑠𝑖𝑏𝑙𝑒 𝑣𝑎𝑙𝑢𝑒𝑠 𝑓𝑜𝑟 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝐴 4.3 Gain ratio The gain ratio is a ratio between information gain and the split information. Rather than considering the entropy(S) on the target attribute, entropy(S) is concerned about all possible values of the attribute A defined to be the split information [59] (p.73-74). Information Gain Ratio is the fundamental information from the required decrease in entropy. The purpose of Quinlan to introduce this was to overcome bias on multi-valued features by considering the count of branches when choosing an attribute [60-62]. Section 7 discussed to implement gain ratio with an example. 𝐺𝑅 (𝑆 , 𝐴 ) = 𝐼𝐺 (𝑆 ,𝐴 ) 𝐼𝑉 (𝑆 ,𝐴 ) (3) 𝐼𝑉 (𝑆 , 𝐴 ) = ∑ |𝑆 𝑖 | |𝑆 | 𝑐 𝑖 =1 log 2 |𝑆 𝑖 | |𝑆 | (4) Where, 𝐺𝑅 (𝑆 , 𝐴 ) − 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝐺𝑎𝑖𝑛 𝑅𝑎𝑡𝑖𝑜 𝑎𝑓𝑡𝑒𝑟 𝑠𝑝𝑙𝑖𝑡𝑡𝑖𝑛𝑔 𝑠𝑒𝑡 𝑆 𝑜𝑛 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝐴 𝐼𝐺 (𝑆 , 𝐴 ) − 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝐺𝑎𝑖𝑛 𝑎𝑓𝑡𝑒𝑟 𝑠𝑝𝑙𝑖𝑡𝑡𝑖𝑛𝑔 𝑠𝑒𝑡 𝑆 𝑜𝑛 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝐴 𝐼𝑉 (𝑆 , 𝐴 ) − 𝐼𝑛𝑡𝑟𝑖𝑛𝑠𝑖𝑐 𝑉𝑎𝑙𝑢𝑒 𝑜𝑟 𝑆𝑝𝑙𝑖𝑡 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑙𝑢𝑒 𝑎𝑓𝑡𝑒𝑟 𝑠𝑝𝑙𝑖𝑡𝑡𝑖𝑛𝑔 𝑠 𝑒𝑡 𝑆 𝑜𝑛 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝐴 , 𝑤 ℎ𝑒𝑟𝑒 𝑆 𝑖 𝑡 ℎ𝑟𝑜𝑢𝑔 ℎ 𝑆 𝑐 𝑎𝑟𝑒 𝑡 ℎ𝑒 𝑐 𝑠𝑢𝑏𝑠𝑒𝑡𝑠 𝑜𝑓 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠 𝑟𝑒𝑠𝑢𝑙𝑡𝑖𝑛𝑔 𝑓𝑟𝑜𝑚 𝑝𝑎𝑟𝑡𝑖𝑡𝑖𝑜𝑛𝑖𝑛𝑔 𝑆 𝑏𝑦 𝑡 ℎ𝑒 𝑐 − 𝑣𝑎𝑙𝑢𝑒𝑑 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒 𝐴 5 Work flow model for proposed illustration The proposed illustration workflow model consists of a tree algorithm for RA [17], which is further converted to An Illustration of Rheumatoid Arthritis Disease Using... Informatica 46 (2022) 107–119 111 a relational database as shown in Table 1. The resultant RA dataset is applied to computational techniques such as ID3 and C4.5 decision tree classifier to obtain decision tree and classification rule. The RA dataset contains all feasible features necessary to identify RA patients, whereas the final result of the decision tree predicts only the optimal features mandatory to predict RA patients. 6 About dataset As mentioned in the workflow model (Figure 2), the conversion of RA Tree Structure (Figure 3) to RA dataset (Table 1) is done by following each path from the root node to the leaf node. The shape of the root node and the intermediate node is a rectangle, whereas the leaf node is in a circle (Figure 3). Each path represents each row in the RA dataset. There are 60 paths (in Figure 3), so the RA dataset consists of 60 rows. The root node in Figure 3 is '>10 joints (at least one small joint)', and the leaf nodes in the Figure 3 are 'RA' and 'crossed RA' (not RA). The root node and the intermediate node indicate the features/attributes, and the leaf node implies the class label of the RA dataset. The aim of this dataset is to Classifying patients by diagnosis of Rheumatoid Arthritis or not Rheumatoid Arthritis. The source of our dataset is from the tree flowchart for classifying distinct Rheumatoid Arthritis (RA) given in the 2010 RA classification criteria. Two active groups of the ACR and the EULAR join together to form a new approach for the 2010 ACR/EULAR classification criteria of RA [17]. The number of instances (rows) of the RA dataset – 60. The number of features (columns) of the RA dataset – 9. Number of Classes (unique values of the target feature) – 2. Number of missing values – 0. The attributes used to diagnosis RA are mixed of both phenotype and genotype. They are '> 10 joints (at least one small joint)', '4-10 small joints', '1-3 small joints', and '2 – 10 large (no small) joints' are four features of phenotype. 'Serology +' (low positive RF or low positive ACPA), 'Serology ++' (high positive RF or high positive ACPA), and 'APR (Acute phase reactants) Abnormal' (abnormal C-reactive protein (CRP) or abnormal ESR) are three features of genotype and the last attribute is ‘Duration of symptoms >=6 weeks’. In Table 1, the features name is followed with a score value to classify RA patients. The cumulative score value of each attribute per record is less than 6 out of 10. Such a score is not classifiable to diagnose RA. Those scores status is yet to be evaluated, and the criteria might be later fulfilled [17]. 7 Illustration of RA dataset with ID3 and C4.5 classifiers RA [Rheumatoid Arthritis] dataset contains the data field of the qualitative binary asymmetric attribute. Binary data has two conditions such as, 'yes or no,' 'affected or unaffected,' ' true or false.' Asymmetric defines binary values are not equally important. Both the predictor (non- target attribute) and response (target attribute) variable in the RA dataset is binary and categorical. Two response variables 'ra' and 'no ra' suggest, diagnosis of rheumatoid arthritis and not rheumatoid arthritis. 7.1 Step-by-step illustration of ID/C4.5 algorithm using RA dataset Step 1: Find the Entropy for the current RA dataset, S. In RA dataset ‘ra’ and ‘no ra’, two classes are present with the count of 26 and 34, total instances in the dataset are 60. The 'ra' target value informs the patient diagnosed with Figure 2: Work flow Model for Proposed Illustration. 112 Informatica 46 (2022) 107–119 U. Ramasamy et al. Rheumatoid Arthritis, whereas the 'no ra' target value reveals the patient diagnosed with no Rheumatoid Arthritis. To draw the decision tree initial step is to measure the uncertainty for this dataset, i.e., the Entropy of dataset, S denoted as E(S). Calculate E(S) using Equation 1 discussed in Section 4. E(S) = − 26 60 log 2 26 60 − 34 60 log 2 34 60 = 0.9871 Step 2: Find Information Gain by applying Equation 2 for each feature value in the RA dataset. 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝐺𝑎𝑖𝑛 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 ) = 𝐸 (𝑆 ) − ∑[𝑝 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 ). 𝐸 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 )] = 𝐸 (𝑆 ) − [𝑝 (𝑆 | > 10 𝑗𝑜𝑖𝑛𝑡𝑠 = 𝑦𝑒𝑠 ) ∗ 𝐸 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 = 𝑦𝑒𝑠 ) + 𝑝 (𝑆 | > 10 𝑗𝑜𝑖𝑛𝑡𝑠 = 𝑛𝑜 ) ∗ 𝐸 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 = 𝑛𝑜 )] 𝐸 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 = 𝑦𝑒𝑠 ) = − 14 16 log 2 14 16 − 2 16 log 2 2 16 = 0.5436 𝐸 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 = 𝑛𝑜 ) = − 12 44 log 2 12 44 − 32 44 log 2 32 44 = 0.8454 𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 𝐺𝑎𝑖𝑛 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 ) = 0.9871 − ( 16 60 ∗ 0.5436 + 44 60 ∗ 0.8454) = 0.9871 − 0.7649 = 0.2222 Furthermore, obtain the information gain for enduring all feature values of the examples. Step 3: Pick the feature which has the highest information gain. The attribute’> 10 joints’ have the highest information gain, as shown in Table 2, ‘>10 joints’ is the best classifier and determined as the root node as shown in Figure 4. Calculate split information for each attribute using Equation 4. 𝑆𝑝𝑙𝑖𝑡𝐼𝑛𝑓𝑜𝑟𝑚𝑎𝑡𝑖𝑜𝑛 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 ) Figure 3 : Tree Algorithm for RA (Rheumatoid Arthritis). An Illustration of Rheumatoid Arthritis Disease Using... Informatica 46 (2022) 107–119 113 = − 16 60 log 2 16 60 − 44 60 log 2 44 60 = 0.8366 Calculate Gain Ratio for each attribute using Equation 3. 𝐺𝑎𝑖𝑛𝑅𝑎𝑡𝑖𝑜 (𝑆 , > 10 𝑗𝑜𝑖𝑛𝑡𝑠 ) = 0.2222 0.8366 = 0.2656 Now the decision tree node (root node) is the '>10 joints' attribute with a maximum of information gain (in the Table 2 it is represented as Info Gain). Since the RA dataset is categorical and not in continuous attribute, the decision tree built is the same for the ID3 and C4.5 algorithms. So, here the gain ratio measure is necessary to construct the decision tree using the C4.5 algorithm. Step 4: Each branch from the attribute '>10 joints' partition the set S into subsets corresponds to the attribute value 'yes' and 'no.' From the root node '>10 joints', the 'yes' branch of the subset has 14 ‘RA’ and 2 ‘NO RA’ examples obtained. Though we can grow a tree further from the 'yes' branch, we have stopped with the target class RA, to avoid overfitting in the decision tree. This approach followed to stop growing the tree earlier before it attains the level to classify the training data perfectly [59] (p.68). Now recurse (from step 2 to step 3) on the subset (from the root node '>10 joints', the ‘no’ branch of the subset has 12 ‘RA’ and 32 ‘NO RA’ mentioned as ‘?’ in Figure 4) until the ID3 algorithm satisfies the stopping criteria [63] or by following the first-class approach to avoid overfitting [59] (p.68). Step 5: The classification rule is generated from the decision tree. 7.2 Top-down generalization approach for the decision tree Figure 5 illustrates the decision tree built from Table 1, which depicts the RA dataset, after applying the ID3 algorithm [58], [59] (p. 56). The basic steps for the algorithm as follows: - Dmat  dominant attribute for root (initial) / non-leaf node - Set Dmat as dominant attribute for the node - Every unique value of Dmat form new descendant - Classify the dataset records to the leaf node corresponding to the dominant attribute value of the branch - If complete dataset records are ideally classified (target feature has identical values) stop, else iterate over new leaf node The dominant attribute (decision attribute) is the best attribute classifier from the training set. Table 1 : A Sample Dataset of RA [Rheumatoid Arthritis] Derived from Figure 3. S.No. >10 Joints (atleast 1 Small Joint) (5) 4 - 10 Small Joints (3) 1 - 3 Small Joints (2) 2 - 10 Large Joints (1) Serology + (2) Serology ++ (3) APR: Abnorm al (1) Duration : >=6 Weeks (1) Class Label 1 no no no no no yes yes yes no ra 2 no no no no yes no yes yes no ra 3 no no no no no no yes yes no ra 4 no no no no no yes no yes no ra 5 no no no no yes no no yes no ra … … … … … … … … … … 56 yes no no no no no yes no ra 57 yes no no no no no no yes ra 58 yes no no no no no yes yes ra 59 yes no no no no no no yes ra 60 yes no no no no no yes yes ra Figure 4 : Root node of the ID3/C4.5 decision tree using RA dataset. 114 Informatica 46 (2022) 107–119 U. Ramasamy et al. ID3 algorithm follows with following input and output. Input: Datts A set of non-target attributes, R  target attribute and D  training examples. Output: returns a decision tree. ID3(Datts, R, D) Step 1: If D is null, return a single node with value Failure Step 2: If D holds the records of the same class, it returns a single leaf node with that value. Step 3: If Datts is null, then return a node with the value of the most frequent value of R in D. Step 4: Begin 4.1: Dmat  the attribute from Atts that best* classifies D 4.2: tree  a new decision tree with root test Dmat 4.3: for each value v j of Dmat do 4.3.1: D j  subset of D with Dmat= v j 4.3.2: subt  ID3(Datts-Dmat, R, D j) 4.3.3: Add a branch to the tree with label v j and subtree subt 4.4: return tree. * The highest information gain is the best attribute defined in Equation 2. 7.3 Extracting classification rule from decision tree algorithm ID3 & C.5 - Classification rules outline the information in the pattern of IF-Then rules - Single rule is built for each way starting from the root node to a leaf node - Each attribute-value pair along a path makes an association - The leaf node contains the predicted class [64] Classification Rule extracted from Figure 5 decision tree: Rule 1: If > 10 joints =”yes” then class = “RA” *Rule 2: If >10 joints =”no” AND 4-10 small joints = ”yes” AND Serology++ =”yes” then class=”RA” Rule 3: If >10 joints =”no” AND 4-10 small joints =”yes” AND Serology++ =”no” AND Serology+ =”yes” then class=”RA” *Rule 4: If >10 joints =”no” AND 4-10 small joints =”yes” AND Serology++ =”no” AND Serology+ =”no” then class=” NO RA” Rule 5: If >10 joints =”no” AND 4-10 small joints=”no” AND 1-3 small joints =”no” then class=”NO RA” Table 2: A sample of Information gain and gain ratio for RA dataset. Features Features Values ra no ra Tot. freq. count E(t) p(t)*E(t) Info Gain Split Info Gain Ratio >10 Joints (atleast 1 Small Joint) (5) yes 14 2 16 0.5436 0.7649 0.2222 0.8366 0.2656 no 12 32 44 0.8454 4 - 10 Small Joints (3) yes 7 5 12 0.9799 0.9708 0.0163 0.7219 0.0226 no 19 29 48 0.9685 1 - 3 Small Joints (2) yes 4 8 12 0.9183 0.9797 0.0074 0.7219 0.0103 no 22 26 48 0.995 2 - 10 Large Joints (1) yes 1 7 8 0.5436 0.9382 0.0489 0.5665 0.0863 no 25 27 52 0.9989 Serology + (2) yes 8 8 16 1 0.9824 0.0047 0.8366 0.0056 no 18 26 44 0.976 Serology ++ (3) yes 12 8 20 0.971 0.9464 0.0407 0.9183 0.0443 no 14 26 40 0.9341 APR: Abnormal (1) yes 16 14 30 0.9968 0.9576 0.0295 1 0.0295 no 10 20 30 0.9183 Duration: >=6 Weeks (1) yes 16 14 30 0.9968 0.9576 0.0295 1 0.0295 no 10 20 30 0.9183 An Illustration of Rheumatoid Arthritis Disease Using... Informatica 46 (2022) 107–119 115 Rule 6: If >10 joints =”no” AND 4-10 small joints=”no” AND 1-3 small joints =”yes” AND Serology++ =”yes” then class=”RA” Rule 7: If >10 joints =”no” AND 4-10 small joints=”no” AND 1-3 small joints =”yes” AND Serology++ =”no” then class=”NO RA” * in rule denotes Rule obtained from pure class Table 3 : Levelwise Leaf Node Membership for ID3 & C4.5 obtained from Figure 5 Level Leaf Node Class Membership Level 1 [14,2] Level 2 - Level 3 [4, 0] [1,19] Level 4 [3,1] [0,4] [3,1] [1,7] 8 Illustration analysis report The RA dataset consists of all possible feasible features from a RA patient. The predicted optimal features for RA disease are obtained using the classifier ID3 and C4.5. The Figure 5, describes the first predictor variable, '>10 joints' is achieved from level 1, the second predictor variable, ' 4- 10 small joints' is identified from level 2, the third and fourth predictor variables namely ‘serology ++’ and ‘1-3 small joints’ exhibited from level 3 and finally, the fifth predictor variable, ‘serology +’ is obtained from level 4. Therefore, five optimal features (predictor variables) are ‘>10 joints’, ‘4-10 small joints’, ‘serology ++’, ‘1-3 small joints’, and ‘serology +’ plays a vital role to predict RA patients. The accuracy is 90% (54/60) for both ID3 and C4.5 decision tree. The performance is identical for both ID3 and C4.5 because the RA dataset contains categorical data. As shown in Table 2 (first level), for all the remaining levels in the decision tree, the information gain and gain ratio are simultaneously highest as displayed in Figure 6. Table 4 : Performance of Class rulesets for ID3 & C4.5 in RA Dataset. Class Generalized Rule Pure Rule Instances covered False Positive False Negative RA 4 1 24 0 4 NO RA 3 1 30 2 0 Figure 5: ID3 and C4.5 Final decision tree for RA dataset. 116 Informatica 46 (2022) 107–119 U. Ramasamy et al. 9 Conclusion The tree-structured data is converted to a relational database (RA dataset), to identify all feasible features for RA disease. Furthermore, the RA dataset is fed into the decision tree algorithm to obtain optimal features for RA disease. Therefore, we have explored the medical dataset to elucidate with the decision tree approach, and derived decision tree and classification rule as the output from the RA dataset. To summarize the work, ID3 and C4.5 decision tree algorithms construct the same decision tree with a classifier accuracy level of 90% for the RA dataset derived from the tree flowchart for diagnosing precise Rheumatoid Arthritis given in the 2010 RA classification criteria. ID3 and C4.5 classifiers result are equal in performances when considered with RA dataset. Acknowledgment This article has been published under AURF Start-up Grant 2018, Alagappa University, Karaikudi, Tamil Nadu, India, Dt. 23.03.2018. This article has been published under RUSA Phase 2.0 grant sanctioned vide letter No. F. 24-51/2014-U, Policy (TN Multi-Gen), Dept. of Edn. Govt. of India, Dt. 09.10.2018. R efer ence s [1] Zahra Shiezadeh, Hedieh Sajedi, and Elham Aflakie, “Diagnosis of Rheumatoid Arthritis using an Ensemble Learning Approach,” Computer Science and Information Technology. © CS & IT-CSCP 2015, pp. 139–148. https://doi.org/10.5121/csit.2015.51512 [2] Rohini Handa, Rao, U. R. K., Juliana F. M. Lewis, Gautam Rambhad, Susan Shiff, and Canna J. Ghia, “Literature review of rheumatoid arthritis in India International Journal of Rheumatic Diseases,” vol. 19, pp. 440-451, 2016. https://doi.org/10.1111/1756- 185x.12621 [3] Jena, Monalisa, and Satchidananda Dehuri. "DecisionTree for Classification and Regression: A State-of-the Art Review." Informatica 44, no. 4 (2020). https://doi.org/10.31449/inf.v44i4.3023. [4] Yang, Fen. "Decision tree algorithm-based university graduate employment trend prediction." Informatica 43, no. 4 (2019). https://doi.org/10.31449/inf.v43i4.3008. [5] Kalyani, G., MVP Chandra Sekhara Rao, and B. Janakiramaiah. "Decision tree-based data reconstruction for privacy preserving classification rule mining." Informatica 41, no. 3 (2017). https://doi.org/10.1007/s13369-017-2834-2. [6] Wang, Xi. "Research on Recognition and Classification of Folk Music Based on Feature Extraction Algorithm." Informatica 44, no. 4 (2020). https://doi.org/10.31449/inf.v45i4.3819. [7] Halima ELAIDI, Zahra BENABBOU, Hassan ABBAR, A comparative study of algorithms constructing decision trees: ID3 and C4.5, LOPAL’18, May 2-5, 2018, 1-5, Rabat, Morocco © 2018 Association for Computing Machinery, https://doi.org/10.1145/3230905.3230916 [8] Badr HSSINA, Abdelkarim MERBOUHA, Hanane EZZIKOURI, Mohammed ERRITALI, A comparative study of decision tree ID3 and C4.5, International Journal of Advanced Computer Science and Applications, pp: 13-19, 2014. https://doi.org/10.14569/specialissue.2014.040203 [9] Sonia Singh, Manoj Giri, Comparative Study Id3, Cart and C4.5 Decision Tree Algorithm: A Survey, International Journal of Advanced Information Science and Technology (IJAIST) ISSN: 2319:2682 3(7): 47-52, July 2014, https://doi.org/10.15693/ijaist/2014.v3i7.47-52. Figure 6 : Highest Information Gain and Gain Ratio for RA optimal features An Illustration of Rheumatoid Arthritis Disease Using... Informatica 46 (2022) 107–119 117 [10] Elaidi, Zahra Benabbou, Hassan Abbar, Using Game Theory to Handle Missing Data at Prediction Time of ID3 and C4.5 Algorithms, (IJACSA) International Journal of Advanced Computer Science and Applications, 9(12): 218-224, 2018. https://doi.org/10.14569/ijacsa.2018.091232 [11] Kalpesh Adhatrao, Aditya Gaykar, Amiraj Dhawan, Rohit Jha and Vipul Honrao, Predicting Students' Performance Using ID3 and C4.5 Classification Algorithms, International Journal of Data Mining & Knowledge Management Process (IJDKP) 3(5) , 39- 52, September 2013. https://doi.org/10.5121/ijdkp.2013.3504 [12] Y.Wang, Y.Li, Y.Song, X.Rong, S.Zhang, Improvement of ID3 Algorithm Based on Simplified Information Entropy and Coordination Degree, Algorithms journal, 10 (4): 124, 2017. https://doi.org/10.3390/a10040124 [13] Pattama Charoenporn, Reservoir Inflow Forecasting Using ID3 and C4.5 Decision Tree Model, 2017 IEEE 3rd International Conference on Control Science and Systems Engineering, 978-1-5386-0484-7/17/$31.00 ©2017 IEEE, 698-701. https://doi.org/10.1109/ccsse.2017.8088023 [14] Sudrajat, I. Irianingsih, D. Krisnawan, Analysis of data mining classification by comparison of C4.5 and ID algorithms,, IOP Conf. Series: Materials Science and Engineering 166: 1-9, 2017. https://doi.org/10.1088/1757-899x/166/1/012031 [15] Joko Azhari Suyatno, Fhira Nhita, Aniq Atiqi Rohmawati, Rainfall Forecasting in Bandung Regency using C4.5 Algorithm, 2018 6th International Conference on Information and Communication Technology (ICoICT), ISBN: 978-1- 5386-4571-0 (c) 2018 IEEE, 324-328. https://doi.org/10.1109/icoict.2018.8528725 [16] X. Wanga, C. Zhoua, X. Xub, Application of C4.5 decision tree for scholarship evaluations, The 10th International Conference on Ambient Systems, Networks and Technologies (ANT), The Authors. Published by Elsevier Ltd. ScienceDirect, Procedia Computer Science 151 (2019) 179–184. https://doi.org/10.1016/j.procs.2019.04.027 [17] Daniel Aletaha, Tuhina Neogi, Alan J. Silman, Julia Funovits, David T. Felson, Clifton O. Bingham., … Gillian Hawker(2010). “2010 Rheumatoid Arthritis Classification Criteria,” Arthritis & Rheumatism, (62)9: 2569-2581, 2010. https://doi.org/10.1002/art.27584 [18] Lakshmi.B.N, Dr.Indumathi.T.S, Dr.Nandini Ravi, A study on C.5 Decision Tree Classification Algorithm for Risk Predictions during Pregnancy, International Conference on Emerging Trends in Engineering, Science and Technology (ICETEST - 2015), Elsevier Ltd., ScienceDirect, Procedia Technology 24: 1542– 1549, 2016. https://doi.org/10.1016/j.protcy.2016.05.128 [19] Yanwei Xing, Jie Wang and Zhihong Zhao, Yonghong Gao, Combination data mining methods with new medical data to predicting outcome of coronary heart disease, 2007 International Conference on Convergence Information Technology,204:868-872,2007. https://doi.org/10.1109/iccit.2007.204 [20] Begona Garcia-Zapirian, Yolanda Garcia-Chimeno, and Heather Rogers(2015), ”Machine Learning Techniques for Automatic Classification of Patients with Fibromyalgia and Arthritis,” International Journal of Computer Trends and Technology, 25(3):2231-2803,2015. https://doi.org/10.14445/22312803/ijctt-v25p129 [21] Begum Cigsar and Deniz Unal, Comparison of Data Mining Classification Algorithms Determining the Default Risk, Hindawi Scientific Programming Feb. 2019, Article ID 8706505. [22] Nikita Jain, Vishal Srivastava, Data Mining Techniques: A Survey Paper, International Journal of Research in Engineering and Technology, 2(11), eISSN: 2319-1163, pISSN: 2321-7308, 2013. https://doi.org/10.15623/ijret.2013.0211019 [23] S. Umadevi, and K. S. Jeen Marseline, A Survey on Data Mining Classification algorithms, International Conference on Signal Processing and Communication (ICSPC’ 17), July 2017. https://doi.org/10.1109/cspc.2017.8305851 [24] Shanmugam, S., & Preethi, J., “Design of Rheumatoid Arthritis Predictor Model Using Machine Learning Algorithms,” SpringerBriefs in Applied Sciences and Technology, https://doi.org/10.1007/978-981-10-6698-6_7 [25] Vaishali S. Parsania, Krunal Kamani, and Gautam J Kamani, “Comparative Analysis of Data Mining Algorithms on EHR of Rheumatoid Arthritis of Multiple Systems of Medicine International,” Journal of Engineering Research and General Science, 3 (1): 344-350, 2015. [26] Beau Norgeot, MS; Benjamin S. Glicksberg, Laura Trupin, Dmytro Lituiev, Milena Gianfrancesco, Boris Oskotsky, Gabriela Schmajuk, Jinoos Yazdany, and Atul J. Butte, Assessment of a Deep Learning Model Based on Electronic Health Record Data to Forecast Clinical Outcomes in Patients with Rheumatoid Arthritis, JAMA Network Open. 2019,2(3). https://doi.org/10.1001/jamanetworkopen.2019.0606 [27] Jihyung Yoo, Mi Kyoung Lim, Chunhwa Ihm, Eun Soo Choi and Min Soo Kang (2017). A Study on Prediction of Rheumatoid Arthritis Using Machine Learning. International Journal of Applied Engineering Research, 12 (20). ISSN 0973-4562, pp. 9858-9862, 2017. 118 Informatica 46 (2022) 107–119 U. Ramasamy et al. [28] Catia Sofia Tadeu Botas (2017), “Feature analysis to predict treatment outcome in Rheumatoid Arthritis,” Instituto Superior Tecnico, Lisboa, Portugal. pp. 1- 10, 2017. [29] Elena Myasoedova, John M. Davis Ill, Eric L.Matteson, Sara J. Achenbach, Soko Setoguchi, Shannon M. Dunlay, Veronique L. Roger, Sherine E. Gabriel, and Cynthia S. Crowson, ” Increased hospitalization rates following heart failure diagnosis in rheumatoid arthritis as compared to the general population,” Seminars in Arthritis and Rheumatism, 50:25-29, 2020 © Elsevier Inc. https://doi.org/10.1016/j.semarthrit.2019.07.006 [30] Cynthia S Crowson, Katherine P Liao, John M Davis, Daniel H Solomon, Eric L Matteson, Keith L Knutson, Mark A Hlatky, and Sherine E Gabriel, Rheumatoid Arthritis and Cardiovascular Disease, NIH Public Access Am Heart J. 166(4):622–628, 2013. https://doi.org/10.1016/j.ahj.2013.07.010 [31] Usman Khalid, Alexander Egeberg, Ole Ahlehoff, Deirdre Lane, Gunnar H. Gislason, Gregory Y. H. Lip and Peter R. Hansen, Incident Heart Failure in Patients with Rheumatoid Arthritis: A Nationwide Cohort Study, Journal of the American Heart Association 2018. https://doi.org/10.1161/jaha.117.007227 [32] Khalid Raza, Application of Data Mining in Bioinformatics, Indian Journal of Computer Science and Engineering 1(2):114-118, 2012, ISSN: 0976- 5166. [33] Xing-Ming Zhao, Data Mining in Systems Biology, IEEE/ACM Transactions on Computational Biology and Bioinformatics 2016, vol 13(6), 1003-1003. https://doi.org/10.1109/tcbb.2016.2617698 [34] Zijing Wang, Yo Liu and Li Liu,” A New Way to Choose Splitting Attribute in ID3 Algorithm,” 978- 1-5090-6414-4/17 ©2017 IEEE. https://doi.org/10.1109/itnec.2017.8284813 [35] Audu Musa Mabu, Rajesh Prasad, Raghav Yadav, Suleiman S Jauro,” A Review of Data Mining Methods in Bioinformatics,” Recent Advances of Engineering, Technology and Computational Sciences, 978-1-5386-1686-4/18 ©2018 IEEE. https://doi.org/10.1109/raetcs.2018.8443785 [36] He Zhang and Runjing Zhou,” The Analysis and Optimization of Decision Tree Based on ID3 Algorithm,” The 9 th International Conference on Modeling, identification and Control, 2017. https://doi.org/10.1109/icmic.2017.8321588 [37] Jorn Lotsch, Lars Alfredsson, Jon Lampa, “Machine- Learning-based knowledge discovery in rheumatoid arthritis-related registry data to identify predictors of persistent pain,” The International Association for the Study of Pain Research Paper – PAIN, 161:114-126, 2020. https://doi.org/10.1097/j.pain.0000000000001693 [38] Tiffany D. Pan, Beth A. Mueller, Carin E. Dugowson, Michael L. Richardson, and J. Lee Nelson, Disease progression in relation to pre-onset parity among women with rheumatoid arthritis, Seminars in Arthritis and Rheumatism 2019, 0049-0172/© 2019 Elsevier Inc. https://doi.org/10.1016/j.semarthrit.2019.06.011 [39] Ho Sharon, I. Elamvazuthi, CK. Lu, S. Parasuraman and Elango Natarajan, Classification of Rheumatoid Arthritis using Machine Learning Algorithms, IEEE Student Conference on Research and Development (SCOReD) :345-350, 2019. https://doi.org/10.1109/scored.2019.8896344 [40] Ho Sharon, Irraivan Elamvazuthi, Cheng-Kai Lu, S. Parasuraman and Elango Natarajan, Development of Rheumatoid Arthritis Classification From Electronic Image Sensor Using Ensemble Method, Sensors 2020, 20, 167. https://doi.org/10.3390/s20010167 [41] Fautrel, B. Guillemin, F. Meyer, O. Bant, M.D. et al., Choice of Second-Line Diases-Modifying Antirheumatic Drugs After Failure of Methotrexate Therapy for Rheumatoid Arthritis: A Decision Tree for Clinical Practice Based on Rheumatologists’ Preferences. Arthritis Rheumatol. 61:425–434, 2009. https://doi.org/10.1002/art.24588 [42] Keyser, F.D. Choice of biologic therapy for patients with rheumatoid arthritis: The infection perspective. Curr. Rheumatol. Rev., 7: 77-87, 2011. https://doi.org/10.2174/157339711794474620 [43] Wu, C.-T.; Lo, C.-L.; Tung, C.-H.; Cheng, H.-L. Applying Data Mining Techniques for Predicting Prognosis in Patients with Rheumatoid Arthritis. Healthcare,8:85,2020. https://doi.org/10.3390/healthcare8020085 [44] Li Y, Sun X, Zhang X, Liu Y, Yang Y, Li R, Liu X, Jia R, Li Z. Establishment of a decision tree model for diagnosis of early rheumatoid arthritis by proteomic fingerprinting. Int J Rheum Dis. 18(8):835-41, 2015. PMID: 26249836. https://doi.org/10.1111/1756- 185x.12595 [45] Ma Dan, Liang Nana, Zhang Liyun. Establishing Classification Tree Models in Rheumatoid Arthritis Using Combination of Matrix-Assisted Laser Desorption/Ionization Time-of-Flight Mass Spectrometry and Magnetic Beads. Frontiers in Medicine. 2021; 8:190. ISSN:2296-858X. https://doi.org/10.3389/fmed.2021.609773 [46] Hans Ulrich Scherer, Thomas Häupl, Gerd R. Burmester,The etiology of rheumatoid arthritis, Journal of Autoimmunity, Volume 110,2020,102400, ISSN 0896-8411, https://doi.org/10.1016/j.jaut.2019.102400 [47] Maria Hugle, Patrick Omoumi, Jacob M. van Laar, Joschka Boedecker and Thomas Hugle. Applied machine learning and artificial intelligence in An Illustration of Rheumatoid Arthritis Disease Using... Informatica 46 (2022) 107–119 119 rheumatology.Rheumatology Advances in Practice 20;0:1–10,2020. https://doi.org/10.1093/rap/rkaa005 [48] Shanmugam, S., Preethi, J. Improved feature selection and classification for rheumatoid arthritis disease using weighted decision tree approach (REACT). J Supercomput 75, 5507–5519 (2019). https://doi.org/10.1007/s11227-019-02800-1 [49] Guo Q, Wang Y, Xu D, Nossent J, Pavlos NJ, Xu J. Rheumatoid arthritis: pathological mechanisms and modern pharmacologic therapies. Bone Res. 6:15, 2018. PMID: 29736302; PMCID: PMC5920070. https://doi.org/10.1038/s41413-018-0016-9 [50] Maria Kourilovitch, Claudio Galarza-Maldonado, Esteban Ortiz-Prado. Diagnosis and classification of rheumatoid arthritis. Journal of autoimmunity, 2014. https://doi.org/10.1016/j.jaut.2014.01.027 [51] Vodencarevic, A., Tascilar, K., Hartmann, F. et al. Advanced machine learning for predicting individual risk of flares in rheumatoid arthritis patients tapering biologic drugs, Arthritis Res Ther, 23, 67 (2021). https://doi.org/10.1186/s13075-021-02439-5 [52] Rehberg, M., Giegerich, C., Praestgaard, A. et al., Identification of a Rule to Predict Response to Sarilumab in Patients with Rheumatoid Arthritis Using Machine Learning and Clinical Trial Data, Rheumatol Ther 8, 1661–1675 (2021), https://doi.org/10.1007/s40744-021-00361-5 [53] Liu, J., Chen, N. A 9 mRNAs-based diagnostic signature for rheumatoid arthritis by integrating bioinformatic analysis and machine-learning, J Orthop Surg Res 16, 44 (2021). https://doi.org/10.1186/s13018-020-02180-w [54] Huang Jie, Fu Xuekun, Chen Xinxin, Li Zheng, Huang Yuhong, Liang Chao, Promising Therapeutic Targets for Treatment of Rheumatoid Arthritis, Frontiers in Immunology 2021, Vol.12, ISSN:1664- 3224, https://doi.org/10.3389/fimmu.2021.686155 [55] Maarseveen TD, Meinderink T, Reinders MJT, Knitza J, Huizinga TWJ, Kleyer A, Simon D, van den Akker EB, Knevel R, Machine Learning Electronic Health Record Identification of Patients with Rheumatoid Arthritis: Algorithm Pipeline Development and Validation Study, JMIR Med Inform, 8(11):2020, e23930, PMID: 33252349, PMCID: 7735897, https://doi.org/10.2196/23930 [56] Koo, B.S., Eun, S., Shin, K. et al. Machine learning model for identifying important clinical features for predicting remission in patients with rheumatoid arthritis treated with biologics. Arthritis Res Ther 23, 178, 2021. https://doi:10.1186/s13075-021- 02567-y [57] Fraenkel L, Bathon JM, England BR, et al. 2021 American College of Rheumatology Guideline for the Treatment of Rheumatoid Arthritis, Arthritis Rheumatol,73(7):1108-1123,2021. https://doi:10.1002/art.41752 [58] nptelhrd. (2008, October 16). Lecture – 35 Rule Induction and Decision Trees – I. Retrieved from https://www.youtube.com/watch?v=WfsRaLmh8js& t=547s. [59] Tom M. Mitchell. Machine Learning McGraw-Hill Science/Engineering/Math. 1997; pp. 52-76. [60] Wikipedia Website. [Online]. Available: https://en.wikipedia.org/wiki/Information_gain_ratio [61] Seema Sharma, Jitendra Agrawal, Sanjeev Sharma, Classification Through Machine Learning Technique: C4.5 Algorithm based on Various Entropies, International Journal of Computer Applications, 82(16):0975-8887, 2013. https://doi.org/10.5120/14249-2444 [62] R. Sudrajat, I. Irianingsih, D. Krisnawan, Analysis of data mining classification by comparison of C4.5 and ID algorithms, IOP Conf. Series: Materials Science and Engineering 166, 2017, 012031, https://doi.org/10.1088/1757-899x/166/1/012031 [63] Wikipedia Website. [Online]. Available: https://en.wikipedia.org/wiki/ID3_algorithm. [64] Poonkuzhali. S, Saravanakumar. C. Data Warehousing & Data Mining. Charulatha Publications 2008. 1 st Ed. pp: 6.12. 120 Informatica 46 (2022) 107–119 U. Ramasamy et al.