https://doi.org/10.31449/inf.v47i6.4607 Informatica 47 (2023) 115–130 115 A Deep Learning-Fuzzy Based Hybrid Ensemble Approach for Aspect Level Sentiment Classification Tanu Sharma 1 , Kamaldeep Kaur 2 1,2 University School of Information, Communication, & Technology, GGSIPU, Delhi, India E-mail: tanu.phd.132.usict2018@ipu.ac.in, kdkaur99@ipu.ac.in Keywords: aspect based sentiment analysis, fuzzy logic, ensemble, deep neural network, natural language processing Received: January 10, 2023 Aspect level sentiment classification (ALSC) has gained high importance in the era of an e-commerce- based economy. It allows manufacturers to improve the designs of their products based on users’ feedback. However, only a few datasets of limited domains are available for ALSC task. To push forward the research in automated ALSC, this study contributes car dataset of the automobile domain. In this study, a novel fuzzy ensemble technique is also proposed based on the mathematical analysis of confidence scores of base deep neural networks. The proposed approach allows for the correction of the misclassifications of base deep learners through a reward and penalization strategy. The experimental results on five benchmark datasets show that the proposed approach outperforms the constituent base deep neural networks and several other important baselines. The proposed Fuzzy ensemble also performed at par with the most recent Graph Convolution Neural Networks on the basis of Friedman and Nemenyi Tests. Povzetek: V prispevku sta opisani dve novosti: (1) baza podatkov o avtomobilih na temo aspektnega nivoja sentimentne klasifikacije in (2) nova mehka ansambelska metoda za kombiniranje klasifikacij globokih nevronskih mrež. 1 Introduction Aspect based sentiment analysis predicts the polarity of sentiment towards a specific entity or target, thus providing more detailed information as compared to general sentiment analysis. Aspect level sentiment classification (ALSC) specifically handles the sentiment classification task of ABSA [1]. Recently, the research in the field of ALSC is driven by well-performing deep learning methods. Most of the researchers are leveraging deep learning (DL) methods for achieving better accuracy on benchmark datasets. However, the benchmarking study [2] of the latest deep learning methods revealed that the good performance of deep learning methods is cursed with poor performance in terms of training time. Moreover, despite the effectiveness of the latest methods in ALSC, it is challenging to apply such methods to real-world applications because of the unavailability of labeled data in various domains. So far, multiple datasets have been proposed in ALSC, that includes SemEval’s restaurant14 [3],laptop14 [3], restaurant15 [4], restaurant16 [5], MAMS [6], and Twitter [7] dataset. Although these datasets are studied as benchmark datasets in almost every research of ALSC, these datasets lack domain diversity. Most of them belong to the restaurant domain except laptop14 and Twitter datasets. It is a well-known fact that supervised approaches like deep learning rely on properly labeled data for training. However, there is a lack of availability of datasets of different categories of products that conform to SemEval guidelines. Thus, the applicability of well- performing methods in product domains other than mobile phones and laptops is not well-tested and hence doubtful. To fill the gaps in the research domain of ALSC, this study provides a dataset in the domain of automobiles that conforms to SemEval guidelines. The availability of labeled data in the automobile domain will help the researchers and other stakeholders of the automobile domain to automate the ALSC process efficiently. Another advantage of proposing a dataset in the automobile domain is to facilitate cross-domain transfer learning in the field of ALSC. Ensemble learning is a paradigm where decision scores from multiple base learners are collectively used to predict the outcome of a given input sample. An ensemble model aims to capture the salient features of the base methods and thus ensures providing promising results as compared to its base learners. The ensemble is constructed by taking prediction decisions from various base learners. For this purpose, some pre-defined weight is allocated to each contributing leaner and the outcome is calculated. However, these methods do not pay any attention to the confidence level of the prediction made by the base learners. Most of the ensemble techniques ignore the confidence score of the predictions made by the learner. In this study, the confidence score (probability) of the predictions of each base learner is considered and then this score value is utilized to calculate the final prediction for each sample. 116 Informatica 47 (2023) 115–130 T. Sharma et al. The contributions of this study are twofold- • A dataset in the automobile domain specifically for ALSC task is developed. • An efficient ensemble technique for the ALSC task is proposed. To build an ensemble, a thorough investigation of the latest deep learning methods is performed to select methods that are diverse and efficient in computational time. After the selection of three base DL methods, a fuzzy rank-based logic using two non-linear functions is developed for the aggregation of ensemble outputs. The functions used for fuzzy ranking are of different concavities. Hence, both penalization and reward strategies are leveraged in the proposed fuzzy ensemble. To develop the benchmark dataset, first, the reviews of cars were collected from Ganesan et al. [9]. Further, the co-reference resolution was applied to the review sentences to ensure that all the aspects discussed in the reviews are covered in the dataset. Further, the sentences were manually annotated based on guidelines released for benchmark datasets by SemEval [4]. An extensive evaluation of the proposed fuzzy rank-based ensemble approach is performed on five benchmark datasets including the newly proposed dataset of cars. Intending to advance the research in the field of ALSC, the major contributions of this study are 1. Manual annotation of the car data for ALSC in the automobile domain. This release of data will push forward cross- domain transfer learning and automated ABSA research in the automobile domain. 2. A novel ensemble fuzzy-based approach using the three base deep learning (DL) methods which are diverse and efficient in terms computational time. Further, the systematic penalization and reward strategy ensures the prediction of the correct class by the proposed ensemble. 3. Experimental results demonstrate that the proposed ensemble approach performs significantly better than the base learners as well as other state-of-art deep learning methods. 4. This study also presents case studies to show that the proposed ensemble can predict correctly even when all the base learners give wrong predictions. The organization of this study is as follows. Section 2 describes the related work. Section 3 describes the approach used to develop the dataset. Section 4 presents the hybrid fuzzy ensemble approach proposed in this study. Section 5 provides experimental details, statistical test results, and an ablation study. Section 6 discusses the case studies. Finally, section 7 concludes this work. 2 Related work 2.1. Deep learning for ALSC In recent years, the literature on ALSC is primarily dominated by methods deploying deep neural networks. This dominance of deep learning methods is mainly because of their capability of learning features automatically without any external feature engineering effort [8] [9]. Additionally, such methods have shown remarkably better performance as compared to traditional machine learning methods [10]. The initial attempts in the deep learning based ALSC utilized sequence networks like LSTM in their architectures [11] [12]. Tang et al. [11] proposed the very first LSTM based model known as target-dependent LSTM(TD-LSTM). TD-LSTM can capture the context on both sides of the aspect or target in a sentence. Later, Wang et al. [12] proposed an attention- based model with LSTM as the underlying network in the architecture. The authors were inspired by the popularity of attention mechanism in the field of NLP and were the first to leverage attention mechanism in the area of ALSC. Followed by their work, various attention-based models using GRU, CNN, and memory networks [13] are proposed for ALSC task. Tang et al. [14]developed a network known as MemNet that utilizes a deep memory network to generate aspect-specific features and updated the memory using the attention mechanism. With the capability of capturing local features efficiently, Convolutional neural networks (CNN) based models have also demonstrated promising performance for ALSC task [10]. The researchers have leveraged simple CNN and CNN with hybrid architectures to achieve promising results. Xue et al. [15] proposed a method based on CNN and used the gating mechanism to efficiently handle the flow of information. Li et al. [16] proposed TNet, a hybrid architecture-based transformation network based on both CNN and LSTM. In TNet, LSTM is used to capture the contextual information whereas CNN captures the local features. There are various other attempts in which interactive attention-based networks are proposed [17] [18] [19]. The intuition behind the interactive attention is to capture the relationship between the context and aspect of the sentence. The authors of [17] [18] argued that simple attention is not sufficient to capture the relationship between aspect and context. In another attempt, Fan et al. [19] proposed a coarse-grained and fine-grained attention mechanism referred to as a multigrain attention network. Recently, graph neural networks (GNN) have also gained importance for ALSC task. The researchers working in the area of ALSC, leverage GNNs to incorporate the syntactical knowledge of the sentence obtained from the dependency tree. Syntactic knowledge plays a crucial role in handling long-range dependencies between aspect and relevant context words. The graph-like structure of the dependency tree facilitates the usage of GNN for this task. However, the architecture of these GNN-based methods is quite complex which makes them computationally expensive. The very first attempts in this line are made by Zhang et al. [20] and Huang et al. [21]. The more recent works leveraging GNNs are [22] [23] [24] [25]. Initially, only node information of the sentence was captured using A Deep Learning-Fuzzy Based Hybrid Ensemble Approach… Informatica 47 (2023) 115–130 117 GNN-based architectures. However, the edges of the dependency tree also carry important and meaningful information. Thus, in various works [26] [27], the edge information is also taken into consideration to generate a better representation of the sentence. Table 1: Summary of the related work Method Year Description Average Acc 1 Advantages Limitation TD- LSTM 2016 It has two LSTMs for handling the left and right context of the target 71.83 Simple architecture, less computational time Low accuracy AEContex tAvg 2019 A simple feed-forward network that takes an average of aspect and context embeddings as input 74.36 Simple architecture, less computational time Moderate accuracy ATAE- LSTM 2016 Append the aspect embeddings with LSTM along with the attention layer. 70.66 Simple architecture, moderate computational time Very low accuracy CNN 2019 Simple network based on CNN to extract local features efficiently 70.80 Simple architecture, less computational time Very low accuracy MemNet 2016 Based on a memory network where context words act as external memory 71.02 Moderate computational time Low accuracy RAM 2017 Based on GRU and Bi-LSTM for generating aspect and context representation respectively 71.40 Moderate computational time Low accuracy IAN 2017 Based on two LSTMs along with interactive attention 72.33 Moderate computational time Low accuracy TNet 2018 Combines the embeddings generated by Bi-LSTM and CNN with a transformation module 75.39 Moderate accuracy Complex architecture, high computational time ASGCN 2019 Converts syntactic information into an undirected graph and then applies GCN 81.29 Very high accuracy Complex architecture, high computational time DualGCN 2022 Based on two GCNS: Syntactic and semantic 81.59 Very high accuracy Complex architecture, high computational time SSEGCN 2022 Generates the attention scores using two different types of attentions and further passes it to GCN layer 82.30 Very high accuracy Complex architecture, high computational time Ensemble majority vote 2022 Base learners: TD-LSTM, AEContext_Avg, and CNN, ensemble creation using majority voting method 72.30 Simple ensemble computation Low accuracy Stacking based ensemble 2022 Base learners: TD-LSTM, AEContext_Avg, and CNN, ensemble creation using the meta-learning approach with random forest classifier 76.74 Simple architecture of base learners Moderate accuracy, Meta-learning increases the computational time EO based ensemble 2022 Optimization approach applied to select base deep learner from a pool of ten DL methods, ensemble creation using meta- learning approach with random forest classifier 78.05 High accuracy Complex computation for ensemble creation Proposed Fuzzy Ensemble 2023 Base learners: TD-LSTM, AEContext_Avg, and CNN, Ensemble creation using simple mathematical fuzzy logic 79.61 Simple architecture of base learners, Simple logic for ensemble, High accuracy Our proposed method will require future research on the scalability of our method across many more corpora of products and services. 1 Very low: acc below 71; low: 71≤acc<74; moderate: 74≤acc<77; high:77≤acc<80; very high: acc above 80 118 Informatica 47 (2023) 115–130 T. Sharma et al. 2.2 Ensemble learning in the context of ALSC Ensemble learning is a popular technique that has attracted researchers in most domains throughout the years. However, there are very few ensemble approaches proposed for ALSC task so far. Mohammadi et al. [30] were the first to propose an ensemble-based technique for the ALSC task. The authors used the simple CNN, Bi- LSTM, LSTM, and GRU as the base learners in their approach. Their ensemble approach was based on the meta-learning principle that fuses the prediction of the base learners to get the final prediction for the ensemble. However, their work has two major limitations. First, the authors selected simple deep neural networks and did not emphasize the aspect-specific information in any of the base learners. Second, the authors demonstrated the performance of the ensemble using the macro-precision metric only. However, accuracy and F1 score are considered better metrics to evaluate the performance of classification models. Sharma et al. [28] proposed another meta-learning ensemble technique for ALSC. The authors used three base learners in their ensemble that are TD-LSTM, AEContextAvg, and CNN. Further, a random forest is used in the meta-learning phase to generate the final predictions. In another similar attempt, the authors of [29], proposed an ensemble approach based on the principle of classifier ensemble reduction. The authors transformed the selection of the base learner method for the ensemble as a classifier reduction problem. Further, a physics-based optimization algorithm known as the EO (Equilibrium Optimizer) algorithm is used to select the base learners from the pool of ten different DL methods. The EO-based ensemble obtained good performance as well. However, the selection of base learners using an optimization algorithm can be time taking and tedious process. Therefore, in this study, the aim is to propose an ensemble that is efficient as well less complex in terms of computation and time as well. A brief description of various DL methods in ALSC literature along with their advantages and limitations is provided in Table 1. The computational time for various methods can be obtained from the work of Sharma & Kaur [31]. It can be seen from the table that most of the methods with simple architecture and less computational time could not achieve higher accuracy. At the same time, the methods attaining higher accuracies either have a complex architecture with high computational time or have complex ensemble construction methods. Thus, in this work, the aim is to propose a method with improved accuracy and less complex computations. Therefore, three simple base learners are selected for the ensemble and further, a novel mathematical logic based on the fuzzy principle is proposed in this study. 3 Data annotation 3.1 Data collection In this study, a new automobile domain is explored for ALSC research. The data collection and annotation process are carried out in a similar manner as to SemEval datasets. The sentences are annotated from the Car review dataset collected by Ganesan et al. [32]. The dataset contains the reviews of cars from the website named ‘caredmunds.com’. The full reviews of one model for each car company are selected from the dataset. Further, before beginning with the annotation process, coreference resolution is applied to the reviews so that all the mentioned aspects in the reviews can be considered. Fig. 1 shows the elaborated steps followed for the data (construction) preprocessing task. Figure 1: Data Annotation Process A Deep Learning-Fuzzy Based Hybrid Ensemble Approach… Informatica 47 (2023) 115–130 119 3.2 Co-reference resolution In ALSC, the annotation is carried out at the sentence level. If the sentence contains any explicit target mentions, then the target is annotated. Otherwise, if the target mention is implicitly referred to using pronouns, then the sentence is not considered for annotation because of this, some valuable opinions may be missed. Co-reference resolution refers to the task of matching the expressions with the same entity in the text. This task helps in linking the actual noun phrase mention with its pronoun in the sentence. The co-reference resolution on the car reviews data collected from [32] is applied in this study. This step ensures that all the mentioned 3rd person pronouns like ‘it’, ‘them’, etc. are replaced with their actual mentions. The example showing the utility of co-reference resolution is shown in Fig.1. The task of co-reference resolution is performed using the NeuralCoref library available in Python. Further, the reviews are split into sentences and the data annotation task is performed. 3.3 Data annotation In this step, each sentence is parsed to find the relevant aspect and its respective sentiment. For annotating the sentences, the annotation guidelines released by the organizers of the SemEval workshop [3] are followed. This ensures that the annotated data is at par with other benchmark datasets of ALSC task. The task of annotating the sentences is carried out with the help of two annotators who are the authors of this paper. Both annotators are initially required to annotate a subset of sentences based on the guidelines released by SemEval2. Then, a review session is conducted to discuss the annotations and other doubts. After 3 such rounds of discussions, all the doubts got clear to both annotators. Later, the rest of the review sentences are equally allotted to both annotators. Finally, the annotated review sentences are combined to form the final dataset. The sample annotated sentences from the dataset are shown in Table 2. After the annotation process is complete, the data is transformed into the standard XML format as in the benchmark datasets of ALSC literature. This step ensures that the prepared dataset can be easily used just like other benchmark datasets by researchers working in the area of ALSC. Random split is applied to get the train and test data respectively. The total number of sentences in the dataset is 5478 whereas the total number of samples is 7541. 2 The guidelines are available at: http://alt.qcri. org/semeval2014/task4/data/uploads/. Table 2: Sample sentences from the dataset Review Sentence Aspect Term Polarity Unfortunately, after rolling just 19k, transmission crapped out. Transmission -1 My only complaint is rear seats are not comfortable on back for a long car trip. Rear seats -1 Get the factory navigation system if you can. Navigation system 0 Get back up camera for sure. Back up camera 0 Fit and finish inside and out is fantastic. Fit and finish 1 The interior is well layed out with easy to read gauges. Interior 1 The statistics of the final released dataset are mentioned in Table 3. Table 3: Data statistics Car Dataset Positive Negative Neutral Total Samples Number of sentences Train 3253 1004 795 5052 4404 Test 1603 494 392 2489 1074 4 Methodology In this section, in the first step, the task definition along with the preliminaries related to ALSC and the deep learning methods is explained. In the second step, the different deep learning methods used as base classifiers in the proposed fuzzy ensemble approach are explained. Lastly, the proposed fuzzy ensemble approach based on selected DL methods is explained. 4.1 Preliminaries The process of aspect sentiment classification is different from the general sentiment classification task. The major reason behind this difference is the presence of different polarity words for different aspects present in a single sentence. For example, in Fig. 2, “The seats are wonderfully comfortable but the mileage is poor”, the sentiment polarity is positive for aspect “seats” and negative for aspect “mileage”. Thus, the ALSC task deals with predicting the polarity class for given pair of sentence 120 Informatica 47 (2023) 115–130 T. Sharma et al. and aspect (𝑆 , 𝐴 ). In deep learning-based ALSC, the sentence 𝑆 is converted into an output vector while taking aspect 𝐴 into consideration. The construction of this output vector which is also the final representation of the input sentence varies depending on the underlying architecture of the deep learning method. Finally, this output vector is treated as the final feature and is fed into the softmax layer for sentiment prediction of the aspect 𝐴 . Fig. 3 shows the overall architecture of the proposed fuzzy ensemble approach. Figure 2: Preliminaries for ALSC task 4.2 Selection of base deep learning methods The performance of any ensemble method majorly depends on the selection of the base learners. Time efficiency plays a crucial role as the objective is to build an ensemble with better accuracy without compromising in terms of time. Further, to ensure that the selected base learners are diverse, different base learners and their characteristics discussed in [2] are thoroughly studied. After closely analysing the limitations in current approaches, three deep learning methods are selected that are diverse and time efficient. The three selected base learner methods are TD-LSTM, AEContextAvg, and CNN. CNNs have the capability to extract local features efficiently. In most of the deep learning-based architectures of ALSC literature, the convolutional layer is placed after the input embedding layer to generate the local features from the text. Thus, CNN is chosen as one of the base learners in the proposed ensemble. Another base learner, TD-LSTM performs well for long sentences as it considers both the right context and left context of the aspect term to predict the sentiment polarity. To handle simple and short sentences, AEContextAvg is chosen. AEContextAvg has a very simple architecture and performs decently on short sentences. Figure 3: The architecture of the proposed fuzzy-based ensemble. A Deep Learning-Fuzzy Based Hybrid Ensemble Approach… Informatica 47 (2023) 115–130 121 These three methods are briefly explained below. Target dependent LSTM The TD-LSTM handles the input in an aspect- oriented way by splitting the input sentence around the aspect into the left context and right context. This way, TD-LSTM ensures considering the aspect position while generating the final feature vector of the sentence. As shown in Fig. 3, there are two separate inputs provided to 2 LSTMS, the LSTM L takes left context(𝐶 𝐿 )+ aspect as input whereas LSTM R takes aspect+right context(𝐶 𝑅 ) as input. Finally, the hidden vector representation from both the LSTMs is combined to form the final vector further fed into softmax for prediction. AEContextAvg AEContextAvg [10] utilizes simple feed-forward neural network architecture which takes both aspect and sentence as input. As shown in Fig. 3, first, the average of aspect and sentence vectors are concatenated. Then this concatenated vector is fed into the softmax layer for prediction. The architecture of AEContextAvg is simple yet efficient because it considers both the aspect and sentence together. Convolutional neural network CNN is a deep learning network that acquires its power from the convolution filters and pooling operations. CNN is proven to be efficient in automatically extracting features from the text. as well. The first layer of CNN is an embedding layer that takes the word embeddings of the sentence as the input. Later, the output is fed into multiple convolution filters as shown in Fig. 3. Finally, the softmax layer generates the confidence score of each class for the given input. 4.3 Proposed fuzzy ensemble In the proposed fuzzy ensemble, the confidence score (class probability) of the predictions of each base learner is considered and then this score value is utilized to calculate the final prediction for each sample. A fuzzy rank is calculated for each class using two non-linear functions: 𝑡𝑎𝑛 ℎ and the modified Weibull function [33]. The chosen two functions in the proposed approach are of different concavities. The different concavities of the functions help in maintaining the equilibrium between the reward and penalization strategy. These functions determine the fuzzy rank of the various classes using the confidence scores obtained by each class for each base deep learning method. Algorithm 1: Fuzzy Rank based Ensemble Input: Probability scores for each class obtained by each base DL classifier Output: Final Predicted Class 1: 𝑝 represents the number of classes and 𝑞 represents the number of base DL classifiers. 2: 𝑖 represents the of 𝑖 𝑡 ℎ class and 𝑗 represents the 𝑗 𝑡 ℎ base DL classifier. 3: Initialize the 𝑝 𝑋 𝑞 list 𝑃𝐵𝑆 𝑗 𝑖 with the confidence scores obtained for each class 𝑖 by each base DL classifier 𝑗 . 4: Initialize 𝐹𝑍𝑅 1 𝑗 𝑖 and 𝐹𝑍𝑅 2 𝑗 𝑖 to store the two fuzzy ranks obtained for each class 𝑖 by each base DL classifier 𝑗 . 5: Initialize 𝐶𝐹𝑅 𝑗 𝑖 to store the combined fuzzy score obtained using 𝐹𝑍𝑅 1 𝑗 𝑖 and 𝐹𝑍𝑅 2 𝑗 𝑖 6: Initialize 𝐹𝐹𝑅 𝑖 to store the final fuzzy rank obtained by each class. 7: for each class 𝑖 and base DL classifier 𝑗 do 8: Use Eq. (1) and (2) to calculate 𝐹𝑍𝑅 1 𝑗 𝑖 and 𝐹𝑍𝑅 2 𝑗 𝑖 respectively. 9: Use Eq. (3) to calculate the combined fuzzy score 𝐶𝐹𝑅 𝑗 𝑖 . 10: Calculate the final fuzzy rank 𝐹𝐹𝑅 𝑖 using Eq. (4). 11: end for 12: return final predicted class = min(𝐹𝐹𝑅 𝑖 ) for 𝑖 ∈ [1, 𝑝 ] The mathematical model used in this work is discussed next. Let 𝑝 represents the number of classes and 𝑞 represents the number of base DL classifiers. Initialize the 𝑝 𝑋 𝑞 list 𝑃𝐵𝑆 𝑗 𝑖 to store the confidence score 𝑐 𝑗 𝑖 obtained for the 𝑖 𝑡 ℎ class and 𝑗 𝑡 ℎ base DL classifier. For each classifier, the two fuzzy ranks 𝐹𝑍𝑅 1 𝑗 𝑖 and 𝐹𝑍𝑅 2 𝑗 𝑖 are calculated using Eq. (1) and Eq. (2) based on the hyperbolic tangent and Weibull functions respectively. 𝐹𝑍𝑅 1 𝑗 𝑖 = 1 − tanh [ (𝐶 𝑗 𝑖 −1) 2 2 ] for 𝑖 ∈ [1, 𝑝 ] (1) 𝐹𝑍𝑅 2 𝑗 𝑖 = exp (−2(𝐶 𝑗 𝑖 ) 2 ) 2 for 𝑖 ∈ [1, 𝑝 ] (2) 122 Informatica 47 (2023) 115–130 T. Sharma et al. Further, both the calculated fuzzy ranks are multiplied to obtain the combined fuzzy rank 𝐶𝐹𝑅 𝑗 𝑖 as explained in Eq. (3). 𝐶𝐹𝑅 𝑗 𝑖 = 𝐹𝑍𝑅 1 𝑗 𝑖 × 𝐹𝑍𝑅 2 𝑗 𝑖 (3) The final aggregated fuzzy rank for each class 𝑖 is obtained using Eq. (4). 𝐹𝐹𝑅 𝑖 = ∑ 𝐶𝐹𝑅 𝑗 𝑖 𝑞 𝑗 =1 for 𝑗 ∈ [1, q] where 𝑞 is the number of base DL classifiers. (4) Finally, Eq. (5) is used to get the final predicted class by the ensemble. Final predicted class= min(𝐹𝐹𝑅 𝑖 ) for 𝑖 ∈ [1, 𝑝 ] (5) The detailed steps of the proposed fuzzy rank-based ensemble are explained in Algorithm 1. 5 Experiments 5.1 Experimental settings The implementation of the proposed work is performed using the PyTorch framework. The hyperparameter details are shown in Table 4. The other model-specific parameter settings are kept the same as in [2]. Accuracy and Macro-F1 score are used as the evaluation metrics. Table 4: Hyperparameter settings GloVe embedding dimension 300 Hidden state vector dimension 300 Batch size 64 Learning Rate 0.01 Regularization L2 Dropout rate 0.1 Optimizer Adam Initialization of weight matrix U (-0.01,0.01) Throughout this paper, acc refers to accuracy and F1 score refers to macro F1 score. The reliability of results is ensured by taking an average of 5 runs with randomly initialized values. 5.2 Datasets In this study, the evaluation of various methods is performed on 5 datasets as mentioned in Table 5. The first four datasets: restaurant14, laptop14, restaurant15, and restaurant16 are released by SemEval whereas the Car dataset is developed in this study itself. Table 5: Details of the datasets Dataset Positive Negative Neutral Train Test Train Test Train Test Laptop 14 987 341 866 128 460 169 Restaurant 14 2164 728 805 196 633 196 Restaurant 15 1198 454 403 346 53 45 Restaurant 16 1657 611 749 204 101 44 Car 3253 1603 1004 494 795 392 5.3 Experimental results In this section, the experimental results obtained by various methods are presented. The proposed ensemble approach is compared with various state-of-art baselines including the latest GNNs and ensemble methods of ALSC literature. It can be observed from Table 6 that the proposed fuzzy ensemble has outperformed most of the compared methods with few exceptions. • The three base learners in the proposed model: TD_LSTM, CNN, and AEContextAvg have simple architecture without any attention mechanism or complex graph neural networks. Since TD_LSTM and CNN do not employ any attention mechanism, their performance is relatively poor as compared to other DL methods. However, the third base learner AEContextAvg has attained moderate accuracy even without the attention mechanism. • The proposed fuzzy ensemble has outperformed all three base learners that are TD-LSTM, AEContextAvg, and CNN by 14.4 %, 13.7 %, and 14.2% respectively in terms of F1 score. This good performance of the proposed ensemble in comparison to base learners justifies the concept that weak and diverse base learners contribute to a good ensemble. Thus, the selection of base deep learning classifiers in this study is also justified. • The other state of art methods like ATAE-LSTM, MemNet, IAN, and RAM deploy different types of A Deep Learning-Fuzzy Based Hybrid Ensemble Approach… Informatica 47 (2023) 115–130 123 attention mechanisms in their architecture. Nevertheless, their performance is almost similar (or slightly better) to the TD_LSTM and CNN. However, our proposed fuzzy logic-based ensemble model has attained better results as compared to the above methods even with weak base learners like TD_LSTM and CNN. Table 6: Experimental results obtained for various methods Methods Restaurant14 Laptop14 Restaurant15 Restaurant16 Car Average Acc Average F1 score Acc F1 score Acc F1 score Acc F1 score Acc F1 score Acc F1 score CNN [10] 73.75 60.3 61.75 53.06 64.3 40.43 77.18 47.54 77.02 71.91 70.8 54.65 AEContexTAvg [10] 70.71 56.99 63.79 55.71 82.7 51.2 78.57 51.3 76.01 70.41 74.35 57.12 TD-LSTM [11] 71.78 58.05 63.0 56.98 65.79 45.36 79.39 52.31 79.21 72.09 71.83 56.96 ATAE-LSTM [12] 70.98 54.61 63.0 52.52 66.15 42.74 74.24 52.77 78.92 71.02 70.66 54.73 MemNet [14] 70.35 56.77 61.59 50.94 63.9 43.3 78.92 49.21 80.32 72.48 71.02 54.54 IAN [18] 70.71 56.49 62.53 57.08 69.23 46.11 79.39 54.77 79.8 67.23 72.33 56.34 RAM [13] 70.26 55.94 60.5 52.61 68.16 45.42 79.04 50.2 79.46 72.01 71.48 55.24 TNet [16] 78.75 67.54 71 64.9 60.15 57.71 86.13 68.82 80.92 74.21 75.39 66.64 ASGCN [20] 81.69 73.76 75.02 70.79 78.96 60.71 87.71 67.83 83.07 76.32 81.29 69.88 DualGCN [22] 83.24 75.22 76.61 72.96 80.88 65.32 84.61 69.05 83.11 76.02 81.69 71.71 SSEGCN [23] 83.35 76.03 78.01 73.21 81.27 64.46 85.3 68.9 83.55 75.68 82.30 71.66 Ensemble majority vote 73.21 62.01 65.42 59.02 68.85 50.11 78.03 51.55 75.98 67.09 72.30 57.96 Stacking based ensemble [28] 81.07 76.32 70.23 68.11 73.3 70.71 80.08 61.4 79.01 72.35 76.74 69.78 EO based ensemble [29] 81.89 77.65 71.07 69.34 77.53 71.01 81.47 61.6 78.3 72.41 78.05 70.41 Proposed Fuzzy Ensemble 82.51 77.89 72.03 70.01 78.88 72.13 82.5 62.07 82.12 75.01 79.61 71.42 • As per the table, the performance of the proposed ensemble in terms of F1 score is better than the state-of-art methods like ATAE-LSTM, MemNet, IAN, and RAM by 16.4%, 16.6%, 14.8%, and 15.9% respectively. • TNet method being a state-of-the-art method in ALSC literature has attained good performance. The architecture of TNet is quite complex whereas our proposed ensemble model has simple DL methods as base learners. Nevertheless, our proposed ensemble has outperformed TNet by 4.3%. • The above compared methods did not incorporate the syntactic knowledge for ALSC task. Syntactic knowledge plays a crucial role in mapping correct opinion words to the aspect. Thus, the performance of methods without syntactic knowledge is quite less as compared to methods with syntactic knowledge like ASGCN, DualGCN, and SSEGCN. Graph neural networks are the most suitable networks for incorporating syntactic knowledge because the dependency tree of the sentence can be easily fed as a graph to such methods. Thus, ASGCN, DualGCN, and SSEGCN utilize various layers of GCN and have very complex architecture. Their performance is 124 Informatica 47 (2023) 115–130 T. Sharma et al. good but computational time is more. Even with simple base learners, our proposed ensemble model has reached comparable performance (if not better) with such GNN based methods. • The proposed fuzzy ensemble is also compared with three other types of ensemble methods proposed in previous ALSC literature. The experimental results show that the proposed fuzzy ensemble approach has outperformed all other ensemble methods that are majority voting, stacking-based, and EO-based ensembles, with the same base learners. • The proposed fuzzy ensemble has outperformed the simple majority voting ensemble with a difference of 13.2%. The majority voting ensemble directly works on the predicted classes and no emphasis is given to the confidence score obtained by each class. In contrast to this, our proposed fuzzy ensemble works minutely on the confidence scores thereby leading to better performance. • The stacking-based ensemble is based on the confidence scores of predictions which are further combined using the principle of stacking. The random forest classifier is adopted in their work [28] to compute the final predicted class where the confidence scores obtained by base learners are considered features. This stacking or meta-learning-based approach increases the overall complexity of the ensemble. In contrast to this, our proposed ensemble is based on fuzzy logic where simple mathematical steps are followed to obtain the final predictions. Nevertheless, the performance of the latter is better with a difference of 1.5% and 2.9% for F1 score and accuracy respectively. • The EO-based ensemble works on the principle of optimization using a heuristic approach where base learners are selected using the EO approach and later random forest classifier is applied in similar manner as in the stacking-based ensemble approach. Therefore, the overall complexity of this ensemble is even more than the stacking- based ensemble. In contrast to this, our proposed fuzzy ensemble is simple yet efficient and has clearly outperformed the EO-based ensemble. • It can be said that a common issue with ensemble learning models is high computational complexity. This work aims to propose a simple yet computationally efficient ensemble aggregation approach. Our proposed ensemble is simple yet efficient. • The phenomenon of better performance shown by the proposed ensemble also demonstrates that the ranking strategy in the proposed ensemble is quite efficient in predicting the correct class, even when all the base learners give wrong predictions. For a better understanding of this phenomenon, two case studies are presented in section 6. 5.4 Statistical test In this study, statistical testing is performed to validate the proposed ensemble approach. The statistical test applied is the Friedman test which is a non-parametric counterpart of ANOVA. The Nemenyi test is used as the post-hoc test for the Friedman test. The Nemenyi test is conducted after the rejection of the null hypothesis of the Friedman test. Figure 4: Post-hoc test results A Deep Learning-Fuzzy Based Hybrid Ensemble Approach… Informatica 47 (2023) 115–130 125 The null and alternate hypotheses of the Friedman test are: Hypothesis 1 (H1). The performance of the compared methods is not significantly different in terms of F1 score and accuracy. Alternate Hypothesis(H1a): At least two of the compared methods have significant differences in F1 score and accuracy. The statistical tests are performed using the R tool. The average ranks of various methods across all datasets are used to conduct the Friedman test. As per the Friedman test results, the null hypothesis is rejected for both evaluation metrics: F1 score and accuracy. Therefore, the post-hoc Nemenyi test is conducted to find the methods that have a statistically significant difference in their performance. Fig. 4(a) and 4(b) show the results of the post-hoc test. In Fig. 4(a) and 4(b), the compared methods are plotted against the average rank obtained by each method. DualGCN and SSEGCN are the best-ranked methods for F1 score and accuracy respectively. The average rank of the proposed fuzzy ensemble is 3.4 and 4.2 for the F1 score and accuracy respectively. The methods with the lines that fall within the grey area indicate that the performance difference is not statistically significant. Therefore, it can be concluded that even though the proposed fuzzy ensemble could not outperform the DualGCN or SSEGCN methods, it is no worse than DualGCN, SSEGCN, and ASGCN methods based on the Friedman test. Thus, the proposed ensemble based on simple deep learning classifiers has achieved comparable performance with top-performing complex GNN based methods. 5.5 Ablation study Ablation experiments were carried out to demonstrate that the proposed work performs better than either the ensemble of any of the two base classifiers or the performance of individual classifiers. The three classifiers were combined in every conceivable way for this study. The accuracy and F1-score for each combination are calculated using the proposed ensemble model for all five datasets as shown in Table 7. Table 7: Ablation experiment results Methods Restaurant14 Laptop14 Restaurant15 Restaurant16 Car Acc F1 score Acc F1 score Acc F1 score Acc F1 score Acc F1 score CNN 73.75 60.3 61.75 53.06 64.3 40.43 77.18 47.54 77.02 71.91 AEContextAvg 70.71 56.99 63.79 55.71 82.7 51.2 78.57 51.3 76.01 70.41 TD-LSTM 71.78 58.05 63.01 56.98 65.79 45.36 79.39 52.31 79.21 72.09 CNN+TD-LSTM 75.43 73.31 67.58 64.02 69.04 67.34 80.82 58.87 79.85 72.69 CNN+AEContextAvg 76.61 73.06 66.51 65.23 71.32 68.09 79.07 58.32 79.47 73.24 TD-LSTM+AEContextAvg 77.02 74.89 68.34 64.51 72.46 67.65 80.55 59.03 80.01 73.03 Fuzzy ensemble (CNN+TD- LSTM+AEContextAvg) 82.51 77.89 72.03 70.01 78.88 72.13 82.5 62.07 82.12 74.01 126 Informatica 47 (2023) 115–130 T. Sharma et al. Figure 5: Learning Curves of base deep learning methods It can be easily inferred from the results that applying the proposed ensemble logic to combine the three basic classifiers produces the best results out of all potential combinations, supporting the rationale for their selection. 5.6 Learning curves The learning curves of the three base deep learning methods for the car dataset are presented in this section. Fig. 5(a) and 5(b) show the learning curves for F1 score and accuracy respectively. Early stopping is applied for training all three base deep learning methods where the training is stopped once model performance stops improving on the validation set. As per Fig. 5(a) and 5(b), TD-LSTM, AEContextAvg, and CNN took 20, 17, and 18 epochs respectively to converge. It can be observed from Fig 5(a) and 5(b) that the performance of TD-LSTM in terms of accuracy is better than both CNN and AEContextAvg whereas the performance of all three methods in terms of F1 score is quite competitive. It can also be observed that TD-LSTM, AEContextAvg, and CNN obtained the best results at epochs 11, 12, and 10 respectively. 6 Case study The intuition behind the fuzzy ensemble approach is to give weightage to the confidence score attained by each class irrespective of the final prediction. This way, the aim is correctly predicting the class even if all three base learners failed to do so. The usage of non-linear functions of different concavities helps in restrictively penalizing the classes. For a comprehensive understanding of the proposed approach, two case studies are presented in this section. The ranks 𝐹𝑍𝑅 1 and 𝐹𝑍𝑅 2 are calculated for each class using Eq. (1) and Eq. (2) (as defined in section 4.3) respectively. 𝐹𝑍𝑅 1 rewards for a high confidence score whereas 𝐹𝑍𝑅 2 penalize for the same. Next, 𝐶𝐹𝑅 (refer to Eq. (3) of section 4.3) is calculated using 𝐹𝑍𝑅 1 and 𝐹𝑍𝑅 2 . Finally, 𝐶𝐹𝑅 obtained for each class by each base classifier is summated to get the 𝐹𝐹𝑅 score (refer to Eq. (4) of section 4.3) where the class with a minimum value of 𝐹𝐹𝑅 is considered as the predicted class. Case study 1: One base learner predicts correct class and two base learners predicted incorrect class In the example shown in Fig. 6, one base learner predicts correctly with a moderate confidence score whereas two other classifiers predict incorrectly with less confidence score. In such a scenario, the proposed approach facilitates the prediction of the correct class on the basis of the confidence score attained by the correct class by each base learner, whereas a majority voting ensemble will fail. Fig. 6 shows the step-by-step calculation of the prediction made by the proposed approach. In the above example, CNN predicted the correct class with a confidence score of 65 percent. However, the other two methods predicted another class 2 with the average confidence score ranging from 35-45 percent. Further, the two methods have given weights in the range of 30-35 percent to class 1 which is the correct prediction. The proposed approach successfully predicted the correct class using the principle of restrictive penalization and rewarding class 1 on the basis of its confidence score obtained by each base learner. The above example shows the utility of the proposed approach when the majority base learner predicts the wrong class and the correct class losses with a very less margin. Similarly, the proposed approach also has the potential to correct the misclassified samples even when all the base learners fail to predict correctly as discussed in case study 2. A Deep Learning-Fuzzy Based Hybrid Ensemble Approach… Informatica 47 (2023) 115–130 127 Figure 6: Case study 1 Figure 7: Case study 2 Case study 2: All three base learners predict incorrectly. In this case, the correct class 2 has obtained a confidence score in the range of 39-45 percent. However, it fails to get the first rank by each base learner by a very slight margin. The calculations of the proposed approach in Fig. 7 show that the rewarding strategy of the approach using 𝑡𝑎𝑛 ℎ function (Eq. (1) in section 4.3) helps class 2 get the final score better than the other two classes. 7 Conclusions In this study, a novel fuzzy-based ensemble technique is proposed for ALSC task. In addition to this, the Car dataset of the automobile domain is also developed as per the SemEval guidelines of benchmark datasets. The following are the conclusions of this study: A car dataset is developed in the automobile domain to facilitate automated and cross-domain learning in ALSC literature. The size of the proposed car dataset is around 7500 samples which is larger than other benchmark datasets available for ALSC task. Since the training of neural networks requires large datasets, this car data will also facilitate better training and testing of various deep learning methods proposed for ALSC task. A novel fuzzy-based ensemble is proposed which utilizes the confidence scores of the classes predicted by each base learner. The proposed fuzzy ensemble is based on mathematical logic where two functions of different concavities are used to calculate the final predicted class. Further, the case studies presented in section 6 have validated the significance of the restrictive rewarding and penalization strategy used in this work. The performance of the proposed fuzzy ensemble technique is either better or at par with other top-performing latest deep learning- based methods. In the future, fine-tuning of the hyperparameters of base learners can be performed to get better results. It is 128 Informatica 47 (2023) 115–130 T. Sharma et al. also suggested to incorporate more advanced deep neural architectures in the ensemble. References [1] K. Schouten and F. Frasincar, "Survey on Aspect- Level Sentiment Analysis," IEEE Transactions on Knowledge & Data Engineering , vol. 28, no. 3, pp. 813-830, March 2016. https://doi.org/10.1109/tkde.2015.2485209 [2] T. Sharma and K. Kaur, "Benchmarking Deep Learning Methods for Aspect Level Sentiment Classification," Applied Sciences, vol. 11(22):10542, November 2021. https://doi.org/10.3390/app112210542 [3] M. Pontiki, D. Galanis, J. Pavlopoulos, H. Papageorgiou, I. Androutsopoulos and S. Manandhar, "SemEval-2014 Task 4: Aspect Based Sentiment Analysis," in 8th International Workshop on Semantic Evaluation (SemEval 2014), Dublin, Ireland, 2014. https://doi.org/10.3115/v1/s14-2004 [4] M. Pontiki, D. Galanis, H. Papageorgiou, S. Manandhar and I. Androutsopoulos, "Semeval-2015 task 12: Aspect based sentiment analysis.," in 9th International Workshop on Semantic Evaluation (SemEval 2015), 2015. https://doi.org/10.18653/v1/s15-2082 [5] M. Pontiki, D. Galanis, H. Papageorgiou, I. Androutsopoulos, S. Manandhar and A. S. Mohammad, "SemEval-2016 task 5: Aspect based sentiment analysis.," in 10th international workshop on semantic evaluation (SemEval-2016), 2016. https://doi.org/10.18653/v1/S16-1002 [6] Q. Jiang, L. Chen, R. Xu, X. Ao and M. Yang, "A Challenge Dataset and Effective Models for Aspect- Based Sentiment Analysis," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019. https://doi.org/10.18653/v1/d19-1654 [7] L. Dong, F. Wei, C. Tan, D. Tang, M. Zhou and K. Xu, "Adaptive Recursive Neural Networkfor target- dependent twitter sentiment classification," in Proceedings of the 52nd annual meeting of the association for computational linguistics, Baltimore, Maryland, USA, 2014. https://doi.org/10.3115/v1/p14-2009 [8] W. Etaiwi, D. Suleiman and A. Awajan, "Deep Learning Based Techniques for Sentiment Analysis: A Survey," Informatica, vol. 45, no. 7, pp. 89-95, 2021. https://doi.org/10.31449/inf.v45i7.3674 [9] S. Al-Otaibi and A. Al-Rasheed, "A Review and Comparative Analysis of Sentiment Analysis Techniques," Informatica, vol. 46, no. 6, pp. 33-44, 2022. https://doi.org/10.31449/inf.v46i6.3991 [10] J. Zhou, J. X. Huang, Q. Chen, Q. V. Hu, T. Wang and L. He, "Deep Learning for Aspect-Level Sentiment Classification: Survey, Vision and Challenges.," IEEE Access, vol. 7, 2019. https://doi.org/10.1109/access.2019.2920075 [11] D. Tang, B. Qin, X. Feng and T. Liu, "Effective LSTMs for Target-Dependent Sentiment Classification," in Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 2016. [12] Y. Wang, M. Huang, L. Zhao and X. Zhu, "Attention-based LSTM for Aspect-level Sentiment Classification," in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016. https://doi.org/10.18653/v1/d16-1058 [13] P. Chen, L. Bing, Z. Sun and W. Yang, "Recurrent Attention Network on Memory for Aspect Sentiment Analysis," in Conference on Empirical Methods in Natural Language Processing, 2017. https://doi.org/10.18653/v1/d17-1047 [14] D. Tang, B. Qin and T. Liu, "Aspect Level Sentiment Classification with Deep Memory Network," in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016. https://doi.org/10.18653/v1/d16-1021 [15] W. Xue and T. Li, "Aspect Based Sentiment Analysis with Gated Convolutional Networks," in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018. https://doi.org/10.18653/v1/p18-1234 [16] X. Li, L. Bing, W. Lam and B. Shi, "Transformation networks for target-oriented sentiment classification," in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018. https://doi.org/10.18653/v1/p18-1087 [17] B. Huang, Y. Ou and K. M. Carley, "Aspect Level Sentiment Classification with Attention-over- Attention Neural Networks," in International Conference on Social Computing, Behavioral- Cultural Modeling and Prediction and Behavior Representation in Modeling and Simulation,SBP- BRiMS 2018, 2018. https://doi.org/10.1007/978-3-319-93372-6_22 [18] D. Ma, S. Li, X. Zhang and H. Wang, "Interactive Attention Networks for Aspect-Level Sentiment A Deep Learning-Fuzzy Based Hybrid Ensemble Approach… Informatica 47 (2023) 115–130 129 Classification," in IJCAI'17: Proceedings of the 26th International Joint Conference on Artificial Intelligence,2017. https://doi.org/10.48550/arXiv.1709.00893 [19] F. Fan, Y. Feng and D. Zhao, "Multi-grained attention network for aspect-level sentiment classification," in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, 2018. https://doi.org/10.18653/v1/d18-1380 [20] C. Zhang, Q. Li and D. Song, "Aspect-based Sentiment Classification with Aspect-specific Graph Convolutional Networks," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019. https://doi.org/10.18653/v1/d19-1464 [21] B. Huang and K. M. Carley, "Syntax-Aware Aspect Level Sentiment Classification with Graph Attention Networks," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019. https://doi.org/10.18653/v1/d19-1549 [22] R. Li, H. Chen, F. Feng, Z. Ma, X. Wang and E. Hovy, "Dual Graph Convolutional Networks for Aspect-based Sentiment Analysis," in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), Online, 2021. https://doi.org/10.18653/v1/2021.acl-long.494 [23] Z. Zhang, Z. Zhou and Y. Wang, "SSEGCN: Syntactic and Semantic Enhanced Graph Convolutional Network for Aspect-based Sentiment Analysis," in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Seattle, United States, 2022. https://doi.org/10.18653/v1/2022.naacl-main.362 [24] H. Wu, C. Huang and S. Deng, "Improving aspect- based sentiment analysis with Knowledge-aware Dependency Graph Network," Information Fusion, vol. 92, pp. 289-299, 2022. https://doi.org/10.1016/j.inffus.2022.12.004 [25] B. Liang, H. Su, L. Gui, E. Cambria and R. Xu, "Aspect-based sentiment analysis via affective knowledge enhanced graph convolutional networks," Knowledge-Based Systems, vol. 235, p. 107643, 2022. https://doi.org/10.1016/j.knosys.2021.107643 [26] X. Bai, P. Liu and Y. Zhang, "Investigating Typed Syntactic Dependencies for Targeted Sentiment Classification Using Graph Attention Neural Network," IEEE/ACM Transactions on Audio, Speech, and Language Processing, pp. 503-514, 2021. https://doi.org/10.1109/taslp.2020.3042009 [27] X. Zhu, L. Zhu, J. Guo, S. Liang and S. Dietze, "GL- GCN: Global and Local Dependency Guided Graph Convolutional Networks for aspect-based sentiment classification," Expert Systems With Applications, vol. 186, 2021. https://doi.org/10.1016/j.eswa.2021.115712 [28] T. Sharma and K. Kaur, "An Ensemble approach for Aspect level sentiment classification using deep learning methods," in 3rd International Conference on Data Analytics & Management (ICDAM-2022), 2022. https://doi.org/10.1007/978-981-19-7615-5_69 [29] T. Sharma and K. Kaur, "An Equilibrium Optimizer based Ensemble for Aspect level Sentiment Classification," in presented at International Conference on Advances and Applications of Artificial Intelligence and Machine Learning(ICAAAIML), 2022. [30] A. Mohammadi and A. Shaverizade, "Ensemble Deep Learning for Aspect-based Sentiment Analysis," International Journal of Nonlinear Analysis and Applications, vol. 12, no. Special Issue, Winter and Spring 2021, pp. 29-38, 2021. https://doi.org/10.22075/IJNAA.2021.4769 [31] T. Sharma and K. Kaur, "Aspect sentiment classification using syntactic neighbour based attention network," Journal of King Saud University - Computer and Information Sciences, vol. 35, no. 2, pp. 612-625, 2023. https://doi.org/10.1016/j.jksuci.2023.01.005 [32] K. Ganesan and C. Zhai, "Opinion-based entity ranking," Information Retrieval, vol. 15, no. 2, pp. 116-150, 2012. https://doi.org/10.1007/s10791-011-9174-8 [33] H. Rinne, The Weibull Distribution A Handbook, 1st ed., Chapman & Hall, 2020. 130 Informatica 47 (2023) 115–130 T. Sharma et al.