https://doi.org/10.31449/inf.v47i9.5222 Informatica 45 (2021) 157–164 157 An Algorithm for Data Management of Higher Education Based on Fuzzy Set Theory - Association Rule Mining Algorithm Youmeng Guan Xingtai Open University, Xingtai 054000, China E-mail: guayou71546@163.com Keywords: fuzzy set, higher education, association rule, data-based management Received: September 22, 2023 Data management can enhance the efficiency of higher education management. When combined with data mining and other technologies, it can provide a sound foundation for making management decisions. This article combined the Apriori algorithm in association rules with fuzzy theory and optimized the FCM algorithm to mine fuzzy association rules for student course grades. The results indicated that the improved FCM algorithm demonstrated a more effective clustering effect on Iris, and the outcomes were closer to the actual values. Applying this method to fuzzy association rule mining could reveal the connections between students' course grades. For instance, when students achieved excellent grades in the course of computer application fundamentals, their performance in principles of computer composition was also good. Furthermore, if they obtained excellent grades in computer application fundamentals and good grades in principles of computer composition, their grades achieved in operating systems were also excellent. The experimental results validate the reliability of the fuzzy association rule mining algorithm, which allows for the discovery of associations between different courses. Consequently, it provides valuable support for education and teaching. Povzetek: Članek obravnava izboljšanje upravljanja visokega šolstva z upravljanjem podatkov in rudarjenjem. Uporabljen je izboljšan algoritem FCM za odkrivanje povezav med ocenami študentov. 1 Introduction Under the influence of technological development, the field of education and teaching is increasingly shifting towards digitization and data-driven approaches [1]. However, as colleges and universities expand and their systems continue to operate, the volume of data stored in the management systems grows annually, putting significant pressure on system performance. As a result, data mining techniques have been widely adopted in various aspects, such as predicting students' grades and evaluating the quality of teaching and learning [2]. Currently, research in this field has mainly focused on predicting student performance and grades, with relatively little analysis of the interrelationships between courses. However, course arrangement and design are crucial aspects of higher education teaching arrangements. In order to understand the reliability of data mining technique for course correlation analysis, this paper studied the application of association rule mining algorithms. To enhance mining efficiency, a fuzzy association rule combined with fuzzy set theory was designed to uncover correlations between students' grades in various courses. The goal of this study is to provide theoretical guidance for organizing educational and teaching work. The study results confirm that the proposed method effectively extracts useful data from education management systems, improves data-based management efficiency, enhances decision-making and management capabilities in colleges and universities, and contributes to the overall improvement of education and teaching quality. It is also beneficial for further application of data mining technology in educational work. 2 Related works Current research on data mining techniques in educational teaching work is presented in Table 1. Table 1: A summary table of related works Literat ure Method Dataset Result Barua h et al. [3] The MapReduce framework based on the proposed fractional competitive multi- verse optimization- based deep neuro- fuzzy network Performan ce data of students The mean squared error, root mean squared error, and mean absolute error are .0.3383, 0.5817, and 0.3915, respectively . Sando val et al. [4] A prediction model based on low-cost variables and a sophisticated algorithm Performan ce data of students They improved the model by up to 12.28% in terms of root-mean- square 158 Informatica 47 (2023) 157–164 Y. Guan error. Joshi et al. [5] CatBoost - an ensemble machine learning model Performan ce data of students The accuracy is 92.27%. Fan et al. [6] A deep learning method of recommending MOOCs to students based on a multi- attention mechanism comprising learning records attention, word-level review attention, sentence- level review attention and course description attention Real- world data consisting of the learning records of 6,628 students for 1,789 courses and 65,155 reviews. MOOC platforms must fully utilize the information implied in course reviews to extract personalize d learning preferences. 3 Association rule mining algorithm 3.1 Association rules Association rule mining algorithms were initially employed in supermarket shopping to analyze customers' purchasing patterns, enabling retailers to optimize product placement and increase sales [7]. Over time, the continuous advancement of association rule mining algorithms has led to their widespread adoption in data analysis across various domains, including healthcare [8] and engineering control [9]. Association rules can be written as 𝑋 ⇒ 𝑌 , meaning that there is a high probability that a record containing 𝑋 will include 𝑌 (𝑋 is called the former term, and 𝑌 is called the latter term). Taking supermarket shopping behaviors an an example, suppose: of the customers who purchased a coke, 90% purchased potato chips (30% of all customers purchased both coke and chips). In this example, "coke" is the former term and "potato chips" is the latter term, "90%" refers to the confidence level of the rule, while "30%" refers to its support level. According to this rule, coke and potato chips can be displayed in similar positions, thus increasing sales. Suppose that in database 𝐷 , the set of all items is: 𝐼 = {𝐼 1 , 𝐼 2 , ⋯ , 𝐼 𝑛 } , and the set of some items is: 𝑇 = {𝑡 1 , 𝑡 2 , ⋯ , 𝑡 𝑛 }. If there is A ⊂ I, B ⊂ I, and A ∩ B ≠ ∅, then an association rule is obtained: A ⇒ B. The following rules exist in the association rule. (1) Support level: the proportion of records in 𝐷 containing both A and B in all records, which is written as: 𝑠𝑢𝑝𝑝 (𝐴 ⇒ 𝐵 ) = |{𝑇 :𝐴 ∪𝐵 ⊆𝑇 ,𝑇 ∈𝐷 }| |𝐷 | . (1) (2) Confidence level: the proportion of records in 𝐷 containing both A and B in all records containing 𝐴 only, which is written as: 𝑐𝑜𝑛𝑓 (𝐴 ⇒ 𝐵 ) = |{𝑇 :𝐴 ∪𝐵 ⊆𝑇 ,𝑇 ∈𝐷 }| |{𝑇 :𝐴 ⊆𝑇 ,𝑇 ∈𝐷 }| . (2) (3) Frequent term set: if there is 𝑠𝑢𝑝𝑝 {𝐴 } ≥ 𝑚𝑖𝑛𝑠𝑢𝑝𝑝 , where 𝑚𝑖𝑛𝑠𝑢𝑝𝑝 is the minimum support level, then {𝐴 } is called the frequent itemset. (4) Strong rule: If there is 𝑠𝑢𝑝𝑝 (𝐴 ⇒ 𝐵 ) ≥ 𝑚𝑖𝑛𝑠𝑢𝑝𝑝 and 𝑐𝑜𝑛𝑓 (𝐴 ⇒ 𝐵 ) ≥ 𝑚𝑖𝑛𝑐𝑜𝑛𝑓 , where 𝑚𝑖𝑛𝑐𝑜𝑛𝑓 is the minimum confidence level, then 𝐴 ⇒ 𝐵 is called a strong rule. 3.2 Apriori algorithm The Apriori algorithm is a classical association rule mining algorithm [10]. For database 𝐷 , the sequence of steps involved in the mining process of the Apriori algorithm is outlined below. (1) Database D is traversed to obtain a one- dimensional set of candidate items. (2) Infrequent subsets in the one-dimensional candidate item set are pruned to obtain a one-dimensional frequent item set. (3) The one-dimensional frequent item set is self- connected to obtain the two-dimensional candidate item set. (4) Infrequent subsets in the two-dimensional candidate item set are pruned to obtain the two- dimensional frequent item set. (5) The k-dimensional frequent term set is self- connected to obtain the k+1-dimensional candidate term set. (6) Infrequent subsets in the k+1-dimensional candidate item set are pruned to obtain the k+1- dimensional frequent item set. (7) There is no new set of frequent items, and mining ends. The mining process of the Apriori algorithm with a minimum support threshold set at 2 can be illustrated in Figure 1, taking a simple dataset as an example. Figure 1: The mining process of the Apriori algorithm As shown in Figure 1, the Apriori algorithm continuously scans and prunes the dataset, eventually obtaining a 3-dimensional frequent itemset {B, C, E}, at which point the algorithm terminates. However, the main drawback of the association rule in the case of the Apriori algorithm is that it can only deal with discrete data [11] and is easy to have the problem of overly hard boundary An Algorithm for Data Management of Higher Education Based… Informatica 47 (2023) 157–164 159 division in the process of dividing continuous attributes into discrete intervals. For example, student grades are generally divided as follows. 0 < x < 60 = failed 60 ≤ x < 70 = pass 70 ≤ x < 80 = moderate 80 ≤ x < 90 = good 90 ≤ x ≤ 100 = excellent According to this division, the difference between the students whose grades are 59 and 60 respectively is very small; however, they are placed in different sets. In addition, in real life, there are many non-numerical attributes that make this mining algorithm inapplicable. Therefore, to better apply association rules for mining higher education data, this paper combines them with fuzzy set theory. 4 Fuzzy association rules incorporating fuzzy set theory 4.1 Fuzzy set theory In real life, there are many affairs without obvious boundaries, such as such as height or shortness, distance or proximity, goodness or badness, etc. Fuzzy set theory [12] introduces the concept of membership function and uses the notion of intermediate transition to realize the fuzzy processing of these affairs, which has made remarkable achievements in areas like fuzzy control [13], pattern recognition [14], and so on. Suppose that A is a mapping from domain X to [0,1], written as: 𝐴 : 𝑋 → [0,1], then 𝐴 is called the fuzzy set on 𝑋 . 𝐴 (𝑥 ) is the membership degree of the fuzzy set. 4.2 Fuzzy association rule algorithm By combining fuzzy set theory with association rules, fuzzy association rules can be obtained [15]. Before mining the data, the attributes need to be discretized first, and the fuzzy C-mean (FCM) algorithm is used [16]. Suppose there is dataset 𝑋 = {𝑥 1 , 𝑥 2 , ⋯ , 𝑥 𝑛 } , 𝑗 = 1,2, ⋯ , 𝑛 , the purpose of the FCM algorithm is to divide 𝑋 into 𝑐 classes and get the clustering center set 𝑉 = {𝑣 1 , 𝑣 2 , ⋯ , 𝑣 𝑐 } , 𝑖 = 1,2, ⋯ , 𝑐 . Then, the membership degree of the 𝑗 -th data belonging to the 𝑖 -th class is written as 𝑢 𝑖𝑗 , 𝑢 𝑖𝑗 ∈ [0,1], ∑ 𝑢 𝑖𝑗 𝑐 𝑖 =1 = 1. The objective function of the FCM algorithm can be written as: 𝐽 (𝑈 , 𝑉 ) = ∑ ∑ 𝑢 𝑖𝑗 𝑚 𝑛 𝑗 =1 𝑐 𝑖 =1 ‖𝑥 𝑗 − 𝑣 𝑖 ‖ 2 , (3) where 𝑈 = [𝑢 𝑖𝑗 ] 𝑐 ×𝑛 is the membership matrix and 𝑚 is the fuzzy factor. The steps of the FCM algorithm are shown below. (1) The clustering center is initialized. (2) u ij is calculated: u ij = 1 ∑ (‖x j −v i ‖/‖x j −v r ‖) 2/m−1 c r=1 . (3) The clustering center is updated: v i = ∑ (u ij ) m x j n j=1 ∑ (u ij ) m n j=1 . (4) Objective function J and the size of 𝐽 in the last iteration are calculated. If 𝐽 is less than or equal to termination condition ε or the specified number of iterations is reached, then it turns to (5), otherwise it returns to (3). (5) Clustering result (V, U) is output. After attribute discretization, assume that there is an arbitrary set of fuzzy attributes 𝑋 = {𝑥 1 , 𝑥 2 , ⋯ , 𝑥 𝑝 }. The fuzzy support level of the 𝑖 -th record in fuzzy database 𝐷 𝑓 for 𝑋 is 𝐹𝑠𝑢𝑝𝑝 𝑖 (𝑋 ). Then, the fuzzy support level of 𝑋 in 𝐷 𝑓 is: 𝐹𝑠𝑢𝑝𝑝 (𝑋 ) = ∑ 𝐹𝑠𝑢𝑝𝑝 𝑖 (𝑋 ) 𝑛 𝑖 =1 |𝐷 𝑓 | . (4) The fuzzy association rule is written as 𝑋 𝑓 ⇒ 𝑌 𝑓 , and its support level is written as: 𝐹𝑠𝑢𝑝𝑝 (𝑋 𝑓 ⇒ 𝑌 𝑓 ) = ∑ 𝐹𝑠𝑢𝑝𝑝 𝑖 (𝑋 𝑓 ∪𝑌 𝑓 ) 𝑛 𝑖 =1 |𝐷 𝑓 | . (5) Confidence level is written as: 𝐹𝑐𝑜𝑛𝑓 (𝑋 𝑓 ⇒ 𝑌 𝑓 ) = ∑ 𝐹𝑠𝑢 𝑝 𝑝 (𝑋 𝑓 ∪𝑌 𝑓 ) 𝑛 𝑖 =1 𝐹𝑠𝑢𝑝𝑝 (𝑋 𝑓 ) . (6) Fuzzy association rules follow the same method to mine the data and get fuzzy association rules. The FCM algorithm has low complexity, is easy to implement, and is the most widely used fuzzy clustering method. However, the way of randomly determining the initial clustering center in the FCM algorithm may bring some negative effects on the results [17], making it difficult to guarantee their accuracy and determine whether the obtained optimal solution is globally optimal. In order to improve this problem and increase the reliability of fuzzy association rules, this paper uses density function to determine the initial clustering center. The FCM algorithm only considers distance measurements in its computation process. By incorporating a density function, it becomes possible to incorporate a measure of the spatial distribution of sample points. In a data, if a point is surrounded by many other data points, it means that it has a high density or is more likely to be the clustering center that reduces the effect of isolated noise point on clustering. The density function is: 𝑀 𝑗 = ∑ 1 ‖𝑥 𝑖 −𝑥 𝑗 ‖ 𝑛 𝑗 =1,𝑗 ≠𝑖 , 1 ≤ 𝑖 ≤ 𝑛 , 1 ≤ 𝑗 ≤ 𝑛 . (7) The larger the value of 𝑀 𝑗 is, the more the data points distributed around is, and the larger the density is. The denser points are used as the initial clustering centers, while the region length is introduced to divide the cluster centers in order to avoid these points being in the same cluster: D = D max c , (8) where 𝐷 𝑚𝑎𝑥 is the distance between the two farthest clustered samples in the dataset and 𝑐 is the number of categories. The procedure of determining the initial clustering centers by the improved FCM is shown below. (1) The length of the region in dataset D is calculated, marking all sample points as searchable. (2) Sample point x i with the highest density is used as the initial clustering center. (3) All sample points in the area where 𝑥 𝑖 locates are marked as unsearchable. 160 Informatica 47 (2023) 157–164 Y. Guan (4) It returns to step (2) for c times of iterations to find out the sample point with the highest density and use it as the initial clustering center. By improving FCM, the influence of randomly generated initial cluster centers on clustering results can be avoided, thus preventing the selection of some isolated noise points as cluster centers and reducing result bias. 5 Student achievement analysis results 5.1 Experimental setup The grades of 11 courses for 1,068 students from the Computer Science Department in the class of 2019 were exported from the student grade management system of Xingtai Open University for fuzzy association rule mining to eliminate incomplete values and abnormal values. Finally, the data of 1,052 students were retained, and some of them are shown in Table 2. Table 2: Student course grades (unit: point) Code Course name 1 2 3 ...... 1052 A Computer application fundamentals 85 91 87 ...... 65 B Principles of computer composition 87 92 89 ...... 64 C Operating systems 77 93 86 ...... 61 D Data structure 76 89 87 ...... 62 E C programming 85 95 88 ...... 69 F Discrete number 77 85 81 ...... 63 G Software engineering 67 77 71 ...... 49 H Computer network 78 85 81 ...... 55 I Database application technology 67 78 73 ...... 55 J WEB development basics 65 77 74 ...... 56 K Computer network security technology 60 70 65 ...... 48 The fuzzification results of students' course grades obtained using trapezoidal membership function are illustrated in Figure 1. 0 1 57 63 68 73 78 83 88 93 Failed=5 Pass=4 Medium=3 Good=2 Excellent=1 Figure 1: Fuzzification of students' course grades The numbers 1-5 were used to assign codes to the students' course grades, corresponding to excellent- failure, respectively. After coding, the data were all written in the form of A1, B2, C3, etc., which facilitated subsequent mining of fuzzy association rules. 5.2 Clustering performance analysis To determine the effectiveness of the improved FCM algorithm, experiments were conducted on the UCI's Iris dataset and Seeds dataset [18]. The UCI dataset is a commonly used standard benchmark for machine learning, and this article selected two subsets from it to validate the clustering performance of the improved FCM algorithm. The Iris dataset contains 150 samples. The samples were divided into three classes with four attributes, such as sepal length as displayed in Table 2. They were clustered using the traditional FCM and improved FCM algorithms, and the results were compared with the actual clustering centers given in the literature [19]. The comparison results are presented in Table 3. Table 3: Comparison of clustering results for the Iris dataset Cluste ring Sepal length /cm Sepal width/ cm Petal length /cm Petal width/ cm Literat ure [19] 1 5.00 3.42 1.46 0.24 2 5.93 2.77 4.26 1.32 3 6.58 2.97 5.55 2.02 Traditi onal FCM algorit hm 1 5.37 3.46 1.54 0.33 2 6.16 3.14 4.56 1.31 3 7.03 3.16 5.88 1.98 Improv ed FCM algorit hm 1 5.01 3.41 1.44 0.25 2 5.99 2.81 4.31 1.29 3 7.03 2.94 5.57 1.99 From Table 3, it can be found that the gap between the clustering results obtained by the traditional FCM algorithm and the actual results was large. For example, in cluster 1, the sepal length obtained by the FCM algorithm was 5.37 cm, while the actual result was 5.00 cm, with a difference of 0.37 cm. In contrast, the clustering results obtained by the improved FCM algorithm were much An Algorithm for Data Management of Higher Education Based… Informatica 47 (2023) 157–164 161 closer to the actual results, with a smaller gap. The study on the Iris dataset showed that the improved FCM algorithm had better clustering results and can be applied in fuzzy association rules. The Seeds dataset consists of 210 samples, which can be divided into three classes and seven attributes. The traditional FCM and improved FCM algorithms were used to cluster the dataset, and the results were compared with the actual clustering centers provided by the dataset. The results are presented in Table 4. Table 4: Comparison of clustering results for the Seeds datasets Cl ust er A re a Peri met er Com pactn ess Le ng th of ke rn el W idt h of ke rn el Asy mme try coef ficie nt Le ng th of ke rn el gr oo ve Act ual valu e 1 0. 7 6 0.7 9 0.69 0. 73 0. 77 0.37 0. 76 2 0. 1 2 0.1 8 0.38 0. 19 0. 16 0.50 0. 28 3 0. 3 8 0.4 2 0.67 0. 36 0. 47 0.26 0. 32 Trad ition al FC M algo rith m 1 0. 7 1 0.6 9 0.72 0. 75 0. 71 0.39 0. 67 2 0. 1 6 0.2 1 0.31 0. 18 0. 15 0.55 0. 31 3 0. 3 1 0.4 5 0.68 0. 37 0. 51 0.24 0. 29 Imp rove d FC M algo rith m 1 0. 7 5 0.7 8 0.68 0. 73 0. 75 0.39 0. 77 2 0. 1 3 0.1 8 0.37 0. 21 0. 17 0.55 0. 28 3 0. 4 1 0.4 2 0.67 0. 35 0. 49 0.27 0. 32 From Table 4, it can be observed that similar to the results of the Iris dataset, there was a significant discrepancy between the clustering results obtained by traditional FCM algorithm and the actual cluster centers. Taking Cluster 1 as an example, the traditional FCM algorithm yielded a length of kernel groove value of 0.67, which differed greatly from the actual value of 0.76. In comparison, the improved FCM algorithm achieved better alignment with the actual values in terms of clustering results. The testing conducted on two datasets demonstrated that improved FCM performed better in clustering tasks. 5.3 Fuzzy association rule analysis Suppose min-supp = 0.3 and min-conf = 0.7. Fuzzy association rule mining was performed on students' course grades, and the partial results obtained are shown in Table 5. Table 5: Fuzzy association rules Former term Latter term Support level Confidence level A1 B1 0.35 0.97 J2 G2 0.36 0.95 K4 G4 0.32 0.88 A1, B2 C1 0.31 0.76 C2, D2 I1 0.33 0.85 E2, F2 I2 0.37 0.77 H1, J2 G3 0.32 0.91 I3, J3 K4 0.31 0.81 A1, C1 G1 0.33 0.84 A2, B2 K3 0.32 0.71 According to Table 5, the rules obtained are as follows. (1) (Computer application fundamentals = excellent) ⇒ (principles of computer composition = excellent) (2) (WEB development basics = good)⇒ (software engineering = good) (3) (Computer network security technology = pass)⇒ (software engineering = pass) (4) (Computer application fundamentals = excellent, computer application fundamentals = good)⇒ (operating systems = excellent) (5) (Operating systems = good, data structures = good)⇒ (database application technology = excellent) (6) (C programming = good, discrete mathematics = good)⇒ (database application technology = good) (7) (Computer network = excellent, WEB development basics = good)⇒ (software engineering = moderate) (8) (Database application technology = moderate, WEB development fundamentals = moderate) ⇒ (computer network security technology = pass) (9) (Computer application fundamentals = excellent, operating systems = excellent)⇒ (software engineering = excellent) (10) (Computer application fundamentals = good, principles of computer composition = good)⇒ (computer network security technology = moderate) The above rules were analyzed. Taking rule 1 as an example, 35% of the students in the database satisfy this rule, i.e., i.e., performing excellently in both computer application fundamentals and principles of computer composition. Moreover, when a student is proficient in 162 Informatica 47 (2023) 157–164 Y. Guan computer application fundamentals, 97% of the students are also proficient in principles of computer composition. This shows that there is some commonality between the two courses and that students who are able to be proficient in computer application fundamentals are also able to be proficient in principles of computer composition. Taking Rule 4 as an example, 31% of students in the database satisfy this rule, i.e., when students have excellent performance in computer application fundamentals and good performance in principles of computer composition, they can achieve excellent performance in operating systems. Additionally, 76% of students demonstrate excellent performance in operating systems when they have excellent performance in computer application fundamentals and good performance in principles of computer composition. This shows that computer application fundamentals and principles of computer composition serve as the foundation for understanding operating systems. If students have a good performance in these two courses, they are likely to achieve high grades in their operating systems course. Take Rule 8 as an example, 31% of students in the database satisfy this rule, i.e., when students' performance in database application technology is moderate and their performance in WEB development fundamentals is also moderate, their performance in computer network security technology is qualified. When students have a moderate performance in database application technology and WEB development fundamentals, 81% of them demonstrate qualified performance in computer network security technology. This indicates that struggling with database application technology and WEB development fundamentals will make it more difficult for students to learn the course of computer network security technology. Some correlations between courses can be identified according to the above rules to provide some guidance for the school's subsequent curriculum arrangement. For example, the study of computer application fundamentals and principles of computer composition should be preceded by courses in operating system and computer network security technology, in order to lay a good foundation. Similarly, courses on operating system and data structure should be arranged prior to database application technology to help students better understand the content of database application technology. 6 Discussion The widespread application of data mining technology in educational teaching has provided scientific and reliable support for the management and decision- making of educational teaching work, effectively improving the efficiency of educational instruction. Considering the limitations of current data mining technology in course correlation analysis, this paper examined the usability of fuzzy association rule mining in higher education student course grade correlation analysis based on an association rule mining algorithm and proposed a method based on an improved FCM algorithm. The improved FCM algorithm introduced in this paper performed better in clustering by incorporating density function design, as demonstrated through experiments on two standard test sets. The obtained clustering results were closer to the actual cluster centers of the data, indicating the reliability of the improved FCM algorithm in cluster analysis. Furthermore, when applied to fuzzy association rule mining for student course grades, this method identified certain associations between courses. For example, the performance in database application technology and web development basics will affect the grades in computer network security technology. Similarly, the performance in computer application fundamentals also has a certain impact on the grades in principles of computer organization. By analyzing these fuzzy association rules, we can understand which courses should be arranged as foundational courses earlier in the curriculum and which courses should be scheduled after completing foundational coursework. This provides theoretical support for course planners and can also be applied to student course selection systems to help students choose suitable courses for better learning outcomes, thereby avoiding a decline in interest and poor learning effectiveness caused by difficulty jumps during their studies. In general, using the fuzzy association rule mining algorithm to analyze the correlation between courses can help optimize teaching plans and improve the performance of course selection systems. Due to the current application of data mining techniques mainly in predicting and evaluating students' performance, there has been limited research on course association rule mining. Consequently, valuable data in educational databases have not been fully utilized. This study serves as a starting point for applying fuzzy association rule mining in higher education data management, demonstrating the effectiveness of this method in analyzing course correlations. However, there is still room for improvement. For example, there are relatively few courses designed in the experiment, and a limited number of rules have been discovered. Additionally, other data mining algorithms that can be applied in this field need to be discussed. In future work, it is necessary to further improve the efficiency of mining and uncover more useful rules for education, providing guidance for educational teaching. 7 Conclusion This paper primarily focuses on data-based management in higher education. Fuzzy association rule mining algorithms were utilized to analyze the association between students' course grades. The results indicated that the improved FCM algorithm exhibited a better clustering effect compared to the traditional FCM algorithm. Additionally, the fuzzy association rule mining based on the improved FCM algorithm yielded satisfactory outcomes. Moreover, the fuzzy association rule mining based on the improved FCM algorithm also achieved satisfactory results, revealing rules that were more meaningful and accurately reflecting the connections between courses. These findings provide a solid foundation for actual course arrangement. An Algorithm for Data Management of Higher Education Based… Informatica 47 (2023) 157–164 163 References [1] Xie B (2020). Construction of Teacher Culture in Applied Colleges under the Background of Educational Informationization. Microprocessors and Microsystems, 2020, pp. 103486. https://doi.org/10.1016/j.micpro.2020.103486 [2] Ma Y L, Cui C, Nie X, Yang G, Shaheed K, Yin Y (2019). Pre-course student performance prediction with multi-instance multi-label learnin. Science China (Information Sciences), 62, pp. 200-205. https://doi.org/10.1007/s11432-017-9371-y [3] Baruah A J, Baruah S (2021). Data Augmentation and Deep Neuro-fuzzy Network for Student Performance Prediction with MapReduce Framework. International Journal of Automation and Computing, 18, pp. 981-992. https://doi.org/10.1007/s11633-021-1312-1 [4] Sandoval A, Gonzalez C, Alarcon R, Pichara K, Montenegro M (2018). Centralized student performance prediction in large courses based on low-cost variables in an institutional context. The Internet and Higher Education, 37, pp. 76-89. https://doi.org/10.1016/j.iheduc.2018.02.002 [5] Joshi A, Saggar P, Jain R, Sharma M, Gupta D, Khanna A (2021). CatBoost - An Ensemble Machine Learning Model for Prediction and Classification of Student Academic Performance. Advances in Data Science and Adaptive Analysis: Theory and Applications, 13, pp. 1-28. https://doi.org/10.1142/S2424922X21410023 [6] Fan J, Jiang Y, Liu Y, Zhou Y (2022). Interpretable MOOC recommendation: a multi-attention network for personalized learning behavior analysis. Internet Research: Electronic Networking Applications and Policy, 32, pp. 588-605. https://doi.org/10.1108/INTR-08-2020-0477 [7] Prahartiwi L I, Dari W (2019). Algoritma Apriori untuk Pencarian Frequent itemset dalam Association Rule Mining. PIKSEL Penelitian Ilmu Komputer Sistem Embedded and Logic, 7, pp. 143-152. https://doi.org/10.33558/piksel.v7i2.1817 [8] Diamond B J, Happawana K A (2022). Association rule learning in neuropsychological data analysis for Alzheimer's disease. Journal of Neuropsychology, 16, pp. 116-130. [9] Fu L, Wang X, Zhao H, Zhao H, Li M (2022). Interactions among safety risks in metro deep foundation pit projects: An association rule mining- based modeling framework. Reliability Engineering & System Safety, 221, pp. 1-16. https://doi.org/10.1016/j.ress.2022.108381 [10] Liu X, Sang X, Chang J, Zheng Y, Han Y (2021). The water supply association analysis method in Shenzhen based on kmeans clustering discretization and apriori algorithm. PLoS ONE, 16, pp. 1-21. https://doi.org/10.1371/journal.pone.0255684 [11] Erişti B, Yildirim O, Eristi H, Demir Y (2018). A New Embedded Power Quality Event Classification System Based on The Wavelet Transform. International Transactions on Electrical Energy Systems, 28, pp. e2597. https://doi.org/10.1002/etep.2597 [12] Zaki M J, Parthasarathy S, Li W, Ogihara M (1997). Evaluation of sampling for data mining of association rules. Proceedings Seventh International Workshop on Research Issues in Data Engineering, pp. 42-50. [13] Zoghby H M E, Ramadan H S (2022). Enhanced dynamic performance of steam turbine driving synchronous generator emulator via adaptive fuzzy control. Computers & Electrical Engineering, 97, pp. 107666-. https://doi.org/10.1016/j.compeleceng.2021.107666 [14] Naranjo R, Santos M (2019). A fuzzy decision system for money investment in stock markets based on fuzzy candlesticks pattern recognition. Expert Systems with Applications, 133, pp. 34-48. https://doi.org/10.1016/j.eswa.2019.05.012 [15] Yavari A, Rajabzadeh A, Abdali-Mohammadi F (2021). Profile-Based Assessment of Diseases Affective Factors Using Fuzzy Association Rule Mining Approach: A Case Study in Heart Diseases. Journal of Biomedical Informatics, 116, pp. 103695. https://doi.org/10.1016/j.jbi.2021.103695 [16] Zare M, Koch M (2018). Groundwater level fluctuations simulation and prediction by ANFIS- and hybrid Wavelet-ANFIS/Fuzzy C-Means (FCM) clustering models: Application to the Miandarband plain. Journal of Hydro-environment Research, 18, pp. 63-76. https://doi.org/10.1016/j.jher.2017.11.004 [17] Siringoringo R, Jamaluddin J (2019). Initializing the Fuzzy C-Means Cluster Center With Particle Swarm Optimization for Sentiment Clustering. IOP Publishing Ltd, pp. 1-6. [18] https://archive.ics.uci.edu/ml/datasets/Iris [19] Bezdek J C, Hathaway R J, Sabin M J, Tucker W T (1987). Convergence theory for fuzzy c-means: Counterexamples and repairs. IEEE Transactions on Systems Man & Cybernetics, 17, pp. 873-877. https://doi.org/10.1109/TSMC.1987.6499296 164 Informatica 47 (2023) 157–164 Y. Guan