https://doi.org/10.31449/inf.v48i11.5675                                                                                      Informatica 48 (2024) 97–112  97 
Research on Intelligent Mining Methods of Multimedia Teaching 
Resources in Colleges and Universities in the Age of Big Data 
Qi Yue
1
, Zhu Xuan
*2
 
1
College of Chemistry, Changchun Normal University, Changchun 130032, China 
2
College of Literature, Changchun Normal University, Changchun130032, China 
E-mail: zhuxuan-486@163.com 
*
Corresponding author 
Keywords: big data; teaching resources; focusing on reptiles; feature extraction; BP neural network; data mining  
Received:  January 30, 2024
To solve the problem of difficult access to college multimedia resources in the era of big data, the 
present research proposes an intelligent mining method for college multimedia teaching resources. 
Collect multimedia teaching resource data from the Internet by using focused crawlers with the 
advantages of subject crawling and URL sorting; Use the methods of removing stop words, word 
segmentation and word frequency statistics to process the crawled data; Extract features from 
processed data; The features extracted by clustering analysis are classified. The number of categories 
is selected as the number of BP neural networks for combination, and the momentum method and 
learning rate adaptive adjustment strategy are introduced to improve the combined BP neural 
network. The extracted features are input into the improved combined BP neural network, and the 
intelligent mining results of university multimedia resources are output. The experimental results 
indicate that the focused crawler method can efficiently collect multimedia education resources in 
colleges and universities, and data preprocessing can efficiently reduce data redundancy. The double-
feature extraction method can significantly enhance the recall and accuracy of data mining. It can 
realize the classified mining of multimedia teaching resources in academic centers and display the 
classified mining results of multimedia teaching resources in various disciplines. 
Povzetek: Raziskava uvaja inteligentno metodo rudarjenja multimedijskih učnih virov z uporabo 
osredotočenih pajkov, obdelave podatkov in izboljšane BP nevronske mreže, kar povečuje učinkovitost 
in točnost rudarjenja podatkov. 
 
1 Introduction 
Big data [1], or huge amount of data, refers to the 
information that the amount of data involved is so large 
that it can't be captured, managed, processed and arranged 
in a reasonable time to help enterprises make more 
positive business decisions through mainstream software 
tools [2]. 
Data mining is a hot issue in artificial intelligence [3], 
database and other fields [4]. The so-called data mining 
refers to the extraordinary process of revealing hidden [5], 
previously unknown and potentially valuable information 
from a large amount of data in the database. The types of 
data can be structured [6], semi-structured [7] or even 
heterogeneous [8]. The object of data mining can be any 
type of data source. It can be a relational database [9], 
which is a data source containing structured data; It can 
also be data warehouse, text [10], multimedia data [11], 
spatial data, time series data and Web data, which contain 
semi-structured data or even heterogeneous data. Data 
mining is a decision support process, which is mainly 
based on artificial intelligence [12], machine learning, 
pattern recognition, statistics, database, visualization 
technology, etc. It analyzes the data of enterprises with 
high automation, makes inductive reasoning, mines 
potential patterns from them, and helps decision makers  
 
adjust market strategies, reduce risks and make correct 
decisions. The process of knowledge discovery consists of 
the following three stages: data preparation, data mining, 
result expression and interpretation. 
In colleges and universities, multimedia teaching has 
been realized, but the current multimedia teaching is not 
very ideal. One of the important factors is that the 
multimedia teaching resources in colleges and universities 
are few and single, and it is difficult to find rich 
multimedia teaching resources on the Internet because 
there are a lot of data on the Internet. 
Many scholars have researched the mining of teaching 
resources, such as Varde A. Stichude's calculation and 
estimation of scientific data mining based on classical 
methods, the automatic implementation of scientists' 
learning strategies [13], and the integration of clustering 
complex teaching data into a framework, thus automating 
scientists' classical learning methods. The knowledge 
from the existing experimental database can be used as the 
basis for estimation. Challenges include maintaining 
domain semantics in clustering, finding the right strategy 
in classification, achieving a good balance between 
refinement and simplicity in presenting estimated results 
according to the target user's needs and getting objective 
metrics to capture users' subjective preferences. Thus, the 
mining of teaching data is completed. An educational data 
98   Informatica 48 (2024) 97–112                                                                                                                                        C. Yue et al.
 
 
 
mining system designed by Abu MM et al. to predict and 
improve the programming ability of college students [14], 
the system includes two main modules: classification and 
learning processes. The classification module predicts the 
current state of students, digs programming data on the 
Internet, and the learning process module generates 
corresponding suggestions and feedback to improve the 
quality of students. Especially for the classification 
module, a real dataset related to the task is prepared and 
evaluated to study six key machine learning (ML) 
algorithms, support vector machines (SVM), decision 
trees, artificial neural networks, random forests (RF), k-
clusters and naive Bayesian classifiers, and use 
performance metrics and goodness of fit related to 
accuracy. Complete the excavation of programming 
teaching resources and evaluate students' programming 
ability. Zhao G et al. proposed the image network teaching 
resources retrieval algorithm based on the depth hash 
algorithm [15], constructed the pixel big data detection 
model of the multi-view attribute coded image network 
teaching resources, reconstructed the pixel information 
collected by the multi-view attribute coded image network 
teaching resources, and extracted the fuzzy information 
feature components of the multi-view attribute coded 
image. The combination of edge contour distribution 
images is done. The distributed fusion result of the image 
edge contour of the network-supervised resource view 
realizes the construction of a view feature parameter set. 
The gray matrix invariant feature analysis method is used 
to realize information encoding. The deep hashing 
algorithm is applied to retrieve teacher resources of the 
image network with multi-view attributes encoded, and 
the teacher resource of multi-view attributes is used. The 
hash coding result of the resource is realized. The encoded 
image network is used to achieve information 
reorganization to improve fusion. Finally, the retrieval and 
mining of teaching resources for images are completed. 
The above educational resource mining methods all have 
certain shortcomings, such as long data collection time, 
insufficient data collection breadth, low accuracy of data 
mining, long running time of data mining, etc. 
Data mining can deal with big data and discover the 
hidden relationship of data. Therefore, this paper puts 
forward the research on intelligent mining method of 
multimedia teaching resources in colleges and universities 
in the era of big data, using focused crawler to collect 
multimedia teaching resources in colleges and 
universities, using word frequency statistics method to 
process data, then extracting data features, and finally 
using improved BP neural network to mine data features. 
2 Intelligent mining method of 
multimedia teaching resources in 
colleges and universities 
2.1 Collection of college multimedia 
teaching resources based on web 
crawler 
Because of the massive data information on the Internet, 
how to quickly and accurately find the required 
multimedia teaching information on the Internet has 
become increasingly important. Search Engine is an 
online information retrieval tool [16], and a web crawler 
is a key part of a search engine. The search engine usually 
includes four parts: information collector (Robot), data 
indexer (Indexer), query searcher (Server) and result 
sorter (Ranker). 
The specific workflow of the search engine is 
presented in Figure 1. It includes the following steps: 
(1) Finding and obtaining web page information 
through web crawlers [17]; 
(2) Organize and process information and create an 
index database; 
(3) Processing and sorting of search results; 
(4) The retrieval result is returned.
 
Figure 1: Search engine workflow 
Research on Intelligent Mining Methods of Multimedia Teaching…                                              Informatica 48 (2024) 97–112    99                                                                                                                                   
However, if search is manually done for multimedia 
education resources, the work efficiency will be much 
higher. Therefore, the web crawlers can be used to collect 
multimedia education resources. In this paper, the focus is 
on the crawlers known as theme crawlers (or professional 
crawlers); that is, it is a "theme-oriented" web crawler 
program. The difference between it and what a crawler 
(universal crawler) is that a focused crawler is a target 
theme-driven and selective crawler. When implementing 
web crawling, college multimedia teaching theme 
screening is required. It tries to ensure that only the 
webpage information related to the multimedia teaching 
theme in colleges and universities is captured. According 
to the established target theme, the focused crawler 
selectively accesses the relevant pages on the Web. It 
pursues the precision of network information rather than 
the coverage of network resources. 
Unlike ordinary web crawlers, the workflow of 
focused crawlers is more complex. It is essential to filter 
irrelevant links to the topic according to a specific web 
page analysis algorithm, retain useful links and put them 
into the URL queue waiting to be crawled. Then, it will 
select the next page URL [18] from the queue according 
to a certain search strategy and repeat the above process 
until a certain condition of the system is reached. The 
focused web crawler workflow is shown in Figure 2. In 
addition, all web pages captured by crawlers will be stored 
in the system [19], analyzed and filtered to a certain extent, 
and college multimedia teaching indexes will be 
established for users to query and retrieve later. For 
focused crawlers, the analysis results obtained in this 
process may also give feedback and guidance to the 
subsequent capture process. The focused crawler structure 
used in this paper is shown in Figure 3. The focused 
crawler of this structure first adds the related theme pages 
to the predefined categories of theme samples according 
to the predefined college multimedia teaching theme 
samples to train the college multimedia teaching theme 
samples. Classifier-based focused crawler includes two 
most important parts: one is a web page classifier, which 
is used to learn the features of crawling targets, calculate 
the relevance of web pages, and filter uncorrelated web 
pages [20]; The other is a web page selector, which is used 
to calculate the importance of a web page and dynamically 
determine the order in which the crawler accesses the web 
page according to the importance.
 
Figure 2: focuses on the crawler works 
 
Figure 3: Focuses on the crawler structure 
100   Informatica 48 (2024) 97–112                                                                                                                                        C. Yue et al.
 
 
 
2.2 Data processing of college multimedia 
teaching resources 
Due to the huge scale of multimedia teaching resources in 
colleges and universities, the collection efficiency of 
traditional common crawlers is low, but focusing crawlers 
can obtain the required teaching resources more quickly 
and accurately, and improve the efficiency of data 
collection. There are data with low correlation in 
multimedia education resources in colleges and 
universities, which will lead to serious data redundancy if 
they are not processed. Therefore, deleting data with low 
correlation can reduce redundancy and improve data 
preprocessing effect. The college multimedia teaching 
resources collected by the focused web crawler are 
uniformly mapped into text form and are de disabled and 
word segmentation processing. On this basis, word 
frequency statistics are used to complete the processing of 
college multimedia data. 
2.2.1 Deactivation word processing 
A complete resource text contains a large number of 
meaningless words, such as the modal particle "ah", "na", 
and "ni”. These words are collectively called stop words. 
The removal method is to traverse all texts according to 
the deactivated thesaurus library and delete all words in 
the text that appear in the deactivated thesaurus library. 
2.2.2 Word segmentation 
Word segmentation: the data text of college multimedia 
teaching resources is divided into thousands of words or 
phrases. Here, the bidirectional matching method is used 
for word segmentation. This particular process is as 
follows. 
Step 1: Set the input string to be segmented as 𝐷 , the 
output segmentation result is 𝐶 ； 
Step 2: Execute the forward maximum matching 
algorithm to get the segmentation result 𝐶 1
. The specific 
process is as follows: 
(1) Initialize dictionary Dic and set maximum 
segmentation 𝑐 𝑚𝑎𝑥
； 
(2) Judgment 𝐷 whether it is empty. If yes, output the 
word segmentation result 𝐶 1
; Otherwise, enter the next 
link; 
(3) Comparison 𝑐 𝑚𝑎𝑥
and 𝐷 length, specify the smaller 
value of the two, and record it as 𝐹 ； 
(4) Cut the length from the head substring to 𝐹 , 
marked as 𝐷̂
； 
(5) Check the dictionary 𝐷̂
 whether it is in the 
dictionary. If not, go to the next step; Otherwise, go to (7); 
(6) Set 𝐷̂
 remove the right word in and judge 
𝐷 
̂
whether it is a single word. If yes, go to the next step; 
Otherwise, go back to (5); 
(7) Order 𝐶 1
=𝐶 1
+𝐷̂
+"/" ，𝐷 =𝐷 −𝐷̂
, and go 
back to (2). 
Step 3: Execute the reverse maximum matching 
algorithm to obtain the segmentation result 𝐶 2
； 
Step 4: Judgment 𝐶 1
 and 𝐶 2 
whether it is the same. If 
the same, make 𝐶 =𝐶 1
perhaps 𝐶 =𝐶 2
 and jump to step 
6; Otherwise, proceed to the next step. 
Step 5: Judge 𝐶 1
and 𝐶 2
whether their lengths are the 
same. If they are the same, make 𝐶 =𝐶 2
, take the 
segmentation result of the reverse maximum matching 
algorithm as the result, and skip to step 6; Otherwise, take 
the short value and assign it to 𝐶 , skip to step 6. 
Step 6: Output segmentation results 𝐶 . 
Therefore, the data segmentation of multimedia 
teaching resources in colleges and universities is 
completed. 
2.2.3 Statistical processing of word frequency 
In the university multimedia teaching resource data text 
document, the words of each frequency are distributed in 
a certain rule. The word frequency statistics method uses 
statistical knowledge to describe the word rules. Zipf's law 
and cloth's law are two laws that have far-reaching 
influence on word frequency statistics [21]. For many 
years, scholars in various fields have conducted in-depth 
research on the statistical law of word frequency, and 
many scholars favor this method because of its simplicity 
and practicality. 
Zipf's law is described as [22]: given university 
multimedia teaching resource data text document as 𝑑 ，
𝐿 represent text 𝑑 length of (𝐿  Large enough), 
𝑁 𝑑𝑖𝑓𝑓 indicates that appears in 𝑑 total number of different 
words in, 𝑇 𝐹 𝑛 express 𝑑 word frequency of Chinese 
words(𝑛 is the number of occurrences of words in the 
text), 𝑇 𝑅 𝑛 
represents and 𝑇 𝐹 𝑛 corresponding word rank, 
𝑓 𝑛 
indicates the frequency of words, 𝑓 𝑛 =
𝑇 𝐸 𝑛 𝐿 , then: 
𝑓 𝑛 ×𝑇 𝑅 𝑛 =𝐾 (1) 
𝑇 𝐹 𝑛 ≤𝑓 𝑛 ×𝐿 <𝑇 𝐹 𝑛 +1
 (2) 
According to Zipf's law, the word frequency is 
𝑇 𝐹 𝑛 number of words on the same frequency of 
𝑁𝑇𝐼 𝐹 𝑛 For: 
𝑁𝑇𝐼 𝐹 𝑛 =
𝐾 ⋅𝐿 𝑇 𝐹 𝑛 ⋅𝑇 𝐹 𝑛 +1
 (3) 
Formula (3) calculates the number of words with the 
same frequency 𝑁𝑇𝐼 𝐹 𝑛 is not completely applicable to 
word frequency 𝑇 𝐹 𝑛 take any value, since it is based on 
Zipf's law, but Zipf's law cannot well reflect the 
distribution of words with extremely low word frequency 
𝑇 𝐹 𝑛 =1,2. The fluctuation is particularly obvious. 
Therefore, the maximum value method processes when 
𝑇 𝐹 𝑛 =1,2  words with the same time-frequency. 
Word frequency is based on the maximum method 
𝑇 𝐹 𝑛 =1,2 number of words on the same frequency 
𝑁𝑇𝐼 𝐹 𝑛 . The expression of is: 
𝑁𝑇𝐼𝐹 =
𝑁 𝑑𝑖𝑓𝑓 𝑇 𝐹 𝑛 ×𝑇 𝐹 𝑛 +1
,𝑛 =1,2 (4) 
Where, 𝑁 𝑑𝑖𝑓𝑓 appears in the document for 𝑑 the total 
number of different words in. 
Research on Intelligent Mining Methods of Multimedia Teaching…                                              Informatica 48 (2024) 97–112    101                                                                                                                                   
The simultaneous formula (3) and (4) get the number 
of words with the same frequency NTIF, and the complete 
expression is: 
𝑁𝑇𝐼 𝐹 𝑛 =
{
 
 
𝐾 ×𝐿 𝑇 𝐹 𝑛 ×𝑇 𝐹 𝑛 +1
,𝑛 >2
𝑁 𝑑𝑖𝑓𝑓 𝑇 𝐹 𝑛 ×𝑇 𝐹 𝑛 +1
,𝑛 =1,2
 (5) 
Among them, 𝐾 =
1
(𝑙𝑛 𝑁 𝑑𝑖𝑓𝑓 +𝛽 )
 ( 𝛽 is Euler constant). 
The text data preprocessing method of college 
multimedia teaching resource data based on the statistical 
rule of word frequency is as follows: 
Step 1: Initialize the storage word frequency as 𝑇 𝐹 𝑛 =
1 ， 𝑇 𝐹 𝑛 =2 ， 𝑇 𝐹 𝑛 >2's dictionary 𝑑𝑖𝑐𝑡 1 ，𝑑𝑖𝑐𝑡 2 ，
𝑑𝑖𝑐𝑡 3 and corresponding counters recording the number 
of words with different word frequencies 𝑐𝑜𝑢𝑛𝑡 1, 
𝑐𝑜𝑢𝑛𝑡 2, 𝑐𝑜𝑢𝑛𝑡 3, definition word list 𝑇𝑒𝑟𝑚𝐿𝑖𝑠𝑡 and 
counter 𝑤𝑜𝑟𝑑 _𝑐𝑜𝑢𝑛𝑡 ； 
Step 2: Perform word segmentation and record the 
word frequency for every word; 
Step 3: Classify based on various word frequencies 
and record the number of words for every frequency; 
Step 4: Data preprocessing based on word frequency 
statistics; 
Step 5: Select the word frequency with low frequency 
and low correlation to delete. 
Step 6: Output the word sets with different word 
frequencies, the total number of words corresponding to 
each set, and the pre-processing list. 
Finally, the data processing of multimedia teaching 
resources in colleges and universities is completed. 
2.3 Data feature extraction of college 
multimedia teaching resources 
Because the current methods for extracting text data 
features have different shortcomings, this paper uses BNS 
and Odds to extract features of college multimedia 
teaching resources data after preprocessing [23]. Using 
these two methods to extract features of college 
multimedia teaching resources data can not only 
complement the shortcomings of the other's methods. It 
can also further improve the accuracy of feature 
extraction. 
The data feature extraction of multimedia teaching 
resources in colleges and universities can help to extract 
key information and remove low-frequency vocabulary, 
so as to obtain accurate and effective features, express the 
characteristics of multimedia data more fully, improve the 
accuracy and efficiency of data feature extraction, 
effectively solve the problems of data complexity and 
redundancy, improve the accuracy and efficiency of 
mining, and provide scientific and reliable technical 
support for intelligent mining of multimedia teaching 
resources in colleges and universities. 
2.3.1 Data feature extraction method of college 
multimedia teaching resources 
(1) BNS method 
BNS is a new feature extraction algorithm applicable 
to the text classification operation of multimedia teaching 
resource data in colleges and universities. It measures and 
compares the significance of items concerning category 
distribution using the probability statistical method, and 
the formula is: 
( ) ( ) ( )
11
pr pr
p
pr
pn
p
pr
pn
BNS t F t F f
t
t
tf
f
f
ft
−−
=−
=
+
=
+
 
(6) 
Where, 𝑡 indicates an entry, 𝐵𝑁𝑆 (𝑡 ) indicates the 
BNS characteristic value of this term. 𝐹 ( ) the 
distribution function representing the standard normal 
distribution, 𝑡 𝑝 
indicates the number of texts containing 
entries in the class, 𝑓 𝑝 refers to the number of texts 
containing entries outside the class, 𝑓 𝑛 denotes the number 
of texts without entries in the class, 𝑡 𝑛 represents the 
number of text without entries outside the class. When 
𝑡 𝑝𝑟
 or 𝑓 𝑝𝑟
is 0, define 𝐹 −1
(0)=0.0005. 
BNS algorithm models the features in each college 
multimedia teaching resource data text with random 
normal distribution curve, and uses the limit area of the 
lower end point of the normal curve as a measure of the 
correlation degree of feature terms to the class. The greater 
the correlation degree of feature terms to the class, the 
farther the endpoint of the normal class is from the 
endpoint of the anti-class (as shown in Figure 4). BNS 
value is a measure of the difference between the two 
endpoints.
102   Informatica 48 (2024) 97–112                                                                                                                                        C. Yue et al.
 
 
 
Positive class correlation
Anti class correlation
BNS value
 
Figure 4: BNS feature extraction algorithm endpoint separation 
The BNS algorithm can solve the problem of data set 
skew of multimedia teaching resources in colleges and 
universities and performs well in multi-step or combined 
feature selection, but the accuracy of classification results 
using the features obtained from the BNS algorithm could 
be better. 
(2) Odds method 
Odds mainly reflects the difference rate between the 
advantages of positive and negative terms in the text 
classification of multimedia teaching resources data in 
academic centers. The formula is: 
𝑂𝑑𝑑𝑠 (𝑡 )=𝑙𝑜𝑔
𝑃 (𝑊 |𝑝𝑜𝑠 )(1−𝑃 (𝑊 |𝑛𝑒𝑔 ))
𝑃 (𝑊 |𝑛𝑒𝑔 )(1−𝑃 (𝑊 |𝑝𝑜𝑠 ))
 (7) 
Where, 𝑡 indicates an entry, 𝑂𝑑𝑑𝑠 (𝑡 )represents this 
term 𝑂𝑑𝑑𝑠 characteristic value. 𝑃 (𝑊 |𝑝𝑜𝑠 )represents 
intraclass entries 𝑊 the conditional probability of 
occurrence, 𝑃 (𝑊 |𝑛𝑒𝑔 ) denotes out of class entries 𝑊 the 
conditional probability of occurrence. 
𝑂 𝑑 𝑑𝑠 its feature extraction algorithm does not treat all 
classes equally. It only cares about the target class value 
and recognizes as many positive classes as possible but 
does not care about anti-classes. It is suitable for binary 
classifiers. According to Mladenic and Grobelnik, 
𝑂𝑑𝑑𝑠 its feature extraction algorithm is conducive to the 
information repair of other algorithms, so it can effectively 
supplement the shortcomings of other algorithms in the 
combination with other algorithms. 
2.3.2 Data feature extraction process of college 
multimedia teaching resources 
This paper uses two algorithms to extract and complement 
the text features of college multimedia teaching resource 
data. Based on maintaining the accuracy of feature 
extraction, the problem of college multimedia teaching 
resource data set skew is solved [24], and finally, the 
college multimedia teaching resource data text feature set 
with fewer dimensions is obtained. The method flow is 
shown in Figure 5.
Research on Intelligent Mining Methods of Multimedia Teaching…                                              Informatica 48 (2024) 97–112    103                                                                                                                                   
 
Figure 5: Algorithm flow 
The especial steps of the algorithm include the 
following phases: 
Input: Processed college multimedia teaching 
resource data. 
Output: Term vector in the vector space model 
composed of feature terms. 
Description: Text object𝑡 (𝑡𝑁𝑎𝑚𝑟 ,∗𝑊 ,𝑡𝑁𝑢𝑚𝑏𝑒𝑟 ), 
entry 
object
𝑤 (𝑤𝑁𝑎𝑚𝑒 ,𝑤𝑁𝑢𝑚𝑏𝑒𝑟 .𝑡𝑝 ,𝑓𝑝 ,𝑡𝑛 ,𝑓𝑛 ,𝐵𝑁𝑆 ,𝑂𝑑𝑑𝑠 ,𝐵𝑂𝑆 )
，∗𝑇 is a pointer to an array of text objects, ∗𝑊 is a 
pointer to the term object array, 𝑡𝑁𝑎𝑚𝑒 is the text name, 
𝑤 𝑁𝑎𝑚𝑒 is the content of the entry, 𝑡𝑁𝑢𝑚𝑏𝑒𝑟 is the 
number of text objects,
wNumber
is the number of 
entries. 
Step 1: Initialize ∗𝑇 ; 
Step 2: Write the input college multimedia teaching 
resource data ∗𝑇 ，𝑇 𝑛 represents the𝑛 class, and calculate 
the 𝑡𝑝 values and 𝑓𝑛 value to set the hash table with 
entries as keywords for each class; 
Step 3: Calculate the 𝑓𝑝 values and 𝑡𝑛 value; 
Step 4: Use the BNS feature extraction algorithm to 
calculate the BNS feature value and Odds feature value of 
each term, and calculate the BNS value variance, Odds 
value variance, BNS value weight, and Odds value weight 
of each category; 
Step 5: Extract the text features of college multimedia 
teaching resources data according to the calculated feature 
values and write them into the vector space model. 
2.4 Data mining model of college 
multimedia teaching resources based 
on improved BP neural network 
2.4.1 BP Neural network 
BP neural network is a multi-layer feedforward neural 
network model trained based on the error backpropagation 
algorithm [25]. This model can learn and store many 
input-output mode mapping relations without explaining 
the mathematical formulas describing the mapping 
relationships in advance. Determine the model structure 
based on BP neural network, as shown in Figure 6. 
Input layer Hidden layer Output layer
 
Figure 6: Model structure of BP neural network 
The processed mult imedia
educational resource data of
colleges and universi ties
BNS feature 
extraction
Odds feature
extraction
Eigenvector Eigenvector
104   Informatica 48 (2024) 97–112                                                                                                                                        C. Yue et al.
 
 
 
As shown in Figure 6, in the BP neural network 
model, the final mining results are output through the 
three-layer architecture of input layer, hidden layer and 
output layer. The single-layer neuron structure of this 
model is expressed mathematically, as shown in formula 
(8): 
1
b
n
x
x
y f w x 
=

=−



 
(8) 
In the formula, 𝑦 represents the output value of the 
development evaluation model; 𝑓 represents the activation 
function of the model; 𝑥 represents the initial index 
variable substituted into the model; 𝑤 𝑥 represents the 
weight of model neurons; 𝜃 represents the model 
threshold. 
In Formula (8), the calculation method of the model 
activation function is shown in formula (9). 
2
1
( ) 1,( 1 ( ) 1)
1
x
f x f x
e
−
= − −  
−
 
(9) 
In the formula, 𝑒 stands for natural constant. The 
development evaluation model is trained, and the initial 
learning rate of the model is set to 0.1. Through repeated 
training, the output error of the model results is controlled 
between 1 and 5. 
On this basis, and dynamically adjusts the network 
weight and threshold through backpropagation to obtain 
the minimum value of the sum of squares of errors. The 
BP algorithm is mainly based on the gradient method to 
establish the minimum quadratic performance index 
function and seek the minimum result of the objective 
function through processing one by one or batch 
processing. The function expression is: 
𝐸 =∑𝐸 𝑘 𝑚 𝑘 =1
 (10) 
Where, 𝐸 𝑘 
is a local error function 
Assumed co ownership 𝑁 hidden layers, update the 𝐸 
hidden layer 𝑖 units to output units 𝑘 and the process of 
connection weight value of is: 
𝑊 𝑖 ,𝑘 𝑅 =𝑅 𝑖 ,𝑘 −1
𝑅 +𝛥 𝑊 𝑖 ,𝑘 𝑅 
𝑊 𝑖 ,𝑘 𝑅 =𝛼 𝜀 𝑖 ,𝑘 𝑅 ℎ
𝑘 𝑅 +1
,(𝑖 =1,2,⋯,𝑁 ) 
(11) 
Applying the learning algorithm to the data mining of 
multimedia teaching resources in colleges and universities 
can effectively find out the hidden mode of multimedia 
teaching resources in colleges and universities. 
2.4.2 Fuzzy clustering 
C-means and K-means algorithms are commonly used 
algorithms in fuzzy clustering [26]. Their common point 
is that the cluster center is modified through repeated 
iterative calculation, and the Euclidean distance is used to 
judge the membership of samples. When a specified 
condition or threshold is reached, the iteration process 
ends and the classification is completed. However, the K-
means algorithm is highly dependent on the initial cluster 
center, and the classification results lack stability, so the 
C-means algorithm is still the mainstream algorithm. In 
this paper, the C-means fuzzy clustering algorithm is used 
to cluster the extracted data characteristics of college 
multi-media teaching resources and divide the data 
characteristics of college multimedia teaching resources 
into
c
categories, corresponding selection
c
BP neural 
networks are combined to complete the mining of college 
multimedia teaching resources. 
Assume a given sample of characteristics of 
multimedia teaching resources in the academic centers 
𝐴 ={𝑥 1
,𝑥 2
,⋯,𝑥 𝑛 } , the number of clusters is
c
, then the 
objective function of formula (10) holds. 
𝑚𝑖𝑛 𝐽 =∑∑𝑢 𝑖𝑘
𝑑 𝑖 𝑘 2
𝑛 𝑘 =1
𝑐 𝑖 =1
 
𝑑 𝑖𝑘
=‖𝑠 𝑘 −𝑧 𝑖 ‖ 
(10) 
𝑢 𝑖𝑘 
represents the 𝑘 samples in the 𝑖 membership in 
the class, and 𝑢 𝑖𝑘
 Formula (11) is satisfied. 
∑𝑢 𝑖𝑘
=1
𝑐 𝑖 =1
,∀𝑘 
𝑛 >∑𝑢 𝑖𝑘
𝑛 𝑘 =1
>0,∀𝑖 
(11) 
Where, 𝑆 𝑘 represents the 𝑘 relative position of 
samples; 𝑧 𝑖 is for the center of class 𝑖 ; 𝑑 𝑖𝑘
is for 𝑘 samples 
to the center distance of the class 𝑖 . 
The purpose of C-means algorithm is to obtain the 
optimal solution of the objective function, which is 
restricted by the following two constraints: 
𝑧 𝑖 (𝑞 )
=
∑ 𝑢 𝑖𝑘
(𝑞 )𝑚 𝑥 𝑘 𝑛 𝑘 =1
∑ 𝑢 𝑖𝑘
(𝑞 )𝑚 𝑛 𝑘 =1
,(𝑖 =1,2,3,⋯,𝑐 ) (12) 
𝑢 𝑖𝑘
(𝑞 +1)
=
1
∑ (
𝑑 𝑖𝑘
𝑑 𝑗𝑘
)
2
𝑚 −1
𝑐 𝑗 =1
,∀𝑖 ,∀𝑘 
(13) 
Where, 𝑚 is the weighted coefficient of membership. 
The basic flow of the algorithm is as follows: 
Step 1: Give initial parameters 𝑚 and 𝑐 ，the value of 
𝑚 is generally taken as 2, and the initial cluster center is 
calculated 𝑧 ； 
Step 2: Use formula (12) and formula (13) to compute 
the corrected 𝑧 and 𝑢 ； 
Step 3: Give a∈If a more appropriate norm matrix is 
found ‖𝑈 (𝑞 +1)
−𝑈 (𝑞 )
‖𝑍 <∈ Stop, otherwise turn to the 
second step. 
Based on the clustering results, the number of clusters 
is 𝑐 and cluster center 𝑧 . 
2.4.3 Implementation of data mining of college 
multimedia teaching resources based on BP neural 
network 
Combined BP neural network generates cluster numbers 
based on fuzzy clustering 𝑐 , select
c
BP networks are 
combined to complete the data mining of college 
multimedia teaching resources. All BP networks adopt 
three-layer structures. These layers include input, hidden 
and output layers. At the same time, heuristic BP 
improved algorithm Heuristicbp is used to enhance the 
efficacy and accuracy of the overall combined BP 
network. In other words, the momentum method and 
learning rate adaptive adjustment strategy are used. 
Research on Intelligent Mining Methods of Multimedia Teaching…                                              Informatica 48 (2024) 97–112    105                                                                                                                                   
The approximate steepest descent method is used by 
traditional BP algorithm to update the weight and offset 
values as following equation: 
𝛥𝑉 (𝑘 )−𝜆 𝑆 𝑚 (𝑎 𝑚 −1
)𝐺 (14) 
𝛥𝑏 (𝑘 )=−𝜆 𝑆 𝑚 (15) 
Where, 𝛥𝑉 represents the weight value; 𝛥𝑏 represents 
offset; 𝑚 represents characteristic attribute 
Add momentum coefficient to the above formula 
𝛾 the momentum improvement formula of 
backpropagation is calculated by momentum 
improvement: 
𝛥𝑉 (𝑘 )=𝛾𝛥𝑉 (𝑘 −1)𝜆 𝑆 𝑚 (𝑎 𝑚 −1
)𝐺 (16) 
𝛥𝑏 (𝑘 )=𝛾𝛥𝑏 (𝑘 −1)(1−𝛾 )𝑎 𝑆 𝑚 (17) 
The improved BP neural network structure is seen in 
Figure 7.
 
Figure 7: Combined BP neural network 
The improved combined BP neural network college 
multimedia teaching resources data mining process is 
represented in Figure 8.
 
Figure 8: The BP neural network data mining process 
Sample input
Fuzzy clustering
BP neural network Predictor 1
BP neural network Predictor 2
BP neural network Predictor 3
BP neural network Predictor n
Predict ive output
Predictor
106   Informatica 48 (2024) 97–112                                                                                                                                        C. Yue et al.
 
 
 
Input: untrained BP neural network, characteristics of 
college multimedia teaching resources, initial learning 
speed momentum coefficient 𝛾 , learning factor and 
correction threshold. 
Output: Mining results of multimedia teaching 
resources in colleges and universities. 
Step 1: Set the weight value and initial offset 
value, 𝑘 ≤0 ； 
Step 2: Select an input vector and target output vector; 
Step 3: Get the output of each unit of hidden layer and 
output layer; 
Step 4: Get the mean square error between the 
expected output and the actual output 𝐸 ； 
Step 5: Judge whether the error meets the 
requirements (if it is the output of data mining results); if 
not, continue to calculate the weight gradient; 
Step 6: Complete the weight and bias learning 
correction; 
Step 7: Calculate the updated mean square error 𝐸 ； 
Step 8: Determine the mean square error whether 
increases (if not, the mean value update is accepted, and 
the non-zero value is restored); if yes, judge whether to 
increase or correct the threshold value (if yes, the weight 
update is accepted, and the non-zero value is restored), if 
not, the weight update is cancelled; 
Step 9: When 𝐾 ≤𝐾 +1 select the input vector and 
target the output vector again. 
This completes the mining of efficient multimedia 
resources, 
3 Experimental analyses 
3.1 Experimental objects 
To conduct intelligent mining of multimedia teaching 
resources in colleges and universities, the search engine 
developed by Baidu Company is used to collect data. 
Baidu Search is a world's leading Chinese search engine. 
In January 2000, Li Yanhong and Xu Yong founded in 
Zhongguancun, Beijing, and committed to providing 
people with "simple and reliable" access to information. 
In the beginning, Google developed Baidu as an original 
version to develop and later developed its core technology 
based on the “Hyperchain analysis. The search service 
provided. 
Baidu has a huge database of Chinese web pages in 
the world. As of 2010, it has included over 20 billion 
Chinese web pages. The number of these pages is growing 
to tens of millions every day. Simultaneously, Baidu 
servers which are distributed throughout China can 
directly return the searched information to domestic users 
from the nearest server, allowing them to enjoy extremely 
fast search transmission speed. 
3.2 Experimental data 
A summary of related works is shown in Table 1. 
Table 1: Summary of related works 
Contrast 
index 
The method in this 
paper 
Reference [13] 
method 
Reference [14] method 
Reference [15] 
method 
Correlation 
data 
Intelligent mining 
performance of 
multimedia teaching 
resources in colleges 
and universities 
Intelligent mining 
performance of 
multimedia teaching 
resources in colleges 
and universities 
Intelligent mining 
performance of 
multimedia teaching 
resources in colleges and 
universities 
Intelligent mining 
performance of 
multimedia teaching 
resources in colleges 
and universities 
Method 
Input the extracted 
features into the 
improved combined 
BP neural network, 
and output the 
intelligent mining 
results of multimedia 
resources in colleges 
and universities. 
Based on classical 
methods, scientific 
data mining 
calculates and 
estimates, and 
realizes the strategy 
of scientists' learning 
automatically. 
Classification process and 
learning process 
Using depth hash 
algorithm to realize 
the retrieval of multi-
view attribute coding 
image network 
teaching resources 
Research 
results 
Improving the recall 
and accuracy of data 
mining can realize 
classified mining of 
multimedia teaching 
resources in colleges 
and universities. 
Strike a good balance 
between refinement 
and simplicity. 
Using the performance 
metrics and goodness of fit 
related to accuracy, the 
mining of programming 
teaching resources and the 
evaluation of students' 
programming ability are 
completed 
Effectively complete 
the image retrieval 
and mining of 
teaching resources. 
To verify the effects of the focused crawler data 
collection method adopted in the current paper, the data 
collection amount of college multimedia teaching 
resource data is compared with ordinary crawlers 
simultaneously. The results are represented in Figure 9
Research on Intelligent Mining Methods of Multimedia Teaching…                                              Informatica 48 (2024) 97–112    107                                                                                                                                   
 
Figure 9: Comparison of data acquisition of two crawlers 
As can be seen from Figure 9, the efficiency of 
collecting multimedia educational resources in colleges 
and universities by using focused crawler is much higher 
than that by using ordinary crawler, because focused web 
crawler not only selects topics when crawling data, but 
also adopts analysis and evaluation methods, which not 
only leads to higher data collection amount of efficient 
multimedia educational resources but also higher data 
correlation than ordinary web crawler. 
To verify the data processing results of this method, 
the word frequency statistics of college multimedia 
teaching resource data collected through web crawlers are 
done, and the results have been presented in Figure 10.
 
Figure 10: Utes-21578 data set word frequency acquisition results 
As seen from Figure 10, after word frequency 
statistics, 𝑁𝑇𝐼 𝐹 1
The proportion of word frequency is the 
highest, more than 50%, and the highest is 64.43%. Its 
relevance could be higher. Deleting it can significantly 
10 20 30 40 50 60
0
500
1000
1500
2000
2500
Data collection quanti ty/piece
Time/minute
 Focused crawler
 Common crawler
58.33
62.73
54.34
56.73
64.43
16.46
18.03
17.91
20.06
15.5
25.21
19.24
25.75
29.21
20.07
Acq/% Crude/% Earn/% Grain/% Interest/%
0
15
30
45
60
75
The s ame frequency 
word proportion
 NTIF1  NTIF2   NTIFn
108   Informatica 48 (2024) 97–112                                                                                                                                        C. Yue et al.
 
 
 
decrease data redundancy and enhance the data 
preprocessing effect of college multimedia teaching 
resources. 
To prove the data processing effect of this method, the 
original college multimedia teaching resources and the 
processed efficient multimedia teaching resources are 
simultaneously extracted with data features. The data 
feature extraction results have been provided in Table 2.
Table 2: Feature extraction results of two groups of data 
Unprocessed data Processed data 
Number of BNS features Number of Odds features The number of BNS features The number of Odds features 
6424 5276 2486 2057 
 
It can be seen from Table 2 that the original untreated 
college multimedia teaching data has obtained a large 
number of features after feature extraction using the 
method in this paper. Because college multimedia 
teaching data has yet to be processed, there is a large 
amount of noise in the data, so there is a large amount of 
redundancy in the extracted data features, and there are 
many useless features in the data features. However, the 
processed college multimedia teaching resources have 
carried out the operations of data segmentation and word 
frequency statistics, removed the low-frequency words 
with low correlation, and the extracted features are 
efficient and accurate. 
To verify the impacts of feature extraction in this 
method, BNS features and Odds features are input into the 
improved combined BP neural network, respectively and 
combined to conduct data mining experiments. The recall 
rate, accuracy rate and running time are used to evaluate 
the data mining results using the three features. The data 
mining results are presented in Table 3
Table 3: Results of mining three kinds of characteristic data 
Index Recall rate/% Accuracy rate/% Running time/s 
BNS feature 53.82 87.47 637 
Odds feature 50.26 79.86 406 
BNS+Odds feature 78.67 97.53 512 
 
Table 3 shows that only one feature is used for data 
mining, and the accuracy and recall rates are lower than 
the results using two features. Although the operation time 
of data mining using Odds features is short, the recall and 
accuracy rates of data mining are sacrificed. Therefore, the 
dual feature extraction method used in this method is. It is 
an effective feature extraction method for college 
multimedia teaching resource mining. 
Compare and analyze the training effects of the 
combined BP neural network before and after the 
improvement, and analyze the data mining results of this 
method. The training results are shown in Figure 11.
 
Figure 11: Improved BP neural network training results 
0 50 100 150 200 250
0.0
0.2
0.4
0.6
0.8
1.0
Error
Training times
 Combined BP 
          neural network
 Improved Combined
          BP neural network
 Goal
Research on Intelligent Mining Methods of Multimedia Teaching…                                              Informatica 48 (2024) 97–112    109                                                                                                                                   
As can be seen from Figure 11, the error regression 
speed of the combined BP neural network without 
improvement is slow, and the result has just reached the 
target after 190 trainings, and the error rate can be less than 
the target value after more trainings. On the other hand, 
the improved combined BP neural network can reach the 
predetermined target error value only through about 50 
trainings, and the error value obtained after continuous 
training will be far less than the target error value. The 
error value after training at about 100 is less than the 
training result of the combined BP neural network before 
improvement for more than 200 times. The experimental 
results show that the improved combined BP neural 
network has more training advantages and can obtain the 
data mining results of multimedia educational resources in 
colleges and universities more quickly. 
This method is used to collect college multimedia 
education resource data and conduct data mining to verify 
the practical effects of proposed method. The data mining 
results can be seen in Table 4.
Table 4: Partial data mining results 
Index ID Multimedia teaching resources 
Category 
Engineering 
Computer Science and Technology J01 
 
Parallel programming 
Computer Science and Technology J02 
 
Algorithm (II) 
Civil engineering T01 
 
Engineering Problems (II) 
Civil engineering T02 
 
Civil engineering analysis 
Electrical engineering E01 
 
Primary transistor circuit 
Electrical engineering E02 
 
Introduction to Electrical Engineering 
Science 
Organic chemistry C01 
 
Organic Chemistry (I) 
Organic chemistry C02 
 
Organic synthesis method 
Applied mathematics M01 
 
Probability and engineering applications 
110   Informatica 48 (2024) 97–112                                                                                                                                        C. Yue et al.
 
 
 
Applied mathematics M02 
 
Complex Analysis (II) 
Theoretical physics P01 
 
Basis of quantum information 
Theoretical physics P02 
 
Intermediate Quantum Mechanics (II) 
Economics 
Theoretical economics WE01 
 
Game theory 
Theoretical economics WE02 
 
Mathematical tools for economists 
Theoretical economics WE03 
 
Macroeconomic Theory (II) 
 
It can be seen from Table 4 that this method can 
effectively mine multimedia teaching resources in 
academic centers by classification and clearly shows the 
classification mining results of multimedia teaching 
resources in various disciplines, and the data mining time 
is short. 
To sum up, studying the intelligent mining method of 
multimedia teaching resources in colleges and universities 
can significantly improve the effect and quality of 
education and teaching by adopting focused crawler to 
collect resources efficiently, eliminating redundancy and 
improving data processing efficiency, accurate feature 
extraction and effective classification methods. The 
improved combined BP neural network has fast training 
speed, can obtain mining results faster, and shows the 
classification results of multimedia teaching resources, 
which provides an important reference for teaching 
management and teaching improvement in colleges and 
universities. 
4 Conclusion 
The significance of studying the intelligent mining method 
of multimedia teaching resources in colleges and 
universities lies in improving the effect and quality of 
education and teaching. Through intelligent mining, the 
following aspects can be achieved: 
(1) The efficiency of collecting multimedia 
educational resources in colleges and universities with 
focused reptiles is much higher than that of collecting 
multimedia educational resources in colleges and 
universities with ordinary reptiles; 
(2) The correlation degree of this method is low. 
After deleting it, the data redundancy can be obviously 
reduced, and a better data preprocessing effect of 
multimedia teaching resources in colleges and universities 
can be obtained. 
(3) The processed multimedia teaching resources in 
colleges and universities are operated by word 
segmentation and word frequency statistics, and the low-
frequency words with low correlation are removed, and 
then the extracted features are efficient and accurate. 
(4) The double feature extraction method adopted in 
this paper is an effective feature extraction method for 
multimedia teaching resources mining in colleges and 
universities. 
(5) The improved combined BP neural network can 
reach the predetermined target error value only through 
about 50 times of training, and the error value obtained 
after continuous training will be far less than the target 
error value. The error value after training at about 100 
times is less than the training result of the combined BP 
neural network before improvement for more than 200 
times, which can obtain the data mining results of 
Research on Intelligent Mining Methods of Multimedia Teaching…                                              Informatica 48 (2024) 97–112    111                                                                                                                                   
multimedia educational resources in colleges and 
universities more quickly. 
(6) This method can effectively classify and mine 
multimedia teaching resources in colleges and 
universities, and clearly show the results of classified 
mining of multimedia teaching resources in various 
disciplines, and the data mining time is short. 
In the future research direction, we can further 
explore the following aspects: 
(1) Multimodal data mining: Combining text, 
images, audio, video and other diverse teaching resources, 
the association and interaction between different data 
modes are used in the mining process to improve the 
accuracy and comprehensiveness of mining results. 
(2) Emotional analysis and learning modeling: By 
digging out students' emotional expressions in the learning 
process, such as emotion and motivation, an emotional 
learning model is constructed, and the motivation and 
influencing factors behind learning are deeply explored to 
provide more accurate guidance and suggestions for 
teaching. 
(3) Mining of open education resources: With the 
increase of open education resources, such as MOOC 
courses and online learning platforms, how to effectively 
mine these resources to better serve the teaching needs is 
an important research direction. 
(4) Privacy protection and data security: When 
mining multimedia teaching resources, we should pay 
attention to students' privacy protection and data security, 
and formulate appropriate data processing and sharing 
norms to ensure the safety and reliability of the mining 
process. 
Availability of data and materials 
The datasets used in this paper are available from the 
corresponding author upon request.  
Conflicts of interest 
The authors declared that they have no conflicts of intere
st regarding this work.  
Authorship contribution statement 
Zhu Xuan: Writing-Original draft preparation, 
Conceptualization, Supervision, Project administration. 
Qi Yue: Language review, Methodology, Software 
Declarations 
Not applicable 
References 
[1] A. Silik, M. Noori, W. A. Altabey, J. Dang, R. 
Ghiasi, and Z. Wu, “Optimum wavelet selection 
for nonparametric analysis toward structural 
health monitoring for processing big data from 
sensor network: A comparative study,” Struct 
Health Monit, vol. 21, no. 3, pp. 803–825, 2022. 
https://doi.org/10.1177/14759217211010261 
[2] B. Seidl, R. Schuhmacher, and C. Bueschl, 
“CPExtract, a software tool for the automated 
tracer-based pathway specific screening of 
secondary metabolites in LC-HRMS data,” Anal 
Chem, vol. 94, no. 8, pp. 3543–3552, 2022. 
https://doi.org/10.1021/acs.analchem.1c04530 
[3] S. C. Pal, D. Ruidas, A. Saha, A. R. M. T. Islam, 
and I. Chowdhuri, “Application of novel data-
mining technique-based nitrate concentration 
susceptibility prediction approach for coastal 
aquifers in India,” J Clean Prod, vol. 346, p. 
131205, 2022. 
https://doi.org/10.1016/j.jclepro.2022.131205 
[4] A. K. Sleiti, “Isobaric Expansion Engines 
Powered by Low‐Grade Heat—Working Fluid 
Performance and Selection Database for Power 
and Thermomechanical Refrigeration,” Energy 
Technology, vol. 8, no. 11, p. 2000613, 2020. 
https://doi.org/10.1002/ente.202000613 
[5] M. Das, S. K. Ghosh, V. M. Chowdary, P. Mitra, 
and S. Rijal, “Statistical and Machine Learning 
Models for Remote Sensing Data Mining—
Recent Advancements,” Remote Sensing, vol. 14, 
no. 8. MDPI, p. 1906, 2022. 
https://doi.org/10.3390/rs14081906 
[6] B. Robson, S. Boray, and J. Weisman, “Mining 
real-world high dimensional structured data in 
medicine and its use in decision support. Some 
different perspectives on unknowns, 
interdependency, and distinguishability,” Comput 
Biol Med, vol. 141, p. 105118, 2022. 
https://doi.org/10.1016/j.compbiomed.2021.1051
18 
[7] A. C. Anadiotis et al., “Graph integration of 
structured, semistructured and unstructured data 
for data journalism,” Inf Syst, vol. 104, p. 101846, 
2022. https://doi.org/10.1016/j.is.2021.101846 
[8] A. Ghose, S. Singh, V. Kulaharia, L. Dokara, S. 
Maity, and S. Dey, “PySchedCL: Leveraging 
Concurrency in Heterogeneous Data-Parallel 
Systems,” IEEE Transactions on Computers, vol. 
71, no. 9, pp. 2234–2247, 2021. 
https://doi.org/10.1109/TC.2021.3125792 
[9] Y. Shin, J. Ahn, and D.-H. Im, “Join optimization 
for inverted index technique on relational database 
management systems,” Expert Syst Appl, vol. 198, 
p. 116956, 2022. 
https://doi.org/10.1016/j.eswa.2022.116956 
[10] B. Mounica and K. Lavanya, “Real time traffic 
prediction based on social media text data using 
deep learning,” Journal of Mobile Multimedia, pp. 
373–392, 2022. 
[11] S. Keshary, K. Bekiroglu, S. Seshadhri, and S. 
Srinivasan, “Multimedia Data-Based Artificial 
Pancreas for Type 2 Diabetes,” IEEE MultiMedia, 
vol. 29, no. 1, pp. 18–27, 2022. 
https://doi.org/10.1109/MMUL.2022.3154534 
[12] D. Shin, “Embodying algorithms, enactive 
artificial intelligence and the extended cognition: 
112   Informatica 48 (2024) 97–112                                                                                                                                        C. Yue et al.
 
 
 
You can see as much as you know about 
algorithm,” J Inf Sci, vol. 49, no. 1, pp. 18–31, 
2023. 
https://doi.org/10.1177/0165551520985495 
[13] A. S. Varde, “Computational estimation by 
scientific data mining with classical methods to 
automate learning strategies of scientists,” ACM 
Transactions on Knowledge Discovery from Data 
(TKDD), vol. 16, no. 5, pp. 1–52, 2022. 
https://doi.org/10.1145/3502736 
[14] M. A. Marjan, M. P. Uddin, and M. Ibn Afjal, “An 
Educational Data Mining System For Predicting 
And Enhancing Tertiary Students’ Programming 
Skill,” Comput J, vol. 66, no. 5, pp. 1083–1101, 
2023. https://doi.org/10.1093/comjnl/bxab214 
[15] G. Zhao and J. Ding, “Image Network Teaching 
Resource Retrieval Algorithm Based on Deep 
Hash Algorithm,” Sci Program, vol. 2021, pp. 1–
7, 2021. https://doi.org/10.1155/2021/9683908 
[16] L. J. Sankpal and S. H. Patil, “Rider-rank 
algorithm-based feature extraction for Re-ranking 
the webpages in the search engine,” Comput J, 
vol. 63, no. 10, pp. 1479–1489, 2020. 
https://doi.org/10.1093/comjnl/bxaa032 
[17] I. Bifulco, S. Cirillo, C. Esposito, R. Guadagni, 
and G. Polese, “An intelligent system for focused 
crawling from Big Data sources,” Expert Syst 
Appl, vol. 184, p. 115560, 2021. 
https://doi.org/10.1016/j.eswa.2021.115560 
[18] S. Rajiv and C. Navaneethan, “Keyword weight 
optimization using gradient strategies in event 
focused web crawling,” Pattern Recognit Lett, 
vol. 142, pp. 3–10, 2021. 
https://doi.org/10.1016/j.patrec.2020.12.003 
[19] K. A. Apoorva and S. Sangeetha, “Analysis of 
uniform resource locator using boosting 
algorithms for forensic purpose,” Comput 
Commun, vol. 190, pp. 69–77, 2022. 
https://doi.org/10.1016/j.comcom.2022.04.002 
[20] D. Dia, G. Kahn, F. Labernia, Y. Loiseau, and O. 
Raynaud, “A closed sets based learning classifier 
for implicit authentication in web browsing,” 
Discrete Appl Math (1979), vol. 273, pp. 65–80, 
2020. https://doi.org/10.1016/j.dam.2018.11.016 
[21] G. De Marzo, F. S. Labini, and L. Pietronero, 
“Zipf’s law for cosmic structures: How large are 
the greatest structures in the universe?,” Astron 
Astrophys, vol. 651, p. A114, 2021. 
https://doi.org/10.1051/0004-6361/202141081 
[22] M. Huang, H. Ma, C. Ma, P. A. Garber, and P. 
Fan, “Male gibbon loud morning calls conform to 
Zipf’s law of brevity and Menzerath’s law: 
insights into the origin of human language,” Anim 
Behav, vol. 160, pp. 145–155, 2020. 
https://doi.org/10.1016/j.anbehav.2019.11.017 
[23] F. Xue and D. Połap, “Detail feature inpainting of 
art images in online educational videos based on 
double discrimination network,” Mobile Networks 
and Applications, pp. 1–14, 2023. 
https://doi.org/10.1007/s11036-023-02191-x 
[24] S. H. Syed and V. Muralidharan, “Feature 
extraction using Discrete Wavelet Transform for 
fault classification of planetary gearbox–A 
comparative study,” Applied Acoustics, vol. 188, 
p. 108572, 2022. 
https://doi.org/10.1016/j.apacoust.2021.108572 
[25] J. Peng and W. Yu, “The algorithm of current 
prediction based on multi-dimensional Long Short 
Term Memory networks,” Energy Reports, vol. 7, 
pp. 1114–1120, 2021. 
https://doi.org/10.1016/j.egyr.2021.09.158 
[26] Q.-T. Bui et al., “SFCM: a fuzzy clustering 
algorithm of extracting the shape information of 
data,” IEEE Transactions on Fuzzy Systems, vol. 
29, no. 1, pp. 75–89, 2020. 
https://doi.org/10.1109/TFUZZ.2020.3014662