https://doi.org/10.31449/inf.v48i10.5931 Informatica 48 (2024) 89–102 89 
The Application of Integrating Data Mining and IoT Management 
Technology in Enterprise Supply Chain Information Management 
Ling Gong 
Accounting Institute, Chongqing College of Finance and Economics, Chongqing 402160, China 
E-mail: cqyc20000219@126.com 
Keywords: data mining, apriori algorithm, internet of things management technology, enterprise supply chain 
information management, radio frequency identification 
Received: March 19, 2024 
The increasing use of the Internet of Things presents both challenges and opportunities for managing 
enterprise supply chains. This study proposes a method of integrating data mining and Internet of 
Things management technology for enterprise supply chain information management. By collecting 
and analyzing numerous data generated by Internet of Things devices, real-time and accurate 
information on supply chain management can be provided, thereby helping enterprises achieve supply 
chain optimization and collaborative management. This study introduces the architecture and 
implementation method of a supply chain information management system that integrates data mining 
and Internet of Things management, and verifies the effectiveness of this method through case analysis. 
The experiment showed that the average accuracy of the enterprise supply chain information 
management system was 93.14%, the average sensitivity was 91.05%, and the average specificity was 
91.89%. The average error rate was 7.06%, the average delay time was 0.84s, and the average 
accuracy was 93.21. This indicates that the performance of the enterprise's supply chain information 
management data system is excellent, and this method has important application value in improving 
supply chain efficiency and accuracy. 
Povzetek: Raziskava združuje rudarjenje podatkov in tehnologijo upravljanja interneta stvari (IoT) za 
upravljanje informacij o oskrbovalni verigi podjetja. Z analizo podatkov, ki jih ustvarjajo naprave 
interneta stvari, sistem zagotavlja realnočasovne in natančne informacije o upravljanju oskrbovalne 
verige.
1 Introduction 
With the intensification of global market competition, 
Enterprise Supply Chain Information Management 
(ESCIM) has become an important factor in enterprise 
competitiveness. However, ordinary Supply Chain 
Information Management (SC-IM) methods suffer from 
issues such as inaccurate information and delays [1]. The 
rise of Internet of Things (IoT) technology has brought 
new opportunities to SC-IM. The ordinary SC-IM method 
mainly relies on manual data collection and processing, 
which has problems such as inaccurate information and 
delay. IoT technology can monitor and collect real-time 
data from various links in the supply chain through 
sensors, devices, and other means, achieving real-time 
monitoring of the supply chain. Data Mining Techniques 
(DMT) can analyze these data and discover hidden 
patterns and correlations [2-3]. Applying DMT to IoT 
data can provide enterprises with more accurate and 
real-time supply chain information, help them make more 
reasonable decisions, reduce inventory costs, and 
improve supply chain efficiency. IoT devices can monitor 
and collect information on various aspects of the supply 
chain in real-time, and analyze this data through DMT to 
provide real-time and accurate supply chain information 
[4-5]. Therefore, the integration and application of Data 
Mining and IoT Management (DM-IoTm) technology in 
ESCIM is of great significance. Integrating DM-IoTm 
technology can help enterprises better manage supply 
chain information, improve supply chain efficiency and 
collaboration. The application of this technology in 
ESCIM has important contributions and significance, 
helping to improve the supply chain management level of 
enterprises, reduce costs, enhance competitiveness, and 
achieve sustainable development. The context is 
segmented into four parts. Part 1 is a literature review that 
discusses and analyzes the current research status of data 
mining, IoT management technology, and ESCIM both 
domestically and internationally. Part 2 proposes an 
ESCIM system that integrates data mining and IoT. Part 3 
verifies the effectiveness and performance of the system 
through experiments. Part 4 summarizes the research 
findings. 
2 Related works 
In recent years, data mining and IoT technology have 
developed rapidly and have been widely applied in 
various fields. One method to address the high turnover 
rate in the education technology market is to predict the 
turnover rate. Kiguchi et al. proposed a customer churn 
90   Informatica 48 (2024) 89–102                                                                     L. Gong 
prediction model based on decision trees and random 
forest data mining models. The proposed method was 
effective in determining and predicting customer churn in 
digital game learning [6]. In response to the widespread 
application of DMT in the financial services field, 
Plotnikova et al. proposed a standardized process for 
enterprise management data mining projects based on 
logistic regression. This method could meet the data 
analysis needs of financial enterprises [7]. Ganesh and 
Kalpana proposed an artificial intelligence-based model 
for analyzing social media data, which identifies and 
evaluates risk factors as the most important stage in 
supply chain risk management. The important role of data 
analysis in achieving accurate decision-making provided 
valuable insights into contemporary sustainable 
development issues [8]. To assist university students, 
improve their physical health, He et al. proposed a 
college student physical exercise and health management 
system based on the IoT era. The IoT-based physical 
exercise framework could strengthen the students’ 
physical status [9]. Jiang proposed an engineering cost 
accounting system that combines chaotic data processing 
methods with IoT technology. This system had good data 
processing performance [10]. To address the issues of 
energy drift and deformation caused by environmental 
temperature and internal structure of prefabricated 
buildings, as well as high construction energy 
consumption costs, Wang and Jiang designed an energy 
management and control system for prefabricated 
buildings built on IoT. The root means square error 
convergence of energy management in this system had 
been proven to be good [11]. 
The rapid growth of IoT, big data, and cloud computing 
has presented ESCIM with numerous challenges and 
opportunities. Understanding and studying the barriers to 
adopting an environmental impact assessment system in 
the supply chain is crucial for better management of the 
supply chain. Deepu and Ravi proposed a grey-based 
decision-making experiment and evaluation laboratory 
method. This study provided decision-makers with a key 
framework for adopting barriers in environmental impact 
information systems to achieve effective environmental 
impact information systems and better management [12]. 
The current cost management model was still stuck in 
ordinary management methods, lacking more intelligent 
big data analysis methods. Therefore, Mao and Chen 
proposed a series of practical operational methods for 
exploring supply chain culture dissemination enterprises 
using big data technology. This method could help 
e-commerce enterprises reduce supply chain management 
costs and obtain higher profit margins [13]. Due to the 
long-term operation of the enterprise resource planning 
system, it had become a platform and guarantee for 
completing management processes. Li and Wu proposed 
an enterprise logistics information management system. 
This system had essential application value in this system 
of supply chain enterprises, which can effectively 
improve customer satisfaction value by 86.7% [14]. Due 
to the negative correlation between the scale of risk 
transmission and operational robustness and flexibility, 
market information completeness, and immunity rate, 
Wang et al. proposed using an epidemic model to study 
supply chain risk transmission. This study was of great 
theoretical and practical significance for this field [15]. In 
a dual channel supply chain, manufacturers sold products 
through their online channels and offline retailers at 
optimal retail prices and service levels. In response to this 
issue, Yang et al. proposed incorporating the reference 
price effect into the hotel's utility function to determine 
the competitive relationship between retail prices and 
service levels. Numerical examples have verified the 
feasibility of the theoretical results [16]. 
Table 1 is a summary of the main findings and research 
limitations of the review literature. 
 
 
Table 1: Summary of the main findings and limitations of the review literature 
Author Key findings Limitations 
Kiguchi et al. [6] 
Decision trees and random forest models are 
effective for customer churn prediction in digital 
game learning 
There is no broader 
application involved 
Plotnikova et al. 
[7] 
Logistic regression model can meet the data 
analysis needs of financial enterprises only in the 
field of financial services 
No other industries are 
considered 
Ganesh et al. [8] 
The value of AI-based social media data analytics 
for supply chain risk management is limited to 
social media data 
Not all risk factors have 
been fully assessed 
He et al. [9] 
IoT technology can improve the physical health 
level of college students 
The broader application of 
health management is not 
considered 
Jiang et al. [10] 
The engineering cost accounting system 
combining chaotic data processing and IoT 
technology has a good effect 
Other aspects of supply 
chain management are not 
covered 
Wang et al. [11] 
The IoT technology has a good effect on the 
energy management of prefabricated buildings 
Broader supply chain 
management issues are not 
The Application of Integrating Data Mining and IoT Management… Informatica 48 (2024) 89–102 91 
covered 
Deepu et al. [12] 
Provide a structural framework for barriers to the 
adoption of environmental impact information 
systems 
Focus only on the 
environmental impact 
assessment system 
Mao et al. [13] 
Big data technology can reduce supply chain 
management costs 
Slow data processing 
Li et al. [14] 
ERP based logistics information management 
system can improve customer satisfaction 
A data security problem 
exists 
Wang et al. [15] 
Epidemic model has theoretical and practical 
significance for supply chain risk management 
Lack of practical application 
cases 
Yang et al. [16] 
The competitive relationship between retail price 
and service level can be determined by 
considering the reference price effect 
Weak privacy protection 
ability 
 
Table 1 summarizes the main findings of the literature 
and highlights the limitations of each study, 
demonstrating the need for further research within the 
existing body of knowledge. The table shows that 
DM-IoTm technology has significant potential for 
application in ESCIM. However, this type of ESCIM also 
faces some challenges, such as data security, data 
processing, and privacy protection. Therefore, this study 
further integrates DM-IoTm technology and explores its 
application in ESCIM in depth. 
3 Construction of ESCIM system 
integrating data mining and IoT 
This study is based on the Apriori algorithm to mine 
ESCIM data, manage supply chain information through 
IoT technology, integrate DM-IoTm technology, and 
construct an ESCIM system to achieve more efficient 
SC-IM. 
 
 
 
3.1 ESCIM data mining based on Apriori 
algorithm 
In enterprise supply chain management, DMT can help 
companies discover hidden patterns, patterns, and 
correlations in the supply chain. The Apriori algorithm is 
a commonly used Association Rule Mining (ARM) 
algorithm that can discover Frequent Itemsets (FI) and 
Association Rules (AR) from abundant transaction data. 
Therefore, applying the Apriori algorithm to ESCIM can 
help enterprises discover correlations in the supply chain, 
guide enterprise management and decision-making. ARM 
is a DMT used to discover meaningful associations in 
datasets [17-18]. The strength of AR can be measured by 
two metrics: support and confidence. Support represents 
the percentage of transactions that contain itemsets X and 
Y. The confidence level represents the transaction ratios 
that include itemset X and also include itemset Y. By 
setting thresholds for support and confidence, AR with 
sufficient strength can be filtered out. The principle of 
ARM is Figure 1. 
 
Enter the transaction 
record database
Find all frequent item 
collections
Association rule
The user
Output a collection of 
related rules
 
Figure 1: Principles of association rules mining 
92   Informatica 48 (2024) 89–102                                                                     L. Gong 
 
The common algorithms of ARM include Apriori and 
frequent pattern growth algorithms. The Apriori mainly 
has two stages. Stage 1 is to generate Candidate Itemsets 
(CI), and the second stage is to find FI. In the Stage 1, the 
Apriori algorithm generates CI through a layer-by-layer 
search method. Firstly, the set of FI identified is denoted 
as L1. Then, L1 is utilized to generate a candidate set, 
and the FI identified is denoted as L2, continuing this 
process until FI cannot be found. In the Stage 2, the 
Apriori algorithm scans the database and calculates the 
support of each CI to identify FI. The Apriori algorithm is 
a method based on FI, which finds FI and AR by 
generating CI and calculating support. After mining out 
all the FI from database D, it is relatively easy to gain the 
corresponding AR. That is, to generate strong AR that 
meet the minimum support and confidence. The support 
calculation of AR is equation (1). 
 ( ) ( , ) Support A B P A B →= (1) 
The confidence calculation of AR is equation (2). 
 
( ) ( | )
sup _ ( )
sup _ ( )
Confidence A B P A B
port count A B
port count A
→=
=
 (2) 
In equation (2), sup _ ( ) port count A B represents the 
quantity of transaction records containing itemset AB . 
sup _ ( ) port count A B represents the amount of 
transaction records with itemset A . Equation (3) shows 
that the degree of improvement in an AR is determined 
by the confidence level ratio of the antecedent to the 
support level of the consequent. This ratio represents the 
probability of both containing A and the probability of A 
occurring as a whole under the condition of B occurring. 
 
( ) ( | ) / ( )
()
()
Lift A B P A B P A
Confidence A B
PB
→=
→
=
   (3) 
The conditional probability here is calculated using the 
support frequency of the itemset. Each FI l will 
generate all non-empty subsets of l . All non-empty 
subsets 
s
 of l have output rules, as shown in equation 
(4). 
 
sup _ ( )
min
sup _ ( )
port count l
conf
port count s

 (4) 
In equation (4), minconf is the min confidence 
threshold. If the non empty subset 
s
 satisfies equation 
(4), then output rule () s l s − . The mining of AR is based 
on the support and confidence of the itemset, as displayed 
in Figure 2. 
 
Frequent item 
set
Association 
rule
Data set D Strong rule
Users
minsup mincof
 
Figure 2: Association rule and output rule 
 
In Figure 2, dataset D is the input data. The minimum 
support is set to obtain the frequent item set, and then to 
proceed to the next step. Based on the results produced in 
the last step and confidence setting, strong AR that meets 
the requirements is inferred and aggregated for validation, 
completing the mining process. In this process, various 
parameters can be set according to actual needs to guide 
the mining process, and the values of both can be 
adjusted to achieve user satisfaction. Firstly, by scanning 
the entire database, the first candidate obtained is the set 
of search results. The next step is to search for frequent 
) 1 k + （ itemset 
1) k
L
+ （
: first connect the frequency k 
itemset to itself, generate a candidate ) 1 k + （ itemset 
( 1) k
C
+
, and sort each item. If the previous ) 1 k − （ 
projects are the same, then the project set self connects as 
shown in equation (5). 
1 2 1 2 1
2 1 2
( [1] [1] [2] [2] ... ( [ 1])
( [ 1]) ( [ ]) [ ]
I I I I I k
I k I k I k
=  =   −
= −  
(5) 
In equation (5), 
12
, II are the set of 
k
L , so 
12
, II can 
be connected. The Apriori process is Figure 3. 
 
The Application of Integrating Data Mining and IoT Management… Informatica 48 (2024) 89–102 93 
Define min__sup, 
let k=1
Frequent -k item set
 is empty
Initiate Scan data set
Let k=k+1
End and output the 
result
Computational 
support
Generate frequent k-item sets
Merging frequent K-
item sets produces 
candidate (k+1) item 
sets
Pruning
Y
N
 
Figure 3: Process of the Apriori algorithm 
 
The Apriori is an ARM algorithm used to discover FI in 
data, generate AR based on FI, and calculate their support. 
According to the support threshold, ARs are filtered, and 
FI and AR are output. The Apriori algorithm requires 
multiple scans of the dataset and requires lots of 
computation, but it can effectively discover FI and AR in 
the data. 
 
 
 
 
 
3.2 ESCIM system based on IoT 
management technology 
The integration of the IoT technology in the ESCIM 
system has brought unprecedented opportunities for 
enterprises, enabling every link of the supply chain to 
obtain real-time monitoring and data transmission [19]. 
The introduction of this technology not only enhances the 
transparency of the supply chain, but also greatly 
improves its efficiency and synergy [20]. The ESCIM 
based on the IoT management technology is shown in 
Figure 4. 
 
Data 
acquisition 
and 
transmission
Data mining 
and analysis
Supply chain 
information 
visualization
Intelligent 
decision 
support
Iot device 
deployment
Iot devices collect information about items in the supply chain, such 
as temperature, humidity, location, etc., and transmit the data to 
servers in real time via a wireless network.
Big data technology and data mining algorithm are used to analyze 
the collected supply chain data and find hidden rules and correlations.
The analysis results are displayed to supply chain managers in a 
visual form to facilitate real-time understanding of supply chain 
conditions and make decisions.
Based on data mining results, it provides intelligent decision support 
for enterprises, such as inventory management, logistics optimization, 
demand forecasting, etc.
Deploy iot devices in all aspects of the supply chain, such as sensors, 
RFID, etc., to achieve real-time monitoring and tracking of supply 
chain items.
 
Figure 4: Enterprise supply chain information management 
94   Informatica 48 (2024) 89–102                                                                     L. Gong 
 
In an ESCIM system, IoT devices, such as Radio 
Frequency Identification (RFID) readers and sensors, 
constantly collect data, including crucial information such 
as the product's location, temperature, humidity, and 
vibration. These data streams are first aggregated into an 
intermediate data acquisition layer, and then subjected to 
specific data cleaning and preprocessing processes to 
remove noise and outliers in preparation for subsequent 
data mining and analysis. The architecture of the ESCIM 
system has become more intelligent and modular with the 
help of the IoT technology. The whole system can be 
roughly divided into three modules: perception layer, 
network layer and application layer. The sensing layer is 
mainly composed of various IoT devices, such as RFID 
tags, sensors, etc., which are responsible for collecting 
various data in the supply chain in real time. The network 
layer is mainly responsible for data transmission. The 
data collected by the perception layer is transmitted to the 
data center or cloud platform through various 
communication technologies. The application layer is the 
core of the ESCIM system, which receives data from the 
network layer and provides valuable insights and decision 
support to the enterprise through data mining, analysis 
and visualization tools. RFID technology in the sensing 
layer is a technology that uses radio signals to 
automatically identify target objects and read relevant 
data. RFID has the characteristics of long reading 
distance, strong penetration ability, anti-interference, high 
efficiency, and numerous information. It can recognize a 
single specific object and process multiple labels at the 
same time. The anti-collision algorithm for RFID tags is 
mainly used to solve the conflict problem that occurs 
when multiple RFID tags send signals simultaneously. 
This study proposes an anti-collision algorithm based on 
label parallel recognition technology, which combines 
Pseudo ID Logistie Code (PILD) and Deterministic Finite 
State Automata (DFSA). This algorithm combines 
pseudo-ID code grouping and tag parallel identification 
technology to enhance the throughput of RFID tag 
identification transmission. As a result, it improves the 
transmission performance of RFID multi-tag 
identification systems while occupying the same 
transmission band. It effectively reduces the probability 
of a single label starving to death due to multiple 
collisions. The steps of the PILD algorithm are shown in 
Figure 5. 
 
Predict the number of 
labels n
Initiate
The reader sends the 
tag
The label randomly generates any number from 1 
to n as the pseudo-ID code
Tag reception n
a=0, i=0
The reader sends i 
to the tag
Is there a tag 
response?
a ＜n&&
i ≤n? 
i=i+1 a=a+1
Finish
Parallel 
identification
Tag collision
a=a+(m/F)*(1-
1/M)
(m/F-1)
N
Y
N
Y
N
Y
 
Figure 5: Steps of PILD algorithm 
 
Assuming the reader generates a series of pseudo labels 
based on the estimated labels, one of which is randomly 
selected as its own recognition flag within the range of 
values. The possibility of selecting a pseudo-ID code 
simultaneously by 
m
 markers is equation (6). 
 
11
( , , ) ( ) (1 )
m m n m
n
P L n m C
LL
−
=   − (6) 
In equation (6), L represents the number of pseudo-ID 
codes generated by the reader. 
n
 represents the total 
number of tags to be identified within the recognition 
range. 
m
 is the number of tags selected for the current 
pseudo-ID code. When the ratio of the expected and total 
number of pseudo-ID codes selected for a single label is 
taken to the limit, the result is calculated by equation (7). 
 
11
( , , ) ( ) (1 )
m m n m
n
P L n m C
LL
−
=   − (7) 
In equation (7), the number of successfully identified 
pseudo codes has reached its maximum value. In fact, the 
number of 
n
 in the equation is very large, so 1 can be 
ignored, and Ln = is taken. Before using the Logistic 
DFSA algorithm to recognize labels, the reader first 
estimates the total number of labels within the 
recognition range 
n
, and then sends 
n
 to the labels. 
The random number generator of the tag generates a 
number between 1~n as a pseudo-ID code. At this 
point, there are several scenarios for pseudo-ID codes. 
The pseudo-ID code has no label selection, i.e., 0 m = . 
There is one tag that selects the pseudo-ID code, which is 
The Application of Integrating Data Mining and IoT Management… Informatica 48 (2024) 89–102 95 
1 m = . When the number of labels for selecting the 
pseudo-ID code is greater than or equal to 2, i.e., 2 m  , 
a collision phenomenon in ordinary algorithms occurs. 
When 0 m = occurs, the probability of an empty ID 
code appearing during the recognition process is shown 
in equation (8). 
 
( 0)
1
(1 )
n
m
P
L
=
=−       (8) 
When m=1, the probability of successfully identifying the 
pseudo-ID code can be obtained as shown in equation (9). 
 
1 1 1
( 1)
1 1 1
( ) (1 ) (1 )
nn
mn
n
PC
L L L L
−−
=
=   − =  − (9) 
When 2 m  is present, the pseudo-ID code is the usual 
collision ID code, and the collision probability 
m
P of the 
label when m ≥2 is shown in equation (10). 
( 0) ( 1)
11
1 ( ) (1 )
m m n m
m m m n
P P P C
nn
−
==
= − − =   − (10) 
Assuming a pseudo-ID code is selected by 
m
 tags 
simultaneously and conflicts occur, the tag parallel 
recognition algorithm is used to provide an algorithm for 
tag 
m
 conflicts, and the number of tag queries and 
throughput are counted. Logistic DFSA is a parallel 
recognition algorithm based on Logistic mapping and 
DFSA algorithm. This algorithm combines the chaotic 
characteristics of Logistic mapping with the determinacy 
and finiteness of DFSA to achieve efficient parallel 
pattern recognition. Logistic mapping is a nonlinear 
dynamic system with chaotic characteristics, where small 
initial condition changes may lead to completely different 
results. This chaotic characteristic gives Logistic mapping 
an advantage in dealing with complex pattern recognition 
problems. The logistic regression function is Figure 6. 
 
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
-5 -4 -3 -2 -1 0 1 2 3 4 5
z
q(z)
 
Figure 6: Logistic regression function 
 
DFSA is a finite state automaton that performs operations 
based on a defined set of rules and state transitions. The 
advantages of DFSA are determinism and finiteness, 
where each input leads to a clear output and the state 
transition is finite. This feature makes DFSA efficient and 
predictable in dealing with pattern recognition problems. 
The Logistic-DFSA algorithm uses Logistic mapping to 
preprocess input data to generate a set of chaotic patterns. 
Then, these chaotic patterns are used as inputs for DFSA 
for pattern recognition. Due to the chaotic nature of 
Logistic mapping and the determinacy and finite nature 
of DFSA, Logistic-DFSA algorithm can achieve efficient 
pattern recognition in parallel environments. Overall, the 
Logistic-DFSA algorithm is an effective and 
parallelizable pattern recognition algorithm. It combines 
the advantages of Logistic mapping and DFSA 
algorithms, can handle complex pattern recognition 
problems, and has predictability. The number of labels in 
the time slot of the Logistic-DFSA algorithm is equation 
(11). 
/
slot
n n F =      (11) 
In equation (11), F is the initial frame length, and in 
general, F is taken as 8. The probability of identifying 
the number of m labels in a certain time slot is 
96   Informatica 48 (2024) 89–102                                                                     L. Gong 
equation (12). 
 
11
1
( 1) 1
( , ) (1 )
m
m M
m
CM
P m M
M M
−
−
−
= = − (12) 
When the spreading code is M , the number of 
recognizable labels is equation (13). 
 
1
1
11
(1 ) (1 )
sx
n
n
F
cg sx
n
nn
M F M
−
−
= − = − (13) 
In equation (13), 
cg
n
 represents the number of 
recognizable labels, and 
sx
n represents the number of 
labels. The interaction between IoT components and the 
data mining process is key to ESCIM systems. First, the 
data provided by IoT devices is the basis for data mining. 
This data is preprocessed and fed into various data 
mining algorithms to uncover patterns, trends, and 
associations. At the same time, the results of data mining 
can also be fed back to IoT components to optimize their 
data acquisition strategies. In summary, the integration of 
IoT technology in the ESCIM system enables a close 
interaction between data flow, system architecture, and 
data mining processes. This provides enterprises with 
more intelligent, efficient, and accurate solutions for 
supply chain management. 
4 Analysis of ESCIM system 
integrating DM-IoTm technology 
This study conducted experiments using actual enterprise 
supply chain data to verify the effectiveness of the 
ESCIM system based on Apriori algorithm and 
Logistic-DFSA algorithm, and conducted a 
comprehensive analysis of the ESCIM system. 
4.1 Performance analysis of Apriori 
algorithm and Logistic-DFSA algorithm 
To analyze the enterprise supply chain data, the first step 
is to prepare a transaction dataset. This dataset should be 
derived from the actual transaction records of a large 
retail enterprise and should cover the entire chain of 
transactions from suppliers to final consumers. The 
dataset comprises hundreds of thousands of transaction 
records, each detailing a complete transaction, including 
the item name, quantity, transaction time, and location. 
To ensure data quality and consistency, a series of 
operations such as data cleaning, deduplication, missing 
value filling and outlier detection are carried out in the 
data pre-processing stage. A unified coding process is 
implemented to ensure that each item has a unique 
identifier, so that Apriori algorithm can accurately 
identify and calculate the support and confidence of each 
set in the subsequent data mining process. 
When implementing Apriori algorithm, the research 
focused on the following two parameters: minimum 
support and minimum confidence. The minimum support 
was a threshold used to measure how frequently the item 
set appears in the data set. It was considered a "FI" only if 
the item set's support was above this threshold. For this 
experiment, a minimum support of 0.05 was set, 
indicating that the item set had to appear in at least 5% of 
transactions to be considered 'frequent.' Minimum 
confidence was a threshold used to evaluate the strength 
of AR. Only when the confidence level of the rule was 
above this threshold was it considered a "strong rule". In 
this experiment, the minimum confidence level was set to 
0.7, meaning that the probability of the rule head 
appearing in the case of the rule body needed to be at 
least 70%. By setting appropriate minimum support and 
minimum confidence thresholds, the algorithm could only 
generate FI and AR with practical significance. 
The implementation of Apriori is as follows: First, the 
algorithm scanned the entire database, calculated the 
support degree of each item, and selected the item that 
met the minimum support degree to form a frequent 
1-item set. Then, based on frequent 1-item sets, the 
algorithm generated candidate 2-item sets. Then, the 
algorithm scanned the database again, calculated the 
support degree of candidate 2-item sets, selected the item 
sets that met the minimum support degree to form 
frequent 2-item sets. At the same time, according to the 
nature of Apriori, the candidate set was pruned to reduce 
unnecessary calculations. The algorithm then continued 
this process, generating candidate 3-item sets, candidate 
4-item sets, etc., until it could not be regenerated into 
larger frequent item sets. After finding all frequent item 
sets, the algorithm generated AR based on these item sets 
and calculated their confidence levels. Only rules with 
higher than minimum confidence were retained. Under 
the above experimental background, the performance of 
Apriori algorithm and Logistic DFSA algorithm was 
compared in this study, and accuracy, sensitivity, 
specificity, bit error rate and delay time were selected as 
evaluation indicators for analysis. The comparison results 
are shown in Table 2. 
 
 
 
 
 
 
 
 
 
 
The Application of Integrating Data Mining and IoT Management… Informatica 48 (2024) 89–102 97 
Table 2: Index evaluation of enterprise supply chain information management data system 
Serial 
number 
Accuracy rate 
(%) 
Sensitivity 
(%) 
Specificity 
(%) 
Bit error rate 
(%) 
Delay time (s) Accuracy (%) 
1 89.1 92.3 84.7 12.9 0.6 91.2 
2 91.3 88.1 87.3 8.7 0.3 93.5 
3 93.9 79.6 93.2 6.1 0.8 89.7 
4 97.6 93.4 96.8 2.4 0.2 93.1 
5 92.8 91.9 97.5 7.2 1.3 94.2 
6 96.2 92.6 91.1 3.8 1.5 91.5 
7 89.5 94.7 89.9 10.5 0.7 94.9 
8 93.7 96.1 92.3 6.3 1.2 88.7 
9 92.0 95.5 94.7 8.0 0.4 93.6 
10 95.3 86.3 90.4 4.7 0.5 94.3 
Mean value 93.14 91.05 91.89 7.06 0.84 93.21 
 
In Table 2, the average accuracy, sensitivity, and 
specificity of the ESCIM system based on Apriori and 
Logistic-DFSA were 93.14%, 91.05%, and 91.89%, 
respectively. The average error rate was 7.06%, the 
average delay time was 0.84s, and the average accuracy 
was 93.21. This indicates that the performance of the  
 
 
Apriori-based ESCIM system is excellent. This study 
further analyzed and compared the gross of time slots and 
throughput of the Logistic-DFSA algorithm, and 
conducted simulation verification using Matlab. The 
comparative algorithms include logistic regression 
algorithm and decision tree algorithm, as shown in Figure 
7. 
 
0 250 500 750
100
0
0
500
750
1000
(a) Total time slots
Total time slots
Number of labels
250
0 250 500 750 1000
0
0.5
0.75
1
(b) Throughput rate
Throughput rate
Number of labels
0.25
Logistic regression algorithm
Logistic-DFSA
Logistic-DFSA
Logistic regression algorithm
Decision tree algorithm
Decision tree algorithm
 
Figure 7: Comparison of total time slots and throughput 
 
Figure 7 (a) is a comparison of the overall time slots for 
the three algorithms. Logistic-DFSA had a lower time 
slot than the other two algorithms, indicating better 
performance. The comparison of throughput rates among 
various algorithms in Figure 7 (b) shows that 
Logistic-DFSA had better throughput rates than the other 
two algorithms, with a stable throughput rate of around 
96%. Based on the Logistic-DFSA algorithm, the 
computational efficiency was high and the algorithm was 
simple, only requiring prefix judgment on the reader 
without significantly increasing label overhead. It was a 
practical and feasible method. To more intuitively 
evaluate the performance of ESCIM data systems based 
on Apriori and Logistic-DFSA, this study used ESCIM 
systems based on these two algorithms (Method 1), 
SC-IM data systems based on support vector data 
description algorithms (Method 2), SC-IM data systems 
based on error backpropagation algorithms (Method 3), 
SC-IM data systems based on genetic algorithms (Method 
4), and SC-IM data system based on particle swarm 
optimization (PSO) algorithm (Method 5) to make a 
comparison. The comparison of the average accuracy of 
these five algorithms is Figure 8. 
 
98   Informatica 48 (2024) 89–102                                                                     L. Gong 
Serial number
0 1 2 3 4 5 6 7 8 9 10
0.5
0.6
0.7
0.8
0.9
1.0
Average accuracy
Method 1
Method 2
Method 3
Method 4
Method 5
 
Figure 8: Comparison of average accuracy of five algorithms 
 
In Figure 8, the average accuracy of methods 2, 3, 4, and 
5 was 84%, 91%, 89%, and 90%. The average accuracy 
of Method 1 was 96%, which is higher than the average 
accuracy of all four models. Therefore, the ESCIM 
system based on Apriori algorithm and Logistic DFSA 
algorithm performed better. 
4.2 Performance testing analysis of ESCIM 
system 
To analyze the practical application effect of the 
proposed ESCIM system, this study compared the overall 
performance of the ESCIM system integrating DM-IoTm 
technology with the general international transportation 
and logistics information management system. 
Performance and cost-effectiveness were used as 
comparative indicators. This study set the same testing 
conditions and dataset. The experimental environment 
was mainly divided into hardware environment and 
software environment, and Table 3 shows the specific 
parameters. 
 
 
Table 3: Specific experimental environment of enterprise supply chain information management system performance 
test 
Environmental classification Description Disposition 
Hardware 
environment 
Server It is used to deploy a SC-IM system 
CPU: Intel Xeon Silver 4216, 
Memory: 128GB DDR4, 
Storage: 1TB NVMe SSD 
Network equipment 
It ensures the stability and speed of 
the network connection 
Gigabit Ethernet switches, 
routers, firewalls, etc 
Terminal equipment 
Used to test the response and 
interaction performance of the 
system 
Multiple computers and mobile 
devices with different 
configurations 
Software 
environment 
Operating system 
Used for server operation and 
management 
CentOS 8.2 
Database 
management system 
Used to store and manage supply 
chain information data 
MySQL 8.0.23 
Middleware 
Used to support the operation and 
interaction of the system 
Apache Tomcat 9.0, Redis 6.0 
Development tools 
and environment 
Used for system development and 
testing 
Java 11, Python 3.8, IntelliJ 
IDEA, Git, etc 
 
Table 3 provides detailed experimental environment 
configuration information, including hardware devices 
and software environments. This configuration 
information is crucial for conducting performance testing 
and evaluating the actual application  
 
effectiveness of the system. The performance of the two 
systems recorded and analyzed in the above experimental 
environment on key performance indicators such as 
response time and CPU usage is Figure 9. 
 
The Application of Integrating Data Mining and IoT Management… Informatica 48 (2024) 89–102 99 
0 3 6 9 12 15 18
0
15
30
45
60
105
CPU Usage (%)
75
90
21 24
Running time (h)
(b) Response time comparison
60
70
80
90
100
Response time (ms)
0 100 200
This paper information 
management system
This paper information management system
Common system
Online population
300 400 500 600
(a) Comparison of CPU usage
Common system
 
Figure 9: Performance test comparison between the two platforms 
 
In Figure 9 (a), the CPU usage of the proposed ESCIM 
system fluctuated between 15% and 30% within 24 hours. 
The fluctuation range of ordinary SC-IM systems was 
45% to 75%. In Figure 9 (b), the response time of the 
proposed ESCIM system was 85 ms when the number of 
people was 600, and 99.5 ms for a regular SC-IM system. 
The response time of ESCIM was much lower than that 
of ordinary SC-IM. The constructed ESCIM system had 
better CPU usage and response time than ordinary 
information management systems, fully reflecting the 
platform's high computing speed, high network 
bandwidth, and other characteristics. After verifying the 
superior performance of the ESCIM system, this study 
also analyzed its economic benefits, mainly comparing 
the total investment cost and benefits. Figure 10 shows 
the comparison results of various indicators of economic 
benefits of the ESCIM system. 
 
Operating cost (w)
Advertising revenue (w)
10 20 30 40
0
10
20
50
30
40
50
60
10 20 30 40
0
10
20
50
30
40
50
60
10 20 30 40
0
10
20
50
30
40
50
60
Development cost (w)
(a) Development cost
10 20 30 40
0
10
20
50
30
40
50
60
Download revenue (w)
Time (d)
(b) Operating cost
Time (d)
(c) Advertising revenue
Time (d)
(d) Download revenue
Time (d)
 
Figure 10: Comparison of economic benefit index of enterprise SC-IM system 
  
100   Informatica 48 (2024) 89–102                                                                     L. 
Gong 
In Figure 10 (a), the development cost of the ESCIM 
system increased over time. In Figure 10 (b), the 
operating cost of the ESCIM system decreased over time. 
In Figure 10 (c), the advertising revenue of the proposed 
ESCIM system showed an upward trend over time. In 
Figure 10 (d), the download revenue of the ESCIM 
system generally increased over time. The above results 
indicate that the advertising and download revenue of the 
ESCIM system are increasing over time, and can achieve 
good cost-effectiveness in the future. Next, the scalability 
analysis of the system in different enterprise 
environments was carried out, and the indicators were 
normalized, as shown in Figure 11. 
 
1.0
Handling capacity
Scalability index
Value
Response time Resource utilization rate
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Small enterprise
Medium-sized enterprise
Large enterprise
 
Figure 11: Scalability analysis of systems in different enterprise environments 
 
Figure 11 shows that the scalability indicators, including 
throughput, response time, and resource utilization, were 
all above 0.8 for small, medium, and large enterprises. 
This suggested that the system was feasible in various 
enterprise environments. Finally, the statistical 
significance test of the decision space and preference 
space of the system was carried out, as shown in Figure 
12. 
 
50
60
70
80
90
100
Non-use system Use system
Decision space
P<0.05
Satisfaction(%)
40
P<0.05
Cost
Time
 
Figure 12: Statistical significance test 
 
 
In Figure 12, the difference in time and cost before and 
after using the SC-IM system is statistically significant, 
indicating that the system has brought substantial 
improvement to the enterprise. 
 
 
5 Discussion 
First of all, from the perspective of accuracy and other 
evaluation indicators, the ESCIM system based on 
Apriori algorithm and Logistic-DFSA algorithm shows 
excellent performance. In particular, Logistic-DFSA 
The Application of Integrating Data Mining and IoT Management… Informatica 48 (2024) 89–102 101 
algorithm is superior to other algorithms in throughput 
because of its high computational efficiency, simple 
algorithm and no obvious increase in label cost. This 
result is similar to that obtained by Gao et al. in the study 
of ESCIM system [21]. In addition, by comparing the 
average accuracy of different algorithms, this study 
further verifies the superiority of the SC-IM system based 
on Apriori algorithm and Logistic-DFSA algorithm. This 
study compares the performance of the ESCIM system 
integrated with data mining and IoT technology with that 
of the general international transportation logistics 
information management system. The results show that 
the ESCIM system outperforms the latter in key 
performance indicators such as CPU utilization rate and 
response time. This shows that the ESCIM system 
proposed in this paper has higher computing speed and 
better network performance in practical application, and 
can better meet the needs of enterprises. This result 
coincides with the conclusion reached by Amini's team in 
2023 [22]. The results of economic benefit analysis show 
that although the development cost of ESCIM system 
increases with the passing of time, its operating cost 
gradually decreases, while the advertising revenue and 
download revenue gradually increase. This shows that the 
system can achieve good cost efficiency in long-term 
operation and bring real economic benefits to enterprises. 
Zhang et al. also reached a similar conclusion when 
conducting research on ESCIM system [23]. In terms of 
scalability analysis, this study normalizes scalability 
indicators under different enterprise environments, and 
finds that throughput, response time, resource utilization 
and other indicators of small, medium and large 
enterprises are satisfactory. This shows that the ESCIM 
system proposed in this study can show good 
performance and strong scalability in different scale 
enterprise environments. This is similar to the research 
results obtained by Sawe's team in 2021 [24]. 
To sum up, this study has made a novel contribution. By 
considering multiple evaluation indicators, comparing 
algorithm performance, analyzing practical application 
effects and economic benefits, and conducting scalability 
analysis, this research validates the superiority of the 
ESCIM system based on the Apriori and Logistic-DFSA 
algorithms. It also provides new ideas and methods for 
related fields. These results are significant both 
theoretically and practically as they promote the 
development and optimization of SCIM systems. They 
also help improve the supply chain management level of 
enterprises, reduce costs, improve competitiveness, and 
achieve sustainable development. 
6 Conclusion 
To provide more accurate and real-time supply chain 
information for enterprises and improve their 
competitiveness, this study explored the application of 
integrated DM-IoTm technology in ESCIM. It included 
aspects such as supply chain data analysis, inventory 
management, order management, and supplier 
management. The research results indicated that the gross 
time slots of the Logistic-DFSA algorithm was lower 
than other algorithms, and the throughput was also better 
than other algorithms, with a stable throughput of around 
96%. The CPU usage of the proposed ESCIM system 
fluctuated between 15% and 30% within 24 hours, and 
the response time was 85 ms when the number of people 
was 600. This indicated that the constructed ESCIM 
system had better CPU usage and response time than 
ordinary information management systems, fully 
reflecting the platform's high computing speed, high 
network bandwidth, and other characteristics. Due to the 
complexity of the source of supply chain data, the quality 
of the data can be uneven. Additionally, the sensitivity of 
supply chain data is high, and if leaked or abused, it can 
cause significant losses to enterprises. At the same time, 
with the widespread application of IoT technology, its 
potential security vulnerabilities also bring new risks to 
enterprise data. To address the aforementioned issues, the 
following improvements can be made: Firstly, enhancing 
data encryption and anonymization processes. Secondly, 
implementing a rigorous data access control mechanism. 
In addition, in view of the security vulnerability problem 
of IoT devices, it is recommended that enterprises should 
regularly conduct security risk assessment and 
vulnerability scanning to discover and repair potential 
security problems in a timely manner. At the same time, 
close cooperation should be maintained with IoT 
equipment suppliers to jointly address security threats and 
ensure enterprise data security. Finally, it is emphasized 
that enterprises need to strengthen the data security 
awareness training of employees. 
References
 
[1] Y. Li, R. K. Shyamasundar, an X. Wang, “Special 
issue on computational intelligence for social media 
data mining and knowledge discovery,” 
Computational Intelligence, vol. 37, no. 2, pp. 
658-659, 2021. https://doi.org/10.1111/coin.12457 
[2] A. W. Al-Khatib, “Internet of things, big data 
analytics and operational performance: the 
mediating effect of supply chain visibility,” Journal 
of Manufacturing Technology Management, vol. 34, 
no. 1, pp.1-24, 2023. 
https://doi.org/10.1108/JMTM-08-2022-0310 
[3] Z. Shao, S. Yuan, J. Xu, and Y. Wang, “A statistical 
feature data mining framework for constructing 
scholars' career trajectories in academic data,” 
Applied Soft Computing, vol. 118, no. 1, pp. 
108550-108561, 2022. 
https://doi.org/10.1016/j.asoc.2022.108550 
[4] T. Wang, B. Ren, C. Li, K. Guo, J. Leng, and P. Zhou, 
“Monolithic tapered Yb-doped fiber chirped pulse 
amplifier delivering 126 μ J and 207 MW 
femtosecond laser with near diffraction-limited 
beam quality,” Frontiers of Optoelectronics, vol. 16, 
102   Informatica 48 (2024) 89–102                                                                     L. 
Gong 
no. 3, pp. 30-30, 2023. 
[5] F. H. Awad, and M. M. Hamad, “Big data clustering 
techniques challenged and perspectives,” 
Informatica, vol. 47, no. 6, pp. 203-218, 2023. 
https://doi.org/10.31449/inf.v47i6.4445 
[6] M. Kiguchi, W. Saeed, and I. Medi, “Churn prediction 
in digital game-based learning using data mining 
techniques: Logistic regression, decision tree, and 
random forest,” Applied Soft Computing, vol. 118, 
no. 1, pp. 108491-108511, 2022. 
https://doi.org/10.1016/j.asoc.2022.108491 
[7] V. Plotnikova, M. Dumas, and F. P. Milani, 
“Applying the CRISP-DM data mining process in 
the financial services industry: Elicitation of 
adaptation requirements,” Data and Knowledge 
Engineering, vol. 139, no. May, pp. 
102013.1-102013.17, 2022. 
https://doi.org/10.1016/j.datak.2022.102013 
[8] A. D. Ganesh, and P. Kalpana, “Supply chain risk 
identification: a real-time data-mining approach,” 
Industrial Management and Data Systems, vol. 122, 
no. 5, pp. 1333-1354, 2022. 
https://doi.org/10.1108/IMDS-11-2021-0719 
[9] L. He, Y. Cao, and J. Mao, “Exploring college 
students' fitness and health management based on 
Internet of Things technology,” Journal of 
High-Speed Networks, vol. 28, no. 1, pp. 65-73, 
2022. https://doi.org/10.3233/JHS-220679 
[10] Y. Jiang, “Project cost accounting based on internet 
of things technology,” Journal of Interconnection 
Networks, vol. 22, no. 3, pp. 1-20, 2022. 
https://doi.org/10.1142/S0219265921450122 
[11] L. Wang, and D. Jiang, “Energy management control 
system of prefabricated construction based on 
internet of things technology,” International Journal 
of Internet Protocol Technology, vol. 14, no. 2, pp. 
86-92, 2021. 
https://doi.org/10.1504/ijipt.2021.116256 
[12] T. S. Deepu, and V. Ravi, “Modelling of 
interrelationships amongst enterprise and 
inter-enterprise information system barriers affecting 
digitalization in electronics supply chain,” Business 
Process Management Journal: Developing 
Re-Engineering Towards Integrated Process 
Management, vol. 28, no. 1, pp. 178-207, 2022. 
[13] H. Mao, and L. Chen, “E-Commerce enterprise 
supply chain cost control under the background of 
big data,” Complexity, vol. 2021, no. 6, pp. 1-11, 
2021. https://doi.org/10.1155/2021/6653213 
[14] Q. Li, and G. Wu, “ERP system in the logistics 
information management system of supply chain 
enterprises,” Hindawi Limited, vol. 2021, 2021. 
https://doi.org/10.1155/2021/7423717. 
[15] J. Wang, H. Zhou, and X. Jin, “Risk transmission in 
complex supply chain network with multi-drivers,” 
Chaos Solitons and Fractals, vol. 143, no. 5439, pp. 
110259-110269, 2021. 
https://doi.org/10.1016/j.chaos.2020.110259 
[16] H. Yang, S. Zhao, and J. Peng, “Optimal retail price 
and service level in a dual-channel supply chain with 
reference price effect,” Journal of Industrial and 
Management Optimization, vol. 19, no. 6, pp. 
3883-3912, 2023. 
https://doi.org/10.3934/jimo.2022115 
[17] G. Liu, C. Li, W. Wei, W. Li, and H. Zhen, “Data 
mining analysis of gene prognostic markers of 
metastatic skin cancer based on the elastic network 
method,” Mathematical Problems in Engineering, 
vol. 25, no. 1, pp. 6636058.1-6636058.12, 2021. 
https://doi.org/10.1155/2021/6636058 
[18] H. Wang, “Analysis and prediction of CET4 scores 
based on data mining algorithm,” Complexity, vol. 
2021, no. 12, pp. 1-11, 2021. 
https://doi.org/10.1155/2021/5577868 
[19] G. Mehdi, H. Hooman, Y. Liu, S. Peyman, and R. 
Arif, “Data mining techniques for web mining: A 
survey,” Artificial Intelligence and Applications, vol. 
1, no. 1, pp. 3-10, 2022. 
https://doi.org/10.47852/bonviewAIA2202290 
[20] M. N. Faisal, “Role of Industry 4.0 in circular supply 
chain management: a? Mixed-method analysis,” 
Journal of Enterprise Information Management, vol. 
36, no. 1, pp. 303-322, 2023. 
https://doi.org/10.1108/JEIM-07-2021-0335 
[21] Q. Gao, S. Guo, X. Liu, G. Manogaran, N. 
Chilamkurti, and S. Kadry, “Simulation analysis of 
supply chain risk management system based on IoT 
information platform,” Enterprise Information 
Systems, vol. 14, no. 9, pp. 1354-1378, 2020. 
https://doi.org/10.1080/17517575.2019.1644671 
[22] M. Saratchandra, and A. Shrestha, "The role of cloud 
computing in knowledge management for small and 
medium enterprises: a systematic literature review", 
Journal of Knowledge Management, vol. 26, no. 10, 
pp. 2668-2698, 2022. 
https://doi.org/10.1108/JKM-06-2021-0421 
[23] X. Zhang, P. Sun, J. Xu, X. Wang, J. Yu, Z. Zhao, 
and Y. Dong, “Blockchain-based safety management 
system for the grain supply chain,” IEEE Access, 
vol. 8, no. 3, pp. 36398-36410, 2020. 
https://doi.org/10.1109/ACCESS.2020.2975415 
[24] F. B. Sawe, A. Kumar, J. A. Garza‐Reyes, and R. 
Agrawal, “Assessing people‐driven factors for 
circular economy practices in small and 
medium‐sized enterprise supply chains: Business 
strategies and environmental perspectives,” Business 
Strategy and the Environment, vol. 30, no. 7, pp. 
2951-2965, 2021. https://doi.org/10.1002/bse.2781