https://doi.org/10.31449/inf.v48i7.5775 Informatica 48 (2024) 113–122 113 Interpolation Analysis of Industrial Big Data Based on KDR Knowledge Recognition Algorithm Considering Singular Value Decomposition Theory Cenglin Yao 1,2 , Yongzhou Li 1* 1 Evergrande School of Management, Wuhan University of Science and Technology, Wuhan, Hubei,430081, China 2 College of Mechanical and Electrical Engineering, Wuhan Business University, Wuhan, Hubei 430056, China E-mail: yl200808@126.com, 20150511@wbu.edu.cn * Corresponding author Keywords: singular value decomposition, KDR algorithm, industrial big data, interpolation analysis Recieved: Februry 27, 2024 Although various algorithms have made some progress in the current research of industrial big data interpolation, most of them are only suitable for static KDR operation methods. Most of the data is not achieved overnight but in an incremental manner. For example, the data will increase with time. In the process of data collection, to ensure the consistency of KDR calculation results under dynamic conditions, the same and different information in the old and new data must be merged, to disperse the dynamic data. According to the increasing properties of data in Industrial big data analysis, a dynamic KDR operation model is established by considering singular value decomposition (SVD) theory. To ensure consistency before and after static separation, a rough set method based on Manilkara is used. Under the influence of Yalo' s singular value decomposition (SVD) theory, the conventional interval is divided into two parts: the core and the blank to express the unstable interval. This method uses the method based on the middle interval. By dividing the middle interval again, the interval between the old and the new data is combined. Povzetek: V raziskave industrijskih velikih podatkov so vpeljali dinamični model KDR, ki uporablja teorijo singularne razčlenitve (SVD) in grobo množico Manilkara za zagotavljanje konsistentnosti med dinamičnimi in statičnimi podatki. 1 Introduction In 1987, KDR was first formally introduced. To solve the problem of incomplete numerical properties, Wang and Chiu2 gave EFKDR algorithm with equal frequency respectively, and the EWKDR algorithm with equal width was introduced in the same year. Then, the KDR operation method has many directions according to the development of the problem. For example, in 1991, Huang and Chiu3' proposed the KDR computation of the basic maximum direct coefficient, so that the corresponding interval number can be automatically obtained according to the characteristics of the data in the KDR computation processing. In 2018, “Hacibeyoglu and Ibrahim" proposed a KDR algorithm for Euonymus according to the EF method. In maximum likelihood estimation, the distribution of boundaries is used to estimate an uncertain parameter, usually using the expected maximum (EM).EM method requires a large amount of data in theory to ensure the asymptotic and normality of the estimation. However, the EM algorithm's local optimal solution can be found quickly, its convergence is sluggish, and its operation is quite intricate. The study assumed that the missing data also existed and that the missing data were the most valuable, rather than simply missing data, so the data were analyzed in a non- processing manner. For example, if a user uses the attribute "obesity," the item "weight" will not be added, but will appear directly on the "empty number". The method analyzes the original data directly, without any preprocessing. The most typical ones are Bayesian networks and neural networks. Bayesian network is used to 114 Informatica 48 (2024) 113–122 C. Yao et al. describe the probability of association between features, which is used to reveal the association between data and provide a natural expression for it. In a Bayesian network, features are represented as features, which correlate with attributes represented laterally. The necessity of a Bayesian network for data collection, and also is to have the understanding of the data collection and the correlation between each attribute is clear. Based on this, we must first analyze these features, or all of these features are added to the model, the cause of the complexity of the Bayesian network, but also as the number of properties and geometric ratio increase. Although the artificial neural network is also a very popular machine learning technique in the present situation, it needs more work to interpolate the missing data [1-6]. In the traditional KDR operation method, the solution of interval cut-off is a game method. In the actual KDR operation, a point cannot be used to describe the boundary point of an interval. In the same period, the boundary of an interval will also be different; At the same time, an interval can have multiple uncertain tangents. Based on this situation, we give an interval that takes into account the theory of singular value decomposition. Using the idea of a boundary interval in three intervals, the uncertain region is extracted to form a boundary interval. When performing dynamic fusion, the intervals in the image need to be re- segmented to achieve the purpose of image delay, to achieve the purpose of KDR operation on the image. The three intervals can not only express the uncertainty of the intervals but also delay the boundary intervals in the dynamic KDR operation [10-14]. 2 Related work In machine learning, clustering is one of the main methods used to analyze non-monitored data, but it still has many problems to be solved. When clustering, objects close to the edge of the cluster cannot be efficiently classified. Yul3l gave three clustering classification methods under the influence of singular value decomposition theory. Compared with the traditional cluster represented by a single set, the 3D group is a new representation of clusters. In three subgroups, the objects of a cluster are divided into two groups, and the universe of a cluster is divided into core domain, edge domain, and irrelevant domain. Taking into account the singular value decomposition theory, the central and boundary objects of different categories can be compared respectively. Based on considering the singular value decomposition theory, this topic has been deeply discussed and some results have been obtained. Given the uncertainty of objects and clusters in multi-view clustering, Yu et al. L6l proposed a low-order ternary principal component classification method, which can not only reflect the correlation between objects and clusters but also effectively raise the multidimensional clustering's accuracy. To solve the large-scale clustering problem, Yu et al. introduced the fusion architecture of three clusters in 2019, which combined the clustering method with the clustering method, which could not only ensure the quality of the cluster but also reduce the time cost of calculation. In 2016, Yu et al. proposed singular value decomposition (SVD) theory based on different vertex distribution characteristics to achieve different vertex representations. Under the influence of the singular value decomposition theory, Liu et al. proposed an approach based on centroids to address the dynamic problem of overlapping communities [7-9]. Table 1 shows the summary table of the previous literatures. Table 1: Summary of the literature review Author Objective Findings [18] The manuscript explores these concepts and provides a case study that demonstrates the implementation of new intelligent hybrid algorithms for Industry 4.0 applications with limited data. An extremely accurate deterministic model that fits the real data was generated by applying the suggested approach. We also employed the UKF technique to strengthen the model's resistance to uncertainty. [19] Big data analytics and related applications in smart grids are The result focuses on the complex applications of different data analytics in Interpolation Analysis of Industrial Big Data Based on KDR… Informatica 48 (2024) 113–122 115 introduced in this study. The initial section covers the features of big data, smart grids, and huge data collection to illustrate the goal and potential advantages of incorporating advanced data analytics in smart grids. smart grids. The current power system may gain a lot from handling massive amounts of data from geographical information systems, meteorological information systems, and energy networks, among other sources. In the big data era, this will also enhance customer service and societal welfare. [20] The work presented a novel method known as the interpretable kernel DR algorithm (I-KDR), which converts Information from the characteristics space into a lower-dimensional space where classes are closer together and there is less overlap. Furthermore, the dimensions are created by the algorithm based on the local contributions of the data samples, which facilitates their interpretation by class labels. Furthermore, we effectively combine the feature selection task with the DR to identify the original space's most pertinent characteristics for the discriminative purpose. [21] This paper presents a new technique that transfers The feature space corresponds to data to a lower dimensional space where classes are closer together and there is less overlap: the interpretable kernel DR algorithm (I-KDR). Additionally, the approach produces the dimensions based on the data samples' local contributions, which make it easier to comprehend the data by class labels. [22] For an improved intrusion detection system, the suggested system uses a Hybrid Deep Learning (HDL) network made up of Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). PySpark, which offers Python support for the study made use of the Apache Spark technology in the Google Colab environment. The CIDDS-001 data set was used to evaluate the model in multiclass mode, and the UNS-NB15 data set was used to evaluate the model in binary mode. 116 Informatica 48 (2024) 113–122 C. Yao et al. [23] The fusion of conventional machine learning algorithms with big data technology has created novel and captivating obstacles in domains such as social media and social networks. Data processing, data storage, data representation, and the use of data for pattern mining, user behavior analysis, data visualization, and data tracking are among the primary issues addressed by these new challenges. Big data has emerged as a significant topic for several study fields, including social networks, data mining, machine learning, computational intelligence, information fusion, and the Semantic Web. The development of several big data frameworks for huge data processing based on the Map Reduce paradigm, like Apache Hadoop and, more recently, Spark, has made it possible to use data mining techniques and machine learning algorithms effectively across a variety of domains. 3 Research methods In general, statistical principles and machine learning techniques are used to interpolate the defaults to obtain a complete data set. At present, the following interpolation methods are usually used in data processing: 3.1 Manual interpolation This method is done manually based on human experience. That is, over the years, they have relied on their years of experience and interpolating this type of data, so the manual interpolation method is usually more accurate. However, with a large number of missing values, this algorithm can take a lot of time and effort. 𝑅 ⃐ ( 𝑋 )= ⋃ {𝑌 𝑖 ∣ 𝑌 𝑖 ∈ 𝑈 /𝐼𝑁𝐷 ( 𝑅 )∧ 𝑌 𝑖 ∩ 𝑋 ≠ ∅} 𝑅 ( 𝑋 )= ⋃ {𝑌 𝑖 ∣ 𝑌 𝑖 ∈ 𝑈 /𝐼𝑁𝐷 ( 𝑅 )∧ 𝑌 𝑖 ⊆ 𝑋 } (1) Where, represents the partition of ambiguous relation on. U/IND( R)= {X ∣ ( X ⊆ U ∧ ∀ x∈X,y∈X,∈∈R ( a( x)= a( y) ) ) }RUY i Represents a set of objects, which can be regarded as an equivalence class.Y i U/IND( R) Y i When the intersection with the objective function is not an empty set, the upper approximation set requirement is satisfied.Y i XY i According to the theory of singular value decomposition (SVD), two classes are divided into three classes. In the case of ambiguity, the ambiguous information is expressed, and the processing of the ambiguous information is realized. This strategy is based on human experience and is frequently regarded as correct. But, especially when dealing with big datasets, it can be labor-time-intensive. The volume of the data may make manual interpolation impractical in an industrial big data scenario where efficiency and accuracy are critical. 3.2 Specific value imputation This algorithm uses a special value to interpolate the missing data, which is independent of other eigenvalues. If the missing data were all labeled "empty," a completely different set of data would be produced, but this would be highly biased and generally not beneficial. Although the sequential clustering method is used to obtain the boundary of the original adjacent interval, there is still some uncertainty. There are two types of merging and disjunction between two adjacent regions. If the center distance between the two spaced groups is small, and the sample groups in the interval have high similarity, if the stable value between the two intervals is still high, then the combination operation is performed. On the contrary, a new segment can be formed by splitting the intersecting part of the two intervals. 𝜆 𝑎 = |𝑉 𝑎 | 𝐹 𝑎 (2) It represents a stable region, and the higher its value, the less stable it will be. A parameter represents a range. The total number of samples in the data represents the width of the total sample value of the individual number. Since the distribution density of each data is different, a parameter is adopted to balance the relationship between the sampling frequency and the width of the interval between each attribute. The relationship is as follows: φ a ( d i )= λ a F d 1 F + ( 1 − λ a ) W d 1 W , λ a ∈ [0,1] (3) The stability of the ordered cluster can be found by using this equation. The stability increases with the decrease of Interpolation Analysis of Industrial Big Data Based on KDR… Informatica 48 (2024) 113–122 117 the stable value. The formula method is used to describe the attribute KDR operation degree of numerical values. The closer the value is to 1, the smaller the attribute KDR operation is, and the higher the attribute KDR operation is. Regardless of other eigenvalues, this technique gives missing data a unique value. Even if it can be simple, if the assigned value is not indicative of the missing data, bias might be introduced. Furthermore, it might not fully represent the intricacy of industrial data, where missing values could have a big influence on studies performed later on. 3.3 Average imputation In the data set, the eigenvalues are divided into continuous and discontinuous values, and the average interpolation is carried out according to the size of their eigenvalues. If the omission is of continuous type, the missing data are interpolated according to the average value of the eigenvalue. If the blank is of continuous type, the most (that is, the most frequent value) in the eigenvalue of the missing data is interpolated according to the method of the statistical method. A similar idea is used in the imputation algorithm of the conditional mean, that is, the average of the missing data is carried out by this algorithm, but it is not selected from all the objects in the data set, but obtained from the target which is consistent with the target's judged eigenvalue. In contrast, the basic idea of both algorithms is to use the maximum possible value to insert the missing data, but the specific implementation will be different depending on the difference in the specific data. Using the average eigenvalue as a guide, this method interpolates missing data. It is easy to use and straightforward, however, it assumes that the dataset is homogeneous, which may not be the case in industrial settings where data can be varied and heterogeneous. 3.4 Hot card insertion Hot card interpolation is looking for and missing from the original data collection value closest to the object, the object's characteristics and use of interpolation, and through the study of the interpolation of the different characteristics of the object, and through the data between the interconnected to estimate the missing data, its deficiency is unable to define similarity and there is much subjective influence. In this way, a single numerical feature can be assigned to the evaluation object. The maximum deviation value represents a numerical value as an attribute, and the magnitude of the value determines the importance of the element. Represents the number of data samples. Is for the number characteristic in a pair. This is a special ability. Compared with the principal component analysis (PCA) method, the variance-based maximum shift method is more convenient, but it is not universal and does not involve the interaction between attributes. MIC, the maximum information, is used to determine the correlation of each attribute and is the largest parameter less information-based inquiry can not only detect different attributes but also find different attributes, to reflect their importance. Conventional MIC only studies a single data type, IMIC( a, A)= ∑   b∈A−a MIC( a, b) (4) Based on MIC, an IMIC algorithm for determining the importance of attributes based on MIC is proposed. The MIC of each attribute is superimposed with the MIC of other attributes to reflect the correlation between each attribute, to determine the importance of each attribute without supervision. In the practical case, we have assumed a normal prosperous present value. On this basis, the data sampling values of each interval meet the normal allocation, and it is regarded as the center point of the interval. Based on the sample size represented by the characteristic values of statistical data, the interval with the largest sample size in a certain region is regarded as the center of the cluster. KL i = ( f( d i )− f( d i−1 ) ) /( d i − d i−1 ) KR i = ( f( d i )− f( d i+1 ) ) /( d i − d i+1 ) K i = KL i × KR i (5) This technique uses the properties of neighboring values in the dataset to approximate missing data. However, defining similarity and figuring out which values are closest might be subjective, which could produce biased findings, particularly in intricate industrial datasets with a variety of features. 3.5 K-adjacent interpolation This algorithm is based on the nearest K sampling values of the object to the missing data and uses the weighted average of the K sampling values to estimate [3]. K-nearest neighbor interpolation uses the hierarchical clustering 118 Informatica 48 (2024) 113–122 C. Yao et al. pattern to estimate the missing data class and interpolates the average of this class. The idea is to insert all the values in the missing data and see if the inserted values achieve the best interpolation result. As you can imagine, such an algorithm is indeed a good solution in the case of a small amount of data, but if the number of data is too large, the data is lost too much, and a lot of data is needed, so a lot of data is needed to test. To prove that the data has the characteristics of regularity, that is data clustering. By the degree of concentration of these data, we can find the average value in these data. In this paper, the statistics and analysis of Sat image in UCI are presented. On this basis, the maximum shift method is used to solve the attribute importance. The larger the value of variance is, the existence of the item is indicated. The higher the uncertainty, the higher the importance. The specific algorithm is as follows: Y a i = ∑   n j=1 ∑   n k=1 √( v j −v i k W a i ) 2 , i = 1,2, … , k (6) Using the closest K sampling values, this technique predicts missing data. Despite providing a mathematical methodology and its efficacy may differ based on the distribution and features of industrial big data. 3.6 A comprehensive approach This method also allows for a test on the missing data, with the difference that the inserted data is the best of the final attribute reductions to be used as an interpolation for the missing data. This algorithm can improve the accuracy of the algorithm on the premise of increasing the complexity of the operation. Obviously, in the case of big data, there will be a lot of loss of these data, so this algorithm can obtain good interpolation accuracy, but it costs a lot of time. Most of the commonly used methods for missing information interpolation are regression-based methods. However, all regression methods are built based on complete data, so the data should be pre-processed. In the case of missing data, the known feature quantities are used to predict, and the method is used to predict the missing data. This paper also focuses on the interpolation method of the missing data. The solution of MIC value is the largest way of parameter-less search based on data. According to the experimental results of reference [66], the larger the MIC value, the greater the correlation between the two properties. If the MIC of both personalities is 1, it indicates that there is a linear relationship between the two personalities’ divides the two-dimensional data into grids, and accumulates the mutual information in the grids to obtain the initial mutual information system. Finally, the cumulative value obtained from the different grid assignments is the MIC value, square formula (2.7).MIC can be used to find out the direct relationship of each attribute, as well as their internal relationship, to reflect the importance of each attribute. MIC( a, b)= max n×m<𝐵  ( ∑   x∈X,y∈Y  P( x,y)log P( x,y) ∑   x∈X  P( x,y)∑   y∈Y  P( x,y) ) log m {n,m} ) (7) Each interval in the core interval is represented by the traditional interval representation, and the cut points in each interval are obtained by the static singular value decomposition (SVD) algorithm based on the data samples. These cut points are supported by the data, and the corresponding attribute values can be found in the original data samples [15-17]. ED a = {[p min , p 1,3 ) , [p 1,3 , p 2,3 ) , … , [p k−1,3 , p max ]} (8) BD a = {( p 1,2 , p 2,1 ) , ( p 2,2 , p 3,1 ) , … , ( p k−1,2 , p k,1 ) } (9) Each interval in the blank interval is represented by the traditional interval representation, and each interval is based on the interval segments not included in the corresponding core interval set, as shown in Table 2 below: Table 2: Case table of singular value decomposition theory u 1 b c d u 2 0.8 2 1 u 3 1 0.5 0 u 4 1.3 3 0 u 5 1.4 1 1 u 6 1.4 2 1 u 7 1.6 3 1 Interpolation Analysis of Industrial Big Data Based on KDR… Informatica 48 (2024) 113–122 119 Data are assumed to be incremented over time, and the nature of numerical data follows a normal assignment. If a data set containing numeric characteristic data is entered at a certain time, the continuity between all characteristics of the data is taken into account, so that the resulting data intervals have a certain order, that is, a priority in the case of sequence clusters. On this basis, firstly, the feature set of numbers is extracted to determine the importance of the attributes of the numbers, and then the corresponding attribute set is obtained by sorting the numbers in order of size. The expected KDR digit feature is regarded as the net ranking of each data in the sequence cluster, and the sequence clustering method is used to sort the unlabeled information to obtain the three interval sets of the initial KDR operation. Then, the uncertain regions in the middle are re-evaluated and segmented, and then the original data is KDR operation. Finally, the data set after the unlabeled KDR operation is obtained by successive KDR operations on the numeric attribute group. These features include the features of the KDR operation and the features of the KDR operation, and then these features are introduced into the new separated features. In this method, the dynamic KDR operation method is adopted, and the combination process is carried out to obtain a new KDR operation information system. Although it requires more computer power and complexity, this method combines several approaches to increase interpolation accuracy. In industrial settings where precision is crucial, this method might make up for the extra computing burden. 4 Result analysis In Table 3 below, statistics of the NB algorithm are performed on the data after the KDR operation. In general, the NB algorithm has the same good performance as the Euonymus algorithm. From a personal point of view, this method is better than EF1l, EW9, and EF_Unique7.In general, the results of 27 out of 33 trials were better than those of the control method, reaching 81.82%. The average accuracy of the proposed method was 82.53%, ranking first. Compared with 81.07, which ranked second, it increased by 1.46 percentage points. The statistical analysis of the suggested approach combining KDR operation with NB, KNN, and C4.5 algorithms will yield encouraging results. With a noteworthy gain of 1.46 percentage points over the second-ranking approach, the average accuracy of 82.53% beats alternative methods. Furthermore, with 81.82% of trials producing better outcomes than control approaches, the suggested strategy shows a significant improvement over other method. The suggested approach specifically shows superiority in 78.79% of cases with respect to recall rates and precision, indicating its efficacy in classification tasks. The advantages of the suggested technique are particularly evident when compared to rule-based systems, as demonstrated by the C4.5 process's lower production of leaf nodes. These results highlight how important it is to combine different methods to improve classification efficiency and accuracy. Moreover, the technique's enhanced performance in comparison to well-known algorithms like Euonymus, EF1l, EW9, and EF_Unique7 highlights its potential for real-world uses. The statistical tests validate the importance of these results, indicating the robustness and dependability of the suggested approach in successfully handling classification problems. In this method, KNN and NB methods were used to calculate KDR operation data, and C4.5 was used to analyze KDR operation data. In a single aspect, this method is 6 times better than the EF algorithm, 8 times better than the EW method, and 6 times better than the Euonymus method. In general, the total number of times was 20, and the proportion was 60.61%. As a rule-based method, C4.5 will produce incomplete rules when the real rules are generated, which makes some problems difficult to classify accurately when conducting experiments. Therefore, we also make a statistic for the average recall rate of each data set. Comparison results show that the effect of this method is better than that of the control method in 26 cases, reaching 78.79%.On this basis, the average precision and average recall of the proposed algorithm have reached a good level. The number of leaf nodes produced by the C4.5 process is illustrated. As can be seen in Table 3 and Figure 1, after adopting the method of the invention, the number of leaf nodes produced is less than the others. Table 3: Schedule of dynamic Singular value Decomposition theory Time( ms) 80 60 50 HTRU2 354743 228578 195755 A villa 205466 135768 124841 B an 3950123 1528104 1481980 Shuttle 2306469 2007489 1477111 120 Informatica 48 (2024) 113–122 C. Yao et al. Figure 1: Scheduling performance of dynamic SVD theory The performance of several algorithms, including KNN, KDR, PLS, and SVD-KDR, over various time intervals is shown in the Table 4 and Figure 2. KNN reaches 0.1 accuracy at 5%-time allocation, while KDR reaches the highest at 0.8. With 0.7 accuracy, KDR holds its lead when time rises to 10%. Fascinatingly, PLS and SVD-KDR both exhibit robustness and efficiency in time-sensitive tasks, consistently obtaining 0.9 accuracy or above across all time intervals. Table 4: Outcomes of interpolation method Missing percentage Interpolation method KNN KDR PLS SVD- KDR 5 0.1 0.8 0.9 1.0 10 0 0.7 0.8 0.9 15 0.1 0.3 0.9 1.0 20 0.0 0.1 0.9 1.0 25 0 0.1 0.9 1.0 Figure 2: Comparison of interpolation method Three algorithms, KNN, NB, and C45 are used in the experiments. The above studies show that the algorithm is effective for unsupervised data processing. In the NB and C4.5 methods, better results are obtained. In the industrial big data environment, based on the characteristics of industrial data, this paper proposed a decentralized problem method based on industrial big data. In this paper, the definition of KDR computation and the research status at home and abroad are described in detail. This paper focuses on the basic principle of bidirectional selection and related theories at home and abroad. At present, most of the KDR calculation methods are only used to deal with static data and do not pay attention to the dynamic change of data. Therefore, this paper will further discuss the kinetic characteristics of KDR and Industrial big data analysis from the perspective of kinetics. There are three main aspects of this study: 4.1 A dynamic model of KDR operation is established by using a three-branch decision In the case of Industrial imputation data, due to the dynamic nature of the original data, the relationship between the original data and the new time data cannot be guaranteed when KDR operation is carried out. When there is a big difference between the new and old-time data, it will cause an unsatisfactory separation. However, most of the current KDR computational theories ignore the dynamic properties of the data. To solve this problem, a dynamic data analysis method based on KDR operation is established by taking into account the singular value decomposition theory. This model can be divided into two categories: static and dynamic. Firstly, the static KDR operation method is used to perform the initial KDR operational processing at each Interpolation Analysis of Industrial Big Data Based on KDR… Informatica 48 (2024) 113–122 121 time point. On this basis, the interval method is used to fuse the KDR operation information at each time point. In consideration of singular value decomposition (SVD) theory, the delay decision method considering SVD theory is introduced, and the original interval is replaced by the form of a three-branch interval. In dynamic fusion, only the core region is fused, the existing blank region is discarded, and the delay method is used to segment the blank region. Lastly, the accuracy of the suggested analysis algorithm is confirmed using the UCI test results. The technique has a promising future for KDR computational processing of dynamic data, according to the experimental results. 4.2 Three-branch interval method taking into account singular value decomposition theory According to the kinetic characteristics of Industrial imputation data, the three-branch interval analysis method is given. To solve the problem that the boundary of the KDR operation interval is uncertain in dynamic data, this paper uses the method of considering singular value decomposition (SVD) to redefine it, so that it cannot only express the boundary of the region but also update the edge of the dynamic region in real time. After describing the space interval, the space interval is used to distribute the space appropriately, to achieve dynamic adjustment of the KDR operation interval for massive information, and to solve the attribute KDR operation problem of incremental big data. The gap between the three root intervals can not only show the change in the region but also dynamically adjust the region according to the size of the space. 5 Discussion The time measurements in milliseconds (ms) for the various tasks completed by HTRU2, A villa, B an, and Shuttle are shown in this table. A separate entity is represented by each row, and a different time condition 80, 60, and 50 milliseconds is represented by each column. For example, HTRU2 took 354,743 ms to finish its assignment under the 80 ms condition, while A villa took 205,466 ms, B and took 3,950,123 ms, and Shuttle took 2,306,469 ms. under a similar vein, HTRU2 took 228,578 ms, A villa took 135,768 ms, B and took 1,528,104 ms, and Shuttle took 2,007,489 ms under the 60 ms condition. In the end, HTRU2 took 195,755 ms, A villa took 124,841 ms, B and took 1,481,980 ms, and Shuttle took 1,477,111 ms under the 50 ms constraint. Singular value decomposition (SVD) theory is incorporated into our study's dynamic KDR operating model for industrial big data interpolation, which takes into account the incremental nature of data accumulation. Notably, our strategy deviates from static techniques that are common in the literature and frequently ignore the dynamic change of data over time. We guarantee the consistency of KDR calculation results under dynamic situations by combining SVD theory with a rough set approach based on Manilkara. By contrasting our findings with previous research, we find that most algorithms are not built to take into consideration the incremental nature of industrial data because they are intended for static KDR processes. By overcoming this gap and offering a thorough foundation for dynamic interpolation, our methodology makes a novel contribution. Moreover, our methodology enables a more precise depiction of changing data trends, augmenting the resilience and dependability of industrial big data examination. We show the unique qualities and benefits of our suggested model through this comparative analysis, opening the door for more developments in the area. 5.1 Limitations There are drawbacks to the suggested dynamic KDR operation model that integrates the rough set method with SVD theory. Its computational complexity might make it difficult to use with very large datasets. Additionally, problems with data quality or outliers may make it less effective to combine old and new data. Dependence on theoretical frameworks such as SVD may restrict the application to a variety of commercial datasets, necessitating rigorous cross-domain validation. 6 Conclusion Although there has been research on the non-monitoring nature of interpolation in industrial big data, these problems caused by interpolation have not been completely solved. In the processing of industrial big data, there is often a lack of data. For such incomplete dynamic KDR operational methods, there is still a lack of relevant theories and methods at home and abroad. The processing of missing data often leads to the loss of data and thus affects the analysis of data. Therefore, considering the theory of singular value analysis, it is a very promising solution to explore the KDR operation. The study of industrial big data interpolation is still beset with numerous issues. According to the statistics of the industrial data, it is found that the data is extremely uneven. The root cause of the problem is that 122 Informatica 48 (2024) 113–122 C. Yao et al. the consistency between the pre-KDR and the interval is guaranteed by using a wide range of cell merging. Therefore, the correctness of the algorithm can be tested by using UCI data. Experiments show that the algorithm can not only solve the non-monitored static data effectively but also separate the non-monitored dynamic data effectively, which provides a useful reference for the subsequent KDR computation of dynamic data. References [1] Esmaeilbeigi M, Chatrabgoun O, Hosseinian-Far A, et al. 2020. A low cost and highly accurate technique for big data spatial-temporal interpolation [J]. Applied Numerical Mathematics, 153. [2] Zhu Q X, Liu D P, Xu Y, et al, 2021. Novel space projection interpolation-based virtual sample generation for solving the small data problem in developing soft sensor [J]. Chemometrics and Intelligent Laboratory Systems, 217: 104425-. [3] Luthra H, Nihith T, Pravallika V, et al, 2021. New Paradigm in Healthcare Industry Using Big Data Analytics [J]. IOP Conference Series: Materials Science and Engineering, 1099(1): 012054 (14pp). [4] Yu F, Zhou Y, 2021. Development Planning and Path Analysis of Intelligent Logistics Industry in Big Data Age [J]. Journal of Physics: Conference Series, 1852(4): 042064 (8pp). [5] Zhao L, Tao W, Wang G, et al, 2021. Intelligent anti- corrosion expert system based on big data analysis [J]. Anti-Corrosion Methods and Materials, ahead-of- print(ahead-of-print). [6] Udugama I A, Gargalo C L, Yamashita Y, et al, 2020. The Role of Big Data in Industrial (Bio)Chemical Process Operations [J]. Industrial & Engineering Chemistry Research. [7] Cheng C, Huang H, 2021. Big data and industrial innovation progress in Jiangxi Province incremental effect highlights enabling digital economy cultivation [J]. Journal of Physics Conference Series, 1852(2): 022005. [8] Ram J, Zhang Z, 2021. Examining the needs to adopt big data analytics in B2B organizations: development of propositions and model of needs [J]. Journal of Business & Industrial Marketing, ahead-of- print(ahead-of-print). [9] Yang K, 2020.The construction of sports culture industry growth forecast model based on big data [J]. Personal and Ubiquitous Computing, 24(1): 5-17. [10] Chi J, Li Y, Huang J, et al, 2020. A secure and efficient data sharing scheme based on blockchain in industrial Internet of Things [J]. Journal of Network and Computer Applications, 167: 102710. [11] Wei, HT, YY, et al, 2015. A k-d tree-based algorithm to parallelize Kriging interpolation of big spatial data [J]. GISCI REMOTE SENS, 2018,52(1)(-): 40-57. [12] Wu W, Ahmad M O, Samadi S, 2019. Discriminant analysis based on modified generalised singular value decomposition and its numerical error analysis [J]. Iet Computer Vision, 3(3): 159-173. [13] Luo L, Wang L, Hu J, 2019. On the Modeling and Analysis of an Improved CNC Interpolation Algorithm [J]. Materials Science Forum, 626-627: 459-464. [14] Tao R, Kang X, Wen S, et al, 2017. Study of Dynamometer Cards Identification Based on Root- Mean-Square Error Algorithm [J]. International Journal of Pattern Recognition & Artificial Intelligence, 32(2). [15] Gao Y, 2019. Constructing the social network prediction model based on data mining and link prediction analysis [J]. Library Hi Tech, ahead-of- print(ahead-of-print). [16] Szczepanik M, Jozwiak I, 2019. Data management for fingerprint recognition algorithm based on characteristic points' group [J]. Foundations of Computing & Decision Sciences, 38(2): 123-130. [17] Guo Y, Zhang B, Sun Y, et al, 2020. Machine learning based feature selection and knowledge reasoning for CBR system under big data [J]. Pattern Recognition, 112(6): 107805. [18] Khayyam, H., Jamali, A., Bab-Hadiashar, A, Esch, T., Ramakrishna, S., Jalili, M. and Naebe, M, 2020. A novel hybrid machine learning algorithm for limited and big data modeling with application in industry 4.0. IEEE Access, 8, 111381-111393. [19] Rani, R., Khurana, M., Kumar, A. and Kumar, N, 2022. Big data dimensionality reduction techniques in IoT: Review, applications and open research challenges. Cluster Computing, 25(6), pp.4027-4049. [20] Hosseini, B. and Hammer, B. September 16–20, 2019, Interpretable discriminative dimensionality reduction and feature selection on the manifold. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019, Würzburg, Germany, Proceedings, 310-326. Springer International Publishing. Interpolation Analysis of Industrial Big Data Based on KDR… Informatica 48 (2024) 113–122 123 [21] Ngiam, K.Y. and Khor, W, 2019. Big data and machine learning algorithms for health-care delivery. The Lancet Oncology, 20(5), pp.e262-e273. [22] Bello-Orgaz, G., Jung, J.J. and Camacho, D, 2016. Social big data: Recent achievements and new challenges. Information Fusion, 28, pp.45-59. [23] Islam, M.R., Liu, S., Wang, X. and Xu, G, , 2020. Deep learning for misinformation detection on online social networks: a survey and new perspectives. Social Network Analysis and Mining 10(1), p.82.