https://doi.org/10.31449/inf.v48i6.5450 Informatica 48 (2024) 19–34 19 A Micro-Class Teaching Data Retrieval Method of Business English Based on Network Information Classification Wang Guifang E-mail: 20072026@zyufl.edu.cn College of English, Zhejiang Yuexiu University, Shaoxing, Zhejiang 312000, China Keywords: network information classification, business English, micro-class teaching, data retrieval, optimised support vector machine, improved artificial bee colony algorithm Received: In order to quickly extract the required micro-class teaching data of business English in the fragmented network information environment, a micro-class teaching data retrieval method of business English based on network information classification is offered. This way, it constructs a network information classification method on the basis of the optimised SVM model, and the parameters of the SVM model are optimised and trained by the improved artificial bee colony algorithm. After optimising the function of the SVM method, the method classifies the online teaching information of various business English micro- class teaching; In the classification network information set, targeted retrieval of network teaching resources according to big data techniques is applied to obtain the teaching data with the highest similarity to the user retrieval data by clustering, which completes the targeted retrieval of the microclass teaching data. The experimental results show that the retrieval delay of the proposed method for micro- class teaching data retrieval of business English is less than 1s, the number of correct retrievals is relatively high, and there are few wrong retrieval phenomena. Povzetek: Metoda za hitro iskanje podatkov predavanj poslovnega angleškega jezika temelji na klasifikaciji spletnih informacij z optimiziranim SVM modelom. 1 Introduction For business English courses, the content is complex. As a key course in the major, it is important to highlight teaching effectiveness [1], [2] Therefore, it needs to try to integrate the two, apply the core function of micro-class instructing to business English instructing, and mine the essence of it. In the new era, the attempt at "micro-class" teaching of business English has its own value. Combined with the current application status of micro-class teaching mode, from the perspective of fragmented content, interactive learning and diversified resources, the specific analysis of teaching value is completed [3], which can provide necessary conditions for teaching innovation. The core resources for implementing and applying micro-class teaching are micro-video courses. The time of each teaching video is usually 6-8 minutes, with clear focus and specific content [4], [5]. The teaching of micro-class makes full use of the convenient features. It supports the development of teaching practice with diversified teaching resources, which has a multifaceted and comprehensive impact on students. The supplement and expansion of resources make the content fuller, more specific and attractive, improving students' comprehensive ability to expand and optimise resources [6]. But in the mean time, in the micro-class teaching style of business English, the diversified teaching resources also pose a challenge to the effect of teaching data retrieval. Quickly extracting the required teaching data from diversified teaching resources is one of the difficulties in the current micro-class teaching process of business English [7]. In reference [8], Wang et al. used deep learning technology in the cross-modal data retrieval of network information, which improved the retrieval accuracy by about 19% compared with the manual feature method. However, during the teaching of the deep learning network, the parameter initialisation ability needs to be improved, thus affecting the speed of cross-modal data retrieval of network information, which needs to be developed in the coming study work. Reference [9], offered a geospatial data extraction way on the basis of machine learning. This method mainly completed the classification and extraction of geospatial data through machine learning KNN classification. This research provided a reference idea for the study context of this article. However, this method classified data by measuring the distance between different feature values. When the network information data is complex, especially when the sample is unbalanced, this method's classification and extraction effect will also be negatively affected. Reference [10], proposed a zero-sample cross-modal retrieval method based on deep supervised learning, aiming at the problem that category matching and corresponding matching are not considered in the current zero-sample cross-modal retrieval research. This method has a good effect on image retrieval, text and text retrieval image. Still, this method also points out at the end of the 20 Informatica 48 (2024) 19–34 W. Guifang study that the retrieval performance of this method is directly related to the teaching effect of the supervised deep learning network. Based on the above analysis, the key contributions and results of existing research on teaching data retrieval and network information classification are summarized as follows: Paper Key Contributions Result [8] Wang et al Cross modal data retrieval using deep learning techniques Compared to manual feature methods, the retrieval accuracy has been improved by approximately 19% [9] Zhu et al Propose a machine learning based geospatial data extraction method Complete the classification and extraction of geospatial data through KNN classification method [10] Zeng et al Propose a zero-sample cross modal retrieval method based on deep supervised learning Good performance in image retrieval of text and text retrieval of images Further analysis of the table content shows that although deep learning techniques have achieved good results in cross modal data retrieval in [8], research has pointed out that the parameter initialization ability needs to be improved during the training phase, which may affect the speed of cross modal data retrieval of network information. In [9], although the machine learning based geospatial data extraction method provides a reference idea, this method measures the distance between different feature values for classification and extraction, which may have a negative impact on the effectiveness in complex network information data or imbalanced samples. In [10], the zero-sample cross modal retrieval method based on deep supervised learning proposed performs well in image retrieval of text and text retrieval of images. However, research has pointed out that its retrieval performance is directly related to the training effect of deep supervised learning networks, implying certain limitations. In summary, the most advanced data retrieval teaching methods still have gaps or limitations in parameter initialization ability, data complexity and sample imbalance issues, as well as the training effectiveness of deep supervised learning networks. Combined with the issues mentioned above, this topic has a wide range of research value due to the positive role of network information classification methods in improving the retrieval ability of micro-class teaching data in business English. It has become a key research topic for many scholars [11], [12], [13], [14]. Therefore, this paper proposes a micro-class data retrieval method of business English according to network information classification. This method is parted chiefly into two: network information classification and data search of micro-class instructing data of business English. The instructing data search is mainly completed in the classified network information collection. This method's design of the operation link can improve the order and regularity of business English micro-class teaching data, thus improving the data retrieval effect. 2 The micro-class data retrieval method of business English A large number of micro-videos and micro-courses are distributed on the network platform. It is difficult for learners to concentrate on learning. They pay attention to the quantity of learning content and ignore its quality [15], [16], [17], [18], [19], [20] In fragmented learning processes, the various information resources are mixed, the discriminative power of the learner is limited, and the quality of the learned content is not guaranteed. In addition, the fragmented learning time is uncertain, and learners cannot complete the learning in the limited fragmented time. The content of learning is simply and quickly browsed, which is difficult to absorb and incorporate into their own knowledge system. There is no quality and quantity of learning, which reduces the learning quality and efficiency [21], [22], [23]. Therefore, this paper designs a data retrieval method for the business English micro-class. 2.1 Establishment of network information classification model based on SVM Support vector machines (SVM for short) can accurately classify network information [24]. The idea of SVM is on the basis of the optimal classification surface based on linear separability. Its detailed introduction is represented in Fig 1. A Micro-Class Teaching Data Retrieval Method of Business… Informatica 48 (2023) 19–34 21 Figure 1: Schematic diagram of optimal classification surface of SVM. The optimal classification surface is relative to the multi-dimensional space. The optimal classification surface is the optimal classification line, which is to accurately classify the networked micro-class teaching data samples of business English and maximise the interval, that is, the classification interval. The principle is as follows: The training set of network information is set as ( 𝑎 𝑗 , 𝑏 𝑗 ) , and the following equation is used to describe the linear function 𝑓 ( 𝑎 ) of SVM in high-dimensional feature space: 𝑓 ( 𝑎 ) = 𝜛 T 𝜙 ( 𝑎 ) + 𝑐 (1) Where, 𝜛 is the weight vector of the network information characteristic, and 𝑐 is the offset of the network information characteristic; 𝜙 ( 𝑎 ) is the input network information sample. According to the purpose of risk minimisation min 𝐼 ( 𝜛 , 𝑑 ) , the above equation can be described as a constrained optimisation problem, and its equation is as follows: min 𝐼 ( 𝜛 , 𝑑 ) = 𝜛 2 𝜛 T + 𝐷 2 ∑ 𝑑 𝑗 2 𝑚 𝑗 = 1 𝑠 . 𝑡 . 𝑏 𝑗 = 𝜛 T 𝜙 ( 𝑎 ) + 𝑐 + 𝑑 ; 𝑗 = 1 , 2 , 3 , . . . , 𝑚 (2) Where, 𝐼 ( 𝜛 , 𝑑 ) is the structural risk value of the SVM model, indicating the model's effectiveness; 𝐷 is the penalty coefficient; 𝑑 is the classification error; 𝑏 𝑗 is the network information training sample; 𝑚 is the total number of network information training samples. By introducing the Lagrange function and using the Lagrange multiplier, the above-constrained optimisation problem can be converted into an unconstrained optimisation problem in dual space. The equation is as expressed below: 𝐿 ( 𝜛 , 𝑐 , 𝑑 , 𝛽 ) = 𝐼 ( 𝜛 , 𝑑 ) − ∑ ( 𝜛 T 𝜙 ( 𝑎 𝑗 ) + 𝑐 − 𝑏 𝑗 ) 𝑚 𝑗 = 1 (3) Where 𝛽 is the Lagrange multiplier. The following equation can be obtained from the KKT condition: [ 0 𝑑 1 𝑇 𝑑 1 𝛽 + 𝐷 − 1 𝐽 ] [ 𝑐 𝛽 ] = [ 0 𝑏 ] (4) Where 𝐽 is the identity matrix. In the process of network information classification, the radial basis function is set as the SVM function [25], then in the network information classification method based on the optimised SVM model, the SVM model is: 𝑓 ( 𝑎 ) = s gn [ ∑ 𝑏 𝑗 e xp ( − ( ‖ ‖ ) 2 2 𝜇 ) + 𝑐 𝑚 𝑗 = 1 ] (5) Where 𝜇 is the width coefficient of the radial basis function. 𝑎 𝑗 and 𝑎 𝑖 are the 𝑗 -th and 𝑖 -th network training samples, in turn. According to the classification principle of SVM [26], the learning performance of SVM is determined by 𝐷 and 𝜇 . If the values of these two parameters are too large, there will be over-fitting. Otherwise, there will be under-fitting, so it is necessary to optimise SVM. 2.2 Improved artificial bee colony algorithm There are three kinds of bees in the algorithm: employed bees, on-looker bees and scout bees. These three kinds of bees work and cooperate with each other to find better honey sources. First of all, the employed bees are responsible for searching for new honey sources around the honey source and sharing the honey source location, honey quantity and other information with the on-looker bees after completing the search. Then, according to the 22 Informatica 48 (2024) 19–34 W. Guifang information shared by the employed bees, the on-looker bees choose a certain honey source to continue mining. If the amount of honey in the honey source is more, the probability of being selected is higher. Finally, suppose a honey source has not been updated many times in succession. In that case, the honey source will be abandoned, and the scout bee’s bee is responsible for searching for a new honey source randomly to replace the abandoned honey source. It is worth noting that the number of employed bees, the number of on-looker bees and the number of honey sources are the same. There is only one scout bee. The honey source corresponds to the candidate solution of the SVM optimisation problem, which is the parameter 𝐷 and parameter 𝜇 of the SVM model. The honey amount of honey source corresponds to the fitness value of the SVM optimisation problem. The classical ABC algorithm uses a random initialisation population to start the iterative search. Suppose that the honey source size representing the candidate solution of the SVM optimisation problem is 𝑅𝑚 , where the honey source is 𝑌 𝑗 , then: 𝑌 𝑗 = ( 𝑦 𝑗 , 1 , 𝑦 𝑗 , 2 , . . . , 𝑦 𝑗 , 𝐸 ) (6) 𝑦 𝑗 , 𝐸 represents the 𝑗 -th honey source in dimension 𝐸 . 𝑦 𝑗 , 𝑖 = 𝑦 𝑖 m i n + 𝑟 𝑎𝑛𝑑 ⋅ ( 𝑦 𝑖 m ax − 𝑦 𝑖 m i n ) (7) Where, 𝑦 𝑗 , 𝑖 ∈ 𝑦 𝑗 , 𝐸 , 𝑗 = 1 , 2 , . . . , 𝑅𝑚 , 𝑖 = 1 , 2 , . . . , 𝐸 . 𝑦 𝑖 m ax represents the 𝑖 -th dimension upper bound value of SVM model parameters, 𝑦 𝑖 m i n represents the i -th dimension lower bound value; 𝑟 𝑎𝑛𝑑 is a random number evenly distributed between 0 and 1. In the initialisation process, it uses Equation (7) to generate all dimension values of each honey source. After population initialisation, the whole population enters the search phase of employed bees, on-looker bees and scout bees and iterates through these three phases until the algorithm termination condition is reached. The details of the three stages are as follows: (1) Employed bee’s stage At this stage, each employed bee generates a new honey source 𝑈 𝑗 = ( 𝑢 𝑗 , 1 , 𝑢 𝑗 , 2 , . . . , 𝑢 𝑗 , 𝐸 ), 𝑢 𝑗 , 𝑖 ∈ 𝑢 𝑗 , 𝐸 at the corresponding honey source 𝑌 𝑗 according to Equation (8), then: 𝑢 𝑗 , 𝑖 = 𝑦 𝑗 , 𝑖 + 𝜑 𝑗 , 𝑖 ( 𝑦 𝑗 , 𝑖 − 𝑦 𝑠 , 𝑖 ) (8) Where, 𝑦 𝑠 , 𝑖 represents a randomly selected honey source in the population, and 𝜑 𝑗 , 𝑖 represents a random number evenly distributed between 0 and 1. According to the greedy selection mechanism, when the honey quantity of the candidate honey source 𝑈 𝑗 is more, that is, the fitness value is better to replace the honey source 𝑌 𝑗 . (2) On-looker bees stage After all the employed bees complete the search, the on-looker bees select a honey source to continue mining according to the information received. The possibility of the 𝑗 -th honey source being chosen is: 𝑞 𝑗 = 𝑢 𝑗 , 𝑖 ∑ 𝑓𝑖 𝑡 𝑗 𝑅𝑚 𝑖 = 1 (9) Where, 𝑓𝑖 𝑡 𝑗 represents the fitness of the 𝑗 -th honey source. 𝑓𝑖 𝑡 𝑗 = { 1 1 + 𝑚 𝑖 𝑛 𝐼 ( 𝜛 , 𝑑 ) 𝑞 𝑗 ≥ 0 1 + | 𝑚 𝑖 𝑛 𝐼 ( 𝜛 , 𝑑 ) | 𝑒𝑙 𝑠 𝑒 (10) Where, min 𝐼 ( 𝜛 , 𝑑 ) shows the goal function value of the solution, and the higher the fitness is, the greater the probability of honey source selection is. (3) Scout bee’s stage After the completion of the above two stages, if a honey source has not been updated many times in succession, it means that the honey source has been exhausted. In this case, the honey source will be discarded and replaced by a new honey source, according to Equation (6). In the traditional artificial bee colony method, the method of employing bees to determine the location of the next honey source search is to use the greedy mechanism to compare the fitness value of the corresponding honey source in the previous two searches. The food source search mode determines whether bees can quickly and accurately find new honey sources. However, the advantages and disadvantages of the position before and after the iteration are not taken into account when searching, and the global optimisation is insufficient. The result is that the search skill of the method is deficient. The main disadvantages are large iteration randomness, slow update speed, easy falling into optimal local solution, etc. To resolve this issue, a global search factor is presented. In each search process, the current honey source information with the best fitness is added to the next location update, then: 𝑛 𝑒𝑤 − 𝑌 𝑗 = 𝑌 𝑗 + 𝑟 𝑎𝑛𝑑 ⋅ ( 𝑌 𝑛𝑗 − 𝑌 ℎ 𝑗 ) + 𝑟 𝑎 𝑛 𝑑 ⋅ ( 𝑌 𝑏 𝑒 𝑠𝑡 , 𝑗 − 𝑌 𝑗 ) (11) Where, 𝑌 𝑛𝑗 and 𝑌 ℎ 𝑗 represent different honey sources; ℎ and 𝑛 are randomly generated random numbers, and ℎ and 𝑛 are not equal to each other, and neither is equal to 𝑗 ; 𝑟 𝑎𝑛𝑑 is a random number evenly distributed between 0 and 1; 𝑌 𝑏 𝑒 𝑠𝑡 , 𝑗 represents the honey source with the highest food abundance (fitness value) at present. In the first cycle of artificial bee colony algorithm optimisation, it is the honey source with the highest fitness value among the initial 𝑀 honey sources. The improved artificial bee colony algorithm can make the search of bees directional and accelerate the convergence speed of the algorithm. 2.3 Optimisation process of SVM parameters When using the improved artificial bee colony algorithm to optimise, SVM parameters are used as the honey source. Three types of bees optimise the honey source according to their own tasks to obtain the best honey A Micro-Class Teaching Data Retrieval Method of Business… Informatica 48 (2023) 19–34 23 source. The specific process is to use the SVM parameters that need to be optimized as honey sources and use an improved artificial bee colony algorithm to optimize these parameters. The bees here can be divided into three types: hired bees, reconnaissance bees, and reconnaissance bees. Hiring bees to evaluate honey sources based on certain evaluation criteria (such as prediction accuracy or model performance indicators) and selecting the optimal honey source based on the evaluation results; Reconnaissance bees are responsible for searching and discovering new honey sources to increase global search effectiveness; The reconnaissance bee, on the other hand, has high exploration ability and can jump out of the local optimal solution to avoid the algorithm falling into the local optimal. Through the above design, the improved artificial bee colony algorithm can more accurately find the optimal solution of SVM parameters, thereby improving the performance and prediction accuracy of the SVM model. Then the optimal honey source is used to build the network information classification model based on the optimised SVM model. Fig 2 is the flow chart of the network information classification model based on the optimised SVM model. Figure 2: Operation process of network information classification model based on optimised SVM model (1) The control parameters in the initialisation algorithm mainly include the size of the bee colony, the number of honey sources, the maximum number of honey source cycles, and the maximum number of iterations. (2) Set the fitness function in the algorithm. The fitness function is calculated using Equation (10). (3) According to their respective tasks, the three honeybees optimise the honey source, calculate the fitness 24 Informatica 48 (2024) 19–34 W. Guifang value using the fitness function, and optimise all possible solutions found. (4) According to the set value of the number of iterations, it is able to determine whether the number of cycles of the honey source exceeds the limit. If the number of cycles exceeds the maximum, the newly generated honey source will replace the original honey source. The current honey source is the best searched. It is recorded and determined whether the termination condition is satisfied according to the cyclic condition. (5) The obtained global optimal honey source, that is, the optimal parameters, is used to construct the SVM model. (6) After the construction of the SVM model, Equation (5) is used to complete the network information classification. 2.4 Directional retrieval method of online teaching resources based on big data technology Considering the classification of network information resources is completed, big data technology is used to perform directional retrieval of network information resources, and the retrieval process is presented in Fig 3. Figure 3: Directional retrieval of network information resources on the basis of big data technology As presented in Fig 3, big data technology is used to cluster and analyse the classified network information set with user search keywords. Since the user search keywords may belong to multiple categories and the corresponding teaching data may also exist, this time, the network information resources will be used as the vertex in the graph. The vertices in the graph are weighted according to the correlation between network information resources and user search keywords. At this time, an undirected weighted graph will be obtained, thus transforming the problem of micro-class teaching data clustering analysis of business English into the problem of graph division. The undirected weighted graph can be expressed as: 𝑃 = 〈 𝜃 , 𝜉 , 𝜌 〉 (12) Where, 𝑃 represents the undirected weighted graph used for the retrieval and analysis of business English micro-class data; 𝜉 represents the vertex in the undirected weighted graph, that is, the classified network information sample set; 𝜉 represents the edge weight of the undirected weighted graph, that is, the correlation between the classified network information samples and the search keywords; 𝜌 represents a symmetric matrix. The key to cluster analysis of network information resources by using an undirected weighted graph is the derivation of 𝜉 . In fact, it is to calculate the correlation between the classified network information resources set and the retrieval keywords. The calculation equation is: 𝜉 = 𝑔 𝛿 𝑔 𝑒 𝑃 ma x 𝑔 𝜉 lg 𝜆 𝜒 (13) A Micro-Class Teaching Data Retrieval Method of Business… Informatica 48 (2023) 19–34 25 Where, 𝑔 𝛿 represents the frequency of user search keywords in the classified network information set; 𝑔 𝑒 shows the number of network data samples containing user search keywords; 𝑔 𝜉 shows the number of user keywords contained in the classified network information set; 𝜆 represents the amount of teaching data in the classified network information set; 𝜒 represents the key of user search keywords in the classified network information set, and its value is - 1~1. Equation (13) is used to calculate the correlation between the classified network information sample and the search keywords, sort the network information according to the degree of correlation, and establish the cluster set 𝐿 . According to the correlation between the two, it can evaluate whether there is only one feature class in the classified network information set. If so, it should take it as a subgraph of the undirected weighted graph. After evaluating all the network information samples in the cluster set 𝐿 , 𝑛 subgraphs will be obtained, thus obtaining the vertex set of the undirected weighted graph. On this basis, the semantic concept tree is established, the teaching data feature classification is implemented according to the attributes of each vertex in the undirected weighted graph, and it is fused. The equation is: 𝛤 ( 𝜀 𝑚 ) = 𝜉𝑟 ( 𝜀 𝑚 ) 𝐺 ( 𝜀 𝑚 ) − 𝑟 ( 𝜀 𝑚 ) (14) Where, 𝛤 ( 𝜀 𝑚 ) represents the result of teaching resource sample fusion, that is, the micro-class teaching data resources of business English that meet the user's retrieval requirements; 𝑟 ( 𝜀 𝑚 ) shows the effective possibility of retrieving the network data sample 𝜀 𝑚 ; 𝐺 ( 𝜀 𝑚 ) shows the joint distribution possibility of the network information sample 𝜀 𝑚 . The fused network information resources are output as directed retrieval results to achieve business English teaching data retrieval based on network information classification. 3 Experimental analysis To verify the effectiveness of the proposed method, experiments are required. Firstly, the research on business English micro course teaching data retrieval based on network information classification will be integrated as a new module of the education platform. This module interacts with existing platform databases through APIs and provides a user interface for retrieving and browsing business English micro course resources. Then, import the business English micro course teaching data into the database of the education platform, and process and transform it according to the structured characteristics of the existing data to ensure consistency with the existing data. Secondly, develop a data retrieval module for business English micro course teaching, including search interfaces, query algorithms, and result display functions. By defining APIs to achieve data interaction and query request transmission with existing platforms. Finally, design a user-friendly search interface that provides functions such as keyword search, advanced filtering, and sorting. Ensure that the user interface is consistent with the appearance and functionality of existing platforms for seamless use by learners. The dataset selected for the experiment is sourced from business English micro course teaching data from a certain website. Before the experiment, the experimental data was normalized and the processed values ranged from -1 to 1. The data information of online business English micro course teaching during the experimental process is shown in Table 1. Table 1: Specific information of network information data Business English micro-class data set type Number/type of information types Video frequency 2 Picture 2 PPT 2 Characters 2 Data 2 The dataset shown in Table 1 contains different types of information, including videos, photos, PPTs, characters, and data, to learn and understand business English courses from different perspectives and levels. At the same time, learners can acquire knowledge through various methods such as watching videos, browsing photos, reading PPTs, reading character texts, and analyzing data files. All kinds of information are well organized and organized, with a certain degree of structure. By learning these materials, learners can improve their abilities in listening, speaking, reading, writing, and other aspects of business English, and prepare for future business scenarios. Therefore, the dataset in Table 1 has characteristics such as diversity, abundant resources, comprehensiveness, structure, and practicality, which can accurately verify the retrieval performance of different methods. The related parameters of the SVM model used in this method are set as follows: the maximum number of iterations is 100. The method in this paper uses the network information classification model based on the optimised SVM model to classify the micro- class teaching data of business English in Table 1. To intuitively reflect the classification effect of this method on the data, it takes the image teaching data in Table 1 as an example. It gives the distribution details of the classification results of the image teaching data before and after the SVM adopts the improved artificial colony algorithm, as shown in Fig 4 and Fig 5. 26 Informatica 48 (2024) 19–34 W. Guifang Figure 4: Distribution details of classification samples before improved artificial bee colony algorithm Figure 5: Distribution details of classification samples after improved artificial bee colony algorithm Compared with Fig 4 and Fig 5, it can be seen that before the method of this paper classifies the network information of the business English micro-class, the network information samples of the business English micro-class are disordered. There is no obvious boundary between the samples, and then the image data retrieval effect of the business English micro-class will be negatively affected. After classifying the network information of the business English micro-class in the method of this paper, the image network information samples can be accurately classified in the best classification plane, which proves that this method has the A Micro-Class Teaching Data Retrieval Method of Business… Informatica 48 (2023) 19–34 27 ability to classify the network information of business English micro-class. Fig 6 shows the training convergence change of the support vector machine in the experiment of the artificial bee colony algorithm before and after the improvement of the artificial bee colony algorithm. According to the parameter optimisation process in Fig 6, the improved artificial bee colony algorithm has a faster convergence speed compared with the traditional artificial bee colony algorithm. It can jump out of the local optimal solution. Improvement is necessary. (a)Before improvement (b)After improvement Figure 6: Change of training convergence of support vector machine 28 Informatica 48 (2024) 19–34 W. Guifang The method in this paper uses the network information classification model based on the optimised SVM model. After classifying the five types of network information data in Table 1, the classification effect is reflected by the Conditional Log-Likelihood Loss ( 𝐶 𝐿𝐿 − 𝑙𝑜 𝑠 𝑠 ) index, which is a classification effect evaluation index. The equation is as follows: 𝐶 𝐿𝐿 − 𝑙𝑜 𝑠 𝑠 = − ∑ lg ( 1 𝑄 ( 𝐴 ( ℎ ) | 𝑦 ( ℎ ) ) ) 5 ℎ = 1 (15) For the test data sample 𝑦 ( ℎ ) , when the classification probability of the correct type 𝐴 ( ℎ ) is close to 1, the value of 𝐶 𝐿𝐿 − 𝑙𝑜 𝑠 𝑠 is minimal, and if the classification probability of correct type 𝐴 ( ℎ ) is close to 0, the value of 𝐶 𝐿𝐿 − 𝑙𝑜 𝑠 𝑠 is maximal. Then the information classification effect of the five micro-class teaching data is shown in Fig 7 after the application of the method in this paper, the method in reference [8], the method in reference [9] and the method in reference [10]. Figure 7: Classification effect of five kinds of network information data As presented in Fig 7, after applying the method of this article, the 𝐶 𝐿𝐿 − 𝑙𝑜 𝑠 𝑠 value of the five network information classification results is greater than 0.95, which is higher than the maximum 𝐶 𝐿𝐿 − 𝑙𝑜 𝑠 𝑠 value of 0.9 of the classification results by using the three comparison methods. It indicates that the classification probability of correct type 𝐴 ( ℎ ) of network information in the method of this paper is close to 1, and the classification accuracy is high. After classifying the network information of micro- class teaching of business English, the retrieval ability of the method in this paper to micro-class teaching data is tested. Each group of experiments searches 10 times in total, the number of retrieval files is 100, and the amount of network information resources is gradually increased by 100. To make the retrieval impact of this approach more convincing, it is essential to use the ways of reference [8], reference [9] and reference [10] for comparison and carry out comparative experiments under the same conditions. Table 2 shows the retrieval results of network information resources by four methods. A Micro-Class Teaching Data Retrieval Method of Business… Informatica 48 (2023) 19–34 29 Table 2: Retrieval results of micro-class teaching network information resources by four methods Number of network informati on samples/p iece Methods in this paper The method in reference [8] The method in reference [9] The method in reference [10] Number of correct searches/ piece Number of error retrievals/ piece Number of correct searches/ piece Number of error retrievals/ piece Number of correct searches/ piece Number of error retrievals/ piece Number of correct searches/ piece Number of error retrievals/ piece 100 100 0 98 2 95 5 91 9 200 199 1 198 2 195 5 192 8 300 299 1 298 2 295 5 289 11 400 399 1 398 2 395 5 391 9 500 499 1 498 2 495 5 492 8 600 599 1 598 2 595 5 589 11 700 699 1 698 2 695 5 688 12 800 799 1 798 2 795 5 786 14 900 899 1 898 2 895 5 865 35 1000 999 1 998 2 995 5 978 22 As shown in the data in Table 2, after comparing the methods of reference [8], reference [9] and reference [10], the method in this paper searches the data of business English micro-class teaching, with the rise of the number of network information samples, the maximum number of error searches of micro-class teaching data is only one sample. The number of error searches of reference [8], reference [9] and reference [10] methods is more. This proves that the algorithm in this article has more advantages in the retrieval impact. The number of correct retrieval in this method is relatively high, and there are few wrong retrieval phenomena. In contrast, the number of wrong retrieval in the comparison methods is relatively large. The above experiments have verified that the approach in this article can retrieve micro-class teaching data of business English. In order to deeply analyse whether there are duplicate and redundant retrieval results in the retrieval results of the algorithm in this article, the experimental setting is a high-precision retrieval index. If the correct results are shown in the first place in the retrieval results of micro-class teaching data of business English, and there is no redundancy in the retrieval results, then the index is higher. In this paper, under the retrieval conditions of different data types, the index test results before and after the retrieval of business English micro- class teaching data are shown in Table 3. Table 3: Test results of three methods for laser video image high-precision retrieval index Business English micro-class teaching data type Number of retrieved samples/piece Before use After use Video frequency 10 0.87 0.98 100 0.78 0.98 Picture 10 0.88 0.98 100 0.79 0.98 PPT 10 0.84 0.98 100 0.77 0.98 Characters 10 0.91 0.98 100 0.89 0.98 Data 10 0.92 0.98 100 0.91 0.98 As shown in Table 3, under the retrieval conditions of different data types, the results of the MAP -index test will not change with the change in the number of samples, and the values of the MAP -index are 0.98. This shows that there are no duplicate and redundant retrieval results in the retrieval results of the method in this paper, and the correct results are displayed in the first place. In the network environment, the retrieval efficiency of micro-class teaching data of business English is also very important. The scale of network information will affect the retrieval efficiency of teaching data to a certain extent. In order to determine whether the method in this paper has an advantage in this respect, it can test the retrieval delay of micro-class teaching resources of business English by the method in this paper, the method in reference [8], the method in reference [9] and the method in reference [10] in the same experimental environment. This delay mainly reflects the time interval of feedback to users after resource retrieval. The test results are shown in Fig 8 -11. 30 Informatica 48 (2024) 19–34 W. Guifang Figure 8: The retrieval delay of this method Figure 9: Reference [8]method retrieval delay Figure 10: Reference [9] method retrieval delay A Micro-Class Teaching Data Retrieval Method of Business… Informatica 48 (2023) 19–34 31 Figure 11: Reference [10] method retrieval delay By analysing Figs 8, 9, 10 and 11, it can be seen that the scale of network information will affect the retrieval efficiency of teaching data to a certain extent. Still, the impact on the retrieval efficiency of this article's method is insignificant. With the rise in the number of micro-class teaching data retrieval of business English, the retrieval time delay of this method does not exceed 1s. In contrast, in the same experimental environment, the retrieval delay of business English micro-class teaching resources of the ways in reference [8], reference [9] and reference [10] exceed that of the method in this paper. It can be seen that the method in this paper has advantages in the retrieval efficiency of large-scale business English micro-class data because this method can classify network information before data retrieval, ensure the order of information, and reduce the difficulty and complexity of subsequent retrieval. Recall rate measures the ratio between the number of relevant documents retrieved by the system and the total number of actual relevant documents. The formula is: recall rate=number of retrieved relevant documents/total number of real relevant documents. The higher the recall rate, the better the system can find relevant documents, providing more comprehensive information. The F1 score combines two indicators, recall and precision, to comprehensively evaluate the comprehensive performance of classification algorithms or information retrieval systems. Accuracy measures the ratio between the relevant documents retrieved by the system and all retrieved documents. The formula is: accuracy=number of relevant documents retrieved/total number of documents retrieved. The F1 score is the harmonic average of recall and accuracy, used to balance the recall and precision of the system. The formula is: F1 score=2 * (accuracy * recall)/(accuracy+recall). The F1 score ranges from 0 to 1, and the closer the value is to 1, the better the balance between recall and accuracy is achieved by the system. The test results of the methods in this article, reference [8], reference [9], and reference [10] are shown in Table 4. Table 4: Test results of recall rate and F1 score Method Recall F1 Score Method of this article 0.90 0.85 Reference [8] Method 0.82 0.78 Reference [9] Method 0.84 0.80 Reference [10] Method 0.78 0.75 From Table 4, it can be seen that the method in this paper achieved the best results in terms of recall and F1 score. This method can more accurately retrieve resources related to business English micro courses, providing more comprehensive and high-quality search results. In contrast, the reference [8] method performs slightly worse in terms of recall and F1 score, while the reference [9] method and the reference [10] method also have slightly lower recall and F1 score. Taking into account the comprehensive performance of recall rate and F1 score, this method performs best after optimization and can provide more accurate and comprehensive results for data retrieval tasks in business English micro course teaching. Firstly, the research on business English micro course teaching data retrieval based on network information classification will be integrated as a new module of the education platform. This module interacts with existing platform databases through APIs and provides a user interface for retrieving and browsing business English micro course resources. Then, import the business English micro course teaching data into the database of the education platform, and process and transform it 32 Informatica 48 (2024) 19–34 W. Guifang according to the structured characteristics of the existing data to ensure consistency with the existing data. Secondly, develop a data retrieval module for business English micro course teaching, including search interfaces, query algorithms, and result display functions. By defining APIs to achieve data interaction and query request transmission with existing platforms. Finally, design a user-friendly search interface that provides functions such as keyword search, advanced filtering, and sorting. Ensure that the user interface is consistent with the appearance and functionality of existing platforms for seamless use by learners. 4 Conclusion Based on the analysis of the necessity of the research on the data retrieval of business English micro-class, this paper studies the micro-class data retrieval method of business English based on network information classification. This method effectively uses the network information classification model based on the optimised SVM model and accurately classifies the micro-class network information resources of business English based on ensuring the performance of the support vector machine model. This link design can ensure the order and diversity of teaching data in the planned and complex network information resources. The method in this paper is the directed retrieval method of network teaching resources based on big data technology. The teaching data retrieval operation is done in the classified network information classification set to complete the directed retrieval of teaching data. Finally, a comparative experiment is used to verify that the method in this paper has an advantage in dealing with the problem of business English teaching data retrieval among similar retrieval methods. With the passage of time, the collection of business English micro course teaching data continues to increase, including various forms of data such as video, audio, and text. As the size of the data increases, the system needs to process more data, which leads to an increase in query complexity and dataset size. When facing larger datasets or more complex queries, caching technology is used to store frequently accessed data, query results, or computational results in memory or cache to reduce the latency of subsequent queries. Data availability The raw data supporting the conclusions of this article wi ll be made available by the authors, without undue reserv ation." Conflicts of interest The author declared that they have no conflicts of interest regarding this work." Acknowledgement This work supported by This research was supported by Project of Higher Education Reform of China ( ZJKY5284 )and Virtual Simulation Experimental Teaching Project for Universities of Zhejiang Province"13th Five-Year Plan." Authorship contribution statement Wang Guifang: Writing-Original draft preparation, Conceptualization, Supervision, Project administration, Methodology, Software, Validation. References [1] L. Ma, “An immersive context teaching method for college English based on artificial intelligence and machine learning in virtual reality technology,” Mobile Information Systems, vol. 2021, pp. 1–7, 2021. [2] Y. Yin, “Microclassroom design based on English embedded grammar compensation teaching,” Math Probl Eng, vol. 2021, pp. 1–9, 2021. [3] E. S. Darowski, E. Helder, and N. D. Patson, “Explicit writing instruction in synthesis: Combining in-class discussion and an online tutorial,” Teaching of Psychology, vol. 49, no. 1, pp. 57–63, 2022. [4] S. Wang et al., “Research on PBL teaching of immunology based on network teaching platform,” Procedia Comput Sci, vol. 183, pp. 750–753, 2021. [5] H. Zhao and L. Guo, “Design of intelligent computer aided network teaching system based on web,” Comput Des Appl, vol. 19, pp. 12–23, 2021. [6] T. Jiao, “Mobile English teaching information service platform based on edge computing,” Mobile Information Systems, vol. 2021, pp. 1–10, 2021. [7] H. Chen and J. Huang, “Research and application of the interactive English online teaching system based on the internet of things,” Sci Program, vol. 2021, pp. 1–10, 2021. [8] Y. Wang, H. Wang, J. Yang, and J. Chen, “Cross- model retrieval with deep learning for business application,” in Journal of Physics: Conference Series, IOP Publishing, 2021, p. 032035. [9] F. Ma, T. Sun, L. Liu, and H. Jing, “Detection and diagnosis of chronic kidney disease using deep learning-based heterogeneous modified artificial neural network,” Future Generation Computer Systems, vol. 111, pp. 17–26, 2020. [10] X. Xu, J. Tian, K. Lin, H. Lu, J. Shao, and H. T. Shen, “Zero-shot cross-modal retrieval by assembling autoencoder and generative adversarial network,” ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), vol. 17, no. 1s, pp. 1–17, 2021. A Micro-Class Teaching Data Retrieval Method of Business… Informatica 48 (2023) 19–34 33 [11] X. Cheng and K. Liu, “Application of multimedia networks in business English teaching in vocational college,” J Healthc Eng, vol. 2021, 2021. [12] D. Jing and X. Jiang, “Optimization of computer- aided English teaching system realized by VB software,” Comput Aided Des Appl, vol. 19, no. S1, pp. 139–150, 2021. [13] Y. Shu, “Experimental data analysis of college English teaching based on computer multimedia technology,” Comput Aided Des Appl, vol. 17, no. S2, pp. 46–56, 2020. [14] L. Wang and Z. Xu, “An English learning system based on mobile edge computing constructs a wireless distance teaching environment,” Mobile Information Systems, vol. 2021, pp. 1–9, 2021. [15] D. Amalia, Igaamo. IGAAMOka, V. Septiani, and M. R. Fazal, “Designing of Mikrokontroler E- Learning Course: Using Arduino and TinkerCad,” Journal of Airport Engineering Technology (JAET), vol. 1, no. 1, pp. 8–14, 2020. [16] R. M. Baker, M. E. Leonard, and B. H. Milosavljevic, “The sudden switch to online teaching of an upper-level experimental physical chemistry course: challenges and solutions,” J Chem Educ, vol. 97, no. 9, pp. 3097–3101, 2020. [17] L. Liu, “Research on IT English flipped classroom teaching model based on SPOC,” Sci Program, vol. 2021, pp. 1–9, 2021. [18] D. A. Wild, A. Yeung, M. Loedolff, and D. Spagnoli, “Lessons learned by converting a first- year physical chemistry unit into an online course in 2 weeks,” J Chem Educ, vol. 97, no. 9, pp. 2389–2392, 2020. [19] F. Yang, “Design of Traditional Teaching Method of Micro-teaching Based on Blended Learning,” in e-Learning, e-Education, and Online Training: 6th EAI International Conference, eLEOT 2020, Changsha, China, June 20-21, 2020, Proceedings, Part I 6, Springer, 2020, pp. 159– 170. [20] F. Zhao, O. I. Fashola, T. I. Olarewaju, and I. Onwumere, “Smart city research: A holistic and state-of-the-art literature review,” Cities, vol. 119, p. 103406, 2021. [21] A. Al-Hasan, “Effects of social network information on online language learning performance: A cross-continental experiment,” in Research Anthology on Applying Social Networking Strategies to Classrooms and Libraries, IGI Global, 2023, pp. 1574–1591. [22] J. Sun, L. Wang, J. Li, F. Li, J. Li, and H. Lu, “Online oil debris monitoring of rotating machinery: A detailed review of more than three decades,” Mech Syst Signal Process, vol. 149, p. 107341, 2021. [23] Z. Xiao et al., “Big data driven vessel trajectory and navigating state prediction with adaptive learning, motion modeling and particle filtering techniques,” IEEE Transactions on Intelligent Transportation Systems, vol. 23, no. 4, pp. 3696– 3709, 2020. [24] W. Zhang and Z. Wu, “Optimal hybrid framework for carbon price forecasting using time series analysis and least squares support vector machine,” J Forecast, vol. 41, no. 3, pp. 615–632, 2022. [25] M. Sehad and S. Ameur, “A multilayer perceptron and multiclass support vector machine based high accuracy technique for daily rainfall estimation from MSG SEVIRI data,” Advances in Space Research, vol. 65, no. 4, pp. 1250–1262, 2020. [26] X. Yu and H. Wang, “Support vector machine classification model for color fastness to ironing of vat dyes,” Textile Research Journal, vol. 91, no. 15–16, pp. 1889–1899, 2021. 34 Informatica 48 (2024) 19–34 W. Guifang