https://doi.org/10.31449/inf.v48i7.5205                                                                                                     Informatica 48 (2024) 53–62   53 
Design of Neural Network-Based Online Teaching Interactive System in 
the Context of Multimedia-Assisted Teaching 
 
Shanshan Cheng ，Qianchen Yang
*
，Huan Luo
 
College of Computer and Information Engineering, Guangxi Vocational Normal University ，China ，530007 
E-mail: sunny122621@163.com 
*
Corresponding author 
 
Keywords: English teaching, multimedia, deep learning (DL), enhanced multilayer perceptron integrated spiking neural 
network (EMLP-SNN), education development strategy 
Recieved: September 20, 2023 
As the pace of global integration increases, so does the demand for English language courses. Due to the 
scarcity of English-language learning resources in China, students of the language often need help to 
improve their spoken English. Advances in artificial intelligence technology and language education 
approaches have created an entirely novel phase of language teaching and learning. To solve this issue, we 
can employ deep learning (DL) technology. Speech recognition software is the foundation of verbal 
communication instruction and is also used as an evaluation tool. More hardware, software, and algorithms 
are needed to analyze speech signals because of the complexity of speech pronunciation variations, the 
quantity of speech signal data, a amount of speech characteristics parameters, and the size of speech 
gratitude and assessment computation. However, it is challenging to increase the precision and speed of 
conventional speech recognition algorithms since they have run across previously unheard-of bottlenecks. 
This article focuses on examining the impact of college English multimedia instruction in order to address 
these issues. The EMLP-SNN technique, which improves multilayer perceptron integration with spiking 
neural networks, is suggested for identifying oral English pronunciation. The results of the experiments 
demonstrate that the proposed algorithm has provided an accuracy of 97.5%, which can help students 
identify discrepancies between their pronunciation and the norm and fix pronunciation mistakes, leading to 
enhanced oral English learning performance. 
Povzetek: Raziskava uvaja EMLP-SNN tehniko za izboljšanje identifikacije angleške izgovorjave s 97,5% 
točnostjo, kar omogoča študentom izboljšanje učenja govornega angleškega jezika.
 
1   Introduction 
In the realm of multimedia education, there is a growing 
trend toward the adoption of interactive systems fueled by 
neural networks. These systems leverage artificial 
intelligence and machine learning algorithms to create 
personalized learning experiences, allowing students to 
progress at their own pace and in a manner that aligns with 
their individual learning styles [1]. Recent years have 
witnessed a significant shift towards integrating 
multimedia into education, with various educational 
institutions embracing technology to enhance the overall 
quality of the learning environment. The incorporation of 
diverse multimedia resources, including movies, 
animations, and interactive simulations, has proven 
effective in augmenting student engagement and 
comprehension of complex subject matter. Despite the  
 
manifold benefits associated with multimedia integration, 
there persist certain challenges that demand attention, such 
as the need for customized instruction and the difficulty in 
real-time monitoring of students' academic progress [2]. In 
this context, the utilization of online educational 
interactive systems relying on neural networks proves 
highly advantageous. These systems leverage artificial 
intelligence and machine learning algorithms to analyze 
real-time student achievement data. This capability enables 
dynamic adjustments in the learning experience [3]. For 
example, if a student encounters difficulties with a specific 
topic, the system can offer additional resources or modify 
the lesson's pace to ensure comprehensive understanding. 
Furthermore, these technologies provide instantaneous 
feedback on student progress, facilitating a more precise 
assessment of learning outcomes [4]. The development of a 
neural network-based interactive online teaching platform 
54   Informatica 48 (2024) 53–62                                                                                                                                             S. Cheng et al. 
is a intricate process spanning various stages. The initial 
phase involves carefully selecting learning goals and the 
information to be integrated into the system. Subsequently, 
the focus shifts to determining the fundamental concepts 
and skills students need to acquire, along with the selection 
of multimedia resources for instructing these foundational 
ideas [5]. 
The subsequent phase involves crafting the architecture of 
neural networks designed to scrutinize student 
achievement data and adapt the learning environment. This 
intricate task demands an iterative approach, where the 
selection of appropriate machine learning algorithms and 
the precise definition of input and output data for the 
system are paramount [6-8]. Ensuring the neural network's 
proficiency in accurately analyzing and responding to 
student actions necessitates extensive training on 
substantial datasets reflective of students' performance [9]. 
The subsequent step, which occurs after the structure of the 
neural network has been created and trained, is to 
incorporate it into the platform used for online instruction. 
Developing a user interface that gives students the chance 
to communicate with the system and obtain individualized 
feedback and resources is required for this step. Also, the 
system needs to be able to scale and must be able to 
manage massive amounts of student data [10-11]. 
In conclusion, the system has to be assessed to see whether 
or not it is effective in enhancing the learning outcomes for 
students. This requires compiling and analyzing data on 
student performance, after which it is compared to more 
conventional approaches to teaching. Additionally, the 
system needs to be continuously updated and enhanced 
depending on the comments and suggestions made by the 
instructors and students. The construction of an online 
teaching interactive system based on neural networks is a 
complicated process that requires knowledge of artificial 
intelligence, machine learning, and teaching, helped by 
multimedia. Yet, these systems can completely transform 
the way in which children are educated by providing 
learning experiences that are individualized, interactive, 
and engaging, as well as that are tuned to the specific 
requirements of each student. Online teaching interactive 
systems that are based on neural networks have the 
potential to alter education and improve the learning 
results for students all over the world, provided that they 
continue to be developed and improved upon. Figure 1 
represents the overview of speech recognition technology. 
 
Figure 1: Speech recognition technology. 
Following are the paper's main contributions: 
• It is suggested to use a multimedia-based impact 
evaluation strategy for English instruction. It 
enables students to discern between their 
pronunciation and that of native speakers, fixes 
pronunciation blunders, and improves the 
standard of spoken English learning. 
• In order to compensate for the missing features 
and increase recognition rate, a unique enhanced 
multilayer perceptron with integrated spiking 
neural network is proposed in this research. 
• Finally, by conducting comparative experiments, 
this method's superiority is demonstrated. 
 
2   Related work 
Table 1: Related works 
Reference Objectives Findings Limitations 
12 They provided an innovative platform for 
online intelligent English teaching using 
deep learning to assist students in becoming 
more proficient in English in accordance 
with their levels of knowledge and 
character development. 
The results demonstrated that the system 
had the potential to increase students' 
productivity in the classroom and to 
contextualize their studies. 
Limited to essential 
content, excluding 
advanced or 
interactive elements 
online 
13 They constructed artificial and human According to the results, machine Limited 
 Design of Neural Network-Based Online Teaching Interactive…                                                   Informatica 48 (2024) 53–62   55 
neural networks with the objective of 
comparing and contrasting their 
capabilities. 
learning systems were recommended to 
be designed with the ability to provide 
explanations for their decisions. 
interpersonal 
interaction, 
hindered practical 
experiences, varied 
learning 
environments 
impact. 
14 To assist teachers in understanding student 
performance in the classroom, they 
developed a system with dual capabilities.  
The outcomes indicated that DNN could 
be successfully implemented. 
Limited bandwidth, 
slow access, 
challenges in real-
time engagement, 
potential 
disruptions 
15 They examined the challenges associated 
with the present state of education and the 
constraints of the current system. Those 
were achieved by integrating an analysis of 
the current state of teaching and research in 
both domestic and foreign supplementary 
education systems, with a focus on 
computer-based teaching and self-directed 
English learning facilitated by computer 
networks. 
They utilized a comparative analysis 
approach to examine the similarities and 
differences between offline exams written 
with pencil and paper and online exams 
taken on a computer and to draw 
conclusions about the effects of each. 
Limited real-time 
practice, lacks face-
to-face 
communication, 
hinders immediate 
feedback 
16 They employed "Advanced Multimedia 
Technology (AMT)" for Teaching 
Assessment in Engineering Teaching to 
analyze the current state of advancement in 
physical education teaching methods. 
The results of the experiments showed 
that the proposed methods could 
accurately classify student evaluation 
tasks. 
Limited student 
engagement, tech 
issues, dependency 
on internet, 
distractions 
17 They suggested an innovative web-English 
situational guidance scenario system based 
on multimedia and sensing. 
They evaluate the proposed approach 
with the existing one. The finding 
demonstrates the model's reliability. 
Latency, 
bandwidth, 
congestion affect 
online teaching 
interactive network 
routing efficiency 
18 They provided a comprehensive analysis of 
CALL's evolution and the current state of 
use, and they made suggestions for how the 
field could benefit from applying 
educational model theory to its 
development and implementation. 
An experimental investigation was 
conducted to evaluate the CAPT system's 
usefulness, with the results showing that 
CAPT-adopted classrooms exhibited 
superior language cognitive abilities. 
Tech glitches, 
limited student 
engagement, hinder 
online teaching 
interactivity. 
19 They presented a framework for adaptive 
learning on mobile devices. 
The findings illustrated how their 
approach handled a variety of learning 
scenarios generated by students. 
Varied technology 
access, engagement 
levels hinder online 
teaching 
interactivity 
20 They developed the foundations of a system 
based on intelligent human-computer 
cooperation by analyzing the literature and 
conducting case studies in artificial 
intelligence technology and the visual 
communication design process. 
They concluded that the system had a net 
positive effect on designers and society at 
large, analyzed the system's future 
directions, and stated that design was the 
system's primary mode of human-
machine collaboration. 
Varied subjects 
challenge cohesive 
engagement in 
online interactive 
teaching 
 
3   Materials and method 
3.1 Enhanced multilayer perceptron 
integrated spiking neural network (EMLP-
SNN) 
EMLP-SNN's innovative approach enhances real-time 
adaptability, facilitating a more dynamic and efficient 
online learning experience by mimicking the brain's 
spiking behavior. 
The SNN-based topological models receive the returned 
frame-based features first. These properties are commonly 
believed to be fixed across the brief optimum point period 
56   Informatica 48 (2024) 53–62                                                                                                                                             S. Cheng et al. 
of segmented frames due to the short temporal length of 
segmentation frames and the modest variability of speech 
signals. The highest point-and-fire (IF) neural model with 
reset by subtract technique is used in this work because it 
efficiently handles this stationary frame-based data while 
requiring minimal processing effort. IF neurons, though 
they do not precisely imitate the complicated temporal 
dynamics of real neurons, are the ideal choice for applying 
the neural models utilized in this research, where spike 
timing has a minimal impact. 
The arriving spikes to neuronal I at layer k are translated 
into synaptic current by each step s of a discrete-time 
framework with a total amount of discrete-time steps N s, as 
shown below. 
𝑦 𝑖 𝑘 (𝑠 ) = ∑ 𝑥 𝑖𝑗
𝑘 −1
. 𝜃 𝑗 𝑘 −1
(𝑠 ) + 𝑎 𝑖 𝑘 𝑗   
        (1) 
In contrast, 𝜃 𝑖 𝑘 −1
(s) shows that input spikes from input 
neuron i occurred at time step t. The synaptic weight of the 
postsynaptic neuron in layer k-1 is also represented by the 
𝑥 𝑖𝑗
𝑘 −1
 and can be thought of in this scenario as a continual 
injecting current. 
According to Equation (2), it shows how Neuronal j 
converts the input current 𝑦 𝑖 𝑘 (s) into its potential across the 
membrane 𝑈 𝑖 𝑘 (s). Here, a unitary resistance to membranes 
is assumed without sacrificing generality. Based on the 
assumption that all synaptic weights are normalized with 
value to the firing threshold, we set k as the firing threshold 
for all experiments (see Equation 3), which causes an 
output spike anytime 𝑈 𝑖 𝑘 (s) exceeds it. 
𝑈 𝑖 𝑘 (𝑠 ) = 𝑈 𝑖 𝑘 (𝑠 − 1) + 𝑦 𝑖 𝑘 (𝑠 ) − 𝜗 . 𝜃 𝑖 𝑘 (𝑠 − 1) 
   (2) 
𝜃 𝑖 𝑘 (𝑠 ) = Θ(U
𝑖 𝑘 (𝑠 ) − 𝜗 )𝑤𝑖𝑡 ℎΘ(𝑤 ) = {
1, if w ≥ 0
0, otherwise
 
         (3)  
We can state that the open collective membrane voltage of 
neuron i in layer kas follows using equations (1) and (2). 
𝑈 𝑖 𝑘 ,𝑒 = ∑ 𝑥 𝑖𝑗
𝑘 −1
𝑗 ∙ 𝑑 𝑗 𝑘 −1
+ 𝑐 𝑖 𝑘 . 𝑀 𝑡   
          (4) 
Where, in accordance with Equation (5), 𝑑 𝑗 𝑘 −1
 is an input 
spike frequency from the layer k neuron with the highest 
point-synaptic density. 
𝑑 𝑗 𝑘 −1
= ∑ 𝜃 𝑗 𝑘 −1
(𝑠 )
𝑀 𝑡 𝑠 =1
    
          (5) 
While ignoring their temporal patterns, the 𝑈 𝑖 𝑘  sums the 
overall potential at the membrane's contributions of the 
input pulses from pre-synaptic cells. The tandem 
architecture for learning section will go into more 
information about this intermediary quantity, which links 
the SNN and connected ANN layers for parameter 
optimization. 
A nonlinear function called a multilayer perceptron is 
created by concatenating different layers of nodes. Either 
the MLP's input features or the previous layer's output are 
sent to each layer. The layer's j
th
 node calculates a weighted 
total of every input as 
𝑦 𝑖 𝑘 = ∑ 𝑋 𝑗𝑖
𝐽 𝑗 =1
𝑥𝑗𝑘 + 𝑥 𝑖  (6) 
Where 𝑋 𝑗𝑖
and𝑥 𝑖 , respectively, stands for the node values 
and bias, the output of the node is then subjected to the 
application of a nonlinear function. The sigmoid function 
related to is the most prevalent. 
𝑦 ̂
𝑖 = 𝑠𝑖𝑔 (𝑦 𝑖 ) =
1
1+exp(−𝑦 𝑖 )
        (7) 
Where for simplicity's sake, the structure's index l was 
eliminated. An arbitrary number of layers can be present in 
a generic MLP, leading to it being derived from a Wiener 
filter that has zero-mean simple Gaussian priors for speech 
and noise in a highly intricate nonlinear function. It is 
consequently challenging to calculate how an MLP should 
modify a Gaussian variable. Nevertheless, the subject of 
transforming a variable that is random through an MLP has 
already been covered in the literature. Sensitivity analysis 
of MLPs against noise is its main area of use. 
Unfortunately, the relevant expectations have no known 
solutions. The Taylor series approximates the sigmoid in 
many papers on MLP performance. These estimates, 
however, are local, making them only applicable for low 
uncertainty levels and producing huge mistakes in all other 
cases. 
By combining the acoustic transformation with the Taylor 
series of expansion, this approach avoids the locality issue. 
However, this method is not scalable to the sizes utilized in 
ASR systems and is not suitable for multilayer perceptrons. 
Some simplifications must be made in order to arrive at an 
overall model for the propagation through an MLP. MLPs 
 Design of Neural Network-Based Online Teaching Interactive…                                                   Informatica 48 (2024) 53–62   57 
used in ASR are often relatively large since they 
encompass the whole acoustic space given a stream of 
features. There are typically between 300 and 1000 nodes. 
Additionally, there is proof that an MLP's node outputs 
have a minimal statistical dependence. Assuming that each 
node's output follows the weighted and can be treated as an 
isolated Gaussian variable as a result of the central limit 
theorem is one way to simplify the situation given these 
characteristics. With this presumption, the issue is limited 
to the sigmoid function's sigmoid function propagation of a 
Gaussian variable. Additionally, just the first two minutes 
need to be transmitted when spreading the output of one 
layer to the next, significantly streamlining the procedure. 
 
Figure 2: flowchart of EMLP-SNN 
A type of neural network known as an enhanced multilayer 
perceptron integrated spiking neural network (EMLP-SNN) 
combines the spiking activity of biological neural networks 
with the benefits of conventional artificial neural networks, 
such as backpropagation (refer Figure 2). 
1. Initialization: Set the network's weights and biases 
to modest random values. 
2. Input Encoding: Make the input data into a spike-
based form that the network can understand. 
Different encoding techniques, such as rate coding 
or temporal coding, can be used to accomplish 
this. 
3. Feedforward: Using the input spikes, the current 
weights, and the current biases, calculate the 
stimulation of each neuron in the network. 
4. Spike Generation: Use a spiking function to 
transform the neurons' activation values into spike 
trains. 
5. Propagate spikes throughout the network, 
updating each neuron's activity in accordance with 
incoming points and the most recent weights and 
biases. 
6. Learning: Based on the discrepancy between the 
expected and actual output, use a learning 
technique, such as backpropagation, to modify the 
network's weights and biases. 
7. Repeat: Go back and forth between steps 3-6 until 
convergence or a predetermined number of epochs 
have passed. 
The selection of the learning method, the spiking operation, 
and the encoding scheme are some essential factors to take 
into account when training an EMLP-SNN. Furthermore, 
the network's spiking activity might add complexity and 
call for specialized training methods like the highest point-
timing-dependent plasticity. 
4   Results and discussion  
The participants in this study are undergraduates from our 
educational establishment. There are a total of 67 of them, 
62 of whom are male and 25 of whom are female. Cool 
Edit, a piece of recording software, was used to record the 
subjects at a sampling rate of 18 kHz and a coding depth of 
18 bits. The audio consists of ten sentences, each of which 
is a sentence that is widely used when speaking English. 
The computer system in question is equipped with an M1 
Pro chip, featuring a 14-core GPU with a power 
consumption of 15 watts and an 8-core CPU capable of 
fluctuating between 0 and 22 watts. The system supports 
NVIDIA CUDA version 418.163, a parallel computing 
platform. Notably, the GPU operates under Rosetta 2, 
indicating that it is running software translated from a 
different architecture. The entire setup is powered by the 
macOS operating system (Table 2). 
Table 2: Experimental setup 
Equipment Model 
CUDA NVIDIA CUDA 418.163 
GPU M1 Pro 14-Core GPU 15 W 
Rosetta 2 
CPU M1 Pro 8-Core 0 W / 22 W 
OS Mac OS 
 
 
58   Informatica 48 (2024) 53–62                                                                                                                                             S. Cheng et al. 
In order to validate our claims regarding the superiority of 
the algorithm described in this research, we conducted 
side-by-side comparisons of the ANN, CNN, and PNN 
approaches in the same experimental environment. Table 3 
presents the findings of a comparison of the two 
organizations' rates of recognition. As can be shown in 
Table 3 and Figure 3, the proposed method EMLP-SNN 
has a recognition rate of 97.52 percent, which is greater 
than the recognition rates of the models discussed earlier. 
As a direct consequence of this, the method that is 
described in this work is both logical and precise. It can be 
used to evaluate the effectiveness of multimedia training in 
college English classes. 
Table 3: Accuracy comparison of existing methods with 
our proposed method 
Model Accuracy 
ANN 91 
CNN 90.56 
PNN 93.22 
EMLP-SNN 97.52 
 
Figure 3: accuracy comparison 
As can be seen in Figures 4 and 5, the suggested method of 
EMLP-SNN will result in a loss function value that is 
lower than 0.2. At the same time, ANN can never go lower 
than approximately 0.5. The “loss function values” of the 
CNN method drop to around 0.3 and then cease 
converging, while the PNN method drop to about 0.4. This 
demonstrates that our model is superior to others in terms 
of its ability to converge. This algorithm performs 
significantly better than the other approaches that were 
evaluated in terms of its effectiveness and speed of 
convergence in the experimentation, which are clearly 
explained in Tables (4 and 5) 
Table 4: Comparison results of convergence performance 
 
ANN 
 
PNN 
 
CNN 
 
EMLP-
SNN 
epoch Loss function epoch Loss function epoch Loss function epoch Loss 
function 
1.457543 1.337217 
 
4.389439 1.309587 
 
2.094912 1.077588 
 
3.624597 1.077588 
1.457543 1.337217 
 
9.488389 1.128023 
 
2.094912 1.077588 
 
3.624597 0.90348 
11.65544 1.286782 
 
16.11702 0.984175 
 
9.488389 0.822784 
 
10.25323 0.729371 
18.28408 1.193369 
 
16.11702 0.984175 
 
16.75439 0.59561 
 
10.25323 0.729371 
22.61818 1.085044 
 
21.98082 0.810066 
 
16.75439 0.59561 
 
16.75439 0.545175 
25.55008 0.968825 
 
38.04251 0.656131 
 
25.55008 0.444306 
 
16.75439 0.545175 
25.55008 0.968825 
 
72.46042 0.560525 
 
44.67114 0.376329 
 
27.84461 0.328526 
32.94356 0.860501 
 
72.46042 0.560525 
 
44.67114 0.376329 
 
27.84461 0.328526 
40.20956 0.762263 
 
113.507 0.514915 
 
73.22526 0.315808 
 
32.94356 0.467111 
40.20956 0.762263 
 
173.5471 0.494741 
 
73.22526 0.315808 
 
41.73925 0.399134 
59.33062 0.674112 
 
224.1542 0.484654 
 
106.241 0.293003 
 
41.73925 0.399134 
93.74853 0.618415 
 
257.9347 0.507459 
 
106.241 0.293003 
 
50.53493 0.371067 
134.0302 0.578067 
 
288.6559 0.502635 
 
152.3865 0.285547 
 
50.53493 0.371067 
134.0302 0.578067 
 
288.6559 0.502635 
 
200.699 0.300458 
 
55.63388 0.194765 
169.9778 0.563156 
 
324.6035 0.494741 
 
261.504 0.315808 
 
71.69558 0.167136 
204.3957 0.560525 
 
335.5662 0.520177 
 
261.504 0.315808 
 
71.69558 0.167136 
 Design of Neural Network-Based Online Teaching Interactive…                                                   Informatica 48 (2024) 53–62   59 
238.0488 0.557894 
 
335.5662 0.520177 
 
304.0802 0.323263 
 
98.08264 0.134244 
268.1326 0.555262 
 
383.2414 0.520177 
 
333.3992 0.335982 
 
98.08264 0.134244 
297.4516 0.563156 
 
383.2414 0.520177 
 
382.4766 0.33335 
 
142.0611 0.106614 
320.1419 0.557894 
 
437.4177 0.520177 
 
434.4859 0.343437 
 
189.7363 0.101351 
345.7641 0.545175 
 
437.4177 0.520177 
 
434.4859 0.343437 
 
232.185 0.096527 
380.9469 0.545175 
 
495.2908 0.517546 
 
496.8205 0.340806 
 
232.185 0.096527 
421.2286 0.545175 
 
495.2908 0.517546 
 
496.8205 0.340806 
 
273.9964 0.08644 
 
 
Figure 4: Comparison results of convergence performance. 
Table 5: Comparison results of convergence speed 
 
ANN 
 
PNN 
 
CNN 
 
EMLP-
SNN 
epochs Loss function epochs Loss function epochs Loss function epochs Loss 
function 
-0.97633 1.465114 
 
4.215425 1.424389 
 
1.936116 1.062891 
 
0.543205 0.976413 
12.95278 1.441986 
 
15.10546 0.79139 
 
9.280557 0.877868 
 
0.543205 0.976413 
18.01791 1.329364 
 
21.69013 0.834629 
 
12.19301 0.698879 
 
10.04033 0.814518 
18.01791 1.227803 
 
27.51503 0.782843 
 
12.19301 0.522404 
 
14.34569 0.540001 
18.01791 1.225289 
 
36.25238 0.687315 
 
23.84281 0.426876 
 
18.01791 0.415312 
23.08304 1.121213 
 
36.25238 0.687315 
 
23.84281 0.426876 
 
30.42748 0.314253 
28.90794 0.976413 
 
47.14242 0.603351 
 
36.25238 0.360509 
 
45.62288 0.175486 
28.90794 0.976413 
 
47.14242 0.603351 
 
45.62288 0.285092 
 
45.62288 0.175486 
36.25238 0.869321 
 
66.13666 0.580223 
 
61.70467 0.238836 
 
63.85735 0.123197 
54.36023 0.779827 
 
66.13666 0.580223 
 
85.76404 0.24487 
 
63.85735 0.123197 
70.44202 0.710443 
 
81.33205 0.557096 
 
103.9985 0.233306 
 
86.52381 0.11465 
70.44202 0.710443 
 
101.7192 0.557096 
 
130.2106 0.221742 
 
86.52381 0.11465 
98.80676 0.672735 
 
101.7192 0.557096 
 
152.7504 0.233306 
 
107.5441 0.097556 
130.2106 0.65564 
 
122.8661 0.540001 
 
174.6571 0.24487 
 
107.5441 0.097556 
161.4878 0.638043 
 
122.8661 0.540001 
 
193.5247 0.256434 
 
138.1882 0.106103 
180.482 0.635026 
 
141.1006 0.548548 
 
193.5247 0.256434 
 
165.9198 0.085992 
197.1969 0.554079 
 
157.1824 0.533968 
 
218.3439 0.273528 
 
203.7816 0.094539 
197.1969 0.554079 
 
157.1824 0.533968 
 
248.2281 0.282578 
 
203.7816 0.094539 
60   Informatica 48 (2024) 53–62                                                                                                                                             S. Cheng et al. 
218.3439 0.554079 
 
176.05 0.540001 
 
248.2281 0.282578 
 
243.7962 0.091522 
250.3808 0.551565 
 
182.6347 0.456037 
 
280.8982 0.282578 
 
243.7962 0.091522 
278.7456 0.551565 
 
197.1969 0.432909 
 
315.9743 0.288109 
 
294.8274 0.091522 
310.7825 0.554079 
 
197.1969 0.432909 
 
359.661 0.296656 
 
347.2515 0.097556 
338.5141 0.542515 
 
205.1745 0.464584 
 
359.661 0.296656 
 
400.4353 0.094539 
354.4693 0.479165 
 
219.7368 0.481679 
 
394.6104 0.296656 
 
452.0997 0.085992 
386.6329 0.481679 
 
236.5783 0.467601 
 
440.4499 0.299673 
 
496.5462 0.088506 
403.3478 0.487712 
 
261.2709 0.464584 
 
495.7865 0.311237 
 
496.5462 0.088506 
 
 
Figure 5: Comparison results of convergence speed 
The factor that determines the test's level of precision is the 
ratio of the number of positive samples that can be 
anticipated to the number of positive examples that can be 
confidently predicted. Table 6 and Figure 6 illustrates a 
comparison between the proposed methodology and the 
present method's precision. The existing techniques ANN, 
CNN, and PNN each have 90%, 86.4%, and 93.8%, 
respectively, while the new EMLP-SNN strategy has 
98.1%.It demonstrates that the proposed method is superior 
to others in terms of precision. 
Table 6: Precision 
Methods Precision (%) 
ANN 0.65 
CNN 0.73 
PNN 0.69 
EMLP-SNN 0.94 
 
 
Figure 6: Precision 
4.1 Discussion 
Our proposed method, the Enhanced Multilayer Perceptron 
integrated with a Spiking Neural Network (EMLP-SNN), 
represents a novel approach to addressing the limitations of 
existing methods such as ANNs, CNNs, and PNNs in the 
context of online teaching interactive systems for 
multimedia-assisted teaching. Compared to ANNs, EMLP-
SNN reduces the demand for large labeled datasets by 
utilizing the benefits of spiking neural networks, which can 
process temporal input effectively and perform better with 
smaller datasets. When compared to computationally 
expensive CNNs, EMLP-SNN provides a more resource-
efficient alternative, allowing for real-time interactions in 
online education without sacrificing performance. The 
application of spiking neural networks improves 
interpretability by offering a more transparent decision-
making process, which is critical for instructional feedback 
scenarios and overcomes the black-box aspect of PNNs. 
Furthermore, EMLP-SNN shows enhanced adaptation to 
changes in teaching approaches and content, overcoming 
the limits of standard neural networks. This novel 
technique also includes measures to reduce overfittings, 
improve generalization capabilities, and ensure fair and 
 Design of Neural Network-Based Online Teaching Interactive…                                                   Informatica 48 (2024) 53–62   61 
unbiased evaluations. EMLP-SNN also focuses on data 
security and privacy while adhering to ethical 
considerations in educational technology applications. 
5   Conclusion  
Each of the everyday speech recognition techniques 
currently in use, EMLP-SNN, has hit bottlenecks that have 
never been seen before, and it is no longer possible to make 
any improvements to either their accuracy or their pace. As 
a reaction to these concerns, the primary focus of this study 
is on analyzing the results of implementing multimedia 
instruction into college English classes. In order to evaluate 
how words are spoken when spoken in English, a 
multilayer residual convolution neural network has been 
developed. The suggested algorithm has been tested, and it 
assists students in differentiating their pronunciation from 
the standard pronunciation, identifying and addressing 
mistakes in pronunciation, and improving the quality of 
oral English learning. There are several restrictions placed 
on online teaching interactive systems that make use of 
EMLP-SNN. The requirement for technical skills and 
infrastructure, high implementation costs, the possible 
challenge of engaging students, restricted feedback, the 
challenge of adapting to individual learners' needs, and the 
chance of technological faults are some of these challenges. 
It is vital to keep these restrictions in mind while utilizing 
these systems and to do so in a manner that makes the most 
of their benefits while minimizing the impact of any 
negatives they may have. Despite the fact that these 
systems have the potential to revolutionize teaching and 
learning, this is not the case. Integrating spiking neural 
networks with traditional multilayer perceptrons may 
introduce additional parameters and intricacies in the 
training process, requiring careful optimization and more 
computational resources. 
In the future, we can expect online teaching interactive 
systems using enhanced multilayer perceptron integrated 
spiking neural network to become more accessible, 
adaptable, and engaging for students. With the use of data 
analytics, artificial intelligence, and virtual and augmented 
reality technologies, these systems will provide more 
personalized and immersive learning experiences. As a 
result, we can anticipate their increased adoption and 
integration into traditional educational methods, leading to 
a more effective and efficient learning experience for 
students. 
 
Acknowledgement 
2021 Guangxi Higher Education Undergraduate Teaching 
Reform Project Key project: The Path Research of 
Ideological and Political Theory Courses into Civil 
Engineering Courses – A Case Study of Civil Engineering 
Courses of Guangxi Vocational Normal University 
References 
[1] Gong, Y., 2022. Study on Machine Translation 
Teaching Model Based on Translation Parallel Corpus 
and Exploitation for Multimedia Asian Information 
Processing. ACM Transactions on Asian and Low-
Resource Language Information Processing. 
[2] Zhou, Y., 2022. Research on innovative strategies of 
college students’ English teaching under the 
background of artificial intelligence. Applied 
Mathematics and Nonlinear Sciences. 
[3] Han, B., 2022. Big Data-Based Behavior Analysis of 
Autonomous English Learning in Distance 
Education. International Journal of Emerging 
Technologies in Learning, 17(13). 
[4] Wang, H., 2022. A survey of multimedia-assisted 
English classroom teaching based on statistical 
analysis. Journal of Mathematics, 2022. 
[5] Deng, W. and Wang, L., 2021, May. Research on 
English teaching based on multimedia-assisted 
teaching. In 2021 2nd International Conference on 
Computers, Information Processing and Advanced 
Education (pp. 1365-1368). 
[6] Kocak, O., 2022. A systematic literature review of 
web-based student response systems: Advantages and 
challenges. Education and Information 
Technologies, 27(2), pp.2771-2805. 
[7] He, X. and Cao, Y., 2021. The problem of art teaching 
based on interactive multimedia assisted instruction 
platform. The International Journal of Electrical 
Engineering & Education, p.0020720921996602. 
[8] Lin, R., 2021, September. Integration of university 
foreign language teaching and multimedia resources 
in artificial intelligence vision. In 2021 4th 
International Conference on Information Systems and 
Computer Aided Education (pp. 1214-1217). 
[9] Yue, Q., 2022. Construction of English-Assisted 
Teaching Mode Based on Multimedia Technique in 
Network Environment. Wireless Communications and 
Mobile Computing, 2022. 
[10] Alzahrani, A., Adnan, M., Aljohani, M., Alarood, 
A.A. and Uddin, M.I., 2022. Memory Load and 
62   Informatica 48 (2024) 53–62                                                                                                                                             S. Cheng et al. 
Performance-based Adaptive Smartphone E-learning 
Framework for E-commerce Applications in Online 
Learning. Journal of Internet Technology, 23(6), 
pp.1353-1365. 
[11] Zhang, Y., 2022. Multimedia-assisted oral English 
teaching system based on B/S 
architecture. International Journal of Continuing  
Education and Life Long Learning, 32(6), pp.663-
680. 
[12] Sun, Z., Anbarasan, M. and Praveen Kumar, D.J.C.I., 
2021. Design of online intelligent English teaching 
platform based on artificial intelligence 
techniques. Computational Intelligence, 37(3), 
pp.1166-1180. 
[13] Webb, M.E., Fluck, A., Magenheim, J., Malyn-Smith, 
J., Waters, J., Deschênes, M. and Zagami, J., 2021. 
Machine learning for human learners: opportunities, 
issues, tensions and threats. Educational Technology 
Research and Development, 69, pp.2109-2130. 
[14] Lee, C.A., Tzeng, J.W., Huang, N.F. and Su, Y.S., 
2021. Prediction of student performance in massive 
open online courses using deep learning system based 
on learning behaviors. Educational Technology & 
Society, 24(3), pp.130-146. 
[15] Tian, M., Fu, R. and Tang, Q., 2022. Research on the 
Construction of English Autonomous Learning Model 
based on computer Network-assisted 
instruction. Computational Intelligence and 
Neuroscience, 2022. 
 
 
 
 
 
 
 
 
 
 
 
[16] Fu, J., 2022. Innovation of engineering teaching 
methods based on multimedia assisted 
technology. Computers and Electrical 
Engineering, 100, p.107867. 
[17] Yu, J., 2020, January. Data sensing based inventive 
system for multimedia and sensing of web 
applications. In 2020 Fourth International Conference 
on Inventive Systems and Control (ICISC) (pp. 437-
440). IEEE. 
[18] Liu, X. and Li, N., 2022. Optimization of Multimedia 
English Teaching Computer-Aided System based on 
Internet of Things. 
[19] Alzahrani, A., Adnan, M., Aljohani, M., Alarood, 
A.A. and Uddin, M.I., 2022. Memory Load and 
Performance-based Adaptive Smartphone E-learning 
Framework for E-commerce Applications in Online 
Learning. Journal of Internet Technology, 23(6), 
pp.1353-1365. 
[20] Huang, L. and Zheng, P., 2022. Human-Computer 
Collaborative Visual Design Creation Assisted by 
Artificial Intelligence. Transactions on Asian and 
Low-Resource Language Information Processing.