https://doi.org/10.31449/inf.v48i6.5375                                                                                    Informatica 48 (2024) 141–156   141 
Analysis of Media Content Recommendation in the New Media Era 
Considering Scenario Clustering Algorithm 
Lei Tian 
Department of Animation Art, Zibo V ocational Institute, Zibo, Shandong, China, 255314 
E-mail: 15264327540@163.com 
 
Keywords: scenario clustering algorithm, content recommendation, new media era, communication by 
microblogging 
 
Received: October 26, 2023 
In the era of new media, the abundance of internet information poses a difficulty for users to find 
media that is both relevant and captivating. Although recommending technologies has made 
significant progress, it still faces hurdles in dealing with concerns related to data confidentiality and, 
the algorithmic partiality effect. With the continuous progress of the social economy, new media and 
micro media are constantly emerging in multiple ways and the methods to access these media contents 
have become diversified as well. However, it should be noted that diverse types of media content in the 
era of big data also require excessive time spent in selecting effective content. In response to these 
demands and defects, a scenario clustering algorithm is introduced in this paper, in which the media 
content recommendation is taken as the breakthrough point to build a clustering model to express the 
effective distribution of events by analyzing the network structure and media content distribution 
model through the analysis of the network structure and the distribution of the media content to 
represent the effective distribution of events and carry out the comparison of cross-content events, to 
achieve the effective clustering and analysis of media content. The results of the simulation experiment 
indicate that the scenario clustering algorithm proposed in this paper is effective and can support the 
analysis of media content recommendation in multiple dimensions, to provide high-quality media.  
Povzetek: Predstavljen je algoritem za gručenje scenarijev za izboljšanje priporočil medijskih vsebin, 
ob upoštevanju zasebnosti in algoritmične pristranskosti, z analizo mrežne strukture. 
 
 
1   Introduction 
These technologies utilize sophisticated algorithms and 
analysis of user behavior to create tailored 
recommendations, hence improving user engagement 
and satisfaction. Given the abundance of choices 
available, customers are increasingly looking for 
personalized experiences. Consequently, the importance 
of media content recommendations in enabling 
discovery and applicability becomes crucial. With the 
continuous progress of the social economy and the 
popularity of mobile Internet and other devices, 
people’s access to corresponding media content in the 
new media era has also become diversified, with typical 
terminals such as cell phones, tablets, computers, and 
TVs [1-2]. It is attributed to the continuous and in-depth 
application of big data that media content is no longer 
released by traditional fixed institutions or people, but 
common users have become both users and creators of 
media content [3-4]. How to extract useful information 
from a huge amount of data information is a highly 
critical research direction. In the specific new media 
and micro-media environment, it is a time-consuming 
and laborious process to distinguish excellent or poor 
data produced by ordinary users effectively; that is, it is 
highly challenging to manage and detect the data 
effectively. In particular, implementing effective 
measurement of data similarity, cluster analysis, and so 
on is a prerequisite for ensuring accurate, effective, and 
healthy media content [5-6]. Efficient media content 
recommendation algorithms are crucial in assisting 
users in navigating through a massive amount of 
information by providing personalized and pertinent 
suggestions that are customized to their specific tastes, 
hobbies, and consumption habits. These 
recommendation methods utilize advanced algorithms, 
machine learning methods, and analysis of user data to 
understand user behavior and preferences, resulting in 
improved user experience and engagement. Industry 
experts have also carried out a lot of studies on data 
142   Informatica 48 (2024) 141–156                                 L. Tian 
management of media content, mainly from effective 
categorization and clustering of data to implement 
effective management. Typically, cluster analysis, bag-
of-words model, and other ways and means are adopted 
to achieve effective extraction of data features and 
apply them to news, online articles, and other clusters. 
Compared with traditional news, new media content has 
obvious characteristics: ordinary users create content 
and two-way transmission. In the above equation 
ordinary users create content means that most of the 
content in new media is created by users themselves, 
and is more rooted in the grassroots, taken from the 
grassroots, and has a more real-time grasp of topics and 
trends in public opinion. Through two-way transmission, 
the users become both the receivers and the creators of 
new media content. Some studies have suggested that 
users are attached to their personal preferences, 
opinions, and relative personal influence when they 
create new media content [7-8]. This has ensured that 
new media content in the transmission process allows 
users can form a certain ship network and become an 
essential part of the new media time. 
However, due to the lowering of the threshold for the 
creation of media content, the quality of media content 
gradually varies, and it is extremely important to extract 
high-quality media content effectively. To address these 
demands and defects, a scenario clustering algorithm is 
introduced in this paper. Attempts are made to explore 
the construction of a recommendation model by sorting 
out the business logic of media content 
recommendation in the new media era. Feature analysis 
and clustering analysis are carried out from the 
perspectives of network structure and media content. 
Through the comparison of cross-content events, the 
effective clustering and analysis of media content is 
implemented to achieve the effective recommendation 
of media content. 
 
Methodology Findings  Limitation  
Study [9] proposed an innovative 
latent genre-aware micro-video 
recommending algorithm. 
The methodology overcomes 
current methods by taking into 
account user interests and micro-
video content features, leading to 
recommendations that were more 
tailored and pertinent. 
The computing capabilities 
necessary for developing and 
implementing the neural 
recommendation network might 
provide practical constraints in 
real-world scenarios. 
 Research [10] examined the 
elements that influence the spread 
of news material in the digital 
realm, with a specific emphasis on 
audience engagement on social 
media networks including 
Facebook and Twitter. 
The analysis identified specific 
news concepts and themes that had 
a higher potential of eliciting 
audience comments on Facebook 
and Twitter. 
The investigation was confined to 
examining news content 
disseminated on Facebook and 
Twitter, possibly neglecting other 
social media sites. 
Author [11] analyzed the origins 
of "user-generated content (UGC)" 
on social media, primarily 
focusing on strong-tie resources, 
weak-tie resources, and tourism-
tie resources, and their influence 
on tourist experience within the 
region. 
The findings could imply that 
UGC sources had an indirect 
impact on visitor satisfaction by 
influencing demands, which were 
then contrasted with visitors' real 
impressions. 
The analysis was dependent on 
user-generated content sourced 
from social media, which might 
introduce bias and fail to consider 
persons who were not actively 
involved in social media 
operations. 
Article [12] presented a multi-
agent framework that employs an 
innovative mechanism to rate 
news articles based on the user's 
interests obtained from social 
media. 
The method, which they had 
developed, yields a 28% 
improvement in outcomes 
compared to the recommendation 
systems used by existing news 
websites. 
 
The efficacy of sentiment analysis 
in screening news items might 
fluctuate based on the precision of 
the assessment algorithm. 
Paper [13] investigated the 
correlation between the user and 
the API and utilized these 
associations together with 
collaborative learning approaches 
The experiments were conducted 
using a dataset from the actual 
world. The findings of the 
experiments demonstrate that their 
models outperform all other 
The effectiveness of the models 
was highly dependent on the 
quality and expressive capability 
of the vector depictions provided 
by doc2vec. However, it was 
Analysis of Media Content Recommendation in the New Media…                                         Informatica 48 (2024) 141–156   143 
to provide API recommendations. 
 
methods that were evaluated. 
 
 
possible that these representations 
might not fully capture all the 
subtleties of user-API interactions. 
 
 
 
1.1 Data collection 
We gathered the dataset from kaggle 
https://www.kaggle.com/datasets/pypiahmad/social-
recommendation-data. The Social Recommendation 
Database consists of ratings combined with social or 
trusted connections among users from two distinct 
networks - LibraryThing (a website for book reviews) 
and Epinions (a platform for general consumer reviews). 
The collection combines review data with social 
relationships between users, offering a distinct chance 
to examine the impact of networking sites on rating 
patterns and vice versa. Table 1 shows the dataset 
description. 
 
Table 1: Dataset description 
 Library Thing Epinions 
Number of users: 73,882 116,260 
Number of items: 337,561 41,269 
Number of 
ratings/feedback: 
979,053 181,394 
Number of social 
relations: 
120,536 181,304 
 
2. Data preprocessing 
2.1 Data cleaning 
The data cleansing procedure for data manipulation 
encompasses the elimination of duplicates, management 
of missing values, and maintenance of data integrity 
across various platforms. More precisely, it involves 
removing duplicate entries, filling in missing grades or 
user connections when  feasible, and standardizing user 
identification.  Furthermore, outlier detection can be 
utilized to eliminate atypical ratings or associations in 
data. 
 
Z-score normalization 
The approach described is the most widely used  
normalization technique, which transforms all input 
values into a normalized statistic with a mean of zero 
and a standard deviation of one.  
 
 
 
 
 
 
For every attribute, the mean and standard deviation are 
computed. The normalization process involves utilizing 
the calculated mean and standard deviation to 
standardize each value of property . The equation  
 
 
representing the transition is provided as  
 
 
          (1) 
 
The notation mean ( ) represents the average value of 
the attribute , while ( ) represents the measure of 
dispersion known as the standard deviation of the 
characteristic . The benefit of the approach emerges 
from its ability to mitigate the impact of abnormalities 
on the data. 
 
2.2 Scenario clustering algorithm 
The so-called scenario clustering algorithm is designed 
in response to the new media environment to implement 
the conversion of media content data and contact data, 
support the clustering analysis of content and structure 
from the complex network, and perform the similarity 
calculation process. A Scenario Clustering Algorithm 
classifies data into clusters according to similarities in 
predefined situations. The process generally includes 
procedures such as extracting features, measuring 
similarity, and utilizing clustering techniques including 
k-means or hierarchical clustering. The system assesses 
patterns and correlations among scenarios to categorize 
them properly, facilitating comprehension of data 
structures and enabling predictions. By employing 
iterative refinement, it improves both accuracy and 
efficiency, making it applicable in many domains 
including data processing, pattern recognition, and 
decision-making. The versatility of this tool enables it 
to be easily adapted to a wide range of datasets, hence 
facilitating problem-solving and the finding of insights 
in difficult settings. Its specific process is shown in 
Figure 1 as the following. 
144   Informatica 48 (2024) 141–156                                 L. Tian 
 
  
Figure 1: Network event similarity model based on the 
network structure entropy and the content distribution 
entropy. 
Starting with the data content, the similarity of media 
content can be measured in two aspects as the following: 
one aspect is the structural entropy of the network, and 
the other aspect is the distributional entropy of the 
content. The structure of the network is used to measure 
the similarity of the topological chain formed by the 
communication in the new media era, and the partial 
entropy of the content mainly measures the similarity of 
the content changes during the communication process. 
The similarity measure based on the network structure 
can be calculated and evaluated quantitatively from the 
complexity and similarity of the network during the 
propagation process, and the similarity measure of the 
content is mainly conducted quantitatively based on a 
fixed model. 
2.3 Similarity measure based on entropy 
In the new media era, during the media content 
dissemination under the guidance of important nodes, 
ordinary nodes with time gradually present the scale of 
decreasing. The details are shown in Figure 2 as the 
following. 
 
 
Figure 2: Evolution of the communication of new media 
events. 
(1) Similarity based on the entropy of network 
structure 
In essence, the measurement of network structure 
entropy is the calculation and analysis of probability 
using nodes. In this paper, divergence is introduced to 
conduct the specific quantitative analysis, which is 
calculated according to equation (1) as the following. 
( ) ( )
( )
( )
1
*log
N
i
KL i
i
i
px
D p q p x
qx
=
=

        (2) 
In the above equation, the probabilities of the 
distribution of specific dimensions are denoted by p and 
q. 
Kullback-Leibler (KL) divergence, also known as 
relative entropy, is a measure of the similarity of two 
distributions [14-15]. 
After the similarity of network size and network 
topology in the transmission process are taken into 
comprehensive consideration, the specific quantitative 
calculation is carried out, as shown in equation (2) 
below. 
( )
( )
( ) ( )
12
1 2 1 2 1 2
,
,1
log 2
gg
n
EMD
D g g w w NND g NND g

= − + −
                                        (3)
 
Analysis of Media Content Recommendation in the New Media…                                         Informatica 48 (2024) 141–156   145 
The specific calculation of the discrete point index for 
the MVD network is shown in equation (3) as the 
following. 
( )
( )
( )
12
, , ,
log 1
N
J P P P
NND G
d
=
+
                           (4) 
Normalized conversion is carried out based on equation 
(3), and the details are shown in equation (4) as the 
following. 
( ) ( )
( )
12
,
1
, , , *log
i
Ni
ij
j
pj
J P P P p j
N 

=




                  
(5) 
The specific calculation 
j
 is shown in equation (5) as 
the following. 
( )
1
N
ji
i
p j N 
=
=

                                             (6) 
(2) Similarity based on the entropy of content 
distribution 
Similarity measurement by using specific structural 
entropy is a measure of event similarity from the 
topology of the event, where specific individual node 
probability distributions instead of the entire 
transmission network are used and corrected based on 
the practical scale of media content. 
The specific calculation of the network similarity model 
based on entropy is shown in equation (6) as follows. 
( )
( ) ( )
1 2 1 2
1 2 1 2 ,,
, * *
l g g n g g
D g g w D w D =+    (7) 
3. Event clustering model based on 
NRL and K-means 
From the similarity calculation model proposed in this 
paper, the distance between events or the distance 
matrix between multiple events can be derived directly. 
The specific calculation of the basic attributes is shown 
in equation (7) as the following. 
  ,
i i i
E M I =                                                    (8) 
In the above equation, the network event is denoted by 
specific, and the numbers of media are denoted by 
i
M and
i
I , respectively, and all the above parameters 
need to be normalized. 
3.1 Improved K-means algorithm 
The K-means algorithm requires that the number of 
clustering centers should be obtained in advance. 
However, in most cases, it is not possible to know the 
exact number before clustering. If the number of 
clustering centers is taken unreasonably, it can increase 
the error of clustering results [16-17]. Before clustering, 
a coarse clustering of historical run scenarios based on 
the Canopy algorithm is required to determine the 
number of clustering centers. The Canopy algorithm 
does not require specifying the number of clusters in 
advance, and coarse clustering of the data can generally 
be conducted in the preprocessing stage. The Canopy 
algorithm can be used to optimize the clustering results 
by accurately processing the data according to the 
coarse clustering results. The specific steps of the 
Canopy algorithm are described as the following: 
Step 1: Input the set List composed of original data, and 
set the distance threshold T1 and T2, and T1>T2. 
Step 2: Select the data point P' from the List randomly, 
take point P' as the first data center Canopy, and remove 
it from the List. 
Step 3: Take a point Q from the List and calculate the 
distance from the point Q to all the Canopy that has 
been generated. If the distance from point Q to a 
Canopy is less than T2, add point Q to that Canopy and 
remove it from the List; that is, point Q is considered to 
be close enough to that Canopy to be the center of other 
Canopies. If the distance of point Q to all Canopy is 
greater than T1, then point Q is added as a new Canopy 
and removed from the List. If the distance from point Q 
to some Canopy is between T2 and T1, point Q is added 
to that Canopy. However, it will not be removed from 
the List and will continue to be included in the 
subsequent calculations. 
Step 4: Repeat Step 3 for the other points in the List 
until the List is empty.  
The number of coarse clusters obtained from the output 
of the Canopy algorithm is taken as the input parameter 
of K-means clustering to obtain the final clustering 
result. 
146   Informatica 48 (2024) 141–156                                 L. Tian 
3.2 Generation of typical scenarios 
Due to the uncertainty of the scenario, it is necessary to 
cluster the historical output curves of the new energy 
power sources in a year to obtain the typical output 
curves. However, based on the improved clustering 
method that analyzes each historical output curve 
separately, it may result in a different number of 
clustering curves for each power source, which can 
increase the complexity and computational effort of the 
subsequent analysis. Hence, it is necessary to define 
power supply operation scenarios, that is, the scenarios 
of power output characteristics obtained by taking all 
new energy sources in the system as a whole. In the 
clustering process of each curve, the first Canopy 
coarse clustering is carried out for each power supply to 
obtain the optimal number of clusters; then K-means 
clustering is performed for the power supply operation 
scenario to get the typical operation scenario of the new 
energy power supply. The specific process is described 
as the following. 
Firstly, the historical curves of n new energy power 
sources are analyzed by Canopy coarse clustering, and 
the number ( ) 1
i
k i n   of coarse clustering centers 
of each power source is obtained accordingly. The 
number of coarse clusters with the most occurrences 
among all the coarse clusters is calculated, and this 
number of cluster centers is used as the optimal cluster 
number (k) for the typical operation scenario of the 
power supply. Subsequently, k is used as the input 
parameter of the next K-means clustering method to 
perform uniform scenario clustering of power supply 
operation scenarios and obtain typical operation 
scenarios of power supplies. At the same time, the 
occurrence probability (P) of each typical scenario can 
be obtained based on the number of historical scenarios 
contained in each typical scenario. As the clustering 
process of typical scenarios of loads is similar to that of 
the power supplies, it will not be described in detail 
herein[18-19]. 
After the clustering results of power (it is assumed that 
there are m typical scenarios) and load (it is assumed 
that there are 
0
n typical scenarios) are obtained 
separately, the typical pairwise combination of 
scenarios of power and load is carried out, and the 
probability of occurrence of 
0
mn  typical system 
operation and each system operation scenario 
( )
0
1 ,1
ij
P P i m j n      can be obtained. The 
process of system typical operation scenario generation 
is shown in Figure 3 as the following. The scenario has 
covered the operation scenarios of the possible power 
sources containing the source-load matching scenarios 
of the power output characteristics of each new energy 
source. 
 
Figure 3: Flow chart for the generation of typical 
scenarios. 
Partitioning is carried out on a power system, and 
flexibility assessment is conducted within and between 
the regions of the power system, respectively. The 
assessment indexes are shown in Figure 4 as the 
following. The intra-regional flexibility assessment 
index includes the partitioned supply and demand 
upward and downward flexibility deficiency index and 
the partitioned grid flexibility deficiency index, and the 
inter-regional flexibility assessment index includes the 
partitioned transmission channel flexibility deficiency 
index. 
 
Figure 4: Schematic diagram of flexibility assessment 
indexes. 
Analysis of Media Content Recommendation in the New Media…                                         Informatica 48 (2024) 141–156   147 
3.3 Flexibility indexes for upward and 
downward adjustment of supply and 
demand by region 
The flexibility index for the supply and demand by 
region is an index to determine whether the flexibility 
resources in the region are all meeting the flexibility 
demand. The schematic diagram of the flexibility 
resources in the region is shown in Figure 5 below, in 
which wind power, PV , and hydropower are 
uncontrollable units, and thermal power and energy 
storage are controllable units. In the calculation of 
flexibility demand, the fluctuation of the output of 
uncontrollable units and load is taken into consideration. 
In the calculation of flexibility supply, the flexibility 
supply capacity provided by controllable units is taken 
into consideration [20-21]. 
The specific calculation of the power demand and 
supply generated by the uncontrollable units and loads 
in the partition at moment t is shown in equation (8) and 
equation (9) as the following. 
( ) ( ) ( )
_ uncon demand load
P t P t P t = +               (9) 
( ) ( ) ( ) ( )
_sup uncon ply wind PV hydro
P t P t P t P t = + +  (10)                         
                                                                        
When ( ) 0 Pt  it is considered that the partition 
delivers power to the outside world and increases the 
power demand; ( ) 0 Pt  it is considered that the 
partition receives power from the outside world and 
decreases the power demand.  
The upper and lower limits of the power demand 
variation in the uncontrollable part are calculated. The 
details are shown in Equation (10) and Equation (11) as 
the following. 
( ) ( ) ( ) ( ) ( )
_ _ max _ _supply
11
uncon demand uncon demand uncon
P t P t P t  = + − −
         (11) 
( ) ( ) ( ) ( ) ( )
_ _ min _ _supply
11
uncon demand uncon demand uncon
P t P t P t  = − − +
         (12) 
In the above equation: λ stands for the power 
fluctuation coefficient; the larger the λ is, the greater the 
power fluctuation is. 
Based on the variation range of power demand 
described above, the flexibility demand calculation 
process of the partition is as the following. 
When ( ) ( )
_ _ min _
1
uncon demand uncon demand
P t P t − only 
upward adjustment of flexibility demand is calculated 
as shown in equation (12) as the following. 
( ) ( ) ( )
_up _ _ max _
1
demand uncon demand uncon demand
P t P t P t = − −                                                                                
(13) 
When
( ) ( ) ( )
_ _ max _ _ _ min
1
uncon demand uncon demand uncon demand
P t P t P t  − 
, there are both upward and downward flexibility 
demands, which are calculated according to equation 
(13) as the following. 
( ) ( ) ( )
( ) ( ) ( )
_ _ _ max _
_down _ _ _ min
1
1
demand up uncon demand uncon demand
demand uncon demand uncon demand
P t P t P t
P t P t P t
= − − 


= − −


           
(14) 
When ( ) ( )
_ _ _ max
1
uncon demand uncon demand
P t P t − , it is 
necessary to adjust flexibility demand downward, 
which is calculated according to equation (14) as the 
following. 
( ) ( ) ( )
_ _ _ _ min
1
demand down uncon demand uncon demand
P t P t P t = − −            
(15) 
 
Figure 5: Diagram of partition flexibility resources. 
148   Informatica 48 (2024) 141–156                                 L. Tian 
The upward and downward flexibility supply by region 
is provided by the controllable unit, and it can be 
calculated according to Equation (15) and Equation (16) 
as follows. 
( ) ( )
supply_up _ _ max _ con gen con gen
P t P P t =−        (16) 
( ) ( )
supply_down _ _ _ min con gen con gen
P t P t P =−      (17) 
In the above equation: ( )
_ con gen
Pt stands for the output 
value of controllable units; 
_ _ max con gen
P and 
_ _ min con gen
P stands for the maximum output value and 
minimum output value of all controllable units, 
respectively. 
Based on the upward and downward flexibility for 
demand and supply described above, the upward and 
downward poor flexibility indexes can be obtained and 
calculated according to equation (17) and equation (18) 
as follows: 
( )
( )
( )
_
supply_up
demand up
up
Pt
Ft
Pt
=                                   (18) 
( )
( )
( )
_down
supply_down
demand
down
Pt
Ft
Pt
=                            (19) 
The upward and downward poor flexibility indexes 
indicate the capacity of the flexibility resources to meet 
the upward and downward flexibility demands. 
When ( ) 1
up
Ft  , ( ) ( )
supply_up _ demand up
P t P t  , and 
the flexibility resources have a certain margin. 
When ( ) 1
up
Ft  , ( ) ( )
supply_up _ demand up
P t P t  , and 
the flexibility resources may not be able to meet the 
demand of the grid. Hence, it is necessary to take 
measures such as allocating controllable units, energy 
storage, new energy, or load removal to keep the 
balance of supply and demand of the flexible resources. 
Similarly, when ( ) 1
down
Pt  , the flexibility resources 
fail to meet the demand of the grid, it is necessary to 
take measures such as reducing new energy and 
increasing load to keep the supply and demand of 
flexibility resources in balance [22-23]. 
3.4 Poor flexibility index of partition grids 
Access to a large number of new energy sources can 
affect the tidal distribution of the system, and the 
partition grid flexibility index is an indicator that can be 
used to determine whether the grid structure and line 
transmission capacity in the region can meet the tidal 
distribution. Its weighted average of the N branches 
with the largest calculated load factor in the network at 
moment t[24-25]. Thus, the poor grid flexibility index 
at moment t can be calculated according to equation (19) 
as the following. 
( ) ( )
1
_
N
ii
i
Flex net t L t 
=
=

                             (20) 
In the above equation: i stands for an arbitrary branch; 
i
 stands for the flexibility weighting factor; ( )
i
Lt and 
stands for the calculated load factor. The detailed 
calculation is shown in Equation (20) and Equation (21) 
as the following: 
( )
( )
( )
( )
2
2
1
2
2
1 1 1
T
i
i
it
i N N T
i
ii
i i t
L t L
L t L



=
= = =
−
==
−

  
                 (21) 
( )
( )
max
i
i
i
St
Lt
S
=                                                      (22) 
In the above equation: 
2
i
 stands for the variance of the 
calculated load factor; i L stands for the average of the 
calculated load factor overall moments; T stands for the 
maximum number of moments; ( )
i
St stands for the 
transmission capacity; and 
max i
S stands for the 
maximum transmission capacity. It is possible to 
identify the "defects" that restrict the flexibility of the 
network frame by comparing the branch circuits 
horizontally to resist the fluctuation of flexibility 
resources based on
i
 . When ( ) 1
i
Lt  , overload can 
Analysis of Media Content Recommendation in the New Media…                                         Informatica 48 (2024) 141–156   149 
occur in the actual operation, and measures such as 
wind prevention, light prevention, and load cutting 
should be taken. When ( ) 1
i
Lt  , the branch circuit 
can be operated normally. 
According to the definition, ( ) _0 Flex net t  . 
When ( ) _1 Flex net t  , one or more branches can be 
overloaded in actual operation it is necessary to take 
appropriate measures to cope with the situation. Hence, 
the smaller the Flaa_net(t) is, the better the flexibility of 
the grid is. 
3.5 Poor flexibility index for partition 
transmission channel  
Some regions are connected with more new energy 
sources and less load, and the demand in the region 
cannot be generated from energy sources, and a large 
volume of electric power needs to be transmitted to 
other regions. In this regard, the flexibility index of the 
transmission channel is defined to determine whether 
the transmission capacity of the transmission channel 
meets the outgoing power. 
By the power flow relationship between the partition, 
the transmission from the two ends of the partition is 
divided into the sending end and the receiving end [26-
27]. It is assumed that there are n transmission lines 
between the sending end and the receiving end, and 
define the active power of lines 1-n as 
12
, , ,
n
P P P , 
respectively, which constitute the transmission channel, 
as shown in Figure 6 below. 
 
 
 
Figure 6: Schematic diagram of the power transmission 
channel. 
About the sending end, the specific calculation of the 
sending power at moment t is shown in equation (22) as 
the following: 
( ) ( ) ( ) ( ) ( ) ( ) ( )
supply _ wind PV hydro con gen load
P t P t P t P t P t P t P t = + + + − − 
        (23)
 
(1) Normal operating conditions 
In the case where the transmission loss is ignored, the 
delivered power at the sending end is the transmission 
power of the inter-regional transmission channel, and 
the specific calculation is shown in equation (23) as the 
following. 
( ) ( ) ( ) ( ) ( )
supply 1 2 trans n
P t P t P t P t P t = = + + +                
(24) 
The power distribution coefficients of lines 1 to n in the 
transmission channel are defined as 
12
, , ,
n
r r r , 
respectively, and their specific calculation is shown in 
equation (24) as the following: 
( ) ( ) ( ) ( )
( ) ( ) ( ) ( )
12
12
1
trans n
n
trans trans n trans i trans
i
P t P t P t P t
r P t r P t r P t r P t
=
= + + + =
+ + + =

             
(25) 
In the above equation: 
i
r stands for the power 
distribution coefficient of the ith transmission line, and  
1
1
n
i
i
r
=
=

. 
In different operation scenarios of the system, the power 
distribution of the transmission line is different. 
However, in the same operation scenario, the power 
distribution coefficient of the transmission line is 
considered to be the same. The definition 
i
r and its 
specific calculation are shown in equation (25) as the 
following.  
1
i
i n
i
i
P
r
P
=
=

                                                            (26) 
150   Informatica 48 (2024) 141–156                                 L. Tian 
The above equation: i P stands for the average power of 
the ith transmission line at all moments, and the detailed 
calculation is shown in equation (26) below. 
( )
1
1
T
i
i
t
P P t
T
=
=

                                                  (27) 
Among all transmission lines in the transmission 
corridor, if one transmission line i reaches the upper 
power limit for transmission, the specific calculation is 
shown in equation (27) as the following. 
( )
max i trans i
r P t P =                                              (28) 
In this case, the total transmission power between the 
sending end and the receiving end is considered to have 
reached its maximum value. The value of the 
transmission capacity of the transmission channel when 
each line reaches the upper limit of power is calculated, 
and the minimum value is taken as the upper limit of 
transmission capacity of the transmission channel. The 
specific calculation is shown in equation (28) as the 
following: 
max
_ max
1
min
i
normal
in
i
P
P
r

=                                    (29) 
The specific calculation of the maximum output power 
at the defined delivery end is shown in equation (29) as 
the following. 
( ) ( ) ( )
( )
( )
( )
( ) ( ) ( ) ( )
_ max
_
max
min min min
source wind PV hydro
con gen load
P t P t P t P t
t P t t P t P t
= + + +
 − − 
             
(30) 
In the above equation, ( ) Pt   stands for the exchange 
power between the sending end and other partitions 
(excluding the receiving end). 
Under normal operation conditions, the specific 
calculation of the transmission channel flexibility 
inflexibility index is shown in equation (30) as follows. 
_ max
_ max
source
channel
normal
P
F
P
=                     (31) 
If 1
channel
F  , it means that the transmission channel 
is not flexible enough to meet the transmission demand 
of the maximum output of the power supply at the 
sending end, new transmission lines are required. If 
1
channel
F = , it means that the transmission channel just 
meets the transmission demand of the maximum power 
output at the sending end. If 1
channel
F  , it means that 
the transmission channel is flexible enough and the 
power output at the sending end is not blocked. 
(2) n-1 operating condition 
If the k-th line in the transmission channel fails and is 
withdrawn from operation, only n-1 transmission lines 
are operating between the sending end and the receiving 
end. The specific calculation is shown in equation (31) 
as the following: 
( ) ( ) ( ) ( ) ( )
( ) ( ) ( ) ( ) ( )
12
12
1,
trans n k
n
trans trans n trans k trans i trans
i i k
P t P t P t P t P t
r P t r P t r P t r P t r P t
=
= + + + − =
+ + + − =

       
(32) 
If one transmission line i in the n-1 transmission lines in 
operation reaches the upper power limit, the total 
transmission power between the sending and receiving 
ends is considered to have reached the maximum value. 
The value of the transmission capacity of the 
transmission channel when each line reaches the upper 
limit of power is calculated, and the minimum value is 
taken as the upper limit of the transmission capacity of 
the transmission channel [28]. The specific calculation 
is shown in equation (32) as the following: 
max
_ _ max
1,
min
i
k fault
i n i k
i
P
P
r
  
=                               (33) 
In the above equation: 
_ _ max k fault
P stands for the 
maximum value of the power transmitted by the 
transmission channel when the k-th line fails. 
Analysis of Media Content Recommendation in the New Media…                                         Informatica 48 (2024) 141–156   151 
In each operation scenario, the minimum value of the 
power that can be transmitted by the remaining 
transmission lines in case of failure of each 
transmission line in the transmission channel is 
calculated as the upper limit of the power transmitted 
by the transmission channel under n-1 operation 
conditions between these two partitions. The specific 
calculation is shown in equation (33) as the following: 
_ max _ _ max
1
min
fault k fault
kn
PP

=                               (34) 
Under n-1 operating conditions, the transmission 
channel's poor flexibility index can be calculated. The 
specific calculation is shown in equation (34) as the 
following: 
_ max
channel_
_ max
source
fault
fault
P
P
P
=                                   (35) 
The meaning of 
channel_ fault
P is similar to that of 
channel
P , 
except that 
channel_ fault
P applies to n-1 operating 
conditions, whereas 
channel
P applies to normal operating 
conditions. 
3.6 Flexibility Assessment Process 
Based on the improved K-means algorithm to generate 
several types of typical scenarios, the flexibility 
evaluation results of each type of scenario are analyzed, 
and the comprehensive flexibility evaluation index is 
weighted based on the probability of occurrence of each 
type of scenario [29-30]. The specific process is shown 
in Figure 7 as the following. 
 
Figure 7: Flow chart of flexibility assessment. 
4. Example analysis of the model 
4.1 Overview of the Events 
Based on the demand of the model for the relevant data, 
the average number of media, the average influence, 
and the average amount of original content in each blog 
post within the event are calculated, and descriptive 
statistics are carried out. The results are shown in 
Figure 8 and Figure 9 as the following. 
 
 
 
 
152   Informatica 48 (2024) 141–156                                 L. Tian 
 
Figure 8: Overview of the basic attributes of events 
(partial). 
 
Figure 9: Descriptive statistics of the basic event 
attributes. 
From the results, it can be observed that the media 
content is stable at around 1 tweet (microblogging 
message), with a relatively great gap generated 
depending on the event or topic. 
4.2 Similarity Measure Results Based on 
Entropy 
The network structure entropy and event content 
entropy (that is, NND measurement index) of each 
event in the dataset are shown in Figure 10 as the 
following. 
 
Figure 10: Entropy of the event network structure 
versus the entropy of the content distribution. 
It can be observed from the results in Figure 10 above 
that the structural EMD and the text distribution 
(content distribution) EMD are consistent in general. 
However, there are relatively significant differences in 
individual events. 
5. Results of event clustering  
5.1 Clustering results in the experimental 
group 
The clustering results are plotted based on the SSE 
(Sum of the Squared Errors) to obtain the optimal 
number of clustering categories. The details are shown 
in Figure 11 as the following. 
Analysis of Media Content Recommendation in the New Media…                                         Informatica 48 (2024) 141–156   153 
 
Figure 11: Number of clustering categories of the SSE 
index in the experimental group. 
When k=4, the SSE index is decreased rapidly. At this 
point, k corresponds to a more realistic number of 
clustering categories at this time. 
Through the observation of the raw data, the typical 
events and characteristics of each category in the final 
clustering results can be obtained, as shown in Figure 
12 as the following. 
 
Figure 12: Description of the clustering categories in 
the experimental group. 
The NND values in the table are the normalized values 
of the mean NND values in this class. It can be 
observed from the results that the propagation structure 
of event 1 is relatively homogeneous, and the local text 
cannot characterize the whole well. Event 2 includes a 
large number of events and relatively complex 
dissemination. Event 3 has local information that can 
better characterize the whole. Event 4 has local 
information that cannot well characterize the whole. 
The events have triggered many controversies and 
discussions, and the dissemination network-N BH 
structure is irregular. 
5.2 Clustering results in the control group 
In the control group, the same SSE index is used to 
identify the optimal number of clustering categories, 
and the results are shown in Figure 13 as follows. 
 
Figure 13: Number of clustering categories of the SSE 
index in the control group. 
The forecasting of qualitative outputs and the analysis 
of connections among parameters are made possible by 
logistic regression. The purpose of this investigation is 
to evaluate how well the scenario clustering algorithm 
performs in improving the media content 
recommending system. A quantitative structure for 
assessing the algorithm's influence on suggesting 
pertinent information is provided by logistic regression, 
which advances the knowledge of the algorithm's 
effectiveness in the environment of new media. The 
results of the Logistic Regression Analysis of Media 
Content Recommendation are displayed in Figure 14.  
154   Informatica 48 (2024) 141–156                                 L. Tian 
 
Figure 14: Logistic regression analysis results 
 
6   Discussion 
Distinct characteristics can be identified in the 
clustering findings of the control group, which vary 
between various event classifications. Category-1 
events demonstrate the utmost level of participation, 
with a substantial number of people actively 
participating in discussions and a corresponding wealth 
of media, such as images and videos. Category-2 events 
are characterized by a significant level of engagement 
in conversations, while they have fewer supporting 
media aspects contrasted to Category-1 activities. 
Category-3 events are characterized by a significant 
number of individuals actively involved in 
conversations and a significant diversity of material, 
particularly images and videos. Finally, Category-4 
events feature fewer individuals but make up for it with 
a somewhat larger use of visual media. The results 
indicate that there are different levels of involvement 
and use of media in different event classifications. This 
suggests that there may be subtle differences in the 
environment and patterns of discussions and media 
consumption within every classification [31]. These 
findings can provide insights for developing strategies 
to engage participants and create content for future 
events. 
The entropy clustering technique suggested in this 
investigation is a new strategy to supplement 
conventional category classification techniques. The 
model of entropy clustering considers various features, 
such as event production, network organization, and 
text transportation, extensively. This is in contrast to 
traditional techniques that mainly focus on shallow data 
features and have difficulty accurately distinguishing 
among events due to significant information variations 
between categories. Employing this approach may 
accurately capture the subtle attributes of occurrences, 
enabling a more accurate measure of similarity and 
cluster analysis across different fields without 
compromising the relevancy of the information. This 
methodology facilitates efficient content suggestion by 
taking into account a wider range of characteristics and 
connections within the data. In conclusion, the 
incorporation of entropy clustering enhances the ability 
to identify commonalities between events, resulting in 
more precise categorization and improved 
recommendation systems that can accommodate a wide 
range of user preferences and requirements. 
The results of the simulation experiment indicate that 
the scenario clustering algorithm proposed in this paper 
is effective and can support the analysis of media 
content recommendation in the new media era. 
 
7   Conclusions 
The traditional online event similarity calculation model 
or clustering model is subject to the constraint of 
surface features of events. Hence, it is challenging to 
build a unified similarity measurement index across the 
events. The continuous progress and optimization of 
new media technologies, have made the ways and 
means to access media content more abundant and 
diversified. Based on the scenario clustering algorithm, 
the business logic of the network structure and media 
content distribution is sorted out based on the demands 
of media content recommendation in this paper, and a 
specific clustering algorithm model is established to 
implement an effective representation of events. 
Through the comparison of the cross-event content, the 
media content is effectively recommended and analyzed. 
The results of the simulation experiment indicate that 
the scenario clustering algorithm proposed in this paper 
is effective and can support the recommendation 
analysis of media content in multiple dimensions, to 
provide high-quality media services to users. Moreover, 
the ever-evolving structure of new media networks and 
fast-evolving content environments provide continuous 
difficulties in developing efficient and user-friendly 
recommending algorithms. The future prospects for 
"Media Content Recommendation in the New Media 
Era" involve the progression of algorithmic information, 
the incorporation of user feedback techniques, the 
enhancement of privacy protections, and the utilization 
of emerging methods such as AI and machine learning 
Analysis of Media Content Recommendation in the New Media…                                         Informatica 48 (2024) 141–156   155 
to personalize recommendations and enhance user 
experiences. 
 
Data Availability  
The data used to support the findings of this study are 
available from the corresponding author upon request. 
 
Conflicts of Interest 
The authors declare no conflicts of interest 
 
Funding Statement 
This study did not receive any funding in any form. 
References 
[1]   Jiang L ,  Yang C C . User recommendation in 
healthcare social media by assessing user 
similarity in heterogeneous network[J]. Artificial 
Intelligence in Medicine, 2017, 81(9):63-77. 
[2] Yu Z ,  Wang C ,  Bu J , et al. Friend 
recommendation with content spread 
enhancement in social networks[J]. Information 
Sciences, 2015, 309(3):102-118. 
[3] Liu C L, Chen Y C. Background music 
recommendation based on latent factors and 
moods[J]. Knowledge-Based Systems, 2018, 
159(1):158-170. 
[4] Daphne, Reinau, Christoph, et al. Skin Cancer 
Prevention, Tanning, and Vitamin D: A Content 
Analysis of Print Media in Germany and 
Switzerland. [J]. Dermatology, 2016,4(2):1-8. 
[5] Rehman F, Khalid O, Madani S A . A 
comparative study of location-based 
recommendation systems[J]. Knowledge 
Engineering Review, 2017, 32(3):1-9. 
[6] Song H ,  Moon N . Eye-tracking and social 
behavior preference-based recommendation 
system[J]. Journal of Supercomputing, 2019, 
75(4):1990-2006. 
[7] Sermpezis P ,  Spyropoulos T ,  Vigneri L , et al. 
Femto-Caching with Soft Cache Hits: Improving 
Performance through Recommendation and 
Delivery of Related Content[J]. IEEE Journal on 
Selected Areas in Communications, 2018, 
4(99):1-8. 
 
[8] Middleton S E, Krivcovs V . Geoparsing and      
Geosemantics for social media: Spatiotemporal 
Grounding of Content Propagating Rumors to 
Support Trust and Veracity Analysis during 
Breaking News[J]. ACM Transactions on 
Information Systems, 2016, 34(3):1-26. 
[9]     Ma J, Li G, Zhong M, Zhao X, Zhu L, Li X. 
LGA: latent genre aware micro-video 
recommendation on social media. Multimedia 
Tools and Applications, 2018,77:2991-3008. 
[10]   García-Perdomo V , Salaverría R, Brown DK, 
Harlow S. To share or not to share: The 
influence of news values and topics on popular 
social media content in the United States, 
Brazil, and Argentina. Journalism studies, 
2018,19(8):1180-201. 
[11 ] Narangajavana Kaosiri Y , Callarisa Fiol LJ, 
Moliner Tena MÁ, Rodríguez Artola RM, 
Sánchez García J. User-generated content 
sources in social media: A new approach to 
explore tourist satisfaction. Journal of Travel 
Research,2019,58(2):253-65. 
[12]   Ashraf M, Tahir GA, Abrar S, Abdulaali M, 
Mushtaq S, Mukthar H. Personalized news 
recommendation based on multi-agent 
framework using social media preferences. 
In2018 International Conference on Smart 
Computing and Electronic Enterprise 
(ICSCEE) 2018,  1-7. IEEE. 
[13]   Stubb C, Colliander J. “This is not sponsored 
content”–The effects of impartiality disclosure 
and e-commerce landing pages on consumer 
responses to social media influencer posts. 
Computers in Human Behavior, 2019,98:210-
22. 
[14] A. C , Faleye,  A. A , et al. Media Portrayal of 
Teen Suicide: A Narrative Content Analysis of 
Netflix Series 13 Reasons Why[J]. Science of 
the Total Environment, 2019,3(5):19-27. 
[15] Minson S ,  Mukerji M ,  Rankine J . 
G26Social media support for parents and 
young people with food allergy – an analysis 
of facebook content[J]. Archives of Disease in 
Childhood, 2016, 101(1): 1-19. 
[16] Bach N X ,  Hai N D ,  Phuong T M . 
Personalized recommendation of stories for 
commenting in forum-based social media[J]. 
Information Sciences, 2016, 352(3):48-60. 
156   Informatica 48 (2024) 141–156                                 L. Tian 
[17] Ohsawa T . Symmetry and Conservation Laws in 
Semiclassical Wave Packet Dynamics[J]. Journal 
of Mathematical Physics, 2015, 56(3): 103-110. 
[18] Chakrapani S K ,  Barnard D J ,  Dayal V . 
Influence of fiber orientation on the inherent 
acoustic nonlinearity in carbon fiber reinforced 
composites[J]. Journal of the Acoustical Society of 
America, 2015, 137(2):617-627. 
[19] Dong E K ,  Su C P ,  Dong I Y , et al. Enhanced 
critical heat flux by capillary driven liquid flow on 
the well-designed surface[J]. Applied Physics 
Letters, 2015, 107(2):1004-1010. 
[20] Mayara, V, Damasceno, et al. Effects of resistance 
training on neuromuscular characteristics and 
pacing during 10-km running time trial[J]. 
European Journal of Applied Physiology, 
2015,3(2):1-9. 
[21] Rawshan F ,  Park Y . Fault-tolerable and SLA-
supportive architecture for TWDM-PON 
systems[J]. Photonic network communications, 
2015, 30(2):143-149. 
[22] Reeves S L ,  Fullerton H J ,  Dombkowski K J , et 
al. Physician attitude, awareness, and knowledge 
regarding guidelines for transcranial Doppler 
screening in sickle cell disease.[J]. Clinical 
Pediatrics, 2015, 54(4):336-345. 
[23] Jiao Y ,  Liu Z ,  Victora R H . Renormalized 
anisotropic exchange for representing heat assisted 
magnetic recording media[J]. Journal of Applied 
Physics, 2015, 117(17):2417-2432. 
[24] Toyoura K ,  Ohta M ,  Nakamura A , et al. First-
principles study on phase transition and 
ferroelectricity in lithium niobate and tantalate[J]. 
Journal of Applied Physics, 2015, 118(6):103-110. 
[25] Huang T J . Acoustic tweezers: Manipulating 
particles, cells, and fluids using sound waves[J]. 
Journal of the Acoustical Society of America, 
2015, 137(4):2222-2229. 
[26] Tinakiche N ,  Annou R . Oscillating two-stream 
instability in a magnetized electron-positron-ion 
plasma[J]. Physics of Plasmas, 2015, 22(4): 101-
110. 
[27] Song H ,  Moon N . Eye-tracking and social 
behavior preference-based recommendation 
system[J]. The Journal of Supercomputing, 2019, 
75(4):1990-2006. 
[28] Zhang Y ,  Zhang L ,  Gai S , et al. Cloning and 
expression analysis of the R2R3-PsMYB1 gene 
associated with bud dormancy during chilling 
treatment in the tree peony (Paeonia 
suffruticosa)[J]. Plant Growth Regulation, 2015, 
75(3):667-676. 
[29] Mcclure J ,  Morton C ,  Yarusevych S . Flow 
development and structural loading on dual step 
cylinders in laminar shedding regime[J]. Physics 
of Fluids, 2015, 27(6):477-539. 
[30] Kieselmann J , ,  Rosselet A , ,  Scheib S , , et al. 
SU-E-J-118: A Systematic Analysis of Rigid 
Image Registration Using Patient CTs and 
Simulated Setup Images with a Unique Gold 
Standard Registration[J]. Medical Physics, 2015, 
42(6):3291-3292. 
[31] Xu Y ,  Zhang H ,  Gao H , et al. Preference 
discovery from wireless social media data in APIs 
recommendation[J]. Wireless Networks, 
2021,5(4):1-8.