https://doi.org/10.31449/inf.v48i13.6171 Informatica 48 (2024) 111–126 111 Financial Investment Optimization by Integrating Multifactors and GA Improved UCB Algorithm Zhe Guo*, Guang Kang Economic and Trade Department, Shijiazhuang College of Applied Technology, Shijiazhuang 050081, China E-mail: guozhe24@163.com Keywords: multifactor model, genetic algorithm, investment portfolio, UCB algorithm, multi-armed bandit Received: May 9, 2024 In complex financial markets, controlling risks while achieving high returns is a challenge for investors. Faced with market uncertainty and complexity, traditional investment strategies often struggle to meet the needs of modern investors. To address this issue, a new investment portfolio strategy was proposed by integrating the multifactor model with the upper confidence bound. Meanwhile, genetic algorithm was used to optimize and improve the weight allocation of the investment portfolio based on the upper confidence bound. These results confirmed that the cumulative return of GA-UCB was 187.4%, which was 68.3% higher than the cumulative return of 119.1% on the Shanghai and Shenzhen 300 Index, respectively. The maximum drawdown rate of the GA-UCB model was the lowest at 13.5%, which was reduced by 5.0%, 0.7%, and 4.8% compared to UCB, Equal weight combination, and Shanghai Shenzhen 300, respectively. The model backtesting result of the GA-UCB algorithm was 163.5%, which was 57.9% higher than the cumulative return of 105.6% on the Shanghai Stock Exchange 50 Index. The penalty coefficient ranged from 0.3 to 0.1, with cumulative returns and annualized returns increasing by 13.9% and 21.7%, respectively. In summary, the research on financial investment optimization by integrating multifactors and GA improved UCB effectively improves returns while controlling risks, providing a new perspective and tool for financial market investors. Povzetek: Predlagan je model za optimizacijo trajnostne gradbene zmogljivosti, ki temelji na občutljivem večciljnem odločanju. Uporaba inteligentnih algoritmov izboljša natančnost in učinkovitost zasnove ter zmanjša število potrebnih sprememb. 1 Introduction With the deepening development of economic globalization, the volatility and uncertainty of financial markets are increasingly intensifying. This makes score diversification investment the preferred investment strategy for the vast majority of investors [1]. The selection of investment portfolios in the field of financial investment is a fundamental research question. This mainly involves how to scientifically allocate funds and investment ratios between different assets [2]. With the continuous development of network technology, applying online learning algorithms to financial investment portfolio selection becomes an important research method. Multi-armed Bandit (MAB) was proposed and applied by international scholars in the early 1990s in the selection of financial portfolios. Therefore, MAB becomes a widely used online learning algorithm [3]. Although China starts relatively late, domestic scholars have also made significant achievements in this field. For example, the method of optimizing weight allocation in investment portfolios using passive attack algorithms and predicting stock returns achieves relatively high cumulative returns [4]. He et al. designed an online investment portfolio strategy by integrating the views of active experts using weak ensemble algorithms, which had significant competitive performance [5]. With the advancement of market integration and financial globalization, processing and analyzing massive amounts of financial data becomes a challenge in formulating effective financial investment portfolio strategies. As a key online learning algorithm, MAB can provide feedback learning from each decision and apply this information to subsequent decisions, thereby finding the optimal strategy through continuous trial and error [6]. The multifactor model can predict the future returns and risks of a stock company by analyzing its various characteristics, providing a reliable method for estimating the reward function for MAB [7]. Therefore, this study innovatively combines multifactor models with MAB to construct a financial investment portfolio model. On the basis of MAB, GA is used to improve the Upper Confidence Bound (UCB) to achieve better investment decision performance. As an important branch of online learning algorithms, MAB is widely applied in online recommendation systems. Numerous experts and scholars have conducted research on MAB and achieved significant results. Morijiri et al. used numerical research methods to analyze gambling machine decision-making 112 Informatica 48 (2024) 111–126 Z. Guo et al. to solve large-scale MAB problems. The gambling machine decision was adjusted by controlling the chaotic time waveform generated by the semiconductor laser to generate bias control. This method achieved a scaling index of 97%, demonstrating its feasibility [8]. Takeuchi et al. used MAB combined with chaotic oscillation time-series of semiconductor lasers to improve communication quality and improve decision-making efficiency. The channel selection effect was verified through adaptive dynamic channels. This method significantly improved the quality and efficiency of channel selection [9]. Hasegawa et al. proposed an efficient channel allocation method combining MAB and large-scale heterogeneous Internet of Things to solve internet network congestion. This method had a higher frame success rate compared to conventional methods [10]. Li et al. proposed a wireless network-based spectrum scheduling algorithm based on MAB to allocate resources reasonably in frequency bands. This algorithm fully utilized uncertain resources for wireless spectrum scheduling modeling, confirming the effectiveness of this method [11]. Yang et al. proposed a maritime network architecture that combined MAB selection edge services to reduce energy consumption and latency in the ship's Internet of Things. Reward and cost constraint networks were introduced into its network architecture. This method significantly reduced energy consumption costs and latency [12]. The multifactor model, as a quantitative investment model for predicting stock returns and risks, plays an important role in financial investment optimization research. De Nard et al. proposed adding a factor model to the covariance matrix to solve difficult modeling under time-varying conditions. Meanwhile, a new covariance estimator was combined with the time-varying condition of high-dimensional residuals. This method was tested in the detection of cross-sectional anomalies in stock returns, confirming its effectiveness [13]. Ta et al. proposed using recurrent neural network for sequence prediction to make accurate stock predictions. The investment portfolio was constructed by combining factor models and long short-term memory network. Multiple portfolio optimization techniques such as weight modeling and mean variation optimization modeling were used to improve portfolio performance. The return on investment of this method was significantly improved [14]. Hao et al. proposed a model for predicting stock price index trends by combining factor models and multi-time scale feature learning on the basis of end-to-end mixed neural networks. Then, the prediction trend of stock price series could be improved in the stock market. This model utilized convolutional neural network to extract features at different time scales and was validated on a dataset, confirming its effectiveness [15]. Chung et al. proposed using factor models combined with multi-channel convolutional neural network to improve the stock market forecasting accuracy for predicting stock index volatility. On the basis of multi-channel convolutional neural network, network topology optimization was carried out to improve this model’s performance. This method effectively improved the prediction accuracy of stock market indices [16]. Chen et al. proposed using factor models for portfolio optimization to allocate funds reasonably and achieved excess returns while controlling risks. This model used mean to improve the maximum rollback rate. Meanwhile, the multi-stage constrained multi-objective evolutionary algorithm using orthogonal learning was used to solve the complex multi-constraints in a prediction model. This method has a competitive advantage [17]. The summary of related works is shown in Table 1. Table 1: Summary of related works Author Method Result Morijiri et al. Numerical study using semiconductor lasers 97% scaling index is achieved. Takeuchi et al. Multi-armed Bandit combined with semiconductor laser The quality and efficiency of channel selection are significantly improved. Hasegawa et al. An efficient channel allocation method for combining multi-armed bandit with large scale heterogeneous Internet of Things networks Compared to conventional methods, the method has a higher frame success rate. Li et al. Multi-armed Bandit spectrum scheduling algorithm based on wireless network Uncertain resources can be fully utilized for wireless spectrum scheduling modeling. Yang et al. A maritime network architecture method combining multi-armed bandit Energy costs and latency are significantly reduced De Nard et al. A method of combining factor models with a new covariance matrix estimator Cross-sectional anomalies in stock returns are effectively detected. Ta et al. A method for constructing investment portfolios using recurrent neural networks combined with factor models and long short-term memory networks Portfolio performance and return on investment are improved. Hao et al. A predictive model combining factor models and multi-time scale feature learning The method has effectiveness in predicting stock price index trends. Chung et al. A stock index volatility prediction method using factor The prediction accuracy of stock Financial Investment Optimization by Integrating Multifactors… Informatica 48 (2024) 111–126 113 models combined with multi-channel convolutional neural networks market indices are effectively improved. Chen Y et al. A multi-objective evolutionary algorithm combining factor models with multi-stage constraints Competitive advantage in achieving excess returns and maximum drawdown rates. In summary, both MAB and multifactor models have strong application potential and research value in their respective fields, providing new ideas and methods for financial investment. Therefore, this study innovatively combines these two, integrating multifactors and GA to improve the financial investment optimization of UCB to achieve better investment decision-making performance. 2 A Financial investment optimization model integrating multifactors and improved UCB This study combines multifactor models with UCB in MAB to construct a financial investment portfolio model that integrates multifactors with UCB. On this basis, Genetic Algorithm (GA) is used for improvement to optimize parameter configuration. 2.1 Construction of a financial investment portfolio model that integrates multifactors and UCB The core issue of financial investment lies in how to optimize the allocation of financial assets to achieve optimal returns. As a key method of online learning, MAB provides a unique solution. The inspiration for MAB comes from a slot machine with multiple rocker arms. Each rocker arm has a certain chance of generating returns when pulled [18]. UCB is an effective decision-making method in MAB. When facing multiple combinations of choices, the probability and return of each choice are unknown. UCB can achieve maximum overall return while minimizing regret [19]. As a quantitative investment model, the multifactor model can predict the returns and risks of stocks, providing a scientific basis for investment decisions [20]. Therefore, this study combines multifactor models and MAB to optimize financial investment portfolios to achieve better investment decision-making performance in the financial market. Figure 1 shows the financial investment portfolio process based on the fusion of multifactors and UCB. Determine candidate factors Factor IC analysis Factor return analysis Multifactorial model Predicting portfolio returns Return covariance matrix Investment portfolio to be selected Calculate Sharpe ratio UCB algorithm Portfolio weight vector Figure 1: Financial investment portfolio process based on fusion of multifactors and UCB algorithm In Figure 1, the financial investment portfolio based on the fusion of multifactors and UCB starts with identifying candidate factors. Potential investment portfolios and their corresponding Sharpe ratios can be identified by selecting factor models. This process transforms the portfolio selection problem into a MAB problem. The Sharpe ratio is used as the expected reward function to guide selection. On this basis, UCB is used to 114 Informatica 48 (2024) 111–126 Z. Guo et al. allocate weights to different stock portfolios, while further optimizing and improving UCB through GA to refine the weight allocation of investment portfolios. The final process can export the weight vector of the financial investment portfolio to achieve the maximum expected return of the investment portfolio. In a multifactor model, factors refer to key variables that can predict changes in stock returns. A comprehensive factor pool is constructed to comprehensively capture market dynamics and potential investment opportunities. Figure 2 shows the overall factor pool. Factor pool Fundamental factor Quantity and price factors Technical factors Momentum reversal factor Statistical indicator factors Disk type factor Liquidity factor Volatile factors Valuation factors Tibetan debt capacity factor Financial risk factors Operational efficiency factor Liquidity factor Profitability factor Figure 2: Overall factor pool In Figure 2, the factors that affect the performance of financial investment portfolios are mainly divided into two categories: quantity and price and fundamental factors. Quantity and price factors are mainly based on historical prices, trading volume, and other available market data. Its core advantage is that quantity and price factor can reflect market changes in real-time [21]. Fundamental factors focus on the financial health of a company. Typical fundamental factors include various financial ratios extracted from the company's income statement, balance sheet, and cash flow statement [22]. To select effective candidate factors and determine the research object, it is necessary to analyze the effectiveness of each factor through Information Coefficient (IC). IC is initially defined as the correlation coefficient between the predicted yield and the actual yield [23]. However, in current practice, IC analysis relies more on the correlation calculation between factor exposure and actual returns. The IC analysis factor value is represented by equation (1). ( ) ,1 , i i t t IC Corr x ret + = (1) In equation (1), i IC represents the IC value of the i th factor. , it x represents the factor value of the i th factor during period t . Corr represents a correlation coefficient. 1 t ret + represents the yield vector of the next period of stocks. There are two calculation methods for the correlation coefficient in equation (1). The Spearman correlation coefficient is used for calculation and analysis, represented by equation (2). ( ) ( ) 2 1 2 6 ,1 1 n j j S d Corr x y nn = =− −  (2) In equation (2), ( ) , S Corr x y represents the Spearman correlation coefficient for calculating the factors x and y . n represents the length of the sequence. j d represents the difference between the j th sequence. The IC value of a factor belongs to a single period indicator and cannot be directly used to measure its ability to predict returns [24]. Therefore, when conducting factor validity analysis, it is also necessary to refer to measurement indicators such as the average and standard deviation of IC to comprehensively evaluate the stability and predictive value of factors. Figure 3 shows the average IC of the testing factors for the Shanghai and Shenzhen 300 (HS300) industries. Financial Investment Optimization by Integrating Multifactors… Informatica 48 (2024) 111–126 115 -0.075 -0.100 -0.050 -0.025 0.000 0.025 0.050 0.075 0.100 0.125 IC value Industry testing factors for HS 300 traditional Chinese medicine IC mean Industrial metals Real estate Coal mining Power Bank Security IT services software development Chemical pharmaceuticals Figure 3: The average IC value of 300 factors in Shanghai and Shenzhen The core objective of a multifactor model is to identify technical and fundamental factors closely related to stock returns and apply these factors to predict stock returns. The multifactor model is represented by equation (3). 1 K i ij j i j r x f  = =+  (3) In equation (3), i r represents the yield of the i th financial asset. K represents the total factor. ij x represents the degree of exposure of asset i to factor j . i  represents the characteristic return rate of an asset. The return-on-investment portfolio is represented by equation (4). 1 N I i i i Rr  = =  (4) In equation (4), I R represents the return rate of the investment portfolio. N represents the quantity of stocks in the investment portfolio.  represents the weight of stock. The actual return of an investment portfolio can be expressed as the weighted average of the returns of K factors. The actual return of the investment portfolio is represented by equation (5). 11 KN I ij j i i ji R x f  == =+  (5) The covariance matrix of factor returns is represented by equation (6). ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) 1 1 2 1 2 1 2 2 1 1 1 , ... , , ... , ... ... ... ... , , ... K K KK F Var f Cov f f Cov f f Cov f f Var f Cov f f Cov f f Cov f f Var f =        (6) In equation (6), F represents the covariance matrix of factor returns. K f represents the yield vector of K factors. UCB is a classic method for solving MAB problems. UCB estimates the potential value of each choice by setting an optimistic expected reward upper limit for each option, i.e., calculating its average reward UCB. UCB is represented by equation (7). ( ) 2ln i m i i m UCB t m  =+ (7) In equation (7), i UCB represents UCB of the i th arm. m represents the times the rocker arm is operated. i  represents the average reward for the i th arm. The principle followed by UCB is to consider the arm with the highest confidence upper bound of the expected reward in each round of selection as the current optimal choice [26]. The UCB of the optimal arm is represented by equation (8). 116 Informatica 48 (2024) 111–126 Z. Guo et al. ( ) argmax m t i m i UCB t = (8) In the construction of financial investment portfolio strategy models, the average reward of each arm needs to be measured by the normalized Sharpe ratio to balance the returns and risks of the investment portfolio. The normalized Sharpe ratio is represented by equation (9). ( )   K K i Si rt K = (9) In equation (9), ( ) K rt represents the normalized Sharpe ratio. S represents the Sharpe index. The Sharpe ratio is a risk adjusted return indicator designed to simultaneously consider the potential returns and risks of an investment portfolio. The Sharpe ratio can effectively identify the optimal investment portfolio, which assumes the lowest risk under predetermined expected return conditions. 2.2 Design of investment portfolio model based on ga improved UCB This study combines a multifactor model with UCB to achieve an effective balance between returns and risks in investment portfolios. However, in practical situations, UCB still has certain limitations when dealing with problems. For example, UCB may present the same investment strategy when selecting investment portfolio strategies for different investors. This occurs when utilizing environmental change information and considering factors such as individual investor preferences and purchasing power. However, different investors have varying levels of risk tolerance, investment preferences, and acceptance of the same investment portfolio in different market environments. Therefore, this study utilizes GA to improve UCB based on its foundation. The pseudocode of GA is shown in Figure 4. Procedure GA Begin t=0; initialize P(t); evaluate P(t); while not finished do begin t=t+1; select P(t) form P(t-1); reproduce pairs in P(t); evaluate P(t); end End Figure 4: Pseudocode of GA GA is a powerful optimization tool specifically designed to solve complex combinatorial optimization problems. GA can adjust and optimize the parameters of investment portfolio models by simulating the processes of natural selection and genetics. This can ensure that the optimal parameters for different investors are selected based on their investment preferences [27]. Figure 5 is the GA flowchart. Financial Investment Optimization by Integrating Multifactors… Informatica 48 (2024) 111–126 117 Start Initialization Calculate fitness value Selection, crossover, variation Generate offspring Terminate? Recombination Output optimal End Yes No Figure 5: GA flowchart In Figure 5, the search space for parameter optimization is first defined in GA. The encoding method is chosen to generate the initial population. Then, individual performance is evaluated by calculating their fitness values. After completing the fitness assessment, a new generation of chromosomes is generated through genetic operations such as selection, crossover, and variation [28]. Finally, if the termination condition is met, the algorithm ends and outputs the current optimal chromosome, which is the optimal parameter configuration. If the termination condition is not met, this algorithm returns to calculating the fitness value and proceeds to the next iteration. The selection operation of GA is represented by equation (11). ( ) ( ) 1 i n j j fy K fy = =  (10) In equation (10), K represents the selection operation. ( ) i fy represents the fitness value of individual i . The crossover operation is represented by equation (11). ( ) 2 3 Rr H R + = (11) In equation (11), H represents the crossover. R represents the total iterations. r represents the iteration. The probability of individual selection is represented by equation (12). ( ) ( ) * i fy i e Py K  = (12) In equation (13), P represents the probability of selection.  is a strength control parameter. The larger the parameter value, the higher the fitness, and the higher the probability of selection. Individual similarity is represented by equation (13). ( ) 1 1 1 A Hy = + (13) In equation (13), A represents individual similarity. The utility function of the investment portfolio model based on GA improved UCB is represented by equation (14). 2 , , , , , , , t t t t U U R          =  (14) In equation (13), U represents the utility function of the investment portfolio model.  represents the investment portfolio. , t R  and 2 , t   represent the returns and variances of the  investment portfolio at time t , respectively.  represents the weight. In the utility evaluation of investment portfolios, the shortest time frame necessary to construct an effective investment portfolio is the Minimum Formation Period (MFP). The time length within this cycle is represented by n . The time length in stock utility , tm D  − is represented by m and set as mn  [29]. The loss function of the investment portfolio model is represented by equation (15). ( ) 1 , , , , , TT t t m t m t m t L D D D U       − − − − =+ (15) In equation (15), , t L  represents the loss function, , tm D  − represents the stock utility, and  represents a penalty coefficient. When GA improves UCB, the confidence upper bound parameters and penalty coefficients in UCB are used as chromosome encoding for fitness calculation. By comparing the fitness of each individual in the overall environment, the best preservation method is used to select the next generation of individuals to enter the genetic operation stage [30]. Figure 6 shows the operation of the investment portfolio model based on GA improved UCB. 118 Informatica 48 (2024) 111–126 Z. Guo et al. 0 MFP MTP GA-UCB GA-UCB , t D  , t U  t  tm  − t m n −− tn − tm − t tm + T ,, t m t m t m    − − − ,, t t t    ,, t m t m t m    + + + Figure 6: Running process of investment portfolio model based on GA improved UCB algorithm In Figure 6, in the investment portfolio model process based on GA improved UCB, each investment time is divided into multiple MFPs. An estimate of the current investment portfolio needs to be made based on historical data during each MFP period. When the strategy model runs from time tm − to t , the time m it experiences is represented as the Minimum Testing Period (MTP). During the MTP, the investment portfolio model utilizes the optimal parameters obtained from GA for simulation back testing [31]. Based on the optimal parameters, the weights for the next period and the optimal investment portfolio for the next MTP back testing can be calculated. By following this loop until the code terminates, the optimal investment portfolio strategy can be obtained. 3 Validation of a financial investment optimization model integrating multifactors and improved UCB Firstly, this study selected and processed experimental data. Further analysis was conducted on the model results of different risk preferences and the applicability of integrating multifactors with the improved UCB investment portfolio model. 3.1 Data Selection and processing The sample data used cover the stock market and financial data of the constituent stocks of the Shanghai Stock Exchange 50 (SZ50) Index and HS300 Index from December 2017 to December 2021. The stock market data cover daily and minute frequency opening prices, highest prices, lowest prices, and closing prices. The financial data include three major financial statements released every quarter. The data processing is divided into three stages: handling missing values, handling outliers, and standardizing data. For missing stock market data, this study chose to directly exclude the relevant stock data. For missing financial data, the previous period's report data were used to fill in. The data obtained from stock market data suppliers may have outliers. For abnormal data research, median methods based on absolute deviation and box plot methods were used for identification and processing. To conduct comparative analysis of subsequent factors, it is necessary to first standardize the factor data. Considering that factor data have a certain mean and variance on the cross-section, this study used the z-score method to standardize it to ensure that the processed data had a mean of 0 and a standard deviation of 1. 3.2 Analysis of the results of integrating multifactors and improved UCB investment portfolio model This study conducted strategy back testing on data from the HS300 stock pools to validate the fusion of multifactors and improved UCB investment portfolio models. This model was compared and analyzed with the strategy back testing results based on the original UCB investment portfolio model and the back testing results of the HS300 Index and equal weight portfolios. Figure 7 shows the back testing results of the investment portfolio model for the HS300 stock pools. As of December 2021, the back testing results of GA-UCB, UCB, and Equal weight combination for cumulative returns were 187.4%, 153.6%, and 128.9%, respectively. This was an increase of 68.3%, 34.5%, and 9.8%, respectively, compared to the cumulative return of 119.1% on the HS300 Index. In summary, integrating multifactors and improving the UCB investment portfolio model can improve the cumulative return of investment portfolios. Financial Investment Optimization by Integrating Multifactors… Informatica 48 (2024) 111–126 119 Date 1.0 2.2 2019- 12 0.8 Cumulative Return 1.2 1.4 1.6 1.8 2.0 2020- 03 2020- 06 2020- 09 2020- 12 2021- 03 2021- 06 2021- 09 2021- 12 GA-UCB UCB Equal weight combination HS300 Figure 7: Analysis of return testing results of investment portfolio models for the CSI 300 stock pool A comparative analysis was conducted on the evaluation indicators of the HS300 stock pools to observe the investment portfolio model’s effectiveness more clearly in Table 2. The evaluation indicators of GA-UCB were superior to other models. The cumulative return of GA-UCB was 187.4%, which was 33.8%, 58.5%, and 68.3% higher than UCB, equity portfolio, and HS300, respectively. In terms of annualized returns, GA-UCB increased by 12.4%, 25.7%, and 30.5% compared to UCB, equity portfolio, and HS300, respectively. The Sharpe ratio of GA-UCB was 1.78, which was 0.58, 1.07, and 1.34 higher than the other three methods, respectively. The maximum drawdown rate of GA-UCB was the lowest at 13.5%, which was reduced by 5.0%, 0.7%, and 4.8% compared to the other three methods, respectively. In summary, GA-UCB performs excellently in terms of returns and risk control. Table 2: Comparison of evaluation indicators for the Shanghai and Shenzhen 300 stock pools Model Evaluating indicator Cumulative income % Annualized rate of return % Sharpe ratio Maximum withdrawal rate % GA-UCB 187.4 42.2 1.78 13.5 UCB 153.6 29.8 1.20 18.5 Equal weight combination 128.9 16.5 0.71 14.2 HS300 119.1 11.7 0.44 18.3 To further validate the fusion of multifactors and the improved UCB investment portfolio model, a strategy back testing was conducted on the SZ50 stock pool. Figure 8 shows the back testing results of the investment portfolio model for the SZ50 stock pool. As of December 2021, the back testing results of GA-UCB, UCB, and Equal weight combinations for cumulative returns were 163.5%, 145.8%, and 115.0%, respectively. This was an increase of 57.9%, 40.2%, and 9.4%, respectively, compared to the cumulative return of 105.6% on the SZ50 Index. In summary, the GA improved UCB model still has a strong ability to generate returns within the SZ50 50 stock pool. 120 Informatica 48 (2024) 111–126 Z. Guo et al. Date 1.8 2019- 12 Cumulative Return 0.8 1.0 1.2 1.4 1.6 2020- 03 2020- 06 2020- 09 2020- 12 2021- 03 2021- 06 2021- 09 2021- 12 GA-UCB UCB Equal weight combination SZ50 Figure 8: Analysis of back testing results of the investment portfolio model for the Shanghai stock exchange 50 stock pool To observe the investment portfolio model more clearly, a comparative analysis was conducted on the evaluation indicators of the SZ50 stock pool in Table 3. The performance indicators of GA-UCB were still optimal in the SZ50 stock pool. The cumulative return of this model was 163.5%, which was 17.7%, 48.5%, and 57.9% higher than UCB, Equal weight combination, and HS300, respectively. The annualized yield of GA-UCB was 31.9%, which increased by 8.4%, 22.8%, and 26.7% compared to the other three methods, respectively. The Sharpe ratio of GA-UCB was 1.35, which increased by 0.31, 1.02, and 1.23, respectively. From the perspective of maximum drawdown rate, the maximum drawdown rate of this model was 14.3%, which decreased by 1.1%, 1.8%, and 9.1%, respectively. In summary, GA-UCB brings more returns while stabilizing risks. Table 3: Comparison of evaluation indicators for the Shanghai Stock Exchange 50 stock pool Model Evaluating indicator Cumulative income% Annualized rate of return% Sharpe ratio Maximum return rate% GA-UCB 163.5 31.9 1.35 14.3 UCB 145.8 23.5 1.04 15.4 Equal weight combination 115.0 9.1 0.33 16.1 HS300 105.6 5.2 0.12 23.4 3.3 Analysis of model results for different risk preferences Different investors had their own risk preferences when choosing investment portfolio strategies. This experiment validated the performance of a financial investment optimization model that integrated multifactors and improved UCB under strong volatility, targeting different risk preferences. This study expanded the sample interval to June 2017 to June 2021. The penalty coefficient values in the model parameters were set to 0.1, 0.2, and 0.3. MFP and MTP were set to 14 and 10, respectively. Therefore, the model could record the investment portfolio for 14 trading days. Figure 9 shows the investment strategy results under different penalty coefficients. As the penalty coefficient decreased, the cumulative returns of different investment strategies gradually increased. The cumulative returns obtained from investment strategies with different penalty coefficients were higher than the actual cumulative returns of HS300. These indicated that the results of the back testing of this model were consistent with the actual investment logic. The smaller the investor's aversion to risk, the greater the risk that the investor can bear. Therefore, the cumulative returns obtained are relatively high. In summary, the financial investment optimization model that integrates multifactors and improves UCB can recommend different investment strategies for investors with different risk preferences, and all returns are improved. Financial Investment Optimization by Integrating Multifactors… Informatica 48 (2024) 111–126 121 Date 2.00 2017- 06 Cumulative Return -0.25 0 0.75 2017- 12 2018- 06 2018- 12 2019- 06 2019- 12 2020- 06 2020- 12 2021- 06 0.25 0.50 1.00 1.25 1.75 Penalty coefficient 0.1 Penalty coefficient 0.2 Penalty coefficient 0.3 HS300 Figure 9: Comparison chart of investment strategy results under different penalty coefficients To observe the evaluation indicators of investment strategy results under different penalty coefficients more clearly, the strategy models for each penalty coefficient were run five times and their mean values were recorded. Table 4 shows the evaluation indicators of investment strategy results under different penalty coefficients. The penalty coefficient ranged from 0.3 to 0.1, with cumulative returns and annualized returns increasing by 13.9% and 21.7%, respectively. Although the growth between different penalty coefficients was not significant, the cumulative returns of the investment strategy back testing of this model have increased by at least 57.2% compared to the actual cumulative returns of HS300. For the maximum drawdown rate, the penalty coefficient values from 0.3 to 0.1 were reduced by 4.2%, 1.6%, and 8% compared to HS300, respectively. In summary, in financial markets with strong volatility, this model can provide targeted investment strategy portfolios for investors with different risk preferences, which can steadily improve returns while controlling risks. Table 4: Comparison of evaluation indicators for investment strategy results under different penalty coefficients Investment strategy Evaluating indicator Cumulative income% Annualized rate of return% Sharpe ratio Maximum return rate% Penalty coefficient 0.1 126.4 35.9 0.76 24.6 Penalty coefficient 0.2 118.6 28.6 0.66 31.0 Penalty coefficient 0.3 112.5 27.8 0.64 28.4 HS300 55.3 14.2 0.37 32.6 To further validate the investment portfolio model, the selected investment portfolio with a penalty coefficient of 0.1 and the actual optimal investment portfolio was compared and analyzed for the highest return obtained from this model. Figure 10 is a histogram of the actual optimal investment portfolio and the model selected investment portfolio. More than half of the investment portfolio chosen by the model had the same direction of return as the actual optimal investment portfolio. The gap between the return after June 2019 and the actual optimal investment portfolio gradually decreased. This indicated that after a period of learning, the model was better able to make decisions on excellent investment portfolios. 122 Informatica 48 (2024) 111–126 Z. Guo et al. Date 2017- 06 Investment income 2017- 12 2018- 06 2018- 12 2019- 06 2019- 12 2020- 06 2020- 12 2021- 06 0.010 0.075 0.050 0.025 0 -0.025 -0.050 Actual optimal combination Select investment portfolio Figure 10: Comparison of histograms of actual optimal investment portfolio and model selected investment portfolio 3.4 Applicability analysis of integrating multifactors and improved UCB investment portfolio model To verify the applicability of this model in other financial markets, this study compared the back testing results of different styles of funds as control groups with the investment portfolio of this model. Table 5 shows the details of funds with different styles. As a control group, the fund covers various styles such as stock, exponential, bond, and mixed type to verify the model’s generalization performance in the financial market. Table 5: Fund detail tables with different styles Fund style Fund name Fund code Fund ranking Stock type Credit Suisse Reform Dividend 000592 29/316 Exponential type Enhanced research on the Shanghai and Shenzhen 300 Index 000176 9/44 Bond type Minsheng Bank Convertible Bond Selection A 000067 327/625 Mixed type Mixed vitality of emerging Chinese businesses 001933 105/1683 Figure 11 shows the back testing results of funds with different styles. The trend of the back testing results of the fusion of multifactors and the improved UCB investment portfolio model in the fund field was consistent with the return trends of the other four funds. For the rate of return, the cumulative return of this model was improved on the basis of stock funds, proving the applicability of this model in other financial markets. Date 2.00 2019- 12 Cumulative Return -0.25 0 0.75 2020- 03 2020- 06 2020- 09 2020- 12 2021- 03 2021- 06 2021- 09 2021- 12 0.25 0.50 1.00 1.25 1.75 GA-UCB 000592 000176 001933 000067 Figure 11: Comparison of back testing results for different style funds Financial Investment Optimization by Integrating Multifactors… Informatica 48 (2024) 111–126 123 Table 6 compares the evaluation indicators of investment strategy results for different style funds. The performance of the investment portfolio strategy that integrated multifactors and improves the UCB investment portfolio model was superior to the other four funds. For cumulative returns, GA-UCB increased returns by 61.9%, 95.5%, 161%, and 84.7% compared to stock, exponential, bond, and mixed type, respectively. For the maximum drawdown rate, GA-UCB decreased by 12.3%, 2.2%, and 6.6%, respectively, compared to stock, exponential, and mixed type, and increased by 17.2% compared to bond funds. In summary, integrating multifactors and improving the UCB investment portfolio model has applicability in different financial fields. Table 6: Comparison of evaluation indicators for investment strategy results of funds with different styles Fund style Evaluating indicator Cumulative income% Annualized rate of return% Sharpe ratio Maximum withdrawal rate% Stock type 125.2 40.2 0.87 38.8 Exponential type 91.6 24.3 68.4 28.7 Bond type 26.1 8.0 20.2 9.3 Mixed type 102.4 26.5 65.1 33.1 GA-UCB 187.1 44.3 1.02 26.5 Back testing experiments were conducted on various investment portfolio models in bear market, bull market, and oscillation interval datasets to verify the robustness of the investment portfolio model. The cumulative return results of different investment portfolio models in different datasets are compared in Table 7. In the bear market dataset, the cumulative return of the studied GA-UCB investment portfolio model still maintained a relatively high return of 18.6%, which was 3.2%, 45.0%, and 36.5% higher than models such as UCB, Equal weight combination, and HS300, respectively. The cumulative return of the GA-UCB investment portfolio model in the oscillation interval dataset was 30.1%, which was 19.9%, 32.2%, and 29.7% higher than the other three investment portfolio models, respectively. In the bull market dataset, the cumulative return of the GA-UCB investment portfolio model reached 113.2%, which increased by 36.8%, 58.6%, and 81.2% compared to the other three models, respectively. Overall, the GA-UCB investment portfolio model studied has achieved high cumulative returns in bear, bull, and oscillation interval datasets, indicating that the model has strong robustness. Table 7: Comparison of cumulative return results of different investment portfolio models in different datasets Model Cumulative return rate/% Bull market/% Bear market/% Oscillation interval/% GA-UCB 113.2 18.6 30.1 UCB 76.4 15.4 10.2 Equal weight combination 54.6 -26.4 -2.1 HS300 32.0 -17.9 0.4 4 Discussion This study combines the multifactor model with the confidence upper bound algorithm in MAB to construct a financial investment portfolio model that integrates multiple factors with the UCB algorithm. Then, the optimal return investment portfolio strategy model is designed in complex financial markets. Based on this, GA is used to improve the configuration of parameters. In previous studies, Morijiri et al. used numerical research methods to analyze the decision-making of large-scale MAB and achieved a scaling index of 97%. The results demonstrated the feasibility of the MAB problem in their field. In this regard, this study also used MAB for large-scale data analysis. The results showed that the cumulative return was 187.4%, significantly higher than the 119.1% of the HS300 Index, an increase of 68.3%. This indicated that the GA-UCB algorithm had a more prominent ability to optimize returns in the financial market. Hasegawa et al.'s method demonstrated high frame success rates in large-scale heterogeneous Internet of Things networks. The GA-UCB model studied also demonstrated efficiency and stability in predicting financial market volatility and optimizing investment portfolios, with a back testing result of 163.5%, significantly higher than the 105.6% of the SZ50 Index. The advantage of the model in this study was more significant in terms of stability. In addition, the MAB spectrum scheduling algorithm studied by Li et al. fully utilized uncertain resources. The UCB algorithm improved by GA in this study showed excellent performance in handling uncertain market resources, with cumulative return and annualized return increasing by 13.9% and 21.7%, respectively. The performance of this study was better when dealing with uncertain data. Meanwhile, the maritime network architecture method 124 Informatica 48 (2024) 111–126 Z. Guo et al. studied by Yang et al. significantly reduced energy costs and latency. The GA-UCB algorithm in this study had better efficiency in computing resources and time costs, which maintained high returns in different market environments, especially demonstrating robustness in bull, bear, and oscillation interval datasets. The factor model studied by De Nard et al. was effective in detecting cross-sectional anomalies in stock returns. However, the model used in this study not only had effectiveness in detecting the cross-section of stock returns, but also had advantages in predicting market trends and optimizing investment returns. The results indicated that the applicability of the model in this study was broader. In summary, previous research has only improved the performance of its research field or a certain aspect. However, the model used in this study not only performs well in portfolio returns and stability, but also in predicting market volatility. This model has a wider applicability and can meet diverse market decision-making needs. Therefore, compared to previous research results, this research model is more adaptable to complex financial stock markets. 5 Conclusion In the complex and ever-changing financial environment, portfolio optimization becomes the key to improving returns and managing risks. To balance investment risks and increase returns, this study combines a multifactor model with UCB and utilizes GA to improve and optimize it. These results confirmed that the cumulative return of GA-UCB was 187.4%, which was 68.3% higher than the cumulative return of 119.1% on the HS300 Index, respectively. The maximum drawdown rate of GA-UCB was the lowest at 13.5%, which was reduced by 5.0%, 0.7%, and 4.8% compared to UCB, Equal weight combination, and HS300, respectively. The back testing result of the GA-UCB model was 163.5%, which was 57.9% higher than the cumulative return of 105.6% on the SZ50 Index. The penalty coefficient ranged from 0.3 to 0.1, with cumulative returns and annualized returns increasing by 13.9% and 21.7%, respectively. For the maximum drawdown rate, the penalty coefficient values from 0.3 to 0.1 were reduced by 4.2%, 1.6%, and 8% compared to HS300, respectively. For cumulative returns, GA-UCB increased returns by 61.9%, 95.5%, 161%, and 84.7% compared to stock, exponential, bond, and mixed type, respectively. For the maximum drawdown rate, GA-UCB decreased by 12.3%, 2.2%, and 6.6% compared to stock, exponential, and mixed type, respectively. Compared to bond funds, it increased by 17.2%. In summary, the financial investment optimization method that integrates multifactors and GA improved UCB effectively improves returns while controlling investment risks. The financial market is complex and ever-changing, and research has not conducted adaptive verification for different market environments. Therefore, these research results are not comprehensive enough. Research can further explore the adaptability of models in different market environments, thereby better providing investors with more diverse market decision-making needs. References [1] K. Oshima, T. Onishi, S. J. Kim, J. Ma, and M. Hasegawa, "Efficient wireless network selection by using multi-armed bandit algorithm for mobile terminals," Nonlinear Theory and Its Applications, IEICE, vol. 11, no. 1, pp. 68-77, 2020. https://doi.org/10.1587/nolta.11.68 [2] Z. Duan, A. Li, N. Okada, Y. Ito, N. Chauvet, M. Naruse, and M. Hasegawa, "User pairing using laser chaos decision maker for NOMA systems," Nonlinear Theory and Its Applications, IEICE, vol. 13, no. 1, pp. 72-83, 2022. https://doi.org/10.1587/nolta.13.72 [3] G. O’Brien and J. D. Yeatman, "Bridging sensory and language theories of dyslexia: Toward a multifactorial model," Developmental Science, vol. 24, no. 3, pp. 13039, 2021. https://doi.org/10.1111/desc.13039 [4] X. Y. Liu and D. J. Huang, "Online portfolio selection based on quadratic smoothing grey prediction," Journal of East China Normal University: Natural Science Edition, vol. 2020, no. 6, pp. 115-128, 2020. https://doi.org/10.3969/j.issn.1000-5641.201921020 [5] J. A. He, B. Wang, and J. X. Lin, "Online investment portfolio strategy based on active expert opinions," Journal of Guangdong University of Technology, vol. 37, no. 4, pp. 59-64, 2020. https://doi.org/10.12052/gdutxb.190091 [6] N. Rouf, M. B. Malik, T. Arif, S. Sharma, S. Singh, S. Aich, and H. C. Kim, "Stock market prediction using machine learning techniques: a decade survey on methodologies, recent developments, and future directions," Electronics, vol. 10, no. 21, pp. 2717, 2021. https://doi.org/10.3390/electronics10212717 [7] Z. Hu, Y. Zhao, and M. Khushi, "A survey of forex and stock price prediction using deep learning," Applied System Innovation, vol. 4, no. 1, pp. 9, 2021. https://doi.org/10.3390/asi4010009 [8] K. Morijiri, T. Mihana, K. Kanno, M. Naruse, and A. Uchida, "Decision making for large-scale multi-armed bandit problems using bias control of chaotic temporal waveforms in semiconductor lasers," Scientific Reports, vol. 12, no. 1, pp. 8073, 2022. https://doi.org/10.1038/s41598-022-12155-y [9] S. Takeuchi, M. Hasegawa, K. Kanno, A. Uchida, N. Chauvet, and M. Naruse, "Dynamic channel selection in wireless communications via a multi-armed bandit algorithm using laser chaos time-series," Scientific Reports, vol. 10, no. 1, pp. 1574, 2020. https://doi.org/10.1038/s41598-020-58541-2 [10] S. Hasegawa, R. Kitagawa, A. Li, S. J. Kim, Y. Watanabe, Y. Shoji, and M. Hasegawa, Financial Investment Optimization by Integrating Multifactors… Informatica 48 (2024) 111–126 125 "Multi-armed-bandit based channel selection algorithm for massive heterogeneous internet of things networks," Applied Sciences, vol. 12, no. 15, pp. 7424, 2022. https://doi.org/10.3390/app12157424 [11] F. Li, D. Yu, H. Yang, J. Yu, H. Karl, and X. Cheng, "Multi-armed-bandit-based spectrum scheduling algorithms in wireless networks: A survey," IEEE Wireless Communications, vol. 27, no. 1, pp. 24-30, 2020. https://doi.org/10.1109/MWC.001.1900280 [12] T. Yang, S. Gao, J. Li, M. Qin, X. Sun, R. Zhang, and X. Li, "Multi-armed bandits learning for task offloading in maritime edge intelligence networks," IEEE Transactions on Vehicular Technology, vol. 71, no. 4, pp. 4212-4224, 2022. https://doi.org/10.1109/TVT.2022.3141740 [13] G. De Nard, O. Ledoit, and M. Wolf, "Factor models for portfolio selection in large dimensions: The good, the better and the ugly," Journal of Financial Econometrics, vol. 19, no. 2, pp. 236-257, 2021. https://doi.org/10.1093/jjfinec/nby033 [14] V. D. Ta, C. M. Liu, and D. A. Tadesse, "Portfolio optimization-based stock prediction using long-short term memory network in quantitative trading," Applied Sciences, vol. 10, no. 2, pp. 437, 2020. https://doi.org/10.3390/app10020437 [15] Y. Hao and Q. Gao, "Predicting the trend of stock market index using the hybrid neural network based on multiple time scale feature learning," Applied Sciences, vol. 10, no. 11, pp. 3961, 2020. https://doi.org/10.3390/app10113961 [16] H. Chung and K. Shin, "Genetic algorithm-optimized multi-channel convolutional neural network for stock market prediction," Neural Computing and Applications, vol. 32, no. 12, pp. 7897-7914, 2020. https://doi.org/10.1007/s00521-019-04236-3 [17] Y. Chen, L. Ye, R. Li, and X. Zhao, "A multi-period constrained multi-objective evolutionary algorithm with orthogonal learning for solving the complex carbon neutral stock portfolio optimization model," Journal of Systems Science and Complexity, vol. 36, no. 2, pp. 686-715, 2023. https://doi.org/10.1007/s11424-023-2406-3 [18] M. Rahiminezhad Galankashi, F. Mokhatab Rafiei, and M. Ghezelbash, "Portfolio selection: a fuzzy-ANP approach," Financial Innovation, vol. 6, no. 1, pp. 17, 2020. https://doi.org/10.1186/s40854-020-00175-4 [19] O. Momen, A. Esfahanipour, and A. Seifi, "A robust behavioral portfolio selection model with investor attitudes and biases," Operational Research, vol. 20, no. 1, pp. 427-446, 2020. https://doi.org/10.1007/s12351-017-0330-9 [20] K. Bisht and A. Kumar, "Stock portfolio selection hybridizing fuzzy base-criterion method and evidence theory in triangular fuzzy environment," Operations Research Forum, vol. 3, no. 4, pp. 53, 2022. https://doi.org/10.1007/s43069-022-00167-3 [21] Z. Li, Y. Dou, K. Yang, and M. Li, "System portfolio selection based on GRA method under hesitant fuzzy environment," Journal of Systems Engineering and Electronics, vol. 33, no. 1, pp. 120-133, 2022. https://doi.org/10.23919/JSEE.2022.000013 [22] Z. Y. Guo, "Stochastic multifactor models in risk management of energy futures," Journal of Futures Markets, vol. 40, no. 12, pp. 1918-1934, 2020. https://doi.org/10.1002/fut.22154 [23] X. Kang, H. Ri, M. N. A. Khalid, and H. Iida, "Addictive games: Case study on multi-armed bandit game," Information, vol. 12, no. 12, pp. 521, 2021. https://doi.org/10.3390/info12120521 [24] N. Narisawa, N. Chauvet, M. Hasegawa, and M. Naruse, "Arm order recognition in multi-armed bandit problem with laser chaos time-series," Scientific Reports, vol. 11, no. 1, pp. 4459, 2021. https://doi.org/10.1038/s41598-021-83726-8 [25] H. PH and A. Rishad, "An empirical examination of investor sentiment and stock market volatility: evidence from India," Financial Innovation, vol. 6, no. 1, pp. 34, 2020. https://doi.org/10.1186/s40854-020-00198-x [26] R. T. Gdeeb, "Detecting Breast Cancer in X-RAY images using image segmentation algorithm and neural networks," Informatica, vol. 47, no. 9, pp. 1-10, 2023. https://doi.org/10.31449/inf.v47i9.4995 [27] A. Wang, W. Zhang, Y. Guo, W. Cross, V. Plummer, L. Lam, and J. Zhang, "Resilience-based multifactorial model of depression among people who lost an only-child in China," Journal of Central South University. Medical Sciences, vol. 46, no. 1, pp. 75-83, 2021. https://doi.org/10.11817/j.issn.1672-7347.2021.1903 01 [28] Z. He, "Improved genetic algorithm in multi-objective cargo logistics loading and distribution," Informatica, vol. 47, no. 2, pp. 253-260, 2023. https://doi.org/10.31449/inf.v47i2.3958 [29] K. Liu, Y. Sun, and D. Yang, "The administrative center or economic center: Which dominates the regional green development pattern? a case study of shandong peninsula urban agglomeration, China," Green and Low-Carbon Economy, vol. 1, no. 3, pp. 110-120, 2023. https://doi.org/10.47852/bonviewGLCE3202955 [30] T. Mihana, K. Kanno, M. Naruse, and A. Uchida, "Photonic decision making for solving competitive multi-armed bandit problem using semiconductor laser networks," Nonlinear Theory and Its Applications, IEICE, vol. 13, no. 3, pp. 582-597, 2022. https://doi.org/10.1587/nolta.13.582 [31] N. Luo, H. Yu, Z. You, Y. Li, T. Zhou, and N. Han, "Fuzzy logic and neural network-based risk assessment model for import and export enterprises: 126 Informatica 48 (2024) 111–126 Z. Guo et al. A review," Journal of Data Science and Intelligent Systems, vol. 1, no. 1, pp. 2-11, 2023. https://doi.org/10.47852/bonviewJDSIS32021078