Strojniški vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 © 2020 Journal of Mechanical Engineering. All rights reserved. D0l:10.5545/sv-jme.2020.6546 Original Scientific Paper Received for review: 2020-01-05 Received revised form: 2020-04-28 Accepted for publication: 2020-05-18 A Transient Feature Learning-Based Intelligent Fault Diagnosis Method for Planetary Gearboxes Bo Qin1 - Zixian Li1 -Yan Qin2* 1 Inner Mongolia University of Science & Technology, School of Mechanical Engineering, China 2 Singapore University of Technology and Design, Engineering Product Development Pillar, Singapore Sensitive and accurate fault features from the vibration signals of planetary gearboxes are essential for fault diagnosis, in which extreme learning machine (ELM) techniques have been widely adopted. To increase the sensitivity of extracted features fed in ELM, a novel feature extraction method is put forward, which takes advantage of the transient dynamics and the reconstructed high-dimensional data from the original vibration signal. First, based on fast kurtosis analysis, the range of transient dynamics of a vibration signal is located. Next, with the extracted kurtosis information, with variational mode decomposition, a series of intrinsic mode functions are decomposed; the ones that fall into the obtained ranges are selected as transient features, corresponding to maximum kurtosis value. Fed by the transient features, a hierarchical ELM model is well-trained for fault classification. Furthermore, a denoising auto-encoder is used to optimize input weight and threshold of implicit learning node of ELM, satisfying orthogonal condition to realize the layering of its hidden layers. Finally, a numerical case and an experiment are conducted to verify the performance of the proposed method. In comparison with its counterparts, the proposed method has a better classification accuracy in the aiding of transient features. Keywords: transient features, kurtosis information, extreme learning machine, variational mode decomposition, fault diagnosis for planetary gearbox Highlights • VMD decomposition is employed to decompose signal into components. • Kurtosis information is used to identify transient features in decomposed components. • A high-dimensional feature vector is constructed using multiscale permutation entropy. • A traditional extreme learning machine is optimized by introduction of a denoising auto-encoder. • Comprehensive comparisons are given to show the efficacy of the proposed method, including both a numerical case and a practical planetary gearbox platform. 0 INTRODUCTION With the advantages of compact structure, high transmission efficiency, and strong carrying capacity, the planetary gearbox has been widely adopted in transmission system powered devices, such as crawler vehicles, ships, and wind-driven generators [1]. Practically, the transmission system always works in adverse environments but suffers from continuously varying load. As a crucial component in a transmission system, the planetary gear is more prone to failure in poor working conditions. If faults in the planetary gear cannot be timely detected, it is possible that the whole transmission system may be disturbed and degenerated, leading to major safety threats. Therefore, providing prompt and reliable fault diagnosis ability for planetary gearboxes has received extensive attention and been an active research field. With ever-increasing developments in sensor and data storage technologies in industrial fields [2] and [3], massive amounts of data have become available and affordable. For the planetary gearbox, vibration sensors have been widely installed, and the collected data contain important features to indicate their health state. Data-driven methods show their superiorities in fault diagnosis in comparison with mechanism modelbased methods, in which a priori process knowledge is necessary but difficult to obtain. Commonly, it includes two sequential steps to develop data-driven fault diagnosis model: fault feature extraction and development of diagnosis model. Correspondingly, a series of related research studies are reviewed from these two aspects. With respect to feature extraction, wavelet transformation [4] has been used in early stages; however, it faces the difficulty of selection proper basis functions. Also, once a basis function is determined, it cannot be adjusted in sequential analysis, leading to a non-optimal solution. After that, empirical mode decomposition (EMD) [5] and [6] was proposed; it decomposed the original measurements into several orthogonal components called intrinsic mode function (IMF). Each IMF corresponds to a specific frequency and is independent with each other. To overcome the problem of mode confusion, ensemble EMD (EEMD) [7] was proposed by adding Gaussian white noise into *Corr. Author's Address: Singapore University of Technology and Design, Engineering Product Development Pillar, 487372 Singapore, neuqinyan@163.com 385 Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 the decomposed signal to improve the distribution of extreme value. For instance, Zhang et al. [8] used EEMD to decompose transmission error signal into several IMFs to extract high-quality fault features with less noise. The experimental results showed that spallation fault and crack fault of gears could be identified. Lv et al. [9] used EEMD decomposition and reconstructed the signal according to the calculated correlation coefficient and kurtosis to achieve the purpose of extracting weak features of early faults in rotating machinery. Pang et al. [10] used EEMD to a certain extent to suppress the interference of signal noise, and it has been experimentally proven that the EEMD denoising method fully retains the fault feature information and effectively improves the fault detection rate of the compound gear train gearbox. Furthermore, variants of EEMD have been reported, such as complementary EEMD [11], complete EEMD [12]. However, tuneable parameters, including the amplitude of added noise, the number of screening, exert unneglectable influences on the performance of EEMD. These parameters are manually given in current research studies, resulting in inaccurate results of EEMD. To solve the above-mentioned problems, variational mode decomposition (VMD) [13] was proposed. Specifically, each IMF component of centre frequency and bandwidth are continuously updated iteratively to search for the optimal solution of the constrained variational model, achieving adaptive subdivision of signal frequency band. Followed with feature extraction, development of reliable and accurate fault diagnosis model occurred. Taking advantages of the abundance of data, artificial intelligence (AI) technology has been developed and widely applied to improve fault identification ability. Among various AI methods, deep belief network (DBN) [14], convolution neural network (CNN) [15], and automation encoder (AE) [16] have been widely studied. Although DBN gets rid of the dependence on tedious signal pre-processing techniques, the application of DBN in fault diagnosis is seldom since it may fail to capture useful features. For CNN, its input data need to meet the requirement of two-dimensional structural features. As a result, it is not suitable for the feature recognition of vibration signals [17]. Compared with DBN and CNN, AE was more suitable for feature classification since it only requires a small number of samples for training. Furthermore, with proper feature extraction, high fault diagnosis accuracy can be achieved for AE, demonstrating its strong feature extraction ability and robustness. Generally, the vibration signal of a planetary gearbox is complex and shows strong non-stationarity and modulation characteristics, resulting in the increase of difficulties for feature extraction. In fact, transient features in variation signal are sensitive to fault information. Correspondingly, if transient features in the vibration signal of the planetary gear box can be properly captured, it is possible to further improve fault diagnosis accuracy with advanced AI methods. To achieve more accurate fault diagnosis performance, an intelligent fault diagnosis method is proposed for a planetary gearbox in this paper, which integrates advantages of fast spectral kurtosis, VMD, improved multiscale permutation entropy (MPE), and denoised AE (DAE) optimization. First, the vibration signal of planet gearbox is decomposed using fast kurtosis mapping and VMD decomposition. In this way, the centre frequency corresponding to several IMFs is captured to sensitive transient impact. Next, an extreme learning machine (ELM) method is used to construct an initial fault diagnosis model with extracted kurtosis features. After that, DAE is used to optimize the input weights and thresholds of the ELM hidden layer node to satisfy orthogonal conditions to realize the hierarchical hidden layer. In this way, the number of input and output samples is equal, improving the classification accuracy of the planetary gearbox fault diagnosis model with DAE-ELM. Experiments on real data show that the proposed method has higher diagnosis accuracy. The rest of the paper is organized as follows. The preliminaries are briefly reviewed in Section 1. Section 2 introduces the proposed method. The experimental results and discussions are given in Section 3. Conclusions are drawn in Section 4. 1 PRELIMINARY 1.1 Fast Spectral Kurtosis Kurtosis information is sensitive to transient shock, which can be used to present the transient frequency of a signal in a planetary gearbox. Antoni et. al [18] proposed a fast-spectral kurtosis algorithm based on FIR bandpass filter, in which one third of the range of a full-band was used with a binary tree structure. Signal X(k) is decomposed into the pre-defined number of layers. After obtaining the filtering result of each layer, kurtosis values of all frequency segments are calculated below, K(f)-2' =O'1'...'2" -(1) dl c„(k )| ) 386 Bo Qin, B. - Li, Z.X. -Qin, Y. Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 where f is the frequency of the signal; c'm (t) is the filtering result obtained by the ith filter of the mth layer; O denotes modulus value; |-| stands for expected value. The defined K(f ) is a measure of the peak value of the signal probability density function at a certain frequency. The interval between the centre frequency fc and the bandwidth Bw corresponding to the maximum kurtosis value Kmax of X(k) is calculated. 1.2 VMD Decomposition To overcome modal aliasing and other drawbacks in both EMD and EEMD, VMD was proposed by Dragomiretskiy and Zosso [13]. It decomposes signal in a variational framework and uses iteration to find the optimal solution of the constrained variational model. A customized component f is constructed to derive IMF, and the corresponding constraint variation model is given below, S(t) + j| % (t) n - (2) where f is the original signal, uk(t) is the kth IMF component, mk is the centre frequency of the kth component, 8(t) is the Dirac function, and t is time index. By solving Eq. (2) iteratively with the alternating direction multiplier method, it is transformed into an unconstrained problem. Through the above process, K IMF components u = [ub u2, ..., uK] and corresponding frequency centre ro = [wb w2, ., wK] are obtained. 2 METHODOLOGY In this section, an intelligent method is proposed to perform fault feature extraction and online diagnosis for a planetary gearbox. The basic structure and framework of the proposed method are given in Fig. 1, which includes three parts. First, in the data acquisition stage, through the sensor and acquisition device, the state monitoring system is used to collect the historical data and online data of planetary gearbox. Then, during the feature extraction stage, historical data are used to obtain fault feature components and their corresponding feature vectors through VMD, fast spectral kurtosis analysis, and an improved feature enhancement method based on a multiscale permutation entropy is used to filtrate feature vector sets. After that, based on DAE-optimized ELM algorithm, an intelligent state identification model is constructed by learning the historical data feature vector sets. Similarly, high quality eigenvector sets are obtained from online data through VMD and fast spectral kurtosis analysis and improved MPE methods. The high-quality feature vector set of online data is further used to train the state identification model of DAE-ELM based on historical data, so as to improve the fault classification accuracy of this mode and achieve the purpose of online diagnosis. | Condition recognition of online data Fig. 1. Framework for the proposed method 2.1 Construction of Feature Set 2.1.1 Extraction of IMFs Fig. 2 shows the details about the construction of the feature set. First, the vibration signal X(k) is decomposed by VMD to obtain n IMF components, and the centre frequency fi of each component is calculated. The vibration signal X(k) is analysed to obtain the centre frequency fc and bandwidth Bw corresponding to the maximum kurtosis value using FSK. Then, according to whether the centre frequency f of the ith IMF component is within the frequency range [f- Bw/2, fc+Bw/2], a part of the IMFs is selected from the total IMFs as the fault 2 k A Transient Feature Learning-Based Intelligent Fault Diagnosis Method for Planetary Gearboxes 387 Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 feature components, which is defined as a set Q. The remaining IMF components will be discarded since they cover little fault information components. Finally, the IMF components in the set Q are added and reconstructed into a new time-domain waveform, and the IMPE value of each IMF component is calculated to construct the fault feature set T, which will be specified in the following subsections. Fig. 2. Signal decomposition and its feature vector construction 2.1.2 Improved Multiscale Permutation Entropy-based Feature Enhancement Multiscale permutation entropy (MPE) algorithm [19] was designed to capture fine-grinded dynamics in various signals, including ECG signal, vibration and speech, etc. However, it still has the problem of learning the details of mutations. That is, first, the sample during coarse granulation is asymmetric. Second, for a specific time series X(i), as the scale 5 increases, the number of samples contained in the coarse-grained time series ys(j) decreases exponentially, resulting in large fluctuations in the calculation of entropy value. To solve the above-mentioned problems, Azami and Escudero [20] used different scale factors 5 as independent variables to refine X(i) and calculate the average of the corresponding entropy values. The specific steps are as follows: (1) Coarse granulation under multiscale conditions. X(i) is coarsely granulated into y5j) and the result is given below, y. (j ) = 11L+1 x 0) (3) .=(j-l)s+1 (2) y5 j) is further transformed into 5 different coarse granulation sequences below, z,w = {y(?, y , y«} (( = 1,2, — , s), (4) where y*fj is as follow, y^ = - £ x;+i+(j_1} , s 1=0 where X(i) is time series with the length N; 5 is time scale factor; and y5j) presents the coarsegrained sequence at different 5,j=1, 2, ..., [N/s]. (3) With independent variable 5, calculate the arranged entropy of each coarse-grained sequence y5 j) and its average below, IMPE = - £ PE(Z(s)), (5) where PE(^) is the function to calculate permutation entropy. 2.2 Intelligent Diagnosis Model Construction ELM has the advantages of fast operation speed and is a global optimal solution. However, the input weight and threshold of hidden layer nodes are randomly generated, resulting in the low accuracy and poor robustness of ELM. To solve this problem, DAE is employed to train ELM by adding local impairment noise to obtain a more robust network. The number of input and output samples is given the same value to achieve unsupervised learning. Also, weights A and B of the randomly generated hidden layer nodes satisfy the orthogonal condition, and the weight and threshold of the hidden layer of ELM are optimized to improve classification accuracy. The orthogonal hidden layer parameters A and B are generated in DAE-ELM, the input sample set is mapped to the high dimensional space by Eq. (6) as follows, h = g (ax + b) s.t. AtA = 1, Btb = 1, (6) where a is the orthogonal weights that connect the input layer and the hidden layer node; A = [a1, a2,..., aN] and B = [b1,b2,..., bN] are an orthogonal threshold, in which a and b are nodes in the hidden layer; H is output matrix of the hidden layer. 388 Bo Qin, B. - Li, Z.X. -Qin, Y. Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 The output weight fi is the learning conversion of the feature space to the input data calculated by Eq. (7) below, P=\C + HT H | Hx, (7) where C is regularization coefficient. The specific process of the algorithm is shown in Fig. 3. Fig. 3. Flowchart of DAE optimized ELM 3 RESULTS In this section, the performance of the proposed method is illustrated with two cases: a numerical case and an industrial one. Specifically, the first case verifies the decomposition result of signals, in comparison with that of EEMD. The second case focuses on analysing fault diagnosis performance with the proposed transient fault features. 3.1 Numerical Simulation A simulation digital signal X(k) is constructed from three independent components Xj(k), X2(k), and X3(k), which are given in Eq. (8) as below, X(k) = X1 (k) + X2 (k) + X3 (k) = e-l000k sin(5000^k) + cos(1000^k) • sin(150^k) + cos(400^k), (8) where Xj(k) is a periodic exponential decay shock signal with the frequency of 2500 Hz; X2(k) is a periodic frequency modulation signal; X3(k) is a cosine signal with the frequency of 200 Hz. In Fig. 4, X(k) and its components following Eq. (4) are plotted with the length of 1000 samplings. X(k) is decomposed based on a five-layer fast kurtosis diagram, and the corresponding results are shown in Fig. 5. It is observed that the colour in the frequency range [2500 Hz, 5000 Hz] is the deepest, which can be used to infer the centre frequency and bandwidth. According to Eq. (1), the centre frequency fc is determined as 3750 Hz, and the bandwidth Bw is 2500 Hz. Sample ] Fig. 4. The time domain waveform of X(k) and its components Frequency [ Fig. 5. Result of fast kurtosis diagram of simulated signal Next, the decomposition results of X(k) based on VMD are shown in Fig. 6, in which three IMFs are extracted, i.e. IMFvmd1, IMFVMD2, and IMFVMD3. By comparing Fig. 6 with Fig. 4, it is observed that A Transient Feature Learning-Based Intelligent Fault Diagnosis Method for Planetary Gearboxes 389 Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 Sample ] Fig. 6. VMD decomposition results of X(k) the extracted signals are similar to real components. IMFVMD3 is similar to Xj(k); IMFVMD2 is similar to X2(k); and IMFvmdi is similar to X3(k). Therefore, the efficacy of VMD in signal decomposition is well illustrated, which provides a foundation for following the construction of a feature set. Further, EEMD is employed for comparison. Five IMFs are retained for EEMD, and the corresponding results are shown in Fig. 7. It is observed that the second component Sample ] Fig. 7. EEMD decomposition results of X(k) Fig. 8. The constructed platform for fault diagnosis of planetary gearbox 5000 10000 15000 0 5000 10000 Sample point Sample point Fig. 9. Examples for typical fault and corresponding signal in typical fault states of planetary gearbox for a) broken tooth b) crack c) missing tooth, and d) wear of tooth surface 390 Bo Qin, B. - Li, Z.X. -Qin, Y. Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 IMF, EEMD2 and the third component IMF, EEMD3 are mixed with each other. Besides, IMFEEMD2 is mixed with the high-frequency component of IMFEEMD1, and the residual component cannot be decomposed. Therefore, it is concluded that the proposed method provides a more powerful feature extraction in comparison with competitive methods. 3.2 Experiments on Planetary Gearbox In order to verify the effectiveness of the above algorithm, a practical testing condition shown in Fig. 8 is established. It contains a multi-channel data acquisition instrument branded SIEMENS-LMS and a DDS power transmission based comprehensive fault simulation platform produced by Spectra Quest. For testing, four kinds of faults, (broken teeth fault, missing teeth fault, wear fault, and crack fault) occurring during the operation of the firststage planetary wheel of the planetary gearbox are employed for analysis. During signal acquisition, the PCB356A16 accelerometer is used to collect the vibration signals of the vertical radial, horizontal radial and axial directions of the measuring point, the sampling frequency is 15,360 Hz, the motor speed is 2100 r/min, and 60 sets of data are collected in each status. Each group of data collection time is 1 second, that is, the number of sampling points corresponding to each group of data is 15,360. Intuitively, typical examples of these faults are given in Fig. 9, associated with a set of signals corresponding to each status. 3.2.1 Extraction of Sensitive Fault Features and Construction of Feature Set Taking the broken tooth signal as an example, first, the fast kurtosis algorithm is used to obtain the centre frequency corresponding to the maximum kurtosis value of broken tooth signal and the results can be derived, as shown in Fig. 10. Fig. 10. 2000 4000 6000 Frequency [Hz] Result of fast kurtosis diagram of broken tooth signal The centre frequency is identified as fc = 2400 Hz, and the associate frequency band ranges from 2240 Hz to 2560 Hz. Then, the obtained signal is further decomposed by VMD, and the first six IMFs Table. 1. The results of entropy values for each status in planetary gearbox Status No. Eigenvector Enhance multiscale entropy PE1 PE2 PE3 PE10 PE11 PE12 Missing tooth failure 60 3.1822 4.3058 4.4948 3.1703 4.2487 4.5769 5.5471 5.5075 5.7055 5.7462 4.8742 4.9564 Normal 60 3.0615 4.0058 4.4398 3.0403 3.9515 4.4486 5.8262 5.7056 5.7132 5.8031 5.6157 5.6923 Broken tooth failure 60 3.5286 4.8606 4.8038 3.5388 4.8242 4.8155 5.2683 5.2321 5.6255 5.6212 6.1774 6.1801 Crack failure 60 3.9995 4.8846 5.2658 3.9950 4.8680 5.2514 5.9002 5.9345 5.2612 5.2756 6.1351 6.1354 Wear failure 60 4.3728 5.3764 5.4617 4.2759 5.3332 5.4744 6.2277 6.2652 5.1601 5.0579 6.2675 6.2135 A Transient Feature Learning-Based Intelligent Fault Diagnosis Method for Planetary Gearboxes 391 Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 5000 10000 Sample point Fig. 11. Result of VMD on broken tooth signals are selected as candidate features, as shown in Fig. 11. Finally, IMF3 is selected as the sensitive fault information since it locates in the frequency band that ranges from 2240 Hz to 2560 Hz. Similarly, the same procedure is conducted on the other three fault signals and the normal signal. In the experiment, the sampling length is one second, and 60 sets of signals under each status are collected. Next, the improved MPE algorithm is used to calculate the entropy values of the above sixty groups of selected IMFs with twelve scales to construct feature vector set T. On the basis of this, Table 1 summarizes partial entropy values of all fault signals and normal signal since the limitation of page. Table. 2. Accuracy comparison between the proposed method and its counterpart under each status Type Missing Normal Broken Crack Wear Average Method tooth [%] [%] tooth [%] [%] [%] [%] DAE-ELM 99 100 100 100 100 99 KELM 95 100 100 95 90 96 SVM 95 95 100 100 85 95 the type of fault, in which 1 is the missing tooth, 2 stands for normal status, 3 means a broken tooth, 4 is a crack, and 5 is wear. It is easy to see that there is only one missing sample in the crack fault. As a result, the classification accuracy of DAE-ELM intelligent diagnosis model reached 99 %. Fig. 12. Fault classification results of planetary gearbox using DAE-ELM 3.2.2 Diagnosis of Planetary Gearbox Faults For each status in Table 1, 40 sets of eigenvectors are randomly selected as training samples, and the remaining twenty sets of data are used as testing data. The DAE-ELM intelligent diagnosis model for planetary gear is developed through the given steps in methodology. In Fig. 12, the X-axis indicates the assignment of testing samples in each status. 7-axis indicates 392 For comparison, the feature vector set T extracted in Subsection 3.2.1 is fed into KELM [21] and SVM [22] based diagnosis model, respectively. The results of these two methods are shown in Figs. 13 and 14, respectively. It is observed that two samples of wear fault in Fig. 13 is misclassified into the crack fault, and one sample of crack fault is misclassified in the wear fault. Also, one sample in missing tooth fault is misclassified in other faults. As a result, the accuracy of KELM based algorithm is 96 %. In Fig. 14, two Bo Qin, B. - Li, Z.X. -Qin, Y. Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 samples of wear fault are misclassified into crack fault, four samples of crack fault are misclassified, resulting in the average diagnosis accuracy is 95 %. Therefore, the proposed DAE-ELM algorithm achieves the best performance by optimizing hidden layer of ELM using DAE. O Missing tooth * Normal Broken tooth □ Crack O Wear Fig. 13. Fault classification results of planetary gearbox using KELM Fig. 14. Fault classification results of planetary gearbox using SVM 4 CONCLUSIONS This paper constructs a sensitive feature set and DAE-ELM intelligent diagnosis model for planetary gearboxes. Through the comparative analysis of simulated signal and experimental signal, the efficacy of the proposed feature set construction and the superiority of the DAE-ELM-based intelligent diagnosis model are verified. In comparison with the diagnosis method based on KELM and SVM, the results show that the proposed EMPE and DAE-ELM methods not only effectively extract sensitive transient characteristics of planetary gearbox vibration signals, but also the classification accuracy of the state identification model is increased by 3 % and 4 %, respectively. 5 ACKNOWLEDGEMENTS This research was supported in part by National Natural Science Foundation of China (No. 61903327), and by Natural Science Fund of Inner Mongolia (No. 2017MS0509), Inner Mongolia Scientific Research Projects of Colleges and Universities (No. NJZY19298). 6 REFERENCES [1] Teng, W., Wang, F., Zhang, K.L., Liu Y.B., Ding, X. (2014) Pitting fault detection of a wind turbine gearbox using empirical mode decomposition. Strojniški vestnik - Journal of Mechanical Engineering, vol. 60, no. 1, p. 12-20, D0l:10.5545/sv-jme.2013.1295. [2] Qiao, Z., Lei, Y., Li, N. (2019) Applications of stochastic resonance to machinery fault detection: A review and tutorial. Mechanical Systems and Signal Processing, vol. 122, p. 502536, D0I:10.1016/j.ymssp.2018.12.032. [3] Yin, A., Lu, J., Dai, Z., Li, J., Ouyang, Q. (2016). Isomap and deep belief network-based machine health combined assessment model. Strojniški vestnik - Journal of Mechanical Engineering, vol. 62, no. 12, p. 740-750, D0I:10.5545/sv-jme.2016.3694. [4] Saxean, A., Wu, B., Vachtsevanos, G. (2005). A methodology for analyzing vibration data from planetary gear systems using complex Morlet wavelets. Proceedings of the American Control Conference, p. 4730-4735, D0I:10.1109/ACC.2005.1470743. [5] Feng, Z., Zuo, M.J. (2013). Fault diagnosis of planetary gearboxes via torsional vibration signal analysis. Mechanical Systems and Signal Processing, vol. 36, no. 2, p. 401-421, D0I:10.1016/j.ymssp.2012.11.004. [6] Feng, Z., Lin, X., Zuo, M.J. (2016). Joint amplitude and frequency demodulation analysis based on intrinsic time-scale decomposition for planetary gearbox fault diagnosis. Mechanical Systems and Signal Processing, vol. 72-73, p. 223-240, D0I:10.1016/j.ymssp.2015.11.024. [7] Wu, Z., Huang, N.E. (2009) Ensemble empirical mode decomposition: A noise assisted data analysis method. Advances in Adaptive Data Analysis, vol. 1, no. 1, p. 1-41, D0I:10.1142/S1793536909000047. [8] Zhang, W.B., Pu, Y.S., Zhu, J.X., Su, Y.P. (2013). Gear fault diagnosis method using EEMD sample entropy and grey incidence. Advanced Materials Research, vol. 694-697, p. 11511154, D0I:10.4028/www.scientific.net/AMR.694-697.1151. [9] Lv, Z.-L., Tang, B.-P., Zhou, Y., Zhou, C.-D. (2015). A novel fault diagnosis method for rotating machinery based on EEMD and MCKD. International Journal of Simulation Modelling, vol. 14, no. 3, p. 438-449, D0I:10.2507/IJSIMM14(3)6.298. [10] Pang, X., Cheng, B., Yang, Z., Li, F. (2019). A fault feature extraction method for a gearbox with a composite gear train A Transient Feature Learning-Based Intelligent Fault Diagnosis Method for Planetary Gearboxes 393 Strojniski vestnik - Journal of Mechanical Engineering 66(2020)6, 385-394 based on EEMD and translation-invariant multiwavelet neighboring coefficients. Strojniški vestnik - Journal of Mechanical Engineering, vol. 65, no. 1, p. 3-11, DOI:10.5545/ sv-jme.2018.5441. [11] Chen, X.H., Cheng, G., Li, H.Y., Li, Y. (2019). Research of planetary gear fault diagnosis based on multiscale fractal box dimension of CEEMD and ELM. Strojniški vestnik -Journal of Mechanical Engineering, vol. 63, no. 1, p. 45-55, DOI:10.5545/sv-jme.2016.3811. [12] Wang, L.M, Shao, Y.M, (2020). Fault feature extraction of rotating machinery using a reweighted complete ensemble empirical mode decomposition with adaptive noise and demodulation analysis. Mechanical Systems and Signal Processing, vol. 138, DOI:10.1016/j.ymssp.2019.106545. [13] Dragomiretskiy, K., Zosso, D. (2014). Variational mode decomposition. IEEE Transactions on Signal Processing, vol. 62, no. 3, p. 531-544, DOI:10.1109/TSP.2013.2288675. [14] Tao, J., Liu, Y., Yang, D. (2016). Bearing fault diagnosis based on deep belief network and multisensor information fusion. Shock and Vibration, vol. 2016, p. 1-9, DOI:10.1155/2016/9306205. [15] Lu, C., Wang, Z., Zhou, B. (2017). Intelligent fault diagnosis of rolling bearing using hierarchical convolutional network based health state classification. Advanced Engineering Informatics, vol. 32, p. 139-151, DOI:10.1016/j.aei.2017.02.005. [16] Zhang, Q., Yang, L.T., Chen, Z. (2016). Deep computation model for unsupervised feature learning on big data. IEEE Transactions on Services Computing, vol. 9, no. 1, p. 161-171, DOI:10.1109/TSC.2015.2497705. [17] Ren, H., Qu, J.F., Chai, Y., Tang, Q., Ye, X. (2017). Research status and challenges of deep learning in the field of fault diagnosis. Control and Decision, vol. 32, no. 8, p. 1345-1358, DOI:10.13195/j.kzyjc.2016.1625. [18] Antoni, J., Randall, R.B. (2006). The spectral kurtosis: application to the vibratory surveillance and diagnostics of rotating machines. Mechanical Systems and Signal Processing, vol. 20, no. 2, p. 308-331, DOI:10.1016/j. ymssp.2004.09.002. [19] Aziz, W., Arif, M. (2005). Multiscale permutation entropy of physiological time series. Pakistan Section Multitopic Conference, p. 1-6, DOI:10.1109/INMIC.2005.334494. [20] Azami, H., Escudero, J. (2016). Improved multiscale permutation entropy for biomedical signal analysis: Interpretation and application to electroencephalogram recordings. Biomedical Signal Processing and Control, vol. 23, p. 28-41, DOI:10.1016/j.bspc.2015.08.004. [21] Li, K., Su, L., Wu, J., Wang, H., Chen, P. (2007). A rolling bearing fault diagnosis method based on variational mode decomposition and an improved kernel extreme learning machine. Applied Sciences, vol. 7, no. 10, 1004, DOI:10.3390/ app7101004. [22] Cortes, C., Vapnik, V. (1995). Support vector networks. Machine Learning, vol. 20, no. 3, p. 273-297, DOI:10.1007/ BF00994018. 394 Bo Qin, B. - Li, Z.X. -Qin, Y.