ELEKTROTEHNIŠKI VESTNIK 91(1-2): 67-71, 2024 OVERVIEW PROFESSIONAL PAPER A review of fault types and analytical and hardware redundancy for the sensor functional safety Helena Domajnko University of Ljubljana, Faculty of Electrical Engineering, Tržaška 25, 1000 Ljubljana, Slovenia E-mail: hd2150@student.uni-lj.si Abstract. Sensors are part of the medical, industrial, automotive, and other appliances and contribute to the overall safety of the whole system. Safety standards for different sectors are imposed, and safety should be addressed already on the sensor design level. There are multiple possible approaches to the sensor fault detection, each with its own advantages and disadvantages. The paper reviews the literature about the main sensor fault types as well as analytical and hardware redundancy and approaches to the fault diagnosis and identification. More specifically, sensor bias, drift, gain, noise, hardware failures, random faults, and complete failures are listed. For the analytical redundancy, the paper describes the parity space approach (PSA), principal component analysis (PCA), independent component analysis (ICA), factor analysis, and minimum mean square error (MMSE) method. The steps of each method and its application are shortly explained. Virtual sensors and decision by majority are proposed as possible alternatives to the hardware redundancy. Choosing the most appropriate method for a particular use enhances the performance without changing the hardware. An in-depth comparison of the methods and its result is believed to optimise the sensor selection. Keywords: functional safety, sensor fault, analytical redundancy, hardware redundancy Pregled okvar ter analitične in strojne redundance za funkcionalno varnost senzorjev Senzorji pomembno prispevajo k varnosti sistema, zato je potrebno zagotoviti njihovo pravilno delovanje. Ta članek vsebuje pregled najpogostejših vrst okvar ter več metod analitične in strojne redundance za prepoznavanje teh okvar. Izbira najbolj primerne tehnike lahko izboljša zanesljivost senzorja brez spreminjanja strojne opreme. 1 INTRODUCTION With the growing integration of Robotic Systems and Automation into human lives, the need for their safety and reliability continues to increase. The systems are becoming more collaborative and are infiltrating industrial, healthcare, and domestic appliances. To guarantee an adequate level of the functional safety, the robots must adhere to the safety standards. The IEC 61508 standard and its extensions ISO 26262 and ISO 13849 are specific for the robotics industry [1]. Today, the safety is mandatory in any newly developed device and is, therefore, a regular subject of research. In the automotive industry, vehicles with a high level of automation are being developed for both industrial and public sector appliances. To ensure the user safety, automated vehicles have to abide by the ISO 26262 Standard (Road Vehicles – Functional Safety) which lays down safety goals, hazard analysis, risk assessment, and functional safety requirements [2]. Standardised functional safety requirements are necessary for advanced driver-assistant systems (ADASs) and autonomous driving (AD), as they strongly affect the dynamics of the vehicle. In these appliances, the commonly used safety features are radar and ultrasound sensors, lidars, and cameras [3]. Collaborative industrial systems present an extra challenge in providing the safety for the physical human-to-robot interaction (HRI), especially regarding mechanical hazards in a shared working environment [4]. Besides the mechanical and electrical technologies, sensor technologies are used in industrial robots to create a safe workspace for humans. A safety-related system (SRS) in industrial robotics usually comprises sensors, a logic subsystem (processors and microcontrollers), software algorithms, and actuators [5]. In the medical industry, medical robots deal with patient lives. Their safety concerns are, therefore, more rigorous compared to the industrial robots. The safety can be ensured using sensors such as positioning, limit, and redundancy sensors. Nevertheless, the encoder monitoring of the motor as a method for the applicator (tool in contact with the patient) positioning can be unreliable. This is because the encoder output can be incorrect, and the motor and applicator movements are not necessarily connected [6]. Received 9 February 2024 Accepted 19 March 2024 68 DOMAJNKO These examples highlight the significant role sensors play in enforcing safety measures. The aim of the paper is to review approaches to the sensor safety, in order to improve the safety of the whole system. The second section discusses sensor faults and their types. The third section describes multiple analytical redundancy techniques for detecting and identifying the faults. The fourth section presents the hardware redundancy as a possibility for the fault detection. In the fifth section, conclusions and perspectives for the future work are given. 2 SENSOR FAULTS According to [7], the ISO 26262 adaptation of the IEC 61508 standard defines the functional safety as “The absence of unreasonable risk due to hazards caused by a malfunctioning behaviour of electrical/electronic systems.” These malfunctions can be divided into two groups, namely systematic and random failures of electrical/electronic components. Systematic failures are introduced during production processes and maintenance, and random failures relate to the hardware that fails due to an old age or random faults. Safety management can handle systematic failures, while different safety mechanisms are needed to detect and control random failures. Sensors as system components can lower the performance, shut down the process completely, or even result in a fatal accident when operating with faults. Proper techniques for the instrument fault detection and identification (IFDI) can lower the effects of such faulty sensors [8]. After the detection of a faulty sensor, the fault type can be identified. 2.1 Sensor fault types In [9], six main types of the sensor fault are presented. In particular sensor bias, sensor drift, sensor gain fault, sensor noise, short and open circuit, and sensor random fault. According to [10], a complete failure can be added to the list of the main fault types. 2.1.1 Sensor bias The sensor bias is a constant deviation from the true value which occurs abruptly [11]. This type of the fault is difficult to detect, and it is one of the most common ones. It is often a result of a bias voltage or a bias current [9]. 2.1.2 Sensor drift The sensor drift is an offset that varies with the time and shows gradually [11]. As a drift is usually hard to notice in the beginning, an early detection is important. A proposed solution is a regular calibration of the sensor. Environmental conditions can be one of the causes of the drift occurrence [9]. 2.1.3 Sensor gain fault The sensor gain fault is a multiplicative fault [10]. The gain fault means that the rate of the change of the measurement is incorrect. A change of the input is multiplied with a certain positive constant and reported as an output [9]. 2.1.4 Sensor noise Sensors can operate under harsh working conditions and are subject to an internal and external noise. Some noise is foreseen, and methods can be applied to reduce it. However, if the noise signal is too strong, it can cause the sensor data to be inaccurate [9]. 2.1.5 Short circuits and open circuits Short circuits and open circuits are hardware failures inside the sensor. They can happen, for example, due to poor contacts inside the sensor or environmental conditions. Such failures are mainly easily detectable and are solved by replacement or repair [9]. 2.1.6 Sensor random faults The impact of environmental conditions can result in random failures as well. Often, such faults return to their expected values once the conditions normalise. These conditions should be considered during the sensor design already to prevent the development of a random failure [9]. 2.1.7 Complete failure In a complete failure, the sensor information is lost. The sensor instead puts out a constant value, bottom noise, or a constant value combined with the bottom noise. Irrespective of the magnitude of the failure, all true information is lost [10]. 2.2 Fault diagnosis and identification (FDI) FDI can be accomplished both by a regular preventative maintenance and by automated techniques. These methods can be divided into two main groups: analytical and hardware redundancy [8]. In [10], sensor validation is explored in steps of the fault detection, fault isolation, fault type identification, and fault quantification. 3 ANALYTICAL REDUNDANCY Analytical redundancy can be divided into a direct (static) and temporal (dynamic) redundancy. In a direct redundancy, the components are observed at a time instant, while in a temporal redundancy, the observation is made over a specific time period. A direct redundancy allows a sensor fault detection but cannot reveal an actuator fault, as only the sensor outputs are observed. In a temporal redundancy, the sensor outputs are observed with respect to the actuator inputs and can, therefore, potentially be used for the actuator fault detection [12]. There are several sensor validation techniques [10]. This section focuses on the following methods: parity A REVIEW OF FAULT TYPES AND ANALYTICAL AND HARDWARE REDUNDANCY FOR THE SENSOR FUNCTIONAL… 69 space approach, principal component analysis (PCA), independent component analysis (ICA), factor analysis, and minimum mean square error (MMSE) estimation. 3.1 Parity space approach Essentially, the result of this method is a vector which is either zero or non-zero. In the event of a fault presence, the vector is non-zero. Based on this vector, the fault type can be identified as well [13]. The parity space approach is usually suitable for simulation but shows a high sensitivity towards the noise. First, a state space model is set up for the system. It includes known inputs, unknown fault inputs, unknown disturbances, and initial state. Diagnostic calculations are then made recursively and provide a residual vector. A non-zero residual vector detects a fault and noise, but further steps are needed to distinguish them. Equations for all steps are provided in [13]. The method can also be used with no state space model if it can be reconstructed from the given data [13]. 3.2 Principal component analysis Principal component analysis can be used when no model knowledge is available. The input and output data are separated into the model and the residual. This is done using vectors and matrices [13]. The fault identification is again computed using the residuals. Since no model is used in the beginning, some faults may be more difficult to isolate [13]. 3.3 Independent component analysis Compared to PCA, independent component analysis (ICA) is a statistical method of a higher order and can provide a more meaningful information from a given data. By extracting more information with the analysis tools, the performance of the instrument can potentially be increased without changing the system hardware. ICA finds independent sources of observations in a data mixture from unknown sources [14]. Observations of statistically independent sources are put together in a linear system. In ICA, matrix calculations [14] are used to approximate the original sources from this data. The actual number of the independent sources can differ from the number of the used principal components. Therefore, an ICA validation is needed [14]. 3.4 Factor analysis Factor analysis is a statistical technique based on a fault model. It is comprised of separate individual faults and a measurement. The impact of each fault on the measurement is described, as well as its severity. The measurement noise and other inherent variations are also included in the model. The aim is to calculate the number of individual faults and to identify and eliminate their causes. The method is explained in [15]. 3.5 Minimum mean square error estimation Another method that does not rely on a model is the minimum mean square error (MMSE) estimation. It works directly with the data but, in turn, has a lower time efficiency. The number of the sensors in the system must be sufficient to achieve the redundancy, which means more than the number of the active modes [16]. Then, the signal of each sensor can be directly estimated using the remaining sensors of the network. This provides a conditional distribution for each sensor, and further tests are applied for the fault detection [17]. 4 HARDWARE REDUNDANCY In the hardware redundancy approach, multiple sensors are used to measure the same value [10]. They are used as a standard safety mechanism to detect and replace a faulty sensor [7]. With a higher number of the redundant sensors, the urgency for the sensor repair is lower [18]. Extra sensors, however, mean an increase in the cost of a system [7]. For the use in a static (direct) redundancy, the number of the sensors must be larger than the number of the system states, while the dynamic (temporal) redundancy usually demands fewer sensors [10]. 4.1 Virtual sensors As an alternative to the multiple physical sensors, virtual sensors can be used to ensure the redundancy. In case of a fault, the parallel virtual sensor is the one to replace the real sensor. Furthermore, the observer of an observable system can provide a better performance than the sensor itself [7]. First, a model of the observed system is constructed. Then, an observer is designed based on the model. There are several observer estimation approaches, for example, Kalman Filters (KF), Extended Kalman Filters (EKF), and Luenberger observers [8]. Estimation with EKFs is shown in [7]. Next, sensor faults are to be detected and isolated. This is done by assigning KF to each sensor and comparing the measured values to their calculated estimations. With an exceeded threshold of the deviation, a fault is detected and must be isolated [7]. 4.2 Decision by majority [19] introduces a concept of decision by majority among sensors. It has been used by computers but has not been thoroughly explored for the sensors used in control systems. The problem of such use is the presence of the noise and the distribution of measurements of the same value for each individual sensor. Nevertheless, decision by majority of the redundant sensors is suggested as a possible tolerance technique for sensor failures. Independent redundant sensors provide their measurements, of which the one provided by the majority is used. In the event of a failure of one of the sensors, the faulty measurement is prevented from 70 DOMAJNKO propagating. The performance of the whole system thus remains at the expected level [19]. For multiple sensor failures, this method is not effective. Even in the event of one faulty sensor, the decision by majority does not always give a correct value from the remaining sensors [20]. 5 CONCLUSION There are multiple approaches to the sensor fault detection and identification. To choose the most appropriate method, the given data about the sensor and measurements should be considered, i.e. the nature of the data and the (in)existence of a data model. The sensor working environment and fault tolerance should be examined as well. Problems should be addressed thoroughly and mathematically. The data on some of the techniques may be difficult to acquire, as some methods are more commonly used than others. Most papers describe the application of one technique in a specific case study. Steps and equations for conducting an analysis are available, but most papers assume that the decision on the method selection has already been taken. A thorough comparison between the possibilities helps choosing the optimal technique for a specific case. Applying a right method improves the performance without changing the hardware. REFERENCES [1] B. Pilkington, “Functional Safety Standards for Robotic Systems,” AZO Robotics, May 2022, Accessed: Nov. 21, 2023. [Online]. Available: https://www.azorobotics.com/Article.aspx?ArticleID =518 [2] T. Stolte, G. Bagschik, and M. Maurer, “Safety goals and functional safety requirements for actuation systems of automated vehicles,” in 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), IEEE, Nov. 2016. Accessed: Nov. 10, 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/779591 0 [3] M. Gerstmair, A. Melzer, A. Onic, and M. Huemer, “On the Safe Road Toward Autonomous Driving: Phase Noise Monitoring in Radar Sensors for Functional Safety Compliance,” IEEE Signal Processing Magazine, Sep. 10, 2019. Accessed: Nov. 10, 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/882799 6 [4] L. Gualtieri, E. Rauch, and R. Vidoni, “Emerging research fields in safety and ergonomics in industrial collaborative robotics: A systematic literature review,” Robot Comput Integr Manuf, vol. 67, Feb. 2021, Accessed: Nov. 26, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S07 3658452030209X [5] V. C. Kumar, Y. Yang, and T. Schneider, “Optimizing functional safety for industrial robots,” Texas Instruments, Jun. 2022, Accessed: Nov. 21, 2023. [Online]. Available: https://www.ti.com/lit/wp/spry347/spry347.pdf?ts=16 99477469720&ref_url=https%253A%252F%252Fw ww.google.com%252F [6] B. Fei, W. S. Ng, S. Chauhan, and C. K. Kwoh, “The safety issues of medical robotics,” Reliab Eng Syst Saf, Aug. 2001, Accessed: Nov. 22, 2023. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S09 51832001000370 [7] S. Schmidt, J. Oberrath, and P. Mercorelli, “A Sensor Fault Detection Scheme as a Functional Safety Feature for DC-DC Converters,” Sensors, vol. 21, no. 19, Sep. 2021, Accessed: Jan. 03, 2024. [Online]. Available: https://www.mdpi.com/1424- 8220/21/19/6516 [8] N. Mehranbod, M. Soroush, and C. Panjapornpon, “A method of sensor fault detection and identification,” J Process Control, vol. 15, no. 3, Apr. 2005, Accessed: Jan. 03, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/abs/pii/ S0959152404000812 [9] D. Li, Y. Wang, J. Wang, C. Wang, and Y. Duan, “Recent advances in sensor fault diagnosis: A review,” Sens Actuators A Phys, vol. 309, Jul. 2020, Accessed: Jan. 03, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S09 24424719308635 [10] J. Kullaa, “Detection, identification, and quantification of sensor fault in a sensor network,” Mech Syst Signal Process, vol. 40, no. 1, Oct. 2013, Accessed: Jan. 03, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S08 88327013002057#bib1 [11] X. Dai, F. Qin, Z. Gao, K. Pan, and K. Busawon, “Model-based on-line sensor fault detection in Wireless Sensor Actuator Networks,” in 2015 IEEE 13th International Conference on Industrial Informatics (INDIN), Cambridge: IEEE, Jul. 2015. Accessed: Jan. 03, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/728179 4 [12] H. M. Odendaal and T. Jones, “Actuator fault detection and isolation: An optimised parity space approach,” Control Eng Pract, vol. 26, pp. 222–232, May 2014, Accessed: Jan. 09, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S09 67066114000434#s0025 [13] A. Hagenblad, F. Gustafsson, and I. Klein, “A comparison of two methods for stochastic fault detection: the parity space approach and principal components analysis,” IFAC Proceedings Volumes, vol. 36, no. 16, Sep. 2003, Accessed: Jan. 10, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S14 7466701734898X A REVIEW OF FAULT TYPES AND ANALYTICAL AND HARDWARE REDUNDANCY FOR THE SENSOR FUNCTIONAL… 71 [14] M. Kermit and O. Tomic, “Independent component analysis applied on gas sensor array measurement data,” IEEE Sens J, vol. 3, no. 2, Apr. 2003, Accessed: Jan. 10, 2024. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/120294 7 [15] D. W. Apley and J. Shi, “A Factor-Analysis Method for Diagnosing Variability in Multivariate Manufacturing Processes,” Technometrics, vol. 43, no. 1, pp. 84–95, Feb. 2001, Accessed: Jan. 11, 2024. [Online]. Available: https://www.jstor.org/stable/1270860?seq=1 [16] J. Kullaa, “Sensor validation using minimum mean square error estimation,” Mech Syst Signal Process, vol. 24, no. 5, Jul. 2010, Accessed: Jan. 10, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S08 88327009003847 [17] J. Kullaa, “Distinguishing between sensor fault, structural damage, and environmental or operational effects in structural health monitoring,” Mech Syst Signal Process, vol. 25, no. 8, Nov. 2011, Accessed: Jan. 11, 2024. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S08 88327011002196 [18] N. E. Elhady and J. Provost, “A Systematic Survey on Sensor Failure Detection and Fault-Tolerance in Ambient Assisted Living,” Sensors, vol. 18, no. 7, Jun. 2018, Accessed: Jan. 09, 2024. [Online]. Available: https://www.mdpi.com/1424- 8220/18/7/1991 [19] K. Suyama, “A new type of reliable control system using decision by majority,” Electronics and Communications in Japan (Part II: Electronics), vol. 81, no. 2, pp. 46–54, Feb. 1998, Accessed: Jan. 09, 2024. [Online]. Available: https://onlinelibrary.wiley.com/doi/epdf/10.1002/%28 SICI%291520- 6432%28199802%2981%3A2%3C46%3A%3AAID- ECJB6%3E3.0.CO%3B2-5 [20] K. Suyama, “Functional safety analysis of reliable control systems using decision by majority,” in Proceedings of the 1999 American Control Conference, San Diego: IEEE, Jun. 1999. Accessed: Nov. 10, 2023. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/782902 Helena Domajnko is a bachelor-degree student of Electronics at the Faculty of Electrical Engineering, University of Ljubljana, Slovenia. She has spent half a year as an Erasmus+ exchange student at the Technische Universität München in Munich, Germany.