Issue 
Security and Safety
Volume 1, 2022



Article Number  2022004  
Number of page(s)  29  
Section  Industrial Control  
DOI  https://doi.org/10.1051/sands/2022004  
Published online  08 August 2022 
Views
A note on diagnosis and performance degradation detection in automatic control systems towards functional safety and cyber security
Institute for Automatic Control and Complex Systems (AKS), University of DuisburgEssen, Bismarckstr. 81 BB, 47057 Duisburg, Germany
^{*} Corresponding author (email: steven.ding@unidue.de)
Received:
8
February
2022
Revised:
7
March
2022
Accepted:
14
March
2022
This note addresses diagnosis and performance degradation detection issues from an integrated viewpoint of functionality maintenance and cyber security of automatic control systems. It calls for more research attention on three aspects: (i) application of control and detection unified framework to enhancing the diagnosis capability of feedback control systems, (ii) projectionbased fault detection, and complementary and explainable applications of projection and machine learningbased techniques, and (iii) system performance degradation detection that is of elemental importance for today’s automatic control systems. Some ideas and conceptual schemes are presented and illustrated by means of examples, serving as convincing arguments for research efforts in these aspects. They would contribute to the future development of capable diagnosis systems for functionality safe and cyber secure automatic control systems.
Key words: Diagnosis in automatic control systems / Cyber security in industrial cyber physical systems / Unified framework of control and detection / Projectionbased diagnosis / Explainable application of MLmethods / Performance degradation detection
Citation: Ding SX. A note on diagnosis and performance degradation detection in automatic control systems towards functional safety and cyber security. Security and Safety 2022; 1: 2022004. https://doi.org/10.1051/sands/2022004
© The Author(s) 2022. Published by EDP Sciences and China Science Publishing & Media Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
In the era of industry 4.0, automatic control systems as the centrepiece of industrial cyber physical systems (CPSs) are fully equipped with intelligent sensors, actuators and an excellent information infrastructure. It is a logical consequence of ever increasing demands for system performance and production efficiency that today’s automatic control systems are of an extremely high degree of integration, automation and complexity. Maintaining reliable and safe operations of automatic control systems is of elemental importance for optimally managing industrial CPSs over the whole operation life cycle. As an indispensable maintenance functionality, realtime monitoring and diagnosis are widely integrated in automatic control systems and run parallel to the embedded control systems.
In a traditional automatic control system, monitoring and diagnosis were mainly dedicated to maintaining functionalities of sensors and actuators as the key components embedded in the system [1, 2]. As a response to wide networking in modern automatic control systems, monitoring and diagnosis of networked control systems as a whole have received considerable attention as well in recent years [3]. Over the past three decades, innumerable capable diagnosis schemes have been developed with various specifications, for instance, detecting abrupt component failures [4], identifying and predicting functionality loss caused by ageing in system components [5, 6], and intermittent faults depending on system operation conditions [7]. Recently, new type of malfunctions, the socalled cyberattacks on automatic control systems, have drawn attention on the urgent need for developing new monitoring and diagnosis strategies [8–11]. Cyberattacks can not only considerably affect functionalities of sensors and actuators, but also impair communications among the system components and subsystems, which may cause immense damage during system operations [12–15]. In addition, different from technical faults, cyberattacks are artificially created and could be designed by attackers in such a way that they cannot be detected using the existing diagnosis techniques. Such cyberattacks are called stealthy [11]. A further type of cyberattack is the socalled eavesdropping attack. Although such attacks do not cause changes in system dynamics and performance degradation, they enable an adversary to gain system knowledge which can be used to design, for instance, stealthy attacks. In a nutshell, the management of cyberattacks, besides functionality maintenance, raises cyber security issues in the framework of monitoring and diagnosis in automatic control systems.
The objective of this note was to address monitoring and diagnosis issues from an integrated viewpoint of functionality maintenance and cyber security of automatic control systems. We would like to draw the reader’s attention to the following three aspects:

application of the control and detection unified framework [16] to enhancing the diagnosis capability of feedback control systems,

alternative technique of detecting faults in dynamic systems towards complementary and explainable applications of model and machine learning (ML)based methods to diagnosis, and

system performance degradation detection issues,
which are, to our best knowledge, not the current research mainstream in the relevant thematic fields. We will report ideas and research efforts, present conceptual schemes, and illustrate, also by means of examples, why research efforts in these three aspects could contribute to the development of capable monitoring and diagnosis methods towards enhancing functionality safety and cyber security of automatic control systems.
This note is motivated by our observations and research experiences in the field of fault diagnosis in technical systems and its industrial applications over the past years. Reviewing publications on fault diagnosis in automatic control systems gives a clear picture of research efforts. That is, they were mainly devoted to the development of fault diagnosis functionality as a separate system running in parallel to the control system. With the increasing complexity of control systems under consideration, from singleloop feedback control systems to networked control systems and recently CPSs, the set of investigated diagnosis issues has been continuously extended, and correspondingly capable but often complicated diagnosis methods have been developed, without paying attention to technical specifications and configurations of controllers embedded in the control system. For instance, successful solutions of detecting the socalled covert, zero dynamics and replay cyberattacks are achieved by extending the wellestablished observerbased detection scheme with a moving target or an auxiliary system [17–19] or injecting watermark signals [20–22]. On the other hand, the unified control and detection framework [16] not only highlights the common information basis of control and detection, but also gives a functionalization of a control system, which enables an integrated configuration of control and detection functionality with enhanced diagnosis capacity. Our recent work demonstrates successful applications of the unified framework to uniform detection of covert, zero dynamics and replay cyberattacks without adding additional systems or signals [23].
Thanks to the close relations of observers and controllers, observerbased diagnosis is the most popular technique applied for fault detection in automatic control systems [1, 2]. Observing the recent development in the thematic field of monitoring and diagnosis in industrial systems and processes, it can be clearly identified that MLbased methods form the mainstream of research. A detailed survey of publications on MLbased diagnosis in automatic control systems reveals obvious deficits in making use of system knowledge, which is no doubt available, since most of plants, partially or as a whole, are engineering systems. In fact, most of MLbased diagnosis methods are, in their core, based on the principle of reconstructing process variables or simply modelling of system faultfree operations. Thanks to the learning capacity of ML algorithms, in particular neural networks (NNs), and on the assumption of availability of rich data, MLmethods are potential technical solutions. Nevertheless, such diagnosis solutions could be far from optimal with respect to diagnosis performance, also due to the reason that often diagnostic specifications are not or could not be integrated into the existing ML algorithms. In comparison, modelbased diagnosis methods, especially the observerbased ones, are fully based on the dynamic model of the system under consideration, and pursue optimal diagnosis performance. To approach this objective, advanced methods of control theory serve as major investigation tools. On the other hand, these methods, compared with MLbased ones, are less capable of dealing with a huge number of data and, above all, lack the learning ability. From these observations, a reasonable question arises: is it possible to efficiently integrate the model and MLbased diagnosis methods to significantly enhance diagnosis performance? Our recent work on the socalled projectionbased fault detection strategy is motivated by this question [24]. The first results showcase that complementary applications of model and MLbased methods result in enhanced detection performance. The proposed projectionbased fault detection method not only provides us with an alternative and more capable modelbased solution than the observerbased ones, but also leads to explainable applications of MLbased methods.
It can be well observed that the major attention of the existing diagnosis methods has been dedicated to faults in hardware components of automatic control systems like sensors and actuators. We call those corresponding diagnosis methods componentoriented diagnosis (COD). In the recent decade, considerable efforts have been made in automation industry to increase the component reliability and, more recently, to enhance the intelligent degree of those key system components. Smart sensors and actuators are nowadays state of the art. In addition, the new generation of smart system components are of the ability of selfdiagnosis and selfrepair. In an industrial CPS, COD is an issue to be addressed both at the process level and locally. At the system level, due to the extremely high degree of automation and complexity, the system performance is often susceptible to variations of operation and environmental conditions. Moreover, it could considerably suffer not only from faults in subsystems, but also from, for example, mismatching of coupled and networked control loops and controller parameters, interferences in system information infrastructure and cyberattacks as well. This calls for research endeavour to develop new strategies of monitoring and detecting performance degradations, called performanceoriented diagnosis (POD) [25].
The remainder of this note consists of three main sections, respectively dedicated to the three topics, (i) the unified control and detection framework towards enhancing the diagnosis capability of feedback control systems, (ii) projectionbased detection of faults in dynamic systems and complementary, explainable applications of model and MLbased methods, and (iii) study on POD issues. We would like to emphasize that the main intention of this note is to report ideas, research efforts, and conceptual schemes for the development of capable monitoring and diagnosis methods towards enhancing functionality safety and cyber security of automatic control systems. So far, no comparison study or survey of relevant publications is included. Concerning related issues, only representative works will be cited if needed. In order to have easy understandable descriptions, we avoid rigorous control theoretical and mathematical formulations, when there is no misleading interpretation orconfusion.
2. Unified control and detection framework towards enhancing the diagnosis capability of feedback control systems
As the methodological basis of our subsequent discussion, we first introduce the unified framework of control and detection. On this basis, we present functionalization of a control system and its applications for enhancing the diagnosis capability of feedback control systems.
Throughout this note, standard notations known in linear algebra and advanced control theory are adopted. In addition, ℛℋ_{∞} is used to denote the set of all stable systems. In the context of cyberattacks, when signal ξ is attacked, it is denoted by ξ^{a}, and the corresponding (injected) attack signal by a_{ξ}, i.e. ξ^{a} = ξ + a_{ξ}.
2.1. System representations and controller parameterization
2.1.1. System factorizations, observerbased residual generation, and signal subspaces
In automatic control engineering, transfer functions are a standard model form for system input–output dynamics, which is written as
$$\begin{array}{c}\hfill y(z)=G(z)u(z),\phantom{\rule{0.166667em}{0ex}}y(z)\in {\mathcal{C}}^{m},\phantom{\rule{0.166667em}{0ex}}u(z)\in {\mathcal{C}}^{p}\end{array}$$(1)
with u and y as the plant input and output vectors, respectively. It is assumed that G(z) is a proper realrational matrix and its minimal state space realization is given by the following discretetime linear time invariant (LTI) system,
$$\begin{array}{cc}\hfill x(k+1)& =Ax(k)+Bu(k),\phantom{\rule{0.166667em}{0ex}}x(0)={x}_{0},\hfill \\ \hfill y(k)& =Cx(k)+Du(k),\hfill \end{array}$$(2,3)
where x ∈ ℛ^{n} is the state vector and x_{0} is the initial condition of the system. Matrices A, B, C, D are appropriately dimensioned real constant matrices. By means of the wellestablished coprime factorization, G(z) can be further factorized as
$$\begin{array}{c}\hfill G(z)={\widehat{M}}^{1}(z)\widehat{N}(z)=N(z){M}^{1}(z)\end{array}$$(4)
with $(\widehat{M}(z),\widehat{N}(z))$ and (M(z),N(z)) as left and right coprime pairs (LCP and RCP), which lead to alternative system representations,
$$\begin{array}{c}{r}_{y}\left(z\right)\u2254{K}_{G}\left(z\right)\left[\begin{array}{c}u\left(z\right)\\ y\left(z\right)\end{array}\right],{K}_{G}\left(z\right)=\left[\widehat{N}\left(z\right)\widehat{M}\left(z\right)\right],{r}_{y}\left(z\right)=0\\ \left[\begin{array}{c}u\left(z\right)\\ y\left(z\right)\end{array}\right]={I}_{G}\left(z\right)v\left(z\right),{I}_{G}\left(z\right)=\left[\begin{array}{c}M\left(z\right)\\ N\left(z\right)\end{array}\right]\end{array}$$(5, 6)
for some signal v(z). Their state space realizations are given, respectively, by
$$\begin{array}{cc}\hfill \widehat{x}(k+1)& =(ALC)\widehat{x}(k)+(BLD)u(k)+Ly(k),\hfill \\ \hfill \widehat{y}(k)& =C\widehat{x}(k)+Du(k),{r}_{y}(k)=y(k)\widehat{y}(k),\hfill \\ \hfill x(k+1)& =(A+BF)x(k)+Bv(k),\hfill \\ \hfill \left[\begin{array}{c}u(k)\\ y(k)\end{array}\right]& =\left[\begin{array}{c}F\\ C+DF\end{array}\right]x(k)+\left[\begin{array}{c}I\\ D\end{array}\right]v(k).\hfill \end{array}$$(7,8,9,10)
System (7) is a state observer and builds, together with (8) (equivalently with (5)), an observerbased residual generator with residual vector r_{y} as its output. If $\widehat{x}(0)\ne {x}_{0}$ or there exist uncertainties in the system, r_{y}(k) will deviate from zero. In other words, r_{y}(k) is an indicator for uncertainties in the system. In system (9)–(10), the input vector u(k)=Fx(k)+v(k) can be interpreted as a state feedback controller with v as reference signal. Corresponding to these interpretations, matrices F and L are called state feedback gain and observer gain matrices and so selected such that A + BF and A − LC are Schur matrices. Systems K_{G} in (5) and I_{G} in (6) are also called stable kernel and image representations (SKR and SIR) of system (1).
Remark 1 Hereafter, we may drop out the domain variable z or k when there is no risk of confusion.
SKR and SIR are two alternative representations of dynamic systems, based on which the following definitions of kernel and image subspaces are introduced [26].
Definition. Given the model (1) and the corresponding LCP and RCP $(\widehat{M},\widehat{N})$ and the subspaces 𝒦_{G} and ℐ_{G} defined by
$$\begin{array}{cc}\hfill {\mathcal{K}}_{G}& =\{\left[\begin{array}{c}u\\ y\end{array}\right]:\left[\begin{array}{cc}\widehat{N}& \widehat{M}\end{array}\right]\left[\begin{array}{c}u\\ y\end{array}\right]=0\},\hfill \\ \hfill {\mathcal{I}}_{G}& =\{\left[\begin{array}{c}u\\ y\end{array}\right]:\left[\begin{array}{c}u\\ y\end{array}\right]=\left[\begin{array}{c}M\\ N\end{array}\right]v\}\hfill \end{array}$$(11, 12)
are called kernel and image subspace of G, respectively.
It is evident that 𝒦_{G} and ℐ_{G} are subspaces in the (m+p)dimensional data space and have the following properties:

𝒦_{G} = ℐ_{G},

ℐ_{G} is uniquely generated by the pdimensional signal and thus

vector v can be understood as a latent (hidden) variable.
These properties enable applications of the projectionbased technique to deal with fault diagnosis issues and hence build a bridge between the model and MLbased methods. This promises the development of more efficient and capable methods for fault diagnosis, performance degradation monitoring and detection of cyberattacks, as will be discussed in the remainder of this note.
It follows from the definition of coprime factorization that there exist two RCP and LCP $(\widehat{X},\widehat{Y})$ and (X,Y) so that the socalled Bezout identity holds [26, 27],
$$\begin{array}{c}\hfill \left[\begin{array}{cc}X(z)& \phantom{\rule{0.166667em}{0ex}}Y(z)\\ \widehat{N}(z)& \phantom{\rule{0.166667em}{0ex}}\widehat{M}(z)\end{array}\right]\left[\begin{array}{cc}M(z)& \phantom{\rule{0.166667em}{0ex}}\widehat{Y}(z)\\ N(z)& \phantom{\rule{0.166667em}{0ex}}\widehat{X}(z)\end{array}\right]=\left[\begin{array}{cc}I\phantom{\rule{0.166667em}{0ex}}& 0\phantom{\rule{0.166667em}{0ex}}\\ 0\phantom{\rule{0.166667em}{0ex}}& I\phantom{\rule{0.166667em}{0ex}}\end{array}\right].\end{array}$$(13)
It is of considerable interest to note their special state space realizations as controllers, i.e. an observerbased state feedback controller and its input–output dynamics [16],
$$\begin{array}{cc}\hfill \widehat{x}(k+1)& =(A+BF)\widehat{x}(k)+L{r}_{y}(k),\hfill \\ \hfill \left[\begin{array}{c}u(k)\\ y(k)\end{array}\right]& =\left[\begin{array}{c}F\\ C+DF\end{array}\right]\widehat{x}(k)+\left[\begin{array}{c}0\\ I\end{array}\right]{r}_{y}(k)\hfill \\ & \u27fa\left[\begin{array}{c}u(z)\\ y(z)\end{array}\right]=\left[\begin{array}{c}\widehat{Y}(z)\\ \widehat{X}(z)\end{array}\right]{r}_{y}(z),\hfill \end{array}$$
as well as an observerbased state feedback controller and a closedloop “residual generator”,
$$\begin{array}{cc}\hfill \widehat{x}(k+1)& =(ALC)\widehat{x}(k)+(BLD)u(k)+Ly(k),\hfill \\ \hfill & v(k)=u(k)F\widehat{x}(k)\hfill \\ \hfill \u27fav(z)& =X(z)u(z)+Y(z)y(z).\hfill \end{array}$$
2.1.2. Parameterization of stabilising controllers and basics of the unified control and detection framework
It is a wellknown result that, given plant model (1), all stabilizing controllers are parameterized by
$$\begin{array}{cc}\hfill K(z)& ={\widehat{V}}^{1}(z)\widehat{U}(z)=U(z){V}^{1}(z),\hfill \\ \hfill \widehat{V}(z)& =X(z)Q(z)\widehat{N}(z),\phantom{\rule{0.166667em}{0ex}}\widehat{U}(z)=Y(z)Q(z)\widehat{M}(z),\hfill \\ \hfill V(z)& =\widehat{X}(z)N(z)Q(z),\phantom{\rule{0.166667em}{0ex}}U(z)=\widehat{Y}(z)M(z)Q(z),\hfill \end{array}$$(14, 15, 16)
with the parameter system Q(z)∈ℛℋ_{∞}, where the RCPs and LCPs (M,N), $(\widehat{X},\widehat{Y})$ and $(\widehat{M},\widehat{N})$, (X,Y) are given before and satisfy Bezout identity (13). The parameterization expression (14)–(15) is called Youla parameterization [27]. It follows from (5) to (6) and Bezout identity [16, 28] that any (stabilizing) output feedback controller,
$$\begin{array}{c}\hfill u(z)=K(z)y(z)+v(z),\end{array}$$(17)
with v(z) being the reference signal can be equivalently written as
$$\begin{array}{c}\hfill u(z)=F\widehat{x}(z)Q(z){r}_{y}(z)+\overline{v}(z),\overline{v}(z)=\widehat{V}(z)v(z),\end{array}$$(18)
where $\widehat{x}$ is the state estimate delivered by the observer (7). In other words, any output feedback controller is an observerbased controller and driven by the residual signal r_{y}. In [16], a further parameterization form of all stabilizing controllers,
$$\begin{array}{c}\hfill u(z)={K}_{0}(z)y(z)+{Q}_{0}(z){r}_{y}(z)+\overline{v}(z),{Q}_{0}(z)\in {\mathcal{RH}}_{\infty},\end{array}$$(19)
is introduced, where K_{0} is an output stabilizing controller, and Q_{0} denotes the parameterization system. Consequently, also those widely used industrial controllers like PI controllers can be written in the form of (19), as far as they stabilize the control loops.
Figure 1. Feedback control loop under consideration 
2.2. Mapping from the signal space to residual space
Consider the feedback control loop sketched in Figure 1 with the plant model (1) and controller (17). It turns out,
$$\begin{array}{cc}\hfill \left[\begin{array}{c}\overline{v}(z)\\ {r}_{y}(z)\end{array}\right]& =\left[\begin{array}{cc}\widehat{V}(z)& \phantom{\rule{0.166667em}{0ex}}\widehat{U}(z)\\ \widehat{N}(z)& \phantom{\rule{0.166667em}{0ex}}\widehat{M}(z)\end{array}\right]\left[\begin{array}{c}u(z)\\ y(z)\end{array}\right]\u27fa\hfill \\ \hfill \left[\begin{array}{c}u(z)\\ y(z)\end{array}\right]& =\left[\begin{array}{c}M(z)\\ N(z)\end{array}\right]\overline{v}(z)+\left[\begin{array}{c}U(z)\\ \phantom{\rule{0.166667em}{0ex}}V(z)\end{array}\right]{r}_{y}(z).\hfill \end{array}$$(20, 21)
From (21), it is obvious that the system signal pair (u,y) consists of two terms: the first one reflects the feedforward control and the second one the response to the feedback control driven by the residual signal. Denoting uncertainties related to the controller by , which may, for instance, be caused by attacks on actuators like the injection of unknown signal, we have,
$$\begin{array}{c}\hfill \left[\begin{array}{c}{r}_{u}(z)\\ {r}_{y}(z)\end{array}\right]=\left[\begin{array}{cc}\widehat{V}(z)& \phantom{\rule{0.166667em}{0ex}}\widehat{U}(z)\\ \widehat{N}(z)& \phantom{\rule{0.166667em}{0ex}}\widehat{M}(z)\end{array}\right]\left[\begin{array}{c}u(z)\\ y(z)\end{array}\right]\left[\begin{array}{c}\overline{v}(z)\\ 0\end{array}\right].\end{array}$$(22)
Relation (22) gives a onetoone mapping between the signal pairs and (r_{u},r_{y}) (for given $\overline{v}$). While (u,y) are the system measurement variables and represent the system dynamics, (r_{u},r_{y}) build an information (residual) space and act as indicators for uncertainties in the system, including not only disturbances and parameter variations, but also faults and cyberattacks when available. Hence, (22) can serve as a residual generator for detecting faults, performance degradation and cyberattacks. Recall that the core of feedback control is residualdriven. That implies the feedback of residuals is sufficient for the control purpose. In this context, system (22) can be interpreted as an encoder that delivers the residuals (r_{u},r_{y}) as code. It is noteworthy that, on the one hand, an identification of the system dynamics by means of the code (r_{u},r_{y}) is generally impossible, and on the other hand, the cyberattacks can be identified using the residual pair (r_{u},r_{y}) under certain conditions [23].
2.3. Functionalization of all stabilizing feedback controllers
In light of the observerbased realization of stabilizing controllers given in (18), a feedback controller can be divided into several functional modules [16]:

an observer and an observerbased residual generator, as given in (7)–(8), which serve as an information provider for the controller and diagnostic system, and deliver a state estimation, $\widehat{x},$ as well as the primary residual, ${r}_{y}=y\widehat{y},$

the control law,
$$\begin{array}{c}\hfill u(z)=F\widehat{x}(z)Q(z){r}_{y}(z)+\widehat{V}(z)v(z),\end{array}$$
including a feedback controller, $F\widehat{x}Q{r}_{y},$ and a feedforward controller, $\widehat{V}v,$ and in addition,

for the detection purpose, a detector R(z)r_{y}(z) with R(z) as a stable postfilter.
This modular structure provides us with a clear parameterization of the functional modules: the state observer is parameterized by the observer gain L, the feedback controller by F, Q, the feedforward controller by $\widehat{V},$ and the detector by R. Although all five parameters are available for the design and online optimization objectives, they have evidently different functionalities, as summarized below [16]:

state feedback and observer gains determine the stability and eigendynamics of the closedloop,

R, V̂ have no influence on the system stability, and R serves for the optimization of the detectability, while $\widehat{V}$ for the tracking behavior, and

Q is used to enhance the system robustness and control performance. The design and update of Q will have influence on the system dynamics and stability, when parameter uncertainties or degradations are present in the system.
It is evident that the above five parameters have to be, due to their different functionalities, treated with different priorities. Recall that system stability and eigendynamics are the fundamental requirement on an automatic control system. This requires that the system stability should be guaranteed, also in case of cyberattacks. Differently, Q, R and $\widehat{V}$ are used to optimize control or detection performance. In case that a temporary system performance degradation is tolerable, the realtime demand and the priority for an online optimization of $Q,R,\widehat{V}$ are relatively low.
When an automatic control system is integrated into a CPS, the cyber security becomes a critical issue. In this context, the unified framework and the functionalization of controllers offer a useful design tool towards a cyber securityconscious system configuration. To delineate potential applications, consider the controller in its original form and in the observerbased realization form, respectively,
$$\begin{array}{c}\hfill \phantom{\rule{0.333333em}{0ex}}\text{controller}\left(17\right)(z)=K(z)y(z)+v(z)vs.\phantom{\rule{0.333333em}{0ex}}\text{controller}\left(18\right)(z)=F\widehat{x}(z)Q(z){r}_{y}(z)+\overline{v}(z),\end{array}$$
and suppose that the plant is networked with a control station (refer to Figure 2 as an example). It is clear that for the implementation of the controller in its original form, i.e. (17), the system data (u,y) should be realtime transmitted over the network. Moreover, for any optimization or degradation recovering effort, controller K(z) should be updated which may yield unexpected dynamic behaviour. Differently, for the implementation of observerbased controller (18), an observer and an observerbased residual generator can be implemented on the plant side. This offers several benefits:

transformation of residual r_{y} from the plant (local) side to the control station and $\overline{v}(z)Q(z){r}_{y}(z)$ from the control station to the plant, which prevent adversary to gain system knowledge by means of eavesdropping attacks [23],

when performance optimization or degradation recovery is the need, realtime tuning Q(z) is an effective way, as reported in [29], which can run in the control station,

updating feedback gain and observer gain matrices, F and L, which will be performed only in very critical operation situations (and thus occasionally) and in the control station. Their transmission to the plant should be well encrypted [30].
As reported in our recent work [23], the modules of the observerbased controller (18) together with the Bezout identity (13) can serve as encoders and decoders distributed at the plant and control station sides. It is noteworthy that the observerbased controller form (18) can be viewed as “control sharing”, which is similar to the secret sharing scheme wellknown in cryptography [30]. This additional function enables efficient detection of cyberattacks and enhances the cyber security of automatic control systems, which are, for instance, implemented in the form of cloudbased control [30].
Figure 2. The original configuration of the automatic control system under consideration 
In the following example, we introduce a conceptual configuration of an encrypted control system based on the above controller functionalization.
Example 1 Consider a networked automatic control system schematically sketched in Figure 2. The plant is modelled by (1), equipped with a (local) feedback controller,
$$\begin{array}{c}\hfill {u}_{0}(z)={K}_{0}(z)y(z),\end{array}$$
and networked with a control and monitoring system (CMS). It receives signal $\overline{u}$ from CMS,
$$\begin{array}{c}\hfill \overline{u}(z)=v+Q(z)y(z),\end{array}$$(23)
where v is the reference signal and Q(z)y(z) represents a correction of the control signal, for instance, to recover control performance degradation [16]. A natural procedure to realize the control law (23) is, as shown in Figure 2, as follows: (i) the plant sends the measurement data y to CMS, and (ii) CMS computes $\overline{u}$ and sends it to the plant. Suppose that integrity cyberattacks could be executed on the system I/O interface via the network. Now, we introduce a conceptual reconfiguration of the systems on both network sides, on the basis of the unified control and detection framework, aiming at:

a reliable detection of integrity cyberattacks, and

preventing attackers to gain system knowledge by means of system identification using the transmitted data $(\overline{u},y).$
Moreover, it is required that the local controller K_{0} should not be changed. For our purpose, consider the control signal,
$$\begin{array}{c}\hfill u(z)={u}_{0}(z)+\overline{u}(z)={K}_{0}(z)y(z)+Q(z)y(z)+v(z).\end{array}$$
Following the functionalization of control systems, u_{0} and u can be equivalently written into
$$\begin{array}{cc}\hfill {u}_{0}(z)& ={F}_{0}\widehat{x}(z){Q}_{0}(z){r}_{y}(z),\hfill \\ \hfill u(z)& ={F}_{0}\widehat{x}(z){Q}_{1}(z){r}_{y}(z)+\widehat{V}(z)v(z)\hfill \end{array}$$
for some Q_{0}(z),Q_{1}(z)∈ℛℋ_{∞}. It turns out
$$\begin{array}{c}\hfill \overline{u}(z)=({Q}_{0}(z){Q}_{1}(z)){r}_{y}(z)+\widehat{V}(z)v(z).\end{array}$$(24)
Run the following residual generation algorithm on the plant side,
$$\begin{array}{c}\hfill \left[\begin{array}{c}{r}_{u}(z)\\ {r}_{y}(z)\end{array}\right]=\left[\begin{array}{cc}X(z)& \phantom{\rule{0.166667em}{0ex}}Y(z)\\ \widehat{N}(z)& \phantom{\rule{0.166667em}{0ex}}\widehat{M}(z)\end{array}\right]\left[\begin{array}{c}{u}^{a}(z)\\ y(z)\end{array}\right]\left[\begin{array}{c}{\overline{u}}^{a}(z){Q}_{0}(z){r}_{y}(z)\\ 0\end{array}\right],\end{array}$$(25)
where
$$\begin{array}{c}\hfill {u}^{a}(z)=u(z)+{a}_{u}(z)={u}_{0}(z)+{\overline{u}}^{a}(z),{\overline{u}}^{a}(z)=\overline{u}(z)+{a}_{u}(z)\end{array}$$
with a_{u} denoting integrity cyberattacks on the actuators. It yields, recalling (22),
$$\begin{array}{c}\hfill {r}_{u}(z)=(X(z)I){a}_{u}(z).\end{array}$$
Thus, attack a_{u} can be detected. In the attackfree case, r_{y} is sent to CMS, otherwise, alarm is triggered. On the CMS side, a detection algorithm is applied to check if the residual signal received from the plant side is corrupted by attack signal a_{y}, i.e.
$$\begin{array}{c}\hfill {r}_{y}^{a}(z)={r}_{y}(z)+{a}_{y}(z).\end{array}$$
In case of no attack, $\overline{u}$ computed using algorithm (24) is sent to the plant side. Figure 3 shows the above described control system schematically.
Figure 3. Reconfiguration of the automatic control system under consideration 
We would like to summarize the main results of this example as follows:

the proposed control system is capable for a reliable attack detection thanks to the use of the residual pair (r_{u},r_{y}),

system (24) and residual generator (25) serve simultaneously as encoders, and

the control system operates stable also in the case of an interrupted communication between the plant and CMS.
It should be moreover mentioned that the control system located at the plant side runs only based on the controller parameters, K_{0}(z) as well as without knowledge about Q_{1}(z) that is set by CMS for enhancing the control performance.
With the following remarks we would like to conclude this section.

The control and detection unified framework forms a methodical basis for the development of advanced diagnosis methods aiming at maintaining system functionality and enhancing cyber security of automatic control systems. It deals with the implementation of control, detection and monitoring algorithms. In this context, the information infrastructure for the configuration of automatic control systems plays an essential role. For instance, the networked system in Figure 3 could be alternatively configured using cloudbased system structure, in which the CMS is realized by means of cloud computing.

Although only LTI systems are addressed in this note, an extension of the unified control and detection framework to linear timevarying (LTV) systems is straightforward using the wellestablished system coprime factorization methods and Youla parameterization of LTV control systems [31]. Concerning nonlinear control systems, corresponding results have been reported in [16, 32, 33].

In our example, the application of the unified framework to the detection of cyberattacks is schematically and shortly illustrated. The reader is referred to [23] for a more systematic and detailed description of this application. In a nutshell, this work results in the detection of those stealthy cyberattacks, which cannot be detected using the existing observerbased detection methods [34]. These include the socalled covert, zero dynamics and replay cyberattacks [8–11].
3. Projectionbased diagnosis methods and their MLaided explainable realization
In this section, we introduce a new framework for fault diagnosis in dynamic control systems. The theoretical foundation of this framework is the alternative system representations SIR, SKR and the associated image and kernel subspaces, as well as orthogonal projection technique. Although this framework has been developed in the modelbased fashion [24], the associated concepts, algorithms and diagnosis approaches can be realized in the datadriven form and using MLbased methods.
In this section, the following notations are adopted. ℒ_{2} = ℒ_{2}(−∞,0] ⊕ ℒ_{2}[0,∞) is the time domain space of all square summable Lebesgue signals (signals with bounded energy) [35]. For transfer matrix G(z),G^{*}(z)=G^{T}(z^{−1}). 𝒫_{𝒦} is an orthogonal projection operator onto subspace 𝒦, whose norm is denoted by 𝒫_{𝒦}. ${\mathcal{P}}_{\mathcal{K}}^{*}$ is the adjoint of 𝒫_{𝒦}. 𝒦^{⊥} represents the orthogonal complement of 𝒦.
3.1. A general framework of projectionbased diagnosis methods
3.1.1. Basic idea
The basic idea of (orthogonal) projectionbased fault detection can be schematically explained by Figure 4. Given a system subspace as the nominal system model, which can be presented in the modelbased form (in terms of SIR or SKR) or datadriven or by means of an NN, by (orthogonally) projecting the measurement vector $\left[\begin{array}{c}u\\ y\end{array}\right]$ onto the system subspace, the distance between the measurement vector and its projection indicates if the measurement vector belongs to the nominal system operations or it is faulty. To this end, the following mathematical concepts and work are necessary:

definition and computation of orthogonal projection operator,

computation of $(\left[\begin{array}{c}u\\ y\end{array}\right],\mathcal{K}),$

online realization algorithms towards constructing a fault detection system, and

determination of threshold for decision making.
Figure 4. Schemetic description of projectionbased classification (${\mathcal{P}}_{\mathcal{K}}\alpha $ denotes the projection of $\alpha $ onto 𝒦) 
3.1.2. Orthogonal projection: mathematical preliminaries
An orthogonal projection on a subspace 𝒱, denoted by 𝒫_{𝒱}, in Hilbert space endowed with the inner product,
$$\begin{array}{c}\hfill \langle x,y\rangle =\sqrt{\sum _{k=\infty}^{\infty}{x}^{T}(k)y(k)},x,y\in {\mathcal{L}}_{2},\end{array}$$(26)
is a linear operator satisfying [36]
$$\begin{array}{c}\hfill x,y\in \mathcal{V},{\mathcal{P}}_{\mathcal{V}}^{2}={\mathcal{P}}_{\mathcal{V}},\langle {\mathcal{P}}_{\mathcal{V}}x,y\rangle =\langle x,{\mathcal{P}}_{\mathcal{V}}y\rangle .\end{array}$$(27)
The following wellknown properties and definitions of an orthogonal projection are of importance for our subsequent study [36]:

given y ∈ ℒ_{2}, ∀x ∈ 𝒱 ∈ ℒ
$$\begin{array}{c}\hfill \langle yx,yx\rangle ={yx}_{2}\ge {y{\mathcal{P}}_{\mathcal{V}}y}_{2},\end{array}$$(28)

given a closed subspace 𝒱 ∈ ℒ_{2} and a vector the distance between y and 𝒱, dist(y,𝒱), is defined as
$$\begin{array}{c}\hfill \mathrm{dist}(y,\mathcal{V})=\underset{x\in \mathcal{V}}{inf}{yx}_{2},\end{array}$$
which, following (28), can be computed as
$$\begin{array}{c}\hfill \mathrm{dist}(y,\mathcal{V})=(\mathcal{I}{\mathcal{P}}_{\mathcal{V}})y={\mathcal{P}}_{{\mathcal{V}}^{\perp}}y.\end{array}$$
Here, ℐ is the unit operator.
In order to measure the distance between two (closed) subspaces in Hilbert space, the concept of gap metric is established [36]. Given two closed subspaces 𝒱, 𝒰 ∈ ℒ_{2}, the gap metric between them is defined by
$$\begin{array}{cc}\hfill \delta (\mathcal{V},\mathcal{U})& =max\{\overrightarrow{\delta}(\mathcal{V},\mathcal{U}),\overrightarrow{\delta}(\mathcal{U},\mathcal{V})\},\hfill \\ \hfill \overrightarrow{\delta}(\mathcal{V},\mathcal{U})& =\underset{\begin{array}{c}x\in \mathcal{V}\\ {x}_{2}=1\end{array}}{sup}\mathrm{dist}(x,\mathcal{U})=(\mathcal{I}{\mathcal{P}}_{\mathcal{U}}){\mathcal{P}}_{\mathcal{V}}=\underset{x\in \mathcal{V}}{sup}\underset{y\in \mathcal{U}}{inf}\frac{{xy}_{2}}{{x}_{2}}.\hfill \end{array}$$(29, 30)
Here, $\overrightarrow{\delta}(\mathcal{V},\mathcal{U})$ is called directed gap. The following properties are wellknown [36] and useful for our subsequent investigation:
$$\begin{array}{cc}\hfill 0& \le \delta (\mathcal{V},\mathcal{U})\le 1,\hfill \\ \hfill \phantom{\rule{0.333333em}{0ex}}\text{for}\delta (\mathcal{V},\mathcal{U})& <1,\overrightarrow{\delta}(\mathcal{V},\mathcal{U})=\overrightarrow{\delta}(\mathcal{U},\mathcal{V})=\delta (\mathcal{V},\mathcal{U}).\hfill \end{array}$$
3.1.3. Orthogonal projection onto image subspace and its system realizations
In our subsequent study on projectionbased fault diagnosis framework, the socalled normalized SKR and SIR play an important role, which are denoted by K_{N} and I_{N} and defined by
$$\begin{array}{cc}\hfill {K}_{N}(z){K}_{N}^{\ast}(z)& ={\widehat{N}}_{0}(z){\widehat{N}}_{0}^{\ast}(z)+{\widehat{M}}_{0}(z){\widehat{M}}_{0}^{\ast}(z)=I,\hfill \\ \hfill {I}_{N}^{\ast}(z){I}_{N}(z)& ={M}_{0}^{\ast}(z){M}_{0}(z)+{N}_{0}^{\ast}(z){N}_{0}(z)=I,\hfill \end{array}$$
where $({\widehat{M}}_{0},{\widehat{N}}_{0})$ and (M_{0},N_{0}) are LCP and RCP with special settings of the observer and state feedback gain matrices using the known algorithms, for example, given in [37]. It is a known result that the orthogonal projection onto the image subspace ℐ_{G} is given by
$$\begin{array}{c}\hfill {p}_{{\mathcal{I}}_{G}}={\mathcal{P}}_{{\mathcal{I}}_{G}}\left[\begin{array}{c}u\\ y\end{array}\right]={I}_{N}{I}_{N}^{\ast}\left[\begin{array}{c}u\\ y\end{array}\right].\end{array}$$(31)
Correspondingly, the difference between $\left[\begin{array}{c}u\\ y\end{array}\right]$ and p_{ℐG} is subject to
$$\begin{array}{c}\hfill {r}_{{\mathcal{I}}_{G}}:=\left[\begin{array}{c}u\\ y\end{array}\right]{p}_{{\mathcal{I}}_{G}}=(\mathcal{I}{\mathcal{P}}_{{\mathcal{I}}_{G}})\left[\begin{array}{c}u\\ y\end{array}\right]=(I{I}_{N}{I}_{N}^{\ast})\left[\begin{array}{c}u\\ y\end{array}\right],\end{array}$$(32)
and called projectionbased residual. Due to the relation,
$$\begin{array}{c}\hfill {I}_{N}{I}_{N}^{\ast}+{K}_{N}^{\ast}{K}_{N}=I,\end{array}$$
projectionbased residual generation (32) can be equivalently written as
$$\begin{array}{c}\hfill {r}_{{\mathcal{I}}_{G}}=(I{I}_{N}{I}_{N}^{\ast})\left[\begin{array}{c}u\\ y\end{array}\right]={K}_{N}^{\ast}{K}_{N}\left[\begin{array}{c}u\\ y\end{array}\right].\end{array}$$(33)
The l_{2}norm of r_{ℐG},
$$\begin{array}{c}\hfill {{r}_{{\mathcal{I}}_{G}}}_{2}=\mathrm{dist}(\left[\begin{array}{c}u\\ y\end{array}\right],\phantom{\rule{0.166667em}{0ex}}{\mathcal{I}}_{G}),\end{array}$$(34)
is the distance from $\left[\begin{array}{c}u\\ y\end{array}\right]$ to ℐ_{G}. Moreover, the fact that K_{N} is a normalized SKR leads to the following implementation form of the residual vector,
$$\begin{array}{c}\hfill {{r}_{{\mathcal{I}}_{G}}}_{2}={{K}_{N}^{\ast}{K}_{N}\left[\begin{array}{c}u\\ y\end{array}\right]}_{2}={{K}_{N}\left[\begin{array}{c}u\\ y\end{array}\right]}_{2}={{r}_{y}}_{2}.\end{array}$$(35)
That means, for the detection purpose with the residual evaluation function the needed online computation is the observerbased residual generator (7)–(8) or equivalently the SKR (5) with the observer gain setting for a normalized SKR.
Next, on the assumption that the system dynamics with uncertainty is described by
$$\begin{array}{cc}\hfill G=N{M}^{1}& =({N}_{0}+{\mathrm{\Delta}}_{N}){({M}_{0}+{\mathrm{\Delta}}_{M})}^{1}={\widehat{M}}^{1}\widehat{N}={({\widehat{M}}_{0}+{\mathrm{\Delta}}_{\widehat{M}})}^{1}({\widehat{N}}_{0}+{\mathrm{\Delta}}_{\widehat{N}}),\hfill \\ \hfill {I}_{G}& =\left[\begin{array}{c}M\\ N\end{array}\right]=\left[\begin{array}{c}{M}_{0}+{\mathrm{\Delta}}_{M}\\ {N}_{0}+{\mathrm{\Delta}}_{N}\end{array}\right]={I}_{N}+{\mathrm{\Delta}}_{I},\hfill \\ \hfill {K}_{G}& =\left[\begin{array}{cc}\widehat{N}& \widehat{M}\end{array}\right]=\left[\begin{array}{cc}{\widehat{N}}_{0}{\mathrm{\Delta}}_{\widehat{N}}& {\widehat{M}}_{0}+{\mathrm{\Delta}}_{\widehat{M}}\end{array}\right]={K}_{N}+{\mathrm{\Delta}}_{K}\hfill \\ \hfill & sup{{\mathrm{\Delta}}_{I}}_{\infty}={\delta}_{{\mathrm{\Delta}}_{I}}<1,sup{{\mathrm{\Delta}}_{K}}_{\infty}={\delta}_{{\mathrm{\Delta}}_{K}}<1,\hfill \end{array}$$(36)
the threshold is to be determined. Considering that the idea of setting threshold is to avoid false alarms caused by model uncertainty during faultfree operations, a basic requirement on the threshold is that
$$\begin{array}{cc}\hfill \forall \left[\begin{array}{c}u\\ y\end{array}\right]& \in {\mathcal{I}}_{G},{J}_{\mathit{th}}(u,y)=\underset{{{\mathrm{\Delta}}_{I}}_{\infty}\le {\delta}_{{\mathrm{\Delta}}_{I}}}{sup}{{r}_{{\mathcal{I}}_{G}}}_{2},\hfill \\ \hfill {\mathcal{I}}_{G}& =\{\left[\begin{array}{c}u\\ y\end{array}\right]:\left[\begin{array}{c}u\\ y\end{array}\right]=\left[\begin{array}{c}M\\ N\end{array}\right]v\},\hfill \end{array}$$(37)
which is obviously different from ℐ_{G0},
$$\begin{array}{c}\hfill {\mathcal{I}}_{{G}_{0}}=\{\left[\begin{array}{c}u\\ y\end{array}\right]:\left[\begin{array}{c}u\\ y\end{array}\right]=\left[\begin{array}{c}{M}_{0}\\ {N}_{0}\end{array}\right]v\}.\end{array}$$
In [24], it is proved that the threshold setting problem (37) is equivalent to
$$\begin{array}{c}\hfill {J}_{\mathit{th}}(u,y)=\underset{{{\mathrm{\Delta}}_{I}}_{\infty}\le {\delta}_{{\mathrm{\Delta}}_{I}}}{sup}{{r}_{{\mathcal{I}}_{G}}}_{2}=\underset{\left[\begin{array}{c}u\\ y\end{array}\right]\in {\mathcal{I}}_{G}}{sup}\delta ({\mathcal{I}}_{G},{\mathcal{I}}_{{G}_{0}}){\left[\begin{array}{c}u\\ y\end{array}\right]}_{2}\end{array}$$
with δ(ℐ_{G},ℐ_{G0}) denoting the gap metric between ℐ_{G0} and ℐ_{G}. It leads to
$$\begin{array}{c}{J}_{\mathrm{th}}(u,y)={\delta}_{{\Delta}_{K}}{\left({\parallel {r}_{y}\parallel}_{2}^{2}+{\parallel {\mathcal{P}}_{{\mathcal{I}}_{G}}\left[\begin{array}{c}u\\ y\end{array}\right]\parallel}_{2}^{2}\right)}^{1/2}\u27fa\\ {J}_{\mathrm{th}}(u,y)=\frac{{\delta}_{{\Delta}_{K}}}{\sqrt{1{\delta}_{{\Delta}_{K}}^{2}}}{\parallel {\mathcal{P}}_{{\mathcal{I}}_{G}}\left[\begin{array}{c}u\\ y\end{array}\right]\parallel}_{2}=\frac{{\delta}_{{\Delta}_{K}}}{\sqrt{1{\delta}_{{\Delta}_{K}}^{2}}}{\left({\parallel \begin{array}{c}u\\ y\end{array}\parallel}_{2}^{2}{\parallel {r}_{y}\parallel}_{2}^{2}\right)}^{1/2}.\end{array}$$(38)
Compared with the wellestablished threshold setting for observerbased fault detection schemes [38], threshold (38) is of significant advantage that it is considerably robust against uncertainties and sensitive to the faulty operations. In fact, this point becomes more apparent, when the threshold and the residual are normalized as follows:
$$\begin{array}{cc}\hfill {J}_{th,N}(u,y)& =\frac{{J}_{\mathit{th}}(u,y)}{{\begin{array}{c}u\\ y\end{array}}_{2}}=\frac{{\delta}_{{\mathrm{\Delta}}_{K}}}{\sqrt{1{\delta}_{{\mathrm{\Delta}}_{K}}^{2}}}{(1{{r}_{y,N}}_{2}^{2})}^{1/2},\hfill \\ \hfill {r}_{y,N}& =\frac{1}{{\begin{array}{c}u\\ y\end{array}}_{2}}{r}_{y}.\hfill \end{array}$$
It can be seen that the threshold J_{th, N}(u,y) reaches its maximal value during the faultfree operations, and becomes smaller as the system is in faulty operations. In this way, the robustness and fault detectability are remarkably enhanced.
Example 2 In this example, we introduce a datadriven realization of the projectionbased detection scheme. Departing from the system model (2)–(3), the system dynamics can be written as
$$\begin{array}{c}{y}_{s}\left(k\right)={\mathrm{\Gamma}}_{s}x(ks)+{H}_{u,s}{u}_{s}\left(k\right),\\ {\mathrm{\Gamma}}_{s}=\left[\begin{array}{c}C\\ \mathrm{CA}\\ \vdots \\ C{A}^{s}\end{array}\right]\in {\mathcal{R}}^{(s+1)m\times n},{H}_{u,s}=\left[\begin{array}{cccc}D& 0& & \\ \mathrm{CB}& \ddots & \ddots & \\ \vdots & \ddots & \ddots & 0\\ C{A}^{s1}B& \cdots & \mathrm{CB}& D\end{array}\right]\in {\mathcal{R}}^{(s+1)m\times (s+1)p},\end{array}$$(39)
where y_{s}(k),u_{s}(k) are signal vectors of the data format
$$\begin{array}{c}\hfill {\varsigma}_{s}(k)=\left[\begin{array}{c}\varsigma (ks)\\ \vdots \\ \varsigma (k)\end{array}\right],\end{array}$$
and s is an integer giving the length of the time interval [k−s,k of interest. To simplify our study, assume that the system is stable, and x(k−s) is neglectable. By defining the orthogonal projection,
$$\begin{array}{c}\hfill {\mathcal{P}}_{\mathcal{I}}=\left[\begin{array}{c}I\\ {H}_{u,s}\end{array}\right]{\left(\left[\begin{array}{cc}I& {H}_{u,s}^{T}\end{array}\right]\left[\begin{array}{c}I\\ {H}_{u,s}\end{array}\right]\right)}^{1}\left[\begin{array}{cc}I& {H}_{u,s}^{T}\end{array}\right],\end{array}$$
a projectionbased residual vector is constructed as follows:
$$\begin{array}{c}\hfill {r}_{\mathcal{I}}(k)=\left[\begin{array}{c}{u}_{s}(k)\\ {y}_{s}(k)\end{array}\right]{\mathcal{P}}_{\mathcal{I}}\left[\begin{array}{c}{u}_{s}(k)\\ {y}_{s}(k)\end{array}\right].\end{array}$$
Note that
$$\begin{array}{c}\hfill {r}_{s}(k)={\mathrm{\Pi}}^{1/2}({y}_{s}(k){H}_{u,s}{u}_{s}(k))\end{array}$$
builds a residual vector and can be interpreted as a datadriven realization of an observerbased residual generator. Moreover, it holds
$$\begin{array}{c}\hfill I\left[\begin{array}{c}I\\ {H}_{u,s}\end{array}\right]{\left(\left[\begin{array}{cc}I& {H}_{u,s}^{T}\end{array}\right]\left[\begin{array}{c}I\\ {H}_{u,s}\end{array}\right]\right)}^{1}\left[\begin{array}{cc}I& {H}_{u,s}^{T}\end{array}\right]=\left[\begin{array}{c}{H}_{u,s}^{T}\\ I\end{array}\right]{\left(\left[\begin{array}{cc}{H}_{u,s}& I\end{array}\right]\left[\begin{array}{c}{H}_{u,s}^{T}\\ I\end{array}\right]\right)}^{1}\left[\begin{array}{cc}{H}_{u,s}& I\end{array}\right],\\ \hfill {\mathrm{\Pi}}^{1/2}\left[\begin{array}{cc}{H}_{u,s}& I\end{array}\right]\left[\begin{array}{c}{H}_{u,s}^{T}\\ I\end{array}\right]{\mathrm{\Pi}}^{1/2}=I,\phantom{\rule{0.166667em}{0ex}}\mathrm{\Pi}={\left(\left[\begin{array}{cc}{H}_{u,s}& I\end{array}\right]\left[\begin{array}{c}{H}_{u,s}^{T}\\ I\end{array}\right]\right)}^{1}.\end{array}$$
It turns out
$$\begin{array}{c}\hfill {r}_{\mathcal{I}}(k)={\mathrm{\Pi}}^{1/2}\left[\begin{array}{cc}{H}_{u,s}& I\end{array}\right]\left[\begin{array}{c}{u}_{s}(k)\\ {y}_{s}(k)\end{array}\right]={r}_{s}(k).\end{array}$$
Suppose that Δ_{Hu, s} represents the uncertainty in the system,
$$\begin{array}{c}\hfill {y}_{s}(k)=({H}_{u,s}+{\mathrm{\Delta}}_{{H}_{u,s}}){u}_{s}(k),{{\mathrm{\Pi}}^{1/2}{\mathrm{\Delta}}_{{H}_{u,s}}}_{2}={\sigma}_{max}\left({\mathrm{\Pi}}^{1/2}{\mathrm{\Delta}}_{{H}_{u,s}}\right)\le {\delta}_{{\mathrm{\Delta}}_{K}}<1.\end{array}$$
Define the residual evaluation function,
$$\begin{array}{c}\hfill J({u}_{s},{y}_{s})={r}_{\mathcal{I}}(k)={r}_{s}(k).\end{array}$$
It follows from (38) that the threshold is set equal to
$$\begin{array}{c}\hfill {J}_{\mathit{th}}({u}_{s},{y}_{s})=\frac{{\delta}_{{\mathrm{\Delta}}_{K}}}{\sqrt{1\delta {2}_{{\mathrm{\Delta}}_{K}}}}{({\begin{array}{c}{u}_{s}\\ {y}_{s}\end{array}}^{2}{{r}_{s}}^{2})}^{1/2}.\end{array}$$
Remark 2 At the end of this subsection, we would like to give an interpretation of the orthogonal projection P_{ℐG} in the context of reconstructing the system variables (u,y) and its relation to the latent variable v. It is apparent that
$$\begin{array}{c}\hfill \left[\begin{array}{c}\widehat{u}\\ \widehat{y}\end{array}\right]:={\mathcal{P}}_{{\mathcal{I}}_{G}}\left[\begin{array}{c}u\\ y\end{array}\right]={I}_{N}{I}_{N}^{\ast}\left[\begin{array}{c}u\\ y\end{array}\right]\end{array}$$
is an estimation of (u,y) for the nominal operations. Note that ${I}_{N}^{*}$ is the conjugate of I_{N}. Let the state space representation of ${I}_{N}^{*}$ be denoted by
$$\begin{array}{cc}\hfill \xi (k1)& =\overline{A}\xi (k)+\overline{B}\left[\begin{array}{c}u(k)\\ y(k)\end{array}\right],\xi \in {\mathcal{R}}^{n},\left[\begin{array}{c}u(k)\\ y(k)\end{array}\right]\in {\mathcal{L}}_{2},\hfill \\ \hfill \varsigma (k)& =\overline{C}\xi (k)+\overline{D}\left[\begin{array}{c}u(k)\\ y(k)\end{array}\right]\in {\mathcal{R}}^{p}.\hfill \end{array}$$
It is known that the above system is dual to I_{N} and its output can be interpreted as a reconstruction of the input variable of i.e. v[16]. In other words, the reconstruction of (u,y) is achieved by an estimation of latent variable v. This interpretation is helpful to extend the projectionbased detection method to nonlinear control systems. To this end, the socalled Hamiltonian extension of nonlinear systems and its application to the construction of normalized (nonlinear) image representations build useful tools [16, 39]. Moreover, aided by this interpretation, we will introduce, in the next subsection, explainable MLbased fault diagnosis methods.
3.2. Complementary and explainable application of modelbased and MLbased methods
In this subsection, we would like to discuss about a complementary and explainable application of modelbased and machine learning methods to enhancing the capability of fault diagnosis systems. To this end, we will demonstrate the realization of the projectionbased fault diagnosis schemes using the socalled autoencoder method, a wellestablished MLtechnique.
3.2.1. Autoencoder technique: preliminaries
Figure 5. Basic configuration of an autoencoder 
As sketched in Figure 5, the essential function of an autoencoder (AE) is to reconstruct (estimate) the system variables under consideration using NNs and learning mechanisms. In Figure 5, 𝒩𝒩_{en} and 𝒩𝒩_{de} represent two neural networks serving as encoder and decoder, respectively, and their parameters, θ_{en} and θ_{de}, are, roughly speaking, learnt using sufficient measurement data, (u,y), by minimizing the loss function
$$\begin{array}{cc}\hfill \mathcal{L}({\theta}_{\mathit{en}},{\theta}_{\mathit{de}})& =\left[\begin{array}{c}u\\ y\end{array}\right]\left[\begin{array}{c}\widehat{u}\\ \widehat{y}\end{array}\right]=\left[\begin{array}{c}u\\ y\end{array}\right]{\mathcal{NN}}_{\mathit{de}}({\theta}_{\mathit{de}},h)\hfill \\ & =\left[\begin{array}{c}u\\ y\end{array}\right]{\mathcal{NN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{NN}}_{\mathit{en}}({\theta}_{\mathit{en}},\left[\begin{array}{c}u\\ y\end{array}\right]))\hfill \end{array}$$
with respect to θ_{en} and θ_{de}. The basic idea of applying an AE to fault detection can be schematically described as follows. Under assumption that the AE is well trained using faultfree operation data, the minimum value of ℒ(θ_{en},θ_{de}) can be adopted as the threshold,
$$\begin{array}{c}\hfill {J}_{\mathit{th}}=\underset{{\theta}_{\mathit{en}},{\theta}_{\mathit{de}}}{min}\mathcal{L}({\theta}_{\mathit{en}},{\theta}_{\mathit{de}}).\end{array}$$
Running the trained AE online to generate projectionbased residual r and computing the evaluation function J,
$$\begin{array}{cc}\hfill r& :=\left[\begin{array}{c}u\\ y\end{array}\right]{\mathcal{NN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{NN}}_{\mathit{en}}({\theta}_{\mathit{en}},\left[\begin{array}{c}u\\ y\end{array}\right])),\hfill \\ \hfill J& :=r=\left[\begin{array}{c}u\\ y\end{array}\right]{\mathcal{NN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{NN}}_{\mathit{en}}({\theta}_{\mathit{en}},\left[\begin{array}{c}u\\ y\end{array}\right])),\hfill \end{array}$$
fault detection is then achieved by the detection logic,
$$\begin{array}{c}\hfill \{\begin{array}{c}J\le {J}_{\mathit{th}}:\phantom{\rule{0.333333em}{0ex}}\text{faultfree}\hfill \\ J>{J}_{\mathit{th}}:\phantom{\rule{0.333333em}{0ex}}\text{faulty}\hfill \end{array}.\end{array}$$
It is wellknown that hidden variable h in an AE plays a central role as the information carrier of the system under consideration and, more importantly, in the context of the socalled information bottleneck [40, 41]. Unfortunately, this aspect has been merely taken into account in most of AE applications to fault diagnosis issues. Typically, the hidden variable is viewed as features, as it is (generated) and as the output of the optimization (training) process, without any explainable interpretation with regard to the system and the fault diagnosis problem under consideration. This motivates the work presented in the next subsection.
3.2.2. AEaided realization of projectionbased fault detection and estimation
The basic idea of applying AE technique to realize a projectionbased fault detection consists in training the NNs to follow the major properties of an orthogonal projection onto the system image subspace. In the sequel, we briefly describe the conceptual realization of the idea by means of two examples. For our purpose, recurrent neural networks are used for the realization of dynamic systems, denoted by ℛ𝒩𝒩_{en} and for encoder and decoder.
Example 3 AEaided realization of projectionbased fault detection. Let 𝒫_{AE} defined by
$$\begin{array}{cc}\hfill \left[\begin{array}{c}\widehat{u}\\ \widehat{y}\end{array}\right]& :={\mathcal{P}}_{\mathit{AE}}\left[\begin{array}{c}u\\ y\end{array}\right],\hfill \\ \hfill {\mathcal{P}}_{\mathit{AE}}\left[\begin{array}{c}u\\ y\end{array}\right]& :={\mathcal{RNN}}_{\mathit{de}}({\theta}_{\mathit{de}},h)={\mathcal{RNN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{RNN}}_{\mathit{en}}({\theta}_{\mathit{en}},\left[\begin{array}{c}u\\ y\end{array}\right]))\hfill \end{array}$$
be an AE. Suppose that M batches of system data are available for the training purpose, and each of them includes N system data,
$$\begin{array}{c}\hfill {\mathcal{B}}^{(i)}:=\{\left[\begin{array}{c}{u}^{(i)}\left({k}_{j}\right)\\ {y}^{(i)}\left({k}_{j}\right)\end{array}\right],\phantom{\rule{0.166667em}{0ex}}j=1,\cdots ,N\},i=1,\cdots ,M.\end{array}$$
Given vectors $\alpha \left({k}_{j}\right),\beta \left({k}_{j}\right)\in {\mathcal{R}}^{\kappa},\phantom{\rule{0.166667em}{0ex}}j=1,\cdots ,N,$ let
$$\begin{array}{c}\hfill {\alpha }_{2}^{2}=\sum _{j=1}^{N}{\alpha}^{T}\left({k}_{j}\right)\alpha \left({k}_{j}\right),\phantom{\rule{0.166667em}{0ex}}\langle \alpha ,\beta \rangle =\sum _{j=1}^{N}{\alpha}^{T}\left({k}_{j}\right)\beta \left({k}_{j}\right).\end{array}$$
For training purpose, a cost function consisting of three or four terms is defined,
$$\begin{array}{c}\hfill \mathcal{L}({\theta}_{\mathit{en}},{\theta}_{\mathit{de}})=\sum _{i=1}^{4}{w}_{i}{\mathcal{L}}_{i}({\theta}_{\mathit{en}},{\theta}_{\mathit{de}}),\phantom{\rule{0.166667em}{0ex}}{w}_{i}>0\text{, weighting factors.}\end{array}$$
Except the basic term,
$$\begin{array}{cc}\hfill {\mathcal{L}}_{1}({\theta}_{\mathit{en}},{\theta}_{\mathit{de}})& =\frac{1}{M}\sum _{i=1}^{M}{\left[\begin{array}{c}{u}^{(i)}\\ {y}^{(i)}\end{array}\right]\left[\begin{array}{c}{\widehat{u}}^{(i)}\\ {\widehat{y}}^{(i)}\end{array}\right]}_{2}^{2}\hfill \\ \hfill & =\frac{1}{M}\sum _{i=1}^{M}{\left[\begin{array}{c}{u}^{(i)}\\ {y}^{(i)}\end{array}\right]{\mathcal{NN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{NN}}_{\mathit{en}}({\theta}_{\mathit{en}},\left[\begin{array}{c}{u}^{(i)}\\ {y}^{(i)}\end{array}\right]))}_{2}^{2},\hfill \end{array}$$
the following regularized terms are added:

realization of idempotent operator 𝒫_{AE} (refer to (27)),
$$\begin{array}{c}{\mathcal{P}}_{\mathrm{AE}}{\mathcal{P}}_{\mathrm{AE}}={\mathcal{P}}_{\mathrm{AE}}\u27f9\\ \frac{1}{M}\sum _{i=1}^{M}\u200a{\parallel \left[\begin{array}{c}{\widehat{u}}^{\left(i\right)}\\ {\widehat{y}}^{\left(i\right)}\end{array}\right]{\mathcal{RN}}^{\mathrm{\prime}}{\mathcal{N}}_{\mathrm{de}}\left({\theta}_{\mathrm{de}},{\mathcal{R}}_{N}\mathcal{N}{\mathcal{N}}_{\mathrm{en}}\left({\theta}_{\mathrm{en}},\left[\begin{array}{c}{\widehat{u}}^{\left(i\right)}\\ {\widehat{y}}^{\left(i\right)}\end{array}\right]\right)\right)\parallel}_{2}^{2}\end{array}$$(40, 41)

realization of selfadjoint operator 𝒫_{AE},
$$\begin{array}{c}\hfill \langle {\mathcal{P}}_{\mathit{AE}}\left[\begin{array}{c}{u}^{(i)}\\ {y}^{(i)}\end{array}\right],\left[\begin{array}{c}{u}^{(j)}\\ {y}^{(j)}\end{array}\right]\rangle =\langle \left[\begin{array}{c}{u}^{(i)}\\ {y}^{(i)}\end{array}\right],{\mathcal{P}}_{\mathit{AE}}\left[\begin{array}{c}{u}^{(j)}\\ {y}^{(j)}\end{array}\right]\rangle \u27f9\\ \hfill \frac{1}{{M}^{2}}\sum _{i=1}^{M}\sum _{j=1}^{M}{(\langle \left[\begin{array}{c}{\widehat{u}}^{(i)}\\ {\widehat{y}}^{(i)}\end{array}\right],\left[\begin{array}{c}{u}^{(j)}\\ {y}^{(j)}\end{array}\right]\rangle \langle \left[\begin{array}{c}{u}^{(i)}\\ {y}^{(i)}\end{array}\right],\left[\begin{array}{c}{\widehat{u}}^{(j)}\\ {\widehat{y}}^{(j)}\end{array}\right]\rangle )}^{2}\u037e\end{array}$$

(optional) realization of the normalized SIR,
$$\begin{array}{c}\hfill {\left[\begin{array}{c}u\\ y\end{array}\right]}_{2}={\left[\begin{array}{c}{M}_{0}\\ {N}_{0}\end{array}\right]v}_{2}={v}_{2}\u27f9\\ \hfill \frac{1}{M}\sum _{i=1}^{M}{({\left[\begin{array}{c}{\widehat{u}}^{(i)}\\ {\widehat{y}}^{(i)}\end{array}\right]}_{2}{{h}^{(i)}}_{2})}^{2}\\ \hfill =\frac{1}{M}\sum _{i=1}^{M}{({\left[\begin{array}{c}{\widehat{u}}^{(i)}\\ {\widehat{y}}^{(i)}\end{array}\right]}_{2}{{\mathcal{RNN}}_{\mathit{en}}({\theta}_{\mathit{en}},\left[\begin{array}{c}{u}^{(i)}\\ {y}^{(i)}\end{array}\right])}_{2})}^{2}.\end{array}$$
It follows from the projectionbased fault detection method that the (online) residual evaluation function and the threshold are defined by
$$\begin{array}{cc}\hfill {J}_{N}(u,y)& =\frac{{\left[\begin{array}{c}u\\ y\end{array}\right]{\mathcal{RNN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{RNN}}_{\mathit{en}}({\theta}_{\mathit{en}},\left[\begin{array}{c}u\\ y\end{array}\right]))}_{2}^{2}}{{\left[\begin{array}{c}u\\ y\end{array}\right]}_{2}^{2}},\hfill \\ \hfill {J}_{th,N}(u,y)& =\frac{\delta}{1\delta}(1\frac{{{\mathcal{RNN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{RNN}}_{\mathit{en}}({\theta}_{\mathit{en}},\left[\begin{array}{c}u\\ y\end{array}\right]))}_{2}^{2}}{{\left[\begin{array}{c}u\\ y\end{array}\right]}_{2}^{2}}),\hfill \end{array}$$
where δ denotes the value
$$\begin{array}{c}\hfill \delta =\underset{{\theta}_{\mathit{en}},{\theta}_{\mathit{de}}}{min}\frac{1}{M}\sum _{i=1}^{M}{\left[\begin{array}{c}{u}^{(i)}\\ {y}^{(i)}\end{array}\right]\left[\begin{array}{c}{\widehat{u}}^{(i)}\\ {\widehat{y}}^{(i)}\end{array}\right]}_{2}^{2}\end{array}$$
achieved by training.
This example clearly demonstrates that,

the objective of the construction and, in particular, the training of the AE is the realization of the projectionbased optimal fault detection;

hidden variable h can be interpreted as the socalled reference signal v in the context of SIR and image subspace, and this information is fully integrated in the training process. Considering that during faultfree operations the system variables (u,y) are uniquely determined by v and thus can be fully recovered using v without any redundancy, such an AE is optimal in the context of information bottleneck [40, 41];

trained AE is embedded in the residual evaluation and threshold computation as well, which, in most of AEbased fault detection schemes, has not been incorporated.
As a next example, we present a conceptual scheme of optimal fault estimation in dynamic systems. To this end, the fault estimation problem is firstly formulated in a general form: considering system dynamics described by
$$\begin{array}{c}\hfill y=\mathcal{G}f,y\in {\mathcal{L}}_{2}^{m},f\in {\mathcal{L}}_{2}^{p},p>m,\end{array}$$(42)
find an estimator
$$\begin{array}{c}\hfill \widehat{f}={\mathcal{E}}_{f}y,\end{array}$$(43)
where operator 𝒢 represents the system dynamics, operator is a dynamic estimator, y is an mdimensional measurement vector, and f denotes a pdimensional unknown input vector that is called fault vector, but could also be cyberattack signals or disturbances. It is wellknown that the solution of (42) is not unique. We are interested in solving the above estimation problem in the datadriven fashion, that is, instead of the system model 𝒢, sufficient data, (y^{(i)}(k_{j}),f^{(i)}(k_{j})), j = 1, ⋯, N, i = 1, ⋯, M, are available and used for the estimation purpose. Moreover, the estimate should be the socalled least squares (LS) estimation ${\widehat{f}}_{\mathit{LS}}$, i.e.
$$\begin{array}{c}\hfill \forall \widehat{f}\phantom{\rule{0.333333em}{0ex}}\text{satisfying}y=\mathcal{G}\widehat{f},{{\widehat{f}}_{\mathit{LS}}}_{2}\le {\widehat{f}}_{2}\end{array}$$
with a specified confidence.
In the sequel, we first briefly introduce the modelbased LSsolution, which serves as the basis for our AEbased algorithm. Let
$$\begin{array}{c}\hfill \mathcal{G}={\mathcal{G}}_{\mathit{co}}\xb0{\mathcal{G}}_{\mathit{ci}}\end{array}$$
be a coinnerouter factorization of 𝒢 [16]. Here are coouter and coinner operators, respectively, satisfying
$$\begin{array}{c}\hfill {\mathcal{G}}_{\mathit{ci}}\xb0{\mathcal{G}}_{\mathit{ci}}^{\ast}=\mathcal{I},\mathcal{Q}={\mathcal{G}}_{\mathit{co}}^{1}\phantom{\rule{0.333333em}{0ex}}\text{being stable and causal},\end{array}$$
with ${\mathcal{G}}_{\mathit{ci}}^{*}$ as conjugate of 𝒢_{ci}. It is well known that
$$\begin{array}{c}\hfill {\widehat{f}}_{\mathit{LS}}={\mathcal{G}}_{\mathit{ci}}^{\ast}\xb0\mathcal{Q}y={\mathcal{G}}_{\mathit{ci}}^{\ast}\xb0{\mathcal{G}}_{\mathit{ci}}f\end{array}$$(44)
is the LS estimate of f. Furthermore, the estimation error,
$$\begin{array}{c}\hfill \eta =f{\widehat{f}}_{\mathit{LS}}=(\mathcal{I}{\mathcal{G}}_{\mathit{ci}}^{\ast}\xb0{\mathcal{G}}_{\mathit{ci}})f,\end{array}$$(45)
is defined as a specified confidence whose distribution and certain norm indicate the estimation performance.
Example 4 Optimal fault estimation in dynamic systems. An AEbased realization of the dynamic estimator (44) is schematically described in this example. As delineated in Figure 6, ${\widehat{f}}_{\mathit{LS}}$ is achieved by means of two recurrent neural networks ℛ𝒩𝒩_{𝒬}(θ_{𝒬}) and ℛ𝒩𝒩_{de}(θ_{de}), where ℛ𝒩𝒩_{de}(θ_{de}) is the decoder trained in the AE for constructing ${\mathcal{G}}_{\mathit{ci}}^{*}$. The AE is trained using the data set (y,f), (y^{(i)}(k_{j}),f^{(i)}(k_{j})), j = 1, ⋯, N, i = 1, ⋯, M, while the confidence η is generated based on the AE. To train the NNs, the total loss function ℒ(θ_{𝒬},θ_{en},θ_{de}) consists of three terms and is set as follows:

ℐ_{1} (θ_{Q})
$$\begin{array}{c}\hfill {\mathcal{L}}_{1}\left({\theta}_{\mathcal{Q}}\right)=\frac{1}{M}\sum _{i=1}^{M}{({{\mathcal{RNN}}_{\mathcal{Q}}({\theta}_{\mathcal{Q}},{y}^{(i)})}_{2}{{f}^{(i)}}_{2})}^{2},\end{array}$$
that minimises
$$\begin{array}{c}\hfill \mathcal{Q}y={\mathcal{G}}_{\mathit{co}}^{1}\xb0{\mathcal{G}}_{\mathit{co}}\xb0{\mathcal{G}}_{\mathit{ci}}f\u27f9{\mathcal{Q}y}_{2}{f}_{2}\u037e\end{array}$$

ℐ_{2} (θ_{Q}, θ_{en}, θ_{de}):
$$\begin{array}{c}\hfill {\mathcal{L}}_{2}({\theta}_{\varsigma},{\theta}_{\mathit{en}},{\theta}_{\mathit{de}})=\frac{1}{M}\sum _{i=1}^{M}{\begin{array}{c}{\mathcal{RNN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{RNN}}_{\mathit{en}}({\theta}_{\mathit{en}},{f}^{(i)}))\\ {\mathcal{RNN}}_{\mathit{de}}({\theta}_{\mathit{de}},{\mathcal{RNN}}_{\mathcal{Q}}({\theta}_{\mathcal{Q}},{y}^{(i)}))\end{array}}_{2}^{2},\end{array}$$
which minimises
$$\begin{array}{c}\hfill r={\mathcal{G}}_{\mathit{ci}}^{\ast}\xb0{\mathcal{G}}_{\mathit{ci}}f{\widehat{f}}_{\mathit{LS}}={\mathcal{G}}_{\mathit{ci}}^{\ast}\xb0{\mathcal{G}}_{\mathit{ci}}f{\mathcal{G}}_{\mathit{ci}}^{\ast}\xb0\mathcal{Q}y\u037e\end{array}$$

ℐ_{3} (θ_{en}, θ_{de}): realization of an AEbased orthogonal projection presented in the previous subsection.
The specified confidence could be computed using the (sample) distribution or a certain norm of variable η.
Figure 6. Schematic configuration of the fault estimator 
3.2.3. A critical remark
The current enthusiasm for ML and big data technologies is significantly influencing the developments in the diagnosis research and engineering domains. It is a logical consequence that most of the existing ML methods and concepts have been introduced into this thematic field. Reviewing the course of this development, it seems that it is becoming a competition of publishing applications of newly developed MLmethods and algorithms to fault diagnosis. The consequence of this “copyandpaste” style of research efforts is that very essential engineering requirements on diagnosis in automatic control systems have not been or cannot be fully considered in the use of MLmethods and algorithms. The reason is simple: the construction of most popular learning machines like deep NNs is less explainable, in particular in the context of diagnosis in dynamic systems. This issue becomes even more critical, when such methods are applied for the purpose of functional safety and cyber security. It is remarkable that explainability and interpretability build a very actual research focus in the MLcommunity [42]. This research endeavour is helpful for applying MLbased methods to diagnosis in automatic control systems. On the other hand, it should be kept in mind that, although enormously powerful and capable, MLtechnology is a tool and its engineering applications should meet technical requirements and be explainable in the engineering context. In this regard, considerable efforts should be made to achieve diagnosisoriented explainable applications of MLbased methods. Our discussion and the examples in this subsection have plainly documented that complementary and explainable application of model and MLbased methods is a convincing way to develop advanced diagnosis methods towards enhancing functional safety and cyber security.
4. Performance degradation monitoring towards functional safety and cyber security
Control performance monitoring is an applicationdriven research area and has its applications mainly in process industry [43]. Roughly speaking, the essential tasks of control performance monitoring consist of assessment of control loop performance, detection of performance degradation and diagnosis of (component) faults [25]. Recently, new research efforts on POD can be observed [29, 44], in which performance of automatic control systems is assessed at the system level and under various aspects like energy consumption, system reliability safety etc. Moreover, different from the traditional efforts focused on recovering performance degradation caused by component faults [45–47], advanced methods for control performance degradation monitoring and loop performance recovery have been reported [25, 44, 48].
In this section, we address POD issues with a focus on residualcentred modelling and detection of system performance degradation.
4.1. Residualcentred system model
In [16], a socalled observerbased input–output model is introduced, which models the input–output dynamics of any LTI automatic control systems and is expressed, given the system nominal model (1)–(3), by
$$\begin{array}{cc}\hfill \widehat{x}(k+1)& =A\widehat{x}\left(k\right)+Bu(k)+Lr(k),\hfill \\ \hfill y(k)& =r(k)+C\widehat{x}(k)+Du(k).\hfill \end{array}$$(46, 47)
It is evident that the centrepiece of the above model is a state observer. Different from the state space model (2)–(3) that solely represents the nominal system dynamics, model (46)–(47) gives the system input–output dynamics also for the case that uncertainties exist in the system. As illustrated in [16], the influences of any uncertainties in the system are showcased by residual vector r, which is available and accessible in the model (46)–(47). Moreover, in light of the observerbased and residualdriven realization of any feedback controllers introduced in Section 2,
$$\begin{array}{c}\hfill u(z)=K(z)y(z)+v(z)=F\widehat{x}(z)Q(z){r}_{y}(z)+\overline{v}(z),\end{array}$$(48)
any standard control loop shown in Figure 1 can be equivalently represented by the model (46)–(48), which is called residualcentred system model to underline the role of the residual vector in the model. Figure 7 showcases the equivalence between the standard control loop and its residualcentred model, in which Δ is used to denote system uncertainties schematically.
Figure 7. From the standard model to the observerbased I/Omodel: a schematic description 
The advantages of the residualcentred system model lie on hand:

all system variables in the model, independent of the existence of any uncertainties, are accessible (for further computations),

the implementation of the model is numerically reliable and stable, since only stable dynamics are concerned in the model, and

with the embedded residual vector, the model is equipped with a capable indicator for the existence of uncertainties in the system.
The last function can be further ground using the projectionbased method introduced in the previous section. According to (35), the l_{2}norm of the residual vector generated by the normalized SKR (and the corresponding observer) is the distance of the measurement data (u,y) to the system image subspace and thus an indicator for the intensity of the uncertainty in the system. Accordingly,
$$\begin{array}{c}\hfill {{r}_{y}}_{2}={{K}_{N}\left[\begin{array}{c}u\\ y\end{array}\right]}_{2}\end{array}$$(49)
is an indicator for the quality of the residualcentred model as well as system operation performance. It can, for instance, substitute the numerical involved algorithm for online estimation of gap metric and system stability margin adopted in [29].
Example 5 In this example, we introduce a conceptual configuration of automatic control systems, which consists of four functional layers and is schematically sketched in Figure 8. “Information layer” is the core of the multilayer configuration, whose centrepiece is the observerbased input–output model (46)–(47). Except for providing the needed online information for realtime control and diagnosis, various additional functionalities, in particular those safety and cyber securityrelated ones, can be well integrated in this layer, for instance, serving as

a fusing algorithm of sensor data,

soft sensors for estimation of plant key variables,

an encoder for encrypting the plant data as described in Section 2,

an indicator for system uncertainties as given by (49).
In “Realtime control and diagnosis layer”, the standard (feedback) control and diagnosis algorithms described in Section 2.3 are performed. “Performance monitoring and optimization layer” includes advanced performance degradation detection and recovery algorithms, for instance reported in [25, 29, 44, 48] or described below. In “Learning and adaptation layer”, MLalgorithms like the AEs introduced in Section 3.2 run aiming at updating the functional layers to match changes in the system.
Figure 8. Schematic configuration of a multilayer automatic control system 
4.2. Functionalityoriented performance degradation monitoring
Consider system (1)–(3). Associated with it, the following Lyapunov equation provides us with a basic form of performance models for the system functionality and control,
$$\begin{array}{c}\hfill {S}^{T}PSP+Q=0,P>0,Q\ge 0,\phantom{\rule{0.166667em}{0ex}}S\phantom{\rule{0.333333em}{0ex}}\text{is Schur}.\end{array}$$(50)
Here, matrices S, Q ∈ ℛ^{n × n} are functions of the system matrices (A,B,C) and state feedback gain matrix F, which are given corresponding to the following (representative) system functionalities and controller configuration:

for
$$\begin{array}{c}\hfill S=A,Q=B{B}^{T},\end{array}$$(51)
P as the solution of (50) is the controllability gramian that indicates the capability of the actuators,

for
$$\begin{array}{c}\hfill S={A}^{T},Q={C}^{T}C,\end{array}$$(52)
P is the observability gramian indicating the capability of the sensors,

for either (51) or (52), ℋ_{2}norm of transfer function C(zI−A)^{−1}B as performance can be assessed as follows:
$$\begin{array}{c}\hfill {C{(zIA)}^{1}B}_{2}=t{r}^{1/2}\left(CP{C}^{T}\right)\phantom{\rule{0.333333em}{0ex}}\text{or}{C{(zIA)}^{1}B}_{2}=t{r}^{1/2}\left({B}^{T}PB\right),\end{array}$$

for
$$\begin{array}{c}\hfill S=A+BF,Q={Q}_{0}+{F}^{T}RF,R>0,{Q}_{0}\ge 0,\end{array}$$
performance of an LQ state feedback controller, u = Fx,
$$\begin{array}{c}\hfill J\left(k\right)=\sum _{i=k}^{\infty}({x}^{T}(i){Q}_{0}x(i)+{u}^{T}\left(i\right)Ru\left(i\right))={x}^{T}(k)Px(k),\end{array}$$
is assessed.
There exist several strategies to monitor the abovedescribed system performance. Assume that the system dynamics is governed by
$$\begin{array}{c}\hfill x(k+1)=Sx(k),\end{array}$$
and x(k) is measurable. Define
$$\begin{array}{c}\hfill J\left(k\right)=\sum _{i=k}^{\infty}{x}^{T}(i)Qx(i)={x}^{T}(k)Px(k).\end{array}$$
It holds
$$\begin{array}{c}\hfill J(k+1)+{x}^{T}(k)Qx(k)J\left(k\right)=0,\end{array}$$(53)
during degradationfree operations. Hence, introducing performance residual r_{p} defined by
$$\begin{array}{c}\hfill {r}_{p}(k):={x}^{T}(k+1)Px(k+1)+{x}^{T}(k)(QP)x(k),\end{array}$$
performance degradation can be detected using standard residualbased detection schemes. This endeavour is unfortunately limited to a theoretical concept and often vain in practical applications due to its minor detection capability and strict constraints on the system dynamics. Aiming at improving the detection performance, [49] have proposed a sophisticated detection scheme, which is briefly described in the sequel.
By means of a vectorization of P matrix, rewrite the performance model
$$\begin{array}{c}J\left(k\right)={x}^{T}\left(k\right)\mathrm{Qx}\left(k\right)+J(k+1)\u27f9\\ {x}^{T}\left(k\right)\mathrm{Px}\left(k\right){x}^{T}(k+1)\mathrm{Px}(k+1)={x}^{T}\left(k\right)\mathrm{Qx}\left(k\right)\end{array}$$(54)
as
$$\begin{array}{c}\hfill ({x}^{T}(k)\otimes {x}^{T}(k){x}^{T}(k+1)\otimes {x}^{T}(k+1)){D}_{n}hvec\left(P\right)={x}^{T}(k)Qx(k).\end{array}$$(55)
In the above equation, hvec(P) denotes a halfvectorization of symmetric matrix P ∈ ℛ^{n × n}, represents the $\frac{n(n+1)}{2}$ parameters to be identified (considering P = P^{T}) and satisfies,
$$\begin{array}{c}\hfill {D}_{n}hvec\left(P\right)=vec\left(P\right),hvec\left(P\right)\in {\mathcal{R}}^{\frac{n(n+1)}{2}},{D}_{n}\in {\mathcal{R}}^{{n}^{2}\times \frac{n(n+1)}{2}}\end{array}$$
with D_{n} being the socalled duplication matrix [50]. Notation ⊗ stands for the Kronecker product. Suppose that, a sufficient number of data, x(k + i), i = 0, ⋯, N, are collected, which enables us to write (55) into
$$\begin{array}{c}\mathrm{\Psi}\mathrm{hvec}\left(P\right)=\varphi \\ \mathrm{\Psi}=\left[\begin{array}{c}{x}^{T}\left(k\right)\otimes {x}^{T}\left(k\right){x}^{T}(k+1)\otimes {x}^{T}(k+1)\\ \vdots \\ {x}^{T}(k+N1)\otimes {x}^{T}(k+N1){x}^{T}(k+N)\otimes {x}^{T}(k+N)\end{array}\right]\\ \varphi =\left[\begin{array}{c}{x}^{T}\left(k\right)\mathrm{Qx}\left(k\right)\\ \vdots \\ {x}^{T}(k+N1)\mathrm{Qx}(k+N1)\end{array}\right]\end{array}$$(56)
As a result, on the assumption of sufficient excitation, matrix P can be identified using, for example, a standard LS estimation algorithm. If the difference between the identified and the nominal goes beyond a decision threshold, performance degradation is declared. Considering that the solution of (50) is a symmetric positive definite (SPD) matrix, the Riemannian metric method [16, 49] can be applied to achieve an efficient degradation detection. In [16], variations of the above algorithm are provided to solve the similar performance degradation problems using system output data y(k) instead of the state variable x(k).
Note that the above presented detection schemes are limited to the case that u = Fx. Although extensions have been proposed in [16], a general solution for arbitrary input u remains to be an open issue. In the following example, we present a conceptual solution for performance degradation detection.
Example 6 For simplicity, we only consider controllability gramian as functionality performance with the system model
$$\begin{array}{c}\hfill x(k+1)=Ax(k)+Bu(k),A\phantom{\rule{0.333333em}{0ex}}\text{is Schur}\end{array}$$
and a function
$$\begin{array}{c}\hfill J\left(k\right)=\sum _{i=k}^{\infty}({x}^{T}(i)Qx(i)+{\overline{u}}^{T}\left(i\right)R\overline{u}\left(i\right)),Q=B{B}^{T},\overline{u}\left(i\right)=\{\begin{array}{c}u(k),\phantom{\rule{0.166667em}{0ex}}i=k,\hfill \\ 0,\phantom{\rule{0.166667em}{0ex}}i>k.\hfill \end{array}\end{array}$$
It yields
$$\begin{array}{c}\hfill J\left(k\right)={x}^{T}(k)Qx(k)+{u}^{T}\left(k\right)Ru\left(k\right)+{x}^{T}(k+1)Px(k+1)\\ \hfill =\left[\begin{array}{cc}{x}^{T}(k)& {u}^{T}\left(k\right)\end{array}\right]\left[\begin{array}{cc}{A}^{T}PA+Q& {A}^{T}PB\\ {B}^{T}PA& R+{B}^{T}PB\end{array}\right]\left[\begin{array}{c}x(k)\\ u\left(k\right)\end{array}\right]\\ \hfill =\left[\begin{array}{cc}{x}^{T}(k+1)& 0\end{array}\right]\left[\begin{array}{cc}{A}^{T}PA+Q& {A}^{T}PB\\ {B}^{T}PA& R+{B}^{T}PB\end{array}\right]\left[\begin{array}{c}x(k+1)\\ 0\end{array}\right]\\ \hfill +\left[\begin{array}{cc}{x}^{T}(k)& {u}^{T}\left(k\right)\end{array}\right]\left[\begin{array}{cc}Q& 0\\ 0& R\end{array}\right]\left[\begin{array}{c}x(k)\\ u\left(k\right)\end{array}\right],\\ \hfill {A}^{T}PAP+Q=0,\phantom{\rule{0.166667em}{0ex}}P>0,\end{array}$$
which can be further written into
$$\begin{array}{c}\left[{x}^{T}\left(k\right){u}^{T}\left(k\right)\right]\mathrm{\Phi}\left[\begin{array}{c}x\left(k\right)\\ u\left(k\right)\end{array}\right]\left[{x}^{T}(k+1)0\right]\mathrm{\Phi}\left[\begin{array}{c}x(k+1)\\ 0\end{array}\right]\\ =\left[{x}^{T}\left(k\right){u}^{T}\left(k\right)\right]\left[\begin{array}{cc}Q& 0\\ 0& R\end{array}\right]\left[\begin{array}{c}x\left(k\right)\\ u\left(k\right)\end{array}\right]\\ \mathrm{\Phi}=\left[\begin{array}{cc}{A}^{T}\mathrm{PA}+Q& {A}^{T}\mathrm{PB}\\ {B}^{T}\mathrm{PA}& R+{B}^{T}\mathrm{PB}\end{array}\right]\end{array}$$(57, 58)
Note that (57) is of the identical form with (54). Consequently, applying the same procedure with (55)–(56), matrix Φ can be identified, which then enables a reliable performance degradation detection. It is noteworthy that Φ contains more information than P, which can be adopted for monitoring other system performance as well. For instance, given $Q={C}_{s}^{T}{C}_{s}$ and R, the value
$$\begin{array}{c}\hfill q=t{r}^{1/2}(\widehat{\mathrm{\Phi}}(2,2)R)\end{array}$$
with $\widehat{\mathrm{\Phi}}(2,2)=R+{B}^{T}PB$ denoting the identified subblock of matrix Φ, gives an estimation of ℋ_{2}norm of transfer function C_{s}(zI−A)^{−1} B, i.e.
$$\begin{array}{c}\hfill q={{C}_{s}{(zIA)}^{1}B}_{2},\end{array}$$
which could, for example, represent the system dynamics from u to a certain sensor block modelled by C_{s} x.
Remark 3 Even though only LTI systems are considered in the schemes introduced above, the ideas can be well adopted to address performance degradation monitoring of nonlinear control systems. Below, we schematically outline the conceptual steps of approaching solutions. Let the system performance under monitoring be
$$\begin{array}{c}\hfill J\left(k\right)=\sum _{i=k}^{\infty}q(x(k),u(k)).\end{array}$$
Analogue to (53), it holds
$$\begin{array}{c}\hfill J(k+1)+q(x(k),u(k))J\left(k\right)=0.\end{array}$$(59)
On the assumption that J(k) as solution of (59) could be approximated by
$$\begin{array}{c}\hfill J\left(k\right)=\sum _{i=i}^{N}{w}_{i}{\varphi}_{i}(x(k),u(k)),\end{array}$$(60)
where {ϕ_{i}(x(k),u(k)),i=1,⋯,N} is the set of some basic functions and w_{i}, i = 1, ⋯, N, are weights [51], difference equation (59) is rewritten into
$$\begin{array}{c}\hfill \sum _{i=i}^{N}{w}_{i}({\varphi}_{i}(x(k),u(k)){\varphi}_{i}(x(k+1),u(k+1)))=q(x(k),u(k)).\end{array}$$(61)
Equation (61) is similar to (54) and can serve as a performance model. During online operations, the system performance can be assessed by an online identification of weights w_{i}, i = 1, ⋯, N, and computation of J(k) according to (60). It is noteworthy that the performance value function J(k) can be generally approximated using NNs [52].
At the end of this subsection, we would like to draw the reader’s attention to the fact that application of the aforementioned schemes requires knowledge of the system state vector x(k), which is, unfortunately, not available in most of real practical applications. It is an open and challenging issue to realize those performance degradation monitoring schemes using system data (u,y) instead of the state vector x. In [16], this issue has been investigated.
4.3. Performance degradation monitoring in the probabilistic setting
Considering that the performance degradation schemes presented in the previous subsection are based on the assumption of ideal system models without uncertainty, adaptations are needed before they are efficiently applied in practice. Although their extensions to systems with normally distributed process and measurement noises have been addressed in [16], efficient handling of model uncertainties remains to be an open issue. Recently, [53–55] have proposed to apply the socalled distributionally robust optimization (DRO) technique [56, 57] to enhancing the robustness of fault detection systems against model uncertainties. In particular, it is advantageous that DRO technique enables handlings and solutions in a probabilistic setting. In this subsection, we briefly introduce the ideas of applying DRO technique to performance degradation detection by means of two examples.
In the sequel, notation Ξ is adopted for support, ℙ is used for probability. ℙ_{ξ} and 𝔼_{ℙξ} represent probability distribution of ξ and expectation taken with respect to ξ following ℙ_{ξ}.
Example 7 In this example, we delineate a datadriven realization of performance indicator (49) in the probabilistic setting. Departing from the system model,
$$\begin{array}{cc}\hfill x(k+1)& =Ax(k)+Bu(k)+\omega (k),\hfill \\ \hfill y(k)& =Cx(k)+Du(k)+\upsilon (k).\hfill \end{array}$$
with ω(k),υ(k) being the process and measurement noise vectors, the system dynamics are written as
$$\begin{array}{c}\hfill {y}_{s}(k)={\mathrm{\Gamma}}_{s}x(ks)+{H}_{u,s}{u}_{s}(k)+{H}_{\omega ,s}{\omega}_{s}(k)+{\upsilon}_{s}(k),\end{array}$$(62)
where y_{s}(k),u_{s}(k),Γ_{s}, H_{u, s} are given in Example 2, and are as follows:
$$\begin{array}{cc}\hfill {H}_{\omega ,s}& =\left[\begin{array}{cccc}0& 0& \phantom{\rule{0.166667em}{0ex}}& \\ C& \ddots & \ddots & \\ \vdots & \ddots & \ddots & 0\\ C{A}^{s1}& \cdots & C& 0\end{array}\right]\in {\mathcal{R}}^{(s+1)m\times (s+1)n},\hfill \\ \hfill {\omega}_{s}(k)& =\left[\begin{array}{c}\omega (ks)\\ \vdots \\ \omega (k)\end{array}\right],{\upsilon}_{s}(k)=\left[\begin{array}{c}\upsilon (ks)\\ \vdots \\ \upsilon (k)\end{array}\right].\hfill \end{array}$$
To simplify our study, assume that the system is stable, x(k−s) is a random vector and ϕ_{s}(k) is a wide sense stationary (w.s.s) stochastic process. We then further write (62) into
$$\begin{array}{cc}\hfill {y}_{s}(k)& ={H}_{u,s}{u}_{s}(k)+{\varphi}_{s}(k),\hfill \\ \hfill {\varphi}_{s}(k)& ={\mathrm{\Gamma}}_{s}x(ks)+{H}_{\omega ,s}{\omega}_{s}(k)+{\upsilon}_{s}(k).\hfill \end{array}$$(63)
Using the results presented in Example 2, the projectionbased residual vector and the corresponding evaluation function are equivalently realized as follows:
$$\begin{array}{cc}\hfill {r}_{\mathcal{I}}(k)& =\left[\begin{array}{c}{u}_{s}(k)\\ {y}_{s}(k)\end{array}\right]{\mathcal{P}}_{\mathcal{I}}\left[\begin{array}{c}{u}_{s}(k)\\ {y}_{s}(k)\end{array}\right],\hfill \\ \hfill {r}_{s}(k)& ={\mathrm{\Pi}}^{1/2}({y}_{s}(k){H}_{u,s}{u}_{s}(k)),{r}_{\mathcal{I}}(k)={r}_{s}(k).\hfill \end{array}$$
Note that r_{s}(k) can be written as
$$\begin{array}{c}\hfill {r}_{s}(k)={\mathrm{\Pi}}^{1/2}(\mathrm{\Delta}{H}_{u,s}{u}_{s}(k)+{\varphi}_{s}(k))=:{\mathrm{\Pi}}^{1/2}{\overline{\varphi}}_{s},\end{array}$$
where ΔH_{u, s} represents uncertainty in the system, which leads to
$$\begin{array}{c}\hfill {{r}_{\mathcal{I}}(k)}^{2}={{r}_{s}(k)}^{2}={\overline{\varphi}}_{s}^{T}\mathrm{\Pi}{\overline{\varphi}}_{s}.\end{array}$$
Suppose that the distribution of unknown random vector ${\overline{\varphi}}_{s}$ belongs to the momentbased ambiguity set [56],
$$\begin{array}{c}\hfill \mathcal{D}({\gamma}_{1},{\gamma}_{2})=\left\{{\mathbb{P}}_{{\overline{\varphi}}_{s}}\begin{array}{c}\mathbb{P}({\overline{\varphi}}_{s}\in \mathrm{\Xi})=1\\ {({\mathbb{E}}_{{\mathbb{P}}_{{\overline{\varphi}}_{s}}}\left\{{\overline{\varphi}}_{s}\right\}{\mu}_{0})}^{T}{\mathrm{\Sigma}}_{0}^{1}({\mathbb{E}}_{{\mathbb{P}}_{{\overline{\varphi}}_{s}}}\left\{{\overline{\varphi}}_{s}\right\}{\mu}_{0})\le {\gamma}_{1}\\ {\mathbb{E}}_{{\mathbb{P}}_{{\overline{\varphi}}_{s}}}\left\{({\mathbb{E}}_{{\mathbb{P}}_{{\overline{\varphi}}_{s}}}\left\{{\overline{\varphi}}_{s}\right\}{\mu}_{0}){({\mathbb{E}}_{{\mathbb{P}}_{{\overline{\varphi}}_{s}}}\left\{{\overline{\varphi}}_{s}\right\}{\mu}_{0})}^{T}\right\}\le {\gamma}_{2}{\mathrm{\Sigma}}_{0}\end{array}\right\},\end{array}$$
where vector μ_{0}, matrix Σ_{0}, and constants γ_{1} ≥ 0, γ_{2} ≥ 1 are estimated using the sufficient number of collected data and thus assumed to be known. It is obvious that threshold setting
$$\begin{array}{c}\hfill {J}_{\mathit{th}}=\underset{{\overline{\varphi}}_{s}}{sup}{\overline{\varphi}}_{s}^{T}\mathrm{\Pi}{\overline{\varphi}}_{s}\end{array}$$
would result in considerably conservative performance degradation detection. More reasonable setting can be achieved in the probabilistic setting as follows:
$$\begin{array}{c}\hfill \forall {\mathbb{P}}_{{\overline{\varphi}}_{s}}\in \mathcal{D}({\gamma}_{1},{\gamma}_{2}),\mathbb{P}({\overline{\varphi}}_{s}^{T}\mathrm{\Pi}{\overline{\varphi}}_{s}>{J}_{\mathit{th}})\le \alpha ,\end{array}$$
where α is a tolerable upper bound of false alarm rate. In this context, the probabilistic performance degradation problem is formulated as: given α ∈ (0,1), solve
$$\begin{array}{c}\mathrm{min}\beta =:{J}_{\mathrm{th}},\\ \underset{{\mathbb{P}}_{{\overline{\varphi}}_{s}}\in \mathcal{D}\left({\gamma}_{1},{\gamma}_{2}\right)}{\mathrm{sup}}\u200a\mathbb{P}\left({\overline{\varphi}}_{s}^{T}\mathrm{\Pi}{\overline{\varphi}}_{s}>\beta \right)\le \alpha \end{array}$$(64, 65)
for the threshold J_{th}. The DRO problem (64)–(65) can be solved using wellestablished DRO technique, see for example [53, 56].
Example 8 Consider observerbased input–output model (46)–(47). Suppose that $u(k)=F\widehat{x}(k),$ and the residual vector is a w.s.s. stochastic process over the time interval [k − s, k), and its (unknown) distribution belongs to the momentbased ambiguity set,
$$\begin{array}{cc}\hfill \mathcal{D}({\gamma}_{1},{\gamma}_{2})& =\left\{{\mathbb{P}}_{{r}_{s1}}\begin{array}{c}\mathbb{P}({r}_{s1}\in \mathrm{\Xi})=1\\ {\mathbb{E}}_{{\mathbb{P}}_{{r}_{s1}}}^{T}\left\{{r}_{s1}\right\}{\mathrm{\Sigma}}_{0}^{1}{\mathbb{E}}_{{\mathbb{P}}_{{r}_{s1}}}\left\{{r}_{s1}\right\}\le {\gamma}_{1}\\ {\mathbb{E}}_{{\mathbb{P}}_{{r}_{s1}}}\left\{{\mathbb{E}}_{{\mathbb{P}}_{{r}_{s1}}}\left\{{r}_{s1}\right\}{\mathbb{E}}_{{\mathbb{P}}_{{r}_{s1}}}^{T}\left\{{r}_{s1}\right\}\right\}\le {\gamma}_{2}{\mathrm{\Sigma}}_{0}\end{array}\right\},\hfill \\ \hfill {r}_{s1}& =\left[\begin{array}{c}r(ks)\\ \vdots \\ r(k1)\end{array}\right],\hfill \end{array}$$
where s is a sufficiently large integer so that ${A}_{F}^{s}\approx 0$. We would like to draw the reader’s attention to random vector r_{s − 1}. As described in Section 4.1, it represents uncertainties in the system, including noises and model uncertainty. Define cost function for control performance assessment as
$$\begin{array}{c}J\left(k\right)=\mathbb{E}\sum _{i=k}^{\mathrm{\infty}}\u200a{\widehat{x}}^{T}\left(i\right)Q\widehat{x}\left(i\right)={\widehat{x}}^{T}\left(k\right)P\widehat{x}\left(k\right)\\ {A}_{F}^{T}P{A}_{F}P+Q=0,P>0\end{array}$$(66, 67)
It follows from (46) that
$$\begin{array}{cc}\hfill \widehat{x}\left(k\right)& ={A}_{F}^{s}\widehat{x}(ks)+{r}_{x}(k)\approx {r}_{x}(k),\hfill \\ \hfill {r}_{x}(k)& =\mathrm{\Theta}{r}_{s1},\mathrm{\Theta}=\left[\begin{array}{cccc}{A}_{F}^{s1}L& \cdots & {A}_{F}L& L\end{array}\right].\hfill \end{array}$$(68)
Assume that Θ is of full rowrank. The momentbased ambiguity set of r_{x} is given by
$$\begin{array}{cc}\hfill {\mathcal{D}}_{{r}_{x}}({\gamma}_{3},{\gamma}_{4})& =\left\{{\mathbb{P}}_{{r}_{x}}\begin{array}{c}\mathbb{P}({r}_{x}\in \mathrm{\Xi})=1\\ {\mathbb{E}}_{{\mathbb{P}}_{{r}_{x}}}^{T}\left\{{r}_{x}\right\}{\overline{\mathrm{\Sigma}}}_{0}^{1}{\mathbb{E}}_{{\mathbb{P}}_{{r}_{x}}}\left\{{r}_{x}\right\}\le {\gamma}_{3}\\ {\mathbb{E}}_{{\mathbb{P}}_{{r}_{x}}}\left\{{\mathbb{E}}_{\mathbb{P}}\left\{{r}_{x}\right\}{\mathbb{E}}_{{\mathbb{P}}_{{r}_{x}}}{\left\{{r}_{x}\right\}}^{T}\right\}\le {\gamma}_{4}{\overline{\mathrm{\Sigma}}}_{0}\end{array}\right\},\hfill \\ \hfill {\overline{\mathrm{\Sigma}}}_{0}& =\mathrm{\Theta}{\mathrm{\Sigma}}_{0}{\mathrm{\Theta}}^{T},\hfill \end{array}$$
where γ_{3}, γ_{4} and ${\overline{\mathrm{\Sigma}}}_{0}$ are known. The probabilistic performance degradation detection problem is then formulated as: given α ∈ (0,1), solve
$$\begin{array}{c}\mathrm{min}\beta =:{J}_{\mathrm{th}},\\ \underset{{\mathbb{P}}_{{r}_{x}}\in {\mathcal{D}}_{{r}_{x}}\left({\gamma}_{3},{\gamma}_{4}\right)}{\mathrm{sup}}\u200a\mathbb{P}\left({r}_{x}^{T}P{r}_{x}>\beta \right)\le \alpha \end{array}$$(69, 70)
for the threshold J_{th}.
The above two examples showcase that DRO technique can serve as a powerful tool to deal with performance degradation detection issues efficiently. It is noteworthy that various ambiguity sets are investigated in the DRO framework [56], which enables us to handle different types of model uncertainties and study performance degradation detection issues both in modelbased and datadriven fashions. A further aspect is to address safety issues in a probabilistic setting [58]. For instance, let
$$\begin{array}{c}\hfill {\mathcal{S}}_{x}=\left\{x{g}_{i}(x)\le 0,i=1,\cdots ,\kappa \right\},\end{array}$$
denote the set of the system state variables that are in the safe region defined by the safety requirements g_{i}(x)≤0, i = 1, ⋯, κ. Then, the probability,
$$\begin{array}{c}\hfill \mathbb{P}(x\in {\mathcal{S}}_{x})>\beta >>0,\end{array}$$(71)
can be, as a constraint, embedded in a probabilistic performance degradation detection and recovery problem.
5. Conclusion
In this note, we have discussed about diagnosis and performance degradation detection issues from an integrated viewpoint of functionality maintenance and cyber security of automatic control systems. Three aspects have been addressed:

application of a control and detection unified framework to enhancing the diagnosis capability of feedback control systems, in which the functionalization of the control system plays an essential role. It is showcased that rational utilization of the residual signal as an information provider and cyber securityoriented configuration of functional units of the control system promises enhanced capacity of detecting technical faults and cyber attacks, and preventing attackers to gain system knowledge by means of system identification using the transmitted data;

projectionbased technique of detecting faults in dynamic systems, which is based on an orthogonal projection of the system data onto the system image and kernel subspaces. This technique is more capable than the wellestablished observerbased schemes in dealing with detecting faults in dynamic systems. In addition, more importantly, it enables explainable applications of MLbased technique like AE methods to diagnosis. It is illustrated that complementary application of model and MLbased methods is the future of the diagnosis technique for industrial automatic control systems;

system performance degradation detection, which is of elemental importance for industrial CPSs and, unfortunately, has received less attention in the research domain. The residualcentred model form for dynamic systems is a useful system tool to deal with performance degradation detection issues. Moreover, some performance degradation monitoring schemes are introduced, whose core, roughly speaking, is modelling of system performance and online identification of the associated model parameters. It is demonstrated that by means of DRO technique, performance degradation detection can be handled in a probabilistic setting, which enables an efficient and more reliable degradation detection.
We have reported ideas, presented conceptual schemes, and illustrated by means of examples why research efforts in these three aspects could contribute to the future development of capable monitoring and diagnosis methods towards enhancing functionality safety and cyber security of automatic control systems. We would like to mention that a number of the basic design schemes and algorithms reported in this note have been successfully tested on laboratory systems, including

application of the control and detection unified framework to cyberattack detection in threetank control system [23],

projectionbased fault detection in threetank control system [24],

DRO techniquebased fault detection in threetank control system [53, 55],

performance degradation monitoring and recovery of visionbased inverted pendulum control system [59].
The focus of this note is on diagnosis and performance degradation detection issues. So far, key maintenance technologies like condition monitoring (CM), prognostics and health management (PHM), performance degradation recovery (PDR) or faulttolerant control (FTC) are not addressed. The interested reader is referred to [5, 16, 25, 60–63] and references cited therein. We would like to emphasize the two aspects of fault diagnosis and performance degradation monitoring in automatic control systems. On the one hand, it builds the technical basis and an indispensable part of technologies like CM, PHM, PDR and FTC. Consequently, its development is significantly stamped by progresses in these technologies. On the other hand, as a basic function of today’s automatic control systems, fault diagnosis and performance monitoring should match ongoing developments in automatic control systems. CPS, internet of things (IoT) and cloud computing as a service are the key technologies that will decisively impact the evolution of automatic control systems in the era of industry 4.0. In this context, integrated study on functional safety and cyber security of automatic control systems is of essential importance. Our work reported in this note is a contribution to this study.
Conflict of Interest
The author declares no conflict of interest.
Data Availability
The original data are available from the corresponding author upon reasonable request.
Acknowledgments
The author is very grateful to Dr.Ing. L. Li for the collaborative work on the unified framework of control and detection as well as on the projectionbased detection methods, to Dr.Ing. Z. Chen for the valuable contributions to MLmethods and AEbased realization of projection methods, and to Dr. D. Zhao for the intensive and valuable discussions on cyber security issues. Also, the author is thankful to the anonymous reviewers for their valuable and constructive comments and suggestions.
Funding
This research did not receive any funding.
References
 Frank PM. Fault diagnosis in dynamic systems using analytical and knowledgebased redundancy  A survey. Automatica 1990; 26: 459–74. [CrossRef] [Google Scholar]
 Frank PM and Ding X. Survey of robust residual generation and evaluation methods in observerbased fault detection systems. J Process Contr 1997; 7: 403–24. [CrossRef] [Google Scholar]
 Ding SX, Zhang P and Yin S et al. An integrated design framework of faulttolerant wireless networked control systems for industrial automatic control applications. IEEE Trans Ind Inform 2013; 9: 462–71. [CrossRef] [Google Scholar]
 Gao ZW, Cecati C and Ding SX. A survey of fault diagnosis and faulttolerant techniques, part I: Fault diagnosis with modelbased and signalbased approaches. IEEE Trans Ind Electron 2015; 62: 3757–67. [CrossRef] [Google Scholar]
 Hwang I, Kim S and Kim Y et al. A survey of fault detection, isolation, and reconfiguration methods. IEEE Trans Contr Syst Tech 2010; 18: 636–53. [CrossRef] [Google Scholar]
 Wen CL, Lv FY and Bao ZJ et al. A review of data drivenbased incipient fault diagnosis. Acta Automat Sin 2016; 42: 1285–99. [Google Scholar]
 Zhou DH, Zhao Y and Wang Z et al. Review on diagnosis techniques for intermittent faults in dynamic systems. IEEE Trans Ind Electron 2020; 67: 2337–47. [CrossRef] [Google Scholar]
 Dibaji SM, Pirani M and Flamholz DB et al. A systems and control perspective of CPS security. Ann Rev Contr 2019; 47: 394–411. [CrossRef] [Google Scholar]
 Ding D, Han QL and Xiang Y et al. A survey on security control and attack detection for industrial cyberphysical systems. Neurocomputing 2018; 275: 1674–83. [CrossRef] [Google Scholar]
 Giraldo J, Urbina D and Cardenas A et al. A survey of physicsbased attack detection in cyberphysical systems. ACM Comput Surv 2018; 51: 76. [Google Scholar]
 Pasqualetti F, Doerfler F and Bullo F. Attack detection and identification in cyberphysical systems. IEEE Trans Automat Contr 2013; 58: 2715–29. [CrossRef] [Google Scholar]
 Tan S, Guerrero JM and Xie P et al. Brief survey on attack detection methods for cyberphysical systems. IEEE Syst J 2020; 14: 5329–39. [CrossRef] [Google Scholar]
 Yan W, Mestha LK and Abbaszadeh M. Attack detection for securing cyber physical systems. IEEE Internet Things J 2019; 6: 8471–81. [CrossRef] [Google Scholar]
 Zhang D, Wang QG and Feng G et al. A survey on attack detection, estimation and control of industrial cyberphysical systems. ISA Trans 2021; 116: 1–16. [CrossRef] [PubMed] [Google Scholar]
 Zhou C, Hu B and Shi Y et al. A unified architectural approach for cyberattackresilient industrial control systems. Proc IEEE 2021; 109: 517–41. [CrossRef] [Google Scholar]
 Ding SX. Advanced Methods for Fault Diagnosis and Faulttolerant Control. Berlin: SpringerVerlag, 2020. [Google Scholar]
 Griffioen P, Weerakkody S and Sinopoli B. A moving target defense for securing cyberphysical systems. IEEE Trans Automat Contr 2021; 66: 2016–31. [CrossRef] [Google Scholar]
 Schellenberger C and Zhang P. Detection of covert attacks on cyberphysical systems by extending the system dynamics with an auxiliary system. In: 2017 IEEE 56th Annual Conference on Decision and Control (CDC), Melbourne, Australia, 2017, 1374–9. [CrossRef] [Google Scholar]
 Weerakkody S and Sinopoli B. Detecting integrity attacks on control systems using a moving target approach. In: 2015 54th IEEE Conference on Decision and Control (CDC), Osaka, Japan, 2015, 5820–6. [CrossRef] [Google Scholar]
 Ferrari RMG and Teixeira AMH. A switching multiplicative watermarking scheme for detection of stealthy cyberattacks. IEEE Trans Automat Contr 2021; 66: 2558–73. [CrossRef] [Google Scholar]
 Mo Y, Weerakkody S and Sinopoli B. Physical authentication of control systems: Designing watermarked control inputs to detect counterfeit sensor outputs. IEEE Contr Syst Mag 2015; 35: 93–109. [Google Scholar]
 Porter M, Hespanhol P and Aswani A et al. Detecting generalized replay attacks via timevarying dynamic watermarking. IEEE Trans Automat Contr 2021; 66: 3502–17. [CrossRef] [Google Scholar]
 Ding SX, Li L and Zhao D et al. Application of the unified control and detection framework to detecting stealthy integrity cyberattacks on feedback control systems. Automatica 2022; 142: 110352. [CrossRef] [Google Scholar]
 Ding SX, Li L and Liu T. An alternative paradigm of fault diagnosis in dynamic systems: Orthogonal projectionbased methods. ArXiv preprint [arXiv:2202.08108], 2022. [Google Scholar]
 Ding SX and Li L. Control performance monitoring and degradation recovery in automatic control systems: A review, some new results, and future perspectives. Contr Eng Pract 2021; 111: 104790. [CrossRef] [Google Scholar]
 Vinnicombe G. Uncertainty and Feedback: H∞ LoopShaping and the ν Gap Metric. London, UK: World Scientific, 2000. [CrossRef] [Google Scholar]
 Zhou K. Essential of Robust Control. Englewood Cliffs, NJ: PrenticeHall, 1998. [Google Scholar]
 Ding SX, Yang G and Zhang P et al. Feedback control structures, embedded residual signals and feedcak control schemes with an integrated residual access. IEEE Trans Contr Syst Tech 2010; 18: 352–67. [CrossRef] [Google Scholar]
 Li L, Luo H and Ding SX et al. Performancebased fault detection and faulttolerant control for automatic control systems. Automatica 2019; 99: 309–16. [Google Scholar]
 Schulze DM, Alexandru AB and Quevedo DE et al. Encrypted control for networked systems: An illustrative introduction and current challenges. IEEE Contr Syst Mag 2021; 41: 58–78. [CrossRef] [Google Scholar]
 Feintuch A. Robust Control Theory in Hilbert Space. New York: SpringerVerlag, 1998. [CrossRef] [Google Scholar]
 Han H, Yang Y and Li L et al. Control performancebased fault detection and faulttolerant control schemes for a class of nonlinear systems. Int J Robust Nonlinear Control 2019; 30: 1431–50. [Google Scholar]
 Han H, Yang Y and Li L et al. Performancebased fault detection and faulttolerant control for nonlinear systems with ts fuzzy implementation. IEEE Trans Cybern 2021; 51: 801–14. [CrossRef] [PubMed] [Google Scholar]
 Ding SX. ModelBased Fault Diagnosis Techniques  Design Schemes, Algorithms, and Tools. Berlin: SpringerVerlag, 2008. [Google Scholar]
 Francis BA. A Course in HInfinity Control Theory. Berlin  New York: SpringerVerlag, 1987. [CrossRef] [Google Scholar]
 Kato T. Perturbation Theory for Linear Operators. Berlin: SpringerVerlag, 1995. [CrossRef] [Google Scholar]
 Hoffmann JW. Normalized coprime factorizations in continuous and discrete time  a joint statespace approach. IMA J Math Contr Inform 1996; 13: 359–84. [CrossRef] [Google Scholar]
 Li L and Ding SX. Gap metric techniques and their application to fault detection performance analysis and fault isolation schemes. Automatica 2020; 118: 109029. [CrossRef] [Google Scholar]
 Van der Schaft A. L2  Gain and Passivity Techniques in Nonlinear Control. London: Springer, 2000. [CrossRef] [Google Scholar]
 Bengio Y, Courville A and Vincent P. Representation learning: A review and new perspectives. IEEE Trans Pattern Anal Mach Intell 2013; 35: 1798–1828. [CrossRef] [PubMed] [Google Scholar]
 Geiger BC. On information plane analyses of neural network classifiersa review. IEEE Trans Neural Netw Learn Syst 2021, in press. https://doi.org/10.1109/TNNLS.2021.3089037. [Google Scholar]
 Burkart N and Huber MF. A survey on the explainability of supervised machine learning. J Artif Intell Res 2021; 70: 245–317. [CrossRef] [Google Scholar]
 Bauer M, Horch A and Xie L et al. The current state of control loop performance monitoring, a survey of application in industry. J Process Contr 2016; 38: 1–10. [CrossRef] [Google Scholar]
 Li L and Ding SX. Performance supervised fault detection schemes for industrial feedback control systems and their datadriven implementation. IEEE Trans Ind Inform 2020; 16: 2849–58. [CrossRef] [Google Scholar]
 Perez T, Goodwin GC and Seron MM. Performance degradation in feedback control due to constraints. IEEE Trans Automat Contr 2003; 48: 1381–85. [CrossRef] [Google Scholar]
 Zhang Y and Jiang J. Fault tolerant control system design with explicit consideration of performance degradation. IEEE Trans Aerosp Electron Syst 2003; 39: 838–48. [CrossRef] [Google Scholar]
 Zhang Y, Jiang J and Theilliol D. Incorporating performance degradation in fault tolerant control system design with multiple actuator failures. J Contr Automat Syst 2008; 6: 327–38. [Google Scholar]
 Li L, Ding SX and Luo H et al. Performancebased faulttolerant control approaches for industrial processes with multiplicative faults. IEEE Trans Ind Inform 2020; 16: 4759–68. [CrossRef] [Google Scholar]
 Li L, Li S and Ding SX et al. Riemannian metric based performance monitoring and diagnosis for a class of feedback control systems. Acta Automat Sin 2022, in press. https://doi.org/10.16383/j.aas.c210027. [Google Scholar]
 Magnus JR. Linear Structures. Oxford, UK: Oxford University Press, 1988. [Google Scholar]
 Parr R, Li L and Taylor G et al. An analysis of linear models, linear valuefunction approximation, and feature selection for reinforcement learning. In: Proceedings of the 25th International Conference on Machine Learning. ICML '08, 2008. Association for Computing Machinery, New York, NY, USA, 752–9. [Google Scholar]
 AlTamimi A, Lewis FL and AbuKhalaf M. Discretetime nonlinear hjb solution using approximate dynamic programming: Convergence proof. IEEE Trans Syst Man Cybern Part B (Cybern) 2008; 38: 943–9. [CrossRef] [PubMed] [Google Scholar]
 Shang C, Ding SX and Ye H. Distributionally robust fault detection design and assessment for dynamical systems. Automatica 2021; 125: 109434. [CrossRef] [Google Scholar]
 Wan Y, Ma Y and Zhong M. Distributionally robust tradeoff design of parity relation based fault detection systems. Int J Robust Nonlinear Contr 2021; 31: 9149–74. [CrossRef] [Google Scholar]
 Xue T, Zhong M and Li L et al. An optimal datadriven approach to distribution independent fault detection. IEEE Trans Ind Inform 2020; 16: 6826–36. [CrossRef] [Google Scholar]
 Lin F, Fang X and Gao Z. Distributionally robust optimization: A review on theory and applications. Numer Algeb Contr Optim 2022; 12: 159–212. [CrossRef] [Google Scholar]
 Rahimian H and Mehrotra S. Distributionally robust optimization: A review. ArXiv preprint [arXiv:1908.05659], 2019. [Google Scholar]
 Yang I. A dynamic game approach to distributionally robust safety specifications for stochastic systems. Automatica 2018; 94: 94–101. [CrossRef] [Google Scholar]
 Xu Y, Ding SX and Yin S et al. Performance degradation monitoring and recovery of visionbased control systems. IEEE Trans Contr Syst Technol 2021; 29: 2712–9. [CrossRef] [Google Scholar]
 Lei Y, Li N and Guo L et al. Machinery health prognostics: A systematic review from data acquisition to rul prediction. Mech Syst Signal Process 2018; 104: 799–834. [CrossRef] [Google Scholar]
 Liao L and Köttig F. Review of hybrid prognostics approaches for remaining useful life prediction of engineered systems, and an application to battery life prediction. IEEE Trans Reliabil 2014; 63: 191–207. [CrossRef] [Google Scholar]
 Si X, Ren Z and Hu X et al. A novel degradation modeling and prognostic framework for closedloop systems with degrading actuator. IEEE Trans Ind Electron 2020; 67: 9635–47. [CrossRef] [Google Scholar]
 Yin S, Xiao B and Ding SX et al. A review on recent development of spacecraft attitude faulttolerant control system. IEEE Trans Ind Electron 2016; 63: 3311–20. [CrossRef] [Google Scholar]
Steven Ding received his Ph.D. degree in electrical engineering from the GerhardMercator University of Duisburg, Germany, in 1992. From 1992 to 1994, he was a R&D engineer at Rheinmetall GmbH. From 1995 to 2001, he was a professor of control engineering at the University of Applied Science Lausitz in Senftenberg, Germany, and served as vice president of this university during 1998–2000. Since 2001, he has been a chair professor of control engineering and the head of the Institute for Automatic Control and Complex Systems (AKS) at the University of DuisburgEssen, Germany. His research interests are modelbased and datadriven fault diagnosis, control and faulttolerant systems as well as their applications in industry with a focus on automotive systems, chemical processes and renewable energy systems.
All Figures
Figure 1. Feedback control loop under consideration 

In the text 
Figure 2. The original configuration of the automatic control system under consideration 

In the text 
Figure 3. Reconfiguration of the automatic control system under consideration 

In the text 
Figure 4. Schemetic description of projectionbased classification (${\mathcal{P}}_{\mathcal{K}}\alpha $ denotes the projection of $\alpha $ onto 𝒦) 

In the text 
Figure 5. Basic configuration of an autoencoder 

In the text 
Figure 6. Schematic configuration of the fault estimator 

In the text 
Figure 7. From the standard model to the observerbased I/Omodel: a schematic description 

In the text 
Figure 8. Schematic configuration of a multilayer automatic control system 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.