Some days ago, while informally chatting with one of the main speakers in a security group conference, he stated one of the biggest truth I have ever heard: everything in life is about trust. Humans need to trust each other whereas humans need trustworthy systems: systems we can rely on.
However, how much can we rely on systems which are part of our lives? How trustworthy is the train you take every morning while you hold and turn the pages of your favourite book? Automated vehicles may eventually be widely adopted, however, can safety of automated driving be validated at all?
There is no doubt: we are living in an increasingly hazardous world. Everybody can make mistakes. Everything can go wrong. Hardware or software faults can sometimes lead to failures that have no severe consequences and other times may be catastrophic. For certain safety-critical applications (e.g. in automotive or aerospace) these failures may result in life threatening situations.
Hence, dependability, which Laprie defined as the ability to deliver services that can justifiably be trusted, must be attained in safety-critical applications and consequently requires strict regulation, validation, and monitoring of the development processes.
Fault injection is known as a dependability validation technique that is based on the conducting of controlled experiments. In detail, the observation of the system behaviour in the presence of faults is explicitly induced by the deliberate introduction of faults into the system. Researchers and engineers have created many novel methods to inject faults, which can be implemented at different design levels and are commonly used to pursue the following objectives:
- Understanding a system’s behaviour in the presence of real faults
- Verification of the fault tolerance mechanisms included in the target system for removing design faults
- Forecasting of faulty behaviour of a target system, by obtaining measurements of the coverage or efficiency and the latency of fault tolerance mechanisms
- Exploring the effects of different workloads on the effectiveness of fault tolerance mechanisms
- Identifying weak links or single point of failures (a single fault leading to severe consequences) in the design
Fault injection has been deeply investigated by both academia and industry. Functional safety standards such as IEC 61508 mandate the use of this technique for system development, while in the automotive functional safety ISO 26262 standard is highly recommended.
ISO 26262 Although it is a quite mature technique, its appliance in early design phases, during which only models of the system exist, is still uncommon.
Through the advance of more complex systems, such as autonomous robots or automated road vehicles, novel challenges on dependability assessment arise. Through the increased amount of potentially hazardous situations, traditional safety and security analysis techniques are not sufficient anymore. In order to manage this new complexity, the extension of traditional analyses techniques with simulation-based approaches poses as a promising solution to virtually validate a system during the early design phases.
The Sabotage simulation-based fault injection framework is such a solution as it combines model-based design with fault injection simulation tests for an early safety assessment. Furthermore, it accounts for the inclusion of a virtual vehicle or a robot in the testing loop in order to measure the effects of faults on system level in different domains.
Regarding the automotive domain, this framework can be utilised to determine (human) controllability and fault tolerant time interval (FTTI) during early design phases. FTTI is defined as the time-span in which a fault can be present in a system before a hazardous event occurs. Moreover, FI can be applied for dimensioning monitoring functions by determining a system’s maximum response time before a hazardous event occurs, as for instance required in highly automated driving.
The Sabotage framework has been developed to set up, configure, execute and analyse simulations. This configuration process includes configuring a virtual environment (driving circuit or robot environment) and the needed information to build up faulty System Under Tests (SUTs). The SUT (Simulink or SCADE behavioural models) is extended with extra blocks called Saboteurs (Simulink S-functions). They reproduce a certain faulty behaviour of different components such as a sensor or an Electronic Control Unit.
Adding signal injectors or saboteurs at the inputs together with read-out blocks or monitors at the outputs, establishes a viable solution to conduct complex fault injection campaigns. Fault models can thereby be selected by identifying potential prototypical failure modes (e.g. too high, too low and too late). Briefly speaking, the following information is considered:
- Where should the faults be injected?
- What is the most appropriate fault model representing the functional failure modes?
- How should the faults be triggered within the system?
- Where should the fault effect be observed?
By comparing fault free simulations (Golden SUT) versus faulty ones (Faulty SUT), tests can be automated and results obtained. The results of the simulation experiments complete the safety analysis and help dimensioning the safety concept by considering the system’s fault tolerant time intervals. Thus, it assists in determining the required level of fault tolerance (e.g. redundancy or graceful degradation), identifying hazards (e.g. vehicle does not turn when it should) and ranking the failure modes with respect to fault occurrence.
On the path towards autonomous driving, fault injection could further be utilised to simulate error injections by generating erroneous patterns, such as flipped images from images sensors. By doing so, these input tests can evaluate the residual risk arising from real-life situations that could trigger a hazardous behaviour of the system when integrated in the vehicle.
Next to these safety concerns, security is also tightly related to dependability. Sabotage could moreover be extended to reproduce the effects coming from security attacks and evaluate how those attacks impact on system safety. The main goal of an early analysis of the resistance against fault attacks is to allow designers to easily identify the weakest point of their design, and consequently protect it with appropriate countermeasures.
In essence, safety is paramount when developing any vehicle, whether driven by a person or a computer. As such, simulation can be an effective measure for testing dangerous or uncommon driving conditions. However, how much can we trust on simulation results? In order to increase the confidence level of those fault injection simulations, designs and tests should be compared with the ones implemented in a model car. Furthermore, by using several model cars, both local failures (sensor, Electronic Unit Control, actuator) and failures related to communication between cars can be reproduced.
This topic of interest is tackled in the AMASS (Architecture-driven, Multi-concern and Seamless Assurance and Certification of Cyber-Physical Systems) ECSEL inititative, which contributes to the development of an ecosystem for assurance and certification of Cyber-Physical Systems (CPS) in the largest industrial vertical markets including automotive, railway, aerospace, space, energy.
There is still much to do to when it comes to safe autonomous systems. However, the presented solution shines light into the darkness of the growing complexity of trustworthy autonomous systems by providing a method for the early validation of safety concepts.
Now you are probably wondering: how trustworthy are the fault injection experiments? That´s a matter of art - the art of fault injection.