Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
53 Cards in this Set
- Front
- Back
System Model |
How we view our system. |
|
Fault model. |
What are the potential problems? |
|
Specification |
Oracle for system behavior. |
|
System |
A set of elements that work together to provide a service. |
|
Element |
An entity that provides a predefined service and is able to communicate with other elements. |
|
Service and Trustworthy Service |
A behavior that a user perceives as their system interface. Trustworthy - correct and timely. |
|
Failure |
Occurs when the services provided by a system observably deviate from specification. |
|
Error |
An error is an erroneous system state that can lead to a failure. May be detectable and may not necessarily lead to a failure. |
|
Fault |
An undesirable event, hypothesized cause of error. Can be permanent, transient or intermittent. |
|
Fault Prevention |
Focus on eliminating the conditions for faults to be activated. e.g. Modular Software Design, Software Development Methods |
|
Fault Removal |
Three stage process: validation, diagnosis, correction. e.g. debugging, code reviews. |
|
Fault Forecasting |
Concerned with estimating the number, likelihood of activation and wider consequences. A quantitative estimate of how good software is. e.g fault injection |
|
Fault Tolerance |
Actively handling the occurrence of faults/errors such that a system is able to still meet it's specification. |
|
Reliability |
Probability of a service to provide correct services over a specified time period. |
|
Fail-safe |
System becomes safe when it cannot operate. Ensures safety. |
|
Fail-operational |
System continues to operate when it fails. Ensures liveness, sometimes violates safety. |
|
Fail-secure |
Fail-secure systems maintain maximum security when they cannot operate. Potentially large safety violations. |
|
Fail-silent |
System either functions correctly or stops functioning after an internal failure is detected. |
|
Fault-tolerant |
System continues to operate correctly when subsystems operate incorrectly. |
|
Availability |
The probability a service provided by a system is operating correctly at a given time. |
|
Safety |
The extent to which a system can operate without damaging or endangering it's environment. |
|
Confidentiality |
Concerned with the non-disclosure of undue information to unauthorized entities. |
|
Integrity |
The capacity of a computer system to ensure the absence of improper system alterations, with regard to the withholding, modification and deletion of information. |
|
Maintainability |
The probability that a failed computer system will be repaired in t time or less. |
|
Fail-Safe Fault Tolerance |
Program satisfies safety only. |
|
Non-masking Fault Tolerance |
Program satisfies liveness only. |
|
Fault Intolerant |
Program satisfies neither safety nor liveness. |
|
Masking Fault Tolerance |
Program satisfies both safety and liveness. |
|
Detector |
A class of program components that asserts the validity of a predicate in a running program. Necessary and sufficient for fail-safe fault tolerance. |
|
Corrector |
A class of program components that imposes a given predicate on a running program. Necessary and sufficient to design non-masking fault tolerance. |
|
Phases of Fault Tolerance |
|
|
Run-time Checks |
Replication Checks, Timing Checks, Reversal Checks, Coding Checks, Reasonable Checks, Structural Checks, Validity Checks |
|
Exception Handlers |
Interface exception, local/internal exception, failure exception |
|
Forward error recovery |
Upon detection of error, program attempts to get into a state which is no longer erroneous. e.g. recovery blocks |
|
Backward error recovery |
Upon detection of error, program rolls back to a previously “recorded” good point from which it can restart execution. e.g. checkpoints |
|
Dependability Analysis |
Tries to identify acceptable levels of safety and reliability. |
|
Hazard |
A hazard is a state or set of conditions of a system, that, together with environmental conditions, will lead to an accident. Important to know the severity and potential frequency. |
|
Hazard level |
Combination of hazard severity and likelihood of occurence |
|
Risk |
The hazard level combined with the likelihood of the hazard leading to accidents and the hazard exposure or duration. |
|
Safety Case |
Needed as part of certification. Should communicate a clear, comprehensive argument that a system is acceptably safe to operate in a particular context. |
|
Failure Modes and Effects Analysis |
Considers the failure of a component with the effect of this failure being assessed at system-level to detect hazardous situations. + Can detect hazard situations arising as a result of single failures. - Does not consider multiple failures. - Expensive to perform exhaustively. |
|
Failure Modes, Effects and Criticality Analysis |
Similar to FMEA, except the important of each failure is ranked by accounting for the consequences and likely frequency of occurrence. + FMECA permits meaningful cost-benefit analysis. |
|
Hazard and Operability Studies |
+ Effective for situations where "What if ...?" questions are constrained. - Time consuming, especially for the experts used in question generation. |
|
Event Trees |
Starting from events that can affect system, tracking forward to determine their effects. + Allows outcomes of events to be determined. - Exponential Complexity |
|
Fault Trees |
Fault trees track backwards. Commonly used for safety-critical systems. + Resultant trees are simplified, compared to event trees. - Identification of high-level hazards is challenging. |
|
Cut Set |
Combinations of elements in which simultaneous failures would lead to a system failure. Minimal cut sets provide lower bound on reliability. |
|
Tie Set |
Combinations of elements in which simultaneous operations would lead to a correctly working system. Minimal tie sets provide upper bound on reliability |
|
Recovery Line and Consistency |
A set of checkpoints across all processes to which the programs can be rolled back, in the event of failure, to ensure consistent error-free state of the system. Said to be consistent if there are no messages that originate after the line and terminate before it. |
|
Recovery Blocks |
One hardware channel and N software components. Secondary components perform similar function but in a different way. Acceptance testing decides whether results are acceptable. If primary component fail acceptance test:, we reload a checkpoint, and execute a secondary from a “clean” state. Represent redundancy in space and time. |
|
N-Version Programming |
N hardware channels and N software channels. |
|
NVP Axiom |
NVP is the most efficient when version failures occur independently. |
|
Fault Injection Analysis |
Fault injection analysis involves subjecting a system to abnormal conditions, i.e., introducing faults / errors, so that behaviour can assessed |
|
Coverage |
The coverage of dependability mechanisms is the ratio of number of test cases successfully passed to the number of test cases subjected to. |