What Is Fault Tolerant Computing?

Improved Essays
I. INTRODUCTION
The performance of present computing systems has increased at the cost of considerably enlarged power consumption. The increased power consumption either reduces the operation time for battery powered systems, such as hand-held mobile systems, or generates extreme amount of heat and requires expensive sophisticated packaging and cooling technologies, especially for complex systems that consist of several processing units. The generated heat, if not efficiently removed, can also reduce system reliability, since hardware failure rate increases with higher temperature [1][2]. In multiprocessor systems, such as space-based control systems or life maintenance systems, where a failure may cause catastrophic results, reliability
…show more content…
During the execution of an application, a fault may take place due to various reasons, such as hardware failures, software errors and electro-magnetic effects. Therefore, fault-tolerance is an inherent requirement of systems when accurate results are needed even in the occurrence of faults. In the fault-tolerance area, redundancy is employed to mask or otherwise work around these faults, in this manner preserving a certain desired level of functionality. Generally, redundancy is defined as the deployment of spare resources (spatial) for the application. Permanent faults are generally tolerated by hardware redundancy, which is also known as modular redundancy (MR), where cloned tasks are running concurrently on multiple processing units. Broadly, three different techniques are used for implementing temporal redundancy based fault-tolerance in task scheduling: checkpointing, recovery block and recovery through …show more content…
In a checkpoint, the state of a system is checked and correct states are saved to a stable storage. When faults are noticed, the execution is rolled back to the most recent correct checkpoint and re-computes the faulty section by exploring the temporal redundancy. With the huge number of checkpoints, the time overhead caused by this method may be unaffordable.
2) The recovery block approach is another method providing a task with one or more backups. Once the original copy of the program fails, the system switches to the executions of its backup [15][12][13]. The execution times of the original task and its backups may be different.
3) Recovery through re-execution technique is used to tolerate transient faults, by re-executing the original task if a fault occurs. As soon as faults are detected, the system restores the system state to a previous safe state and the recovery task is send out, in the form of re-execution

Related Documents

  • Decent Essays

    Nt1310 Unit 1 Exercise 1

    • 551 Words
    • 3 Pages

    In this section the fast-recovery algorithm that makes TCP NewReno[1] perform better than the Reno[3], is described. During congestion avoidance if the sender receives triple duplicate ACKs, then it performs fast-retransmission and enters into fast-recovery. In fast-retransmit the sender retransmits the lost segment,sets the threshold value for slow-start as ssthresh = cwnd/2, and sets congestion window, cwnd = ssthresh+3 segments. In fast-recovery, the sender increases its congestion window by one segment for each subsequent duplicate ACK received that indicates a segment is reaching the destination. It can transmit new segments if permitted by its congestion window.…

    • 551 Words
    • 3 Pages
    Decent Essays
  • Improved Essays

    Nt1310 Unit 4 Test Paper

    • 419 Words
    • 2 Pages

    Suited for small operations that don’t require large amounts of storage space. Also suited for operations that are critical requiring high availability and no downtime. 1. Improved…

    • 419 Words
    • 2 Pages
    Improved Essays
  • Superior Essays

    Nt1310 Unit 8.2

    • 772 Words
    • 4 Pages

    Each employee should be held responsible for ensuring that their daily backup has taken place without any errors. If errors should arise, an IT Professional such as myself should be notified…

    • 772 Words
    • 4 Pages
    Superior Essays
  • Decent Essays

    Therefore, at the completion of the disk transfer DMA generates an interrupt signal it is interrupting the CPU and informing it that the operation has been completed and returning the status of the operation. The interrupt signal interrupts the CPU from what it is doing and forces the CPU to take the required action. Then, set up the next request once the disk transfer is completed. b.…

    • 568 Words
    • 3 Pages
    Decent Essays
  • Superior Essays

    Introduction The current information technology (IT) infrastructure consists of a mix of 2008 and 2008 R2 servers, with two legacy Windows 2000 servers, and four Windows Server 2012. In addition to the Windows server environment, the network consists of routers, switches, and security appliances, such as firewalls. The environment also consists of important services to ensure the security, and stability of the network.…

    • 922 Words
    • 4 Pages
    Superior Essays
  • Improved Essays

    RAID level 5 works with any number of disk equal or greater than 3 and places a parity sum on one disks in the set to be able to recover from a disk failure. (Striped blocks with distributed parity.) The parity calculations are done in a RAID 5 set using XOR. We assume a small RAID 5 set of four disks and some data is written to it. On the first three disks we have the binary information 1010, 1100 and 0011, here representing some data, and now calculate the parity information for the fourth disk.…

    • 572 Words
    • 3 Pages
    Improved Essays
  • Decent Essays

    Nt1310 Unit 4 Test

    • 315 Words
    • 2 Pages

    Have you ever noticed that your PC keeps slowing down during the day? If ‘yes’, it might happen as because many programs simply remain in an execution mode and keeps sucking up a considerable amount of processing power. Such a type of fault remains specific to a fault specific to Windows 8.1 Task Manager Startup and the best way of tackling the same is with the help of taskmgr.exe fixing software. In this context, it could be said that the free online tools hardly prove to be of any benefit in fixing all analogous types & genres of issues. Adyne Roberts posted a series of queries one after the other.…

    • 315 Words
    • 2 Pages
    Decent Essays
  • Great Essays

    Nt1320 Assignment 1

    • 1461 Words
    • 6 Pages

    To support an acceptable level of fault tolerance, a worker can become a resource manager in case that the current RM fails. A distributed election algorithm is implemented to determine which one of the participating machines behaves as RM; e.g. the Bully Algorithm. The distributed system is deployed and…

    • 1461 Words
    • 6 Pages
    Great Essays
  • Decent Essays

    Ts1110 Research Paper

    • 358 Words
    • 2 Pages

    TS1101 Foundations of Information Technology SU01 Operating System Version Scalability Ease of Use Reliability Cost Dual Processor Supported Windows 10 “As long as this condition is met, an NLB cluster can be configured to function as one large RPC server with potentially excellent scalability.” 1 • Battery life improvements. • Performance improvements. . • Better reliability.…

    • 358 Words
    • 2 Pages
    Decent Essays
  • Decent Essays

    Nt1330 Course Project

    • 296 Words
    • 2 Pages

    2.1 Goals The goals of this internship are: a. Provide a solution which reduces the stall time and increases the system efficiency. b. A solution which could be generalized in future to be used with more than two sub-systems. 2.2 Solution Overview The most common solution to such a problem is by making use of a buffer.…

    • 296 Words
    • 2 Pages
    Decent Essays
  • Improved Essays

    In the automation business, the use of machines, control systems, and information technologies to optimize productivity in the production of goods, many people operate these machines and control systems (Source A/C). A human might make a mistake and destruct…

    • 280 Words
    • 2 Pages
    Improved Essays
  • Improved Essays

    Love to solve problems, I enjoy a challenge of identifying a situation at hand and being able to solve it. It brings me joy to restore something back to its initial glory and sometimes go back to previous problems knowing that I can fix it with giving it another try. Being restorative means being able to figure out what is wrong and fix it. Very skilled at dealing with problems, practical or personal. Being restorative is being able to say that I can go back and fix things that I’ve had problems with in the past and applying what I’ve learned to help understand it better.…

    • 1441 Words
    • 6 Pages
    Improved Essays
  • Improved Essays

    1. What is the mission of corrections? The mission of the corrections has traditionally been to implement court-prescribed sentences for criminal violators or to carry out the sentence of the court. 2.…

    • 569 Words
    • 3 Pages
    Improved Essays
  • Great Essays

    A Fault Tree Essay

    • 1481 Words
    • 6 Pages

    Nanjing University of Aeronautics and Astronautics Graduation Thesis Assignment Letter College College of International Education Major _ Aeronautic Engineering __ Topic Fault tree Analysis of Main Landing Gear System for Civil Aircraft Student Name Zeshan Ellahi ID Number191261230 Deadline _______2016.06.08____________ Location __Jiangning District_11501B__ Faculty Advisor Lu Zhong…

    • 1481 Words
    • 6 Pages
    Great Essays
  • Superior Essays

    (1) Using at least 250 words, explain each of the guiding principles of restorative justice. Restorative justice is a process in which the offender repairs wrongdoings that were done to the victim and to the community. Instead of a traditional trial, the offenders are encouraged to take responsibility for their actions by expressing remorse and even apologizing to the victim. The restorative justice process gives the victim the opportunity to meet with the offender so the victim can explain the impact of the crime to the offender, while also giving the victim the opportunity to forgive the offender.…

    • 1198 Words
    • 5 Pages
    Superior Essays