Implementation of a Fault Tolerant Computing Testbed PDF Download

Are you looking for read ebook online? Search for your book and save it on your Kindle device, PC, phones or tablets. Download Implementation of a Fault Tolerant Computing Testbed PDF full book. Access full book title Implementation of a Fault Tolerant Computing Testbed by David C. Summers. Download full books in PDF and EPUB format.

Implementation of a Fault Tolerant Computing Testbed

Implementation of a Fault Tolerant Computing Testbed PDF Author: David C. Summers
Publisher:
ISBN: 9781423536611
Category :
Languages : en
Pages : 185

Book Description
With spacecraft designs placing more emphasis on reduced cost, faster design time, and higher performance, it is easy to understand why more commercial-off-the-shelf (COTS) devices are being used in space based applications. The COTS devices offer spacecraft designers shorter design-to- orbit times, lower system costs, orders of magnitude better performance, and a much better software availability than their radiation hardened (radhard) counterparts. The major drawback to using COTS devices in space is their increased susceptibility to the effects of radiation, single event upsets (SEUs) in particular. This thesis will focus on the implementation of a fault tolerant computer system. The hardware design presented here has two different benefits. First, the system can act as a software testbed, which allows testing of software fault tolerant techniques in the presence of radiation induced SEUs. This allows the testing of the software algorithms in the environment they were designed to operate in without the expense of being placed in orbit. Additionally, the design can be used as a hybrid fault tolerant computer system. By combining the masking ability of the hardware with supporting software, the system can mask out and reset processor errors in real time. The design layout will be presented using OrCAD schematics.

Implementation of a Fault Tolerant Computing Testbed

Implementation of a Fault Tolerant Computing Testbed PDF Author: David C. Summers
Publisher:
ISBN: 9781423536611
Category :
Languages : en
Pages : 185

Book Description
With spacecraft designs placing more emphasis on reduced cost, faster design time, and higher performance, it is easy to understand why more commercial-off-the-shelf (COTS) devices are being used in space based applications. The COTS devices offer spacecraft designers shorter design-to- orbit times, lower system costs, orders of magnitude better performance, and a much better software availability than their radiation hardened (radhard) counterparts. The major drawback to using COTS devices in space is their increased susceptibility to the effects of radiation, single event upsets (SEUs) in particular. This thesis will focus on the implementation of a fault tolerant computer system. The hardware design presented here has two different benefits. First, the system can act as a software testbed, which allows testing of software fault tolerant techniques in the presence of radiation induced SEUs. This allows the testing of the software algorithms in the environment they were designed to operate in without the expense of being placed in orbit. Additionally, the design can be used as a hybrid fault tolerant computer system. By combining the masking ability of the hardware with supporting software, the system can mask out and reset processor errors in real time. The design layout will be presented using OrCAD schematics.

Fault Tolerant Computing Testbed

Fault Tolerant Computing Testbed PDF Author: John C. Payne, Jr.
Publisher:
ISBN: 9781423554882
Category :
Languages : en
Pages : 184

Book Description
Operating computers in space requires the use of very expensive radiation hardened microelectronics devices. Unfortunately, the United States radiation hardened market is rapidly shrinking and makes up a very small percentage of the commercial market. For these reasons, and the fact that commercial-off-the-shelf (COTS) devices are cheaper, more capable, readily available, and software availability is much greater, the use of COTS devices in future space systems is fast becoming a reality. A significant disadvantage of COTS devices is their susceptibility to radiation induced single event upsets (SEUs), among other radiation effects which are detrimental to electronic systems. This thesis focuses on the board level design of a tool which enables the analysis of fault tolerant computing techniques in a laboratory environment in the presence of radiation induced SEUs. When implemented, this tool will be beneficial to the study of using COTS devices in space. The tool will provide the capability to analyze the performance of hardware redundancy techniques and software algorithms intended to improve the performance of COTS microprocessors in this environment prior to their use in designs intended for actual space applications. Cadence Concept(TM) design schematics, associated Verilog(registered) code and simulation results are presented to develop this concept.

Fault Tolerant Computing Testbed

Fault Tolerant Computing Testbed PDF Author: John C. Payne
Publisher:
ISBN:
Category :
Languages : en
Pages : 184

Book Description
Operating computers in space requires the use of very expensive radiation hardened microelectronics devices. Unfortunately, the United States radiation hardened market is rapidly shrinking and makes up a very small percentage of the commercial market. For these reasons, and the fact that commercial-off-the-shelf (COTS) devices are cheaper, more capable, readily available, and software availability is much greater, the use of COTS devices in future space systems is fast becoming a reality. A significant disadvantage of COTS devices is their susceptibility to radiation induced single event upsets (SEUs), among other radiation effects which are detrimental to electronic systems. This thesis focuses on the board level design of a tool which enables the analysis of fault tolerant computing techniques in a laboratory environment in the presence of radiation induced SEUs. When implemented, this tool will be beneficial to the study of using COTS devices in space. The tool will provide the capability to analyze the performance of hardware redundancy techniques and software algorithms intended to improve the performance of COTS microprocessors in this environment prior to their use in designs intended for actual space applications. Cadence Concept(TM) design schematics, associated Verilog(registered) code and simulation results are presented to develop this concept.

Software Fault Tolerance Techniques and Implementation

Software Fault Tolerance Techniques and Implementation PDF Author: Laura L. Pullum
Publisher: Artech House
ISBN: 1580531377
Category : Computers
Languages : en
Pages : 358

Book Description
Look to this innovative resource for the most-comprehensive coverage of software fault tolerance techniques available in a single volume. It offers you a thorough understanding of the operation of critical software fault tolerance techniques and guides you through their design, operation and performance. You get an in-depth discussion on the advantages and disadvantages of specific techniques, so you can decide which ones are best suited for your work.

Fault-Tolerant Computing Systems

Fault-Tolerant Computing Systems PDF Author: Mario Dal Cin
Publisher: Springer
ISBN:
Category : Computers
Languages : en
Pages : 446

Book Description
A survey of the state of research, development and applications of all aspects of fault tolerance and dependability in computing and automation systems (hardware and software) with the topics: reconfiguration and recovery, system level diagnosis, voting and agreement, testing, fault-tolerant circuits, array testing, modelling, applied fault tolerance, fault-tolerant arrays and systems, interconnection networks, fault-tolerant software.

A Testbed for Evaluation of Fault-tolerant Routing in Multiprocessor Interconnection Networks

A Testbed for Evaluation of Fault-tolerant Routing in Multiprocessor Interconnection Networks PDF Author: Aniruddha S. Vaidya
Publisher:
ISBN:
Category : Computer networks
Languages : en
Pages : 20

Book Description
Abstract: "With parallel machines increasingly taking on critical and complex applications, it is important to make them dependable to ensure their commercial success. Fault-tolerance in the network to accommodate link and node failures is an important step towards this goal. This can be achieved by employing cost-effective fault-tolerant algorithms. However, despite substantial efforts on the theoretical front in developing fault-tolerant routing techniques and architectures, these ideas have not manifested themselves in many commercial platforms. The ramifications of providing fault-tolerant routing in terms of cost and performance is still not clear to the computer architect. Such an insight can only be gained through detailed analysis of a design with realistic workloads. Since no current evaluation platform supports this, previous research on fault-tolerant routing has used synthetic workloads for analyzing performance. This paper presents a comprehensive evaluation testbed for interconnection networks and routing algorithms using real applications. The testbed is flexible enough to implement any network topology and fault-tolerant routing algorithm, and allows the system architect to study the cost versus performance tradeoffs for a range of network parameters. We illustrate its use with one fault-tolerant algorithm and analyze the performance of four shared memory applications with different fault conditions. We also show how the testbed can be used to drive future research in fault-tolerant routing algorithms and architectures, by proposing and evaluating novel architectural enhancements to the network router, called path selection heuristics (PSH). We propose three such schemes and the Least Recently Used (LRU) PSH is shown to give the best performance in the presence of faults."

Completion and Testing of a TMR Computing Testbed and Recommendations for a Flight-Ready Follow-on Design

Completion and Testing of a TMR Computing Testbed and Recommendations for a Flight-Ready Follow-on Design PDF Author: Damen O. Hofheinz
Publisher:
ISBN: 9781423532552
Category :
Languages : en
Pages : 172

Book Description
This thesis focuses on the completion and hardware testing of a fault tolerant computer system utilizing Triple Modular Redundancy (TMR). Due to the radiation environment in space, electronics in space applications must be designed to accommodate single event phenomena. While radiation hardened processors are available, they offer lower performance and higher cost than commercial off the shelf processors. In order to utilize non-hardened devices, a fault tolerance scheme such as TMR may be implemented to increase reliability in a radiation environment. The design that was completed in this effort is one such implementation. The completion of the hardware design consisted of programming logic devices, implementing hardware design corrections, and the design of an overall system controller. The testing effort included basic power and ground verification checks to programming, executing, and evaluating programs in read only memory. During this phase, additional design changes were implemented to correct design flaws. This thesis also evaluated the preliminary design changes required for a space implementation of this TMR design. This included design changes due to size, power, and weight restrictions. Additionally, a detailed analysis of component survivability was performed based on past radiation testing.

Fault-Tolerance Techniques for High-Performance Computing

Fault-Tolerance Techniques for High-Performance Computing PDF Author: Thomas Herault
Publisher: Springer
ISBN: 3319209434
Category : Computers
Languages : en
Pages : 325

Book Description
This timely text presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC). The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as ABFT. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Features: provides a survey of resilience methods and performance models; examines the various sources for errors and faults in large-scale systems; reviews the spectrum of techniques that can be applied to design a fault-tolerant MPI; investigates different approaches to replication; discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems.

Methods, Models and Tools for Fault Tolerance

Methods, Models and Tools for Fault Tolerance PDF Author: Michael Butler
Publisher: Springer
ISBN: 3642008674
Category : Computers
Languages : en
Pages : 350

Book Description
The growing complexity of modern software systems increases the di?culty of ensuring the overall dependability of software-intensive systems. Complexity of environments, in which systems operate, high dependability requirements that systems have to meet, as well as the complexity of infrastructures on which they rely make system design a true engineering challenge. Mastering system complexity requires design techniques that support clear thinking and rigorous validation and veri?cation. Formal design methods help to achieve this. Coping with complexity also requires architectures that are t- erant of faults and of unpredictable changes in environment. This issue can be addressed by fault-tolerant design techniques. Therefore, there is a clear need of methods enabling rigorous modelling and development of complex fault-tolerant systems. This bookaddressessuchacuteissues indevelopingfault-tolerantsystemsas: – Veri?cation and re?nement of fault-tolerant systems – Integrated approaches to developing fault-tolerant systems – Formal foundations for error detection, error recovery, exception and fault handling – Abstractions, styles and patterns for rigorousdevelopment of fault tolerance – Fault-tolerant software architectures – Development and application of tools supporting rigorous design of depe- able systems – Integrated platforms for developing dependable systems – Rigorous approaches to speci?cation and design of fault tolerance in novel computing systems TheeditorsofthisbookwereinvolvedintheEU(FP-6)projectRODIN(R- orous Open Development Environment for Complex Systems), which brought together researchers from the fault tolerance and formal methods communi- 1 ties. In 2007 RODIN organized the MeMoT workshop held in conjunction with the Integrated Formal Methods 2007 Conference at Oxford University.

Fault Tolerance

Fault Tolerance PDF Author: Peter A. Lee
Publisher: Springer Science & Business Media
ISBN: 370918990X
Category : Computers
Languages : en
Pages : 326

Book Description
The production of a new version of any book is a daunting task, as many authors will recognise. In the field of computer science, the task is made even more daunting by the speed with which the subject and its supporting technology move forward. Since the publication of the first edition of this book in 1981 much research has been conducted, and many papers have been written, on the subject of fault tolerance. Our aim then was to present for the first time the principles of fault tolerance together with current practice to illustrate those principles. We believe that the principles have (so far) stood the test of time and are as appropriate today as they were in 1981. Much work on the practical applications of fault tolerance has been undertaken, and techniques have been developed for ever more complex situations, such as those required for distributed systems. Nevertheless, the basic principles remain the same.