A survey on Fault Detection, Isolation and
Recovery (FDIR) Module in Satellite Onboard
Sofware
Fatemeh SalarKaleji
Satellite Research Institute
Tehran, Iran
f.salar@sri.ac.ir
fatemeh.salaail.com
Abstract- The complexity of the avionic systems in satellites is
rising as space missions become increasingly more sophisticated.
This complexity emphasizes the need for more dependable
systems with minimal anomalies. As satellite manufactures seek
to convert many hardware implemented functionalities into
software, the On-Board Software (OBSW) is becoming a major
component in every satellite. Noticeably, more tasks for Fault
Detection, Isolation and Recovery (FDIR) are being implemented
in software, where the need comes for a well-defned sofware
architecture that supports a cost-effective implementation of the
FDIR functions. FDIR was already explained as key functionality
of the OBSW. Obviously not all failures are subject to onboard
identifcation and not all failures are subject to onboard
recovery. The FDIR concept to be worked out for the spacecraft
during the engineering phase follows some basic requirements
and principles, implements a certain failure hierarchy- specifying
furthermore on which level the failure is to be fxed- and fnally it
implements a consistent approach for the functionality
transferring the spacecraf to Safe Mode and how to recover
from there.
Since a FDIR concept usually follows a hierarchical approach, in
this paper we will indicate a FDIR and safeguarding hierarchy
example in the paper. In this structure we will indicate the levels
of failures which handled by unit internal, subsystem software,
satellite system sofware, onboard computer hardware
reconfguration unit and ground. Also we will explain the FDIR
hierarchy in safe mode implementation in a bit more detail.
In this paper we will consider FDIR technologies in the On-board
software in a satellite. Today, there are several proposed
methodologies and frameworks which try to solve this problem.
We will analyze the functionalities in FDIR Module implemented
in an OBSW Framework. Also we have a survey on the FDIR
hierarchies and their relationship to the Packet Utilization
Standard (PUS) Services.
Keywords- Fault Detection, Isolation and Recovery (FDIR),
Software FDIR, On-Board software (OBSW), Satellites,
Frameworks, Packet Utilzation Standard (PUS), On-Board
architectures.
I. INTRODUCTION
"Failure Detection, Isolation and Recovery", (FDIR), was
already explained as key functionality of the On-Board
978-1-4673-6396-9/13/$3l.00 ©2013 IEEE 545
Aboulfazl Dayyani
Satellite Research Institute
Tehran, Iran
Day aniail.com
Sofware (OBSW). Obviously not all failures are subject
to onboard identifcation and not all failures are subject to
onboard recovery. The FDIR concept to be worked out for the
spacecraf during the engineering phase follows some basic
requirements and principles, implements a certain failure
hierarchy - specifing furthermore on which level the failure
is to be fxed - and fnally it implements a consistent approach
for the fnctionality transferring the spacecraf to Safe Mode
and how to recover fom there. A properly defned Safe Mode
with fll satellite observability is essential for FDIR operations.
The Safe Mode must also assure a proper balance of the
satellite produced and consumed resources (mainly power)
since the diagnosis of failures plus recovery in most cases will
not be possible within one ground contact (in particular not for
polar orbiting Earth observation satellites). In this paper we
have a survey on FDIR in the following sections:
II. FDIR REQUlREMENTS
Typical requirements for FDIR design at the beginning the
satellite system engineering phase request that:
•
A clear hierarchy is to be defned which type of
failure is to be identifed and managed on which level
FDIR level.
•
The satellite must be able to reach its Safe Mode
autonomously.
•
The Safe Mode, if triggered, shall not limit ground in
any way with spacecraf observability and
commandability.
•
Ground may also be allowed to submit commands
which are blocked for the OBSW or are not allowed
in that sequence for the OBSW.
•
Ground must be able to perform a detailed status
analysis and failure event history analysis for unique
failure identifcation.
•
Ground may alter operational limits to avoid future
Safe Modes - e.g. in cases of failures triggered by
equipment degradation.