The development of the prototype of distributed onboard computing system
The development of the prototype of distributed fault tolerant onboard computing system for satellite control system and the complex of scientific equipment
Tech Area / Field
- INF-COM/High Performance Computing and Networking/Information and Communications
8 Project completed
Senior Project Manager
Rybakova T A
Keldysh Institute of Applied Mathematics, Russia, Moscow
- Russian Academy of Sciences / Space Research Institute, Russia, Moscow
- Deutsches Zentrum für Luft- und Raumfahrt e.V. / Institute of Space Sensor Technology and Planetary Exploration, Germany, Berlin\nFraunhofer Institut Rechnerarchitektur und Softwaretechnik, Germany, Berlin
Project summaryThe given proposal of the project provides the development of a prototype of distributed fault tolerant on-board computing system for control by the satellite and by the complex of scientific equipment, intended for a satellite with a complex of atmospheric hyperspectral monitoring.
Requirements for on-board control complex and the system of scientific equipment control which are necessary for carrying out of scientific tasks of an project of given satellite should be carried out only at presence of powerful and flexible communication and computing infrastructure. For support of wide set of different operations the computing functions and computing power must have the property of adaptation to dynamically changing requirements of the project.
The payload management system must be able for close interaction with on-board control complex in accordance with requirements for carrying out of concrete operations. On-board complex of satellite control must provide high degree of autonomous operation of a satellite and at the same time it must provide vitality of satellite as a problem of highest priority.
The on-board computing system (OBCS) should be realized as a distributed fault tolerant multi computer system which will execute all control, telemetry, and monitor tasks as well as the application dependent tasks of the traditional payload computer. The unification of the different computing functions on-board of the satellite into a single highly redundant system will allow for a close cooperation between the different tasks and will optimize the flexible utilization of the redundant computing resources to fulfil the varying service dependent performance requirements as well as the required level of fault tolerance.
Technologically the on-board computing system should derive benefits from using newest VLSI technology. The high integration density of modem microprocessors and memory devices allows to implement powerful computing systems with a high functionality consisting of only a small number of components. The extreme low power dissipation of low voltage components which are also used in battery-powered systems (notebook, camera, handy) results in minimal requirements for space, weight, cooling, and power supply which all are limited recourses in the context of small satellites.
However, if we want to take these advantages we have to ensure that the use of newest VLSI technology may not diminish the reliability and availability of the OBCS. In general, due to the optimized manufacturing process for high volume production, the reliability of modem VLSI components already attained a very high standard, if operated within the expected environmental conditions. Except for the radiation problems, the environmental conditions on-board of a satellite in the low earth orbit (LEO) can be compared to the operating conditions assumed for industrial versions of VLSI components. Therefore we will implement special hardware and software measures in each node of the system to handle the mission critical radiation problems.
The architecture of the on-board computer is that of a homogeneous symmetric multi computer system, i.e. it is formed by a number (from 3 up to 16) of identical node computers which are connected by a redundant bus system. In principle each of the nodes is able to execute all tasks and can access all I/O busses. The overall hardware structure ensures that there exists no 'single point of failure' in the system. Thus the nodes are completely separate units, i.e. they have no common components except for the fault tolerant communication system and the I/O busses to the devices of the satellite. The physical realization of the communication and I/O busses will ensure that a faulty unit can not prevent the transfer of information among the other units. By using a special design of the interface logic it is guaranteed that an inactive or faulty component will be completely isolated from the busses even in the case of a stuck-at error.
The software architecture follows the same approach of the hardware architecture. The modular, distributed and highly redundant system architecture must be provided. Within each node, a small operating system kernel (supposedly based on the “Linux” operating system) must provide the basic functionality for preemptive multitasking, priority and real time based scheduling, memory management, and communication.