Tiger teams have been long used for problem-solving and or responding to opportunities (Laakso et al., 1999). Tiger teams are not just another form of a team with a set of resources. They are formed differently, used for episodic actions, and then released when the task is complete. In most cases, tiger teams are used for solving difficult problems in a timely way. Given this reason. these teams are formed with a purpose and bring specific skills (Pavlak, 2004). They are also created to self-sufficient and complete in terms of skills and capacity.
As a manager researching flight management system (FMS) issues, I would bring a group of resources with the required skill sets and more importantly experience. The skills needed would include, hardware, software programming, testing, and automation. I would also include subject matter experts from flight control, navigation, avionics, navigation data, and computational expertise. The team will also comprise resources from external partners for hardware, the operating system, and any other sub-components that were externally sourced. I would also ensure that escalation paths are clearly identified within each of the external organizations in case, such escalations were required to ensure priority and timeliness in those organizations.
Once the team has been formed, interaction cadence is important. When and where will they meet, how often and who leads the team – becomes important. Establishing a team leader helps with this process. Setting up a ‘war room’ will be essential to collate all the necessary design artifacts and incident reports. that will help troubleshoot the issue. A flight management system simulator will be required and set up. The simulator will help in reproducing the navigation guidance issue(s).
Robust system design “ensures that future systems continue to meet user expectations despite rising levels of underlying disturbances” (Mitra, 2010). The intent of the robust design is to allow a system to function reliably, perhaps with reduced capability, despite errors in input or computation. Systems are designed to accommodate a vast range of operating conditions and inputs. Despite that, every system has its limits where outside of that operating envelope, the system does not have the intelligence to handle the situation (Atkins et al., 2006). In the given FMS situation, without more detail, it’s hard to conclude that the system is either robust or not. Regardless of how holistic and intelligent a system is, once outside of its programmed boundary, the system would not know how to handle a specific problem. In 2008, Qantas flight 72 suffered an uncommanded loss in altitude now associated with a bit being flipped in the flight management system caused by ionization radiation (See Baraniuk, 2022 for more information on computer bit flips due to solar radiation associated solar flares). This is an example of ‘yet to be known’ factors that could impact the robustness of a system.
Typically redundancy is used as a mitigation for safety-critical systems. Having backup systems is an effective strategy for dealing with insufficient robustness (Mitra, 2010). Multiple of input sensors allow for differentiation models to detect differences and when supplemented with a tertiary sensor, allow for triangulation and therefore detection of an impending problem (Bijjahalli et al., 2020). In the case of software systems, where complex logic can be the cause for errors, exhaustive testing of all code branches and automated testing of multiple-programmatic paths is typically used to prevent an isolated line of code to cause an error (Huhns, & Holderfield, 2002). Data is another cause for the lack of robust behavior because systems are as precise as the data provided. Data verification and validation are the mitigation for this cause.
Ideally, it is best to build systems such that if it’s unable to compute an answer within its defined operating envelope, it must signal such failure to the crew and allow them to resolve the situation. That said, differential input computations led to the autopilot on Air France 447 disconnecting at 35000 feet over the Atlantic leaving the plane in the crew’s hands (Admiral Cloudberg, 2021). In the final analysis, the crew stalled the airplane and all data indicates that the controls were held in high pitch position all the way to its final impact. This indicates that using reverting to manual control as a means of robust design may also not be the best answer for all situations.
Human input is a common problem and protecting a system from erroneous input is perhaps the most significant challenge for designers (Atkins et al., 2006). To anticipate the various inputs that numerous users of a system could potentially input into a system is an extraordinary challenge for any designer. American Airlines flight 965 to Cali, Colombia impacted terrain from an erroneous input into the FMS (Ladkin, 1996). Rushing through an approach at an airport without operational radar, accepting a different approach than earlier planned and programmed, clearing the programming from the FMS to execute a visual approach, and rushing through FMS waypoint entry without verification with associated charts are reported to be the most probably causes (Pérez-Chávez & Psenka, 2001). There are more causes that came together as explained in Reason’s Swiss cheese model that contributed to the crash (Reason et al., 2006). Other factors included a single letter identification for a navaid 150 miles away which with some diligence and attention, could have been easily detected if the crew was not rushing. A single letter – R – indicating two different navaids ROMEO and ROZO – caused the airplane to make a sharp left turn and head straight into the terrain. Preventing error input is typically used to maintain the operational boundaries of a system. However, this could prove limiting in itself.
The first task for the team will be to reduce the errors as much as possible. This allows for a close study of the problem. Documenting the causes in fishbone diagrams allows for listing all causes that could have led to the issues. Taking each case individually to further resolve them would lead to resolution of the issue. The benefit of using a tiger team for this purpose is to have undistracted bandwidth to focus on the issues on hand.
There is no perfect answer to designing robust behavior. It is as much art as it is a science to build a complex, comprehensive design. Achieving perfect design is an ongoing challenge and it is worthy of mention that automation and human factors issues remain a serious concern for the aviation industry even today.
References:
Admiral Cloudberg. (2021, October 9). The Long Way Down: The crash of Air France flight 447. Medium; Medium. https://admiralcloudberg.medium.com/the-long-way-down-the-crash-of-air-france-flight-447-8a7678c37982Links to an external site.
Atkins, E. M., Portillo, I. A., & Strube, M. J. (2006). Emergency Flight Planning Applied to Total Loss of Thrust. Journal of Aircraft, 43(4), 1205–1216. https://doi.org/10.2514/1.18816Links to an external site.
Baraniuk, C. (2022, October 12). The computer errors from outer space. Bbc.comLinks to an external site.; BBC. https://www.bbc.com/future/article/20221011-how-space-weather-causes-computer-errorsLinks to an external site.
Bijjahalli, S., Sabatini, R., & Gardi, A. (2020). Advances in intelligent and autonomous navigation systems for small UAS. Progress in Aerospace Sciences, 115, 100617.
Huhns, M. N., & Holderfield, V. T. (2002). Robust software. IEEE Internet Computing, 6(2), 80-82.
Ladkin, P. (1996). AA965 Cali accident report. University of Bielefeld.
Laakso, M., Takanen, A., & Röning, J. (1999, June). The Vulnerability Process: a tiger team approach to resolving vulnerability cases. In Proc. 11th FIRST Conf. Computer Security Incident Handling and Response.
Mitra, S. (2010). Robust System Design. 2010 23rd International Conference on VLSI Design, 434–439. IEEE. https://doi.org/10.1109/VLSI.Design.2010.77Links to an external site.
Pavlak, A. (2004). MODERN TIGER TEAMS.
Pérez-Chávez, A., & Psenka, C. (2001). Systems accidents and epistemological limitations: The case of American airlines’ flight 965 in Cali, Colombia. Practicing anthropology, 23(4), 33-38.
Reason, J., Hollnagel, E., & Paries, J. (2006). Revisiting the Swiss cheese model of accidents. Journal of Clinical Engineering, 27(4), 110-115.
Pavlak, A. (2004). Modern TIger Teams.