The Case Against AI-Generated Safety Recommendations in Aviation

As artificial intelligence systems demonstrate impressive capabilities in pattern recognition and data analysis, should these systems generate safety recommendations based on analytics?

This question requires careful examination of both AI's technical capabilities and the organizational dynamics that govern safety improvement. 


Overview

  1. Distinguishing AI Capabilities: Analysis vs. Recommendation

  2. The Accountability Gap in High-Risk Contexts

  3. System Complexity and the Limits of Pattern Recognition

  4. Trust Dynamics and Professional Relationships

  5. Organizational and Implementation Realities

  6. Toward Appropriate AI Integration

  7. Human Intelligence in Safety-Critical Industries is Safety-Critical


Distinguishing AI Capabilities: Analysis vs. Recommendation

The debate benefits from precision about what AI can and cannot do reliably. AI systems excel at specific analytical tasks: processing large datasets, identifying statistical patterns, and flagging anomalies for human review. These capabilities have legitimate applications in aviation safety, from predictive maintenance to identifying clusters of similar incidents. However, generating actionable safety recommendations requires a qualitatively different form of reasoning.

Modern AI systems exhibit characteristics that challenge traditional safety certification: they are data-intensive, often opaque in their evaluation processes, and can behave unpredictably under novel conditions. These challenges intensify when moving from pattern identification to recommendation generation. An AI system might correctly identify that certain procedural deviations correlate with incidents, but determining whether to recommend procedural changes, additional training, technology modifications, or organizational restructuring, requires contextual judgment that extends beyond statistical inference.

Research in high-stakes decision-making emphasizes a key principle: for critical applications, recommendations should only come from analysis when the relationships between variables are fully understood. Yet even interpretable AI faces limitations in recommendation generation. Statistical relationships identified in historical data may not hold under changed conditions, and recommendations must account for implementation feasibility, organizational capacity, regulatory constraints, and unintended consequences that AI systems cannot reliably anticipate. AI lacks the contextual understanding and accountability structures that safety-critical recommendations require.


The Accountability Gap in High-Risk Contexts

Aviation safety operates under stringent accountability frameworks precisely because failures carry catastrophic consequences. Safety-critical systems undergo extensive testing, evaluation, verification, and validation before certification. Yet the aviation industry lacks well-established procedures for certifying data-intensive AI systems, reflecting genuine uncertainty about how to validate systems whose outputs emerge from complex pattern recognition rather than explicit logical rules.

The accountability problem has both technical and organizational dimensions. Unlike conventional software that follows predefined rules, modern AI systems learn, adapt, and generate responses dynamically. This makes their decision-making process less predictable and creates challenges in pinpointing responsibility when mistakes occur. When a flawed recommendation contributes to an incident, the question "who is responsible" becomes difficult to answer. Is it the algorithm developers, the organization that deployed the system, the safety professionals who reviewed (or failed to review) the recommendation, or some combination of these parties?

Regulatory frameworks increasingly recognize AI systems as high-risk, requiring strong oversight, transparency, and continuous monitoring. This is particularly critical for learning models that can evolve in unpredictable ways, potentially introducing new vulnerabilities. These requirements acknowledge the fundamental challenge: in systems that learn and adapt, behavior can drift from initial specifications in ways that are difficult to detect until failures occur.

However, accountability concerns extend beyond regulatory compliance to the practical dynamics of safety improvement. Safety recommendations succeed or fail based on organizational buy-in, resource allocation, and sustained implementation effort. When recommendations come from human experts, there is someone to call when implementation encounters obstacles, someone who can clarify intent, adapt approaches, and troubleshoot unanticipated problems. AI-generated recommendations lack this ongoing accountability relationship, creating a disconnect between recommendation and implementation that undermines effectiveness.

It bears acknowledging that human experts also face accountability challenges. Experts can err, recommendations can prove ill-conceived, and human judgment is subject to biases. The difference lies in the nature of accountability: human experts can be questioned, can explain their reasoning in dialogue, can be held professionally responsible for their advice, and can learn from failures in ways that inform future practice. These accountability mechanisms remain underdeveloped for AI systems.


System Complexity and the Limits of Pattern Recognition

Aviation safety analytics must account for deeply interconnected sociotechnical systems where surface patterns can mislead and context shapes interpretation. Expert judgment proves essential for unraveling these complexities, drawing on structured analysis techniques, uncertainty quantification, and collaborative problem-solving. This expertise involves more than recognizing patterns in data; it requires understanding how operational procedures, organizational culture, regulatory frameworks, technology systems, and human factors interact to produce safety outcomes.

This expertise involves more than recognizing patterns in data; it requires understanding how operational procedures, organizational culture, regulatory frameworks, technology systems, and human factors interact to produce safety outcomes.
— Savannah Vlasman, Founder and Chief Scientific Officer, Sociometri


Consider a concrete example: survey data might reveal that compliance with certain procedures correlates negatively with safety incidents. An AI system trained on this pattern might recommend stricter enforcement or additional training. However, expert analysis might reveal that the procedures in question are poorly designed for actual operational conditions, and high compliance correlates with fewer incidents only because it reflects organizations with better overall safety cultures, not because the specific procedures are effective. The appropriate recommendation might be procedure redesign rather than enforcement, a conclusion that requires contextual understanding of operations, not just pattern recognition in survey data.

In safety-critical environments, software alone cannot replicate professional investigative thinking. This limitation reflects the difference between pattern matching and causal reasoning. AI systems identify correlations in historical data, but safety recommendations must be based on causal understanding of how interventions will produce desired effects. Current AI capabilities in causal inference remain limited, particularly in complex organizational contexts where interventions interact with existing practices, culture, and systems in ways that historical data may not adequately capture.

The customization challenge amplifies these limitations. Each aviation organization has unique characteristics: different fleet compositions, operational procedures, organizational cultures, regulatory environments, and resource constraints. Effective safety recommendations must account for these specifics. Technology and human elements work together to determine safety outcomes. Human consultants engage in dialogue to understand these contextual factors and tailor recommendations accordingly. AI systems trained on aggregated data may miss crucial organization-specific factors that determine whether recommendations are appropriate and feasible.


That said, this does not mean AI has no role in supporting this work. AI could assist human experts by identifying patterns deserving deeper investigation, processing large volumes of survey responses to flag themes, or providing visualizations that make complex data relationships more accessible. The key distinction is that AI serves as an analytical tool supporting human judgment rather than supplanting it.



Trust Dynamics and Professional Relationships

The adoption of safety recommendations depends not only on their technical merit but on trust relationships between advisors and clients. Research on human-AI trust reveals important dynamics that apply directly to aviation safety consulting. Trust in AI tends to be lower in domains traditionally dominated by human expertise, and trust in AI systems takes longer to build while being lost more quickly when problems occur.

This trust gap has practical consequences. Safety recommendations often require significant organizational changes, resource investments, or procedural modifications that create resistance. Successful implementation depends on credibility and relationships. Aviation safety professionals need to feel that recommendations come from advisors who understand operational pressures, appreciate organizational constraints, and genuinely prioritize safety over algorithmic generalizability optimizations. Users may recognize AI's technical capabilities but still withhold trust when they question the system's intentions or alignment with their goals.


The relational dimension extends beyond initial adoption to ongoing implementation support. Research consistently shows that professionals prefer human interaction for scenarios requiring deep empathy or complex decision-making. In aviation safety, recommendations may touch on sensitive organizational issues, expose problematic practices, or challenge established procedures. The consultative relationship allows for candid discussion, iterative refinement, and the kind of organizational problem-solving that implementation requires.


The explainability challenge compounds trust issues. For AI systems to be trustworthy, their decision-making processes must be understandable, traceable, and auditable. Even with explainability techniques, AI-generated recommendations may fail to answer the specific questions that safety professionals need addressed. Why does this recommendation make sense given our specific operational context? What second-order effects might we anticipate? How should we adapt the recommendation to our organizational constraints? These questions require dialogue and contextual reasoning that current AI capabilities cannot provide.

For AI systems to be trustworthy, their decision-making processes must be understandable, traceable, and auditable.
— Savannah Vlasman, Founder and Chief Scientific Officer, Sociometri


However, the trust argument requires nuance. Human experts also face trust challenges, particularly when recommendations conflict with organizational interests or established practices. Human bias, whether from professional backgrounds, consulting incentives, or limited exposure to diverse operational contexts, can skew recommendations. The advantage of human expertise lies not in perfect objectivity but in the ability to engage transparently about reasoning, respond to challenges, and revise recommendations through dialogue. These interactive trust-building mechanisms remain unavailable for AI-generated recommendations.




Organizational and Implementation Realities

The effectiveness of safety recommendations depends heavily on organizational dynamics that extend beyond technical accuracy. Organizations vary in their capacity to implement changes, their receptiveness to external advice, and their ability to integrate new practices into existing operational frameworks. Human consultants navigate these dynamics through ongoing engagement, relationship management, and adaptive problem-solving as implementation unfolds.


Research on safety-critical systems consistently emphasizes that technology must be embedded within appropriate organizational practices. AI applications in aviation remain in early developmental stages, with certification frameworks still being outlined. This developmental stage suggests that rather than rushing to deploy AI for recommendation generation, the field needs sustained work on understanding how AI can appropriately support human decision-making in safety contexts.


The maturity gap has practical implications. Organizations implementing AI-generated recommendations would face challenges in validation, customization, and troubleshooting. Without established frameworks for evaluating AI recommendations, safety professionals must either accept recommendations on faith or develop ad-hoc validation approaches. Both paths introduce risks: uncritical acceptance of potentially flawed recommendations, or inconsistent validation practices that undermine systematic safety improvement.




Toward Appropriate AI Integration

This analysis does not suggest that AI has no place in aviation safety analytics. Rather, it argues for clear boundaries around appropriate use. AI can valuably support safety analytics by processing large datasets, identifying patterns for human investigation, visualizing complex relationships, and flagging anomalies. These applications leverage AI's strengths in pattern recognition while preserving human judgment at the recommendation stage.

The principle is straightforward: AI should augment professional expertise, not automate safety-critical judgments.
— Savannah Vlasman, Founder and Chief Scientific Officer, Sociometri

The principle is straightforward: AI should augment professional expertise, not automate safety-critical judgments. Safety investigation, root cause analysis, and recommendation development demand expertise, critical thinking, and accountability that only trained professionals can provide. This suggests a division of labor: AI handles data-intensive analytical tasks while human experts handle interpretation, contextualization, and recommendation development.



Human Intelligence in Safety-Critical Industries is Safety-Critical

The limitations of AI for generating safety recommendations in aviation reflect both current technological boundaries and deeper challenges inherent in the nature of safety-critical decision-making. While AI capabilities will undoubtedly improve, several core challenges resist purely technical solutions: the accountability gap when responsibility for recommendations becomes diffuse, the contextual understanding required to translate patterns into appropriate organizational interventions, and the trust relationships necessary for effective implementation.

The case for human expertise in aviation safety recommendations rests on practical realities about how safety improvement actually occurs. Recommendations succeed when they combine technical understanding with contextual judgment, when they account for organizational realities, and when they are backed by ongoing consultative relationships that support implementation. Human experts, despite their limitations and biases, can provide these elements in ways that current AI systems cannot.

This conclusion does not preclude AI use in aviation safety analytics. AI can valuably support data analysis, pattern identification, and information synthesis. The appropriate boundary lies between analytical support and recommendation generation, between tools that enhance human judgment and systems that attempt to replace it. As regulatory frameworks mature and AI capabilities develop, these boundaries may shift.

For now, prudence suggests maintaining human judgment at the center of safety recommendation processes while exploring how AI can appropriately enhance the analytical work that informs those recommendations.

Cover photo by Brent Vlasman, taken at the Hello Stavanger 2025 Conference, showing the placement of both “Transportation Systems” and “Safety” in the High Risk category for AI usage according to keynote speaker and AI expert Richard Campbell during the Keynote address, “Beyond the AI Hype: What's Real, What's Next.” Available here in full (link).

Next
Next

Understanding Safety Culture: An Interview with Sociometri Founder Savannah Vlasman