When Workflows Stay Stable but Meaning Moves in Agentic Analyst Pipelines
Abstract
Agentic analyst pipelines can keep a workflow object stable while its operational meaning shifts underneath. A coded event, alert, or dashboard tile may persist unchanged while the reporting frame moves from accident to attack, or from outage to state-linked operation. This semantic movement can reach the analyst undetected, degrading situation awareness while the interface appears organized. This research proposes a measurement and governance layer for detecting semantic frame shifts in agentic analyst workflows, using 1032 headline and snippet records organized into 37 adjudicated event-window states across 10 event families. Three findings bear directly on human-agentic workflow design. First, human coders outperformed all tested LLM coding agents on a binary frame-shift endpoint, with the best LLM condition reaching kappa 0.549 against adjudication compared to kappa 0.784 for human coders, confirming that human semantic judgment is a more reliable reference in analyst triage workflows. Second, our proposed embedding-based metrics outperformed every LLM condition as computational detectors: translational drift reached AUROC 0.894, neighborhood rewiring AUROC 0.841, and a combined reframing-pressure score AUROC 0.909, classifying workflow states into stable interpretation, smooth directional drift, semantic churn, and major reframing. Third, LLM coding agents varied materially by model and prompt policy, paralleling the variation observed across human coders, which means that consensus among human judgment, semantic metrics, and LLM signals cannot be assumed and must be treated as an active governance variable in workflow design. Together these findings support an orchestration layer that routes high-reframing or high-disagreement cases to human review while allowing semantically stable outputs to pass forward. As demonstration, the proposed metrics and approach are implemented in an open source intelligence system.
Keywords: human-agentic workflows, agentic AI, semantic drift, LLM-as-judge, situation awareness, analyst pipelines
DOI: 10.54941/ahfe1007675
Cite this paper
More from this volume
- Workshop: Orchestrating Synthesized Human and AI-Agentic Workflows: AI Agency Benefits, Disruptions and Management
- The Risks, Challenges, and Potential Opportunities with GenAI
- Agentic LLMs for Scalable, Verifiable System Health Digital Twins
- Effects of Swarm Size Variability on Operator Workload
- A Computer-Vision Approach to Accessible Robot Control: Hand Gesture Recognition for Users With Limited Mobility or Speech
- Emotive Design Heuristics: A Methodology for Creating and Validating Empathetic Design Heuristics for Human-Robot Interaction
- User Perception and Sentiment Analysis of Knee exoskeletons for Hiking Based on Social Media Comments: A Preliminary Study
- Effects of Robot Non-Verbal Behaviors on Human Emotion Recognition in Human–Robot Communication
- Rule-Based Interpretable AI for Concurrent Collision Detection in Industrial Robot Manipulators
- Human Factors in the Design of Human–Machine Interfaces for Counter-Drone Systems
- Human-Friendly Control of Drones and Drone Swarms Using Natural Language and AI-Based Task Decomposition
- European University–Industry Collaboration for Civil Counter-Drone Protection: A Human-Centered, AI-Game-Based Socio-Technical Systems Approach


AHFE Open Access