Multiple Agents Interacting via Probability Flows on Factor Graphs
Open Access
Article
Conference Proceedings
Authors: Francesco Palmieri, Krishna Pattipati, Giovanni Di Gennaro, Amedeo Buonanno, Martina Merola
Abstract: Expert team decision-making demonstrates that effective teams have shared goals, shared mental models to coordinate with minimal communication, establish trust through cross-training, and match task structures through planning. The key questions: Do best practices of human teams translate to hybrid human-AI agent teams, or autonomous agents alone? Is there a mathematical framework for studying shared goals and mental models? We propose factor graphs for studying multi-agent interaction and agile cooperative planning. One promising avenue for modeling interacting agents in real environments is with stochastic approaches, where probability distributions describe uncertainties and imperfect observations. Stochastic dynamic programming provides a framework for modeling multiple agents as scheduled and interacting Markov Decision Processes (MDPs), wherein each agent has partial information about other agents in the team. Each agent acts by accounting for both its objectives and anticipated behaviors of others, even implicitly. We have shown that Dynamic Programming, Maximum likelihood, Maximum entropy and Free-energy-based methods for stochastic control are special cases of probabilistic message propagation rules on modeled factor graphs. Now we show how multiple agents, modeled as multiple interacting factor graphs, exchange probability distributions carrying partial mutual knowledge. We demonstrate the ideas in contexts of moving agents on a discrete grid with obstacles and pre-defined semantic areas (grassy areas, pathways), where each subject has a different destination (goal). The scheduling of agents is fixed a priori or changes over time, and the forward-backward flow for each agent’s MDP is computed every time step, with additional branches that inject probability distributions into and from other agent MDPs. These interactions avoid collisions among agents and enable dynamic planning by agents, accounting for estimates of posterior probabilities of other agents states at future times, the precision and timing being adjustable. Simulations included limited interacting agents (three) on small rectangular discrete grid with starting points and destination goals, obstacles in various positions, narrow passages, small mazes, destinations that require coordination, etc. Solely due to probability distributions flowing in the interacting agent system, the solutions provided by the probabilistic model are interesting because agents that encounter potential conflicts in some regions autonomously adapt strategies, like waiting to let others pass, or taking different paths. The information available to each agent is a combination of rewards received from the environment and inferences about other agents. Previously, we described a scheme for a hierarchy (prioritized order) of agents and unique value function for each agent. Now, we propose a different, tunable interaction, wherein each agent dynamically transmits the posterior probability of its position at future time steps to other agents. The new framework allows flexibility in tuning the information that each agent has on others, ranging from complete knowledge of goals and positions about others out to a limited probabilistic awareness, both in precision and in time, for where others may be located at future time steps. This framework systematically addresses questions, such as the minimal amount of information needed for effective team coordination in the face of changes in goals, communication bandwidth, grid parameters and agent status.
Keywords: Interacting Multiple Agents, Factor Graphs, Path Planning, Probability Flows
DOI: 10.54941/ahfe1003761
Cite this paper:
Downloads
168
Visits
475