Proposal: Modeling Wayfinding

A Hybrid Neurosymbolic and Reinforcement Learning Approach to Dynamic Decision-Making in Quest-Driven Narrative Worlds

— *Authored by ****Diksha Shrivastava ***(diksha-shrivastava13.github.io)

This preliminary proposal is in response to the project idea by Red Hen Labs detailing my approach to model dynamic decision-making by combining neurosymbolic AI with reinforcement learning (RL). My aim is to create a framework that embodies the “wayfinding” model—one that integrates ambiguity, uncertainty, and dynamic goal setting into a richly detailed narrative environment. This proposal is strongly influenced by the insights from McCubbins and Turner’s work, which emphasise the fluid, context-dependent construction of selves and the trade-offs between cognitive processing and action.

Overview of the Proposal

I aim to model a text-based, quest-driven fictional world—similar to those in Dune, The Three-Body Problem, Lord of the Rings, or Harry Potter—to provide a dynamic, evolving environment in which agents can make decisions, act, and learn from reward models. The choice of a quest-driven fictional world is intended to allow for dynamic goal setting, where there is both a higher-level objective and multiple immediate priorities. Unlike modeling a rigid environment, a fictional world would enable agents to form hypotheses, blend their perceptions, and have their decisions play out over the long term; delayed rewards are a typical factor in reinforcement learning.

At every point, the choices available to the characters would depend on their position in the fictional world’s timeline. This is strictly governed by the neurosymbolic module, which is responsible for setting a set of priors, rules, and available actions at any given time. This module also influences the agents’ decision-making process, depending on whether they can afford to think or act.

Using LLMs as agents or “characters” in the fictional world would allow regulation of the cost of thinking and acting, enable the agent to simulate various scenarios before taking action, and support the blending of selves—from past memories, perceptions of other characters, and the environment at the moment of decision. Developing an explainable and interpretable model is thus necessary for the success of this project.

I present the pseudocode in the next section to portray how the different modules can play out.

Module I: Narrative Fictional RL Environment

The project’s foundation will be a simulated narrative world reminiscent of a novel-like setting (imagine a universe similar to Harry Potter or even a broader epic like Lord of the Rings). This environment will:

Present Dynamic, Evolving Cues: Narrative events, character interactions, and shifting contexts will constantly alter the landscape, challenging the agent to adapt its strategies.
Introduce Ambiguity and Uncertainty: In line with the papers’ emphasis on real-world unpredictability, the environment will expose the agent to partial information and competing priorities, prompting continuous re-evaluation of its goals.
Embed Institutional and Cultural Constraints: Reflecting the idea that cultural institutions shape predictability, the environment will incorporate narrative “rules” (e.g., magical laws or societal norms) that the agent must learn and respect.

Module II: Neurosymbolic Decision-Making Module

To bridge the gap between raw sensory input and high-level reasoning:

Neural Processing: Neural networks or LLMs will be employed to interpret sensory and narrative inputs—extracting meaningful representations from the rich textual and contextual cues in the environment.
Symbolic Reasoning: A complementary symbolic module will encode domain-specific rules and constraints, ensuring that decision-making adheres to the established lore and internal logic of the narrative world. This aligns with the papers’ critique of rigid, fixed preferences by emphasising a dynamic, context-sensitive reasoning process.
Purpose: This integration ensures that the agent’s decisions are both data-driven and consistent with narrative and institutional constraints, mirroring how human cognition blends raw perception with learned cultural norms.

Module III: LLMs as Part of the RL Agent

To capture the dynamic “blending of selves” and foster hypothesis generation:

Role of LLMs: Large Language Models will be integrated to generate narrative hypotheses and explore potential action paths based on the current context. This mechanism simulates the flexible, multi-faceted construction of the self, as discussed in the papers.