Article
· 1 hr ago 11m read

AI Agents from Scratch Part 2: Giving the Brain a Body

cover

In Part 1, we laid the technical foundation of MAIS (Multi-Agent Interoperability Systems). We have successfully wired up the 'Brain', built a robust Adapter using LiteLLM, locked down our API keys with IRIS Credentials, and finally cracked the trick code on the Python interoperability puzzle.

However, right now our system is merely a raw pipe to an LLM. It processes text, but it lacks identity.

Today, in Part 2, we will define the Anatomy of an Agent. We will move from simple API calls to structured Personas. We will learn how to wrap the LLM in a layer of business logic, giving it a name, a role, and, most importantly, the ability to know its neighbors.

Let’s build the "Soul" of our machine.

The Anatomy of an Agent: More Than Just a Prompt

Now that we have a connection to the "Brain" (the LLM), we need to grant it a personality. A common misconception is that an Agent is simply a system prompt, e.g., "You are a helpful assistant." That’s just a chatbot.

True Agentic AI stands out because it does not require a babysitter. It combines autonomy with a serious drive to complete the job. It looks ahead, e.g., verifying inventory before booking a sale, and if it runs into a roadblock, it figures out a workaround instead of just giving up.

To encapsulate this complexity within IRIS, I developed the dc.mais.adapter.Agent class. It acts as a "Persona Definition," effectively wrapping the raw LLM within strict operational boundaries.

Every agent is built on a specific configuration set. We always start with the Name and Role to establish a unique identifier and expertise domain (e.g., a "French Cuisine Expert"). To prevent hallucinations or scope creep, we set a hard Goal and a detailed checklist of Tasks. We also enforce communication standards via OutputInstructions, telling the agent to be concise or avoid specific characters, and provide the Tools (JSON definitions) it is authorized to execute.

Why "Target" Matters

To matter how you look at it, the most critical component in this setup is the Target. This property enables what I call a Decentralized Handoff.

Instead of routing everything through the central supervisor, the Target property provides a comma-separated list of valid next steps in the chain. For example, a MenuExpert agent knows its job is to help choose food. Yet, thanks to the Target property, it also understands that once the user says "I want the bill," it must pass the ball to the CashierAgent.

It creates a "reasoning engine" where the LLM comprehends its own boundaries: "I am the food expert, but I am not allowed to process payments. I need to call the Cashier."

Below, you can see how the class looks so far:

Class dc.mais.adapter.Agent Extends Ens.OutboundAdapter
{

/// Controls which properties are visible in Production settings
Parameter SETTINGS = "Name:Basic,Role:Basic,Goal:Basic,Tasks:Basic:textarea?rows=5&cols=50,OutputInstructions:Basic:textarea?rows=5&cols=50,Tools:Basic:textarea?rows=5&cols=50,Target:Basic,Model:Basic,MaxIterations,Verbose";

/// Unique identifier for the agent
Property Name As %String;

/// Primary objective the agent is designed to achieve
Property Goal As %String(MAXLEN = 100);

/// Description of the agent's function and expertise. i.e. "This assistant is knowledgeable, helpful, and suggests follow-up questions."
Property Role As %String(MAXLEN = 350);

/// Guidelines for how the agent should format and present responses
Property OutputInstructions As %String(MAXLEN = 1000);

/// Ordered list of responsibilities and actions the agent must perform
Property Tasks As %String(MAXLEN = 1000);

/// List of callable functions available to the agent
Property Tools As %String(MAXLEN = 10000);

/// Name of the next agents allowed (Comma-separated, e.g., "OrderTaker,OrderSubmitter")
Property Target As %String(MAXLEN = 1000);

// Comma-separated or JSON for extensibility

/// LLM model name (allows different agents to use different models)
Property Model As %String;

/// Maximum number of tool-calling iterations before stopping
Property MaxIterations As %Integer [ InitialExpression = 3 ];

}

This layering is the secret sauce. It elevates us from a generic "I hope the AI understands" approach to a structured "Plan, Assign, and Monitor" system. Essentially, we are wrapping the raw unpredictability of an LLM in a safety blanket of business logic.

Still, having these properties in a database class is not sufficient. We also need to translate these strict configurations into natural-language instructions that the LLM respects.

This is where Dynamic Prompt Engineering enters the picture.

I implemented a method called GetAgentInstructions that acts as a factory for the agent's personality. It does not simply concatenate strings; it constructs the whole mental model for the AI layer by layer.

The "Knowledge of Neighbors" (Handoff)

Pay attention to the logic inside the If (..Target '= "") block since it is the glue that holds the network together. In fact, we are telling the agent exactly who its neighbors are.

This functions as an "Allow List." It prevents the MenuExpert from attempting to transfer a customer to a non-existent ParkingAttendant. It enforces the business process flow at the prompt level. While the actual transfer mechanism belongs to the Orchestrator (which we will cover soon), the awareness of the transfer starts here.

Defensive Prompting

You should also notice the section on Tool Usage Guidelines. We explicitly command the model: "Do NOT guess or invent data."

It is defensive programming applied to English. We are pre-emptively stopping the model from hallucinating a menu or faking an order confirmation and forcing it to utilize the native tools we provide.

Check out the implementation below:

Method GetAgentInstructions(Output oPrompt As %String) As %String
{
    Set tSC = $$$OK
    Set oPrompt = "" // Ensures it's not null
    Try {
        Set prompt = "You are "_..Name_", a specialized agent."_$C(10)
        Set:(..Role '= "") prompt = prompt_"## Your Role: "_..Role_$C(10)
        Set:(..Goal '= "") prompt = prompt_"## Your Goal: "_..Goal_$C(10)
        Set:(..Tasks '= "") prompt = prompt_"## Your Tasks: "_..Tasks_$C(10)
        Set:(..OutputInstructions'= "") prompt = prompt_"## Output Instructions: "_..OutputInstructions_$C(10)

        // --- Handoff Logic: Introducing the neighbors ---
        If (..Target '= "") {
            Set prompt = prompt_"## Handoff Capabilities:"_$C(10)
            Set prompt = prompt_"- You can transfer the conversation ONLY to the following agents: "_..Target_$C(10)
            Set prompt = prompt_"- Use the 'handoff_to_agent' tool with one of these exact names."_$C(10)
        }

        // --- Defensive Prompting for Tools ---
        If (..Tools '= "") {
            Set prompt = prompt_"## Tool Usage Guidelines:"_$C(10)
            Set prompt = prompt_"- You have access to functions (tools) to get real data."_$C(10)
            Set prompt = prompt_"- You MUST call the function natively when needed."_$C(10)
            Set prompt = prompt_"- Do NOT guess or invent data. Use the function."_$C(10)
            Set prompt = prompt_"- NEVER write the function call JSON in the response text. Just trigger the function."_$C(10)
        }

        Set prompt = prompt_"# Remember: You are part of a multi-agent system."
        Set oPrompt = prompt

    } Catch ex {
        Set tSC=ex.AsStatus()
        $$$LOGERROR("Error generating instructions: "_ex.DisplayString())
    }
    Return tSC
}

Now that our agents have their orders and recognize their neighbors, we need a Commander to ensure they actually stick to the script. So, let’s enter the Orchestrator.

The Nervous System: Orchestrating the Bistro Crew

It is time for use to move to the nervous system: The Orchestrator.

I find it significantly easier to grasp the Orchestrator by creating a tangible project rather than discussing abstract theory. So, let’s head to our dc.samples package and build a "Bistro Crew" to validate our framework.

The concept is straightforward: we will establish a team of attendants for a small bistro. We will need a Greeter to welcome guests and a Menu Expert to handle the culinary details.

1. Hiring the Staff (Configuring Agents)

Since we designed our dc.mais.operation.Agent class for reusability, we do not need to write new code for these agents. We should simply add them to the Production and configure the settings.

[Caption: Setting up the crew: Adding a reusable Agent Operation to the Production. No new code required, just configuration.]

Let’s add the first one, Agent.Greeter. In the Basic Parameters, we should define its soul:

Name: Greeter
Role: Welcome customers and provide initial menu information
Goal: Make customers feel welcomed and guide them to the appropriate specialist
Tasks: 
- Welcome customers with a warm, professional greeting
- Provide brief overview of restaurant specialties
- Identify customer needs (menu info, ordering, or general questions)
- Handoff to MenuExpert when customer wants detailed menu information
OutputInstructions: 
- Greet customers warmly and professionally
- Keep responses concise and inviting (2-3 sentences max)
- Always end with a question to engage the customer

noice

We will return to the MenuExpert shortly. First, let’s make sure they have a brain. Just as we did for the Agent, I created a generic Business Operation dc.mais.operation.LLM using the adapter we built earlier. Simply add it to the production, and we are ready to go.

Great!!

2. Building the Flow (The BPL)

Now, let's create a Business Process named Orchestrator.

This process requires a Request message containing the following:

  • Sender: The name of the agent sending the message (if any).
  • Assignee: The specific agent we want to target.
  • Content: The user’s interaction text.

For the response, a simple Content property to hold the agent's reply is sufficient.

I prefer using Context Variables to keep the state clean. So, the first thing we do in the BPL is assign the incoming Request properties to the Context.

The "Cold Start" Logic: If this is the very first execution, the Assignee will be empty. We need to decide who initiates the conversation. For that reason, I added a simple If condition: if Assignee is empty, set the Target to 'greeter'.

The Routing: At this point, we trace the route using a Switch based on the Target.

  • Case 'greeter': Call Agent.Greeter.
  • Case 'menu_expert': Call Agent.MenuExpert.

Important: We are making synchronous calls here (disable the Async flag). Why? Because we are not asking the Agent to answer the user yet. We are requesting the Agent Operation to return its System Prompt (its personality).

Remember to save this result in context.CurrentSystemPrompt.

The Synapse: Finally, after the route is determined and we have the correct prompt, we can call the LLM.

  • Request.Content: context.CurrentSystemPrompt (The Rules)
  • Request.UserContent: The actual text from the user.

With this simple flow — Router -> Get Persona -> Call LLM — we can run our first test.

I sent a "Hello" to the process, and...

Et voilà! The Greeter responded with a warm, professional welcome, exactly as configured. It is alive!

Giving Agents Hands: Tools and the ReAct Loop

Let’s get back to our crew. We left off with the Greeter, who is charming but, frankly, a bit useless when it comes to the actual food.

Enter the MenuExpert.

The primary distinction between this agent and the Greeter is that the MenuExpert does not merely rely on its training data (which hallucinates prices). It requires access to real-time data. It needs a Tool.

To keep things simple for this example, I created a standard Business Operation called Tool.GetMenu.

At this stage, in the MenuExpert configuration settings, under Tools, we should paste the function definition that follows the standard OpenAI JSON schema:

[{
    "type": "function",
    "function": {
        "name": "get_menu",
        "description": "Get the full bistro menu",
        "parameters": {
            "type": "object",
            "properties": {},
            "required": []
        }
    }
}]

It acts as the API documentation for the brain. We are basically telling the LLM: "If you need the menu, there is a function called get_menu that takes no arguments. Use it."

A Quick Break: The ReAct Paradigm

Prior to implementing the wiring, it is worth understanding the theory behind it. The entire system hinges on a framework known as ReAct (short for Reason plus Act). Introduced in a 2022 paper, it fundamentally changed the way we build AI agents.

Before ReAct, LLMs were exclusively text completion engines. With ReAct, we force the model to alternate between verbal reasoning (Thinking) and actions (Tools). In a nutshell, it looks similar to the following internal monologue:

Thought: The user wants the price of Coq au Vin. I don't know it.
Action: get_menu()
Observation: {"Coq au Vin": 28.00}
Thought: I have the price. I can answer now.
Final Answer: "The Coq au Vin costs $28.00."

Think of Memory as "What I remember" (context history) and ReAct as "How I solve problems" (the loop of reasoning and acting). Without ReAct, the agent is just a passive observer. With it, it becomes an active problem solver.

The Missing Piece: The Nervous System

We have covered considerable ground. We have a secure connection to the LLM (Part 1). We have also defined our Agents with strict Roles, Goals, and Tools, and explored the ReAct theory that drives their reasoning (Part 2).

However, after reviewing our code, we have identified a problem: It is all static.

We have the definitions of the MenuExpert and the get_menu tool, but nothing connects them. There is no loop to catch the tool call, execute the SQL, and feed the result back to the brain. There is also no mechanism to handle the Handoff when the agent says, "I need help."

We have the talent (the Actors) and the instructions (the Script), but they have nowhere to perform. There is no Stage yet, either.

In the final Part 3, we will construct the Nervous System. We will do the following:

  1. Implement the Orchestrator using InterSystems BPL.
  2. Build the Double Loop Architecture to manage autonomous lifecycles.
  3. Execute the tools and handle the "Handoff" signal dynamically.
  4. Utilize Visual Tracing to watch our agents thinking in real time.

Get your coffee ready since the next part will bring it all to life!

Discussion (0)0
Log in or sign up to continue