How to Build an Agentic AI RAG Application: A Step-by-Step Guide |

Article

Muhammad Waseem · Apr 5 6m read

Open Exchange

#Artificial Intelligence (AI) #ChatGPT #Embedded Python #Large Language Model (LLM) #Vector Search #InterSystems IRIS for Health #Open Exchange

Hi Community,

Traditional keyword-based search struggles with nuanced, domain-specific queries. Vector search, however, leverages semantic understanding, enabling AI agents to retrieve and generate responses based on context—not just keywords.

This article provides a step-by-step guide to creating an Agentic AI RAG (Retrieval-Augmented Generation) application.

Implementation Steps:

Create Agent Tools
- Add Ingest functionality: Automatically ingests and index documents (e.g., InterSystems IRIS 2025.1 Release Notes).
- Implement Vector Search Functionality
Create Vector Search Agent
Handoff to Triage (Main Agent)
Run The Agent

1. Create Agent Tools

1.1 - Implement Document Ingestion: Automated ingestion and indexing of documents

The following code implements the ingestion tool:

    def ingestDoc(self):
        #Check if document is defined, by selecting from table
        #If not defined then INGEST document, Otherwise back
        embeddings = OpenAIEmbeddings()	
        #Load the document based on the fle type
        loader = TextLoader("/irisdev/app/docs/IRIS2025-1-Release-Notes.txt", encoding='utf-8')      
        
        documents = loader.load()        
        text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=0)
        
        texts = text_splitter.split_documents(documents)
                       
        #COLLECTION_NAME = "rag_document"
        db = IRISVector.from_documents(
            embedding=embeddings,
            documents=texts,
            collection_name = self.COLLECTION_NAME,
            connection_string=self.CONNECTION_STRING,
        )

        db = IRISVector.from_documents(embedding=embeddings,documents=texts, collection_name = self.COLLECTION_NAME, connection_string=self.CONNECTION_STRING,)

The Vector Search Agent automatically ingests and indexes the New in InterSystems IRIS 2025.1 from the specified repository folder) into the IRIS Vector Store, performing this operation only when the data isn't already present.

Run the below query to fetch the required data from the vector store

SELECT
id, embedding, document, metadata
FROM SQLUser.AgenticAIRAG

1.2 - Implement Vector Search Functionality

The following code provides the search capability for the Agent:

 def ragSearch(self,prompt):
        #Check if collections are defined or ingested done.
        # if not then call ingest method
        embeddings = OpenAIEmbeddings()	
        db2 = IRISVector (
            embedding_function=embeddings,    
            collection_name=self.COLLECTION_NAME,
            connection_string=self.CONNECTION_STRING,
        )
        docs_with_score = db2.similarity_search_with_score(prompt)
        relevant_docs = ["".join(str(doc.page_content)) + " " for doc, _ in docs_with_score]
        
        #Generate Template
        template = f"""
        Prompt: {prompt}
        Relevant Docuemnts: {relevant_docs}
        """
        return template

The Triage Agent handles incoming user queries and delegates them to the Vector Search Agent, which performs semantic search operations to retrieve the most relevant information.

2 - Create Vector Store Agent

The following code implements the vector_search_agent with:

Custom handoff_descriptions for agent coordination
Clear operational instructions
The iris_RAG_search Tool (leveraging irisRAG.py for document ingestion and vector search operations)

@function_tool
    @cl.step(name = "Vector Search Agent (RAG)", type="tool", show_input = False)
    async def iris_RAG_search():
            """Provide IRIS Release Notes details,IRIS 2025.1 Release Notes, IRIS Latest Release Notes, Release Notes"""
            if not ragOprRef.check_VS_Table():
                 #Ingest the document first
                 msg = cl.user_session.get("ragclmsg")
                 msg.content = "Ingesting Vector Data..."
                 await msg.update()
                 ragOprRef.ingestDoc()
            
            if ragOprRef.check_VS_Table():
                 msg = cl.user_session.get("ragclmsg")
                 msg.content = "Searching Vector Data..."
                 await msg.update()                 
                 return ragOprRef.ragSearch(cl.user_session.get("ragmsg"))   
            else:
                 return "Error while getting RAG data"
    vector_search_agent = Agent(
            name="RAGAgent",
            handoff_description="Specialist agent for Release Notes",
            instructions="You provide assistance with Release Notes. Explain important events and context clearly.",
            tools=[iris_RAG_search]
    )

3 - Handoff to Triage (Main Agent)

The following code implements the handoff protocol to route processed queries to the Triage Agent (main coordinator):

 triage_agent = Agent(
        name="Triage agent",
        instructions=(
            "Handoff to appropriate agent based on user query."
            "if they ask about Release Notes, handoff to the vector_search_agent."
            "If they ask about production, handoff to the production agent."
            "If they ask about dashboard, handoff to the dashboard agent."
            "If they ask about process, handoff to the processes agent."
            "use the WebSearchAgent tool to find information related to the user's query and do not use this agent is query is about Release Notes."
            "If they ask about order, handoff to the order_agent."
        ),
        handoffs=[vector_search_agent,production_agent,dashboard_agent,processes_agent,order_agent,web_search_agent]
    )

4 - Run The Agent

The following code:

Accepts user input
Invokes the triage_agent
Routes queries to the Vector_Search_Agent for processing

@cl.on_message
async def main(message: cl.Message):
    """Process incoming messages and generate responses."""
    # Send a thinking message
    msg = cl.Message(content="Thinking...")
    await msg.send()

    agent: Agent = cast(Agent, cl.user_session.get("agent"))
    config: RunConfig = cast(RunConfig, cl.user_session.get("config"))

    # Retrieve the chat history from the session.
    history = cl.user_session.get("chat_history") or []
    
    # Append the user's message to the history.
    history.append({"role": "user", "content": message.content})
    
    # Used by RAG agent
    cl.user_session.set("ragmsg", message.content)
    cl.user_session.set("ragclmsg", msg)

    try:
        print("\n[CALLING_AGENT_WITH_CONTEXT]\n", history, "\n")
        result = Runner.run_sync(agent, history, run_config=config)
               
        response_content = result.final_output
        
        # Update the thinking message with the actual response
        msg.content = response_content
        await msg.update()

        # Append the assistant's response to the history.
        history.append({"role": "developer", "content": response_content})
        # NOTE: Here we are appending the response to the history as a developer message.
        # This is a BUG in the agents library.
        # The expected behavior is to append the response to the history as an assistant message.
    
        # Update the session with the new history.
        cl.user_session.set("chat_history", history)
        
        # Optional: Log the interaction
        print(f"User: {message.content}")
        print(f"Assistant: {response_content}")
        
    except Exception as e:
        msg.content = f"Error: {str(e)}"
        await msg.update()
        print(f"Error: {str(e)}")

See It in Action:

For more details, please visit iris-AgenticAI open exchange application page.

Thanks