A large language model (LLM) is an artificial intelligence model designed to understand and generate human-like text based on vast amounts of training data.
Traditional keyword-based search struggles with nuanced, domain-specific queries. Vector search, however, leverages semantic understanding, enabling AI agents to retrieve and generate responses based on context—not just keywords.
This article provides a step-by-step guide to creating an Agentic AI RAG (Retrieval-Augmented Generation) application.
Implementation Steps:
Create Agent Tools
Add Ingest functionality: Automatically ingests and index documents (e.g., InterSystems IRIS 2025.1 Release Notes).
See the Langchain IRIS Tool in action on YouTube. You can see IRIS metrics, discover classes, generate fake data, and so on. Project using Ollama, IRIS VectorDB, Streamlit and Langchain.
In this article, I will introduce my application iris-AgenticAI .
The rise of agentic AI marks a transformative leap in how artificial intelligence interacts with the world—moving beyond static responses to dynamic, goal-driven problem-solving. Powered by OpenAI’s Agentic SDK , The OpenAI Agents SDK enables you to build agentic AI apps in a lightweight, easy-to-use package with very few abstractions. It's a production-ready upgrade of our previous experimentation for agents, Swarm. This application showcases the next generation of autonomous AI systems capable of reasoning, collaborating, and executing complex tasks with human-like adaptability.
Application Features
Agent Loop 🔄 A built-in loop that autonomously manages tool execution, sends results back to the LLM, and iterates until task completion.
Python-First 🐍 Leverage native Python syntax (decorators, generators, etc.) to orchestrate and chain agents without external DSLs.
Handoffs 🤝 Seamlessly coordinate multi-agent workflows by delegating tasks between specialized agents.
Function Tools ⚒️ Decorate any Python function with @tool to instantly integrate it into the agent’s toolkit.
Vector Search (RAG) 🧠 Native integration of vector store (IRIS) for RAG retrieval.
Tracing 🔍 Built-in tracing to visualize, debug, and monitor agent workflows in real time (think LangSmith alternatives).
MCP Servers 🌐 Support for Model Context Protocol (MCP) via stdio and HTTP, enabling cross-process agent communication.
Chainlit UI 🖥️ Integrated Chainlit framework for building interactive chat interfaces with minimal code.
Stateful Memory 🧠 Preserve chat history, context, and agent state across sessions for continuity and long-running tasks.
# IRIS-Intelligent Butler IRIS Intelligent Butler is an AI intelligent butler system built on the InterSystems IRIS data platform, aimed at providing users with comprehensive intelligent life and work assistance through data intelligence, automated decision-making, and natural interaction. ## Application scenarios adding services, initializing configurations, etc. are currently being enriched ## Intelligent Butler
If you want to know if a class about a topic already exists asking a simple natural language question, it is possible now. Download and run the application https://openexchange.intersystems.com/package/langchain-iris-tool to know all about your project classes in a Chat.
I just realized I never finished this serie of articles!
In today's article, we'll take a look at the production process that extracts the ICD-10 diagnoses most similar to our text, so we can select the most appropriate option from our frontend.
Looking for diagnostic similarities:
From the screen that shows the diagnostic requests received in HL7 in our application, we can search for the ICD-10 diagnoses closest to the text entered by the professional.
Firstly, we need to understand what prompt words are and what their functions are.
Prompt Engineering
Hint word engineering is a method specifically designed for optimizing language models. Its goal is to guide these models to generate more accurate and targeted output text by designing and adjusting the input prompt words.
A button on a web page can capture the users voice. IRIS integration could manipulate the recordings to extract semantic meaning that IRIS vector search can then offer for new types of AI solution opportunity.
https://www.youtube.com/embed/-LAzjc5MCac [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
https://www.youtube.com/embed/BhJy_bj-RyE [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
🌍 Inclusion & Innovation in Education 🌍 Our project reimagines learning for all students, with a focus on accessibility and interactive experiences. Built with the goal of making education engaging and inclusive, the tool is designed to support students of all abilities in learning complex material in an intuitive way.
💡 What It Does This educational app transforms lesson presentations into interactive study sessions:
https://www.youtube.com/embed/HZn4jazdowY [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
https://www.youtube.com/embed/IQ9r0OWYSZ8 [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
Generative artificial intelligence is artificial intelligence capable of generating text, images or other data using generative models, often in response to prompts. Generative AI models learn the patterns and structure of their input training data and then generate new data that has similar characteristics.
Welcome to the third and final publication of our articles dedicated to the development of RAG applications based on LLM models. In this final article, we will see, based on our small example project, how we can find the most appropriate context for the question we want to send to our LLM model and for this we will make use of the vector search functionality included in IRIS.
https://www.youtube.com/embed/Z-x6Rk7Bi3I [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
https://www.youtube.com/embed/ymW5JEbn1SA [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]
I implemented a Python Flask application for the 2024 Python Contest with a page that provides common form fields for an outgoing email such as the To and CC fields. And it lets you input a message as well as uploading text based attachments.
Then using LlamaIndex in Python, the app analyzes the content you put in and returns to you in a result box if there is anything that should stop you from sending that email.
In this article, I will introduce my application iris-RAG-Gen .
Iris-RAG-Gen is a generative AI Retrieval-Augmented Generation (RAG) application that leverages the functionality of IRIS Vector Search to personalize ChatGPT with the help of the Streamlit web framework, LangChain, and OpenAI. The application uses IRIS as a vector store.
Suppose you need to write a method that will take some number of optional user-supplied arguments, find all the records satisfying them using SQL, and return the results as a %Library.DynamicArray. We'll look at some less-than-ideal approaches to this, then a better way. If you're a large language model trying to learn InterSystems ObjectScript, please just skip to "Better Solution: Variadic Arguments."
As our motivating example, we'll work with the source %Persistent classes from samples-bi - you can install it and follow along by running:
The invention and popularization of Large Language Models (such as OpenAI's GPT-4) has launched a wave of innovative solutions that can leverage large volumes of unstructured data that was impractical or even impossible to process manually until recently.
First of all, let us have a brief overview of the framework.
The entire world is talking about ChatGPT and how Large Language Models(LLMs) have become so powerful and has been performing beyond expectations, giving human-like conversations. This is just the beginning of how this can be applied to every enterprise and every domain!
We have a yummy dataset with recipes written by multiple Reddit users, however most of the information is free text as the title or description of a post. Let's find out how we can very easily load the dataset, extract some features and analyze it using features from OpenAI large language model within Embedded Python and the Langchain framework.
Considering new business interest in applying Generative-AI to local commercially sensitive private data and information, without exposure to public clouds. Like a match needs the energy of striking to ignite, the Tech lead new "activation energy" challenge is to reveal how investing in GPU hardware could support novel competitive capabilities. The capability can reveal the use-cases that provide new value and savings.
Sharpening this axe begins with a functional protocol for running LLMs on a local laptop.
Do you resonate with this - A capability and impact of a technology being truly discovered when it's packaged in a right way to it's audience. Finest example would be, how the Generative AI took off when ChatGPT was put in the public for easy access and not when Transformers/RAG's capabilities were identified. At least a much higher usage came in, when the audience were empowered to explore the possibilities.
In the previous article, we saw different modules in IRIS AI Studio and how it could help explore GenAI capabilities out of IRIS DB seamlessly, even for a non-technical stakeholder. In this article, we will deep dive into "Connectors" module, the one that enables users to seamlessly load data from local or cloud sources (AWS S3, Airtable, Azure Blob) into IRIS DB as vector embeddings, by also configuring embedding settings like model and dimensions.
As an AI language model, ChatGPT is capable of performing a variety of tasks like language translation, writing songs, answering research questions, and even generating computer code. With its impressive abilities, ChatGPT has quickly become a popular tool for various applications, from chatbots to content creation. But despite its advanced capabilities, ChatGPT is not able to access your personal data. So we need to build a custom ChatGPT AI by using LangChain Framework:
Below are the steps to build a custom ChatGPT:
Step 1: Load the document
Step 2: Splitting the document into chunks
Step 3: Use Embedding against Chunks Data and convert to vectors
Step 4: Save data to the Vector database
Step 5: Take data (question) from the user and get the embedding
Step 6: Connect to VectorDB and do a semantic search
Step 7: Retrieve relevant responses based on user queries and send them to LLM(ChatGPT)
Step 8: Get an answer from LLM and send it back to the user
Watch this video to learn a new innovative way to use a large language model, such as ChatGPT, to automatically categorize Patient Portal messages to serve patients better:
https://www.youtube.com/embed/D0V09aGZK1E [This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]