Find

Article
· Oct 9 1m read

IRIS のベクトル検索を活用しユーザーへ最新で正確な応答を提供する RAG AI チャットボットを作成するチュートリアル

開発者の皆さん、こんにちは!

この記事では、Developer Hub にあるチュートリアルに新しいチュートリアル:InterSystems IRIS ベクトル検索を使用した RAG が追加されましたので内容をご紹介します。(準備不要でブラウザがあれば試せるチュートリアルです!)

このチュートリアルでは、生成 AI アプリケーションの精度向上に向けて、ベクトル検索と検索拡張生成(Retrieval Augmented Generation)の活用を体験できます。

具体的には、InterSystems IRIS のベクトル検索機能を活用し、生成 AI チャットボット向けのナレッジベースをサンプルコードを利用して作成します。

また、Streamlit を使用して作成したチャットボットを動かしながら、ナレッジベースの情報を追加することで生成 AI からの回答が変化していくことを確認していきます。

アカウント作成やログインも不要で  ボタンをクリックするだけで始められます👍

チュートリアルへのリンクは「開発者コミュニティのリソース」からも辿れます!

ぜひ、お試しください!​​​​​​

Discussion (0)1
Log in or sign up to continue
Announcement
· Oct 9

Security & AI Meetup for Developers and Startups

Join our next in-person Developer Meetup in Boston to explore Security & AI for Developers and Startups.

This event is hosted at CIC Venture Cafe.

Talk 1: When Prompts Become Payloads
Speaker: Mark-David McLaughlin, Director, Corporate Security, InterSystems

Talk 2: Serial Offenses: Common Vulnerability Types
Speaker: Jonathan Sue-Ho, Senior Security Engineer, InterSystems

>> Register here
 

⏱ Day and Time: October 21, 5:30 p.m. to 7:30 p.m.
📍CIC Venture Café in Cambridge, MA

Save your seat now!

Food, beverages, and networking opportunities will be provided as always.
Join our Discord channel to connect with developers from the InterSystems developer ecosystem.

Discussion (0)1
Log in or sign up to continue
Article
· Oct 9 6m read

Enhancing FHIR Data Exploration with Local LLMs: Integrating IRIS and Ollama

Introduction

In my previous article, I introduced the FHIR Data Explorer, a proof-of-concept application that connects InterSystems IRIS, Python, and Ollama to enable semantic search and visualization over healthcare data in FHIR format, a project currently participating in the InterSystems External Language Contest.

In this follow-up, we’ll see how I integrated Ollama for generating patient history summaries directly from structured FHIR data stored in IRIS, using lightweight local language models (LLMs) such as Llama 3.2:1B or Gemma 2:2B.

The goal was to build a completely local AI pipeline that can extract, format, and narrate patient histories while keeping data private and under full control.

All patient data used in this demo comes from FHIR bundles, which were parsed and loaded into IRIS via the IRIStool module. This approach makes it straightforward to query, transform, and vectorize healthcare data using familiar pandas operations in Python. If you’re curious about how I built this integration, check out my previous article Building a FHIR Vector Repository with InterSystems IRIS and Python through the IRIStool module.

Both IRIStool and FHIR Data Explorer are available on the InterSystems Open Exchange — and part of my contest submissions. If you find them useful, please consider voting for them!

1. Setup with Docker Compose

To make the setup simple and reproducible, everything runs locally via Docker Compose.
A minimal configuration looks like this:

services:
  iris:
    container_name: iris-patient-search
    build:
      context: .
      dockerfile: Dockerfile
    image: iris-patient-search:latest  
    init: true
    restart: unless-stopped
    volumes:
      - ./storage:/durable
    ports:
      - "9092:52773"  # Management Portal / REST APIs
      - "9091:1972"   # SuperServer port
    environment:
      - ISC_DATA_DIRECTORY=/durable/iris
    entrypoint: ["/opt/irisapp/entrypoint.sh"]

  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    pull_policy: always
    tty: true
    restart: unless-stopped
    ports:
      - 11424:11434
    volumes:
      - ./ollama_entrypoint.sh:/entrypoint.sh
    entrypoint: ["/entrypoint.sh"]

You can find all the configurations on the GitHub project page.

2. Integrating Ollama into the Workflow

Ollama provides a simple local REST API for running models efficiently on CPU, which makes it ideal for healthcare applications where privacy and performance matter.

To connect IRIS and Streamlit to Ollama, I implemented a lightweight Python class for streaming responses from the Ollama API:

import requests, json

class ollama_request:
    def __init__(self, api_url: str):
        self.api_url = api_url

    def get_response(self, content, model):
        payload = {
            "model": model,
            "messages": [
                {"role": "user", "content": content}
            ]
        }
        response = requests.post(self.api_url, json=payload, stream=True)

        if response.status_code == 200:
            for line in response.iter_lines(decode_unicode=True):
                if line:
                    try:
                        json_data = json.loads(line)
                        if "message" in json_data and "content" in json_data["message"]:
                            yield json_data["message"]["content"]
                    except json.JSONDecodeError:
                        yield f"Error decoding JSON line: {line}"
        else:
            yield f"Error: {response.status_code} - {response.text}"

This allows real-time streaming of model output, giving users the feeling of “watching” the AI write clinical summaries live in the Streamlit UI.

3. Preparing Patient Data for the LLM

Before sending anything to Ollama, data must be compact, structured, and clinically relevant.
For this, I wrote a class that extracts and formats the patient’s most relevant data — demographics, conditions, observations, procedures, and so on — into YAML, which is both readable and LLM-friendly.

Here’s the simplified process:

  1. Select the patient row from IRIS via pandas
  2. Extract demographics and convert them into YAML
  3. Process each medical table (Conditions, Observations, etc.)
  4. Remove unnecessary or redundant fields
  5. Output a concise YAML document used as the LLM prompt context.

This string is then passed directly to the LLM prompt, forming the structured context from which the model generates the patient’s narrative summary.


4. Why Limit the Number of Records?

While building this feature, I noticed that passing all medical records often led small LLMs to become confused or biased toward older entries, losing focus on recent events.

To mitigate this, I decided to:

  • Include only a limited number records per category in reverse chronological order (most recent first)
  • Use concise YAML formatting instead of raw JSON
  • Normalize datatypes (timestamps, nulls, etc.) for consistency

This design helps small LLMs focus on the most clinically relevant data avoiding “prompt overload”.


💬 4. Generating the Patient History Summary

Once the YAML-formatted data is ready, the Streamlit app sends it to Ollama with a simple prompt like:

“You are a clinical assistant. Given the following patient data, write a concise summary of their medical history, highlighting relevant conditions and recent trends.”

The output is streamed back to the UI line by line, allowing the user to watch the summary being written in real time.
Each model produces a slightly different result, even with the same prompt — revealing fascinating differences in reasoning and style.


🧠 5. Comparing Local LLMs

To evaluate the effectiveness of this approach, I tested three lightweight open models available through Ollama:

Model Parameters Summary Style Notes
Llama 3.2:1B 1B Structured, factual Highly literal and schema-like output
Gemma 2:2B 2B Narrative, human-like Most coherent and contextually aware
Gemma 3:1B 1B Concise, summarizing Occasionally omits details but very readable

You can find example outputs on this GitHub folder. Each patient summary highlights how model size and training style influence the structure, coherence, and detail level of the narrative.

Here’s a comparative interpretation of their behavior:

  • Llama 3.2:1B tends to reproduce the data structure verbatim, almost as if performing a database export. Its summaries are technically accurate but lack narrative flow — resembling a structured clinical report rather than natural text.
  • Gemma 3:1B achieves better linguistic flow but still compresses or omits minor details. 
  • Gemma 2:2B strikes the best balance. It organizes information into meaningful sections (conditions, risk factors, care recommendations) while maintaining a fluent tone.

In short:

  • Llama 3.2:1B = factual precision
  • Gemma 3:1B = concise summaries
  • Gemma 2:2B = clinical storytelling

Even without fine-tuning, thoughtful data curation and prompt design make small, local LLMs capable of producing coherent, contextually relevant clinical narratives.


🔒 6. Why Local Models Matter

Using Ollama locally provides:

  • Full data control — no patient data ever leaves the environment
  • Deterministic performance — stable latency on CPU
  • Lightweight deployment — works even without GPU
  • Modular design — easy to switch between models or adjust prompts

This makes it an ideal setup for hospitals, research centers, or academic environments that want to experiment safely with AI-assisted documentation and summarization.


🧭 Conclusion

This integration demonstrates that even small local models, when properly guided by structured data and clear prompts, can yield useful, human-like summaries of patient histories.

With IRIS managing data, Python handling transformations, and Ollama generating text, we get a fully local, privacy-first AI pipeline for clinical insight generation.

Discussion (0)1
Log in or sign up to continue
Article
· Oct 9 4m read

Building a FHIR Vector Repository with InterSystems IRIS and Python through the IRIStool module

Introduction

In a previous article, I presented the IRIStool module, which seamlessly integrates the pandas Python library with the IRIS database. Now, I'm explaining how we can use IRIStool to leverage InterSystems IRIS as a foundation for intelligent, semantic search over healthcare data in FHIR format.

This article covers what I did to create the database for another of my projects, the FHIR Data Explorer. Both projects are candidates in the current InterSystems contest, so please vote for them if you find them useful.

You can find them at the Open Exchange:

In this article we'll cover:

  • Connecting to InterSystems IRIS database through Python
  • Creating a FHIR-ready database schema
  • Importing FHIR data with vector embeddings for semantic search

Prerequisites

Install IRIStool from the IRIStool and Data Manager GitHub page.

1. IRIS Connection Setup

Start by configuring your connection through environment variables in a .env file:

IRIS_HOST=localhost
IRIS_PORT=9092
IRIS_NAMESPACE=USER
IRIS_USER=_SYSTEM
IRIS_PASSWORD=SYS

Connect to IRIS using IRIStool's context manager:

from utils.iristool import IRIStool
import os
from dotenv import load_dotenv

load_dotenv()

with IRIStool(
    host=os.getenv('IRIS_HOST'),
    port=os.getenv('IRIS_PORT'),
    namespace=os.getenv('IRIS_NAMESPACE'),
    username=os.getenv('IRIS_USER'),
    password=os.getenv('IRIS_PASSWORD')
) as iris:
    # IRIStool manages the connection automatically
    pass

2. Creating the FHIR Schema

At first, create a table to store FHIR data, then while extracting data from FHIR bundles, create tables with vector search capabilities for each of the extracted FHIR resources (like Patient, Osservability, etc.). 

IRIStool simplifies table and index creation!

FHIR Repository Table

# Create main repository table for raw FHIR bundles
if not iris.table_exists("FHIRrepository", "SQLUser"):
    iris.create_table(
        table_name="FHIRrepository",
        columns={
            "patient_id": "VARCHAR(200)",
            "fhir_bundle": "CLOB"
        }
    )
    iris.quick_create_index(
        table_name="FHIRrepository",
        column_name="patient_id"
    )

Patient Table with Vector Support

# Create Patient table with vector column for semantic search
if not iris.table_exists("Patient", "SQLUser"):
    iris.create_table(
        table_name="Patient",
        columns={
            "patient_row_id": "INT AUTO_INCREMENT PRIMARY KEY",
            "patient_id": "VARCHAR(200)",
            "description": "CLOB",
            "description_vector": "VECTOR(FLOAT, 384)",
            "full_name": "VARCHAR(200)",
            "gender": "VARCHAR(30)",
            "age": "INTEGER",
            "birthdate": "TIMESTAMP"
        }
    )
    
    # Create standard indexes
    iris.quick_create_index(table_name="Patient", column_name="patient_id")
    iris.quick_create_index(table_name="Patient", column_name="age")
    
    # Create HNSW vector index for similarity search
    iris.create_hnsw_index(
        index_name="patient_vector_idx",
        table_name="Patient",
        column_name="description_vector",
        distance="Cosine"
    )

3. Importing FHIR Data with Vectors

Generate vector embeddings from FHIR patient descriptions and insert them into IRIS easily:

from sentence_transformers import SentenceTransformer

# Initialize transformer model
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

# Example: Process patient data
patient_description = "45-year-old male with hypertension and type 2 diabetes"
patient_id = "patient-123"
# Generate vector embedding
vector = model.encode(patient_description, normalize_embeddings=True).tolist()

# Insert patient data with vector
iris.insert(
    table_name="Patient",
    patient_id=patient_id,
    description=patient_description,
    description_vector=str(vector),
    full_name="John Doe",
    gender="male",
    age=45,
    birthdate="1979-03-15"
)

4. Performing Semantic Search

Once your data is loaded, you can perform similarity searches:

# Search query
search_text = "patients with diabetes"
query_vector = model.encode(search_text, normalize_embeddings=True).tolist()

# define sql query
query = f"""
        SELECT TOP 5
            patient_id,
            full_name,
            description,
            VECTOR_COSINE(description_vector, TO_VECTOR(?)) as similarity
        FROM Patient
        ORDER BY similarity DESC
    """
# define query parameters
parameters = [str(query_vector)]

# Find similar patients using vector search
results = iris.query(query, parameters)

# print DataFrame data
if not results.empty:
    print(f"{results['full_name']}: {results['similarity']:.3f}")

Conclusion

  • IRIStool simplifies IRIS integration with intuitive Python methods for table and index creation
  • IRIS supports hybrid SQL + vector storage natively, enabling both traditional queries and semantic search
  • Vector embeddings enable intelligent search across FHIR healthcare data using natural language
  • HNSW indexes provide efficient similarity search at scale

This approach demonstrates how InterSystems IRIS can serve as a powerful foundation for building intelligent healthcare applications with semantic search capabilities over FHIR data.

Discussion (0)1
Log in or sign up to continue
Announcement
· Oct 9

Developer Community AI Is Taking a Break

Hi Community,

It seems our Developer Community AI has decided to take a coffee break ☕️ (probably after answering one too many tricky ObjectScript questions).

The importance of the coffee break

For now, it’s gone mysteriously silent and refuses to generate answers. We suspect it might be rethinking its life choices after reading one too many deeply philosophical ObjectScript questions.

We’ll let you know as soon as our digital colleague is back online, refreshed, and ready to assist you again. In the meantime, if you notice it suddenly waking up and replying again, please do let us know in the comments before it changes its mind.

Thanks for your patience, and remember: even AI needs a break sometimes. 😄

3 Comments
Discussion (3)4
Log in or sign up to continue