Article
· Feb 13 10m read

From FHIR Events to Explainable Agentic AI: Building a Clinical Follow‑Up Demo with InterSystems IRIS for Health

10:47 AM — Jose Garcia's creatinine test results arrive at the hospital FHIR server.
2.1 mg/dL — a 35% increase from last month.

What happens next?

  • Most systems: ❌ The result sits in a queue until a clinician reviews it manually — hours or days later.
  • This system: 👍 An AI agent evaluates the trend, consults clinical guidelines, and generates evidence-based recommendations — in seconds, automatically.

No chatbot. No manual prompts. No black-box reasoning.

This is event-driven clinical decision support with full explainability:

image

Triggered automatically by FHIR events
Multi-agent reasoning (context, guidelines, recommendations)
Complete audit trail in SQL (every decision, every evidence source)
FHIR-native outputs (DiagnosticReport published to server)

Built with:
- InterSystems IRIS for Health — Orchestration, FHIR, persistence, vector search
- CrewAI — Multi-agent framework for structured reasoning

You'll learn: 🖋️ How to orchestrate agentic AI workflows within production-grade interoperability systems — and why explainability matters more than accuracy alone.

https://www.youtube.com/embed/43Vl7cU_uNY?si=o3NZ3AqPOdFkCn9w
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]


🎬 What This Demo Produces

When Jose's abnormal creatinine observation arrives, the system automatically generates:

INPUT: FHIR Observation (creatinine 2.1 mg/dL, status: HIGH)

OUTPUT: FHIR DiagnosticReport containing:

  • Risk Level: Medium-High (confidence: 85%)
  • Recommendations:
    • ⚠️ Repeat creatinine in 7–14 days
    • 💊 Review nephrotoxic medications (currently on Ibuprofen)
    • 📊 Monitor renal function closely
  • Evidence Used:
    • Patient context: CKD Stage 3 + progressive creatinine rise (>30%)
    • Clinical guidelines: KDIGO section on AKI management in CKD
    • Lab trend analysis: 1.6 → 1.9 → 2.1 mg/dL over 3 months

AUDIT TRAIL: Every decision, recommendation, and evidence citation persisted in SQL tables for compliance and review.


🎯 What Problem Does This Solve?

Most AI demos in healthcare focus on:
- Chat interfaces for asking questions
- Unstructured text outputs
- Opaque reasoning ("trust the AI")

In real clinical environments, what matters is:

  • Reacting to clinical events automatically
  • Understanding complete patient context
  • Providing explainable recommendations with evidence
  • Persisting decisions for audit and compliance

This demo answers a simple but realistic question:

What happens when a new abnormal lab result arrives — and how can we automate the initial clinical assessment while maintaining transparency?


🧪 Demo Scenario: CKD + Rising Creatinine

The demo is based on a common healthcare use case:

Patient: Jose Garcia (MRN-1000001)
- Conditions: Chronic Kidney Disease (CKD Stage 3), Hypertension
- Medications: Ibuprofen (NSAID), Lisinopril
- Lab history:
- 3 months ago: 1.6 mg/dL
- 1 month ago: 1.9 mg/dL
- Today: 2.1 mg/dL ← triggers workflow

The >30% progressive increase requires clinical follow-up.

Instead of waiting for manual review, the system automatically:

  1. Detects the event (FHIR Observation POST)
  2. Retrieves patient context (conditions, medications, lab history)
  3. Consults clinical guidelines via RAG (vector search)
  4. Performs agentic reasoning across three specialized agents
  5. Produces explainable recommendations with evidence citations

⏱️ From Event to Evidence: The Complete Journey

Follow a single lab result through the system:

  • FHIR Observation posted to IRIS server
  • Interoperability Production triggered
  • Context Agent queries patient history from FHIR
  • Guidelines Agent searches vector database (clinical documents)
  • Reasoning Agent synthesizes 3 recommendations
  • Results persisted to SQL (Cases, CaseRecommendations, CaseEvidences)
  • FHIR DiagnosticReport published to server
  • Complete — Full audit trail available for review

From event to actionable recommendations.


🧠 Architecture Overview

Key Principle

InterSystems IRIS for Health is the orchestrator and system of record.

The AI agents are external capabilities that are governed, triggered, and integrated by the IRIS platform. IRIS owns the data, the workflow, and the audit trail — the agents provide specialized reasoning.

High-Level Flow

Key steps:

  • FHIR Observation → POSTed to IRIS FHIR server
  • Interaction Strategy → Detects clinical event
  • Interoperability Production → Orchestrates workflow
  • Business Operation → Calls Agentic AI REST service (FastAPI)
  • Agents Execute → Context retrieval, guideline search, reasoning
  • Results Return → Structured JSON back to IRIS
  • Persistence → SQL tables store cases, recommendations, evidence
  • Publishing → FHIR DiagnosticReport created and stored

Visual Components

The demo includes a Gradio web UI for interactive demonstration:

  • Post lab values and trigger the workflow
  • Watch real-time agent progress
  • View recommendations and evidence citations
  • Query SQL audit tables
  • Access IRIS Production message viewer

This makes the complete flow visible and understandable.


🤖 Why CrewAI? Understanding Multi-Agent Architecture

CrewAI is a multi-agent orchestration framework that enables specialized AI agents to collaborate on complex tasks.

In this demo, three agents work sequentially:

1. Context Agent

Role: Gather patient clinical history from FHIR server

Action:
- Fetch patient demographics and conditions
- Retrieve historical lab results (creatinine trends)
- Collect active medications
- Identify risk factors (NSAID use + CKD)

Output: Structured patient context for reasoning


2. Guidelines Agent

Role: Search clinical knowledge base using RAG (Retrieval-Augmented Generation)

Action:
- Query IRIS vector database with semantic search
- Find relevant guideline sections (clinical protocols, etc.)
- Retrieve evidence chunks with similarity scores
- Provide citations for recommendations

Output: Evidence-based clinical guidance


3. Reasoning Agent

Role: Synthesize recommendations from context + guidelines

Action:
- Analyze lab trends (>30% increase = significant)
- Identify risk factors (CKD + NSAID + progressive rise)
- Apply clinical decision rules
- Generate structured recommendations with confidence levels

Output: Risk assessment + actionable follow-up plan


Why Multi-Agent Instead of Single LLM Call?

Agentic workflows provide:

Better structured reasoning — Each agent has a focused responsibility
Tool use — Agents can query FHIR, search vector databases, analyze trends
Explainable decision chains — Each step is traceable
Separation of concerns — Context ≠ Guidelines ≠ Reasoning

Critical: IRIS orchestrates the agents — CrewAI is used as a library, not the platform. IRIS owns persistence, orchestration, FHIR integration, and audit trails.


🔄 Interoperability Production

The workflow is managed by three IRIS components:

  • Business Service (FHIRObservationIn)
    Triggered automatically when FHIR Observation is POSTed

  • Business Process (FollowUpAI)
    Orchestrates three-step workflow:

    1. Call agent service
    2. Persist results to SQL
    3. Publish DiagnosticReport
  • Business Operations

    • ClinicalAgenticOperation → REST call to FastAPI/CrewAI
    • ClinicalAiPersistence → SQL table writes
    • ClinicalReportPublisher → FHIR DiagnosticReport POST

🔍 Explainability: Proving the AI's Reasoning

One of the most critical aspects of clinical AI is proving why a recommendation was made.

IRIS persists everything in a minimal, queryable SQL model:

  • Cases — What happened (patient, observation, risk level, confidence)
  • CaseRecommendations — What to do (action type, description, timeframe)
  • CaseEvidences — Why (guideline citations, similarity scores, text excerpts)

Example Queries

"What cases were evaluated today?"

SELECT
  CaseId,
  PatientRef,
  RiskLevel,
  Confidence,
  ReasoningSummary
FROM clinicalai_data.Cases
WHERE CreatedAt >= CURRENT_DATE
ORDER BY CreatedAt DESC

Result:

CaseId: CSE-20260108-001
PatientRef: Patient/1 (Jose Garcia)
RiskLevel: medium-high
Confidence: high
ReasoningSummary: The patient with stage 3 chronic kidney disease and hypertension demonstrates a sustained and progressive increase in serum creatinine over 90 days...

"Why did the agent recommend nephrotoxic medication review?"

SELECT
  e.GuidelineId,
  e.Similarity,
  e.Excerpt
FROM clinicalai_data.CaseEvidences e
WHERE e.CaseId = 'b344f121-db68-4cd6-8877-1855c3d547ff'
ORDER BY e.Similarity DESC

Result:

GuidelineId: ckd_creatinine_guideline_demo
Similarity: 0.66
Excerpt: "Recommended actions include repeat serum creatinine testing within 7–14 days, review of current medications for nephrotoxicity, assessment of contributing factors, and close monitoring of renal function."

Every recommendation has:
- The clinical context used
- The guidelines consulted
- The similarity scores showing relevance
- The reasoning chain from data to decision

You can answer "Why did the AI recommend this?" with SQL queries and evidence citations.


🩺 Publishing Results as FHIR DiagnosticReport

The final step closes the loop: AI outputs become part of the clinical record.

The system publishes a FHIR DiagnosticReport containing:

  • Subject: Patient reference (Jose Garcia)
  • Result: Link to triggering Observation (creatinine 2.1)
  • Conclusion: Risk level + reasoning summary
  • PresentedForm: Human-readable recommendations (Base64-encoded)
  • Extensions: Case ID, confidence score, model metadata

This makes the AI output:
- Interoperable — Standard FHIR resource
- Consumable — Accessible via FHIR API by EHRs, portals, apps
- Auditable — Part of the permanent clinical record
- QueryableGET /DiagnosticReport?result=Observation/14

The DiagnosticReport is not a separate "AI system output" — it's a first-class clinical document that follows the same standards as lab reports and radiology findings.


🚀 Try It Yourself

Quick Start (15 minutes):

  1. Clone the repository

    git clone https://github.com/intersystems-ib/iris-health-fhir-agentic-demo
    cd iris-health-fhir-agentic-demo
    
  2. Start IRIS container

    docker-compose up -d
    
  3. Load sample patient data (Jose Garcia with CKD history)
    Follow the README setup instructions

  4. Run the Gradio UI

    python run_ui.py
    

    Open browser to http://localhost:7860

  5. POST an abnormal lab value and watch:

    • Real-time agent progress
    • Evidence retrieval from vector database
    • Recommendations generated with confidence scores
    • SQL audit trail queries
  6. Query the results using IRIS SQL Explorer or Management Portal

💬 Questions or feedback? Reply to this post — I'd love to hear about your use cases.


🎯 What You've Learned

If you've followed along, you now understand how to:

Trigger AI workflows from FHIR events — No manual initiation required
Orchestrate multi-agent systems with CrewAI — Context, Guidelines, Reasoning agents
Build explainable AI with SQL audit trails — Every decision is traceable
Publish AI outputs as FHIR resources — Interoperable clinical documents
Integrate agentic AI with IRIS Interoperability — Production-grade orchestration


🔮 Beyond Lab Results: What Else Can You Automate?

This pattern applies to many clinical scenarios:

  • Medication reconciliation alerts — Detect drug-drug interactions or contraindications
  • Care gap identification — Missing screenings based on age, conditions, guidelines
  • Risk stratification triggers — Identify high-risk patients for intervention
  • Clinical trial matching — Find eligible patients based on inclusion criteria

The architecture is the same: event → context → evidence → reasoning → action.


🚀 Conclusion

This demo shows how Agentic AI can be safely and effectively integrated into real clinical workflows using InterSystems IRIS for Health.

By combining:
- Event-driven interoperability — React to clinical events automatically
- Agentic reasoning — Multi-agent collaboration with tool use
- SQL persistence — Full audit trails for compliance
- FHIR-native outputs — Standard clinical documents

We move from AI experiments to platform-grade clinical AI.

Next Steps:

Star the repo: https://github.com/intersystems-ib/iris-health-fhir-agentic-demo

🧪 Try the demo with your own clinical guidelines

💬 Share your use case — What clinical event would you automate first?

Discussion (2)1
Log in or sign up to continue