How do I stream process and ETL terabyte-scale HL7/FHIR data using InterSystems IRIS Interoperability?
We’re ingesting high-volume HL7 messages and converting them to FHIR in near-real-time. How do we design a streaming ETL pipeline using interoperability (productions) that scales horizontally?
Discussion (1)1
Comments
To design a scalable HL7-to-FHIR ETL pipeline in InterSystems IRIS Interoperability, follow these steps:
-
Data Ingestion and Processing:
- Use the InterSystems IRIS Interoperability Production framework, which provides "Business Services" to connect to external data sources and ingest HL7 data via various protocols (e.g., HTTP, SOAP, or File Adapters).
- Configure "Business Processes" to process messages and define the workflow for message handling [1].
- Transform custom data into standard formats (HL7 to FHIR) using the native Data Transformation Language (DTL) and IRIS Interoperability Toolkit. DTL simplifies mappings between HL7 v2 segments and FHIR resources.
-
FHIR Transformation:
- Enable HL7-to-FHIR transformations using the built-in FHIR interoperability components in IRIS. Configure the FHIR Interoperability Adapter to transform HL7 v2 messages into FHIR resources in real-time or batch [2].
- Use the Summary Document Architecture (SDA) as an intermediary structure for easier transformation between HL7 V2 and FHIR [2].
-
FHIR Repository Configuration:
-
Scalability with Horizontal Scaling:
- Configure your production architecture for horizontal scaling by deploying multiple instances of IRIS Interoperability components (e.g., services, processes, and operations) to handle high message volumes concurrently.
- IRIS supports distributed deployments and clustering to ensure that workloads are balanced and failover resilience is achieved [4].
-
Tools for Efficiency and Maintenance:
For comprehensive data pipeline management:
- Visualize and trace message flows efficiently using the IRIS Management Portal.
- Leverage interoperability features like transaction resilience (FIFO queue preservation), message persistence, and automatic recovery [4][1].
This pipeline design ensures that large-scale terabyte data streams from HL7 are reliably converted to FHIR, stored, and scaled horizontally across computing nodes for performance stability.
Sources: