IRIS/Ensemble as an ETL
IRIS and Ensemble are designed to act as an ESB/EAI. This mean they are build to process lots of small messages.
But some times, in real life we have to use them as ETL. The down side is not that they can't do so, but it can take a long time to process millions of row at once.
To improve performance, I have created a new SQLOutboundAdaptor who only works with JDBC.
BatchSqlOutboundAdapter
Extend EnsLib.SQL.OutboundAdapter to add batch batch and fetch support on JDBC connection.
Benchmark
Benchmarks released on Postgres 11.2 with 1 000 000 rows fetched and 100 000 rows inserted on 2 columns.
Prerequisites
Can be used on IRIS or Ensemble 2017.2+.
Installing
Clone this repository
git clone https://github.com/grongierisc/BatchSqlOutboundAdapter.git
Use Grongier.SQL.SqlOutboundAdapter adaptor.
New methods from the adaptor
- Method ExecuteQueryBatchParmArray(ByRef pRS As Grongier.SQL.GatewayResultSet, pQueryStatement As %String, pBatchSize As %Integer, ByRef pParms) As %Status
- pRS is the ResultSet can be use as any EnsLib.SQL.GatewayResultSet
- pQueryStatement is the SQL query you like to execute
- pBatchSize is the fetch size JDBC parameter
- Method ExecuteUpdateBatchParamArray(Output pNumRowsAffected As %Integer, pUpdateStatement As %String, pParms...) As %Status
- pNumRowsAffected is the number of row inserted
- pUpdateStatement is teh update/insert SQL statement
- pParms is Caché Multidimensional Array
- pParms indicate the number of row in batch
- pParms(integer) indicate the number of parameters in the row
- pParms(integer,integerParam) indicate the value of the parameter whose position is integerParam.
- pParms(integer,integerParam,"SqlType") indicate the SqlType of the parameter whose position is integerParam, by default it will be $$$SqlVarchar
Example
- Grongier.Example.SqlSelectOperation show an example of ExecuteQueryBatchParmArray
- Grongier.Example.SqlSelectOperation show an example of ExecuteUpdateBatchParamArray
Content of this project
This adaptor include :
- Grongier.SQL.Common
- No modification, simple extend of EnsLib.SQL.Common
- Grongier.SQL.CommonJ
- No modification, simple extend of EnsLib.SQL.CommonJ
- Grongier.SQL.GatewayResultSet
- Extension of EnsLib.SQL.GatewayResultSet to gain the ablility to use fetch size.
- Grongier.SQL.JDBCGateway
- Use to allow compilation and support on Ensemble 2017.1 and lower
- Grongier.SQL.OutboundAdapter
- The new adaptor with :
- ExecuteQueryBatchParmArray allow SQL query a distant database and specify the JDBC fetchSize
- ExecuteUpdateBatchParamArray allow insertion in a distant database with JDBC addBatch and executeBatch
- Grongier.SQL.Snapshot
- Extend of EnsLib.SQL.Snapshot to handle Grongier.SQL.GatewayResultSet and the fetch size property
Adaptor now aviable on zpm :
Perfect, @Guillaume Rongier!!
Thank you!
I am looking at an interface written in Health Connect that inserts one row at a time using Outbound SQL Adapter. I have observed inserting 200k rows took 4 hours. Does this article say, if I use this adapter properly, I can reduce insert time a lot?
Yes, you can easily gain a factor from 5 to 10 in insert time.
I your case, you can expect your batch to be processed in less than 30 minutes or so.
Example of code can be found here :
https://github.com/grongierisc/BatchSqlOutboundAdapter/blob/master/src/C...
BTW : This adaptor only works with JDBC drivers.
I have tried to adapt your example to run on HealthShare 2019.1. I get an error when I call ..Adapter.ExecuteUpdateBatchParamArray from SQLInsertOperation.
I debugged the OutboundAdapter code. The error happens on this line:
..%ConnHandle is not an Object. pSQLStatement looks okay.
My code is here:
https://github.com/oliverwilms/etl/blob/main/src/Oliver/SQL/OutboundAdap...(..%25ConnHandle%2CpSQLStatement)
Any idea what is the problem?
hmmm 4 hours is a lot.
What's the DB you insert it ? If external, jdbc / odbc ?
It is a jdbc to sql server (on VA network)
Social networks
InterSystems resources
Log in or sign up
Log in or create a new account to continue
Log in or sign up
Log in or create a new account to continue
Log in or sign up
Log in or create a new account to continue