JDBC batch insert

Hi, I'm looking for a way to implement the JDBC pattern "executeBatch" in ObjectScript: https://www.tutorialspoint.com/jdbc/jdbc-batch-processing.htm

Since Ensemble 2017.2.2 there is a new method called executeParametersBatch : https://docs.intersystems.com/ens20172/csp/docbook/relnotes/index.html#D...

Even in the Ensemble 2018.1 change log, mention a patch on this method: https://docs.intersystems.com/ens20181/csp/docbook/relnotes/index.html#J...

Do you have any examples or tracks to use the executeParametersBatch method of %Net.Remote.Java.JDBCGateway?
Otherwise have you found other workaround for batch insert in JDBC through an EnsLib.SQL.OutboundAdapter?

Thank you in advance for your answers.

  • 0
  • 0
  • 190
  • 3
  • 1

Answers

Hi Guillaume,

Here's some rough code showing how to use executeParametersBatch. I put this together from a few other working examples I had but I haven't tested this actual code and it may have bugs.

As always, this is provided as sample code only and is not meant for production use :)

-Marc
 

set pSQLStatement="INSERT INTO Test.Table (Field1,Field2) VALUES (?,?)"

// JDBCGwy is an instance of the JDBC Gateway object. EnsLib.SQL.OutboundAdapter instantiates this automatically and stores a reference to it in ..%Connection.%JGProxy

// Prepare the SQL statement
set pHS=JDBCGwy.prepareStatement(ConnHandle,pSQLStatement)

// executeParametersBatch expects tArgs to be a $LIST, with the following format:
//     ParamCount, ParamSets, Type1, Param1, Type2, Param2, Type3, Param3, Type11,Param11… TypeNN,ParamNN
// 
// ParamCount is the number of parameters the query expects (in this example 2) 
// ParamSets is the number of rows we will be inserting in this batch
// Type1, Type2, ..., TypeN is an integer indicating the JDBC data type for the corresponding Param value (e.g. Param1, Param2, ..., ParamN)
// Param1, Param2, ..., ParamN is the value for the query parameter

set $LIST(tArgs,1)=2  // The query has two parameters ("?") in it
set $LIST(tArgs,2)=3  // We will insert 3 rows in this batch

// Row 1 -------------------------------------------------------------

//    Set the value for the first column (Field1VarChar)
set $LIST(tArgs,3)=12 // The JDBC data type for varchar is 12
set $LIST(tArgs,4)="String value 1" // Value for column Field1VarChar

//    Set the value for the second column (Field2Integer)
set $LIST(tArgs,5)=4 // The JDBC data type for integer is 4
set $LIST(tArgs,6)=7 // Value for column Field2Integer

// Row 2 -------------------------------------------------------------

//    Set the value for the first column (Field1VarChar)
set $LIST(tArgs,7)=12 // The JDBC data type for varchar is 12
set $LIST(tArgs,8)="String value 2" // Value for column Field1VarChar

//    Set the value for the second column (Field2Integer)
set $LIST(tArgs,9)=4 // The JDBC data type for integer is 4
set $LIST(tArgs,10)=123 // Value for column Field2Integer

// Row 3 -------------------------------------------------------------

//    Set the value for the first column (Field1VarChar)
set $LIST(tArgs,11)=12 // The JDBC data type for varchar is 12
set $LIST(tArgs,12)="String value 3" // Value for column Field1VarChar

//    Set the value for the second column (Field2Integer)
set $LIST(tArgs,13)=4 // The JDBC data type for integer is 4
set $LIST(tArgs,14)=54 // Value for column Field2Integer


// Perform the batch insert
// tResultCodes is a $LIST of integers indicating success/failure for each row in the batch
set tResultCodes=JDBCGwy.executeParametersBatch(pHS,tArgs)
 

Thanks Marc,

Your code work great.

We gain a factor ten with this implementation.

Below our current implementation.

/// Structure of pParms is Caché Multidimensional Array
/// Where :
/// pParms indicate the number of row in batch
/// pParms(integer) indicate the number of parameters in the row
/// pParms(integer,integerParam) indicate the value of the parameter whose position is integerParam.
Method ExecuteUpdateBatchParmArray(Output pNumRowsAffected As %Integer, pUpdateStatement As %String, pParms...) As %Status
{

set tStatus = $$$OK

try{

set pSQLStatement=pUpdateStatement

// JDBCGwy is an instance of the JDBC Gateway object. EnsLib.SQL.OutboundAdapter instantiates this automatically and stores a reference to it in ..%Connection.%JGProxy

// Prepare the SQL statement
Set pHS=..%JGProxy.prepareStatement(..%ConnHandle,pSQLStatement)

// executeParametersBatch expects tArgs to be a $LIST, with the following format:
// ParamCount, ParamSets, Type1, Param1, Type2, Param2, Type3, Param3, Type11,Param11… TypeNN,ParamNN
// 
// ParamCount is the number of parameters the query expects (in this example 2) 
// ParamSets is the number of rows we will be inserting in this batch
// Type1, Type2, ..., TypeN is an integer indicating the JDBC data type for the corresponding Param value (e.g. Param1, Param2, ..., ParamN)
// Param1, Param2, ..., ParamN is the value for the query parameter

set nbParam = pParms(1,1)
set nbBatch = pParms(1)

set $LIST(tArgs,1)=nbParam // How many parameters ("?") in it
set $LIST(tArgs,2)=nbBatch // We will insert nbBatch rows in this batch

set = 2

for k=1:1:nbBatch {



for l=1:1:pParms(1,k){

set = +1
set $LIST(tArgs,i)=12 // The JDBC data type for varchar is 12
set = +1
set $LIST(tArgs,i)=pParms(1,k,l) // Value for column Field1VarChar

}

}


// Perform the batch insert
// tResultCodes is a $LIST of integers indicating success/failure for each row in the batch
set tResultCodes = ..%JGProxy.executeParametersBatch(pHS,tArgs)

//Todo, Read list set pNumRowsAffected = $LISTLENGTH(tResultCodes)
set pNumRowsAffected = nbBatch 


// Remove Statement to avoid CURSOR Leaks
set sc = ..%JGProxy.removeStatement(pHS)

tArgs

}

catch exp{

Set tStatus = exp.AsStatus()
}

Quit tStatus
}

Comments

Hi Guillaume,

I'm not sure what you're trying to get at. Our core JDBC driver supports batch processing through exactly the mechanism described in the tutorial you referenced, so that should work fine using default JDBC methods on the Java side. The JDBC SQL Adapter in EnsLib on the other hand was designed for a message-by-message processing and therefore the Adapter doesn't expose a batching mechanism. 

Maybe you can share a little more on the actual use case you're implementing? Buffering up messages for batch insertion or does the message carry a lot of data that deserves a batch insert by itself? Or am I totally on the wrong track here? :-)

Thanks,
benjamin

Hi Benjamin,

In some cases we have to use Ensemble as an ETL not an ESB.

So we extract lot of data, transform and load them throw the JDBC SQL Adapter in EnsLib. To do so, we dont do it line by line throw messages but ResultSet by ResultSet (one ResultSet can have more than 500 000 lines and 30 colones). This pattern work well.

But we have start to have time treatment issue. Our process took more than 8 hours. When we analyze it, what it cost time is the select and insert treatment.

We solved the problem of select treatment with this post : https://community.intersystems.com/post/jdbc-large-query-optimisation

So now, we are looking for a way to improve insert time. Our guess is to implement throw JDBC SQL Adapter in EnsLib the java pattern with PrepareStatement.addBatch() then executeBatch().

Do you have any idea to improve the insert treatment ?