Question
Scott Roth · Oct 21, 2021

Optimizing throughput XML to MS SQL (JDBC) insert

Once a week we are attempting to load an XML file from Workday into a MS SQL table using JDBC and Store Procedures. There is approx 102999 records in this XML file. We are struggling with processing the entire file within a reason amount of time. We feed the XML through a BPL to then populate values in a stored procedure then call the stored procedure through a Business Operation. I have tried splitting out the Business Operations to make two calls, but we still continue to see an issue loading the XML into MS SQL. I have also gone through the JDBC driver properties to see if that would help and it doesn't make a difference.

Any suggestions on how we can process this file faster?

Product version: Ensemble 2018.1
$ZV: Cache for UNIX (IBM AIX for System Power System-64) 2018.1.3 (Build 414U) Mon Oct 28 2019 11:24:02 EDT
0
0 166
Discussion (4)1
Log in or sign up to continue

Where do you struggle ? Is it in the BPL, in the Operation or during the insert statement to SQL Server ?

If the bottleneck is in the insert statement check out this community SQL adapter (work only on JDBC).
https://openexchange.intersystems.com/package/ETL-Interoperability-Adapter

This adapter provide you an access to insert batch statement, that can speed up the insert speed up to 10 times even 100 times.

In the github repository you have some example how to use it :
https://github.com/grongierisc/BatchSqlOutboundAdapter/blob/master/src/C...

I believe the bottleneck is in the BPL with the sheer quantity/format of records.

I have done several BPL's that take either flat file data, or HL7 messages and insert them into SQL tables using a JDBC Stored Procedure call. 

Looking over the Adapter you mention, how is that different than just inserting the data into a stored procedure to do the insert? It appears that you are creating the "Dynamic" SQL statement in the class file to be sent and executed. In the past I have tried making "Dynamic" SQL statements through a JDBC call and it slows the process down, because the SQL statement will have to be built in SQL before it executes. With Stored Procedures the SQL statement is already on the MS SQL database so it just needs to pass the variables, and doesn't have to add that extra step of validating, and building the SQL query before execution.

The difference between a stored procedure and SQL inserts is that the business logic remains on the application side and not on the database side to keep the principles (storage/logic/representation) separate.

Nevertheless, the problem doesn't seem to be at the SQL level but rather at the BPL level, so maybe it's due to the conversion of XML to object?
If it is the case, use different technique to parse the XML, like the SAX Parser which avoids to mount all the XML document in memory.
https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls...

I am confused so bare with me... We currently use EnsLib.EDI.XML.Service.FileService to read the data and send it to a BPL which we just insert the data elements into the parameters for the stored procedure call.  Are you saying we need to add another step in between? Is the SAX Parser only part of IRIS? We are still on Healthshare 2018.1.3 for the time being.

The documentation doesn't always give clear views on how it should be setup within the GUI. I tried looking for an example in ENSDEMO but did not see one. Do you have an example?