Question
Tom Philippi · Jan 8, 2018

Processing multiple SQL-rows at the same time using SQL Inbound adapter

I have a database which needs to be synced to another system and considering to use the SQL Inbound adapter to 'watch' the database. The other system, however, has a bulk API so I'd prefer to send multiple SQL rows in a single message to that system. The SQL Inbound adapter seems to trigger a single call (to a process or business operation) per SQL-row. However, the documentation here: http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=... in example 5 seems to suggest that if I exclude a KeyFieldName I get multiple rows in my EnsLib.SQL. snapshot and should be able to send multiple rows to the bulk API down the line. Is this a correct interpretation of the documentation?

Secondly, If I cannot process multiple rows using the SQL Inbound Adapter, I think I should use a business process to call the sql outbound adapter (which can certainly query multiple rows) and then call the bulk API. But how do a trigger this business process? Preferably i'd like to call it every 10 seconds or something. Thoughts?

0
0 575
Discussion (3)1
Log in or sign up to continue

Not really. KeyFieldName only affects the tracking of processed records.  Example 5 shows configuration where each row is passed each time the service querying the database is ran.

SQL Inbound Adapter always passes rows one by one.

Your options are:

  1. Create a buffer and check if this is a last row. If it's a last row - call bulk API. If it's not a last row then add row to buffer.
  2. Call bulk API after every N (i.e. 100) of rows.
  3. Extend SQL  Inbound Adapter to pass a whole resultset or to pass batches of rows.
  4. Write changes locally row by row. Create a separate service that calls bulk API from local data.

What option to choose? Here are several questions, which may help you:

  1. What are the throughout rates (per second/minute/day/peaks?)?
    • New rows in the source system
    • How much rows Ensemble can process
    • How much requests can API process (does it matter if you pass several rows? For example you can pass 100 individual requests per second or 10 bulk requests with 1000 rows each)
  2. Administrative limits for the above-mentioned external systems?
  3. Can you normalize some of these rates? (For example you receive updates during the work hours, but send the 24/7) 
  4. What's the bulk error strategy (good to bad)?
    • You get an error list which you can easily match to individual records
    • You get an error list but you're unable to match it to the individual records
    • You get one (first/last/random) error
  5. Should you group your records before sending (can be useful for visual tracing, pinpointing error source)?
    • By some criteria (hospital, country, day, etc)
  6. What's the resend strategy in cause of errors?
    • Let's say you sent a batch of 1000 records: 999 ok, 1 failed. Can you easily resend only one record? Would sending a whole batch again break data consistency or just fail altogether? 
  7. Can you send the records individually?

Hey,

How do we know it's the last record in Snapshot as it is processing all records one by one

If I had to put a trace - This is the last record 

What syntax would I use?