Question
· Nov 15, 2023

What is the best practice for parcing RecordMap (batch) when DTL is insufficient

Hello,

For a particular problem we were trying to parse a relatively large *.csv file with a recordmap. We are doing this from a BPL where we start with a REST call to acquire the file. This file needs a slight transformation which we tried in a DTL. However DTL's seem to be incapable of parsing larger files. 

We see two different alternatives: The first one to write custom code to parse the file. And the second option to write SQL queries and to write the output to a csv file.  We were wondering what the best practice for parsing larger files, transforming the data, and filter the data to different locations based on fields in the *.csv are and would appreciate input. 

Thanks in advance!

Discussion (7)3
Log in or sign up to continue

Hello,

Thank you for the reply. However i don't see this working for this case. We already have an element in the BPL that uses a recordmap to make records out of the CSV to send them to the DTL. We can only get the csv by a REST call, so using the BusinessServer for record maps is not realy an option. If the best way to do this is without a record map we would also like to know.

Hi,

I am not sure I know what the issue is exactly. Is it speed, is it DB writes, or memory (FRAMESTACK) errors.

In the meantime, some things we've done to help with large files.

We have BPL's that process CSV files with 100k rows, and these things helped.

Hi Thom,

  I suggest you don’t use recordmapper at all or DTLs neither are good for very wide csvs,  use a file reader and save the input to a quickstream. Pass the stream id to a custom process which walks the quickstream changes each line and saves to a new quick stream, pass final quickstream to outbound.  This will prevent sending individual lines to processes or Dtls (which if your csv is huge would be a massive overhead).  You can contact me and I’d be happy to help.

Hey Paul,

Half agree if the OP requirements turn out to be a file to file use case.

If not, I wanted to defend the record mapper solution a little so as not to put off other readers from this approach.

As an example, I recently implemented a solution that had a million line CSV file. This would generate around 300,000 messages from a complex mapping. Each record was individually passed into a DTL which in turn produced a master file update message. These were then pushed into the local EPR (as per the EPR's requirements).

Yes, many messages, but not what I would call an overhead when taken into context of the frequency and need for M16 messages.

Bottom line, the solution I selected was the most maintainable solution. Almost zero glue code to go wrong and all maintanence managed inside DTL's. This is exactly what the record mapper was designed for.