Written by

Senior Startups and Community Programs Manager at InterSystems Corporation

TEAM

Question Evgeny Shvarov · Jan 15, 2023

CSV to CSV data transformation using Interoperability

#InterSystems IRIS #CSV #Interoperability #Key Question

Hi folks!

Have a question for those who are masters of interoperability.

I have a basic task of having one CSV with some data. I need to transform one column in the initial dataset and get the new csv with the same form.

What's the best approach with Interoperability?

Should I user record mapper?

Should I use streams, objects?

What is the best practice?

Discussion (12)1

Comments

Oliver Wilms · Jan 15, 2023

I would not create record mapper unless you want to make sure the data is in correct format. If you already have record maps defined I see no issue using them.

0 0

Evgeny Shvarov Jan 15, 2023 to Oliver Wilms

I don't plan to use record maps at all. The idea is to use DTL for every record.

0 0

Oliver Wilms Jan 15, 2023 to Evgeny Shvarov

I created a BPL and read one line at a time. I tried updating data line in code without DTL. I get an error instead of seeing output file. Please take a look in git repo here (just start the production and look for messages from File Passthrough Service):

https://github.com/oliverwilms/interoperability-update-datafile

0 0

Robert Barbiaux · Jan 16, 2023

If you need to process the entire file and no line filtering, I would go for using pass through file service (EnsLib.File.PassthroughService) to send an instance of stream container (Ens.StreamContainer) to either a message router (EnsLib.MsgRouter.RoutingEngine) or custom (BPL or code) process, and use a transform (class extending Ens.DataTransform) to transform the source stream container into a target stream container and send it to the file pass through operation (EnsLib.File.PassthroughOperation) for output.

I would use a custom process over message router if transform needs data source(s) (e.g. response from another process or operation) other than the input file. The transform can pick a suitable target stream class (Extending %Stream.Object) to hold in the Ens.StreamContainer depending on where you want to store the data (database vs file system,…)

HTH

0 0

Evgeny Shvarov Jan 18, 2023 to Robert Barbiaux

@Robert Barbiaux , this is very cool!

In fact the purpose of what I plan to do is to expose the idea of data-transformation for newcomers in the simplest possible manner.

I wanted to have every line as a message that contains data that will be transformed via the rule.

I understand that in a real-life interoperability cases one message should be a one file/stream but the purpose is to explain how engine works.

0 0

Robert Barbiaux Jan 21, 2023 to Evgeny Shvarov

For a simple message transformation flow example, I would go for record map :

you get readymade service (EnsLib.RecordMap.Service.FileService) and operation (EnsLib.RecordMap.Operation.FileOperation) in the library to read and write CSV files and
a mecanism that generates an appropriate message class based on a declarative definition (see Using the Record Mapper | Developing Productions | InterSystems IRIS Data Platform 2022)

So you can focus on DTL and the whole flow can be done from the administration portal, look ma, no code ;-)

0 0

Evgeny Shvarov Jan 21, 2023 to Robert Barbiaux

Indeed, it works! Thank you, Robert!

One issue though: as a result of Operation I have one new file per new message/record in a source file. Any chance to ask Production to put all the same amount records as were in the initial file?

0 0

Robert Barbiaux Jan 22, 2023 to Evgeny Shvarov

The record map file operation append records to the output file. The initial value of the 'Filename' setting is '%Q', hence you get one file per timestamp.

If you set "Filename" to '%f', the output file name will be the same as the input file name and records from one input file will be appended to an output file with the same name.

0 0

Evgeny Shvarov Jan 22, 2023 to Robert Barbiaux

Thank you, Robert!

This could work but for some reason '%f' doesn't work for record mapper:

I'm getting <NOTOPEN> error if it is only the '%f'

and if I use the default setting of FileOperation as '%f_%Q%!+(_a)' I get the file name that starts from '_' symbol and looks like:

_2023-01-22_13.10.49.784

Maybe it is the way to update this setting on-the-go somehow? E.g. with a callback?

0 0

Evgeny Shvarov Jan 23, 2023 to Robert Barbiaux

Never mind.

Turned out I didn't %Source to %Source data copy in transformation thus there were no filename in the result file.

The only question left - how to manage Headers line in such a production? If possible?

0 0

Robert Barbiaux Jan 24, 2023 to Evgeny Shvarov

For simple headers (and footers), you can use the 'batch class' feature of the record map file batch service (EnsLib.RecordMap.Service.BatchFileService) and operation (EnsLib.RecordMap.Operation.BatchFileOperation), and a class such as EnsLib.RecordMap.SimpleBatch to specify a header string.

0 0

Developer Community Admin · Feb 15, 2023

💡 This question is considered a Key Question. More details here.

0 0