Question
· Oct 20, 2017

Extract data values from Ens.StreamContainer object

Hi,

I am receiving a CSV file as Ens.StreamContainer using pRequest As Ens.StreamContainer

In the studio code, pRequest.Stream.ReadLine()  returns following records when read in a while loop using pRequest.Stream.AtEnd = 0

Appt ID , Patient ID , Patient Name    , Program ,   Appt Date , Appt Start

       102 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00

       103 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00

Question:  How do I extract individual data values from the pRequest.Stream  Stream Container? As in 102, 1002, "Gene,Jill" as 3 separate data elements in the studio code

Thanks

Rajesh

Discussion (2)0
Log in or sign up to continue

Hi Rajesh,

It looks like there are few things going on here:

  1. you need to parse the csv file from the stream and it looks like the csv file contains quoted fields, and lost of trailing and leading spaces (or tabs)
  2. you need to be able to extract the relevant data element into some array or global for later processing

The following code should give some guidance in achieving what you want. 

The code is based on a simple csv file based on your question. this is the contents on the sample.csv file

Appt ID , Patient ID , Patient Name    , Program ,   Appt Date , Appt Start
102 ,              "10""02" ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00
103 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00
104 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00
105 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00
106 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00
107 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00
108 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00
109 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00
110 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00

Notice in the first line the second field (10""02) contains a quote within the cell. 

Notice that the name field contains a "," within the cell

Here is the Code:

LoadCSVFile(file)
    n stream,count,rec
    s stream = ##class(%FileCharacterStream).%New()
     s stream.Filename = file
     s count = 0,rec="" f  d  q:stream.AtEnd
     . s rec   =  $$ParseCSV(stream.ReadLine())
     . i count s ^SCRATCH(count) = rec
     . s count = count + 1
     q
     ;
ParseCSV(str,delim) 
    i $g(delim)="" s delim =$c(0)
     n i,qual,nstr,fstr,on
     s qual="""",on=0,nstr="",fstr=""
     f i=1:1:$l(str) d
     . i $e(str,i)'=qual,'on s nstr=nstr_$replace($e(str,i),",",delim) q
     . i $e(str,i)=qual,'on  s on=1 s nstr=nstr_$e(str,i) q
     . i $e(str,i)'=qual,on  s nstr=nstr_$e(str,i) q
     . i $e(str,i)=qual,on   s on=0 s nstr=nstr_$e(str,i)
     f i=1:1:$l(nstr,delim) d 
     . s $p(nstr,delim,i) = $$trim($p(nstr,delim,i))
     . i $e($p(nstr,delim,i))=qual,$e($p(nstr,delim,i),$l($p(nstr,delim,i)))=qual d  q
     .. s fstr=fstr_delim_$e($p(nstr,delim,i),2,($l($p(nstr,delim,i))-1)) q
    . s fstr=fstr_delim_$p(nstr,delim,i)
    q $e($replace(fstr,"""""",""""),($l(delim)+1),$l(fstr))
    ;
trim(t) ;trim
     n i,j f i=1:1:$l(t) q:" "_$c(9)'[$e(t,i)
     f j=$l(t):-1:1 q:" "_$c(9)'[$e(t,j)
     q $e(t,i,j)
    ;

Calling this Function should give us an output that looks like this :

 
USER>k ^SCRATCH
 
USER>D LoadCSVFile("SAMPLE.CSV")
 
USER>zw ^SCRATCH
^SCRATCH(1)="102"_$c(0)_"10""02"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"
^SCRATCH(2)="103"_$c(0)_"1044"_$c(0)_"Smith, John"_$c(0)_"HTR"_$c(0)_"06/10/2017"_$c(0)_"08:00"
^SCRATCH(3)="104"_$c(0)_"1002"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"
^SCRATCH(4)="105"_$c(0)_"1044"_$c(0)_"Smith, John"_$c(0)_"HTR"_$c(0)_"06/10/2017"_$c(0)_"08:00"
^SCRATCH(5)="106"_$c(0)_"1002"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"
^SCRATCH(6)="107"_$c(0)_"1044"_$c(0)_"Smith, John"_$c(0)_"HTR"_$c(0)_"06/10/2017"_$c(0)_"08:00"
^SCRATCH(7)="108"_$c(0)_"1002"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"
^SCRATCH(8)="109"_$c(0)_"1044"_$c(0)_"Smith, John"_$c(0)_"HTR"_$c(0)_"06/10/2017"_$c(0)_"08:00"
^SCRATCH(9)="110"_$c(0)_"1002"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"

Few key points to keep in mind :

  • The ParseCSV function takes a second argument, which the delimiter that you want fields to be separated by. the default is $c(0) if you don't pass anything ( $c(0) works with csv's as it's not a printable character ) but you can choose whatever delimiter you want
  • We are skipping the first line of the CSV file because it's the header and who needs that
  • You don't have to store the result in a global obviously, so you can store it in an array instead. simply instead of:
       . i count s ^SCRATCH(count) = rec

                 you would do something like

       . i count s ARRAY(count) = rec
  • You don't have to save the record as a counter, you can simple store the id field of the record instead. something like this: (granted that you make sure it's not empty first or you'll get a subscript error)
       . i count,$p(rec,$c(0))]"" s ARRAY($p(rec,$c(0)) = rec
  • Also notice that the ParseCSV function took care of the quoted fields, the quote in 10""2 and the leading and trailing spaces. 

Hope this helps. 

Ahmed

You need to parse CSV. Here's how to do that in Caché:

Provided you have this csv:

car,2000,100.51,27.10.2016,
phone,2003,65.8,15.01.2017,

You can import it into class Test.CSV:

1. Generate a persistent class Test.CSV

set rowtype = "name VARCHAR(50),year INTEGER,amount NUMERIC(9,2),date DATE"
set filename = "D:\data.csv"
do ##class(%SQL.Util.Procedures).CSVTOCLASS(2, .rowtype, filename,,,,"Test.CSV")

2. Import file or files

do ##class(Test.CSV).Import(2,filename)

Usually you can't import your CSV right away - the dates are in a different format, etc. You need to modify Import method and property definitions. For example I often:

  • Add FORMAT=4 property parameter for dates to import dates in dd/mm/yyyy format.
  • Find&replace Library.
  • Add this else line in Import method:
if $$$ISOK(tStatus) { set tCounter = tCounter + 1 } else { w $System.Status.GetErrorText(tStatus) return}

If you have a tab separated file, you need to change Import method signature from:

pDelimiter As %String = ","

to:

pDelimiter As %String = {$c(9)}

You may need to additionally modify Import method to accept streams instead of files.