Replies by Ahmed Razek for InterSystems Developer Community

Ahmed Razek · Oct 20, 2017

Hi Rajesh,

It looks like there are few things going on here:

you need to parse the csv file from the stream and it looks like the csv file contains quoted fields, and lost of trailing and leading spaces (or tabs)
you need to be able to extract the relevant data element into some array or global for later processing

The following code should give some guidance in achieving what you want.

The code is based on a simple csv file based on your question. this is the contents on the sample.csv file

Appt ID , Patient ID , Patient Name    , Program ,   Appt Date , Appt Start
102 ,              "10""02" ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00
103 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00
104 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00
105 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00
106 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00
107 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00
108 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00
109 ,            1044 ,    "Smith, John" ,  HTR          , 06/10/2017 , 08:00
110 ,            1002 ,    "Gene, Jill"       ,  HTR          , 06/06/2017 , 10:00

Notice in the first line the second field (10""02) contains a quote within the cell.

Notice that the name field contains a "," within the cell

Here is the Code:

LoadCSVFile(file)
    n stream,count,rec
    s stream = ##class(%FileCharacterStream).%New()
     s stream.Filename = file
     s count = 0,rec="" f  d  q:stream.AtEnd
     . s rec   =  $$ParseCSV(stream.ReadLine())
     . i count s ^SCRATCH(count) = rec
     . s count = count + 1
     q
     ;
ParseCSV(str,delim) 
    i $g(delim)="" s delim =$c(0)
     n i,qual,nstr,fstr,on
     s qual="""",on=0,nstr="",fstr=""
     f i=1:1:$l(str) d
     . i $e(str,i)'=qual,'on s nstr=nstr_$replace($e(str,i),",",delim) q
     . i $e(str,i)=qual,'on  s on=1 s nstr=nstr_$e(str,i) q
     . i $e(str,i)'=qual,on  s nstr=nstr_$e(str,i) q
     . i $e(str,i)=qual,on   s on=0 s nstr=nstr_$e(str,i)
     f i=1:1:$l(nstr,delim) d 
     . s $p(nstr,delim,i) = $$trim($p(nstr,delim,i))
     . i $e($p(nstr,delim,i))=qual,$e($p(nstr,delim,i),$l($p(nstr,delim,i)))=qual d  q
     .. s fstr=fstr_delim_$e($p(nstr,delim,i),2,($l($p(nstr,delim,i))-1)) q
    . s fstr=fstr_delim_$p(nstr,delim,i)
    q $e($replace(fstr,"""""",""""),($l(delim)+1),$l(fstr))
    ;
trim(t) ;trim
     n i,j f i=1:1:$l(t) q:" "_$c(9)'[$e(t,i)
     f j=$l(t):-1:1 q:" "_$c(9)'[$e(t,j)
     q $e(t,i,j)
    ;

Calling this Function should give us an output that looks like this :

 
USER>k ^SCRATCH
 
USER>D LoadCSVFile("SAMPLE.CSV")
 
USER>zw ^SCRATCH
^SCRATCH(1)="102"_$c(0)_"10""02"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"
^SCRATCH(2)="103"_$c(0)_"1044"_$c(0)_"Smith, John"_$c(0)_"HTR"_$c(0)_"06/10/2017"_$c(0)_"08:00"
^SCRATCH(3)="104"_$c(0)_"1002"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"
^SCRATCH(4)="105"_$c(0)_"1044"_$c(0)_"Smith, John"_$c(0)_"HTR"_$c(0)_"06/10/2017"_$c(0)_"08:00"
^SCRATCH(5)="106"_$c(0)_"1002"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"
^SCRATCH(6)="107"_$c(0)_"1044"_$c(0)_"Smith, John"_$c(0)_"HTR"_$c(0)_"06/10/2017"_$c(0)_"08:00"
^SCRATCH(7)="108"_$c(0)_"1002"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"
^SCRATCH(8)="109"_$c(0)_"1044"_$c(0)_"Smith, John"_$c(0)_"HTR"_$c(0)_"06/10/2017"_$c(0)_"08:00"
^SCRATCH(9)="110"_$c(0)_"1002"_$c(0)_"Gene, Jill"_$c(0)_"HTR"_$c(0)_"06/06/2017"_$c(0)_"10:00"

Few key points to keep in mind :

The ParseCSV function takes a second argument, which the delimiter that you want fields to be separated by. the default is $c(0) if you don't pass anything ( $c(0) works with csv's as it's not a printable character ) but you can choose whatever delimiter you want
We are skipping the first line of the CSV file because it's the header and who needs that
You don't have to store the result in a global obviously, so you can store it in an array instead. simply instead of:

       . i count s ^SCRATCH(count) = rec

you would do something like

       . i count s ARRAY(count) = rec

You don't have to save the record as a counter, you can simple store the id field of the record instead. something like this: (granted that you make sure it's not empty first or you'll get a subscript error)

       . i count,$p(rec,$c(0))]"" s ARRAY($p(rec,$c(0)) = rec

Also notice that the ParseCSV function took care of the quoted fields, the quote in 10""2 and the leading and trailing spaces.

Hope this helps.

Ahmed