Reading a file and translating the content from UTF8 to 8-bit

Hi,

I need to read a UTF8 encoded text file and translate the content to 8-bit.

Using %File class and $ZCVT(TXT,"I","UTF8") works , but I see that if the content is larger than max string  (32000) and we cut the content

To max string chunks, we can get a <translate> error if we cut it in the "wrong" point..

Is there a better way to do this task?

My code looks like this:

    S file=##class(%File).%New(..LocalFileName)
    D file.Open("R")
    While 'file.AtEnd {    
        S Line=$ZCVT(Line,"I","UTF8")
    }
    D file.Close()

and an example of such an error:

USER>s str=$C(215)
USER>w $ZCVT(str,"I","UTF8")
W $ZCVT(str,"I","UTF8")
^
<TRANSLATE>

Regards,

Nael

  • 0
  • 89
  • 6
  • 3

Answers

I would recommend to use more suitable class for it. %Stream.FileCharacter when you can set TranslateTable property

Set stream=##class(%Stream.FileCharacter).%New()
Set sc=stream.LinkToFile("c:\myfile.txt")
Set stream.TranslateTable = "UTF8"
While 'stream.AtEnd {
	Set line=stream.Read()
	; Process the chunk here
}

And you don't need any conversions after that

Enable long strings in System Mgmt Portal and get strings up to 3.4 MB

System > Configuration > Memory and Startup​

 ​

Thank you Robert,

We are aware of this parameter and we change it in some of our servers,

But right now we need to write code that works even without it being enabled.

Regards,

Nael

Nael,

I think you need to use 4th argument of $zconvert:

Set file=##class(%File).%New(..LocalFileName)
Do file.Open("R")
Set handle=""
While 'file.AtEnd { 
    Set Line=$ZCVT(file.Read() , "I", "UTF8", handle)
   // do something with Line
}
Do file.Close()

Handle "contains the remaining portion of string that could not be converted at the end of $ZCONVERT, and supplies this remaining portion to the next invocation of $ZCONVERT."

Please see reference for $zconvert

I like that !  yes

BUT: as in the original request I seem to miss the file.Read()  to fill Line somewhere in the loop.  wink

Thanks Alexander!

that's a very useful tip, I should have read the whole $ZCONVERT documentation..

still, I think the most elegant solution to this specific problem is the suggestion by Dmitry- 

not using ZCVT at all, but using the TranslateTable Property of %Stream.FileCharacter

Regards,

Nael