"garbled text" due to incorrect character encoding or decoding.
I am receiving the garbled text due to incorrect encoding or decoding. I tried to use the $zconvert function to convert it into the normal text but failed to do that. Can anybody suggest what I have to use to convert that into normal text?
Example: Garbled text that I am getting is "canââ¬â¢t , theyââ¬â¢re".
Comments
Where are you receiving that text? A business service? From an API REST?
Yes, that's correct. It's coming from the REST API in json payload.
To declare a specific charset for a REST API (I guess that you are using a class that extends %CSP.REST) you have to define the following parameter:
Parameter CHARSET = "utf-8";In my case I was receiving UTF8 texts, you should configure it with your specific charset.
I tried using this but this did not make any difference. but when i tried to convert this string "theyââ¬â¢re" to normal text with $ZCONVERT i am getting below result.
Example:
set rawString="theyââ¬â¢re"
USER>Set fixedString = $ZCONVERT(rawString, "I", "UTF8")
USER>write fixedString
theyâ?re
So, my Question is how to remove the â? and get the exact text. and this is not the only text that is coming we can also another character encoding as well.
In my opinion the problem is that your text is already in UTF8 but with wrong characters due to a wrong decode so you can't remove it with $ZCONVERT, you should define the charset of the data from the class that is receiving the JSON to decode the JSON properly.
It appears the inbound text is double-encoded UTF-8 - the problem character is the fancy-quote. I've seen this in IRIS pipelines where the UTF8 data is read into a character stream without setting the TranslateTable, then exported out through a UTF8 encoder (eg a REST call)
%SYS>sx="Can’t"%SYS>zzdumpx0000: 00430061006E 20190074 Can’t
%SYS>s y=$ZCVT(x,"O","UTF8")
%SYS>zw y
y="Canâ"_$c(128,153)_"t"%SYS>w y
Can�t
%SYS>s z=$ZCVT(y,"O","UTF8")
%SYS>zw z
z="CanâÂ"_$c(128)_"Â"_$c(153)_"t"%SYS>w z
Can�t