User bio
404 bio not found
Member since Apr 15, 2016
Posts:
Mike has not published any posts yet.
Replies:
Mike Henderson · Apr 9, 2025 go to post

It appears the inbound text is double-encoded UTF-8 - the problem character is the fancy-quote.  I've seen this in IRIS pipelines where the UTF8 data is read into a character stream without setting the TranslateTable, then exported out through a UTF8 encoder (eg a REST call)

%SYS>sx="Can’t"%SYS>zzdumpx0000: 00430061006E 20190074                                Can’t
%SYS>s y=$ZCVT(x,"O","UTF8")

%SYS>zw y
y="Canâ"_$c(128,153)_"t"%SYS>w y
Can�t
%SYS>s z=$ZCVT(y,"O","UTF8")

%SYS>zw z
z="CanâÂ"_$c(128)_"Â"_$c(153)_"t"%SYS>w z
Can�t
Mike Henderson · Feb 17, 2025 go to post

For this specific case I recommend using the %SQL_Util.CSV stored procedure since it handles CRLF on linux correctly. Note - RFC 4180 uses CRLF as the line terminator.

In situations where you need to manually read line-by-line simply trim trailing whitespace including CR

While 'file.AtEnd {
  Set line = file.ReadLine()
  Set line = $ZStrip(line, "<>W", $C(13))
  // ...
}
Mike Henderson · Mar 18, 2024 go to post

FWIW I was able to pull this successfully on Apple M2 macOS 14.4 docker engine 25.0.3. 

Certifications & Credly badges:
Mike has no Certifications & Credly badges yet.
Followers:
Mike has no followers yet.
Following:
Mike has not followed anybody yet.