Replace Non ASCII Character with value
Hi All
I am writing a class to take a general HL7 replace the non ASCII characters in all the segments to construct another message. I have the following solution working to an extent however it is inserting \r at the end of the segments that have been transformed. Firstly is this the correct approach and secondly how do I get rid of \r at the end of the segment that is affected?
Thank you for your help.
ClassMethod Transform(source As EnsLib.HL7.Message, Output target As EnsLib.HL7.Message) As %Status
{
Set $ZT="Trap",tSC=$$$OK
Do {
Try {
#Dim tMSHin As EnsLib.HL7.Segment
#Dim tSegIn As EnsLib.HL7.Segment
#Dim tSegOut As EnsLib.HL7.Segment
#Dim tmp As EnsLib.HL7.Segment
#Dim tSegString as %String
#Dim NonASCIIChar as %String="©"
#Dim ASCIIChar as %String="(c)"
Set target = ##class(EnsLib.HL7.Message).%New()
Set:source.DocType'="" target.DocType=source.DocType
Set tMSHin = source.GetSegmentAt(1, .tSC)
For i = 1:1:tMSHin.Count
{
Set tSC = target.SetValueAt(tMSHin.GetValueAt(i),"1:"_i)
}
//get the data lookup characters
//check each segment for each of the character
For i = 2:1:source.SegCount
{
Set tSegIn = source.GetSegmentAt(i,.tSC)
Set tSegOut = tSegIn.%ConstructClone()
//get all the fields within the segment
set tIndex =0
while $FIND(tSegIn.OutputToString(),"©",tIndex) > 0 {
set mystr =$Replace(tSegIn.OutputToString(),"©","(C)")
set tSegOut =##class(EnsLib.HL7.Segment).ImportFromString(mystr,.tSC)
do tSegIn.SetValueAt(tSegOut,i)
set tIndex = $FIND(tSegIn.OutputToString(),"©",tIndex)+1
}
set tSC = target.AppendSegment(tSegOut)
}
}
Catch ex
{
Set tSC = ex.AsStatus()
}
} while (0)
Exit
Quit tSC
Trap
Set $ZT="",tSC=$$$EnsSystemError
Goto Exit
}
Kind Regards,
Salma
Comments
Replacing the character by converting the segment to a string, applying $Replace(), and then rebuilding the segment with EnsLib.HL7.Segment.ImportFromString() is a reasonable approach based on the returned examples. A returned example also shows ImportFromString() being used to create an HL7 segment from a string. [1]
The \r you are seeing at the end of the segment string is expected in HL7. The returned material states that carriage return (CR, ASCII 13) is the normal HL7 segment terminator, and another returned discussion confirms that seeing r as the segment terminator is normal HL7 behavior. [1][2]
For your specific question about removing that trailing \r, the returned documents do not provide a documented supported method for stripping it from the result of OutputToString() in this use case. They only establish that CR is the normal segment terminator and that ImportFromString() is used for segment-from-string reconstruction. [1]
A safer alternative shown in the returned material is to avoid rebuilding the full segment text where possible, and instead iterate through fields/components/subcomponents using GetValueAt() and SetValueAt(). That avoids working directly with the full serialized segment string, including its terminator. [1]
There is also a warning in the returned material that if your target is still an HL7 message, you should be careful not to blindly unescape or otherwise alter HL7 syntax in a way that makes the message invalid. [3]
If the goal is only to replace a literal non-ASCII data character such as © with (C), using $Replace() on the relevant data value is consistent with the returned examples. [1][3][4]
Sources: