Replacing character groups

What's the best way to replace character groups?

I want not to remove character groups as $zstrip does but to replace them with whitespaces.

$translate needs explicit character list.

Effectively I want to remove any characters besides letters, numbers and a small (known) subset of punctuation characters, replacing everything else with whitespaces.

A little bit $translate(), a little bit of $zstrip() and the job is done (in the example below, I want to keep parenthesis)

set str="abcd(123)/,op(56)*&^%$987ABC"

write $tr(str,$zstrip(str,"*AN","()"),$j("",$l(str)))
what to keep generell ------^^^
what to keep extra --------------^^


By the way, he above works for unicode chars too, unless your pattern match table is outdated ;-))
Another solution is to use an loop and RegEx:

set str="abcd(123)/,op(56)*&^%$987ABC", i=0
set remove="[:punct:]|[:symbol:]"
set keep = "()"

while $locate(str,remove,i,i,c) { set:keep'[c $e(str,i-1)=" " }

Hi Ed!

This is $Translate again )

Assume you need all the ASCII codes from 33 to 126 - letters, numbers, punctuation.


classmethod onlygood(str) as %String


for i=0:1:255 set $E(all,i)=$C(i)

for i=33:1:126 set $E(good,i)=$C(i)

// adding spaces for "no good" symbols

set good=$J("",34)_good_$J("",129)

return $tr(str,all,good)


I've been thinking about it but I think there are more letter characters. Unicode is a big place.

a solution with NO $TRANSLATE

 set str="aN d.ef123$eR=xx?,yWz"
 for i=1:1:$l(str) s:$e(str,i)'?1(1AN,1".",1" ",1",") $e(str,i)=" " 

 aN d.ef123 eR xx ,yWz


USER> Set str = "abcd(123)/,op(56)*&^%$987ABC"
USER> Write ##class(%Regex.Matcher).%New("[^\w\d()]", str).ReplaceAll(" ")
abcd(123)  op(56)     987ABC

USER> Write ##class(%Regex.Matcher).%New("[^\w\d()]+", str).ReplaceAll(" ")
abcd(123) op(56) 987ABC