Question
· Oct 30, 2017

Zstrip to clean string

I'm extracting text from HTML (more on how - here), and after I extract text it has two problems:

  • Lot's of $c(10) control characters
  • Multiple whitespaces

Here's an example of the text extracted from HTML page:

set text = " "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" Word1"_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" Word2 "_$c(10)_"Word3 "_$c(10,10,10,10)_" "_$c(10)_" "_$c(10)_" © 2017 "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)_" "_$c(10)

I want to remove control characters and multiple whitespaces from this string,  and there's $zstrip function for that:

write $zstrip($zstrip(text, "*C"), "<=>P")
>Word1 Word2 Word3 © 2017

But I need to use $zstrip twice. Is there any way to remove control characters and multiple whitespaces using one $zstrip?

Discussion (2)0
Log in or sign up to continue