Flávio Lúcio Na... · Aug 18, 2021

How convert a String to HTML Entity Codes?

Good morning everybody,

In Html sometimes we need to use entities code for don't have problems with browsers all over the world. Ex. 'á' is converted to code: 'á' / 'é'  is converted to 'é'

I have a question, how can i convert a string like my name to this type of code? Ex. "Flávio" will be "Flávio". I tried this way:

USER>write $zconvert("Flávio","I","HTML")

but don't worked :(

Best Regards.

Product version: Caché 2018.1
0 221
Discussion (10)2
Log in or sign up to continue

A couple of things...

You'll want to use "O" (output) mode instead of "I" (input) mode:
write $zconvert("Flávio","O","HTML")

However, I see that this still doesn't replace the "á". But if I add an "&" it replaces it with "&" as expected:
write $zconvert("Flávio&","O","HTML")

The table at the bottom of this section in the documentation lists which characters are encoded into entities in HTML mode.

Interesting that it works in the opposite direction:

USER> w $zconvert("Flávio","I","HTML")

I saw the table below, but don't have another function to do this convertion?

I think, the $zconvert() function will cover only the necessary entities. But you can use a simple method to convert characters to currently known(*) entities.

ClassMethod ToHTML(str)
   for i=$length(str):-1:1 set c=$ascii(str,i) set:$data(^entityChars(0,c),c) $extract(str,i)=c
   quit str

ClassMethod FromHTML(str)
   set i=0
   while $locate(str,"&[A-Za-z]+;",i,j,v) {
   set:$data(^entityChars(1,v),c) s=$length(v), $extract(str,j-s,j-1)=$c(c), j=j-s+1
   set i=j
   quit str

I have a table (the ^entityChars() global) which contains more the 1400 entities. You can download the above class, together with the table from my FTP server (File: DC.Entity.xml):

Usr: dcmember
Psw: member-of-DC

A sample output:

USER>write ##class(DC.Entity).ToHTML("Flávio Lúcio Naves Júnior")
Flávio Lúcio Naves Júnior
USER>write ##class(DC.Entity).FromHTML("Flávio Lúcio Naves Júnior")
Flávio Lúcio Naves Júnior

(*) Currently known, because (1) I do not have all the currently known entities in my table and (2) with each new day, the W3C and the Unicode consortium can extend the current entity list.

Thanks, I'm trying to access your ftp but i'm not getting. Can you import the file here in the answer?

You have to use the login data (as specified, this is not a anonymous FTP). On Windows: command line (cmd.exe), on Linux: terminal. Then (on both systems): ftp, enter username and passwprd, then: get DC.Entity.xml.

Do you really want to see some 1470 lines of code here?

Why not to place it to some GitHub repo or at least gist?

Why not use numeric codes?

$ascii("á") = 225

set s1=$zconvert("Flávio","I","HTML"),
write s1,$select(s1=s2:" = ",1:" <> "),s2