Question
Natasa Klenovsek Arh · Jan 19, 2017

Import data with special charactes (čšž)

Does anyone has any experiance importing data to Cache which contains special characts like ščž?

 

I have tried several options, but nothing really works. :)

 

thanks

00
0 0 11 407
Log in or sign up to continue

Please add more details, how did you try to do it, and $zversion of your instance. 

Natasha,

we need much more details.

What exactly did you tried? How do you import data? What's the source of the data? What do you mean by "nothing works"?

What version of Caché do you have (exact $zv)? What locale this instance have?

I have no problems importing this data from UTF-8 file.

USER>set f = ##class(%Stream.FileCharacter).%New()

USER>write f.LinkToFile("c:\temp\demo.txt")
1
USER>set line = f.Read()

USER>write line
ščž

Version of Cache 2016.2.1.803, source of data is csv exported from mysql

 

I tried importi with different charset types, i tried importing it with $SYSTEM.SQL.DDLImport, I also tried the same as you, but i get a result ??? instead of ščž

 

%SQL.Import.Mgr

this version number is not enough, please show output from this command

write $zv

Looks like, your installation in 8-bit mode instead of Unicode. In that case such behavior possible. That's why we are asking for full version name from $zv, 

aha i see :)

write $zv detail: Cache for Windows (x86-64) 2016.2.1 (Build 803U_SU) 

Thanks.

803U means that installation is Unicode.

1) What locale do you have? You can check it in Management Portal -> System Administration -> Configuration -> National Language Settings -> Locale Definitions.

2) Please double-check that file is indeed in UTF-8 format.

3) Can you reproduce this problem with small file?
For example, create small text file with just three symbols: ščž
Save it in UTF-8
And read it as above in my example. Does it work?

Thank you.

You were right with the first solution, the language was set to English :)

Thank you again, it's working now.

It seems to be pretty peculiar case, at least for me. Sounds like Cache is supplied with English Unicode locale which does not support all unicode characters. What for?

Locale enuw has RAW translation table for reading from files.
Some other locales, for example rusw, has UTF8 translation table.

So when reading UTF8 in enuw locale you need to specify translation table explicitly. Or have locale with default file translation table UTF8.

Hi, Natasa!

If Alexander's answer fits the question for you would you please mark it as "Accepted"?

Thank you in advance!