Translate a number to text

Hi all,

I'm pleased to announce this personal project to convert a number to text, in spanish, english, catalan and russian.

the aim of this function is to convert numbers into text. It allows a maximum number of 15 digits.

Overview

The translation is done in several languages. The allowed languages are

  • es: Spanish
  • en: English
  • ca: Catalan
  • ru: Russian

The function also allows to treat the numbers of 109 (milliards) in English-speaking countries format. See the following link Billion Wikipedia

w ##class(NumberTranslate.NumberTranslate).GetText(123,.tSc)

one hundred and twenty-three

w ##class(NumberTranslate.NumberTranslate).GetText(123,.tSc,"es")

ciento veintitres

w ##class(NumberTranslate.NumberTranslate).GetText(123,.tSc,"ca")

cent vint-i-tres

w ##class(NumberTranslate.NumberTranslate).GetText(123,.tSc,"ru")

Сто двадцать три 

w ##class(NumberTranslate.NumberTranslate).GetText(1000000000,.tSc,"en",1) 

one billion

w ##class(NumberTranslate.NumberTranslate).GetText(1000000000,.tSc,"es",0) 

mil millones

Please, have a look the project in the following link:

https://openexchange.intersystems.com/index.html#!/package/CosNumberTranslate

How to install

Open link last Version 1.1.2 CosNumberTranslation_v1.1.2.xml

Right click and select "Save as..."

Download the file .xml

Load from terminal in your namespace (i.e. USER)

USER> do $System.OBJ.Load("c:\temp\CosNumberTranslation_v1.1.2.xml","cs")

check a number

USER> w ##class(NumberTranslate.NumberTranslate).GetText(123,.tSc)

one hundred and twenty-three

I hope it is useful for your development.

Best regards,

Francisco López

Comments

Hi, Francisco!

Great stuff!

Could it be possible to add other languages support to your solution? E.g. Russian?

Que bueno! :)

But I would be happy even if you suggest the way how to contribute to your solution to introduce an another language support.

Hi Evgeny,

I'll publish another article explaining how it works and how other languages are added. I'll have to take etymology and grammar classes to understand how some languages are structured. laugh

Hola Francisco,
You motivated me to do something similar for German.
It's is straightforward .int routine and you are welcome to add the code to your project.
GermanNumberToText 
I did it up to 10e21, negatives and unlimited decimals.  (except what is cut down due to internal limits)
I tried to catch all the irregular structures of the language like singular/plural, varying genders,  upper/lower case
and tried to keep the output readable:

 w $$^zahl(-190000103201101.3903)
Minus 
einhundertneunzig Billionen einhundertdrei Millionen zweihundertein Tausend einhunderteins Komma drei neun null drei 

For quick copy:
Updated to avoid failover  from integer to floating format  for large numbers (2018-06-28 16:34 UTC)

zahl(num="",gen="") Public {
 ;;; convert number as German text
 ;;; w $$^zahl(-1123.505) >>>> Minus ein Tausend einhundertdreiundzwanzig Komma fünf null fünf

 set dec=$p(num,".",2),dec=$s(dec?1.N:$$dec(dec),1:"")
 if num=0 quit "null"_dec
 set neg=$S(num<0:"Minus ",1:"")

  if $l(neg) set num=$tr(num,"-")   
 if num=1 set gen=$zcvt(gen,"U") quit neg_"ein"_$case(gen,"W":"e","S":"es","M":"",:"s")
 if num<10e23 quit neg_$$trd($p(num,"."))_dec
 quit "*** Zahl zu groß ***"
}
 ;
dec(num) {
 set dec=" Komma"
 for p=1:1:$l(num) set dec=dec_" "_$$zig($e(num,p))
 quit dec
}
zig(num) {
 if num<10 quit $li($lb("null","eins","zwei","drei","vier","fünf","sechs","sieben","acht","neun"),num+1)
 if num<20 quit $li($lb("zehn","elf","zwölf","dreizehn","vierzehn","fünfzehn","sechzehn","siebzehn","achtzehn","neunzehn"),num-9)
 set zig=$e(num,*-1),zn=$e(num,*)
 set res=$s(zig=3:"dreißig"
           ,1:$li($lb(,"zwan",,"vier","fünf","sech","sieb","acht","neun"),zig)_"zig")
 if zn set res=$s(zn=1:"ein",1:$$zig(zn))_"und"_res
 quit res 
}
hun(num) {
 set hun=$e(num,*-2),zig=$e(num,*-1,*),res="",m="hundert"
 set res=$s(hun=1:"ein"_m
           ,hun>1:$$zig(hun)_m
           ,1:"" )
 quit $replace(res_$$zig(zig),"null","")
}
ein(res) {
 if $e(res,*-3,*)="eins" set res=$e(res,1,*-1)
 quit $replace(res,"null","")

tsd(num) ;1,000 10e3
 set tsd=$e(num,*-5,*-3),hun=$e(num,*-2,*),res=""
 if tsd set res=$$ein($$hun(tsd))_" Tausend "
 quit res_$$hun(hun)
}
mio(num) ;1,000,000 10e6
 set mio=$e(num,*-8,*-6),tsd=$e(num,*-5,*),m=" Million"
 set res=$s(mio=1:"eine"_m_" "
           ,mio>1:$$ein($$hun(mio))_m_"en "
           ,1:"")
 quit res_$$tsd(tsd)
}
mrd(num) ;1,000,000,000 10e9
 set mrd=$e(num,*-11,*-9),mio=$e(num,*-8,*),m=" Milliarde"
 set res=$s(mrd=1:"eine"_m_" "
           ,mrd>1:$$ein($$hun(mrd))_m_"n "
           ,1:"" )
 quit res_$$mio(mio)
}
bio(num) ;1,000,000,000,000 10e12
 set bio=$e(num,*-14,*-12),mrd=$e(num,*-11,*),m=" Billion"
 set res=$s(bio=1:"eine"_m_" "
           ,bio>1:$$ein($$hun(bio))_m_"en "
           ,1:"" )
 quit res_$$mrd(mrd)
}
brd(num) ;1,000,000,000,000,000 10e15
 set brd=$e(num,*-17,*-15),bio=$e(num,*-14,*),res="",m=" Billiarde"
 set res=$s(brd=1:"eine"_m_" "
           ,brd>1:$$ein($$hun(brd))_m_"n"_" "
           ,1:"" )
 quit res_$$bio(bio)
}
tri(num) {;1,000,000,000,000,000,000 10e18
 set tri=$e(num,*-20,*-18),brd=$e(num,*-17,*),m=" Trillion"
 set res=$s(tri=1:"eine"_m_" "
           ,tri>1:$$ein($$hun(tri))_m_"en"_" "
           ,1:"" )
 quit res_$$brd(brd)
}   
trd(num) ;1,000,000,000,000,000,000,000 10e21
 set trd=$e(num,*-23,*-21),tri=$e(num,*-20,*),m=" Trilliarde"
 set res=$s(trd=1:"eine"_m_" "
           ,trd>1:$$ein($$hun(trd))_m_"n"_" "
           ,1:"" )
 quit res_$$tri(tri)
}

Good one!

You can actually use 10e21 in COS code, it's a valid number format.

You are right. 
Though due to the internal limits, next time I would avoid \ and # operations in favor of $E() for the next version
As there are some strange effects in handling numerics due to normalization

write $$^zahl(1.3400) >>>> eins Komma drei vier          
write $$^zahl("1.3400") >> 
eins Komma drei vier null null

with large numbers exceeding 64bit integers the logic with integer division \ and modulo #
was causing wrong results. So I changed it to pure string interpretation.

Recommendation:
pass all numbers as strings  to escape from numeric normalization

NOT  write $$^zahl(102100900002103201200301.6123100)
einhundertzwei Trilliarden einhundert Trillionen neunhundert Billiarden zwei Billionen einhundertdrei Milliarden zweihundertein Millionen zweihundert Tausend

BUT  write $$^zahl("102100900002103201200301.6123100")
einhundertzwei Trilliarden einhundert Trillionen neunhundert Billiarden zwei Billionen einhundertdrei Milliarden zweihundertein Millionen zweihundert Tausend dreihunderteins Komma sechs eins zwei drei eins null null

If you feel think this is exaggerated think about banking calculations for countries within low rated currencies.

Hi Francisco,

I've downloaded the latest version 1.1.1 to add Portuguese-Brazil (pt-br) and Portuguese (pt) from Portugal, there are some differences as in Brazil shortscale notation is used while in Portugal they use longscale notation. There are also slightspelling differences.

The first problem I encountered with versions 1.1 and 1.1.1 was that both were exported from IRIS and Caché currently does not recognize the <Export generator="IRIS" ...> tag and fails to import the file. Editing the XML file and changing IRIS by CACHE solved the import issue.

It would be nice to have the XML exported as Caché so everyone will be able to import it.

There is an issue with some numbers like 12345 that returns an <INVALID OREF> because the OREF "obj" was killed. Large numbers in the trillion range return wrong results like 123456789123456.

I have these issues fixed and have also added the code for Portuguese (pt-br and pt). I'll send the changed classe to you for analysis.

Best Regards,

Ernesto

Hi, Ernesto! This is great!

You can make a Pull Request to the repo - so everyone can review/comment the changes too

Speaking about IRIS export - yes, it exports with <Export generator="IRIS"> which makes it complicated to import into Caché and Ensemble without manually changing the file.

I'd suggest @Francisco.López1549 to export releases for IRIS and Caché/Ensemble separately for now and invite @Stefan Wittmann and @Benjamin DeBoe to share guidelines what is the best way to develop on IRIS and make it available for Caché/Ensemble too.