﻿ Simple Arithmetics | InterSystems Developer Community | Caché|ObjectScript

Simple Arithmetics

Can somebody explain this behaviour of Cache (and many other calculating machine as well) :

WRITE 1/3*12

4

WRITE 1/3*5

1.666666666666666667

W 1/3*18

5.999999999999999999

W 1/3*21

6.999999999999999999

W 1/3*24

7.999999999999999999

How can we make more occurate caculations in Cache ?

• + 1
• 0
• 477
• 0
• 2

Hi Uri,

this comes down to the way computers are representing numbers. Since computers are based on a binary system, you have to approximate (some) numbers. This leads to things like .3 actually being .29999999999999998889 (in IEEE binary double precision floating point representation). This actually comes up from time to time, so I took the liberty to use some of the examples I gathered over the time:

Various languages use different limits for rounding and or display of numbers and sometimes moving between languages leads to these artifacts coming up. One common example for that is moving from java's double to Cache's decimal types.

Cache's decimal representation has almost 19 digits of decimal precision and it has a range where its largest positive number is exactly 9.223372036854775807E+145 and its smallest non-zero positive number is exactly 1E-128.

Cache's binary representation, as well as Java's "double" type representation (just as an example, since you asked about other languages as well), uses the 64-bit double precision representation defined by IEEE Std 754-1985 (the IEEE Standard for Binary Floating-Point Arithmetic). This representation has a precision of 53 binary bits, which is almost 16 digits of decimal precision. It has a range where its largest positive number is approximately 1.797693134862315708E+308 and its smallest non-zero positive number is approximately 2.22507385850720138E-308. The set of decimal fractions that can be stored exactly in binary floating- point representation (i.e., without using an approximation) is very limited. The set includes 0.5, 0.25, 0.75, 0.125, 0.375, 0.625, 0.875. The list I just wrote down shows all the 1 digit, 2 digit and 3 digit decimal fractions that are represented exactly in binary. The remaining three- digit decimal fractions (992 decimal fractions out of a total of 999 fractions) must be approximated.

So looking at another example:

The decimal value 4.2 is approximated in IEEE double precision binary floating-point representation by the decimal value 4.2000000000000001776356839400250464678... . The internal hex value is 4010CCCCCCCCCCCD. The next smaller binary value has hex value 4010CCCCCCCCCCCC and it is approximately 4.1999999999999992894572642398998141289... . Since the larger value is closest to 4.2, that is the IEEE double precision binary floating-point value we use to approximate the decimal value of 4.2. Both Cache and Java will use the same "double" value to approximate 4.2. The Cache decimal floating-point representation can exactly represent the decimal value 4.2 with no approximation necessary. The COS function \$DOUBLE(x) can be used to force the value "x" into the IEEE binary floating-point representation by generating the best approximation. The COS function \$DECIMAL(x) can be used to force the value "x" into the Cache decimal floating-point representation. The COS function \$DECIMAL(x,n) can be used to convert the numeric value "x" into a string representation using "n" significant digits (but "n" is limited to be between 1 and 38.) Consider, the following:

USER>set x=4.2

USER>set xd=\$DOUBLE(x)

USER>write x 4.2

USER>write xd 4.2000000000000001776

USER>write \$DECIMAL(xd,30) 4.20000000000000017763568394003

USER>write \$DECIMAL(xd,16) 4.2

USER>write \$DECIMAL(xd,17) 4.2000000000000002

Cache converts the numeric value \$DOUBLE(4.2) to a default string representation with 20 significant digits. Asking for 30 significant digits shows that \$DOUBLE(4.2) has more than 20 digits in its decimal representation. Asking for 16 and 17 decimal digits shows that \$DOUBLE(4.2) does approximate the decimal value 4.2 with an accuracy of about 16 decimal digits. Consider, the following computation which removes the leading 4 and 2. (It also multiplies by 10 which does cause a little round off error but you would get a lot more round off error if you computed xd-4.2. Computing xd- 4.2 causes so much round off error that all significance is lost and the answer is 0.0.)

USER>WRITE ((xd-4)*10)-2 .0000000000000017763568394002504646

You can try the above computation in Java using its "double" type and you should get the same answer (with a few less decimal digits printed.) Doing this in Java will give you another way (besides using the BigDecimal package) to demonstrate that "double xd=4.2:" does not produce an exact representation of 4.2 in Java. Another question is why Cache prints \$DOUBLE(4.2) using as many digits as 4.2000000000000001776. The reason is that Cache can exactly represent other nearby decimal values such as 4.200000000000000177 and 4.200000000000000178. These are adjacent values in the Cache decimal floating-point representation and the value \$DOUBLE(4.2) falls in between these two values. If we convert \$DOUBLE(4.2) to decimal then we will get the larger of these values because that is the closest Cache decimal floating-point value.

USER>WRITE \$DECIMAL(\$DOUBLE(4.2))

4.200000000000000178

This conversion from binary to decimal involves an approximation as the following comparisons show:

USER>write \$DECIMAL(\$DOUBLE(4.2))=\$DOUBLE(4.2)

0

USER>write \$DECIMAL(\$DOUBLE(4.2))>\$DOUBLE(4.2)

1

and the default conversions of these values to string representation shows why the comparisons give these results.

USER>write \$DECIMAL(\$DOUBLE(4.2)),!,\$DOUBLE(4.2)

4.200000000000000178

4.2000000000000001776

You should note that 4.2 and \$DOUBLE(4.2) are not really close to each other. Cache can also represent an additional 177 decimal floating-point values between 4.2 and \$DOUBLE(4.2). The default conversion of a \$DOUBLE value to a string will usually have 20 significant digits. If the default representation has less than 20 significant digits then that \$DOUBLE value exactly equals the corresponding decimal value represented by the string. If you want to format Cache \$DOUBLE values as strings so that look they like Java conversions to strings then you can consider using \$DECIMAL(xd,15) or \$FNUMBER(xd,"G",14) which are two examples of formatting functions that only print 15 significant digits. If you take a Java "double" type value, move it into a Cache data base, and later extract the value to send it back to a Java "double" variable then the value originally sent to Cache will be identical to value that is returned. As long as the value is between 9.22E+145 and 1E-110 in magnitude then it will not matter whether the value stored internal to Cache uses decimal or binary representation. The approximations involved in converting between binary and decimal will not be large enough to change the "double" value. If the Java "double" data involves values that are outside this range then you must be careful to use the %Double type in Cache in order to eliminate conversions that might cause overflow or underflow. We usually recommend that customers use the default decimal representations in Cache and avoid the %Double type and avoid the COS \$DOUBLE function. The default decimal representation rarely involves unexpected approximations. The only time to use \$DOUBLE(), %Double and binary floating-point is when the values are not directly entered and read by humans but instead involve a machine-to-machine transfer of data stored using the representation defined by IEEE Std 754. One additional note: when you write 1.9f in Java, the suffix "f" means you are asking to use the 32-bit single precision binary floating-point representation defined by IEEE Std 754-1985. It has a much smaller range and only 24 bits of precision. It's approximation of decimal fractions is only good to about 7 decimal digits. This smaller precision explains the results you are seeing when using this value.

So all in all, you are not asking to get more accurate calculations, but you are rather asking for less accuracy, as you only want to see the rounded values. As shown in the above example you can use \$double for some of these calculations.

-Fab

My colleague provided you a very detailed answer to your first request. That is the “Why” part of your request. For the “How” part, rather than round away the the apparent error that you don’t understand, there is a whole branch of mathematics dealing with transforming your math to make it work on computers. While many programming languages optimize code to correct inefficient logic, few (including COS) will fix your math.

Some, simple examples are:

(1) When adding a vector of floating point numbers, it is wise to sort the numbers by their absolute value, then and add the small numbers to the accumulator first.

(2) When evaluating a polynomial like a·x³+b·x²+c·x+d, rewrite this as ((a·x+b)·x+c)·x+d, and this can be conveniently be written in COS without any parenthesis: SET ans=a*x+b*x+c*x+d

(3) Your problem: Divide last, unless it would cause an overflow. That is if your general equation is (x÷y)·z, why not rewrite it as (x·z)÷y. If (x·z) would cause an overflow, then use either (x÷y)·z, or (z÷y)·x, which ever gives a better answer. The simple way to decide which answer is better, is which answer has the fewest digits after the decimal point. Here is some code if not resorting to rounding is important to you.

MATH    ; SRS 2016-11-03
FOR {
READ !,"x/y*z enter: ",t  QUIT:t=""
SET x=\$PIECE(t,"/"),t=\$PIECE(t,"/",2)
SET y=\$PIECE(t,"*"),z=\$PIECE(t,"*",2)
IF x'=+x||(y'=+y)||(z'=+z) { SET ans=\$CHAR(9785) }
ELSE {
TRY { SET ans=x*z/y }
CATCH e {
TRY {
SET xovery=x/y,zovery=z/y
IF \$LENGTH(\$PIECE(xovery,".",2))<\$LENGTH(\$PIECE(zovery,".",2)) {
SET ans=xovery*z
} ELSE {
SET ans=zovery*x
}
CATCH e {
SET ans=\$CHAR(8734)
}
}
}
WRITE " = ",ans
}
QUIT