﻿ Replies by Ray Fucillo for InterSystems Developer Community

Hi Sean,

OK. I don't know of any direct way to access a variable's type.  Last little bit of food for thought...

Even if there were such a function, though, I'd consider it an internal detail that wouldn't necessarily be reliable.  Take as a trivial example 'set x="1234",x=x+0'.  Today, under the covers, x starts out as a string and then changes to an integer when it gets assigned the result of the addition operation.  You could imagine a future where a compile- or run- time optimization notices that it can just leave x unchanged as its string type 1234.  This is entirely an implementation detail and the optimization wouldn't violate any rules of the language.  Note that in the case of "set x="0.5",x=x+0", we would be obligated to leave x as having value ".5", not "0.5" due to the canonicalization rules, but even then we're not obligated to internally make it a floating point type rather than a string type.

Would we ever really do this?  I don't know.  Unfortunately because there are things like \$LB and \$ZHEX that expose bits of these internal details in some fashion, you'd worry about compatibility implications.  But fundamentally, the internal type is just a detail for the Caché virtual machine to manage internally in doing whatever it needs to do to present the typeless COS language to the application.

Sean, I think your post reveals a couple misunderstandings that relate to this problem.  Let me comment on a couple, though at this point, I'm not sure how helpful I'm being to you...

If "1.5"=1.5 is true, then arguably "0.5"=0.5 should also be true, but it is not. This means that developers should be wary of automatic equality coercion on floating point numbers.

It's very important to understand what's going on here because it's central to your question.  "1.5"=1.5 because 1.5 is a number in canonical form.  "0.5" does not equal 0.5 because 0.5 is a numeric literal, and so that literal 0.5 gets canonicalized before being evaluated in the equals.  This is exactly expected and well-defined and not really arguable.  Literals are one thing, but programs are going to most likely get both sides of the equality from some calculation, string extraction, or user input.  If one side of the equality was either a numeric literal or came through some numeric operation, then it is canonicalized, whereas the other side may or may not be, thus possibly failing the equality check unless you explicitly use the unary +.

To make things a little more interesting, a persistent object will automatically coerce a %Float property to a true number value when saved. That's fine, but what if the developer is unaware that he / she is assigning a stringy float value and later performs a dirty check between another stringy float value and the now saved true float number. The code could potentially be tripped up into processing some dirty object logic when nothing has changed.

I understand exactly what you're saying here, but I want to make sure that this behavior doesn't seem mysterious.  All that's going on here is that saving an object invokes %Normalize for all the object properties before saving.  You can do the same any time you want if you have a need to do so.  Remember though that COS is a typeless language so developers should absolutely NOT expect to need to manage the type of their data.  Consider that I store an integer as second comma-delimited piece of a string.  Now I have a %Integer method where I'll return that piece.  All is well and I do not need to use the unary +.  However, your sample assert method would generate a false positive failure because the number I returned in this way internally has string type.  That's not correct though, and you should not be writing code to try to expose the internal type of local variables.  The fact that certain special operations must expose the internal type (like the internal \$listbuild structure, \$zhex, and this dynamic array typing stuff) is a detail specific to those particular functions and shouldn't be considered a backdoor to imposing types on COS, which is typeless.  (BTW, I'm not 100% convinced that it's correct for "1" to become a string in these dynamic arrays, but I'm not going to get into that!)

If I can interpret your goals more generally, it sounds like you're trying to impose a coding convention that at certain places in your application, you want certain value to have been already normalized through the appropriate normalization for their datatype class, so that evaluation with the = operator can be used for logical equality.  You're using %Float as a specific example of that which is interesting in that it gets into how the language canonicalizes number.  But, one could easily imagine wanting the same thing for any arbitrary data type for which only the %Normalize method will do.  If that's what you're really after, then you could easily write an AssertNormalizedValue(value,datatype) which generates an asssertion failure if value'=\$classmethod(datatype,"%Normalize",value)... or something like that.

Sorry for all the duplicate replies... It won't seem to let me place the comment in the right place!

Sorry for all the duplicate replies... It won't seem to let me place the comment in the right place!

See my other comment above, but I don't think relying on what the dynamic array implementation picks for a type to convert to is a great idea. I'd like to see you find a solution in the core of the (typeless) language. If you really are just trying to implement AssertNumericEquals(actual,expected), then that's simply 'if +actual'=+expected { FAILED }'.  This will pass any value 'actual' that would evaluate in an arithimatic operation as the value 'expected' would.  Similarly, if you are trying to implement AssertEqualsCanonicalNumber(actual,expected), then it's 'if actual'=+expected { FAILED }'. That one will pass only the value 'actual' if it exactly is the canonicalized expected value (and thus could be compared to that number with the = operator).  If you want AssertIsCanonical(actual), that's 'if actual'=+actual { FAILED }'.  That one, of course will pass any number in its canonical form.

I'm not sure that I have a precise definition of what you are trying to achieve.  If you can define it, I might be able to help more.  However, there is some confusion in your example that I think needs clarification.

What you're dealing with here is the rules about canonical numbers.  (x=+x) will indeed evaluate whether a number is in canonical form because the equals operator test for exactly equal strings and the unary + converts a value to a number in canonical form.  The reason your first example above returns true is just that you set x equal to a numeric literal, so it got converted to canonical form before it even got set into the variable.  (if you look at the value in x, it would not have had a leading zero)

If you haven't read it before, this portion of the doc (along with the linked references) is a pretty good treatment of this subject. http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=...

String-to-Number Conversion

If you log the error with ^%ETN (as in DO BACK^%ETN, or LOG^%ETN, or exceptionobject.Log()), the SETs to the ^ERRORS global are done with the transaction "suspended", so that it does not roll back.  In the future, we will be exposing this functionality for use in applications.  These get recorded in the journal so that they are recovered upon a system crash or restored in a journal restore, but they are omitted from rollback.  As others have said, ^%NOJRN is not the answer because it is ignored for mirrored databases.

I hesitate to comment on this because you know the answer, but it seems that if you're trying determine if a value is a number in canonical form, it's hard to beat testing that (x=+x).

I don't think we should be so excited about the suggestion for sorts after \$c(0), because that introduces dependencies on the the current local collation strategy.  Whatever answer you choose, I think you should require it to be invariant

Again, "1.0" is not a canonical number; "2.2" is.  Both are valid numbers, but only one is in canonical form.  So exactly what you quoted here is the reason for this behavior.

Since both are valid numbers, you don't have to use + for any function that evaluates them as numbers or as boolean.  You do have to use + any time you desire conversion to canonical form (like equality, array sorting, etc).