Article
· Apr 13, 2017 3m read

Compilation gotchas and a request for change

On the back of my recent post on writing bug-less code I wanted to raise a few suggestions (to ISC) that would help prevent certain types of bugs at compile time. I've probably missed a few, but these are the main ones in my mind. Please contribute more suggestions.

Btw, these also serve as potential gotchas for new COS developers.

I except that introducing these types of changes can cause legacy code problems. Particularly where developers do some interesting overloading of method arguments. Therefore I would see these as an option that would need to be enabled and that it won't prevent compilation (it would warn not error).

As an alternative I could see these being implemented as a Lint tool. This is something I (and perhaps the community) would be willing to do if we had an open source AST tree as a starting block (if anyone is aware of one in existence for COS).

Apologies if many of these have been raised before, but it never does any harm to keep raising them. They might seem like small issues, but the accumulative time saved by the community in tracking down these bugs will be in the thousands of man days.

1. Warn on potential <COMMAND> errors

If a method has a return type then the compiler should warn of any quit code that does not return a value.

ClassMethod HelloWorld() As %String
{
    if 1 quit                                     // <- this will cause a command error
    quit "Hello World"
}

(Its interesting that studio will complain when you try and quit a for loop with a value, but it wont complain when you try and quit with nothing on the method.)

The same problem happens in reverse, if a method does not return a value then the compiler should warn if the calling code is trying to use a return value

ClassMethod HelloWorld() 
{
    quit   //with nothing
}

Elsewhere...

set foovar=##class(Example).HelloWorld()     // <-​ the compiler should warn about this, as it will cause a command error

 

2. Warn on incorrect return type

If a method has a return type (e.g. As %String) then the compiler should warn of any quit code that returns the wrong type.


3. Warn on incorrect argument type

Arguments in a method can have a declared type. Unfortunately, these type declarations are not enforced in any way. If the calling code passes in the wrong type then it can cause a) crashes, b) worse, silent problems.

ClassMethod HelloWorld(pObject As Foo.Bar) As %String
{
    Write !, pObject.foo    // this will fall over if passed a string
}

Elsewhere...

Do ##class(Example).HelloWorld("Hello")  // this should cause a compiler warning

 

4. Warn on incorrect number of parameters

Calling code should supply the correct number of parameters unless the Formal Spec of the arguments include a Variadic argument, e.g.

ClassMethod HelloWorld(...) As %String

 

However, this would also require some kind of notation that would allow arguments to be optional, for instance TypeScript fixes this problem in JavaScript by implementing a question mark after the argument.

ClassMethod HelloWorld(pRequired As %String, pAlsoRequired As %String, pOptional? As %String) As %String


intializers and optional's would not be allowed to work together.

ClassMethod HelloWorld(pRequired As %String, pAlsoRequired As %String, pOptionalBad? As %String="") As %String


As an alternative, just an empty intializer could be used to mark an argument as optional

 

5. Enable auto suggestion for methods that return objects as an output of the arguments.

Do ##class(Foo.Bar).DoThing(pArg1, .pArg2)


Currently if pArg2 is an object then we have to add a #dim pArg2 to the code to get the auto suggest to work
 

6. Warn on properties and methods that are not members of the object.

Currently the compiler will error on trying to access a non existent property or method of this class, e.g.

Do ..ThisDoesNotExist()  //causes compilation error

 
But if I do the same on an object instance, then I will not get any type of warning...

ClassMethod HelloWorld(pObject As Foo.Bar) As %String
{
     Do pObject.ThisDoesNotExist()  //wont fail until run time 
}
Discussion (24)0
Log in or sign up to continue

From CacheQuality web site:

Pricing
We have different pricing model accordingly with your needs, starting at a rate of 4.200$

To me this is a way to expensive just to know that method or property doesn't exist. Moreover, I want to know it immediately at project compile time. An external tool could parse the code during the night and tell some complex statistics but simple things must tell compiler

I've tried it with Cache 2017.1.0. The first impressions are:
- It completely ignored all "blockers" in the class with 300+ methods, while there was a many of them it should not pass by: 'not procedureblock', legacy flow control, etc; maybe some internal limit was exceeded.

- On the smaller class it popped rather funny messages, e.g. 

  1 blocker
        1 Method declared as "ProcedureBlock = 0"
User.test.cls(+6): ClassMethod RestVars(zzzzzzsource As %String, bOref As %Boolean) As %Status [ ProcedureBlock = 0 ]
  2 major
       2 Usage of QUIT to exit a method
User.test.cls(+24): quit $$$OK
User.test.cls(+28): quit $$$ERROR($$$CacheError, $ze)

Well, 'ProcedureBlock = 0' can be considered not a good style, but what if I need it in some special case? This very method restores local variables stored in a global, so it would loose all its sense be it 'procedureblock'. IMHO, there should be a facility to protect such special case methods.

'Usage of QUIT to exit a method': QUIT is standardized command to exit methods/routines/etc. IMHO it's better than RETURN as it prevents coders to exit their methods from inside the loops and try/catches constructs, so it encourages the modular coding: each module (method) should have one enter and one exit.

So, many of such rules seem to be subjective. Without a facility to customize the rules this linter seems to be no more than a demo of some commercial product.

P.S. As I noticed later, it never clears 'mgr/Temp/' sub-folders it creates.

... studio will complain when you try and quit a for loop with a value

QUIT from inside a loop is considered quitting a loop rather then a function, so it should always be without a value.

...but it wont complain when you try and quit with nothing on the method

According to language definition, each function can be called as a function (set x=$$function(...)) or as a routine (do function(...)). The call type can be recognized inside the function using a construct:

   quit:$quit ReturnValue quit  ; $quit=1 if called as a $$function()

Methods are compiled to functions and behave the same way.

My point was if the compiler will go as far as protecting the developer from this type of quit misshap

It seems that this kind of checkup would be difficult to implement because of variety of methods how the code could be branched depending on its call type. E.g. 

 if $quit {
     ...
     quit rc
 } else {
     ...
     quit
 }

or 

 quit:$quit rc
 ...
 ...
 quit

Of course, both code fragments demonstrate not very good coding style, but they are semantically correct. If one gets many false positives from a (hypothetical) code checker, he would likely drop it.

It may depend on local coding traditions. Just for example, quick search using Studio's <Ctrl-F> through one of our apps (comprised 1000+ classes) showed: 

 q:$q <lvn> q            ; Found 4501 occurrence/s in 90 file/s
 quit:$quit <lvn> quit   ; Found 126 occurrence/s in 28 file/s.
 if $quit                ; Not found.

So, it turned that about 10% of our classes used $QUIT. Not too many, but not negligible few.

I just wanted to add that some occurrences of `quit:$quit value  quit` idiom was probably inherited from legacy coding practices, as Caché allows calling methods/functions using DO command (thus ignoring the value being returned).

From the other hand, using of `quit:$quit ""  quit` construct looks like an attempt to amend erroneous $$ calls of methods/routines, which did not return any value by their initial design. One might say that such design was initially wrong, as any method should return something, at least %Status which is always $$$OK.

HI,

   The first though I have is how, for any reference outside of the routine or class being compiled, could the compiler be sure of what the routine and package mappings are going to be like at run time.  Most likely in a different environment.

   For example, even in a single instance of Cache, it is possible that for one user an argument type, argument, or return type would be correct and for another it would be incorrect.  This is because they are each using different mappings, maybe running is different namespaces, at run time.  You will notice that if you DO a non-existent routine or label in another routine it will compile, however if you call a non-existent label within the current routine you get a <NOLINE> at compile time.

    Having this only work within the current item seems to me to be of limited value.  So I am thinking that this would be a show-stopper for doing this at compile-time.  You would need something, like a 'lint', that could be run in a full test environment or even in production.

Clark

Hi Clark,

Can you provide a more concrete example.

In particular, why would mappings change the formal spec or return type of a method.

Are you using mappings in some kind of environmental polymorphism?

For me the implementation should never be affected by mappings. Mappings are just a convenience, not a logical influence.

I'm struggling to understand the limitation that you are suggesting?

Sean,

   Mapping was just the first example that came to my mind, and probably not the best one.  From two different namespaces with different mappings you could end up using different versions of some routines or classes.  I'm not saying it's good practice, just that it can happen.

   In addition to mappings there are a number of other things to consider, like the fact that the target can be changed after compilation.  There are object code only routines, or routines where, because something has not been re-compiled, the source and object do not match, so which does the compiler use?  What if the target is changed after the caller was compiled and now the call is incorrect?  What if the target does not exist on the development or build machine, and will only work when combined with object code from other modules of a product in a testing environment?

   One likely case where the caller and function are not both available is where hooks are implemented (like ^ZJRNFILT or ^%ZSTART in the kernel) and a product calls a method or routine in a call frame specified by it, if it exists.  Then someone else is free to write the function so long as it conforms to what is expected.  The module making the call would either be getting compile errors, or they would be compiling using some sort of dummy to eliminate the errors.  In any case the check is useless for the developer of the caller, and the developer of the target is unlike to re-compile after each upgrade unless it is specifically documented that the call had changed. 

   At best, in any of these cases you would get numerous "false" errors that are going to make it easier to miss "real" compilation errors.  I am not sure you really want to validate the external calls in the same step as compilation.  Wouldn't you want to do this once the module is complete and connected with any other modules required for the entire package to work?  This way you can be sure that all the inter-object references are correct in the environment being tested.

Clark

All of the uses cases are around the implementation of instance methods and static methods.

In all cases class A either extends or uses class B. For class A to compile, class B must be present and compiled first, no matter what module or namespace it belongs to. If class B implements a generic interface, then all variances of class B should adhere to the same interface (despite being a manual verification check at the moment).

The types of examples you have raised sound like you are hot calling functional code, in which case I agree it would be very hard to lint this style of coding.

Hi,

your suggestions #2 and #3 would work against the weak typed character of the language. I.e. a method like this:

ClassMethod test() as %String { q 23}

Will work perfectly fine, no matter if you use the return value a string or integer. You can actually also return values from methods that don't define a return type at all.
As you already point out, changing any of this behaviour would likely break quite a lot of existing software, so it's not really feasible.

Hi Tomas,

Its a good point, but performance doesn't have to be an issue.

In the main, the lint tool would only need to check the class that is being edited and compiled at that time. If coded optimally (e.g. memoize dictionary lookups etc) then is should only add milliseconds to a compilation cycle. This would hardly be noticed by the developer.

There is a use case where the edited class could break implementations of that class. In this instance a "lint all" process could be hooked into various other non disruptive steps, such as running a full unit test.

Losing milliseconds at compile time is a small trade off to the collective time lost on supporting these types of errors.