Article
· Feb 3, 2016 11m read

Introducing new JSON capabilities in Caché 2016.1

This post is intended to guide you through the new JSON capabilities that we introduced in Caché 2016.1. JSON has emerged to a serialization format used in many places. The web started it, but nowadays it is utilized everywhere. We've got plenty to cover, so let's get started.

Warning: Some of the features and syntax documented here were subsequently changed in 2016.2, so will only work on 2016.1. See this other article for details.

Our JSON support so far was tightly coupled with the ZEN framework, our own framework for building web applications. Over time, we recognized a rising demand for accessing the JSON functionality outside of ZEN. Though it was possible to be used outside of ZEN, it was confusing and the old JSON support included a couple of shortcomings that required the introduction of a new API. I want to describe the new API in this post and provide you with some background information why we made certain design choices.

 

The first change you will notice is the introduction of two new classes, %Object and %Array, both extending %AbstractObject. These classes are your primary interface for handling dynamic data, that is, data with a dynamic structure like JSON. Both classes are located in the %Library package as they implement general-purpose functionality.

 

%Object allows you to create what we call a dynamic object. You will not implement a subclass, but rather you will instantiate a dynamic object and add properties you require at runtime. You can also remove properties and we provide an iterator interface for introspection and discovery of existing properties. Properties are always unordered and we will not guarantee any order.

 

USER>set object = ##class(%Object).$new()

USER>set object.name = "Stefan Wittmann"

USER>set object.lastSeriesSeen = "Daredevil"

USER>set object.likes = "Galaxy"  

 

The above code sample creates a new dynamic object with three properties name, lastSeriesSeen and likes.

 

%Array is an ordered collection of values. You can add new values to an array, you can remove them and iterate over it. I think you get the picture. We call these dynamic arrays. Arrays do support sparseness, that means you can assign a value to slot 100 while slot 0 to 99 are left unassigned and we only allocate space for one value, instead of 101.

 

USER>set array = ##class(%Array).$new()

USER>do array.$push(1)

USER>do array.$push("This is a string")

USER>do array.$push(object)

 

The above example creates a dynamic array with three values. The first is a number, the second is a string and the last is the dynamic object from our previous example. We are building a dense dynamic array by pushing new values to the array. A sparse array can be built by setting values at a specific key. But let's save sparseness for later.

 

Producing JSON

 

If you want to serialize a dynamic entity to JSON you just call the method $toJSON() on it. The method is pretty smart and returns a string if it fits into a single string and a stream object otherwise. If it is used in a DO statement it will send the output to the current device. The output will be assigned to the left-hand variable if it is used in a SET statement.

 

USER>do object.$toJSON()

{"name":"Stefan Wittmann","lastSeriesSeen":"Daredevil","likes":"Galaxy"}

USER>do array.$toJSON()

[1,"This is a string.",{"name":"Stefan Wittmann","lastSeriesSeen":"Daredevil","likes":"Galaxy"}]
 

Consuming JSON

 

The other direction is pretty simple as well. There are plenty of ways how you can receive a JSON object, but almost always it will end up in a variable, either as a string or a stream, depending on the size. Just call $fromJSON() and pass in your JSON string or stream. We will take care of the rest.

 

USER>set someJSONstring = "{""firstname"":""Stefan"",""lastname"":""Wittmann""}"

USER>set consumedJSON = ##class(%AbstractObject).$fromJSON(someJSONstring)

USER>write consumedJSON.$size()
2
USER>write consumedJSON.$toJSON()
{"firstname":"Stefan","lastname":"Wittmann"} 

 

The above code example makes use of the method $size() to get the number of properties for the consumedJSON object. $size() is also available for arrays and returns the number of assigned values in the array.

 

Moving on

 

Now that you have seen the basics for the new API for dynamic objects and arrays, let's explore some of the more advanced features and topics. As the data structures are defined at runtime it is very important to provide proper tooling for discovering content. The most important utility is an iterator that allows you to loop through the properties of a dynamic object.

 

USER>set iter = object.$getIterator()

USER>while iter.$getNext(.key,.value) { write "key "_key_":"_value,! }
key name:Stefan Wittmann
key lastSeriesSeen:Daredevil
key likes:Galaxy

 

A very important aspect while designing the API was consistency. A generic functionality like an iterator should behave the same for objects and arrays.

 

USER>set iter = array.$getIterator()

USER>while iter.$getNext(.key,.value) { write "key "_key_":"_value,! }
key 0:1
key 1:This is a string.
key 2:2@%Library.Object

 

The iterator allows us to easily introspect the content of an array and, therefore, we can now discuss sparse arrays. Let's assume we set another value at the positional index 10. Keep in mind that arrays are zero-based.

 

USER>do array.$set(10,"This is a string in a sparse array")

USER>write array.$toJSON()
[1,"This is a string.",{"name":"Stefan Wittmann","lastSeriesSeen":"Daredevil","likes":"Galaxy"},null,null,null,null,null,null,null,"This is a string in a sparse array"] 

 

You can observe that the 11th value is set to the string “This is a string in a sparse array” and all slots between index 3 and 9 are serialized with null values. These null values do not exist in memory and they are only serialized as null values because JSON does not support undefined values. You can easily prove this by iterating over the array as the following code snippet.

 

USER>set iter = array.$getIterator()

USER>while iter.$getNext(.key,.value) { write "key "_key_":"_value,! }
key 0:1
key 1:This is a string.
key 2:2@%Library.Object
key 10:This is a string in a sparse array

 

As you can see only 4 keys are defined and keys 3 to 9 are not set to any value. The code sample demonstrates this concept with an array, but the same is true for dynamic objects. It is very important for various environments to handle sparse data efficiently and you will see some references to this in blog posts I will share later.

Error handling and some sugar

 

Another important fact is that we are throwing exceptions in the case of an error instead of returning a %Status value.  Let’s see what happens if we try to parse an invalid JSON string.

 

USER>set invalidObject = ##class(%AbstractObject).$fromJSON("{,}")

<THROW>zfromJSON+24^%Library.AbstractObject.1 *%Exception.General Parsing error 3 Line 1 Offset 2 

 

You can see that the thrown exception includes enough information to conclude that the second character on the first line is invalid.  Therefore, any code that makes use of the new JSON API should be surrounded with a try/catch block at some level. If you think about it, this makes sense as we are dealing with dynamic data and the data may not fit your assumptions.

 

There are multiple benefits for using exceptions for the report mechanism, but the most important reason is that it allows each method to return a reference to the output, therefore allowing chaining of methods:

 

USER>do array.$push(11).$push(12).$push(13)

USER>write array.$toJSON()
[1,"This is a string.",{"name":"Stefan Wittmann","lastSeriesSeen":"Daredevil","likes":"Galaxy"},null,null,null,null,null,null,null,"This is a string in a sparse array",11,12,13]

Tight COS integration

 

All the code samples I provided so far created the dynamic objects and arrays explicitly. We called the constructor of the corresponding class - %Object or %Array – and started to manipulate the in-memory object.

With the new API there is an even simpler way to create dynamic objects and arrays by implicitly creating them with the JSON syntax:

 

USER>set object = {"name":"Stefan Wittmann","lastMovieSeen":"The Martian","likes":"Writing Blogs"}

USER>write object.$toJSON()
{"name":"Stefan Wittmann","lastMovieSeen":"The Martian","likes":"Writing Blogs"}
USER>set array = [1,2,3,[4,5,6],true,false,null]

USER>write array.$toJSON()
[1,2,3,[4,5,6],true,false,null] 

 

Isn’t that exciting? It is a very clear, compact and human-readable way of describing what you want to create.  You may have realized that the array is initialized with the JSON values true, false and null.  These values are not directly accessible in COS, but they can be used within the JSON syntax.

 

We did not stop there. To really make this useful and dynamic we allow values to be COS expressions. Consider this example:

 

USER>set name = "Stefan"

USER>set subObject = {"nationality":"German","favoriteColors":["yellow","blue"]}

USER>set object = {"name":name,"details":subObject,"lastUpdate":$ZD($H,3)}

USER>write object.$toJSON()
{"name":"Stefan","details":{"nationality":"German","favoriteColors":["yellow","blue"]}," lastUpdate ":"2016-01-31"} 

 

This allows you to easily produce and alter JSON structures on the server. There is one thing you should consider, though: Accessing values will always produce a COS friendly value.  Let’s explore an example to understand what this actually means:

 

USER>set array = [1,2,3,[4,5,6],true,false,null]

USER>set iter = array.$getIterator()

USER>while iter.$getNext(.key,.value) { write "key "_key_":"_value,! }          
key 0:1
key 1:2
key 2:3
key 3:5@%Library.Array
key 4:1
key 5:0
key 6: 

 

The output until key 4 should be expected.  Key 4 returns the COS value 1 for the JSON value true.  Similar for key 5, which returns the COS value 0 for the JSON value false.  Probably less obvious is that key 6 is returning an empty string for the JSON value null.

 

The reason for this is that we want to return COS friendly values that can directly be used in COS conditionals. By mapping true to 1 and false to 0 you can directly test for truthiness and falseness in an if-statement.  An empty string is as close as you can get to a JSON null.

 

But how do you distinguish the JSON value pairs true and 1, false and 0 and null and “” from each other? You do so by checking the type. The method you want to use for this is $getTypeOf().

 

USER>w array.$getTypeOf(5)
boolean
USER>w array.$getTypeOf(6)
null 

 

To close the gap you can pass in a type in the setter for dynamic objects and arrays to specify which type the value represents:

 

USER>do array.$set(7,1)

USER>write array.$toJSON()
[1,2,3,[4,5,6],true,false,null,1]
USER>do array.$set(7,1,"boolean")

USER>write array.$toJSON()
[1,2,3,[4,5,6],true,false,null,true]

 

First we set the key 7 to the value 1, which obviously translates to the JSON number 1. If we want to set the value true instead, we can specify the type “boolean” to the setter. I leave the exercise with the JSON value null for the reader.

Behind the scenes

 

Congratulations for getting this far. This is pretty much information and you are still reading. Obviously there are more API calls to learn, but I would like to refer you to the documentation as well as the class documentation for this topic. There are obviously more advanced topics we can cover and I would like to cover two if them.

Performance

 

Performance is a difficult word. It means so many things and you have to be very careful to state what you mean when you make use of it. One of the shortcomings of the old API, which includes the zenProxyObject is its non-linear runtime behavior for serializing and deserializing JSON. While consuming smaller JSON content is just fine, consuming a raw JSON content as large as 100MB on disk could take a couple of minutes. One reason for this is that parsing is implemented in COS, which is not the most efficient language for parsing character streams. In addition, the complete object graph had to be constructed in memory.

 

We took great care to ensure that we addressed this issue with the new API. Parsing is directly implemented in the kernel and we invented a highly optimized in-memory structure to manage dynamic entities. The following table shows the results from a performance test we conducted recently.

 

JSON performance comparison table  

 

We tested the zenProxyObject on Caché 2015.1, the new JSON support on Caché 2016.1 and as a third-party comparison, we ran the same test against Node.JS. We loaded a 10.8 MB file with 1,000 JSON companies and several embedded objects and you can see that the numbers went down from 28,000ms to 94ms with the new JSON support. Even better the new JSON support is as fast as running the same operation in Node. Scaling up the test makes the improvement even more clear. Consuming a file with 10,000 JSON companies improved from 386,700ms to 904ms, which is on par with Node again. In-memory operations are pretty fast and linear, but as expected Node is a clear winner here, as there is a native support for JSON objects in-memory.

Overall we are pretty happy with the numbers. Let us know about your experience!

 

System Methods

 

This is the last topic I want to shed some light on before I conclude. You may have wondered why all method names started with a dollar character, $new(),$set(), $pop(), $size() and so on. This is a new category of methods we introduced, called system methods. There are two different types of system methods, similar to standard methods: instance and class methods.

 

System methods cannot be overridden by standard methods as they live in a separate namespace. If you’ve worked with the zenProxyObject, you are probably aware that we reserved some property names. The reason is that the zenProxyObject defined some internal properties and methods that allowed it manage its state and provide the API. These stepped on the users namespace and had to be reserved.

 

With the introduction of system methods we got rid of this problem and there are no reserved property names. Your JSON data can be truly dynamic from now on. System methods are always prefixed with a dollar character.

Conclusion

 

We think the new JSON support is a huge step forward and will help you to build better interfaces faster. Expect to see some benefits for handling huge sets of data as the API gets adopted by other parts of our stack, e.g. DeepSee. If you haven’t had a chance to experience the new API yet make sure to give it a try. I hope you are as excited about it as we are, as this is the very foundation for another feature that I will be discussing in a separate blog post soon. Stay tuned.

 

I am looking forward for your feedback.

 

Stefan

Discussion (50)12
Log in or sign up to continue

Has any information about system methods reached the documentation yet? On my 2016.2 FT (build 585) Documatic shows the methods of %Library.Object in a new section headed "SystemMethods" and summarized in a table titled "System Methods". The methods start with keywords systemmethod and systemclassmethod but a DocBook search on either of those terms doesn't get me any extra information.

Also, in Stefan's article the section about system methods says that the "live in a separate namespace" (my emphasis). I don't think he means namespace in the sense of %SYS, SAMPLES, USER etc, but rather that your own method called "new" can coexist with the systemmethod called "new".

I have double checked this topic with our documentation group. The documentation talks about the concept of system methods in general here:

/csp/docbook/DocBook.UI.Page.cls?KEY=GJSON_intro#GJSON_intro_dao

We do not document formally how to define system methods, as this is reserved for InterSystems. See also the following quote from the above documentation link:

Note that there is no supported mechanism for customers to create system methods.

Nevertheless, we have to introduce the concept, so that you are aware what problem system methods are solving and how you can call them (using the dot dollar syntax).

Thanks,

Stefan

I just tested this in a 2016.2 FieldTest terminal session:

USER>set object = {"":"test"}
 
USER>w object.$toJSON()
{"":"test"}
USER>set object."" = "one more test"
 
USER>w object.$toJSON()
{"":"one more test"}

So the answer is, yes we do support empty keys.

We have and continue to test our JSON implementation heavily. If you come across anything that looks incomplete, incorrect or just behaves in unexpected ways, let us know. We are happy to take a look at it.

Many thanks, Stefan

Absolutely. If you are using the embedded JSON-style constructor you can directly make use of special values:

USER>set object = {"boolean":false,"numeric":2.2,"nullValue":null}
 
USER>write object.$toJSON()
{"boolean":false,"numeric":2.2,"nullValue":null}

If you want to manipulate or add special values like null and true/false, you have to use the setter method $set and specify the type with the optional third argument:

USER>do object.$set("anotherBoolean",1,"boolean")
 
USER>write object.$toJSON()
{"boolean":false,"numeric":2.2,"nullValue":null,"anotherBoolean":true}
If I wouldn't have specified the type in the above sample, the value would have been set to the numeric value 1.
HTH,
Stefan

Hello,

That is not what I mean, sorry. What I meant to ask is whether there is a way to store "primitive" JSON values, not only objects and arrays. For instance, can I:

set xxx = ##class(JsonValue).number(2.3)
set xxx = ##class(JsonValue).null()
// or .true(), .false(), .string("some string constant"), etc
w xxx.getType() // would return either of object, array, number, boolean or null

We have no way to represent these special JSON values in Caché Object Script. When you access these special values they are automatically converted to a friendly Caché Object Script value. Here is some more code to describe this based on the snippets I used before:

USER>write object.$getTypeOf("boolean")
boolean
USER>write object.$getTypeOf("nullValue")
null
USER>write object.$getTypeOf("numeric")
number
USER>write object.boolean
0
USER>write object.anotherBoolean
1

You can see, that I can retrieve the type for each property using the $getTypeOf() method. The boolean property returns 0 while the anotherBoolean property returns 1. Both are Caché Object Script-friendly and allow embedding in if-statements.

We would have lost that capability if we introduced special Caché Object Script values to reference special JSON values. In addition, you have to have in mind, that we plan to introduce more serialization formats in the future, so we may not only talk about special JSON values here.

Does that make sense?

I see. Serializing registered and persistent objects to JSON is a new feature in 2016.2. $toJSON() allows you to serialize these objects using a standard projection algorithm that we provide. There are ways how you can specify your own projection logic, in case the standard logic is insufficient for your needs.

$compose() is a new method that lets you project dynamic entities (%Object and %Array instances) to registered and persistent objects (and vice-versa).

The $compose functionality is using the same standard projection logic. Later versions will allow you to specify and store projection rules for specific needs.

Thanks for the great article! It made a good introduction to the new JSON capabilities in Caché for me.

But it seems like there is one super strange issue with $toJSON method.

Check this simple REST application:

Class JSON.Mystery Extends %CSP.REST
{

XData UrlMap
{
<Routes>
   <Route Url="/test" Method="GET" Call="Test"/>
</Routes>
}

ClassMethod Test() As %Status
{
set = "test"1 }
do a.$toJSON()
quit $$$OK
}

}

Guess what you get by querying "http://hostname/webAppName/test"? You get nothing. And now, by adding a pointless change to the Test method:

ClassMethod Test() As %Status
{
set = "test":1 }
write ""  // Right. This one.
do a.$toJSON()
quit $$$OK
}

Everything works as expected, string '{"test":1}' is outputted. Tested on Caché 2016.2.0.590.0.

Any ideas on why this happens? Doesn't $toJSON function initialize the output as write does?

do a.$toJSON() does not work properly with I/O redirection. This is a known issue and will be fixed in future releases. The workaround is very simple: use write a.$toJSON(), or write something else to the stream before (like you did in your second example).

Personally, I prefer to be explicit in my REST methods and use the write command when I want to output something to the response stream. So this code snippet will work in your REST class:

ClassMethod Test() As %Status
{
set "test":}
write a.$toJSON()
quit $$$OK
}

That's why I use this code in my projects
    set stream=##class(%Stream.TmpCharacter).%New()
    do Result.$toJSON(stream)
    
    while 'stream.AtEnd {
        write stream.Read()
    }

and such write, instead of simple
do stream.OutputToDevice()

because, sometime I got unreadable response in this case, don't know why, may be because of gzip

That is correct. If you expect to serve larger JSON content you should make use of the stream interface as Dmitry has pointed out. Here is a snippet that copies the content of a dynamic object to a local file on windows:

ClassMethod WriteObjectToFile(pObject As %Object)
{

set stream=##class(%Stream.TmpCharacter).%New()
set filestream=##class(%Stream.FileCharacter).%New()
set sc=filestream.LinkToFile("c:\Temp\jsonfile.txt")
do pObject.$toJSON(stream)

do filestream.CopyFrom(stream)
do filestream.%Save()

}

Why does this not work for me (Caché 2016.2.2)?

ClassMethod WriteObjectToFile(pObject As %DynamicObject)
{
SET filename="/ps/pctran/Temp/ETIM.json"
SET stream = ##class(%Stream.TmpCharacter).%New()
SET filestream = ##class(%Stream.FileCharacter).%New()
SET sc = filestream.LinkToFile(filestream)
do pObject.%ToJSON(stream)

DO filestream.CopyFrom(stream)
DO filestream.%Save()
}

I get an <STORE> error

What ist the problem?

The $toJSONFormat() method provided a way to output a formatted (pretty-printed) JSON string/stream. There is an effort involved to make sure such a method works properly on all supported platforms (which it didn't) and in addition, there are various options that users would ask for (like omitting null values, intent or no intent, intent with spaces, etc...). We had a similar experience with the previous JSON API.

We decided to put our initial efforts into the machinery and not into pretty-printing. For that reason, we do not produce any output that is irrelevant for machine processing, which is the major task for this output as JSON is a data-interchange format.

It is very simple to post-process a JSON file or string and pretty-print or minify it. There are online tools like

http://www.freeformatter.com/json-formatter.html

and there is functionality/plugins available for popular text editors like Sublime 3:

https://github.com/dzhibas/SublimePrettyJson

Also, there are many node.js packages available that pretty-print or minify JSON files and that can be automated with a grunt/gulp task if you have to automate this for some reason. 

Personally, I just copy/paste my JSON content into Sublime3 and pretty print it.

1) ewd-document-store is primarily designed for use by the JavaScript developer who wants to use / access a Cache database without needing to learn or use COS, or for anyone wanting to migrate their logic to JavaScript whilst retaining Cache as a database.

2) it works with any version of Cache that supports the cache.node interface (at least 2012.x onwards)

3) It can be used against existing/legacy  global storage as well as new storage.

4) It's free Open Source (Apache 2 licensed)

Read the documentation for more information.

excellent - thanks !...

But what about collections, say, a property 'b' that is a collection (with b1, and b2 keys)

>set objFromJSON = {}.$fromJSON("{""a"":""1"",""b"":[{""b1"":""x""},{""b2"":""y""}]}")
>write objFromJSON.a
1
>write objFromJSON.b
24@%Library.Array

>set arr=objFromJSON.b

 

I can only get to each item in 'b' by instantiating an iterator (using arr.$getIterator()), and looping through the list with the $getNext() method of the resulting iterator.   I can remove, get the last, add to the end and set an item in the collection.

I'm assuming there is no concept of getting the item #1 from the collection - using '1' as the key, indicating the first in the collection, or getting #2, indicating the second item - something like

set bObject=arr[1] or  set bObject=arr.GetAt(1) or bObject=arr.Get(1) ?

Steve