Sean Connelly · Jun 8, 2017 go to post

Thanks Rubens, port looks really good.

I agree with the extra JSON use case for legacy code.

Will need some thinking about to get similar functionality and performance. Perhaps a just in time code generator that gets cached...

Sean Connelly · Jun 7, 2017 go to post

Agreed, especially as there are hundred+ classes pending for release.

I have an existing sync tool that will automatically export into folders, but I am in the middle of hacking it to work with UDL and a couple of other new features. Until I know its production ready I will do these next few releases of Cogs manually into the same folder (next week or so).

Sean Connelly · Jun 6, 2017 go to post

Hi Javier,

COS does not have a Generics implementation, mainly as its a loosely/duck typed language.

You can however write generic code without needing Generics.

Make your property a base class of your Info classes, this can be %RegisteredObject...

Class Response Extends %RegisteredObject  {
​    Property Code As %String;
    Property Info As %RegisteredObject;
}


You can now assign any valid object to that property at run time.

You won't be able to assign a string to this property, so create a class with a single property of type string that you can assign it to.

Try that and if you get stuck with the JSON serialisation then post back the code that is not working.

Sean.

Sean Connelly · Jun 2, 2017 go to post

What about...

&sql(select RuntimeType into :qRuntimeType from %Dictionary.CompiledProperty where ID1='Foo.MyClass||MyProperty')
Sean Connelly · Jun 2, 2017 go to post

Hi Rubens,

Is this what you are after...

set rs=statement.%Execute()
set meta=rs.%GetMetadata()
set colmeta=meta.columns.GetAt(1)
set runtimeType=colmeta.property.RuntimeType

Sean.

Sean Connelly · Jun 2, 2017 go to post

Great answer Rubens.

The class documentation makes no mention of the second parameter and I was not aware that it existed.

Fortunately I've only had to deal with documents under the large string size to date and did wonder how I would might need to work around that limitation at some point.

Question, the length the XML writer uses is set to 12000. Would this solution work for 12001 or does the size have to be divisible by 3? I'm wondering because 3 characters are represented by 4 characters in base64.

Sean.

Sean Connelly · Jun 2, 2017 go to post

It might sound a bit of a far fetched idea...

After recent (minor) database corruptions caused by VM host activities I did wonder if a future Caché version could be made to self heal itself by using its mirror member.

Sean Connelly · Jun 1, 2017 go to post

100,000 per second is a synthetic benchmark, a for loop in a terminal window will only just do 100,000 global sets a second, and this is without any data validation, data loops, referential integrity etc

you also don't mention if this is done via the API or over the network, I would only be interested in the over the network benchmarks

what I would be really interested in are real world benchmarks that track the number of http requests handled per second, so not some tight benchmark loop, but real end to end http requests from browser, federated through node, to cache.node and Caché and back again

plus I am not really interested in global access from node, I want to work with objects everywhere and gain the performance of letting optimised queries run on Caché without shuffling data back and forth unnecessarily

i know cache.node does handle objects, but it just doesn't fit my needs, I'm not a fan of the API and it is missing some functionality that i need

fundamentally there is a missmatch with the CoffeeTable framework that I have developed and the cache.node API

basically it just didn't seem like a good idea to end up using cache.node as nothing more than a message forwarder with potential overhead that I can't see, what I ended up with is a lean 142 lines of node code that is practically idling in the benchmarks that I have done so far

i also have concerns over the delays I have read about with the cache.node versions keeping up with the latest Node.JS version

the other thing is where is its open source home, I looked and couldn't find it, would have been nice to inspect the code, see how it works and fill in the gaps that the documentation does not go deep enough into

ultimately, why not have alternatives, different solutions for different needs

Sean Connelly · Jun 1, 2017 go to post

One small caveat to consider. Whilst JSON does not have a date type, there is a mismatch between the preferred W3C date that most people use and the internal date format of Cache.

You will find with both of the suggestions that you will still need to do a last minute translation of these dates before you call %Save(), otherwise you will get a save error.

Sean Connelly · Jun 1, 2017 go to post

Its a very simple JSON-RPC wire protocol. The JSON is stripped of formatting. Its then delimited with ASCII 13+10 which are already escaped in the JSON. Nothing more complicated than that.

> How do you deal with license usage? How much does it escalates with a fair amount of users and how do you manage all of that?

I can only refer to benchmarks at the moment, hence why the node connector is still marked as experimental.

The set up was on a single 3 year old commodity desktop machine running a stress tool, node, cache and about 10 other open applications.

The stress tool would simulate 50 users sending JSON-RPC requests over HTTP to a Node queue, a single Caché process would collect these requests over TCP, unpack the JSON, perform a couple of database operations, create a response object, serialise it and pass it all the way back.

With one single Caché process running one single licence I recorded an average of 1260 requests per second.

Sean Connelly · Jun 1, 2017 go to post

As requested, here are some snippets of the ORM library that works for both browser and Node.JS. This is from some of 30,000 unit tests that I built on top of the Northwind database data.

The solution starts with a Caché class that extends the Cogs.Store class, this is just a normal %Persistent class with extra methods.

Class Cogs.CoffeeTable.Tests.Northwind.Customers Extends Cogs.Store
{

Parameter DOMAIN = "northwind";

Property CustomerID As %String;

Property CompanyName As %String;

Property ContactName As %String;

Property ContactTitle As %String;

Property Address As %String;

Property City As %String;

Property Region As %String;

Property PostalCode As %String;

Property Country As %String;

Property Phone As %String;

Property Fax As %String;

Index CustomerIDIndex On CustomerID [ IdKey, PrimaryKey, Unique ];

}

There are then two approaches to develop in JavaScript. The first is to include a client API script that is dynamically created on the fly, this includes a promise polyfill and an HTTP request wrapper. This is a good approach for small to medium projects.

In this instance there will be a global object called northwind that will contain a set of database objects, each with a set of CRUD methods

A basic example of using find...

northwind.customers.find().then( function(data) { console.log(data) } )

The second approach uses TypeScript and Browserify using a modern ES6 approach.

A code generator produces a TypeScript Customer schema class...

import {Model} from 'coffeetable/Model';

export class CustomerSchema extends Model {

    static _uri : string = '/northwind/customers';

    static _pk : string = 'CustomerID';

    static  _schema = {
        Address : 'string',
        City : 'string',
        CompanyName : 'string',
        ContactName : 'string',
        ContactTitle : 'string',
        Country : 'string',
        Fax : 'string',
        Phone : 'string',
        PostalCode : 'string',
        Region : 'string',
        CustomerID : 'string'
    };

    CustomerID : string;
    Address : string;
    City : string;
    CompanyName : string;
    ContactName : string;
    ContactTitle : string;
    Country : string;
    Fax : string;
    Phone : string;
    PostalCode : string;
    Region : string;

}

as well as a model class which can then be extended without affecting the generated class...

import {CustomerSchema} from '../schema/Customer';

export class Customer extends CustomerSchema {

    //extend the proxy client class here

}

now I can develop a large scale application around these proxy objects and benefit from schema validation, auto type conversions as well as having object auto complete inside IDE's such as WebStorm.

Create and save a new object...

import {Customer} from "./model/Customer";

var customer = new Customer();
//Each one of these properties auto completed
customer.CustomerID = record[0];
customer.CompanyName = record[1];
customer.ContactName = record[2];
customer.ContactTitle = record[3];
customer.Address = record[4];
customer.City = record[5];
customer.Region = record[6];
customer.PostalCode = record[7];
customer.Country = record[8];
customer.Phone = record[9];
customer.Fax = record[10];
customer.save().then( (savedCustomer : Customer) => {
    console.log(customer)
}).catch( err => {
    console.log(err)
})

Open it...

Customer.open('ALFKI').then( customer => {
    console.log(customer.CompanyName);    
})

Search...

Customer.find({
    where : "City = 'London' AND ContactTitle = 'Sales Representative'"
}).then( customers => {
    console.log(customers);
});

The last example returns a managed collection of objects. In this instance the second approach includes a more sophisticated client library to work with the collection, such that you can filter and sort the local array without needing to go back to the server.

customers.sort("Country")

this triggers a change event on the customers collection which would have been scoped to a view, so for instance you might have a React component that subscribes to the change and sets its state when the collection changes

Motivation

I needed to develop an application that could run on existing customer databases, Ensemble -> Caché, Mirth -> PostgreSQL as well as MongoDB. Such that the database can be swapped in and out without changing a line of client code.

I looked to adapt one of the existing ORM libraries such as Sequelize or Sails but it was easier to start out from scratch to leverage on Caché without needing to use lots of duck tape to get it working.

This new solution required a JSON-RPC interface and more JSON functionality from Caché, hence re-engineering some old JSON libs and building out the Cogs library.

Moving forward the plan is to release CoffeeTable as a seperate NPM library and Cogs will essentially be a server side adapter to it.

Probably the wrong forum to talk about GT.m, but I have a long standing internal library that was designed for this eventual abstraction and will be one of the databases that will be added to CoffeeTable down the line.

Sean Connelly · May 31, 2017 go to post

I ended up writing my own solution in the end.

It's a TCP wire based solution that uses the JSON-RPC messages as the main protocol.

Node starts up a concurrent TCP listener and then Caché jobs off as many client connections as required.

It surprisingly simple on the Node side, minimal glue to bind HTTP requests to TCP messages with zero blocking.

I did quit a lot of testing on it at the time I wrote it and found that I could get twice as many RPC messages into Cache via Node than I could via CSP. My guess is that the RPC route does not have to deal with all the HTTP protocols.

I then wrapped the same event emitter used for the HTTP requests with a small promise caller and was able to do some testing of proxy objects inside Node itself. It's a little bit experimental on the Node side, but I am able to run the 30,000 browser unit tests (lots of automated ones in there) over the ORM library and it just works.

Not sure I would want to put it into production until its been kicked around some more.

Sean Connelly · May 31, 2017 go to post

Hi Alexy,

You've fished out a property that is of type Cogs.Lib.Types.Json,

In its property state the JSON is stored as a pure string, hence seeing the odd escaping.

When its serialized back out to JSON it will be correctly escaped, which you can see in the JSON dump I posted before it.

This provides the best of both worlds, schema driven properties that can have one or more non schema properties for generic data storage.

btw, Cogs includes JSON classes for serialising and de-serialising to and from arrays and globals as well, interestingly they are only 50 lines of code each, so will be interesting to compare them.

Sean.

Sean Connelly · May 31, 2017 go to post

Thanks Rubens and Alexander, I've not even released the code and getting good ideas to improve things - open source at its best.

Given that return types can also be applied to methods, I am now weighing up native vs annotations.

Any preferences?

Sean Connelly · May 31, 2017 go to post

Excellent, just tried it and the value is accessible via ReturnTypeParams in the %Dictionary.CompiledMethod table.

Sean Connelly · May 31, 2017 go to post

OK, excellent thanks for that. Seem to remember hitting a brick wall trying to get this to work many moons ago.

Sean Connelly · May 31, 2017 go to post

Hi Alexander,

Unless I am missing a cool trick, you can't do this directly...

Property DateOfBirth As %Date(JSONNAME = "BirthDate");

you would need to extend %Date with your own class type and add the JSONNAME parameter to it, which means you end up with...

Property DateOfBirth As Cogs.Lib.Types.Date(JSONNAME = "BirthDate");

Which for me feels much more cumbersome, not to mention that developers are forced to change all of their existing code as well as amend any existing overriden data types that they use. 

Unless I am missing another trick, I'm pretty sure you can't add these attributes to complex types which if I am right is a show stopper anyway.

Annotations are just much easier to work with, I need them for methods as well so it just seems more in keeping to do it all this way.

Sean. 

Sean Connelly · May 31, 2017 go to post

Hi Rubens,

I designed the solution around the real life use cases that I hit in my mainstream work.

In most instances I am handling JSON to and from a browser and I have never had a use case where the JSON is over Cachés long string support of 3,641,144 characters. 

Thats with the exception of wanting to post a file with JSON. In that instance I have some boiler plate code that sends them as multiparts and joins them back together after the main JSON parse.

With those decisions made it was just a matter of writing very efficient COS code that processed long strings. A couple of years ago the serialiser and deserialiser classes stacked up pretty big. In this latest version they are an uber efficient 90 and 100 lines of code each.

There is no AST magic going on, just projection compilation with inspection of the dictionary. A small lib to abstract the annotations and various code generator tricks to bake in type handlers and deligators.

Where data might go over 3,641,144 characters is parsing data backwards and forwards with Node.JS or another Caché server. In this instance the data is almost always going to be an array of results or an array of objects. For the later there is a large array helper class I am working on that will split out individual objects from a stream and then handle them as long strings. This will be part of the Node package.

In the few fringe cases where someone might be generating objects larger than 3,641,144 characters then it wouldn't be to hard to have stream varients. I used to have these, but dropped them because they were never used. But I would keep the string handler variants as the primary implementations as they prove very quick.

As for older Caché instances, I have had to support JSON as long as 8 years ago and still see the need for backwards compatibility.

Sean.

Sean Connelly · May 30, 2017 go to post

As Daniel has said.

Plus, you might want to unit test them, in which case you need to create an instance of your web service class and call its instance method, e.g.

set service=##class(MyApp.MyService).%New()
set result=service.Test()

where MyApp.MyService is the name of your web service class, and Test() is the instance method you want to call.

Sean Connelly · May 30, 2017 go to post

Hi Ranjith,

This is a very good question, and the opposite of one asked a week ago...

https://community.intersystems.com/post/xml-json-ensemble

Caché has varying degrees of support for JSON which will depend on the version of Caché that you have.

Firstly, you will not find a one step solution to your problem inside of Caché.

It's important to note that there is an impedance mismatch between JSON and XML that can produce different results in a one step solution. If you really don't care about this, or the exact format of the XML then I can point you towards...

http://www.newtonsoft.com/json/help/html/ConvertingJSONandXML.htm

which is a .NET solution. You could create a simple .NET object that wraps and calls this conversion utility. You can then bind to that .NET object using these instructions...

http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY…

making the utility feel as if it was local Cache object / function.

There are Java alternatives which you can google for, for which you would use the Java binding...

http://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY…

A two step conversion which will require a little more coding, but will enable you to control your XML output exactly as you want it.

First you will need to convert the JSON into an internal object. If you are on 2016.1 or greater then please take a look at this article...

https://community.intersystems.com/post/introducing-new-json-capabiliti…

You can use the %Object to ingest JSON into a generic object.

From here you will need a concrete class that will be used to generate your XML. Make sure it extends %XML.Adapter. It will then be a process of mapping each property from the generic object to the concrete object. Finally call its XML to string / stream method and you will have well formed and consistent XML.

If you are on an older version of Cache then take a look at the %ZEN.Auxiliary.jsonProvider class which has a %ConvertJSONToObject, apparently its much slower than the newer object which might factor in your solution. I've never used this method myself, but would think you will end up with a very similar solution to the newer %Object.

Sean.

Sean Connelly · May 30, 2017 go to post

Hi Shobha,

Ensemble logs a great deal of metrics that you can use to determine all sorts of timings.

If you look at the header of any message you will see the time it was created and the time it was processed. This will give you the individual times taken for that message in its service or process to complete.

To get an end to end time you will need to know when the operation completed its task.

In your instance you can enable "Archive IO" which you will find under "Development and Debugging" in your file out operation settings. This will record the time received and responded. If you take one of these times and subtract the created time of the service message then you will have an overall benchmark.

Note that in dev you will probably see very little or no lag between each message stage. However when you get to a live environment this lag can increase from milliseconds to seconds depending on load. Therefore its best to take any benchmark on dev as just a best scenario.

Sean.
 

Sean Connelly · May 26, 2017 go to post

Just for good measure, I benchmarked Vitaliy's last example and it completes the same test in 0.344022, so for out and out performance a solution built around this approach is going to be the quickest.

Sean Connelly · May 26, 2017 go to post

OK, I got the third example working, needed to stash the dirs as they were getting lost.

Here are the timings...

Recursive ResultSet  =  2.678719

Recycled ResultSet  =  2.6759

Recursive SQL.Statement  =  15.090297

Recycled SQL.Statement  =  15.073955

I've tried it with shallow and deep folders with different file counts and the differential is about the same for all three.

The recycled objects surprisingly only shave off a small amount of performance. I think this is because of bottlenecks elsewhere that over shadow the milliseconds saved.

SQL.Statement 6-7x slower that RestulSet is a surprise, but then the underlying implementation is not doing a database query which is where you would expect it to be the other way around.

The interesting thing now would be to benchmark one of the command line examples that have been given to compare.

Sean Connelly · May 26, 2017 go to post

> I don't recommend opening %ResultSet instances recursively.

Agreed, but maybe splitting hairs if only used once per process

> It's more performatic if you open a single %SQL.Statement  and reuse that.

Actually, its MUCH slower, not sure why, just gave it a quick test, see for yourself...

ClassMethod GetFileTree(pFolder As %String, pWildcards As %String = "*", Output oFiles, ByRef pState = "") As %Status
{
    if pState="" set pState=##class(%SQL.Statement).%New()
    set sc=pState.%PrepareClassQuery("%File", "FileSet")
    set fileset=pState.%Execute(##class(%File).NormalizeDirectory(pFolder),pWildcards,,1)
    while $$$ISOK(sc),fileset.%Next(.sc) {
        if fileset.%Get("Type")="D" {
            set sc=..GetFileTree(fileset.%Get("Name"),pWildcards,.oFiles,.pState)
        } else {
            set oFiles(fileset.%Get("Name"))=""
        }    
    }
    quit sc
}

** EDITED **

This example recycles the FileSet (see comments below regarding performance)

ClassMethod GetFileTree3(pFolder As %String, pWildcards As %String = "*", Output oFiles, ByRef fileset = "") As %Status
{
    if fileset="" set fileset=##class(%ResultSet).%New("%Library.File:FileSet")
    set sc=fileset.Execute(##class(%File).NormalizeDirectory(pFolder),pWildcards,,1)
    while $$$ISOK(sc),fileset.Next(.sc) {
        if fileset.Get("Type")="D" {
            set dirs(fileset.Get("Name"))=""
        } else {
            set oFiles(fileset.Get("Name"))=""
        }    
    }
    set dir=$order(dirs(""))
    while dir'="" {
        set sc=..GetFileTree3(dir,pWildcards,.oFiles,.fileset)        
        set dir=$order(dirs(dir))
    }
    quit sc
}
Sean Connelly · May 26, 2017 go to post

I've removed the recycled resultset example, it is not working correctly. Might not work at all as a recycled approach, will look at it further and run more time tests if it works.

In the mean time, my original example without recycling the resultset, on a nest of folders with 10,000+ files takes around 2 seconds, where as the recycled SQL.Statement example takes around 14 seconds.