This is an important point. Depending on the size of each node, this $order loop could be touching virtually every block in the global. If you read the 4GB test global after setting it, you're reading from a warm buffer pool, whereas the 40GB production global is less likely to be buffered—hence the greater than 10x difference in time.

I don't have a good suggestion for how to make this loop run faster. $prefetchon might help a little. Rather, depending on why you need to perform this operation, I'd either cache the record count (which could then become a hot spot of its own), or maintain an index global (possibly a bitmap).

Can you elaborate on what it would mean to support Parquet? The diagram appears to have come from this post:

https://blog.openbridge.com/how-to-be-a-hero-with-powerful-parquet-googl...

In that context, the query run time refers to acting on the file directly with something like Amazon Athena. Do you have in mind importing the file (which is what an IRIS user would currently do with CSV), or somehow querying the file directly?

Since Caché 2016.1, the %SYSTEM.Process class contains a TimeZone() method that lets you alter the value of the TZ environment variable, thereby changing the local time:

https://docs.intersystems.com/irislatest/csp/documatic/%25CSP.Documatic....

Depending on your situation, you can either change the current process, or isolate the change by jobbing off a background process.

Once the time zone is set appropriately, you can use dformat -3 of the $zdatetime and $zdatetimeh functions to convert between UTC and local. It would be cleaner if the time zone were an explicit argument, rather than implicit part of the environment.

No, the VALIDATE MODEL statement does not do cross validation. It calculates validation metrics for the given trained model and dataset. As described in the "Model Selection Process" section of the documentation, however, the TRAIN MODEL statement does this to some extent for classification models when using the AutoML provider:

These scoring metrics are then computed for each model using Monte Carlo cross validation, with three training/testing splits of 70%/30%, to determine the best model.

I also believe that the DataRobot provider incorporates cross validation into its training. I'm not sure about H2O.

This new error is indeed a breaking change in InterSystems IRIS 2019.3, as described in the 2020.1 release notes. Since I unfortunately can't link to them at the moment, I'll quote the relevant section from the upgrade compatibility checklist:

3.2.26 System — Class Compiler Validates Global Name Length Limit

This release adds compiler validation to ensure that the global names defined for DataLocation, IdLocation, IndexLocation, StreamLocation, CounterLocation, and VersionLocation are all within the global name length supported by the system (currently 31 characters). In previous releases, these global names were silently truncated to 31 characters. This could cause collisions between global names in the same class. These collisions would cause what appears to be references to separate globals to actually reference the same global.

During class compilation, an error will be reported if any of the global names used for the class are longer than the supported length. For example:

Compiling class %UnitTest.Result.TestInstance

ERROR #9101: Global name 'UnitTest.Result.TestInstanceStream' for 'StreamLocation' is too long, must be no more than 31 characters in length. [ConstructAddresses+60^%ocsExtentCache:BUILDSYS] > ERROR #5030: An error occurred while compiling class '%UnitTest.Result.TestInstance' [compile+59^%occClass:BUILDSYS]

This change will most likely result in classes that need to be modified upon upgrade. This more commonly occurs on systems that were converted from Caché or Ensemble using the in-place conversion if the classes were first defined using %CacheStorage and the class names were fairly long. (first in 2019.3)

I don't believe that you can disable this validation.

The Python documentation is pretty good. I haven't been too impressed by the Python books I've looked at. I didn't get much out of Learning Python, and even less out of Programming Python. Effective Python is okay, but not great. If I have a favorite, it's Fluent Python by Luciano Ramalho, although it probably shouldn't be your first book on the subject. A second edition is underway; you may need a Safari subscription to read it.

I don't know whether this is documented, but you can use array accessors in the property expression. To create a property for the productID of the first entry in the OrderContent array, you should be able to use an expression like this in a %CreateProperty() call: $.OrderContent[0].productID.

Unfortunately, this doesn't address the general case of searching an array. I don't think you can create a collection property in DocDB, nor do the %FindDocuments() operators seem to be of much use. You might try poking around in your generated class to see if you can use an existing property as a template for creating your own computed property that aggregates the productID. If that works, you may still find the %FindDocuments() operators to be inadequate, but the property would then be accessible to SQL queries.

I took the spec. from your reply to Dmitriy's question, changed "query" to "path", smooshed it into one line, then pasted it into a terminal:

USER>s swagger={...}

USER>s status=##class(%REST.API).CreateApplication("neerav",swagger)

USER>zw status
... /* ERROR #8722: Path parameter country is not used in url /Demo of route with GETDemo in class: neerav.spec

I then changed "/Demo" to "/Demo/{country}":

USER>s swagger={ "swagger":"2.0","info":{ "title":"O","version":"1" },"schemes":["http","https"],"consumes":["application/json]"],"produces":["application/json]"],"paths":{ "/Demo/{country}":{ "get":{ "summary":"Get Demo ","description":"Demo ","operationId":"GETDemo","x-ISC_CORS":true,"consumes":["application/json]","text/html]"],"produces":["application/json]","text/html]"],"parameters":[{ "name":"country","type":"string","in":"path","description":"Country Name","allowEmptyValue": true } ],"responses":{ "200":{ "description":"Returns Country","schema": { "type":"string" } } } } } } }

USER>s status=##class(%REST.API).CreateApplication("neerav",swagger)

USER>zw status
status=1

I'm curious to know in what version this previously worked. I get the same error in 2019.1.2.

I wasn't sure whether the Caché ODBC driver works with IRIS. Even if it does now, it's not guaranteed to do so in the future. I did find that the C++ binding SDK appears to include everything you need to generate proxy classes. You just need to call Generator::gen_classes() in generator.cpp.

Again, everything I've described is an unsupported hack. If a C++ binding application is holding up an upgrade from Caché to IRIS, let your Support or Sales Engineering rep. know.

You are correct that callin is not a replacement for the C++ binding. Callin is a lower-level API that only works for a local instance. The C++ binding has two flavors: the light C++ binding (LCB), which depends on callin, and the full C++ binding, which depends on the ODBC driver.

It's not clear to me from your questions how you're currently using C++. Do you actually have an existing C++ binding application, or are you writing something new with a requirement to use C++? I suggest that you get in touch with your friendly InterSystems representative in Support or Sales Engineering to help you find the best path forward.

That said, I was curious about whether the C++ binding from Caché can be used with InterSystems IRIS. Although I've never used this API on purpose, I was able to get it working after a fashion with a few modifications. I presume this is not supported by InterSystems, and I would hesitate to use it in production. The following overview describes my experiment using the C++ binding SDK from a Caché 2018.1.4 kit for lnxrhx64 with the Community edition of IRIS 2020.3.0 in a container. Apologies for any copy-paste mistakes. I started this on a lark, and didn't take very good notes.

I installed g++, make, and vim in the container:

$ docker exec -it --user root cppbind bash
root:/home/irisowner# apt update && apt install -y g++ make vim
...

I extracted cppsdk-lnxrhx64.tar.gz, and found that the C++ binding uses dlsym() and dlopen() to call entry points in the ODBC driver library. (Search for the init_pfn macro in odbc_common.cpp.) These are still present in the IRIS library, as you can see with the nm command:

$ docker exec -it cppbind bash
irisowner:~$ nm /usr/irissys/bin/libirisodbciw35.so |grep cpp_
000000000005cf90 T cpp_binding_alloc_messenger
000000000005dd40 T cpp_binding_clear_odbc_cache
...

I changed the ODBCLIBNAME symbol in libcppbind.mak from the libcacheodbciw Caché driver library to the libirisodbciw35 IRIS driver library.

I changed the Caché "CacheUni" symbols to the IRIS "ServerUni" symbols. For those three symbols, instead of

*((void**)(&(func))) = dlsym(dll_handle, #func);

You want something like this, where name is the new name of the symbol in the IRIS driver:

*((void**)(&(func))) = dlsym(dll_handle, #name);

I did the same for the ODBC 2.x SQLColAttributesW() function, which has been replaced by the ODBC 3.x SQLColAttributeW() function.

I fixed an apparent bug in obj_types.h by marking get_cs_handle() const and cs_handle mutable. I didn't spend much time investigating, so I'm not certain that this is the best fix.

I then built the C++ binding:

irisowner:~/dev/cpp/src/modules/cpp/lnxrhx64$ make libcppbind.so

It appears that the cpp_generator utility is only shipped as a binary, which obviously doesn't have any of these changes, so it won't work with an IRIS driver library. However, I was able to do a simple test using Dyn_obj:

#include <iostream>
#include "cppbind.h"

using namespace InterSystems;

int
main()
{
    auto conn = tcp_conn::connect(
        "localhost[1972]:USER", "_system", "SYS");
    Database db{conn};
    d_string res;
    Dyn_obj::run_class_method(&db, L"%SYSTEM.Util", L"CreateGUID",
        nullptr, 0, &res);
    std::cout << res.value() << std::endl;

    return 0;
}

I compiled and ran the test program like so:

$ docker exec -it cppbind bash
irisowner:~$ g++ -o cppbindtest cppbindtest.cpp -I dev/cpp/include \
    -L dev/cpp/src/kit/lnxrhx64/release/bin -lcppbind
irisowner:~$ export LD_LIBRARY_PATH=/home/irisowner/dev/cpp/src/kit/lnxrhx64/release/bin:/usr/irissys/bin
irisowner:~$ ./cppbindtest 
7325F26A-21E2-11EB-BB36-0242AC110002

I don't know what the TRACE macro does, but I would add that you have a couple of ways to visualize the contents of a dynamic object:

USER>read json
{"foo":"bar"}
USER>s object={}.%FromJSON(json)

USER>do object.%ToJSON()
{"foo":"bar"}
USER>write object.%ToJSON()     
{"foo":"bar"}
USER>zwrite object
object={"foo":"bar"}  ; <DYNAMIC OBJECT>

If object.%Get("forename") or object.forename returns "", you can try object.%IsDefined("forename") to see whether that property is in fact defined.