No, the VALIDATE MODEL statement does not do cross validation. It calculates validation metrics for the given trained model and dataset. As described in the "Model Selection Process" section of the documentation, however, the TRAIN MODEL statement does this to some extent for classification models when using the AutoML provider:

These scoring metrics are then computed for each model using Monte Carlo cross validation, with three training/testing splits of 70%/30%, to determine the best model.

I also believe that the DataRobot provider incorporates cross validation into its training. I'm not sure about H2O.

I don't know whether this is documented, but you can use array accessors in the property expression. To create a property for the productID of the first entry in the OrderContent array, you should be able to use an expression like this in a %CreateProperty() call: $.OrderContent[0].productID.

Unfortunately, this doesn't address the general case of searching an array. I don't think you can create a collection property in DocDB, nor do the %FindDocuments() operators seem to be of much use. You might try poking around in your generated class to see if you can use an existing property as a template for creating your own computed property that aggregates the productID. If that works, you may still find the %FindDocuments() operators to be inadequate, but the property would then be accessible to SQL queries.

I took the spec. from your reply to Dmitriy's question, changed "query" to "path", smooshed it into one line, then pasted it into a terminal:

USER>s swagger={...}

USER>s status=##class(%REST.API).CreateApplication("neerav",swagger)

USER>zw status
... /* ERROR #8722: Path parameter country is not used in url /Demo of route with GETDemo in class: neerav.spec

I then changed "/Demo" to "/Demo/{country}":

USER>s swagger={ "swagger":"2.0","info":{ "title":"O","version":"1" },"schemes":["http","https"],"consumes":["application/json]"],"produces":["application/json]"],"paths":{ "/Demo/{country}":{ "get":{ "summary":"Get Demo ","description":"Demo ","operationId":"GETDemo","x-ISC_CORS":true,"consumes":["application/json]","text/html]"],"produces":["application/json]","text/html]"],"parameters":[{ "name":"country","type":"string","in":"path","description":"Country Name","allowEmptyValue": true } ],"responses":{ "200":{ "description":"Returns Country","schema": { "type":"string" } } } } } } }

USER>s status=##class(%REST.API).CreateApplication("neerav",swagger)

USER>zw status
status=1

I'm curious to know in what version this previously worked. I get the same error in 2019.1.2.

You're getting a "Datatype validation" error, which suggests that the XML parsed fine, but that the value "/shared/BENANDERSON" is not valid for the PArray property of the ExecuteResult class.

If PArray is an array, I don't remember how that's typically projected to XML, and I can't find an example in the documentation. Lists are projected as a container element with the contents nested as elements within. Are you reading in something that was written using an %XML.Writer, or are you designing a class to read an XML file from somewhere else?

It might help to see more context in the XML, and the relevant definitions from the ExecuteResult class.

FIPS 180-4 describes SHA-512 et al., FIPS 198-1 describes HMAC, and PKCS #5 describes PBKDF2, which depends on HMAC-SHA. As for NIST, special publication 800-132 (now ten years old) states: "This Recommendation approves PBKDF2 as the PBKDF using HMAC with any approved hash function as the PRF." For more recent guidance, consider special publication 800-63B.

As I understand it, none of the weaknesses in SHA affect HMAC or PBKDF2. However, if SHA-1 is no longer FIPS approved, the NIST guidance would indicate replacing it with, say, SHA-2 or SHA-3.

In terms of strength, PBKDF2 essentially has two parameters, the hash function, and the iteration count. For the hash function, bigger is usually slower, therefore stronger. For the iteration count, PKCS #5 and NIST 800-132 both suggest a minimum of 1,000. NIST 800-63B states: "the iteration count SHOULD be as large as verification server performance will allow, typically at least 10,000 iterations."

One thing I've done to split machine learning datasets is to use an auxiliary table that maps IDs to a random number. I write a stored procedure that returns a random number. I create a table with two columns, one for the ID of the source table, and one to hold a random number. I populate the column for the source IDs:

insert into random_table (source_id)
select id from source_table

I then populate the column for the random number:

update random_table
set random_number = MySchema.MySP_Random(1E9)

Then I can select with an ORDER BY clause on the random number:

select top 10 source_id
from random_table
order by random_number, source_id

It depends on your use case whether this will be appropriate for a source table with millions of rows. It's an expensive way to select just one row.

I agree that the most likely path to a Rust binding is to wrap a C or C++ API. If you're content with a local client, callin and/or callout is the place to start. As you said, it shouldn't be too hard to write a callout library in Rust. Callin, on the other hand, (and callback from a callout library) is a bit more involved, requiring a lot of unsafe code.

If you want a remote client, you could look at wrapping the C or C++ binding, but that's a dead end that is not supported in IRIS. You might also look into relational access or an ORM. Diesel looks promising, but I don't know whether it (or Rust in general) works well with ODBC.

It appears that your link is to a Docker image of the application installed on YottaDB, a fork of GT.M. I followed the link at the bottom of the page to the FOIA release page:

https://code.osehra.org/journal/journal/view/1576

After downloading the latest copy and skimming the documentation, this release includes a CACHE.DAT database and extensive installation and configuration instructions for Caché. If you want to run the application on InterSystems IRIS, you would do better to start there than the YottaDB-based image.

For a container deployment, there are three things you'll want to be aware of: durable %SYS, bind mounts, and ports.

I believe that durable %SYS is covered in the documentation. Basically, it will store all of the database and mapping configuration that you do outside of the container.

You'll need a bind mount for durable %SYS, and for the RPMS database.

You'll want to export the web server port so that you can access the management portal.

One other suggestion: if you're new to InterSystems IRIS and/or containers, you may want to start with Docker locally on your system before tackling a cloud deployment.

Good luck, and have fun.

Whereas the digest methods in the %SYSTEM .Encryption class return binary strings, which are documented in terms of their byte length, it is indeed conventional to display them using hexadecimal. Instead of $select, I more usually see $translate and $justify:

s h0x=h0x_$tr($j(chr,2)," ",0)

If %xsd.hexBinary is covenient, though, I'd say use that.

For a sanity check, you can compare whatever implementation you choose with the zzdump command:

USER>s digest=$system.Encryption.MD5Hash("12345678")

USER>w ##class(%xsd.hexBinary).LogicalToXSD(digest)
25D55AD283AA400AF464C76D713C07AD
USER>zzdump digest

0000: 25 D5 5A D2 83 AA 40 0A F4 64 C7 6D 71 3C 07 AD

It's documented here:

https://cedocs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?K...

"The hashes are calculated using the PBKDF2 algorithm with the HMAC-SHA-1 pseudorandom function, as defined in Public Key Cryptography Standard #5 v2.1: 'Password-Based Cryptography Standard.' The current implementation uses 1024 iterations, 64 bits of salt, and generates 20 byte hash values."

So the input is Windows-1252, and the output is Windows-1252 in which certain characters are mapped to their numerical escape sequence? You could do this with XSLT 2.0 using character maps.

Given this input (presented here as UTF-8 for visibility on the forum):

<?xml version="1.0"?>
<Recordset>
• coffee €5,• tea €4
</Recordset>

This stylesheet will escape the bullets and euro signs:

<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:character-map name="a">
    <xsl:output-character character="€" string="&amp;#128;"/>
    <xsl:output-character character="•" string="&amp;#149;"/>
  </xsl:character-map>
  <xsl:output encoding="Windows-1252" indent="yes" use-character-maps="a"/>
  <xsl:template match="/">
    <Recordset>
      <xsl:value-of select="/Recordset"/>
    </Recordset>
  </xsl:template>
</xsl:stylesheet>

Output:

<?xml version="1.0" encoding="Windows-1252"?>
<Recordset>
&#149; coffee &#128;5,&#149; tea &#128;4
</Recordset>

The first $zf(-100) call doesn't work, because you're trying to redirect with the /STDOUT flag and the ">>" operator. You can do one or the other, but not both.

If you add the /LOGCMD flag to the second $zf(-100) call, you should see something like the following in messages.log:

    $ZF(-100) cmd=type "" file1.txt

I suggest that you not put an empty string in your options array.