Replies by Jon Willeke for InterSystems Developer Community

Jon Willeke · Jun 28, 2024

To be clear, the recommendation is to add "-d", not "d". "d" displays output (the default); "-d" suppresses output. If that is indeed the bug you're running into, it is fixed in newer wheels.

go to post

Jon Willeke · Jun 20, 2024

If I understand correctly, you want the Status property of a given task. In that case, this is going to be similar to your other question about fetching the gmheap property, except that you'll call a slightly different method to open the object:

task = native.classMethodValue('%SYS.Task', '%OpenId', id)
task.get('Status')

Since you're able to call RunNow(), I assume that you know the ID of the task. The same idea will work for the Error property.

If you're stuck with the ancient 1.0.0 wheel for some reason, you'll have to write helper class methods of your own.

go to post

Jon Willeke · Jun 9, 2023

If you're downloading from evaluation.intersystems.com, take another look. I see a radio button for "Red Hat 9" that does not appear in a screenshot that John Murray posted last week, so this may have been fixed since you last checked.

go to post

Jon Willeke · Aug 10, 2021

No, the VALIDATE MODEL statement does not do cross validation. It calculates validation metrics for the given trained model and dataset. As described in the "Model Selection Process" section of the documentation, however, the TRAIN MODEL statement does this to some extent for classification models when using the AutoML provider:

These scoring metrics are then computed for each model using Monte Carlo cross validation, with three training/testing splits of 70%/30%, to determine the best model.

I also believe that the DataRobot provider incorporates cross validation into its training. I'm not sure about H2O.

go to post

Jon Willeke · May 12, 2021

Are you looking for something like this?

select top 1 contacttype
from table_name
order by datefrom desc

go to post

Jon Willeke · Mar 1, 2021

I don't know whether this is documented, but you can use array accessors in the property expression. To create a property for the productID of the first entry in the OrderContent array, you should be able to use an expression like this in a %CreateProperty() call: $.OrderContent[0].productID.

Unfortunately, this doesn't address the general case of searching an array. I don't think you can create a collection property in DocDB, nor do the %FindDocuments() operators seem to be of much use. You might try poking around in your generated class to see if you can use an existing property as a template for creating your own computed property that aggregates the productID. If that works, you may still find the %FindDocuments() operators to be inadequate, but the property would then be accessible to SQL queries.

go to post

Jon Willeke · Jan 21, 2021

I took the spec. from your reply to Dmitriy's question, changed "query" to "path", smooshed it into one line, then pasted it into a terminal:

USER>s swagger={...}

USER>s status=##class(%REST.API).CreateApplication("neerav",swagger)

USER>zw status
... /* ERROR #8722: Path parameter country is not used in url /Demo of route with GETDemo in class: neerav.spec

I then changed "/Demo" to "/Demo/{country}":

USER>s swagger={ "swagger":"2.0","info":{ "title":"O","version":"1" },"schemes":["http","https"],"consumes":["application/json]"],"produces":["application/json]"],"paths":{ "/Demo/{country}":{ "get":{ "summary":"Get Demo ","description":"Demo ","operationId":"GETDemo","x-ISC_CORS":true,"consumes":["application/json]","text/html]"],"produces":["application/json]","text/html]"],"parameters":[{ "name":"country","type":"string","in":"path","description":"Country Name","allowEmptyValue": true } ],"responses":{ "200":{ "description":"Returns Country","schema": { "type":"string" } } } } } } }

USER>s status=##class(%REST.API).CreateApplication("neerav",swagger)

USER>zw status
status=1

I'm curious to know in what version this previously worked. I get the same error in 2019.1.2.

go to post

Jon Willeke · Sep 9, 2020

It has been a while since I've looked at the Python binding, but my recollection is that it is based on the C binding (not to be confused with the C callin API), which is based on the C++ binding, which does not have to be on the same system as the Caché server. However, you do need to satisfy those dependencies on the client.

go to post

Jon Willeke · Aug 31, 2020

You're getting a "Datatype validation" error, which suggests that the XML parsed fine, but that the value "/shared/BENANDERSON" is not valid for the PArray property of the ExecuteResult class.

If PArray is an array, I don't remember how that's typically projected to XML, and I can't find an example in the documentation. Lists are projected as a container element with the contents nested as elements within. Are you reading in something that was written using an %XML.Writer, or are you designing a class to read an XML file from somewhere else?

It might help to see more context in the XML, and the relevant definitions from the ExecuteResult class.

go to post

Jon Willeke · Jul 20, 2020

FIPS 180-4 describes SHA-512 et al., FIPS 198-1 describes HMAC, and PKCS #5 describes PBKDF2, which depends on HMAC-SHA. As for NIST, special publication 800-132 (now ten years old) states: "This Recommendation approves PBKDF2 as the PBKDF using HMAC with any approved hash function as the PRF." For more recent guidance, consider special publication 800-63B.

As I understand it, none of the weaknesses in SHA affect HMAC or PBKDF2. However, if SHA-1 is no longer FIPS approved, the NIST guidance would indicate replacing it with, say, SHA-2 or SHA-3.

In terms of strength, PBKDF2 essentially has two parameters, the hash function, and the iteration count. For the hash function, bigger is usually slower, therefore stronger. For the iteration count, PKCS #5 and NIST 800-132 both suggest a minimum of 1,000. NIST 800-63B states: "the iteration count SHOULD be as large as verification server performance will allow, typically at least 10,000 iterations."

go to post

Jon Willeke · May 20, 2020

One thing I've done to split machine learning datasets is to use an auxiliary table that maps IDs to a random number. I write a stored procedure that returns a random number. I create a table with two columns, one for the ID of the source table, and one to hold a random number. I populate the column for the source IDs:

insert into random_table (source_id)
select id from source_table

I then populate the column for the random number:

update random_table
set random_number = MySchema.MySP_Random(1E9)

Then I can select with an ORDER BY clause on the random number:

select top 10 source_id
from random_table
order by random_number, source_id

It depends on your use case whether this will be appropriate for a source table with millions of rows. It's an expensive way to select just one row.

go to post

Jon Willeke · Apr 24, 2020

I agree that the most likely path to a Rust binding is to wrap a C or C++ API. If you're content with a local client, callin and/or callout is the place to start. As you said, it shouldn't be too hard to write a callout library in Rust. Callin, on the other hand, (and callback from a callout library) is a bit more involved, requiring a lot of unsafe code.

If you want a remote client, you could look at wrapping the C or C++ binding, but that's a dead end that is not supported in IRIS. You might also look into relational access or an ORM. Diesel looks promising, but I don't know whether it (or Rust in general) works well with ODBC.

go to post

Jon Willeke · Nov 5, 2019

It appears that your link is to a Docker image of the application installed on YottaDB, a fork of GT.M. I followed the link at the bottom of the page to the FOIA release page:

https://code.osehra.org/journal/journal/view/1576

After downloading the latest copy and skimming the documentation, this release includes a CACHE.DAT database and extensive installation and configuration instructions for Caché. If you want to run the application on InterSystems IRIS, you would do better to start there than the YottaDB-based image.

For a container deployment, there are three things you'll want to be aware of: durable %SYS, bind mounts, and ports.

I believe that durable %SYS is covered in the documentation. Basically, it will store all of the database and mapping configuration that you do outside of the container.

You'll need a bind mount for durable %SYS, and for the RPMS database.

You'll want to export the web server port so that you can access the management portal.

One other suggestion: if you're new to InterSystems IRIS and/or containers, you may want to start with Docker locally on your system before tackling a cloud deployment.

Good luck, and have fun.

go to post

Jon Willeke · Oct 28, 2019

Whereas the digest methods in the %SYSTEM .Encryption class return binary strings, which are documented in terms of their byte length, it is indeed conventional to display them using hexadecimal. Instead of $select, I more usually see $translate and $justify:

s h0x=h0x_$tr($j(chr,2)," ",0)

If %xsd.hexBinary is covenient, though, I'd say use that.

For a sanity check, you can compare whatever implementation you choose with the zzdump command:

USER>s digest=$system.Encryption.MD5Hash("12345678")

USER>w ##class(%xsd.hexBinary).LogicalToXSD(digest)
25D55AD283AA400AF464C76D713C07AD
USER>zzdump digest

0000: 25 D5 5A D2 83 AA 40 0A F4 64 C7 6D 71 3C 07 AD

go to post

Jon Willeke · Sep 30, 2019

If you've specified the language as "basic", then I would expect the Right() function to be available. Otherwise, for "objectscript", you'll want to use something like $extract(string,*-10,*).

Edit: if there's some reason to prefer using the methods supplied by the wizard, you could also do something like this: ..SubString(string,..Length(string)-10)

go to post

Jon Willeke · Sep 26, 2019

It looks like the accented "é" is causing problems with the directory software. The UTF-8 encoding of \u00e9 is \xc3\xa9, which is URL encoded as %c3%a9, which has been URL encoded again as %25c3%25a9. Try this:

https://openexchange.intersystems.com/package/Cach%C3%A9Quality

My browser displays this as:

https://openexchange.intersystems.com/package/CachéQuality

go to post

Jon Willeke · Aug 21, 2019

It's documented here:

https://cedocs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?K...

"The hashes are calculated using the PBKDF2 algorithm with the HMAC-SHA-1 pseudorandom function, as defined in Public Key Cryptography Standard #5 v2.1: 'Password-Based Cryptography Standard.' The current implementation uses 1024 iterations, 64 bits of salt, and generates 20 byte hash values."

go to post

Jon Willeke · Jun 17, 2019

Unfortunately, deployment of the Hibernate dialect for InterSystems IRIS is still a bit of a work in progress:

https://irisdocs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page...

There's an old dialect for Caché in the Hibernate distribution itself, but the IRIS dialect lives in a handful of files that you can get from the WRC.

go to post

Jon Willeke · Apr 9, 2019

So the input is Windows-1252, and the output is Windows-1252 in which certain characters are mapped to their numerical escape sequence? You could do this with XSLT 2.0 using character maps.

Given this input (presented here as UTF-8 for visibility on the forum):

<?xml version="1.0"?>
<Recordset>
• coffee €5,• tea €4
</Recordset>

This stylesheet will escape the bullets and euro signs:

<?xml version="1.0"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:character-map name="a">
    <xsl:output-character character="€" string="&amp;#128;"/>
    <xsl:output-character character="•" string="&amp;#149;"/>
  </xsl:character-map>
  <xsl:output encoding="Windows-1252" indent="yes" use-character-maps="a"/>
  <xsl:template match="/">
    <Recordset>
      <xsl:value-of select="/Recordset"/>
    </Recordset>
  </xsl:template>
</xsl:stylesheet>

Output:

<?xml version="1.0" encoding="Windows-1252"?>
<Recordset>
&#149; coffee &#128;5,&#149; tea &#128;4
</Recordset>

go to post

Jon Willeke · Apr 9, 2019

My hunch is that XSLT is not the right tool for the job, but it's not clear to me what you're trying to do. What is the input encoding, what is the desired output encoding, and what do you mean by non-standard characters? Are these characters that don't exist in the output encoding?