Replies by Jon Willeke for InterSystems Developer Community

Jon Willeke · Mar 27

@Murray Oldfield posted "Decoding Intel processor models reported by Windows" a while back. Perhaps the wmic command is along the lines of what you're looking for. The properties that look of interest include name, description, caption, and processorid:

C:\>wmic cpu get name
Name
Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz

C:\>wmic cpu get caption
Caption
Intel64 Family 6 Model 85 Stepping 7

C:\>wmic cpu get processorid
ProcessorId
BFEBFBFF00050657

I don't know if any of these alone is enough to determine whether the processor supports AVX and/or BMI, but it should be enough to find specifications for the processor.

Jon Willeke · Jun 28, 2024

To be clear, the recommendation is to add "-d", not "d". "d" displays output (the default); "-d" suppresses output. If that is indeed the bug you're running into, it is fixed in newer wheels.

Jon Willeke · Jun 20, 2024

This is another case in which you're going to be happier using one of the newer wheels that comes with 2023.1.2. You're likely getting some kind of error on the server that the old 1.0.0 wheel cannot report back to you cleanly.

One thing you might try, if for some reason you're stuck on 1.0.0, is to add "-d" to the qspec argument. There's a really old bug/limitation, and I don't remember whether it was fixed on the client or server side, having to do with unexpected writes by a class method. "-d" would suppress that.

Jon Willeke · Jun 20, 2024

If I understand correctly, you want the Status property of a given task. In that case, this is going to be similar to your other question about fetching the gmheap property, except that you'll call a slightly different method to open the object:

task = native.classMethodValue('%SYS.Task', '%OpenId', id)
task.get('Status')

Since you're able to call RunNow(), I assume that you know the ID of the task. The same idea will work for the Error property.

If you're stuck with the ancient 1.0.0 wheel for some reason, you'll have to write helper class methods of your own.

Jon Willeke · Jun 20, 2024

1.0.0 is very old. 2023.1.2 shipped with the 3.2.0 and 4.1.0 wheels in the dev/python directory. You're going to be happier using one of them, which support the answer I gave originally. A big difference between the two is that 3.2.0 is pure Python, whereas 4.1.0 includes platform-specific binaries like 1.0.0 did.

Jon Willeke · Jun 14, 2024

What version of IRIS are you using, and what version of the Native API wheel? This works for me in 2024.1.0 using both 3.2.0 and 4.2.0.

Jon Willeke · Jun 13, 2024

It's not a dumb idea, but I would balance the risk you're introducing against the value. Refactor if you're already in the code, but if it ain't broke ...

The main thing to look out for is that the control flow is different. One pitfall of try-catch is that a try block is a context for quit, so if you have a quit command in a loop, then introduce a try-catch block, the quit may no longer do what you expect:

for i=1:1:10 {
   if i>4 {
   quit
   }
}

for i=1:1:10 {
   try {
   if i>4 {
   quit // oops
   }
   } catch {}
}

Otherwise, I find try-catch easier to visualize than $ztrap. Unfortunately, there is no finally block.

Looking through some of my old code, I see a lot of $ztrap for restoring the previous namespace on error. A lot of those can now be replaced with new $namespace.

Jon Willeke · Jun 13, 2024

You're trying to call gmheap as a class method, but it's a property. Here's one way to do it:

cfg = native.classMethodValue('Config.config', 'Open')
cfg.get('gmheap')

Jon Willeke · Aug 7, 2023

This is an important point. Depending on the size of each node, this $order loop could be touching virtually every block in the global. If you read the 4GB test global after setting it, you're reading from a warm buffer pool, whereas the 40GB production global is less likely to be buffered—hence the greater than 10x difference in time.

I don't have a good suggestion for how to make this loop run faster. $prefetchon might help a little. Rather, depending on why you need to perform this operation, I'd either cache the record count (which could then become a hot spot of its own), or maintain an index global (possibly a bitmap).

Jon Willeke · Jun 9, 2023

If you're downloading from evaluation.intersystems.com, take another look. I see a radio button for "Red Hat 9" that does not appear in a screenshot that John Murray posted last week, so this may have been fixed since you last checked.

Jon Willeke · Oct 25, 2021

Can you elaborate on what it would mean to support Parquet? The diagram appears to have come from this post:

https://blog.openbridge.com/how-to-be-a-hero-with-powerful-parquet-googl...

In that context, the query run time refers to acting on the file directly with something like Amazon Athena. Do you have in mind importing the file (which is what an IRIS user would currently do with CSV), or somehow querying the file directly?

Jon Willeke · Oct 25, 2021

Since Caché 2016.1, the %SYSTEM.Process class contains a TimeZone() method that lets you alter the value of the TZ environment variable, thereby changing the local time:

https://docs.intersystems.com/irislatest/csp/documatic/%25CSP.Documatic....

Depending on your situation, you can either change the current process, or isolate the change by jobbing off a background process.

Once the time zone is set appropriately, you can use dformat -3 of the $zdatetime and $zdatetimeh functions to convert between UTC and local. It would be cleaner if the time zone were an explicit argument, rather than implicit part of the environment.

Jon Willeke · Aug 10, 2021

No, the VALIDATE MODEL statement does not do cross validation. It calculates validation metrics for the given trained model and dataset. As described in the "Model Selection Process" section of the documentation, however, the TRAIN MODEL statement does this to some extent for classification models when using the AutoML provider:

These scoring metrics are then computed for each model using Monte Carlo cross validation, with three training/testing splits of 70%/30%, to determine the best model.

I also believe that the DataRobot provider incorporates cross validation into its training. I'm not sure about H2O.

Jon Willeke · Aug 10, 2021

This new error is indeed a breaking change in InterSystems IRIS 2019.3, as described in the 2020.1 release notes. Since I unfortunately can't link to them at the moment, I'll quote the relevant section from the upgrade compatibility checklist:

3.2.26 System — Class Compiler Validates Global Name Length Limit

This release adds compiler validation to ensure that the global names defined for DataLocation, IdLocation, IndexLocation, StreamLocation, CounterLocation, and VersionLocation are all within the global name length supported by the system (currently 31 characters). In previous releases, these global names were silently truncated to 31 characters. This could cause collisions between global names in the same class. These collisions would cause what appears to be references to separate globals to actually reference the same global.

During class compilation, an error will be reported if any of the global names used for the class are longer than the supported length. For example:

Compiling class %UnitTest.Result.TestInstance

ERROR #9101: Global name 'UnitTest.Result.TestInstanceStream' for 'StreamLocation' is too long, must be no more than 31 characters in length. [ConstructAddresses+60^%ocsExtentCache:BUILDSYS] > ERROR #5030: An error occurred while compiling class '%UnitTest.Result.TestInstance' [compile+59^%occClass:BUILDSYS]

This change will most likely result in classes that need to be modified upon upgrade. This more commonly occurs on systems that were converted from Caché or Ensemble using the in-place conversion if the classes were first defined using %CacheStorage and the class names were fairly long. (first in 2019.3)

I don't believe that you can disable this validation.

Jon Willeke · Jun 2, 2021

The Python documentation is pretty good. I haven't been too impressed by the Python books I've looked at. I didn't get much out of Learning Python, and even less out of Programming Python. Effective Python is okay, but not great. If I have a favorite, it's Fluent Python by Luciano Ramalho, although it probably shouldn't be your first book on the subject. A second edition is underway; you may need a Safari subscription to read it.

Jon Willeke · May 13, 2021

Unless pg_bulkload or SimpleMover are doing some kind of magic, I don't see any evidence of indexes in the GitHub repo. gaia_source.pgsql contains a single CREATE TABLE statement, and gaia_source.irissql contains a list of field names.

I don't think the repo is sufficient to reproduce your results.

Jon Willeke · May 13, 2021

This line also looks suspect:

set tID = tSeg.GetValueAt(pField_"("_tRepCount_")"_"."_pSubField)

Do you want tRep, instead of tRepCount?

Jon Willeke · May 12, 2021

Am I reading this right: 99 GB and not a single index? The IRIS numbers are incredible under those circumstances, but surely one would put some effort into the schema before issuing non-trivial queries against a large data set. There's a lot of low-hanging fruit here, at least for PostgreSQL.

Jon Willeke · May 12, 2021

Are you looking for something like this?

select top 1 contacttype
from table_name
order by datefrom desc

Jon Willeke · Mar 4, 2021

That file uses the irisnative library, but I can't find the library itself.

1
2
3
4
5
…
next ›
last