Replies by Pete Greskoff for InterSystems Developer Community

Pete Greskoff · Oct 11, 2021

You should review your license agreement with your account team. Licenses are typically per-user (where a user at a different IP counts as a separate user), but there are also connection limits for each user (each 'job' is a new connection as long as it is still around), and exceeding them can cause one user/IP to consume many licenses.

go to post

Pete Greskoff · Oct 8, 2021

If creating processes is slow, it might be worth looking into using Job Servers. This is a pool of processes that wait to handle 'job' requests.

go to post

Pete Greskoff · Oct 8, 2021

This is likely an issue with the OS - it shouldn't take that long to create a new process. It might be worth gathering an OS-level trace of your process while you do this to narrow down where the time is spent. Depending on platform, the trace can be gathered with strace, truss, Windows process monitor, or others.

go to post

Pete Greskoff · Jul 27, 2021

Are you doing something like 'new myarr' in mytag? That would behave as you described.

go to post

Pete Greskoff · Jul 1, 2021

From the documentation: "The best strategies for backing up databases are external backup and online backup." External backup involves external scripts (examples are included in the documentation, but do not show actually taking the backup, as that is done by 3rd party technology). This is generally the best way to take backups. Online backups can be configured and run from within Caché. All the various backup strategies, and details about how to use them, are available here:

https://docs.intersystems.com/latest/csp/docbook/Doc.View.cls?KEY=GCDI_b...

go to post

Pete Greskoff · Mar 3, 2021

This is best to be handled by the WRC. Please contact support@intersystems.com and provide the full cconsole.log as well as the timing of your attempted startup and details of what you deleted, how, and when.

For what it's worth, my best hypothesis based on the snippets you provided is that you have custom startup code that isn't completing (possibly due to whatever you deleted), which is blocking startup from continuing.

go to post

Pete Greskoff · Feb 18, 2021

Just noting for anyone who refers to this later that this will also be fixed in IRIS 2019.1.2 in addition to 2020.1.1

go to post

Pete Greskoff · Feb 17, 2021

Calling into IRIS won't work if the instance is hung, so the only way to detect that is something external to the instance. Take a look at 'iris qlist'. You can get more information from 'iris help qlist', but here are the basics:

Syntax:
iris qlist
Description:
Quick list InterSystems IRIS registry information for all instances, in a format suitable for parsing in command scripts.

go to post

Pete Greskoff · Dec 3, 2020

To add to this, given that you're concerned with security and want to use TLS 1.2, you should strongly consider upgrading, as 2012.1.2 has a number of security issues that have been fixed over the years.

go to post

Pete Greskoff · Oct 29, 2020

I just want to note that 2015 kits are not available for download anymore, even for supported customers. If anyone has a specific need for a 2015 kit, please contact the WRC.

go to post

Pete Greskoff · Oct 19, 2020

Are the two mirror members functioning as primary and backup? No members will even attempt to connect to the arbiter until you have a primary and an active backup (because that is the only situation where the arbiter is used). If you do have a primary and active backup, and they aren't connecting to the arbiter, I'd suggest contacting the WRC so that we can take a look at this with you.

go to post

Pete Greskoff · Aug 12, 2020

Is the database journaled? Remote non-mirrored databases on mirrored ECP database servers can only be mounted read-write on the ECP application server (in this case the reporting async) if the database is NOT journaled on the database server. This is documented here: https://docs.intersystems.com/latest/csp/docbook/Doc.View.cls?KEY=GHA_mi...

"Select the database you want to access over this ECP channel from the list of remote databases. You can select both mirrored databases (databases listed as :mirror:mirror_name:mirror_DB_name) and nonmirrored databases (databases listed as :ds:DB_name); only mirrored databases remain accessible to the application server in the event of mirror failover. When the data server is a failover member, mirrored databases are added as read-write, and nonmirrored databases are added as read-only, if journaled, or read-write, if not journaled; when the data server is a DR async member, all databases are added as read-only."

go to post

Pete Greskoff · Jul 30, 2020

I don't know the answer to this, but since you haven't gotten any responses here, I'd suggest opening a case with the WRC so someone can investigate it fully. Please include the $zv string from both instances involved, as well as the location of the installation (as I suspect that is relevant to the problem you're seeing).

go to post

Pete Greskoff · May 21, 2020

This isn't enough information to answer definitively, as we don't really know whether there is a limitation on the client side or the server side. You should check IRIS' messages.log file. My best guess is that the errors may be due to running out of licenses. Either way, I think it's worth engaging the WRC on this, as a lot of setup-specific information is going to be required to get to the bottom of it.

go to post

Pete Greskoff · May 20, 2020

That routine ultimately uses the same API under the covers: ##class(SYS.Mirror).BecomePrimary(), which forces down the original primary.

go to post

Pete Greskoff · May 8, 2020

I just want to note that, although this script works at the basic level, using systemd with Caché/IRIS is not going to be a perfect experience. systemd relies on the fact that it's the only method being used for stopping and starting the service. If the instance is stopped or started via another method, such as a direct 'iris start/stop/force' by a user, or an 'iris force' invoked by the ISCAgent during a mirror failover, systemd will lose track of the actual status of the instance. Certainly, if you want to use this on a test system for basic functionality, you can, but I would definitely not recommend this on a live system.

go to post

Pete Greskoff · May 6, 2020

This is what reporting async members are for. You can run an archiving task on the async to move the data to another database (or different globals in the same database, however you want to do it), then purge the data on the primary. As long as the reporting process knows where to look for the data, you're all set.

go to post

Pete Greskoff · May 5, 2020

Your real problem here is the memory usage on the system. It may or may not be Caché using up all the memory, and that's where your investigation should focus, but I wanted to give a technical explanation here for why the write daemon specifically is getting killed.

Most of the memory used by Caché is allocated at instance startup, and is a 'shared memory segment', which you can see with 'ipcs'. Other (Caché and non-Caché) processes allocate memory for individual processing, but the vast majority of memory used by Caché on a running system is this shared memory segment. The largest chunk of that shared memory segment is almost always global buffers (where database blocks are stored for access by Caché processes). Anytime a database block is updated, it is updated in global buffers, and the write daemon will need to access that block in memory and write it to disk. Therefore, the write daemon ends up touching a huge amount of memory on a system, although almost all of that memory is shared. The Linux out of memory killer doesn't prioritize processes using individual memory vs. accessing shared memory segments, so the write daemon is almost always its first target (as it has accessed the most memory), even though killing that process doesn't actually free up much memory for the system (since that shared memory segment doesn't get freed until all other Caché processes detach from it).

go to post

Pete Greskoff · Apr 17, 2020

All you need to do is shutdown the instance on the primary. This is documented here:

https://cedocs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY=GHA_mirror#GHA_mirror_set_autofail_outages_primary

If your Veeam backup is short enough (on the order of a few seconds), there's really no need to do this, but if it's on the order of minutes, not seconds, that may be a good plan.

go to post

Pete Greskoff · Feb 20, 2020

The method you are trying to use is not documented - if you look in the class reference, it is not there. This data is internal. As @Alexander Koblov suggested, you should use the SYS.ECP class. If you want the byte transferred information, that is available in mgstat. If there's a particular piece of information you need that isn't available with a supported method (or you can't find it), contact the WRC.