Pete Greskoff · Oct 11, 2021 go to post

You should review your license agreement with your account team. Licenses are typically per-user (where a user at a different IP counts as a separate user), but there are also connection limits for each user (each 'job' is a new connection as long as it is still around), and exceeding them can cause one user/IP to consume many licenses. 

Pete Greskoff · Oct 8, 2021 go to post

If creating processes is slow, it might be worth looking into using Job Servers. This is a pool of processes that wait to handle 'job' requests. 

Pete Greskoff · Oct 8, 2021 go to post

This is likely an issue with the OS - it shouldn't take that long to create a new process. It might be worth gathering an OS-level trace of your process while you do this to narrow down where the time is spent. Depending on platform, the trace can be gathered with strace, truss, Windows process monitor, or others.

Pete Greskoff · Jul 27, 2021 go to post

Are you doing something like 'new myarr' in mytag? That would behave as you described.

Pete Greskoff · Jul 1, 2021 go to post

From the documentation: "The best strategies for backing up databases are external backup and online backup." External backup involves external scripts (examples are included in the documentation, but do not show actually taking the backup, as that is done by 3rd party technology). This is generally the best way to take backups. Online backups can be configured and run from within Caché. All the various backup strategies, and details about how to use them, are available here:

https://docs.intersystems.com/latest/csp/docbook/Doc.View.cls?KEY=GCDI_…

Pete Greskoff · Mar 3, 2021 go to post

This is best to be handled by the WRC. Please contact support@intersystems.com and provide the full cconsole.log as well as the timing of your attempted startup and details of what you deleted, how, and when.

For what it's worth, my best hypothesis based on the snippets you provided is that you have custom startup code that isn't completing (possibly due to whatever you deleted), which is blocking startup from continuing.

Pete Greskoff · Feb 18, 2021 go to post

Just noting for anyone who refers to this later that this will also be fixed in IRIS 2019.1.2 in addition to 2020.1.1

Pete Greskoff · Feb 17, 2021 go to post

Calling into IRIS won't work if the instance is hung, so the only way to detect that is something external to the instance. Take a look at 'iris qlist'. You can get more information from 'iris help qlist', but here are the basics:

Syntax:
        iris qlist
Description:
        Quick list InterSystems IRIS registry information for all instances, in a format suitable for parsing in command scripts.

Pete Greskoff · Dec 3, 2020 go to post

To add to this, given that you're concerned with security and want to use TLS 1.2, you should strongly consider upgrading, as 2012.1.2 has a number of security issues that have been fixed over the years.

Pete Greskoff · Oct 29, 2020 go to post

I just want to note that 2015 kits are not available for download anymore, even for supported customers. If anyone has a specific need for a 2015 kit, please contact the WRC.

Pete Greskoff · Oct 19, 2020 go to post

Are the two mirror members functioning as primary and backup? No members will even attempt to connect to the arbiter until you have a primary and an active backup (because that is the only situation where the arbiter is used). If you do have a primary and active backup, and they aren't connecting to the arbiter, I'd suggest contacting the WRC so that we can take a look at this with you.

Pete Greskoff · Aug 12, 2020 go to post

Is the database journaled? Remote non-mirrored databases on mirrored ECP database servers can only be mounted read-write on the ECP application server (in this case the reporting async) if the database is NOT journaled on the database server. This is documented here: https://docs.intersystems.com/latest/csp/docbook/Doc.View.cls?KEY=GHA_m…

"Select the database you want to access over this ECP channel from the list of remote databases. You can select both mirrored databases (databases listed as :mirror:mirror_name:mirror_DB_name) and nonmirrored databases (databases listed as :ds:DB_name); only mirrored databases remain accessible to the application server in the event of mirror failover. When the data server is a failover member, mirrored databases are added as read-write, and nonmirrored databases are added as read-only, if journaled, or read-write, if not journaled; when the data server is a DR async member, all databases are added as read-only."

Pete Greskoff · Jul 30, 2020 go to post

I don't know the answer to this, but since you haven't gotten any responses here, I'd suggest opening a case with the WRC so someone can investigate it fully. Please include the $zv string from both instances involved, as well as the location of the installation (as I suspect that is relevant to the problem you're seeing).

Pete Greskoff · May 21, 2020 go to post

This isn't enough information to answer definitively, as we don't really know whether there is a limitation on the client side or the server side. You should check IRIS' messages.log file. My best guess is that the errors may be due to running out of licenses. Either way, I think it's worth engaging the WRC on this, as a lot of setup-specific information is going to be required to get to the bottom of it.

Pete Greskoff · May 20, 2020 go to post

That routine ultimately uses the same API under the covers: ##class(SYS.Mirror).BecomePrimary(), which forces down the original primary.

Pete Greskoff · May 8, 2020 go to post

I just want to note that, although this script works at the basic level, using systemd with Caché/IRIS is not going to be a perfect experience. systemd relies on the fact that it's the only method being used for stopping and starting the service. If the instance is stopped or started via another method, such as a direct 'iris start/stop/force' by a user, or an 'iris force' invoked by the ISCAgent during a mirror failover, systemd will lose track of the actual status of the instance. Certainly, if you want to use this on a test system for basic functionality, you can, but I would definitely not recommend this on a live system.

Pete Greskoff · May 6, 2020 go to post

This is what reporting async members are for. You can run an archiving task on the async to move the data to another database (or different globals in the same database, however you want to do it), then purge the data on the primary. As long as the reporting process knows where to look for the data, you're all set.

Pete Greskoff · May 5, 2020 go to post

Your real problem here is the memory usage on the system. It may or may not be Caché using up all the memory, and that's where your investigation should focus, but I wanted to give a technical explanation here for why the write daemon specifically is getting killed. 

Most of the memory used by Caché is allocated at instance startup, and is a 'shared memory segment', which you can see with 'ipcs'. Other (Caché and non-Caché) processes allocate memory for individual processing, but the vast majority of memory used by Caché on a running system is this shared memory segment. The largest chunk of that shared memory segment is almost always global buffers (where database blocks are stored for access by Caché processes). Anytime a database block is updated, it is updated in global buffers, and the write daemon will need to access that block in memory and write it to disk. Therefore, the write daemon ends up touching a huge amount of memory on a system, although almost all of that memory is shared. The Linux out of memory killer doesn't prioritize processes using individual memory vs. accessing shared memory segments, so the write daemon is almost always its first target (as it has accessed the most memory), even though killing that process doesn't actually free up much memory for the system (since that shared memory segment doesn't get freed until all other Caché processes detach from it).

Pete Greskoff · Feb 20, 2020 go to post

The method you are trying to use is not documented - if you look in the class reference, it is not there. This data is internal. As @Alexander Koblov suggested, you should use the SYS.ECP class. If you want the byte transferred information, that is available in mgstat. If there's a particular piece of information you need that isn't available with a supported method (or you can't find it), contact the WRC.

Pete Greskoff · Feb 19, 2020 go to post

you shouldn't need to give the ISCAgent root permissions to get it to work. you should contact the WRC, so we can take a look at this system to avoid this workaround.

Pete Greskoff · Feb 18, 2020 go to post

Hansel,

A common problem for users setting up mirroring for the first time is to add the IP intended to be used for the Virtual IP directly to the machine before creating the mirror. That IP address needs to be free (unassigned) when you create the mirror, so that mirroring itself can assign the IP to the interface.

Pete Greskoff · Sep 18, 2019 go to post

The real question is why is that a problem? There are quite a few things involved in shadowing:

1) Processes retrieving/writing journal files

2) Dejournal reader process which reads those files and queues up records for sets/kills into database

3) Dejournal worker process which actually does those sets/kills

4) Dejournal prefetchers which fetch blocks from disk so 3) doesn't need to do disk reads itself

Most likely, you have 16+ dejournal prefetchers, optimizing the performance of shadowing keeping up with your source system.

Pete Greskoff · Sep 13, 2019 go to post

It might be best to open a WRC issue for this, but this is a fairly common error message that actually has nothing to do with SSL. This is generally a sign of 1 of 2 problems, depending on your platform:

Unix: Incorrect permissions on the cuxagent binary on the PRIMARY mirror member. The permissions on that file should look like:

[root@RH7-64-001 bin]# ls -l /intersystems/CACHE/bin/cuxagent
-r-sr-x--- 1 root iscagent 27468 Sep 13 13:40 /intersystems/CACHE/bin/cuxagent

Windows: Generally a problem with the ISCAgent being unable to actually find the instance or access the cache.cpf file. If you look in C:\Windows\system32\iscagent.log, you should see the reason for the problem.

If this doesn't point you to the solution, I definitely suggest contacting support. 

Pete Greskoff · Aug 26, 2019 go to post

Please contact support@intersystems.com and provide your phone number. We can help walk you through this, and explain exactly what's going on here. This is a discussion better suited for a phone call.

Pete Greskoff · Aug 23, 2019 go to post

You should use $zf(-100) to execute the operating system command to run the EXE. There are examples of running .EXE files in that documentation.