Are you asking about the KMIP server and how it works?  If so, I don't think this is the right place to ask, and would recommend you talk to people who know more about your KMIP system.

Or are you asking about how Cache handles key rotation for encrypted databases and/or managed key encryption?   If so, this is mostly up to the user.  There isn't automatic re-keying of databases on a schedule.

I've done a tremendous number of system restores over the years, and generally think that structuring your restore process to require manually removing the WIJ isn't a preferred design choice.  It's too easy to get into the habit of removing it, and then do it at a time when it causes problems.  

I assume you're asking about doing this following a restore of all databases, including CACHESYS,  since if only some databases were restored, removing the WIJ could lead to problems with the databases which were not restored.    I'm also assuming you're talking about designing a backup restore process, not an actual down system you're trying to get back up.  (If you're talking about a down system you need help with right now, please call the WRC.)

In a full system restore, you would restore the databases and WIJ from the older time.   Since you're already restoring all databases, have you considered treating it like a full system restore and including the older WIJ, which would let you avoid the need to remove the current WIJ?   This would also mean that the system would know which journal restore point to start from, and could automatically start journal restore for you at startup, assuming the journals are all available before you start up.

The error sounds like this could be a problem from a lower level (eg, the network), but I would suggest you start by collecting the SSH debugging log information to see if it helps.  You can find how to do that in the last section of this page:  https://community.intersystems.com/post/using-and-debugging-netsshsession-ssh-connections

This will connect and works for testing, but for a production configuration you should also edit the configuration so that it checks the server's certificate.  If you don't, the configuration will connect without an error even if someone is pretending to be the server you're trying to connect to.  Since you're setting up TLS, that's probably not what you want.

To do this, change the "Server certificate verification" setting from 'none' to 'require' and then fill in the name of the file which contains the certificate authority (CA) certificate for the server you're connecting to in the "File containing trusted Certificate Authority certificate(s)" field.  The certificate should be in PEM format, and the file may contain multiple certificates if you want to include more than one. 

Are you trying to have the same user have different login namespaces on different systems?  If so, for your InterSystems IRIS instances, have you looked at the "Authorization group ID" and "Authorization Instance ID" which are part of each LDAP configuration?   You can use these to make each instance (or group of instances) look for a different group to define the namespace.    

It can be tricky to get the exact form of the username right on a Linux client connecting to a Windows AD server.  If you're familiar with using an LDAP browser, you might be able to use one to manually look for the user object you're trying to find, and see what the account name shows up as.   That might let you check for any details you might not be exactly matching.

Unfortunately I don't know of a one size fits all solution, since each AD server is set up differently.

There are a lot of possible reasons this search be failing, but they mostly boil down to not looking in the right place for the user or not being able to identify the user when you find it.  Here are a few things to try:

For a Windows AD server, you will almost certainly want sAMAccountName as the unique search attribute. 

Check to make sure your base DN includes the location of the user you're trying to authenticate.  You may want to test with a high-level or generic base DN to make sure it matches the user account.  For example, try DC=intersystems, DC=com instead of a longer base DN like: OU=Boston, OU=Users, DC=intersystems, DC=com  This will mean you search a larger part of the tree (which is slower) but will let you make sure you're searching an area which includes your user.

Take a look at exactly what DN the is being used for the failing user.  This should be in the detailed output of the test connection.  Is the base DN being appended to the full DN that you gave as the username?  If so, you may not want to use the full DN as the username, and instead just the value of the account name.

If you use Kerberized telnet with encryption, the connection will be encrypted.  To quote the documentation:

"Kerberos with Encryption — Kerberos manages initial authentication, ensures the integrity of all communications, and also encrypts all communications. This involves end-to-end encryption for all messages in each direction between the user and Caché."

Are you trying to add TLS (aka SSL) to your connection, or use WS-Security?  Which steps to take next depend on which of those you're trying to set up.

If you're trying to add TLS, you'll want to create an SSL/TLS configuration, and use that in the operation.   You can find information about these configurations in the documentation.  You'll likely also want to read up on certificates in order to understand which files to put in the configuration.  It sounds like you already have some certificates (PFX is a certificate format), but I'm not certain whether that is your certificate or the CA certificate for the other side.

The TLS configuration information above assumes your operation is one end of the connection.  That's likely since you're sending the data instead of receiving it.  If you were receiving it, you'd want to understand whether your webserver was the endpoint.

If you can predict when it's going to happen and enable logging beforehand, SetTraceMask could help.  It will log information if the issue is related to the SSH layer.  The problem might or might not be in that layer, but it would be good to check.  I would use the same 511 flags if you can since we don't know where the problem is yet.

Are you using a version earlier than 2018.1?  There was an issue where connections could get interrupted which was fixed in that version.  If you're seeing that issue, strace/truss or the similar tool for your OS is the best way to confirm it.  You would see the client process get a SIGUSR1 signal right before the connection drops.  

The "Error: 20, unable to get local issuer certificate" means you don't have one or more certificates in your server's trusted CA file needed to verify the client's certificate.  It looks like the certificate for  "/C=NL/ST=Noord-Holland/L=Amsterdam/O=TERENA/CN=TERENA SSL CA 3" probably isn't in the DigiCertCA.pem file.  It's possible that you don't want to trust that CA (and therefore this is being correctly rejected) or that you do want to trust it and that certificate should be added to the file. 

Are the other connection attempts you see with empty hostnames for ones where Cache is the server, or the client?  I wouldn't be surprised if the hostname isn't displayed when Cache is the server since in that case you don't pass a hostname to the open command.

Have you also enabled SuperServer SSL/TLS? You can do this in the portal at the System Administration > Security > System Security > System-Wide Security settings page.   

If that doesn't help, I would recommend running the REDEBUG routine on the server to enable network debugging, then trying the connection from the gateway again. You can turn the network debugging on like this:

%SYS>d ^REDEBUG

Old flag values = FF

New flag values (in Hex): FFFFFFFF

Done

%SYS>

The logging information will be in the cconsole.log file.  Remember to set the flags back to FF when you're done to disable collection.

The CSP gateway event log may also be useful.  You can access this through the gateway management page.  It may be helpful to clear the log, then try the connection again, so that you see only the messages related to your test.

Finally, I see that you've edited the enabled protocols.  I would recommend you use the defaults unless you have a specific reason why you need to enable a protocol with known problems, such as SSLv3.   

There are two different connections here - one from the browser to the webserver, and one from the CSP gateway to Cache. Either or both can use SSL and they are configured separately.

You said you want HTTPS. This would be used on the connection between the browser and the webserver. It does not involve Cache or the CSP gateway at all. It is configured entirely in the web server configuration. For example, if you are using Apache, it is configured in the httpd.conf and related Apache conf files.

The settings you've shown above are for the CSP gateway to Cache connection. If the gateway and Cache are on separate machines, you may also want to configure SSL for this connection. The gateway will be connecting to the SuperServer on Cache, so you will want to follow the instructions for configuring SuperServer SSL if you want to get this part working. The instructions for that are here:

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY...

You definitely can do this. These parts of the documentation might be useful. This is on configuring AD group based LDAP authentication:

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY...

LDAP authentication is straightforward to set up, but does need specific group names. If you can't do it for some reason, you might want to look at delegated authentication:

https://docs.intersystems.com/latest/csp/docbook/DocBook.UI.Page.cls?KEY...

With delegated, you'll have to write code to handle the authentication and authorization. It can be more flexible, but is also more responsibility.

Either of those options will let your users get permissions based on their AD groups. You'll also need to make the CSP page check for the particular access granted by that role, probably by doing a check for a permission on a resource.

It sounds like you're connecting from a browser to a webpage handled by your mirrorset. If this is the case, then you may want to start debugging the problem with a network trace, such as one collected with Wireshark.

Remember that this TLS connection is between the browser and the webserver. The connection between the web gateway and InterSystems IRIS is a separate connection which will be negotiated separately. Due to this, there aren't any settings inside InterSystems IRIS which will affect the browser to webserver connection.

If this isn't the type of connection you're doing, can you describe where you're connecting from and to?

You might be interested in this page:

https://docs.intersystems.com/ens20181/csp/docbook/DocBook.UI.Page.cls?K...

Or potentially some of the documents on this one:

https://www.intersystems.com/gt/

Database encryption uses AES. You select the key size when creating the key; 128, 192, and 256 bits are all options.

If you have a specific question about standards not covered there, I would recommend contacting the WRC.