Question
· Jul 4, 2022

Web teminal : lost connection with server (code 1006)

Hi,

I have a problem with the recent update 4.9.4 of the WebTerminal.

This message appeared after the loading of the page :

New update is available. Click here to install it now. Changelist:
4.9.4: No longer require /terminal to be at the root of the URL

So I installed the new version. Unfortunatly, after refreshing the page, I have this message again and again :

WebTerminal lost connection with server (code 1006).
Attempting to restore session in 10 seconds...
Please, refresh the web page to start a new session.

WebTerminal lost connection with server (code 1006).
Attempting to restore session in 10 seconds...

I tried to reinstall the previous version we were using, the 4.9.3, but it doesn't solve the problem (I also tried the 4.9.2, but same result).

The web application under System Administration -> Security -> Applications -> Web Applications seems to be identical to the one on another environment that work from the WebTerminal perspective.

It seems to be a setting issue, but I don't know where to look at.

Any ideas on how to solve this ? :)

Thanks by advance !

Product version: IRIS 2021.1
$ZV: IRIS for Windows (x86-64) 2021.1 (Build 215U) Wed Jun 9 2021 09:39:22 EDT
Discussion (19)1
Log in or sign up to continue

Indeed, I also compared these two, and they are identical.

However, I noticed this file : web.config appeared at the root of CSP folders. It seems to contain no settings whatsoever since there is only this in it :

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
</configuration>

This is the only difference I found so far between the two environments.

Also, maybe I should precise that the 2 environments are on the same server, so two instances of Iris runnning at the same time on the same machine (same version of course). Since one of them still works, I assume it is an Intersystems setting issue.

There might be something happening here indeed :

License Unit Use  Local  Distributed 
Current License Units Used 35 35
Maximum License Units Used 35 65
License Units Authorized 64 64
Current Connections 39 43
Maximum Connections 39 71

I'm not sure about the difference between local and distributed, but if seems like it exceed the maximum units authorized in the distributed part. Could that be the problem ?

If it is, how can we solve that, granted that before the update, the problem wasn't here ? I assume there might be some licence management we can do in this case.

I'm not sure about the meaning of the Distributed column, but this does indeed look suspicious. If you have that option, I'd stop and start IRIS (which should clear existing licenses), and try the web terminal again. If it now works, you have at least found the cause.

As to a more permanent solution, I don't know. I don't use web terminal all that often, but I would hope that, when you log in, it adds a connection to a license that login may already have. Perhaps the web socket connection plays a part in this. Like I said, I haven't really investigated this; I just mentioned it so you'd have something to check.

I've been troubleshooting issues with WebTerminal that resulted in either long delays after login before the IRIS prompt displayed, or license count exhaustion (which generated the lost connection errors you've experienced). The change that made the most immediate difference was to switch the MPM module used by Apache from event to worker. The latter allows pre-allocation of resources that seem to better support websockets.

On RedHat Linux, the MPM module is configured through /etc/conf.modules.d/00-mpm.conf:

Replace

LoadModule mpm_event_module modules/mod_mpm_event.so

with

LoadModule mpm_worker_module modules/mod_mpm_worker.so
ServerLimit     20
StartServers    10
MaxRequestWorkers 500
MinSpareThreads 75
MaxSpareThreads 250
ThreadsPerChild 25

The values above are likely overkill for many purposes; they're borrowed from @Mark Bolinsky's excellent article on HS web server configuration found here: https://community.intersystems.com/post/apache-httpd-web-server-configuration-healthshare
 It does increase httpd's memory footprint by 30-40% over the "stock" configuration, but unless you're very tight on resources it shouldn't be an issue.

I also heard from @John Murray that there will be another WebTerminal release shortly that addresses the IRIS/Caché version interrogation issue along with being kinder to your license count laugh

As long as the XML import compiled the classes, there's nothing else you need to do in order to benefit from the update.

I suggest you stop using WebTerminal for at least 15 minutes, then check in Portal's Web Sessions page (under System Operation) that no /terminalsocket sessions remain.

Next, launch a WebTerminal. Confirm it reports being version 4.9.5. Check that the Web Sessions page shows one /terminalsocket entry. Now close your WebTerminal browser. Refresh the Web Sessions page. The /terminalsocket entry should no longer be there.

I tried these operations.

There was no /terminalsocket indeed in the web session portal before I launched a new session. However, this session does not disappear when I close the page. It even create another session for each attempt it make, so every 10 seconds. After waiting a bit (like 5 - 10 minutes), some begin to disapear.

It doesn't show more than the error message saying it can't connect, but it seems it use the 4.9.5 version, I found this in the index.js source file :

I notice the user is UnknownUser. Could this have an impact ?

UnknownUser is normal for the /terminalsocket session.

Does the problem also occur if you start from an incognito/InPrivate instance of your web browser?

Previously you reported that WebTerminal is working OK on another of your InterSystems environments. Is this still the case? Is that one using 4.9.5 yet?

For the one that fails to make the websocket connection I suggest you use F12 in your browser to open Developer Tools, and make sure network tracing is active before you go to the /terminal/ URL to open a WebTerminal. Look at the network trace messages, and compare them to equivalent messages from a different web browser session that connects successfully to WebTerminal on your other environment.

I'm not clear whether WebTerminal ever worked on this InterSystems instance, e.g. with WebTerminal 4.9.3.

Maybe also worth checking what the InterSystems security audit log is showing around the time you fail to connect with WebTerminal. You may need to turn this auditing on, and perhaps enable it for some additional event types.

I found what was the issue.

There was indeed something happening in the audit portal. Each attempt of the page was adding a line in this page :

I added the role of the application /terminalsocket to %All and now the error is gone. I have no clue why this happened though, your package didn't seems to change this kind of parameters (Or maybe I missed it).

In any case, thank you for your help !