Question
Jules Pontois · Jul 4

Web teminal : lost connection with server (code 1006)

Hi,

I have a problem with the recent update 4.9.4 of the WebTerminal.

This message appeared after the loading of the page :

New update is available. Click here to install it now. Changelist:
4.9.4: No longer require /terminal to be at the root of the URL

So I installed the new version. Unfortunatly, after refreshing the page, I have this message again and again :

WebTerminal lost connection with server (code 1006).
Attempting to restore session in 10 seconds...
Please, refresh the web page to start a new session.

WebTerminal lost connection with server (code 1006).
Attempting to restore session in 10 seconds...

I tried to reinstall the previous version we were using, the 4.9.3, but it doesn't solve the problem (I also tried the 4.9.2, but same result).

The web application under System Administration -> Security -> Applications -> Web Applications seems to be identical to the one on another environment that work from the WebTerminal perspective.

It seems to be a setting issue, but I don't know where to look at.

Any ideas on how to solve this ? :)

Thanks by advance !

Product version: IRIS 2021.1
$ZV: IRIS for Windows (x86-64) 2021.1 (Build 215U) Wed Jun 9 2021 09:39:22 EDT
0
0 343
Discussion (19)2
Log in or sign up to continue

Thanks for the suggestion, I opened an issue on there Github.

Yes, I can confirm that I use this kind of url. It doesn't work with either of these : 

  • Apache like this : fxhfgsevfapp57:56773/terminal/
  • HTTP like this : this.is.url.com:56773/terminal/
  • HTTPS like this : this.is.url.com/terminal/

When comparing between an environment that works and the one that doesn't, did you also check the settings of the /terminalsocket application? If not, please do that.

Indeed, I also compared these two, and they are identical.

However, I noticed this file : web.config appeared at the root of CSP folders. It seems to contain no settings whatsoever since there is only this in it :

<?xml version="1.0" encoding="UTF-8"?>
<configuration>
</configuration>

This is the only difference I found so far between the two environments.

Also, maybe I should precise that the 2 environments are on the same server, so two instances of Iris runnning at the same time on the same machine (same version of course). Since one of them still works, I assume it is an Intersystems setting issue.

I've seen this happen on a CE docker instance; in this case the cause was that licenses ran out. I did not investigate further, but it appeared each start of a web terminal caused a new license to be consumed. Perhaps this could be your problem?

So far I don't think our license is consumed since we can still use the entire product without any limitation. Or did I missunderstand the meaning of "license to be consumed" ?

Also, we're not using Docker version of Intersystems. And the two environments I mentionned are on the same machine. So there is two instances of Iris running on a windows server.

You can check whether you still have licenses available in the management portal: System Operation -> License Usage. If Current License Units Used equals License Units Authorized, web terminal can't allocate a new license, which it does seem to need. If not, yours is a different problem.

There might be something happening here indeed :

License Unit Use  Local  Distributed 
Current License Units Used 35 35
Maximum License Units Used 35 65
License Units Authorized 64 64
Current Connections 39 43
Maximum Connections 39 71

I'm not sure about the difference between local and distributed, but if seems like it exceed the maximum units authorized in the distributed part. Could that be the problem ?

If it is, how can we solve that, granted that before the update, the problem wasn't here ? I assume there might be some licence management we can do in this case.

I'm not sure about the meaning of the Distributed column, but this does indeed look suspicious. If you have that option, I'd stop and start IRIS (which should clear existing licenses), and try the web terminal again. If it now works, you have at least found the cause.

As to a more permanent solution, I don't know. I don't use web terminal all that often, but I would hope that, when you log in, it adds a connection to a license that login may already have. Perhaps the web socket connection plays a part in this. Like I said, I haven't really investigated this; I just mentioned it so you'd have something to check.

I've been troubleshooting issues with WebTerminal that resulted in either long delays after login before the IRIS prompt displayed, or license count exhaustion (which generated the lost connection errors you've experienced). The change that made the most immediate difference was to switch the MPM module used by Apache from event to worker. The latter allows pre-allocation of resources that seem to better support websockets.

On RedHat Linux, the MPM module is configured through /etc/conf.modules.d/00-mpm.conf:

Replace

LoadModule mpm_event_module modules/mod_mpm_event.so

with

LoadModule mpm_worker_module modules/mod_mpm_worker.so
ServerLimit     20
StartServers    10
MaxRequestWorkers 500
MinSpareThreads 75
MaxSpareThreads 250
ThreadsPerChild 25

The values above are likely overkill for many purposes; they're borrowed from @Mark Bolinsky's excellent article on HS web server configuration found here: https://community.intersystems.com/post/apache-httpd-web-server-configuration-healthshare
 It does increase httpd's memory footprint by 30-40% over the "stock" configuration, but unless you're very tight on resources it shouldn't be an issue.

I also heard from @John Murray that there will be another WebTerminal release shortly that addresses the IRIS/Caché version interrogation issue along with being kinder to your license count laugh

We actually are short on memory laugh  So this might not be a solution for us.

I guess we're gonna wait for this new version while investigating further in the network protection then.

Thanks for the insights !

WebTerminal 4.9.5 is now available. After updating to this you should no longer see /terminalsocket web sessions hanging around for 15 minutes after WebTerminal browser windows have been closed.

This should help with license starvation issues.

Sorry for the late response, I didn't saw the notification.

Unfortunatly, the update didn't solve the problem. Is there a something particular I need to do beside importing the new package (still in the user namespace ?) ?

As long as the XML import compiled the classes, there's nothing else you need to do in order to benefit from the update.

I suggest you stop using WebTerminal for at least 15 minutes, then check in Portal's Web Sessions page (under System Operation) that no /terminalsocket sessions remain.

Next, launch a WebTerminal. Confirm it reports being version 4.9.5. Check that the Web Sessions page shows one /terminalsocket entry. Now close your WebTerminal browser. Refresh the Web Sessions page. The /terminalsocket entry should no longer be there.

I tried these operations.

There was no /terminalsocket indeed in the web session portal before I launched a new session. However, this session does not disappear when I close the page. It even create another session for each attempt it make, so every 10 seconds. After waiting a bit (like 5 - 10 minutes), some begin to disapear.

It doesn't show more than the error message saying it can't connect, but it seems it use the 4.9.5 version, I found this in the index.js source file :

I notice the user is UnknownUser. Could this have an impact ?

UnknownUser is normal for the /terminalsocket session.

Does the problem also occur if you start from an incognito/InPrivate instance of your web browser?

Previously you reported that WebTerminal is working OK on another of your InterSystems environments. Is this still the case? Is that one using 4.9.5 yet?

For the one that fails to make the websocket connection I suggest you use F12 in your browser to open Developer Tools, and make sure network tracing is active before you go to the /terminal/ URL to open a WebTerminal. Look at the network trace messages, and compare them to equivalent messages from a different web browser session that connects successfully to WebTerminal on your other environment.

I'm not clear whether WebTerminal ever worked on this InterSystems instance, e.g. with WebTerminal 4.9.3.

Maybe also worth checking what the InterSystems security audit log is showing around the time you fail to connect with WebTerminal. You may need to turn this auditing on, and perhaps enable it for some additional event types.

I found what was the issue.

There was indeed something happening in the audit portal. Each attempt of the page was adding a line in this page :

I added the role of the application /terminalsocket to %All and now the error is gone. I have no clue why this happened though, your package didn't seems to change this kind of parameters (Or maybe I missed it).

In any case, thank you for your help !

Instead of adding the %All role to the /terminalsocket web app I suggest you add %DB_IRISLIB which should be sufficient to solve your issue.

My guess is, this environment used to give public %DB_IRISLIB:R but then someone tightened security by removing this, around the time you upgraded WebTerminal.

Such a change ought to show up in the audit log.