Question
Scott Roth · Jul 19

Interoperability - Interface Maps

When I try to run Interoperability -> Interface Maps in 2022.1 on a very large namespace, I keep getting timeout errors. Even though I add filtering by Category, Text Search, etc... it still errors out no matter what. However if it is ran in one of our smaller namespaces it runs just fine. WRC told us the namespace is too big, however that should not matter on how many services, processes, and operations you have running. 

  • Has anyone had this issue with Interoperability - Interface Maps timing out because of the size of the namespace?
  • Does anyone have a way around it?
  • Or does anyone else have another answer for the issue?

Right now this in our DEV system, but I would like to get this running for the rest of my team and so we can meet certain CMS/JACO requirements for knowing where data is flowing, and documentation.

Thanks

Scott

Product version: IRIS 2022.1
$ZV: IRIS for UNIX (Red Hat Enterprise Linux 8 for x86-64) 2022.1 (Build 209U) Tue May 31 2022 12:13:24 EDT [HealthConnect:3.5.0] [HealthConnect:3.5.0]
0
0 291
Discussion (17)2
Log in or sign up to continue

Is the timeout occuring waiting for System Managemt Portal web page to return html content to the web browser?

In the "Web Gateway" configuration, "Default Parameters" page the setting "Server Response Timeout" is default 60 Seconds.

Also if using internal Apache webserver for SMP (install dir/httpd/conf/httpd.conf)
the default "Timeout" is 300 seconds.

The lower of these values is applied. So could start by increasing the Web Gateway timeout setting first to see if gives more time for page to return content.

I did increase the timeout to 180, 300, and 1800. Each time I got an HTTP 500 error

Hey Scott.

I'm not sure if this will work at all, but have you tried extending your timeout for your CSP gateway?

Management Portal -> System Administration -> Configuration -> Web Gateway Management is the route to this. The username you'll then need is "CSPSystem" and the password should be the default password used when installing the system.

From within here you can navigate to "Default Parameters" and increase the Server Response Timeout parameter.

Below is the error I am getting... "Executing your search took longer than the timeout for requests....." This seems more of a System Management Portal waiting for response error. Is there a way to adjust that? or make the search run more efficiently when you have a large namespace?

Believe you are looking for System Configuration->Security->Applications->Web Applications, then click on the link for the webapp that looks like: /csp/healthshare/devclin and adjust the Session Timeout setting.

Default is 900 seconds I believe but we often increase ours to 1800 (30 minutes).

However, not sure this will address it for you as I'm working with a rather Production as well and my result still comes back within a few seconds. You mentioned an HTTP 500 which is a generic error - I would have expected a 408 (Request Timeout) or 504 (Gateway Timeout) if it were truly a session timeout issue.

Wonder if you can see any error in the Application Error log.

I made the suggested change to the Web Application but till received...

I did not see anything in the Application log, but did do a trace on the Web Gateway and found this...But it appears the Web Gateway returned a 200?

HTTP/1.1 200 OK
Content-Type: application/x-csp-hyperevent
Set-Cookie: CSPSESSIONID-SP-52773-UP-csp-healthshare-=000000010000NDWvnIIgxIxABp0JuJLbDWSBd89wThY$r1wuVK; path=/csp/healthshare/; httpOnly; sameSite=lax;
Cache-Control: no-cache
Date: Mon, 25 Jul 2022 11:46:01 GMT
Expires: Thu, 29 Oct 1998 17:04:19 GMT
Pragma: no-cache
Set-Cookie: CSPWSERVERID=hA035JK3; path=/; httpOnly;
Content-Length: 614 000000010000NDWvnIIgxIxABp0JuJLbDWSBd89wThY$r1wuVK
#R
// InvokeInstanceMethod:
try {
}
catch(ex) {
var text = 'A JavaScript error occurred while invoking a server instance method.\nClass: EnsPortal.InterfaceMaps\nMethod: PrepareCancel\n';
zenExceptionHandler(ex,arguments,text);
}
// %EndChangeTracking: sync client with server changes
try {
}
catch(ex) {
zenExceptionHandler(ex,arguments,'A JavaScript error occurred in %EndChangeTracking.');
}
delete _zenProxyIndex[-2]; if (zenPage && zenPage.onServerMethodReturn) {
zenPage.onServerMethodReturn('PrepareCancel');
}
#OK
1172932

Let's see how much time does it take to run in a non web context. There would be no timeouts in that case.

Could you please run this in your interoperability namespace:

set start=$h,rs=##class(Ens.InterfaceMaps.Utils).EnumeratePathsFunc(),end=$h
write "Time in seconds: ", $p(end, ",", 2)-$p(start, ",", 2)

If you run the query in SMP with parameters set them here too. EnumeratePaths query docs.

Also after you obtain the result set, you can output it to a terminal using %Display method, or to a file using %DisplayFormatted method.

Agree with this approach, I was typing it up as well. Don't believe you're really seeing a request timeout and its masking the real issue.

Node: int-lxiris-vd01.unix.osumc.edu, Instance: DEV

USER>zn "DEVCLIN"

DEVCLIN>set start=$h,rs=##class(Ens.InterfaceMaps.Utils).EnumeratePathsFunc(),end=$h write "Time in seconds: ", $p(end, ",", 2)-$p(start, ",",2)
Killed

It timed out in the cache session as well.

I am not seeing much in the logs. I tried executing it from the GUI again and this popped up in the logs...

alerts.log
07/25/22-11:07:30:783 (1170818) 2 [Generic.Event] Process terminated abnormally (pid 1407120, jobid 0x000501f1)

messages.log
07/25/22-11:07:30:783 (1170818) 2 [Generic.Event] Process terminated abnormally (pid 1407120, jobid 0x000501f1)
07/25/22-11:07:30:802 (1170818) 0 [Generic.Event] cleaned dead job, pid: 1407120, jobid 0x000501f1
07/25/22-11:08:29:835 (1407553) 2 [Utility.Event] ResourceCleanup: Dead job cleanup from '' job '1407120' of 'gbl:^IRIS.Temp.cspServer(1407120)'

csp.log
local-time="Mon Jul 25 11:04:14 2022" wg-build="RT 2201.1823 (linux/apapi:srv=2.4.52/apr=1.7.0/apu=1.6.1/mpm=worker)" wg-log-level=-1 when="2022-07-25 15:04:14.745" level=WARNING event=WebGateway.ProcessRequest pid=1170851 thread-id=139680926398208 csp-connection-no=0 csp-server="LOCAL" csp-server-pid=1407120 csp-request-id=6f7 csp-session-id=oShZWGY9SY csp-remote-addr=10.95.70.50 csp-page="POST /csp/healthshare/devclin/%CSP.Broker.cls" text="Error Condition" details="CSP application closed the connection before sending a complete response"
 

Since messages.log mentions IRIS.Temp.cspServer does it mean there is an issue with globals?

That's InterSystems IRIS logs. You need OS logs, namely syslog (check /var/log/syslog or /var/log/messages). Search for log entries about the "killed process" around the time your process was killed.

To be extra sure run write $job before calling the query to get process id. It should match the id of a killed process.

After you establish that your process was killed by OOM you need to report that to WRC, as a fix probably requires if not access, then at least overview of the production configuration which causes it.

Jul 25 13:21:31 int-lxiris-vd01 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-2001341.slice/session-1349.scope,task=irisdb,pid=1411729,uid=327
Jul 25 13:21:31 int-lxiris-vd01 kernel: Out of memory: Killed process 1411729 (irisdb) total-vm:64515128kB, anon-rss:40949392kB, file-rss:216kB, shmem-rss:21504kB, UID:327 pgtables:100648kB oom_score_adj:0
 

Yeah, the process had 64,5 Gb of virtual memory allocated and 40Gb actually used.

Not surprised OOM killed that.

I went ahead and opened a WRC per your suggestion.

Thanks

Just FYI..... One and probably many of my BPL's have over 30+ IF statements. Interface Maps has a limitation on how many IF statements it handled. It was suggested that I split apart my BPL, or wait for Development, which has been entered into the internal InterSystems system.