go to post Benjamin De Boe · Dec 19 Hi @Scott Roth , the %MANAGE_FOREIGN_SERVER privilege was only just introduced with 2024.2, as part of finalizing the full production support for Foreign Servers (see also release notes). I'm not sure though why it wouldn't appear after you created it. Can you confirm whether it's still there right after the CREATE SERVER command, whether you're using the same user for both connections, and whether or not you can CREATE FOREIGN TABLEs with that server (before logging off and / or after logging back in). I understand upgrading may not be straightforward, but the most logical explanation would be that the initial, crude privilege checking (that we replaced in 2024.2 as advertised) has a hole in it. thanks,benjamin
go to post Benjamin De Boe · Dec 19 No, I would leave out the semicolon at the end of that query. It's typically used as a statement separator, but not really part of query syntax itself. IRIS (as of 2023.2) will tolerate it at the end of a statement, but it doesn't seem that Spark really does anything with it as it wraps what you sent to dbtable with further queries, causing the error you saw. You may also want to apply .option(“pushDownLimit”, false)
go to post Benjamin De Boe · Dec 6 You can install any Python distribution you like (such as Anaconda), but the deep integration with IRIS is only available from 2022.1 onwards, so it'd be running "external" to IRIS and have to talk to IRIS through a client such as the Native API or DB-API.
go to post Benjamin De Boe · Dec 3 I'm afraid your IRIS version does not support Embedded Python. It was only introduced in 2022.1: https://docs.intersystems.com/iris20221/csp/docbook/DocBook.UI.Page.cls?...
go to post Benjamin De Boe · Dec 3 Hi, while others have already addressed your specific XML projection question, I'm wondering why you aren't considering to use IRIS' built-in support for Bulk FHIR exports: https://docs.intersystems.com/irisforhealthlatest/csp/docbook/DocBook.UI...
go to post Benjamin De Boe · Dec 3 See also my response to Vitaliy's post -- we intend to refresh the driver distribution page on GitHub to reflect the new distribution channel shortly.
go to post Benjamin De Boe · Dec 3 Indeed, as of 3.10.1, we're publishing our JDBC drivers directly to Maven when needed, offering bugfixes as well as enhancements independently of IRIS releases. This significantly increases our ability to address customer feedback. For convenience, we'll continue to ship jar files with IRIS, using the version that is current at the time of the IRIS release.
go to post Benjamin De Boe · Nov 18 Hi David, AutoParallel is based on a comprehensive formula weighing the cost (setup cost for the coordination work, writing and then recombining per-process results) against the benefits (independent work that can run in parallel). For queries not doing any aggregation, iow when the result rows correspond directly to rows in the table being queried (and especially if there are no JOINs), having to write and then read the per-process results, even when done in parallel, does not add value if there's no significant compute cost associated with for example validating a filter predicate. For the SELECT COUNT(*), the optimizer is satisfying that from the index (you don't seem to have a bitmap extent index, but that index on age is still plenty small), which is very cheap to read so 1M rows still don't weigh up against the setup costs. Anyway, AutoParallel works very well for most of our customers. It's based on a cost formula that incorporates a few constants representing the cost of IO and computations that generalize across infrastructure, so for some environments there may be cases where it over-estimates or under-estimates the exact cost, leading to edge cases where the other option might have yielded a slightly faster query, but generally the formula holds well and every now and then we review whether the constants need to be tuned (wrt newer standards for hardware). As for the particular example with 0.06 vs 0.24s, I think there may be something different at play there. The (presumed!) non-parallel case does 600k grefs whereas the parallel one only needs 318. Even if your result should only have 300 rows, I would expect it to need at least twice as many grefs (index lookup + master map), so I'd recommend giving that another try after ensuring table stats are up to date and comparing the query plans (for the %PARALLEL, %NOPARALLEL, and query without either hint). A possible explanation might be that your query tool (the SMP?) at the UI level was only retrieving the first 100 rows for the first query, and all of them for the second test.
go to post Benjamin De Boe · Nov 15 Then you'd need to define those four subscript levels as part of your IDKEY and have your data map look like this: <Data name="ModifyDDDD"> <Delimiter>":"</Delimiter> <Piece>1</Piece> </Data> Please see the examples attached to the series of articles I referenced earlier
go to post Benjamin De Boe · Nov 15 But what represents your master map and row level? is that are these 14 entries all corresponding to one and the same row, presumably with some sort of IDKEY consisting of "AC" and "MD"? We cannot say much meaningful without understanding that broader picture.
go to post Benjamin De Boe · Nov 15 The data in your sample globals is kind of terse :-) Can you please clarify what you expect the SQL projection to yield as rows and columns? Please also see Brendan's excellent series of articles about custom storage mappings: https://community.intersystems.com/post/art-mapping-globals-classes-1-3
go to post Benjamin De Boe · Nov 15 This is indeed expected behaviour. It's not the SELECT * itself, but the fact that this query is not applying any filtering or doing any other calculations that are worth parallelizing. So the query is asking to return all rows as-is, passing them back through a single connection/process. Therefore the optimizer argues there's no benefit in parallelizing that work, as the work of collating the per-process results back into a single resultset is pure overhead. The actual formula being applied is a little more subtle (a WHERE clause that is expected to only filter out a small fraction of the rows would not be enough to parallelize either) and as has been suggested you still need to hit the AutoParallel threshold for the process mgmt code not to outweigh the benefits (e.g. if there's only a few hundred rows). Thanks,benjamin
go to post Benjamin De Boe · Nov 7 I have only this morning updated my settings about notifications, which had a few things switched off that I didn't know existed as separate options. I'm wondering if some recent enhancement to the DC refining control on what to subscribe to was a little conservative in starting with notifications disabled. I believe I should be all set now, as I got a notification for your post right away now.
go to post Benjamin De Boe · Nov 7 Hi Andreas, thanks for your patience. I have now finally been able to reproduce your issue using the released 2024.2 kit, connecting to it from a JDBC client rather than the shell. I do not quite understand why the THROUGH command is throwing an error in that client-to-server context and not when invoked from the shell in code that is creating a "nested" server-to-server JDBC connection to the Foreign Server, but those seem to be the facts. This said, the good news is the exact same scenario works fine on 2024.3. I have tried sneaking the 2024.3 JDBC driver into the 2024.2 kit, but that doesn't seem to make a difference so it's a server-side code that fixed this. We're just two weeks away from 2024.3 going GA and there's a totally stable developer preview version of it available, so I hope that gets you by. Thanks,benjamin
go to post Benjamin De Boe · Oct 29 Hi @Andreas Schneider , I have just downloaded the released 2024.2 container as well as the latest 2024.3, set up foreign server connections from the former to the latter, and all THROUGH SERVER commands work as expected: [SQL]USER>>through server myhost select $zv 2. through server myhost select $zv | Expression_1 | | -- | | IRIS for UNIX (Ubuntu Server LTS for x86-64 Containers) 2024.3 (Build 206U) Fri Oct 11 2024 16:59:56 EDT | 1 Rows(s) Affected statement prepare time(s)/globals/cmds/disk: 0.0485s/6,382/43,267/0ms execute time(s)/globals/cmds/disk: 0.0028s/40/576/0ms query class: %SQL.PassthroughResult.Dynamic --------------------------------------------------------------------------- [SQL]USER>>through server myself select $zv 3. through server myself select $zv | Expression_1 | | -- | | IRIS for UNIX (Ubuntu Server LTS for x86-64 Containers) 2024.2 (Build 247U) Tue Jul 16 2024 09:29:04 EDT | 1 Rows(s) Affected statement prepare time(s)/globals/cmds/disk: 0.1686s/6,388/43,347/0ms execute time(s)/globals/cmds/disk: 0.0031s/40/576/0ms query class: %SQL.PassthroughResult.Dynamic --------------------------------------------------------------------------- [SQL]USER>>select $zv 4. select $zv | Expression_1 | | -- | | IRIS for UNIX (Ubuntu Server LTS for x86-64 Containers) 2024.2 (Build 247U) Tue Jul 16 2024 09:29:04 EDT | 1 Rows(s) Affected statement prepare time(s)/globals/cmds/disk: 0.0852s/35,278/160,395/0ms execute time(s)/globals/cmds/disk: 0.0001s/0/411/0ms query class: %sqlcq.USER.cls24 --------------------------------------------------------------------------- [SQL]USER>> I'm afraid I have no other suggestion than to re-check whether your 2024.2 instance and the JDBC driver it uses are in sync and there are no stale classpath entries in your SQL Gateway connection or SQL Data Lens config. FWIW, Foreign Tables use new xDBC Gateway infrastructure that only relies on the SQL Gateway connection definition for getting the server URL and credentials, so the "test" button in the SQL Gateway definition screen is not necessarily representative for xDBC Gateway connectivity. This is a temporary situation until the IRIS Wallet (secure credential score) is ready for the new Gateway (and Foreign Tables) to adopt, but we've never run into any differences in behaviour like the one you are facing. To validate whether the xDBC Gateway can connect and possibly glean more detail from the error message, please try this, replacing the host/port/credentials as appropriate for your setup: w ##class(%XDBC.Gateway.Connection).%GetConnection({ "host": "host.docker.internal", "port": 51774, "databasename": "USER", "datasource": "com.intersystems.jdbc.IRISDataSource", "user":"_SYSTEM", "password":"SYS","type":"jdbc" },.sc) d $system.OBJ.DisplayError(sc)
go to post Benjamin De Boe · Oct 24 hi @Andreas Schneider , sorry about the embarrassingly late response! It seems my DC email notifications aren't coming through anymore. The error you're seeing may well be because you might still be using an older JDBC or ODBC driver version. The THROUGH keyword at the start of the statement is brand new and was added to the JDBC or ODBC driver versions that ship with IRIS 2024.2 (v3.9.0 for JDBC), but if you're using a query tool such as DBeaver, they may still be using something older. As you already found / tried, you shouldn't add any (single) quotes around the statement to be sent to the foreign server. The <UNDEFINED> error is puzzling. This hasn't been reported by any of the customers we know are using THROUGH today (we've worked closely with at least one!), but it seems from your $zv string you are using one of the early Developer Preview kits and not the final 2024.2.0 release. Can you please try upgrading to the official release (build 247)? I can see in our internal tracking system at least one more related change went in between the DP build you listed and the final release, and have reasonable confidence that'll sort things. Apologies for the delayed response
go to post Benjamin De Boe · Oct 23 If you're on that release I'm afraid you'll have to consider legacy options you may have found in the doc already: %SQL.Import.Mgr and %SQL.Export.Mgr. They lack comprehensive documentation, but from the example in their shared superclass, you can derive pretty much what you'd need. I have not specifically tested my own SQL toolbox on 2021.1, but it doesn't rely on LOAD DATA for everything and may offer some suggestions on where to look next. Something you'll likely need when trying to import row IDs from one instance to another is to temporarily override the InsertIdentity setting. See the INSERT reference doc for pointers on how to enable manually specifying RowID values.
go to post Benjamin De Boe · Aug 28 You can tune the reconnection behaviour using SQL outbound adapter settings as documented here. This is intended behaviour we're not planning to change, though interested if other DC members have run into this and want to share their experiences. Also curious to hear if someone tried to keep a shell / telnet connection open for 4 months :-)
go to post Benjamin De Boe · Aug 8 Great stuff! This was really easy to get going with once I got the right OpenAI API key. I was curious how it'd do on a different (more complex) schema and it seemed to hold up quite well. Two things I noted so far (but I'm not done with poking around) is that it seems to add LIMIT n to queries every now and then, which is not valid IRIS SQL syntax (and likely you can suggest that in the prompt), and that it does weird things when you try to put your own SQL in the execute box (it restores the previous query - just a UI thing I guess). You guys really earned that first place in the contest!!