Replies by Benjamin De Boe for InterSystems Developer Community

Benjamin De Boe · Jan 3, 2022

You can configure your Java Gateway's JVM arguments through the [Gateways] section in the CPF file or, preferably, through the API documented here

go to post

Benjamin De Boe · Jul 29, 2021

The iFind search portal demo includes a simple class query to find similar documents within a single iFind index. It's only pretty basic and somewhat picky (assuming the demo setup), building on the dominance score for each entity, and may not guard against that difference in length issue you're seeing with BM25. There is a similar method in iKnow when your data would already be in an iKnow domain.

There would indeed be value in providing %SIMILARITY support for iFind indexed fields, leveraging the standard/enhanced algorithm on top of word tokens. I'll log that as an enhancement request and we can follow up internally. Obviously, I'm interested in experiences or advice of other DC members here

go to post

Benjamin De Boe · Jan 28, 2021

IRIS will automatically create an auto-incrementing RowID and leverage it as the IdKey for you, so unless you want anything other than the default, you shouldn't define such a field or index explicitly.

If all you want is control the name, take a look at the SqlRowIdName class parameter. If you need control over its behaviour, what you've set up is appropriate and you can leverage SqlRowIdPrivate to get rid of the default additional projection of the RowID. However, unless there's a good reason for controlling the IdKey, you should try to avoid overriding it as it may deprive you of certain practical features and efficiencies such as bitmap indices and an extent index.

go to post

Benjamin De Boe · Jan 11, 2021

How do you define "first 12 beds"? Is that based on some identifier or other field in the database? Otherwise %VID may help you.

FWIW: We're currently working on a more comprehensive implementation of window functions and ROW_NUMBER(), but that'll be for a release probably around this summer.

go to post

Benjamin De Boe · Jan 5, 2021

You can expose this information through setting the IFINDMAPPINGS parameter to 1:

[class_name]_[index_name]_WordRec: stores which words appear in each record in this index. See also %iFind.Index.AbstractWordRec.
[class_name]_[index_name]_WordSpread: stores the total number of records in which this word appears in this index. See also %iFind.Index.AbstractWordSpread.
[class_name]_[index_name]_WordPos stores which word occurs at which position in a record, so it can be joined to the AttributePos table. See also

So by doing a COUNT() on the WordPos table, you should find what you're looking for IFF it corresponds to an actual word. If you're using wildcards, you might combine with %iFind.FindWords() as a TVF, but that'd still be looking for individual words only:

SELECT COUNT(*) 
FROM Test_IFind.IFind_IF_WordPos 
WHERE WordId IN (SELECT WordId FROM %iFind.FindWords('ab*'))

If you want to count any kind of match, your highlight trick is probably the nicest way to get at it.

(as an aside: with the introduction of the %iFind.Index.Minimal index type in 2020.1, which BTW was Eduard's suggestion ;-), it seems the class reference for the IFINDMAPPINGS projections added by the .Basic class but not in .Minimal got lost. We'll fix that shortly!)

go to post

Benjamin De Boe · Oct 28, 2020

Are you asking about the web interface or the underlying API?

We are currently working on a new all-SQL loader that would be easy to use from any application or prompt (or to build a GUI on :-) )

go to post

Benjamin De Boe · Sep 11, 2020

Hi Mark,

can you be more specific on the nature of the data and possibly application that's going to be migrated? HealthShare is more a suite of applications with standardized data models underneath, running on top of InterSystems IRIS for Health. To ingest data into HealthShare means transforming it through its APIs, whereas IRIS for Health is more comparable to a relational database like DB2.

SQL Gateways allow mapping tables in remote non-InterSystems databases (such as DB2) to IRIS so they can be accessed as if they were local. That can be helpful during migrations as well as heterogeneous production environments. TSQL is a specific set of extensions of the SQL language used by Sybase and MS SQL Server that we support to ease migration scenarios for new customers with applications written in TSQL.

hope this helps,
benjamin

go to post

Benjamin De Boe · Aug 26, 2020

Class Test.Person Extends (%Persistent, %XML.Adaptor)
{ 
Property Name As %String; 
Property Id As %Integer [ Identity ]; 
}

go to post

Benjamin De Boe · Aug 25, 2020

This is not possible today, but happens to be something due for 2020.4 or 2021.1 in the form of expression indices.

The documentation may be slightly confusing here as the BuildValueArray() mechanism applies to non-collection properties and not "array of" or "list of" properties. The section that describes it just happens to be placed right after the section describing how to use the ELEMENTS trick to index them in their entirety. Note that you can use BuildValueArray() for %ArrayOfObjects properties, but those aren't projected as a child table.

A possible workaround would be through a separate property that's just there as an intermediary for your index:

Property AR As array Of Sample.Address;
Property AllStates As %String [ Calculated, ReadOnly, SqlComputeCode = {set {*} = {%%ID}}, SqlComputed ];
Index AllStatesIDX On AllStates(ELEMENTS);
ClassMethod AllStatesBuildValueArray(value, ByRef valueArray) As %Status
{
  kill valueArray
  set tObj = ..%OpenId(value), tKey = ""
  for {
    set tAddress = tObj.AR.GetNext(.tKey)
    quit:tKey=""
    set valueArray(tKey) = tAddress.State
  }
  quit $$$OK
}

Then you can include it in queries:

select * from sample.person where FOR SOME %ELEMENT(AllStates) (%VALUE = 'KY')

go to post

Benjamin De Boe · Jul 15, 2020

Hi David, maybe you can elaborate a little more on the particular latency challenge you faced?

As Dmitriy mentioned, sharding will spread your data and corresponding query workload across multiple nodes in order to achieve higher efficiencies on very large datasets and is especially fit for read-mostly workloads. ECP, when used in a typical application server setup, is meant to distribute user-bound workload across multiple servers, so serving a slightly different goal. Depending on your use case, either of those (or a combination of them) can be more appropriate. See also this overview and introductory video for more info.

go to post

Benjamin De Boe · Jul 9, 2020

Maybe this Getting Started guide is also a worthwhile reference

go to post

Benjamin De Boe · Jul 9, 2020

if that column is currently in a string format (you can tell from the column/resultset metadata), you can use a different pattern to check if it meets the raw format:

SELECT CASE WHEN '01/01/2020' %PATTERN '1.2N1"/"1.2N1"/"4N' THEN 'valid' ELSE 'invalid' END

if you also want to check if that resolves to a valid date, you can use CAST or TO_DATE, but those will throw errors if they don't resolve to a valid date, so a small custom function in ObjectScript is perhaps not a bad idea

go to post

Benjamin De Boe · May 11, 2020

Depends on how you're set up. There is a setting to allow specifying those values that is meant for this type of bulk load scenario.

@Daniel Buxton : the multi-row insert syntax you're asking about is currently not supported by IRIS SQL (as you figured by now :-)). Robert's approach using UNION all is a creative workaround, but depending on where your big INSERT is coming from you might also just be able to generate / rewrite it into separate INSERTs? Once you have that in place, you can use the SQL Shell's RUN command or the ObjectScript $SYSTEM.SQL.DDLImport() method to execute your script file.

go to post

Benjamin De Boe · Mar 13, 2020

and if you prefer something more database-independent, you can use the standard INFORMATION_SCHEMA package:

SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE COLUMN_NAME LIKE 'PATID%'

go to post

Benjamin De Boe · Mar 11, 2020

FWIW, you can just put as many conditions as you want in the same SQL query. InterSystems IRIS includes an excellent SQL optimizer that will consider all the conditions (and other clauses) in your query and decide on the most efficient path to access the data.

go to post

Benjamin De Boe · Feb 20, 2020

Hi Darko,

LONGVARCHAR is actually storing the text as a stream, so to make LIKE work, you'll need to convert it to a string, e.g. using SUBSTRING(), in the expression you're serving to the LIKE operator. The following works fine for me:

SELECT SUBSTRING(text, 1, 999999) FROM table WHERE SUBSTRING(text, 1, 9999999) LIKE '%abc%'

This looks a little clumsy, but streams are really meant for huge data that wouldn't fit in a VARCHAR. Note that you can get a massive amount of text in VARCHAR (%String) columns these days, so maybe that's worth considering.

Separately, when working with iFind, that'll provide faster search results and it also transparently takes care of the stream aspect.

go to post

Benjamin De Boe · Jan 9, 2020

Also, which versions are you using (on the server and client sides)?

go to post

Benjamin De Boe · Jan 2, 2020

In addition to the suggestions made earlier (to provide more context such as full query & plan), you may also want to consider POSIXTIME as the data type for your date / time columns. It encodes dates as a simple 64-bit integer, which makes for efficient storage, indices and queries.

go to post

Benjamin De Boe · Jan 2, 2020

Could you share the rest of the error message (hidden behind the alert) and more specifics on the ODBC connection type?

And is there anything special about these tables or do you get this for each and every table?

go to post

Benjamin De Boe · Nov 29, 2019

No, we indeed don't support that syntax feature. Like Evgeny said, this recursion is something we'd typically try to wrap inside ObjectScript methods, which you could then expose as a stored procedure.