Benjamin De Boe · Aug 8, 2018 go to post

Hi Robert,

in 2018.2, we're introducing a feature called "coordinated backup", which basically allows adding a checkpoint in the journal files of all participating instances so you can roll them back to a synchronized state. We were just working on the docs for that feature the other week and it's four pages if you'd want the comprehensive answer to your question, so this is just a simplified version :-)

Please note that we currently do not support cross-shard transactions on sharded tables. It's not a common requirement for the types of use cases our sharding implementation was designed for (typically more analytical queries), but we're happy to discuss specific scenarios in the context of a POC to see what guarantees can be provided through appropriate application & schema design.

thanks,

benjamin

Benjamin De Boe · Jul 30, 2018 go to post

Note that in InterSystems IRIS 2018.2, you'll be able to save a PMML model straight into InterSystems IRIS from SparkML, through a simple iscSave() method we added to the PipelineModel interface. You can already try it for yourself in the InterSystems IRIS Experience using Spark.

Also, besides this point-and-click batch test page, you can invoke PMML models stored in IRIS programmatically from your applications and workflows as explained in the documentation. We have a number of customers using it in production, for example to score patient risk models for current inpatient lists at HBI Solutions.

Benjamin De Boe · Jul 6, 2018 go to post

Yes, we maintain an adoption guide that covers exactly that purpose. In order to be able to properly follow up on questions you'd have, we're  making it available through your technical account team (sales engineer or TAM) rather than ship it with the product.

Benjamin De Boe · Jun 18, 2018 go to post

Hi Eduard,

for this sort of querying (and many other uses outside straight API calls), you can use the SQL projections generated for your domain, as documented in %iKnow.Tables.Utils. That'll generate a column for your Views metadata field on the table containing Source information, which you can then join to the Part (entity occurrence) table to filter the ones containing the requested entity.

Hope this helps,
benjamin

Benjamin De Boe · Jun 14, 2018 go to post

Hi Eduard,

looking at your code, there seem to be a few small things that may each contribute to not seeing the results you were expecting:

  1. the MaintenanceAPI:GetBlackListElements() call returns its results as result(n) = $lb(id, string) with n just an incrementing integer representing the row number. At the other end, the ContainsEntityFilter expects array(string) or a $listbuild(string1, string2, ...). So your filter might be selecting sources containing the strings "1", "2", etc
  2. SourceAPI:GetByDomain() returns result(n) = $lb(sourceID, externalID). That source ID is an internally generated integer ID that has no links to your source table Text.Data. The external ID is typically composed of what you selected as group field and identifier field when loading from a SQL table. So depending on how you set up your domain, that may indeed be the ID field of your Text.Data table. It looks like you have the "simple external IDs" feature switched on, which is why your external IDs only consist of the identifier field, making things indeed easier (but usually only useful/safe when loading from a single table!). Note that this is slightly different for DeepSee-managed domains, where the source ID equals the external ID and corresponds to DeepSee's fact ID, but ignore this confusing comment when not using DeepSee
  3. Finally, and likely irrelevant, you're passing in $$$YES when initializing filterNot. I'm not sure where you're loading that macro from, but that should be a %Boolean with a value of 1 to work as expected, where a string value would translate to a %Boolean with value 0.

Hope this helps,

benjamin

Benjamin De Boe · May 4, 2018 go to post

OK, thanks for the feedback. We're indeed looking into those additional windowing functions to go beyond our %FOREACH SQL extension, but it's not (yet) on the short-term agenda. Customer demand like yours of course helps us properly prioritize what should go on there.

Benjamin De Boe · May 4, 2018 go to post

We currently don't support analytic windowing functions (PARTITION BY syntax), but have been looking into it for a future release. MATCH_RECOGNIZE is certainly one of the more advanced ones in that bucket. Is this the very one you would need or do you have scenarios that would be served by core windowing functionality, excluding the pattern matching piece?

Or is it the pattern matching and not as much the windowing you're looking for?

Benjamin De Boe · May 4, 2018 go to post

I'm afraid we don't support the SQL PIVOT command, so unless you can enumerate the response codes as columns explicitly, you can only organise them as rows. If you control the application code, you could of course first have a query selecting all response codes and then generating the lengthy SQL call that includes separate columns for each response code. Something like SUM(CASE bRecord.ResponseCode WHEN 'response code 1' THEN 1 ELSE 0 END) AS ResponseCode1Count should work fairly well.

Benjamin De Boe · Mar 27, 2018 go to post

Horita-san,

The SetParameter() method requires your domain has an ID assigned, which it gets automatically as soon as you call %Save() a first time. Note that it should be returning an error in the sequence of commands you pasted, but it went unnoticed because of the do syntax.

In general, it should be more convenient to work with Domain Definitions rather than the %iKnow.Domain API directly.

Thanks,
benjamin

Benjamin De Boe · Feb 1, 2018 go to post

Hi Dmitry,

Zen is indeed no longer a central piece of our application development strategy. We'll support it for some time to come (your Zen app still works on IRIS), but our focus is on providing a fast and scalable data management platform rather than GUI libraries. In that sense, you may already have noticed that recent courses we published on the topic of application development focus on leveraging the right technologies to connect to the backend (i.e. REST) and suggest using best-of-breed third-party technologies (i.e. Angular) for web development.

InterSystems IRIS is a new product where we're taking advantage of our Caché & Ensemble heritage. It's meant to address today's challenges when building critical applications and we've indeed leveraged a number of capabilities from those products, but also added a few significant new ones like containers, cloud & horizontal scalability. We'll be providing an overview of elements to check for Caché & Ensemble customers that would like to migrate to InterSystems IRIS shortly (i.e. difference in supported platforms), but please don't consider this as merely an upgrade. You may already have noticed the installer doesn't support upgrading anyhow.

Thanks,
benjamin 

Benjamin De Boe · Feb 1, 2018 go to post

Hi Robert,

DocBook has now moved fully online, which is what the mgmt portal will link to: http://docs.intersystems.com/iris

SAMPLES included quite a few outdated examples and was also not appropriate for many non-dev deployments, so we've also moved to a different model there, posting the most relevant ones on GitHub, giving us more flexibility to provide updates and new ones: https://github.com/intersystems?q=samples

JDBC driver: to what extent is this different from the past? It's always just been available as a jarfile, as is customary for JDBC drivers. We do hope to be able to post it through Maven repositories in the near future though.

Small icons: yeah, to make our installer and (more importantly) the container images more lightweight, we had to economize on space. Next to the removal of DocBook and Samples, using smaller icons also reduces the size in bytes ;) ;)

InterSystems IRIS is giving us the opportunity to adopt a contemporary deployment model, where we were somewhat restricted by long-term backwards compatibility commitments with Caché & Ensemble. Some of these will indeed catch your eye and might even feel a little strange at first, but we really believe the new model makes developing and deploying applications easier and faster. Of course, we're open to feedback on all of these evolution and this is a good channel to hear from you.

Thanks!
benjamin

Benjamin De Boe · Jan 12, 2018 go to post

If you have a global structure that you mapped a class to afterwards, that data is already in one physical database and therefore not sharded or shardable.  Sharding really is a layer in between your SQL accesses and the physical storage and it expects you not to touch that physical storage directly. So yes you can still picture how that global structure looks like and under certain circumstances (and when we're not looking ;-) ) read from those globals, but new records have to go through INSERT statements (or %New in a future version), but can never go against the global directly.

We currently only support sharding for %CacheStorage. There's been so many improvements in that model over the past 5-10 years that there aren't many reasons left to choose %CacheSQLStorage for new SQL/Object development. The only likely reason would be that you still have legacy global structures to start from, but as explained above, that's not a scenario we can support with sharding. Maybe a nice reference in this context is that of one of our early adopters who was able to migrate their existing SQL-based application to InterSystems IRIS in less than a day without any code changes, so they could use the rest of the day to start sharding a few of their tables and were ready to scale before dinner, so to speak.

Benjamin De Boe · Jan 12, 2018 go to post

Hi Warlin,

I'm not sure whether you have something specific in mind, but it sort of works the other way around. You shard a table and, under the hood, invisible to application code, the table's data gets distributed to globals in the data shards. You cannot shard globals.

thanks,
benjamin

Benjamin De Boe · Jan 11, 2018 go to post

Hi Herman,

We're supporting SQL only in this first release, but are working hard to add Object and other data models in the future. Sharding any globals is unfortunately not possible as we need some level of abstraction (such as SQL tables or Objects) to hook into in order to automate the distribution of data and work to shards. This said, if your SQL (or soon Object) based application has the odd direct global reference to a "custom" global (not related to a sharded table), we'll still support that by just mapping those to the shard master database.

Thanks,
benjamin

Benjamin De Boe · Sep 29, 2017 go to post

iKnow was written to analyze English rather than ObjectScript, so you may see a few odd results coming out of code blocks. I believe you can add a where clause excluding those records from the block table to avoid them.

Benjamin De Boe · Sep 21, 2017 go to post

I've had something simple running on my laptop already a long time ago, but the internal discussion on how to package it proved a little more complicated. Among other things, an iFind index requires an iKnow-enabled license (and more space!), which meant you couldn't simply include it in every kit.

Also, for the ranking of docbook results, applying proper weights based on the type of content (title / paragraph / sample / ...) was at least as important as the text search capabilities themselves. That latter piece has been well-addressed in 2017.1, so docbook search is in pretty good shape now. Blending in an easily-deployable iFind option as Konstantin published can only add to this!

Thanks,
benjamin

Benjamin De Boe · Sep 19, 2017 go to post

Hi Steve,

hadn't seen this question until just now, but I have to admit we're a bit storage-hungry with iKnow. If you generate the default full set of indices, for a moderately-sized domain you'll need up to 25x the original dataset size measured as raw text to fit everything. This can drop to half that size (12x) if you forsake all non-essential indices, but that will prevent a number of queries from running smoothly or, in some cases, disable them completely.

For iFind, the numbers are dependent on the type of index. Count on factors 2x, 7x and 15x for Basic, Semantic and Analytic indices, respectively. Of course there's a difference in functionality between all these options and it's best to start from a set of functional requirements and then look at which particular approach covers those.

These numbers are somewhat conservative maximums and, as Eduard already suggested, you may see different (lower) numbers depending on the nature of your data. A more detailed sizing guide is available on request.

Thanks,
benjamin

Benjamin De Boe · Sep 18, 2017 go to post

Hi Konstantin,

thanks for sharing your work, a nice application of iFind technology! If I can add a few ideas to make this more lightweight:

  • Rather than creating a domain programmatically, the recommended approach for a few versions now has been to use Domain Definitions. They allow you to declare a domain in an XML format (not much unlike the %Installer approach) and avoid a number of inconveniences in managing your domain in a reproducible way.
  • From reading the article, I believe you're just using the iKnow domain for that one EntityAPI:GetSimilar() call to generate search suggestions. iFind has a similar feature, also exposed through SQL, through %iFind.FindEntities() and %iFind.FindWords(), depending on what kind of results you're looking for. See also this iFind demo. With that in place, you may even be able to skip those domains altogether :-)

thanks,
benjamin

Benjamin De Boe · Jul 3, 2017 go to post

Thanks John,

indeed, you'd need a proper license in order to work with iKnow. If the method referred above would return 0, please contact your sales representative to request a temporary trial license and appropriate assistance for implementing your use case.

Also, iKnow doesn't come as a separate namespace. You can create (regular) namespaces as you prefer and use them to store iKnow domain data. You may need to enable your web application for iKnow, which is disabled by default for security reasons in the same way DeepSee is. See this paragraph here for more details.

Benjamin De Boe · Jun 26, 2017 go to post

Hi Eduard,

you can define iFind indices for calculated fields, so if you point your field calculation to a function that strips out the HTML, you should be fine. The HTML converter in iKnow was built for a slightly different purpose, but can be used here:

Property HtmlText As %String(MAXLEN="");

Property PlainText As %String(MAXLEN="") [ Calculated, ReadOnly, SqlComputed, SqlComputeCode = { set {PlainText} = ##class(%iKnow.Source.Converter.Html).StripTags({HtmlText}) } ];

Regards,
benjamin

Benjamin De Boe · May 8, 2017 go to post

Hi Evgeny,

nice work!

Maybe you can enhance the interface by also including an iKnow-based KPI to the dashboard exposing the similar or related entities for the concept clicked in the heat map. You can subclass this generic KPI and specify the query you want it to invoke, and then use it as the data source for a table widget. Let me know if I can help.

 

thanks,
benjamin

Benjamin De Boe · Apr 19, 2017 go to post

After posting the initial article, I realized the sample code's use of ^CacheTemp.* globals implied a risk of iKnow.SyncedDefinition subclasses with the same name in different namespaces overwrite one another's data. The revised code now uses the namespace and domain ID as a subscript in ^CacheTemp, which should be safe.

The update also fixes the sample table's CreateTime column to be of type %DeepSee.Datatype.dateTime rather than %Date.

Benjamin De Boe · Apr 10, 2017 go to post

Cool stuff!

I believe you're using matching dictionaries for identifying those sentiment markers, which is indeed convenient from an API perspective. However, you might want to take advantage of sentiment attributes, which will allow you to not just detect occurrences of your marker terms, but also which parts of the sentence they apply to. I'm not sure how that is covered in your current app (didn't dig that deep into the code), but especially in the recent versions that improved our attribute expansion accuracy, it may improve the precision of your application too. See this article for more details.

Separately, leveraging domain definitions may also simplify the methods you're using to set up your domain. There's an option to load dictionary content from a table or file, leveraging <external> tags inside the <matching> section. It's not (yet) supported through the Architect, but you can add it when updating the class through Studio.

 

Thanks for sharing this!

benjamin

Benjamin De Boe · Apr 3, 2017 go to post

Hi Max,

the connector we're building is meant to be a smarter alternative to regular JDBC, pushing down filtering work from the Spark side to Caché SQL and leveraging parallelism where possible. So that means you can still use any Spark programming language (Scala, Java, Python or R) while enjoying the optimized connection. However, as it's an implementation of Spark's DataSource API, it's meant to go from Spark to "a data source" and not the other way round, i.e. submit a Spark job from Caché. On the other hand, that'd be something you could probably build without much effort through the Java Gateway. Do you have a particular example or use case in mind? Perhaps that would make an interesting code sample to post on the Developer Community.

 

Thanks,
benjamin

Benjamin De Boe · Mar 10, 2017 go to post

Hi Andreas,

we don't have a release date yet, but we'll certainly be demonstrating it at the Global Summit in September. If you are already using Spark in your organisation today and would be interested in seeing how it may help you make better use of the underlying Caché database, please drop me an email.

Thanks,
benjamin

Benjamin De Boe · Mar 1, 2017 go to post

yes, the two-word feature called "executing COS" would probably be quite a step up. It was more a loose idea than something I've researched thoroughly, but maybe the authors of the Caché Web Terminal have some clues on how the connectivity should work (JDBC won't pull it). 

Benjamin De Boe · Feb 27, 2017 go to post

Nice article Andreas!

Have you perhaps also looked into creating a more advanced interpreter, rather than just leveraging the JDBC one? I know that's probably a significantly more elaborate thing to do, but a notebook-style interface for well-documented scripting would nicely complement the application development focus of Atelier / Studio.

Thanks,
benjamin