Search

Clear filter
Announcement
Michelle Spisak · Oct 22, 2019

New Videos! High Speed, Multi-Model Capabilities of InterSystems IRIS™

The Learning Services Online Learning team has posted new videos to help you learn the benefits of InterSystems IRIS. Take a peek to see what you stand to gain from making the switch to InterSystems IRIS! Why Multi-Model?Stefan Wittmann presents use cases for the multi-model data access of InterSystems IRIS data platform. He shows the multi-model architecture that allows you to use the data model that best fits each task in your application — relational, object, or even direct/native access — all accessible through the language of your choice. The Speed and Power of InterSystems IRISInterSystems IRIS powers many of the world’s most powerful applications — applications that require both speed and power for ingesting massive amounts of data, in real time, at scale. Learn about these features and more in this video!
Article
Peter Steiwer · Jan 10, 2020

Understanding Missing Relationship Build Errors in InterSystems IRIS Business Intelligence

When using Related Cubes in InterSystems IRIS BI, cubes must be built in the proper order. The One side must be built before the Many side. This is because during build time for the Many side, it looks up the record on the One side and creates a link. If the referenced record is not found on the One side, a Missing Relationship build error is generated. The One side is going to be the independent side of the relationship, AKA the side of the relationship that is referenced by the Many side or the Dependent cube. For example: Patients contain a reference to their Doctor. The Doctor does not contain references to each of their Patients. Doctors is the One, or Independent side. Patients is the Many, or Dependent side. For more information about setting up Cube Relationships, please see the documentation. WARNING: If you rebuild the One side without rebuilding the Many side, the Many side may point to the wrong record. It is not guaranteed that a record in your cube will always have the same ID. The relationship link that is created is based on ID. YOU MUST REBUILD THE MANY SIDE AFTER BUILDING THE ONE SIDE. To ensure your cubes are always built in the proper order, you can use the Cube Manager. When debugging Build Errors, please also debug them in the Build Order. This is because errors can cascade and you don't want to spend time debugging an error just to find out it is because a different error happened first. Understanding the Missing Relationship Build Error Message SAMPLES>do ##class(%DeepSee.Utils).%PrintBuildErrors("RELATEDCUBES/PATIENTS") 1 Source ID: 1 Time: 01/03/2020 15:30:42 ERROR #5001: Missing relationship reference in RelatedCubes/Patients: source ID 1 missing reference to RxPrimaryCarePhysician 1744 Here is an example of what the Missing relationship build error looks like. We will extract some of these values from the message to understand what is happening. Missing relationship reference in [Source Cube]: source ID [Source ID] missing reference to [Related Cube Reference] [Related Source ID] In our error message, we have the following values: Source Cube = RelatedCubes/Patients Source ID = 1 Related Cube Reference = RxPrimaryCarePhysician Related Source ID = 1744 Most of these are pretty straightforward except for the Related Cube Reference. Sometimes the name is obvious, other times it is not. Either way, we can do a little bit of work to find the cube this reference. Step 1) Find the Fact Class for the Source Cube. SAMPLES>w ##class(%DeepSee.Utils).%GetCubeFactClass("RelatedCubes/Patients") BI.Model.RelCubes.RPatients.Fact Step 2) Run an SQL query to get the Fact class the Related Cube Reference is pointing to: SELECT Type FROM %Dictionary.PropertyDefinition where ID='[Source Cube Fact Class]||[Related Cube Reference]' example: SELECT Type FROM %Dictionary.PropertyDefinition where ID='BI.Model.RelCubes.RPatients.Fact||RxPrimaryCarePhysician' Which returns a value of: BI.Model.RelCubes.RDoctors.Fact Step 3) Now that we have the Related Cube Fact Class, we can run an SQL query to see if this Related Source ID does not have an associated fact in our Related Cube Fact Table. SELECT * FROM BI_Model_RelCubes_RDoctors.Fact WHERE %SourceId=1744 Please note that we had to use the SQL table name instead of the class name here. This can typically be done by replacing all "." excluding ".Fact" with "_". In this case, 0 rows were returned. This means it is still the case that the required related fact does not exist in the related cube. Sometimes after spending the time to get to this point, a synchronize may have happened to pull this new data in. At this point, the Build Error may no longer be true, but it has not yet been cleared out of the Build Errors global. Regular synchronization does not clean entries in this global that have been fixed. The only way to clean the Build Errors global is to run a Build against the cube OR running the following method: Do ##class(%DeepSee.Utils).%FixBuildErrors("CUBE NAME WITH ERRORS") If we now had data for the previous SQL query, the %FixBuildErrors method should fix the record and clear the error. Step 4) Since we do not have this record in our Related Cube Fact Table, we should check the Related Cube Source Table to see if the record exists. First we have to find the Related Source Class by viewing the SOURCECLASS parameter of the Related Cube Fact Class: SAMPLES>w ##class(BI.Model.RelCubes.RDoctors.Fact).#SOURCECLASS BI.Study.Doctor Step 5) Now that we have the Related Source Class, we can query the Related Source Table to see if the Related Source ID exists: SELECT * FROM BI_Study.Doctor WHERE %ID=1744 If this query returns results, you should determine why this record does not exist in the Related Cube Fact Table. This could simply be because it has not yet synchronized. It could also have gotten an Error while building this fact. If this is the case, you need to remember to diagnose all Build Errors in the proper Build Order. It can often be the case that lots of errors cascade from one error. If this query does not return results, you should determine why this record is missing from the Related Source Table. Perhaps some records have been deleted on the One side but records on the Many side have not yet been reassigned or deleted. Perhaps the Cube Relationship is configured incorrectly and the Related Source ID is not the correct value and the Cube Relationship definition should be changed. This guide is a good place to start, but please feel free to contact the WRC. The WRC can help debug and diagnose this with you.
Announcement
Anastasia Dyubaylo · Sep 4, 2019

[September 18, 2019] Upcoming Webinar: InterSystems MLToolkit: AI Robotization

Hey Developers! We are pleased to invite you to the upcoming webinar "InterSystems MLToolkit: AI Robotization" on 18th of September at 10:00 (GMT+3)! Machine Learning (ML) Toolkit - a set of extensions to implement machine learning and artificial intelligence on the InterSystems IRIS Data Platform. As part of this webinar, InterSystems Sales Engineers @Sergey Lukyanchikov and @Eduard Lebedyuk plan to present an approach to the robotization of these tasks, i.e. to ensure their autonomous adaptive execution proceeds within the parameters and rules you specify. Self-learning neural networks, self-monitoring analytical processes, agency of analytical processes are the main subjects of this webinar. Webinar is aimed at both experts in Data Science, Data Engineering, Robotic Process Automation - and those who just discover the world of artificial intelligence and machine learning. We are waiting for you at our event! Date: 18 September, 10:00 – 11:00 (GMT+3).Note: The language of the webinar is Russian. Register for FREE today!
Announcement
Anastasia Dyubaylo · Sep 13, 2019

New Video: JSON and XML persistent data serialization in InterSystems IRIS

Hi Everyone! New video, recorded by @Stefan.Wittmann, is already on InterSystems Developers YouTube: JSON and XML persistent data serialization in InterSystems IRIS Need to work with JSON or XML data? InterSystems IRIS supports multiple inheritance and provides several built-in tools to easily convert between XML, JSON, and objects as you go. Learn more about the multi-model development capabilities of InterSystems IRIS on Learning Services sites. Enjoy watching the video! Can confirm that the %JSON.Adaptor tool is extremely useful! This was such a great addition to the product.In Application Services, we've used it to build a framework which allows us to not only expose our persistent classes via REST but also authorize different levels of access for different representations of each class (for example, all the properties, vs just the Name and the Id). The "Mappings and Parameters" feature is especially useful:https://irisdocs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=GJSON_adaptorAlso, @Stefan are you writing backwards while you talk? That's impressive. Anyone who is doubting multiple-inheritance is insane.Although calling this kind of inheritance 'mixin-classes' helps I've noticed, mixing in additional features. https://hackaday.com/tag/see-through-whiteboard/
Announcement
Fabiano Sanches · Apr 26, 2023

Get Alerts, Advisories and other Product News directly from InterSystems

Be in touch with InterSystems and receive alerts, advisories and product news quickly. The process is really simple: Click in this link: https://www.intersystems.com/support/product-alerts-advisories/ Fill in the form with your contact information and You're all set! As you can see, it takes less than a minute to keep informed about the news!
Announcement
Anastasia Dyubaylo · Jun 14, 2017

Video of the Week: InterSystems iKnow Technology. A Cure for Clinician Frustration

Hi Community! Enjoy the video of the week about InterSystems iKnow Technology: A Cure for Clinician Frustration In this video, learn why iKnow capabilities are critical for getting the most out of your investments in electronic health records and improving information access for clinicians. You are very welcome to watch all the videos about iKnow in a dedicated iKnow playlist on the InterSystems Developers YouTube Channel. Enjoy!
Article
Константин Ерёмин · Sep 18, 2017

Search InterSystems documentation using iKnow and iFind technologies

The InterSystems DBMS has a built-in technology for working with non-structured data called iKnow and a full-text search technology called iFind. We decided to take a dive into both and make something useful. As the result, we have DocSearch — a web application for searching in InterSystems documentation using iKnow and iFind. How Caché Documentation works Caché documentation is based on the Docbook technology. It has a web interface (which includes a search that uses neither iFind nor iKnow). The articles themselves are stored in Caché classes, which allows us to run queries against this data and, of course, to create our own search tool. What is iKnow and iFind Intersystems iKnow is a technology for analyzing unstructured data, which provides access to this data by indexing sentences and instances in it. To start the analysis, you first need to create a domain — a storage for unstructured data, and load a text to it. The iFind technology is a module of the Caché DBMS for performing full-text search in Caché classes. iFind uses many iKnow classes for intelligent text search. To use iFind in your queries, you need to introduce a special iFind index in your Caché class. There are three types of iFind indexes, each offering all the functions of the previous type, plus some additional ones: The main index (%iFind.Index.Basic): supports the search for words and word combinations.Semantic index (%iFind.Index.Semantic): supports the search for iKnow objects.Analytic search (%iFind.Index.Analytic): supports all iKnow functions of the semantic search, as well as information about paths and word proximity. Since documentation classes are stored in a separate namespace, if you want to make classes available in ours, the installer also performs mapping of packages and globals. Installer code for mapping XData Install [ XMLNamespace = INSTALLER ] { <Manifest> // Specify the name of the namespace <IfNotDef Var="Namespace"> <Var Name="Namespace" Value="DOCSEARCH"/> <Log Text="Set namespace to ${Namespace}" Level="0"/> </IfNotDef> // Check if the area exists <If Condition='(##class(Config.Namespaces).Exists("${Namespace}")=1)'> <Log Text="Namespace ${Namespace} already exists" Level="0"/> </If> // Creating the namespace <If Condition='(##class(Config.Namespaces).Exists("${Namespace}")=0)'> <Log Text="Creating namespace ${Namespace}" Level="0"/> // Creating a database <Namespace Name="${Namespace}" Create="yes" Code="${Namespace}" Ensemble="" Data="${Namespace}"> <Log Text="Creating database ${Namespace}" Level="0"/> // Map the specified classes and globals to a new namespace <Configuration> <Database Name="${Namespace}" Dir="${MGRDIR}/${Namespace}" Create="yes" MountRequired="false" Resource="%DB_${Namespace}" PublicPermissions="RW" MountAtStartup="false"/> <Log Text="Mapping DOCBOOK to ${Namespace}" Level="0"/> <GlobalMapping Global="Cache*" From="DOCBOOK" Collation="5"/> <GlobalMapping Global="D*" From="DOCBOOK" Collation="5"/> <GlobalMapping Global="XML*" From="DOCBOOK" Collation="5"/> <ClassMapping Package="DocBook" From="DOCBOOK"/> <ClassMapping Package="DocBook.UI" From="DOCBOOK"/> <ClassMapping Package="csp" From="DOCBOOK"/> </Configuration> <Log Text="End creating database ${Namespace}" Level="0"/> </Namespace> <Log Text="End creating namespace ${Namespace}" Level="0"/> </If> </Manifest> } The domain required for iKnow is built upon the table containing the documentation. Since we use a table as the data source, we'll use SQL.Lister. The content field contains the documentation text, so let's specify it as the data field. The rest of the fields will be described in the metadata. Installer code for creating a domain ClassMethod Domain(ByRef pVars, pLogLevel As %String, tInstaller As %Installer.Installer) As %Status { #Include %IKInclude #Include %IKPublic set ns = $Namespace znspace "DOCSEARCH" // Create a domain or open it if it exists set dname="DocSearch" if (##class(%iKnow.Domain).Exists(dname)=1){ write "The ",dname," domain already exists",! zn ns quit } else { write "The ",dname," domain does not exist",! set domoref=##class(%iKnow.Domain).%New(dname) do domoref.%Save() } set domId=domoref.Id // Lister is used for searching for sources corresponding to the records in query results set flister=##class(%iKnow.Source.SQL.Lister).%New(domId) set myloader=##class(%iKnow.Source.Loader).%New(domId) // Building a query set myquery="SELECT id, docKey, title, bookKey, bookTitle, content, textKey FROM SQLUser.DocBook" set idfld="id" set grpfld="id" // Specifying the fields for data and metadata set dataflds=$LB("content") set metaflds=$LB("docKey", "title", "bookKey", "bookTitle", "textKey") // Putting all data into Lister set stat=flister.AddListToBatch(myquery,idfld,grpfld,dataflds,metaflds) if stat '= 1 {write "The lister failed: ",$System.Status.DisplayError(stat) quit } //Starting the analysis process set stat=myloader.ProcessBatch() if stat '= 1 { quit } set numSrcD=##class(%iKnow.Queries.SourceQAPI).GetCountByDomain(domId) write "Done",! write "Domain cointains ",numSrcD," source(s)",! zn ns quit } To search in documentation, we use the %iFind.Index.Analytic index: Index contentInd On (content) As %iFind.Index.Analytic(LANGUAGE = "en", LOWER = 1, RANKERCLASS = "%iFind.Rank.Analytic"); Where contentInd is the name of the index and content is the name of the field that we are creating an index for. The LANGUAGE = “en” parameter sets the language of the text The LOWER = 1 parameter turns off case sensitivity The RANKERCLASS = "%iFind.Rank.Analytic" parameter allows to use the TF-IDF result ranking algorithm After adding and building such an index, it can be used in SQL queries, for example. The general syntax for using iFind in SQL: SELECT * FROM TABLE WHERE %ID %FIND search_index(indexname,'search_items',search_option) After creating the %iFind.Index.Analytic index with such parameters, several SQL procedures of the following type are generated: [table_name]_[index name]Procedure name In our project, we use two of them: DocBook_contentIndRank — returns the result of the TF-IDF ranking algorithm for a request The procedure has the following syntax: SELECT DocBook_contentIndRank(%ID, ‘SearchString’, ‘SearchOption’) Rank FROM DocBook WHERE %ID %FIND search_index(contentInd,‘SearchString’, ‘SearchOption’) DocBook_contentIndHighlight — returns the search results, where the searched words are wrapped into the specified tag: SELECT DocBook_contentIndHighlight(%ID, ‘SearchString’, ‘SearchOption’,’Tags’) Text FROM DocBook WHERE %ID %FIND search_index(contentInd,‘SearchString’, ‘SearchOption’) I will go into more detail later in the article. What do we have in the end: Autocomplete in the search field As you start entering text into the search field, the system will suggest possible query variants to help you find the necessary information quicker. These suggestions are generated on the basis of the word (or its beginning) that you types. The system shows ten best matching words or phrases. This process uses iKnow, the %iKnow.Queries.Entity.GetSimilar method Fuzzy string search iFind supports fuzzy search for finding words that almost match the search query. This is achieved by measuring the Levenshtein distance between two words. Levenshtein distance is the minimal number of one-character changes (inserts, removals or replacements) necessary for turning one word into another. It can be used for correcting typis, small variations in writing, different grammatic forms (plural and singular, for exampe). In iFind SQL queries, the search_option parameter is responsible for the fuzzy search. search_option = 3 denotes a Levenshtein distance of 2. To set a Levenshtein distance equal to n, you need to set the search_option parameter to ‘3:n’ Documentation search uses a Levenshtein distance of 1, so let's demonstrate how it works: Let's type “ifind” in the search field: Let's try a fuzzy search by intentionally making a typo. As we can see, the search corrected the typo and found the necessary articles. Complex searches Thanks to the fact that iFind supports complex queries with brackets and AND OR NOT operators, we were able to implement complex search functionality. Here's what you can specify in your query: word, word combination, one of several words, exceptions. Fields can be filled one by one, or all at once. For example, let's find articles containing the word “iknow”, the combination “rest api” and those that contain either “domain” or “UI”. We can see that there are two such articles: Please note that the second one mentions Swagger UI, so we can modify the query to make it exclude those ones that do not contain the word Swagger. As the result, we will only find one article: Search results highlighting As stated above, the use of an iFind index creates the DocBook_contentIndHighlight procedure. Let's use the following: SELECT DocBook_contentIndHighlight(%ID, 'search_items', '0', '<span class=""Illumination"">', 0) Text FROM DocBook To get the resulting text wrapped into a tag <span class=""Illumination""> This helps you to visually mark search results on the front-end.Search results ranking Find is capable of ranking results using the TF-IDF algorithm. TF-IDF is often used in text analysis and data search tasks – for example, as a criterion of relevance of a document to a search query. As the result of the SQL query, the Rank field will contain the weight of the word that will be proportionate to the number of times the word was used in an article, and reversely proportionate to the frequency of the word’s occurrence in other articles. SELECT DocBook_contentIndRank(%ID, ‘SearchString’, ‘SearchOption’) Rank FROM DocBook WHERE %ID %FIND search_index(contentInd,‘SearchString’, ‘SearchOption’) Integration with the official documentation search After installation, a “Search using iFind” button will be added to the official documentation search. If the “Search words” field is filled, you will be taken to the search results page after clicking the “Search using iFind” button. If the field is empty, you will be taken to the new search page. Installation Download the Installer.xml file from the latest release available on the corresponding page.Import the loaded Installer.xml file into the %SYS namespace and compile it.Enter the following command in the terminal in the %SYS namespace: do ##class(Docsearch.Installer).setup(.pVars) After that, the search will be available at the following address localhost:[port]/csp/docsearch/index.html Demo An online demo of the search is available here. Conclusion This project demonstrates interesting and useful capabilities of iFind and iKnow technologies that make data search more relevant. Any comments or suggestions will be highly appreciated. The entire source code with the installer and the deployment guide is available on github Hi Konstantin,thanks for sharing your work, a nice application of iFind technology! If I can add a few ideas to make this more lightweight:Rather than creating a domain programmatically, the recommended approach for a few versions now has been to use Domain Definitions. They allow you to declare a domain in an XML format (not much unlike the %Installer approach) and avoid a number of inconveniences in managing your domain in a reproducible way.From reading the article, I believe you're just using the iKnow domain for that one EntityAPI:GetSimilar() call to generate search suggestions. iFind has a similar feature, also exposed through SQL, through %iFind.FindEntities() and %iFind.FindWords(), depending on what kind of results you're looking for. See also this iFind demo. With that in place, you may even be able to skip those domains altogether :-)thanks,benjamin Thank you, Benjamin.I will keep your ideas in mind.Thanks, Konstntin Eremin. Thanks for posting this Konstantin. For a long time I have been wondering why InterSystems hadn't done this already. I've had something simple running on my laptop already a long time ago, but the internal discussion on how to package it proved a little more complicated. Among other things, an iFind index requires an iKnow-enabled license (and more space!), which meant you couldn't simply include it in every kit.Also, for the ranking of docbook results, applying proper weights based on the type of content (title / paragraph / sample / ...) was at least as important as the text search capabilities themselves. That latter piece has been well-addressed in 2017.1, so docbook search is in pretty good shape now. Blending in an easily-deployable iFind option as Konstantin published can only add to this!Thanks,benjamin Hi, Konstantin!I tried to search $Case word it finds, but it shows strange option in a dropdown list of a search field. See the screenshot:What does it mean? Hi, Evgeny!I used iKnow Entities as words in a dropdown list of a search field. iKnow thinks "$case( $extract( units, 1" is entity, because it look some strange. ​But I would like to use %iFind.FindEntities() (Idea from first Benjamin DeBoe's comment) for words in dropdown list of a search field after a short time. I think it will fix this iKnow was written to analyze English rather than ObjectScript, so you may see a few odd results coming out of code blocks. I believe you can add a where clause excluding those records from the block table to avoid them. Now I use %iFind.FindEntities to get words in a dropdown list of a search field. Installation has become faster than before, because I don't use domain builiding process Hi, Konstantin!The problem with strange suggestions fixed, but it doesn't suggest anything for $CASE now ) Did you introduce $CASE in a blacklist? )I think suggestions on all COS commands and functions is a good option for the search field (if possible of course). Hi, Evgeny!Yes, I agree with you about COS commands in a dropdown list of a search field.I had some problems with COS commands and functions. But now I fixed it: Hi Konstantin,Can we install this project on Cache 2016.2 or does it need 2017 ?I tried to install offline (becuse my server cannot get through to GITHUB(443)) and the installation failed on several errors.Maybe I need more specific instructions for offline install ?Uri Hi Uri!You need Cache 2017Konstantin Hi, Constantin!When I search documentation with your online tool what is the version of documentation it works with?Would you please add the version of the product in the results or somewhere?Thanks in advance! Hi, Evgeny!I will add the version of the product in the results in the near future. Hi, Evgeny!I added the version of documentation in resultsKonstantin Thanks, Konstantin!And here is the link to the demo.Do you want to add an option to share the search? E.g. introduce some share results button in UI which would provide an URL with added search option in URL? It would be very handy if you want to share search results with a colleague. Good day,I would very much like to install this example on my local instance. However, I cannot find installer.xml on "corrresponding page". Which is the "corresponding page" please? I downloaded the solution from Github, but also there is no installer.xml. I will apprecitae it if you can point me to the "corresponding page" where the installer.xml is please.Thank you in advance. Hi Elize!It's in releases
Announcement
Evgeny Shvarov · Sep 15, 2017

Join InterSystems Developer Meetup on 17th of October in UK, Birmingham!

Hi, Community!We are pleased to invite you to the InterSystems UK Developer Community Meetup on 17th of October!The UK Developer Community Meetup is an informal meeting of developers, engineers, and devops to discuss successes and lessons learnt from those building and supporting solutions with InterSystems products. An excellent opportunity to meet and discuss new solutions with like-minded peers and to find out what's new in InterSystems technology.The Meetup will take place on 17th of October from 5pm to 8pm at The Belfry, Sutton Coldfield with food and beverages supplied.Your stories are very welcome! Here is the current agenda for the event:TimeSessionPresenterSite5:00 pmDependencies and Complexity@John.Murraygeorgejames.com5:30 pmDeveloping modern web applications with Caché, Web Components & JSON-RPC@Sean.Connellymemcog.com6:00 pmNetworking Coffee break 6:30 pmUp Arrow Redux: Persistence as a Language Feature@Rob.Tweedmgateway.com7:00 pmFirst class citizens of the container world@Luca.RavazzoloInterSystems Product Manager If you want to be a presenter, please comment on this post below, we'll contact you. All sessions are now filled.Attendees are also invited to join us the following day for the UK Technology Summit - which is the annual gathering of the InterSystems community to discuss the technologies, strategies, and methodologies that will leverage what matters – competitive advantage and business growth.Register for the Meet Up here (link to http://www3.intersystems.com/its2017/registration) and select UK Developer Community Meet Up. Topic from @rob.tweed is introduced. We have one free slot available! And we would have a session regarding containers from @Luca.Ravazzolo, InterSystems Product Manager.Come to InterSystems Data Platform UK Meetup and InterSystems UK Summit! We would have a live stream in two hours. Join! We live now! If you have any questions for presenter you can ask it online. To accompany the YouTube video I have posted the slide deck for my talk (the first one) here. Slides from @Luca.Ravazzolo session are available here. Slide deck for my presentation on "data persistence as a language feature" is here:https://www.slideshare.net/robtweed/data-persistence-as-a-language-feature Slide #7. You touched on a very sore subject. As I understand you! My presentation made reference to a Google V8 API bottleneck issue. Here's the link to the bug tracker report:https://bugs.chromium.org/p/v8/issues/detail?id=5144#c1and the detailed benchmark tests that illustrate the problemhttps://bugs.chromium.org/p/v8/issues/attachmentText?aid=240024 Here are the slides on DeepSee Web session
Article
Developer Community Admin · Oct 21, 2015

High Availability Strategies for InterSystems Caché, Ensemble, and HealthShare Foundation

IntroductionThis document is intended to provide a survey of various High Availability (HA) strategies that can be used in conjunction with InterSystems Caché, Ensemble, and HealthShare Foundation. This document also provides an overview of the various types of system outages that can occur, as well as how each strategy would handle a given outage, with the goal of helping you choose the right strategy for your specific deployment.The strategies surveyed in this document are based on three different HA technologies:Operating System Failover ClustersVirtualization-Based HACaché Database Mirroring
Article
Developer Community Admin · Oct 21, 2015

The European Space Agency: Charting the Galaxy with the Gaia Satellite and InterSystems Caché

AbstractThe European Space Agency (ESA) has chosen InterSystems Caché as the database technology for the AGIS astrometric solution that will be used to analyze the celestial data captured by the Gaia satellite.The Gaia mission is to create an accurate phase-map of about a billion celestial objects. During the mission, the AGIS solution will iteratively refine the accuracy of Gaia's spatial observations, ultimately achieving accuracies that are on the order of 20 microarcseconds.In preparation of the extreme data requirements for this project, InterSystems recently engaged in a proof-of-concept project which required 5 billion discrete Java objects of about 600 bytes each to be inserted in the Caché database within a span of 24 hours. Running on one 8-core Intel 64-bit processor with Red Hat Enterprise Linux 5.5, Caché successfully ingested all the data in 12 hours and 18 minutes, at an average insertion rate of 112,000 objects/second.
Question
Tom Philippi · Jan 31, 2018

DSN does not show up on InterSystems Ensemble SQL Gateway configuration.

I am running InterSystems Ensemble 2016.2 on ubuntu and trying to connect to a remote MS SQL server database.Insofar, I have successfully configured my ubuntu machine to connect to the remote MS SQL server database using unix-odbc. That is:Telnet connection workstsql (test sql) connection worksisql command succesfully connects to sql server and I am able to execute queries on ubuntu.The DSN for the isql command are defined in /etc/odbc.ini and /etc/odbcinst.ini and should be available systemwide.The DSN in the odbcinst.ini uses the microsoft odbc driver 13 for Sql Server for linux. However, when I access the sql gateway in the management portal the DSN configured in /etc/odbc.ini does not show up. Does anyone know how I can expose my DSN defined in /etc/obdc.ini to Ensemble? I already tried creating a shortcut in /intersystems/mgr directory named cacheodbc.ini (as described here: https://groups.google.com/forum/#!topic/intersystems-public-cache/4__XchiaCQU), but insofar no success :(. The first thing I'd check are the permissions on these files. If you created them as root, they might not be readable for other users?
Question
Tom Philippi · Apr 20, 2017

Enabling SSL / TLS on an InterSystems (soap) web service, part 2

We are in the process of setting enabling SSL on a soap web service exposed via InterSystems, but are running into trouble. We have installed our certificates on our webserver (Apache 2.4) and enabled SSL over the default port 57772. However, we now get an error when sending a soap message to the web service (it used to work over http). Specifically the CSP gateway refuses to route te emssage the soap web service:<SOAP-ENV:Envelope SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:s="http://www.w3.org/2001/XMLSchema"> <SOAP-ENV:Body> <SOAP-ENV:Fault> <faultcode>SOAP-ENV:Server</faultcode> <faultstring>CSP Gateway Error (version:2016.1.2.209.0 build:1601.1554e)</faultstring> <detail> <error xmlns="http://tempuri.org"> <special>Systems Management</special> <text>Invalid Request : Cannot identify application path</text> </error> </detail> </SOAP-ENV:Fault> </SOAP-ENV:Body></SOAP-ENV:Envelope>Probably either the CSP gateway or the web server was misconfigured. Anyone an idea in which direction we might look. (BTW accessing the management port now returns the same error as does using SSL port 443).PS this issue was also submitted to WRC Tom,I presume by now you've had this answered by the WRC, but the issue is most likely that the private Apache web server that ships with Caché/Ensemble does not currently support SSL. In order to configure SSL, you would need to configure a full Apache or IIS web server, which is typically recommended for any public-facing, production-level deployment anyway.-Steve
Announcement
Steve Brunner · Sep 4, 2018

InterSystems IRIS Data Platform 2018.1.2 Maintenance Release

InterSystems is pleased to announce the availability of InterSystems IRIS Data Platform 2018.1.2 maintenance release For information about the corrections in this release, refer to the release notes.This release is supported on the same platforms as InterSystems IRIS 2018.1.1. You can see details, including cloud platforms and docker containers supported, in this Supported Platforms document. The build corresponding to this release is 2018.1.2.609.0 If you have not visited our Learning Services site recently, I encourage you to try the InterSystems IRIS sandbox and Experiences.
Article
Vasiliy Bondar · Oct 14, 2018

Configuring LDAP authentication in InterSystems Caché using Microsoft Active Directory

From the first glance, the task of configuring LDAP authentication in Caché is not hard at all – the manual describes this process in just 6 paragraphs. On the other hand, if the LDAP server uses Microsoft Active Directory, there a few non-evident things that need to be configured on the LDAP server side. Those who don’t do anything like that on a regular basis may get lost in Caché settings. In this article, we will describe the step-by-step process of setting up LDAP authentication and cover the diagnostic methods that can be used if something doesn’t work as expected.Configuration of the LDAP server1. Create a user in ActiveDirectory that we will use to connect to Caché and search for information in the LDAP database. This user must be located in the domain’s root.2. Let’s create a special unit for users who will be connecting to Caché and call it IdapCacheUsers.3. Register users there.4. Let’s test the availability of the LDAP database using a tool called ldapAdmin. You can download it here.5. Configure the connection to the LDAP server:6. All right, we are connected now. Let’s take a look at how it all works:7. Since users that will be connecting to Caché are in the ldapCacheUsers unit, let’s limit our search to this unit only.Settings on the Caché side8. The LDAP server is ready, so let’s proceed to configuring the settings on the Caché side. Go to Management Portal -> System Administration -> Security -> System Security -> LDAP Options. Let’s clear the “User attribute to retrieve default namespace”, “User attribute to retrieve default routine” and “User attribute to retrieve roles” fields, since these attributes are not in the LDAP database yet.9. Enable LDAP authentication in System Administration -> Security -> System security -> Authentication/CSP Session Settings10. Enable LDAP authentication in services. The %Service_CSP service is responsible for connecting web applications, %Service_Console handles connections through the terminal.11. Configure LDAP authentication in web applications.12. For the time being and for testing the connection, let’s configure everything so that new users in Caché have full rights. To do this, assign the %All role to the user _PUBLIC. We will address this aspect in the future ……13. Let’s try opening the configured web application, it should open without problems.14. The terminal also opens15. After connecting, LDAP users will appear on the Caché users list16. The truth is, this configuration gives all new users complete access to the system. To close this security hole, we need to modify the LDAP database by adding an attribute that we will use to store the name of the role that will be assigned to users after connecting to Caché. Prior to that, we need to make a backup copy of the domain controller to ensure that we don’t break the entire network if something goes wrong with the configuration process.17. To modify the ActiveDirectory schema, let’s install the Active Directory snap-in on the server where ActiveDirectory is installed (it is not installed by default). Read the instruction here.18. Let’s create an attribute called intersystems-Roles, OID 1.2.840.113556.1.8000.2448.2.3, a case-sensitive string, a multi-value attribute.19. Then add this attribute to the class “user”.20. Let’s now make it so that when we view the list of unit users, we can see a “Role in InterSystems Cache” column. To do that, click Start -> Run and type “adsiedit.msc”. We are connecting to “Configuration” naming context.21. Let’s go to the CN=409, CN=DisplaySpecifiers, CN=Configuration container and choose a container type that will show additional user attributes when we view it. Let’s choose unit-level display (OU) provided by the organisationalUnit-Display container. We need to find the extraColumns attribute in its properties and change its value to ”intersystems-Roles, Role in IntersystemsCache,1,200,0”. The rule of composing the attribute is as follows: attribute name, name of the destination column, display by default or not, column width in pixels, reserved value. One more comment: CN=409 denotes a language code (CN=409 for the English version, CN=419 for the Russian version of the console).22. We can now fill out the name of the role that will be assigned to all users connecting to Caché. If your Active Directory is running on Windows Server 2003, you won’t have any built-in tools for editing this field. You can use a tool called ldapAdmin (see item 4) for editing the value of this attribute. If you have a newer version of Windows, this attribute can be edited in the “Additional functions” mode – the user will see an additional tab for editing attributes.23. After that, let’s specify the name of this attribute in the LDAP options on the Caché management portal. 24. Let’s create an ldapRole with the necessary privileges25. Remove the %ALL role from the user _PUBLIC26. Everything is set up, let’s try connecting to the system27. If it doesn’t work right away, enable and set up an audit28. Audit settings29. Look at the error log in Audit Database.ConclusionIn reality, it often happens that the configuration of different roles for different users is not required for working in an application. If you only need to assign a particular set of permissions to users logging in to a web application, you can skip steps 16 through 23. All you will need to do is to add these roles and remove all types of authentication except for LDAP on the “Application roles” tab in the web application settings. In this case, only users registered on the LDAP sever can log in. When such a user logs in, Caché automatically assigns the roles required for working in this application. I wanted to add that you certainly can create an attribute to list a user's roles as described here, and some sites do, but it's not the only way to configure LDAP authentication. Many administrators find the group-based behavior enabled by the "Use LDAP Groups for Roles/Routine/Namespace" option easier to configure, so you should consider that option if you're setting up LDAP authentication. If you do use that option, many of the steps here will be different, including at least steps 17-23 where the attribute is created and configured. Yes, I agree. Thanks for the addition Thank you for sharing. Good job.
Article
Niyaz Khafizov · Oct 8, 2018

Record linkage using InterSystems IRIS, Apache Zeppelin, and Apache Spark

Hi all. We are going to find duplicates in a dataset using Apache Spark Machine Learning algorithms. Note: I have done the following on Ubuntu 18.04, Python 3.6.5, Zeppelin 0.8.0, Spark 2.1.1 Introduction In previous articles we have done the following: The way to launch Jupyter Notebook + Apache Spark + InterSystems IRIS Load a ML model into InterSystems IRIS K-Means clustering of the Iris Dataset The way to launch Apache Spark + Apache Zeppelin + InterSystems IRIS In this series of articles, we explore Machine Learning and record linkage. Imagine that we merged databases of neighboring shops. Most probably there will be records that are very similar to each over. Some records will be of the same person and we call them duplicates. Our purpose is to find duplicates. Why is this necessary? First of all, to combine data from many different operational source systems into one logical data model, which can then be subsequently fed into a business intelligence system for reporting and analytics. Secondly, to reduce data storage costs. There are some additional use cases. Approach What data do we have? Each row contains different anonymized information about one person. There are family names, given names, middle names, date of births, several documents, etc. The first step is to look at the number of records because we are going to make pairs. The number of pairs equals n*(n-1)/2. So, if you have less than 5000 records, than the number of pairs would be 12.497.500. It is not that many, so we can pair each record. But if you have 50.000, 100.000 or more, the number of pairs more than a billion. This number of pairs is hard to store and work with. So, if you have a lot of records, it would be a good idea to reduce this number. We will do it by selecting potential duplicates. A potential duplicate is a pair, that might be a duplicate. We will detect them based on several simple conditions. A specific condition might be like: (record1.family_name == record2.familyName) & (record1.givenName == record2.givenName) & (record1.dateOfBirth == record2.dateOfBirth) but keep in mind that you can miss duplicates because of strict logical conditions. I think the optimal solution is to choose important conditions and use no more than two of them with & operator. But you should convert each feature into one record shape beforehand. For example, there are several ways to store dates: 1985-10-10, 10/10/1985, etc convert to 10-10-1985(month-day-year). The next step is to label the part of the dataset. We will randomly choose, for example, 5000-10000 pairs (or more, if you are sure that you can label all of them). We will save them to IRIS and label these pairs in Jupyter (Unfortunately, I didn't find an easy and convenient way to do it. Also, you can label them in PySpark console or wherever you want). After that, we will make a feature vector for each pair. During the labeling process probably you noticed which features are important and what they equal. So, test different approaches to creating feature vectors. Test different machine learning models. I chose a random forest model because of tests (accuracy/precision/recall/etc). Also, you can try decision trees, Naive Bayes, other classification model and choose the one that will be the best. Test the result. If you are not satisfied with the result, try to change feature vectors or change a ML model. Finally, fit all pairs into the model and look at the result. Implementation Load a dataset: %pysparkdataFrame=spark.read.format("com.intersystems.spark").option("url", "IRIS://localhost:51773/******").option("user", "*******").option("password", "*********************").option("dbtable", "**************").load() Clean the dataset. For example, null (check every row) or useless columns: %pysparkcolumns_to_drop = ['allIdentityDocuments', 'birthCertificate_docSource', 'birthCertificate_expirationDate', 'identityDocument_expirationDate', 'fullName']droppedDF = dataFrame.drop(*columns_to_drop) Prepare the dataset for making pairs: %pysparkfrom pyspark.sql.functions import col# rename columns namesreplacements1 = {c : c + '1' for c in droppedDF.columns}df1 = droppedDF.select([col(c).alias(replacements1.get(c, c)) for c in droppedDF.columns])replacements2 = {c : c + '2' for c in droppedDF.columns}df2 = droppedDF.select([col(c).alias(replacements2.get(c, c)) for c in droppedDF.columns]) To make pairs we will use join function with several conditions. %pysparktestTable = (df1.join(df2, (df1.ID1 < df2.ID2) & ( (df1.familyName1 == df2.familyName2) & (df1.givenName1 == df2.givenName2) | (df1.familyName1 == df2.familyName2) & (df1.middleName1 == df2.middleName2) | (df1.familyName1 == df2.familyName2) & (df1.dob1 == df2.dob2) | (df1.familyName1 == df2.familyName2) & (df1.snils1 == df2.snils2) | (df1.familyName1 == df2.familyName2) & (df1.addr_addressLine1 == df2.addr_addressLine2) | (df1.familyName1 == df2.familyName2) & (df1.addr_okato1 == df2.addr_okato2) | (df1.givenName1 == df2.givenName2) & (df1.middleName1 == df2.middleName2) | (df1.givenName1 == df2.givenName2) & (df1.dob1 == df2.dob2) | (df1.givenName1 == df2.givenName2) & (df1.snils1 == df2.snils2) | (df1.givenName1 == df2.givenName2) & (df1.addr_addressLine1 == df2.addr_addressLine2) | (df1.givenName1 == df2.givenName2) & (df1.addr_okato1 == df2.addr_okato2) | (df1.middleName1 == df2.middleName2) & (df1.dob1 == df2.dob2) | (df1.middleName1 == df2.middleName2) & (df1.snils1 == df2.snils2) | (df1.middleName1 == df2.middleName2) & (df1.addr_addressLine1 == df2.addr_addressLine2) | (df1.middleName1 == df2.middleName2) & (df1.addr_okato1 == df2.addr_okato2) | (df1.dob1 == df2.dob2) & (df1.snils1 == df2.snils2) | (df1.dob1 == df2.dob2) & (df1.addr_addressLine1 == df2.addr_addressLine2) | (df1.dob1 == df2.dob2) & (df1.addr_okato1 == df2.addr_okato2) | (df1.snils1 == df2.snils2) & (df1.addr_addressLine1 == df2.addr_addressLine2) | (df1.snils1 == df2.snils2) & (df1.addr_okato1 == df2.addr_okato2) | (df1.addr_addressLine1 == df2.addr_addressLine2) & (df1.addr_okato1 == df2.addr_okato2) ))) Check the size of returned dataframe: %pysparkdroppedColumns = ['prevIdentityDocuments1', 'birthCertificate_docDate1', 'birthCertificate_docNum1', 'birthCertificate_docSer1', 'birthCertificate_docType1', 'identityDocument_docDate1', 'identityDocument_docNum1', 'identityDocument_docSer1', 'identityDocument_docSource1', 'identityDocument_docType1', 'prevIdentityDocuments2', 'birthCertificate_docDate2', 'birthCertificate_docNum2', 'birthCertificate_docSer2', 'birthCertificate_docType2', 'identityDocument_docDate2', 'identityDocument_docNum2', 'identityDocument_docSer2', 'identityDocument_docSource2', 'identityDocument_docType2'] print(testTable.count())testTable.drop(*droppedColumns).show() # I dropped several columns just for show() function Randomly take a part of the dataframe: %pysparkrandomDF = testTable.sample(False, 0.33, 0)randomDF.write.format("com.intersystems.spark").\option("url", "IRIS://localhost:51773/DEDUPL").\option("user", "*****").option("password", "***********").\option("dbtable", "deduplication.unlabeledData").save() Label pairs in Jupyter Run the following (it will widen the cells). from IPython.core.display import display, HTMLdisplay(HTML("<style>.container { width:100% !important; border-left-width: 1px !important; resize: vertical}</style>")) Load dataframe: unlabeledDF = spark.read.format("com.intersystems.spark").option("url", "IRIS://localhost:51773/DEDUPL").option("user", "********").option("password", "**************").option("dbtable", "deduplication.unlabeledData").load() Return all the elements of the dataset as a list: rows = labelledDF.collect() The convenient way to display pairs: from IPython.display import clear_outputfrom prettytable import PrettyTablefrom collections import OrderedDict def printTable(row): row = OrderedDict((k, row.asDict()[k]) for k in newColumns) table = PrettyTable() column_names = ['Person1', 'Person2'] column1 = [] column2 = [] i = 0 for key, value in row.items(): if key != 'ID1' and key != 'ID2' and key != "prevIdentityDocuments1" and key != 'prevIdentityDocuments2' and key != "features": if (i < 20): column1.append(value) else: column2.append(value) i += 1 table.add_column(column_names[0], column1) table.add_column(column_names[1], column2) print(table) List where we will store rows: listDF = [] The labeling process: from pyspark.sql import Rowfrom IPython.display import clear_outputimport time# 3000 - 4020for number in range(3000 + len(listDF), len(rows)): row = rows[number] if (len(listDF) % 10) == 0: print(3000 + len(listDF)) printTable(row) result = 0 label = 123 while True: result = input('duplicate? y|n|stop') if (result == 'stop'): break elif result == 'y': label = 1.0 break elif result == 'n': label = 0.0 break else: print('only y|n|stop') continue if result == 'stop': break tmp = row.asDict() tmp['label'] = label newRow = Row(**tmp) listDF.append(newRow) time.sleep(0.2) clear_output() Create a dataframe again: newColumns.append('label')labelledDF = spark.createDataFrame(listDF).select(*newColumns) Save it to IRIS: labeledDF.write.format("com.intersystems.spark").\option("url", "IRIS://localhost:51773/DEDUPL").\option("user", "***********").option("password", "**********").\option("dbtable", "deduplication.labeledData").save() Feature vector and ML model Load a dataframe into Zeppelin: %pysparklabeledDF = spark.read.format("com.intersystems.spark").option("url", "IRIS://localhost:51773/DEDUPL").option("user", "********").option("password", "***********").option("dbtable", "deduplication.labeledData").load() Feature vector generation: %pysparkfrom pyspark.sql.functions import udf, structimport stringdistfrom pyspark.sql.types import StructType, StructField, StringType, IntegerType, DateType, ArrayType, FloatType, DoubleType, LongType, NullTypefrom pyspark.ml.linalg import Vectors, VectorUDTimport roman translateMap = {'A' : 'А', 'B' : 'В', 'C' : 'С', 'E' : 'Е', 'H' : 'Н', 'K' : 'К', 'M' : 'М', 'O' : 'О', 'P' : 'Р', 'T' : 'Т', 'X' : 'Х', 'Y' : 'У'} column_names = testTable.drop('ID1').drop('ID2').columnscolumnsSize = len(column_names)//2 def isRoman(numeral): numeral = numeral.upper() validRomanNumerals = ["M", "D", "C", "L", "X", "V", "I", "(", ")"] for letters in numeral: if letters not in validRomanNumerals: return False return True def differenceVector(params): differVector = [] for i in range(0, 3): if params[i] == None or params[columnsSize + i] == None: differVector.append(0.0) elif params[i] == 'НЕТ' or params[columnsSize + i] == 'НЕТ': differVector.append(0.0) elif params[i][:params[columnsSize + i].find('-')] == params[columnsSize + i][:params[columnsSize + i].find('-')] or params[i][:params[i].find('-')] == params[columnsSize + i][:params[i].find('-')]: differVector.append(0.0) else: differVector.append(stringdist.levenshtein(params[i], params[columnsSize+i])) for i in range(3, columnsSize): # snils if i == 5 or i == columnsSize + 5: if params[i] == None or params[columnsSize + i] == None or params[i].find('123-456-789') != -1 or params[i].find('111-111-111') != -1 \ or params[columnsSize + i].find('123-456-789') != -1 or params[columnsSize + i].find('111-111-111') != -1: differVector.append(0.0) else: differVector.append(float(params[i] != params[columnsSize + i])) # birthCertificate_docNum elif i == 10 or i == columnsSize + 10: if params[i] == None or params[columnsSize + i] == None or params[i].find('000000') != -1 or params[i].find('000000') != -1 \ or params[columnsSize + i].find('000000') != -1 or params[columnsSize + i].find('000000') != -1: differVector.append(0.0) else: differVector.append(float(params[i] != params[columnsSize + i])) # birthCertificate_docSer elif i == 11 or i == columnsSize + 11: if params[i] == None or params[columnsSize + i] == None: differVector.append(0.0) # check if roman or not, then convert if roman else: docSer1 = params[i] docSer2 = params[columnsSize + i] if isRoman(params[i][:params[i].index('-')]): docSer1 = str(roman.fromRoman(params[i][:params[i].index('-')])) secPart1 = '-' for elem in params[i][params[i].index('-') + 1:]: if 65 <= ord(elem) <= 90: secPart1 += translateMap[elem] else: secPart1 = params[i][params[i].index('-'):] docSer1 += secPart1 if isRoman(params[columnsSize + i][:params[columnsSize + i].index('-')]): docSer2 = str(roman.fromRoman(params[columnsSize + i][:params[columnsSize + i].index('-')])) secPart2 = '-' for elem in params[columnsSize + i][params[columnsSize + i].index('-') + 1:]: if 65 <= ord(elem) <= 90: secPart2 += translateMap[elem] else: secPart2 = params[columnsSize + i][params[columnsSize + i].index('-'):] break docSer2 += secPart2 differVector.append(float(docSer1 != docSer2)) elif params[i] == 0 or params[columnsSize + i] == 0: differVector.append(0.0) elif params[i] == None or params[columnsSize + i] == None: differVector.append(0.0) else: differVector.append(float(params[i] != params[columnsSize + i])) return differVector featuresGenerator = udf(lambda input: Vectors.dense(differenceVector(input)), VectorUDT()) %pysparknewTestTable = testTable.withColumn('features', featuresGenerator(struct(*column_names))) # all pairsdf = df.withColumn('features', featuresGenerator(struct(*column_names))) # labeled pairs Split labeled dataframe into training and test dataframes: %pysparkfrom pyspark.ml import Pipelinefrom pyspark.ml.classification import RandomForestClassifierfrom pyspark.ml.feature import IndexToString, StringIndexer, VectorIndexerfrom pyspark.ml.evaluation import MulticlassClassificationEvaluator # split labelled data into two sets(trainingData, testData) = df.randomSplit([0.7, 0.3]) Train a RF model: %pysparkfrom pyspark.ml.classification import RandomForestClassifier rf = RandomForestClassifier(labelCol='label', featuresCol='features') pipeline = Pipeline(stages=[rf]) model = pipeline.fit(trainingData) # Make predictions.predictions = model.transform(testData)# predictions.select("predictedLabel", "label", "features").show(5) Test the RF model: %pysparkTP = int(predictions.select("label", "prediction").where((col("label") == 1) & (col('prediction') == 1)).count())TN = int(predictions.select("label", "prediction").where((col("label") == 0) & (col('prediction') == 0)).count())FP = int(predictions.select("label", "prediction").where((col("label") == 0) & (col('prediction') == 1)).count())FN = int(predictions.select("label", "prediction").where((col("label") == 1) & (col('prediction') == 0)).count())total = int(predictions.select("label").count()) print("accuracy = %f" % ((TP + TN) / total))print("precision = %f" % (TP/ (TP + FP))print("recall = %f" % (TP / (TP + FN)) How it looks: Use the RF model on all the pairs: %pysparkallData = model.transform(newTestTable) Check how many duplicates are found: %pysparkallData.where(col('prediction') == 1).count() Or look at the dataframe: Conclusion This approach is not ideal. You can make it better by experimenting with feature vectors, a model or increasing the size of labeled dataset. Also, you can do the same to find duplicates, for example, in shops database, historical research, etc... Links Apache Zeppelin Jupyter Notebook Apache Spark Record Linkage ML models The way to launch Jupyter Notebook + Apache Spark + InterSystems IRIS Load a ML model into InterSystems IRIS K-Means clustering of the Iris Dataset The way to launch Apache Spark + Apache Zeppelin + InterSystems IRIS GitHub