Apache Spark has rapidly become one of the most exciting technologies for big data analytics and machine learning. Spark is a general data processing engine created for use in clustered computing environments. Its heart is the Resilient Distributed Dataset (RDD) which represents a distributed, fault tolerant, collection of data that can be operated on in parallel across the nodes of a cluster. Spark is implemented using a combination of Java and Scala and so comes as a library that can run on any JVM.

11 5
0 2,334

Hi all.

I want to insert my dataframe into InterSystems IRIS. So, I tried to do this:

df = spark.read.load("/home/imported-openssh-key/zeppelin-0.8.0-bin-all/bin/resultData3/DF.json", format="json")
option("url", "IRIS://localhost:51773/DEDUPL").\
option("user", "********").option("password", "********").\
option("dbtable", "try.test1").save()

And got this error:

1 3
0 1,630

Last week, we announced the InterSystems IRIS Data Platform, our new and comprehensive platform for all your data endeavours, whether transactional, analytics or both. We've included many of the features our customers know and loved from Caché and Ensemble, but in this article we'll shed a little more light on one of the new capabilities of the platform: SQL Sharding, a powerful new feature in our scalability story.

13 11
1 1,293

Keywords:  PyODBC, unixODBC, IRIS, IntegratedML, Jupyter Notebook, Python 3



A few months ago I touched on a brief note on "Python JDBC connection into IRIS", and since then I referred to it more frequently than my own scratchpad hidden deep in my PC. Hence, here comes up another 5-minute note on how to make "Python ODBC connection into IRIS".

1 0
1 1,201

With the release of InterSystems IRIS, we're also making available a nifty bit of software that allows you to get the best out of your InterSystems IRIS cluster when working with Apache Spark for data processing, machine learning and other data-heavy fun. Let's take a closer look at how we're making your life as a Data Scientist easier, as you're probably already facing tough big data challenges already, just from the influx of job offers in your inbox!

2 2
0 1,080
Niyaz Khafizov · Jul 27, 2018 4m read
Load a ML model into InterSystems IRIS

Hi all. Today we are going to upload a ML model into IRIS Manager and test it.

Note: I have done the following on Ubuntu 18.04, Apache Zeppelin 0.8.0, Python 3.6.5.


These days many available different tools for Data Mining enable you to develop predictive models and analyze the data you have with unprecedented ease. InterSystems IRIS Data Platform provide a stable foundation for your big data and fast data applications, providing interoperability with modern DataMining tools. 

6 2
2 1,074


Keywords:  Anaconda, Jupyter Notebook, Tensorflow GPU, Deep Learning,  Python 3 and HealthShare    

1. Purpose and Objectives

This "Part I" is a quick record on how to set up a "simple" but popular deep learning demo environment step-by-step with a Python 3 binding to a HealthShare 2017.2.1 instance .  I used a Win10 laptop at hand, but the approach works the same on MacOS and Linux.

4 0
2 951

Last week saw the launch of the InterSystems IRIS Data Platform in sunny California.

For the engaging eXPerience Labs (XP-Labs) training sessions, my first customer and favourite department (Learning Services), was working hard assisting and supporting us all behind the scene.

11 3
0 915

The last time that I created a playground for experimenting with machine learning using Apache Spark and an InterSystems data platform,  see Machine Learning with Spark and Caché, I installed and configured everything directly on my laptop: Caché, Python, Apache Spark, Java, some Hadoop libraries, to name a few. It required some effort, but eventually it worked.

8 3
7 768


Keywords:   Jupyter Notebook, Tensorflow GPU, Keras, Deep Learning, MLP,  and HealthShare    


1. Purpose and Objectives

In  previous"Part I" we have set up a deep learning demo environment. In this "Part II" we will test what we could do with it.

Many people at my age had started with the classic MLP (Multi-Layer Perceptron) model. It is intuitive hence conceptually easier to start with.

1 2
3 712
James Breen · Aug 30, 2018
Machine Learning 101 Presentation

View Machine Learning 101 recording at: https://videos.intersystems.com/detail/video/5827774460001/machine-learning-101?autoStart=true&q=machine%20learning.

In addition to our webinar on machine learning (https://community.intersystems.com/post/rescheduled-webinar-its-machine-learning-not-rocket-science-july-31-1100-am-edt), we are pleased to announce a basic introduction to machine learning presentation that provides an overview of the basic algorithms by @Donald Woodlock, InterSystems VP of HealthShare Platforms.

4 6
2 663
Robert Cemper · Sep 22, 2018 3m read
Sharding evaluation #1

IRIS brought us a new  WOW feature - SHARDING !
Definitely a great thing!
But how can I find out if it suits my actual applications?
Is there a practical advantage to go for it with my well cooked transactional application?
Or is it just for new still to be designed applications?

15 8
1 623

Keywords:  IRIS, IntegratedML, Machine Learning, Covid-19, Kaggle 


Recently I noticed a Kaggle dataset  for the prediction of whether a Covid-19 patient will be admitted to ICU.  It is a spreadsheet of 1925 encounter records of 231 columns of vital signs and observations, with the last column of "ICU" being 1 for Yes or 0 for No. The task is to predict whether a patient will be admitted to ICU based on known data.

2 1
1 565