Clear filter
Announcement
Janine Perkins · May 24, 2017
Take this course to learn how data flows from HealthShare Information Exchange to Health Insight, along with the details of that data flow.Learn how to : - Relate a clinical scenario supported by Health Insight to its internal data structures and processes.- Identify the main data management components of HealthShare Information Exchange and Health Insight.- Describe the details of the data flow between HealthShare Information Exchange and Health Insight.- Differentiate between HL7 and CCD data handling in HealthShare Information Exchange.- Recognize configuration points in the system and how they affect system performance.- Define the HealthShare Information Exchange internal data structures and how they are used.Audience: HealthShare Customers. This course is for anyone who customizes or supports Health Insight, as well as power users of Health Insight who need an understanding of its technical details.Learn More.
Announcement
Evgeny Shvarov · Sep 11, 2017
Hi, Community!
The Global Summit 2017 Key Notes session will start in two hours at 9-00AM (PT).
Here is the link for live streaming.
Join!
Text your questions now to get answers on Global Summit 2017 Key Notes to a number:+16179968827 Is there a recording of this for people who couldn't watch in real time? Hi, Mike! It would be posted on YouTube in a few days, we would make an announcement here. Any update on when the videos would be posted to youtube, specifically ones like the keynotes. You can also find InterSystems Global Summit Keynote Presentations in a dedicated Global Summit 2017 playlist on InterSystems Developers YouTube Channel:InterSystems Global Summit Keynote - Part 1InterSystems Global Summit Keynote - Part 2Enjoy!
Article
Murray Oldfield · Apr 8, 2016
This post will guide you through the process of sizing shared memory requirements for database applications running on InterSystems data platforms. It will cover key aspects such as global and routine buffers, gmheap, and locksize, providing you with a comprehensive understanding. Additionally, it will offer performance tips for configuring servers and virtualizing IRIS applications. Please note that when I refer to IRIS, I include all the data platforms (Ensemble, HealthShare, iKnow, Caché, and IRIS).
[A list of other posts in this series is here](https://community.intersystems.com/post/capacity-planning-and-performance-series-index)
When I first started working with Caché, most customer operating systems were 32-bit, and memory for an IRIS application was limited and expensive. Commonly deployed Intel servers had only a few cores, and the only way to scale up was to go with big iron servers or use ECP to scale out horizontally. Now, even basic production-grade servers have multiple processors, dozens of cores, and minimum memory is hundreds of GB or TB. For most database installations, ECP is forgotten, and we can now scale application transaction rates massively on a single server.
A key feature of IRIS is the way we use data in shared memory usually referred to as database cache or global buffers. The short story is that if you can right size and allocate 'more' memory to global buffers you will usually improve system performance - data in memory is much faster to access than data on disk. Back in the day, when 32-bit systems ruled, the answer to the question _how much memory should I allocate to global buffers?_ It was a simple - _as much as possible!_ There wasn't that much available anyway, so sums were done diligently to calculate OS requirements, the number of and size of OS and IRIS processes and real memory used by each to find the remainder to allocate as large a global buffer as possible.
## The tide has turned
If you are running your application on a current-generation server, you can allocate huge amounts of memory to an IRIS instance, and a laissez-faire attitude often applies because memory is now "cheap" and plentiful. However, the tide has turned again, and pretty much all but the very largest systems I see deployed now are virtualized. So, while 'monster' VMs can have large memory footprints if needed, the focus still comes back to the right sizing systems. To make the most of server consolidation, capacity planning is required to make the best use of available host memory.
# What uses memory?
Generally, there are four main consumers of memory on an IRIS database server:
* Operating System, including filesystem cache.
* If installed, other non-IRIS applications.
* IRIS processes.
* IRIS shared memory (includes global and routine buffers and GMHEAP).
At a high level, the amount of physical memory required is simply added up by adding up the requirements of each of the items on the list. All of the above use real memory, but they can also use virtual memory. A key part of capacity planning is to size a system so that there is enough physical memory so that paging does not occur or is minimized, or at least minimize or eliminate hard page faults where memory has to be brought back from disk.
In this post I will focus on sizing IRIS shared memory and some general rules for optimising memory performance. The operating system and kernel requirements vary by operating system but will be several GB in most cases. File system cache varies and is will be whatever is available after the other items on the list take their allocation.
IRIS is mostly processes - if you look at the operating system statistics while your application is running you will see cache processes (e.g. iris or iris.exe). So a simple way to observe what your application memory requirements are is to look at the operating system metrics. For example with `vmstat` or `ps` on Linux or `Windows process explorer` and total the amount of real memory in use, extrapolating for growth and peak requirements. Be aware that some metrics report virtual memory which includes shared memory, so be careful to gather real memory requirements.
## Sizing Global buffers - A simplified way
One of the capacity planning goals for a high transaction database is to size global buffers so that as much of the application database working set is in memory as possible. This will minimise read IOPS and generally improve the application's performance. We also need to strike a balance so that other memory users, such as the operating system and IRIS process, are not paged out and there is enough memory for the filesystem cache.
I showed an example of what can happen if reads from disk are excessive in [Part 2 of this series.](https://community.intersystems.com/post/intersystems-data-platforms-and-performance-–-part-2) In that case, high reads were caused by a bad report or query, but the same effect can be seen if global buffers are too small, forcing the application to be constantly reading data blocks from disk. As a sidebar, it's also worth noting that the landscape for storage is always changing - storage is getting faster and faster with advances in SSDs and NVMe, but data in memory close to the running processes is still best.
Of course, every application is different, so it's important to say, "Your mileage may vary" but there are some general rules which will get you started on the road to capacity planning shared memory for your application. After that you can tune for your specific requirements.
### Where to start?
Unfortunately, there is no magic answer. However, as I discussed in previous posts, a good practice is to size the system CPU capacity so that for a required peak transaction rate, the CPU will be approximately 80% utilized at peak processing times, leaving 20% headroom for short-term growth or unexpected spikes in activity.
For example, when I am sizing TrakCare systems I know CPU requirements for a known transaction rate from benchmarking and reviewing customer site metrics, and I can use a broad rule of thumb for Intel processor-based servers:
`Rule of thumb:` Physical memory is sized at _n_ GB per CPU core for servers running IRIS.
- For example, for TrakCare database servers, a starting point of _n_ is 8 GB. But this can vary, and servers may be right-sized after the application has been running for a while -- you must monitor your systems continuously and do a formal performance review, for example, every six or 12 months.
`Rule of thumb:` Allocate _n_% of memory to IRIS global buffers.
- For small to medium TrakCare systems, n% is 60%, leaving 40% of memory for the operating system, filesystem cache, and IRIS processes. You may vary this, say to 50%, if you need a lot of filesystem cache or have a lot of processes. Or make it a higher percentage as you use very large memory configurations on large systems.
- This rule of thumb assumes only one IRIS instance on the server.
For example, if the application needs 10 CPU cores, the VM would have 80 GB of memory, 48 GB for global buffers, and 32 GB for everything else.
Memory sizing rules apply to physical or virtualized systems, so the same 1 vCPU: 8 GB memory ratio applies to TrakCare VMs.
### Tuning global buffers
There are a few items to observe to see how effective your sizing is. You can observe free memory outside IRIS with operating system tools. Set up as per your best calculations, then observe memory usage over time, and if there is always free memory, the system can be reconfigured to increase global buffers or to right-size a VM.
Another key indicator of good global buffer sizing is having read IOPS as low as possible, which means IRIS cache efficiency will be high. You can observe the impact of different global buffer sizes on PhyRds and RdRatio with mgstat; an example of looking at these metrics is in Part 2 of this series. Unless you have your entire database in memory, there will always be some reads from disk; the aim is simply to keep reads as low as possible.
Remember your hardware food groups and get the balance right. More memory for global buffers will lower read IOPS but possibly increase CPU utilization because your system can now do more work in a shorter time. Lowering IOPS is pretty much always a good thing, and your users will be happier with faster response times.
_See the section below for applying your requirements to __physical memory__ configuration._
For virtual servers, plan not to ever oversubscribe your production VM memory. This is especially true for IRIS shared memory; more on this below.
Is your application's sweet spot 8GB of physical memory per CPU core? I can't say, but see if a similar method works for your application, whether 4GB or 10GB per core. If you have found another method for sizing global buffers, please leave a comment below.
### Monitoring Global Buffer usage
The IRIS utility `^GLOBUFF` displays statistics about what your global buffers are doing at any point in time. For example to display the top 25 by percentage:
do display^GLOBUFF(25)
For example, output could look like this:
Total buffers: 2560000 Buffers in use: 2559981 PPG buffers: 1121 (0.044%)
Item Global Database Percentage (Count)
1 MyGlobal BUILD-MYDB1 29.283 (749651)
2 MyGlobal2 BUILD-MYDB2 23.925 (612478)
3 CacheTemp.xxData CACHETEMP 19.974 (511335)
4 RTx BUILD-MYDB2 10.364 (265309)
5 TMP.CachedObjectD CACHETEMP 2.268 (58073)
6 TMP CACHETEMP 2.152 (55102)
7 RFRED BUILD-RB 2.087 (53428)
8 PANOTFRED BUILD-MYDB2 1.993 (51024)
9 PAPi BUILD-MYDB2 1.770 (45310)
10 HIT BUILD-MYDB2 1.396 (35727)
11 AHOMER BUILD-MYDB1 1.287 (32946)
12 IN BUILD-DATA 0.803 (20550)
13 HIS BUILD-DATA 0.732 (18729)
14 FIRST BUILD-MYDB1 0.561 (14362)
15 GAMEi BUILD-DATA 0.264 (6748)
16 OF BUILD-DATA 0.161 (4111)
17 HISLast BUILD-FROGS 0.102 (2616)
18 %Season CACHE 0.101 (2588)
19 WooHoo BUILD-DATA 0.101 (2573)
20 BLAHi BUILD-GECKOS 0.091 (2329)
21 CTPCP BUILD-DATA 0.059 (1505)
22 BLAHi BUILD-DATA 0.049 (1259)
23 Unknown CACHETEMP 0.048 (1222)
24 COD BUILD-DATA 0.047 (1192)
25 TMP.CachedObjectI CACHETEMP 0.032 (808)
This could be useful in several ways, for example, to see how much of your working set is kept in memory. If you find this utility is useful please make a comment below to enlighten other community users on why it helped you.
## Sizing Routine Buffers
Routines your application is running, including compiled classes, are stored in routine buffers. The goal of sizing shared memory for routine buffers is for all your routine code to be loaded and stay resident in routine buffers. Like global buffers, it is expensive and inefficient to read routines off disk. The maximum size of routine buffers is 1023 MB. As a rule you want more routine buffers than you need as there is always a big performance gain to have routines cached.
Routine buffers are made up of different sizes. By default, IRIS determines the number of buffers for each size; at install time, the defaults for 2016.1 are 4, 16 and 64 KB. It is possible to change the allocation of memory for different sizes; however, to start your capacity planning, it is recommended to stay with IRIS defaults unless you have a special reason for changing. For more information, see routines in the [IRIS documentation](https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=RACS_routines) “config” appendix of the IRIS Parameter File Reference and Memory and Startup Settings in the “Configuring IRIS” chapter of the IRIS System Administration Guide.
As your application runs, routines are loaded off disk and stored in the smallest buffer the routine will fit. For example, if a routine is 3 KB, it will ideally be stored in a 4 KB buffer. If no 4 KB buffers are available, a larger one will be used. A routine larger than 32 KB will use as many 64 KB routine buffers as needed.
### Checking Routine Buffer Use
#### mgstat metric RouLas
One way to understand if the routine buffer is large enough is the mgstat metric RouLas (routine loads and saves). A RouLas is a fetch from or save to disk. A high number of routine loads/saves may show up as a performance problem; in that case, you can improve performance by increasing the number of routine buffers.
#### cstat
If you have increased routine buffers to the maximum of 1023 MB and still find high RouLas a more detailed examination is available so you can see what routines are in buffers and how much is used with `cstat` command.
ccontrol stat cache -R1
This will produce a listing of routine metrics including a list of routine buffers and all the routines in cache. For example a partial listing of a default IRIS install is:
Number of rtn buf: 4 KB-> 9600, 16 KB-> 7200, 64 KB-> 2400,
gmaxrouvec (cache rtns/proc): 4 KB-> 276, 16 KB-> 276, 64 KB-> 276,
gmaxinitalrouvec: 4 KB-> 276, 16 KB-> 276, 64 KB-> 276,
Dumping Routine Buffer Pool Currently Inuse
hash buf size sys sfn inuse old type rcrc rtime rver rctentry rouname
22: 8937 4096 0 1 1 0 D 6adcb49e 56e34d34 53 dcc5d477 %CSP.UI.Portal.ECP.0
36: 9374 4096 0 1 1 0 M 5c384cae 56e34d88 13 908224b5 %SYSTEM.WorkMgr.1
37: 9375 4096 0 1 1 0 D a4d44485 56e34d88 22 91404e82 %SYSTEM.WorkMgr.0
44: 9455 4096 0 0 1 0 D 9976745d 56e34ca0 57 9699a880 SYS.Monitor.Health.x
2691:16802 16384 0 0 7 0 P da8d596f 56e34c80 27 383da785 START
etc
etc
"rtns/proc" on the 2nd line above is saying that 276 routines can be cached at each buffer size as default.
Using this information another approach to sizing routine buffers is to run your application and list the running routines with cstat -R1. You could then calculate the routine sizes in use, for example put this list in excel, sort by size and see exactly what routines are in use. If your are not using all buffers of each size then you have enough routine buffers, or if you are using all of each size then you need to increase routine buffers or can be more direct about configuring the number of each bucket size.
## Lock table size
The locksiz configuration parameter is the size (in bytes) of memory allocated for managing locks for concurrency control to prevent different processes from changing a specific element of data at the same time. Internally, the in-memory lock table contains the current locks, along with information about the processes that hold those locks.
Since memory used to allocate locks is taken from GMHEAP, you cannot use more memory for locks than exists in GMHEAP. If you increase the size of locksiz, increase the size of GMHEAP to match as per the formula in the GMHEAP section below. Information about application use of the lock table can be monitored using the system management portal (SMP), or more directly with the API:
set x=##class(SYS.Lock).GetLockSpaceInfo().
This API returns three values: "Available Space, Usable Space, Used Space". Check Usable space and Used Space to roughly calculate suitable values (some lock space is reserved for lock structure). Further information is available in [IRIS documentation.](https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=RACS_locksiz)
Note: If you edit the locksiz setting, changes take place immediately.
## GMHEAP
The GMHEAP (the Generic Memory Heap) configuration parameter is defined as: Size (in kilobytes) of the generic memory heap for IRIS. This is the allocation from which the Lock table, the NLS tables, and the PID table are also allocated.
Note: Changing GMHEAP requires a IRIS restart.
To assist you in sizing for your application information about GMHEAP usage can be checked using the API:
%SYSTEM.Config.SharedMemoryHeap
This API also provides the ability to get available generic memory heap and recommends GMHEAP parameters for configuration. For example, the DisplayUsage method displays all memory used by each of the system components and the amount of available heap memory. Further information is available in the [IRIS documentation](https://docs.intersystems.com/irislatest/csp/docbook/DocBook.UI.Page.cls?KEY=RACS_gmheap).
write $system.Config.SharedMemoryHeap.DisplayUsage()
The `RecommendedSize` method can give you an idea of GMHEAP usage and recommendations at any point in time. However, you will need to run this multiple times to build up a baseline and recommendations for your system.
write $system.Config.SharedMemoryHeap.RecommendedSize()
`Rule of thumb:` Once again your application mileage will vary, but somewhere to start your sizing could be one of the following:
(Minimum 128MB) or (64 MB * number of cores) or (2x locksiz) or whichever is larger.
Remember GMHEAP must be sized to include the lock table.
# Large/Huge pages
The short story is that huge pages on Linux have a positive effect on increasing system performance. However, the benefits will only be known if you test your application with and without huge pages. The benefits of huge pages for IRIS database servers are more than just performance -- which may only be ~10% improvement at best. There are other reasons to use huge pages; _When IRIS uses huge pages for shared memory, you guarantee that the memory is available for shared memory and not fragmented._
Note: By default, when huge/large pages are configured, InterSystems IRIS attempts to utilize them on startup. If there is not enough space, InterSystems IRIS reverts to standard pages. However, you can use the memlock parameter to control this behavior and fail at startup if huge/large page allocation fails.
As a sidebar for TrakCare, we do not automatically specify huge pages for non-production servers/VMs with small memory footprints ( for example less than 8GB) or utility servers (for example print servers) running IRIS because allocating memory for huge pages may end up orphaning memory, or sometimes a bad calculation that undersizes huge pages means IRIS starts not using huge pages which is even worse. As per our docs, remember that when using huge pages to configure and start IRIS without huge pages, look at the total shared memory at startup and then use that to calculate huge pages. [Configuring Huge and Large Pages
](https://docs.intersystems.com/iris20242/csp/docbook/DocBook.UI.Page.cls?KEY=ARES#ARES_memory_plan_pages)
## Danger! Windows Large Pages and Shared Memory
IRIS uses shared memory on all platforms and versions, and it's a great performance booster, including on Windows, where it is always used. However, there are particular issues unique to Windows that you need to be aware of.
When IRIS starts, it allocates a single, large chunk of shared memory to be used for database cache (global buffers), routine cache (routine buffers), the shared memory heap, journal buffers, and other control structures. On IRIS startup, shared memory can be allocated using small or large pages. On Windows 2008 R2 and later, IRIS uses large pages by default; however, if a system has been running for a long time, due to fragmentation, contiguous memory may not be able to be allocated at IRIS startup, and IRIS can instead start using small pages.
Unexpectedly starting IRIS with small pages can cause it to start with less shared memory than defined in the configuration, or it may take a long time to start or fail to start. I have seen this happen on sites with a failover cluster where the backup server has not been used as a database server for a long time.
`Tip:` One mitigation strategy is periodically rebooting the offline Windows cluster server. Another is to use Linux.
# Physical Memory
The best configuration for the processor dictates physical memory. A bad memory configuration can significantly impact performance.
## Intel Memory configuration best practice
This information applies to __Intel__ processors only. Please confirm with vendors what rules apply to other processors.
Factors that determine optimal DIMM performance include:
- DIMM type
- DIMM rank
- Clock speed
- Position to the processor (closest/furthest)
- Number of memory channels
- Desired redundancy features.
For example, on Nehalem and Westmere servers (Xeon 5500 and 5600) there are three memory channels per processor and memory should be installed in sets of three per processor. For current processors (for example, E5-2600), there are four memory channels per processor, so memory should be installed in sets of four per processor.
When there are unbalanced memory configurations — where memory is not installed in sets of three/four or memory DIMMS are different sizes, unbalanced memory can impose a 23% memory performance penalty.
Remember that one of the features of IRIS is in-memory data processing, so getting the best performance from memory is important. It is also worth noting that for maximum bandwidth servers should be configured for the fastest memory speed. For Xeon processors maximum memory performance is only supported at up to 2 DIMMs per channel, so the maximum memory configurations for common servers with 2 CPUs is dictated by factors including CPU frequency and DIMM size (8GB, 16GB, etc).
`Rules of thumb:`
- Use a balanced platform configuration: populate the same number of DIMMs for each channel and each socket
- Use identical DIMM types throughout the platform: same size, speed, and number of ranks.
- For physical servers, round up the total physical memory in a host server to the natural break points—64GB, 128GB, and so on—based on these Intel processor best practices.
# VMware Virtualisation considerations
I will follow up in future with another post with more guidelines for when IRIS is virtualized. However the following key rule should be considered for memory allocation:
`Rule:` Set VMware memory reservation on production systems.
As we have seen above when IRIS starts, it allocates a single, large chunk of shared memory to be used for global and routine buffers, GMHEAP, journal buffers, and other control structures.
You want to avoid any swapping for shared memory so set your _production database VMs_ memory reservation to at least the size of IRIS shared memory plus memory for IRIS processes and operating system and kernel services. If in doubt reserve the full production database VMs memory.
As a rule if you mix production and non-production servers on the same systems do not set memory reservations on non-production systems. Let non-production servers fight out whatever memory is left ;). VMware often calls VMs with more than 8 CPUs 'monster VMs'. High transaction IRIS database servers are often monster VMs. There are other considerations for setting memory reservations on monster VMs, for example if a monster VM is to be migrated for maintenance or due to a High Availability triggered restart then the target host server must have sufficient free memory. There are stratagies to plan for this I will talk about them in a future post along with other memory considerations such as planning to make best use of NUMA.
# Summary
This is a start to capacity planning memory, a messy area - certainly not as clear cut as sizing CPU. If you have any questions or observations please leave a comment.
As this entry is posted I am on my way to Global Summit 2016. If you are attending this year I will be talking about performance topics with two presentations, or I am happy to catch up with you in person in the developers area. Thank you for excellent articles, Murray.We use slightly different approach for memory planning.Our app mostly runs as a set of concurrent user sessions, one process per user. It's known that avg memory per process is 10Mb, we multiply it by 3*N_concurrent_users. The 1st multiplier (3) makes a gap for memory spikes. So, the result is a memory we leave for user processes.We try to leave for Routine Buffer cache as much memory as possible, upto 1Gb.The Global Buffer memory is usually calculated as a 30% of 3-years-old-database size for given kind of customer. Usually it comes to 24-64Gb global cache for medium to large size hospitals and provides thousands (or dozens of thousands) Rdratio. At whole, we usually get numbers that are close to your 60/40 proportion, while my globuff calculation method is not so presized as yours and I feel that I need a better calculation base for it. Thanks for adding your experience. Yes, your method for sizing per user process makes perfect sense, and that is how I did it when using client/server applications. I spend a lot of time now with a CSP (web) application which has less server processes per user so the calculations are different per user.
The same with memory so plentiful now 1023 MB is often the default for routine buffer. But smaller sites or small VMs may be adjusted down.
The 60/40 came about because of a need for sizing a new site, but I also like the idea of using a % for expected active data. In the end the best path is try and start in the ballpark with the rules we have... over time with constant monitoring adjust if/when needed.
Thanks again.
MO Currently, we are running 2010.2 and I am following this to review our performance. In doing so, I read above, "The maximum size of routine buffers is 1023 MBs."I was wondering if you could clarify what this means as I'm finding that the maximum is 65,535.Thanks in advance.
Announcement
Janine Perkins · Jan 17, 2017
Take this online course to learn the foundations of the Caché ObjectScript language especially as it relates to use in creating variables and objects in Caché. You will learn about variables, commands, and operators as well as how to find more information using the InterSystems DocBooks when needed. This course contains instructional videos and exercises.Learn More.
Article
Evgeny Shvarov · May 7, 2017
Hi, Community!Hope you know and use the Developer Community Analytics page which build with InterSystems DeepSee and DeepSee Web.We are playing with InterSystems iKnow analytics against Developer Community posts and introduced the new dashboard, which shows Top 60 concepts for all the posts:Or play with filters and see the top concepts for Atelier tag:Or the top concepts of a particular member:Click on the concept on the left and see the related articles on Developer Community. Here is small gif how it works: Next we plan to introduce concepts on Answers too and to fix tags and introduce new tags according to the concept stats.Your ideas and feedback would be much appreciated. Hi Evgeny,nice work!Maybe you can enhance the interface by also including an iKnow-based KPI to the dashboard exposing the similar or related entities for the concept clicked in the heat map. You can subclass this generic KPI and specify the query you want it to invoke, and then use it as the data source for a table widget. Let me know if I can help. thanks,benjamin Thank you, Benjamin!Yes, similar and related are concepts are in the roadmap too, thanks for the useful link!
Article
Luca Ravazzolo · Sep 21, 2017
Last week saw the launch of the InterSystems IRIS Data Platform in sunny California.
For the engaging eXPerience Labs (XP-Labs) training sessions, my first customer and favourite department (Learning Services), was working hard assisting and supporting us all behind the scene.
Before the event, Learning Services set up the most complicated part of public cloud :) "credentials-for-free" for a smooth and fast experience for all our customers at the summit. They did extensive testing before the event so that we could all spin up cloud infrastructures to test the various new features of the new InterSystems IRIS data platform without glitches.
The reason why they were so agile, nimble & fast in setting up all those complex environments is that they used technologies we provided straight out of our development furnace. OK, I'll be honest, our Online Education Manager, Douglas Foster and his team have worked hard too and deserve a special mention. :-)
Last week, at our Global Summit 2017, we had nine XP-Labs over three days. More than 180 people had the opportunity to test-drive new products & features.
The labs were repeated each day of the summit and customers had the chance to follow the training courses with a BYOD approach as everything worked (and works in the online training courses that will be provided at https://learning.intersystems.com/) inside a browser.
Here is the list of the XP-Lab given and some facts:
1) Build your own cloud
Cloud is about taking advantage of the on-demand resources available and the scalability, flexibility, and agility that they offer. The XP-Lab focused on the process of quickly defining and creating a multi-node infrastructure on GCP. Using InterSystems Cloud Manager, students provisioned a multi-node infrastructure which had a dynamically configured InterSystems IRIS data platform cluster that they could test by running few commands. They also had the opportunity to unprovision it all with one single command without having to click all over a time-consuming web portal. I think it is important to understand that each student was actually creating her/his own virtual private cloud (VPC) with her or his dedicated resources and her/his dedicated InterSystems IRIS instances. Everybody was independent of each other. Every student had her or his own cloud solution. There was no sharing of resources.
Numbers: we had more than a dozen students per session. Each student had his own VPC with 3 compute-node each. With the largest class of 15 people we ended up with 15 individual clusters. There was then a total of 45 compute-nodes provisioned during the class with 45 InterSystems IRIS instances running & configured in a small shard cluster. There were a total of 225 storage volumes. Respecting our best practices, we provide default volumes for a sharded DB, the JRN & the WIJ files and the Durable %SYS feature (more on this in another post later) + the default boot OS volume.
2) Hands-On with Spark
Apache Spark is an open-source cluster-computing framework that is gaining popularity for analytics, particularly predictive analytics and machine learning. In this XP-Lab students used InterSystems' connector for Apache Spark to analyze data that was spread over a multi-node sharded architecture of the new InterSystems IRIS data platform.
Numbers: 42 spark cluster were pre-provisioned by 1 person (thank you, Douglas again). Each cluster consisted of 3 compute-nodes for a total of 126 node instances. There were 630 storage volumes for a total of 6.3TB of storage used.The InterSystems person that pre-provisioned the clusters run multiple InterSystems Cloud Manager instances in parallel to pre-provision all 42 clusters. The same Cloud Manager tool was also used to re-set the InterSystems IRIS containers (stop/start/drop_table) and, at the end of the summit, to unprovision/destroy all clusters so to avoid un-necessary charges.
3) RESTful FHIR & Messaging in Health Connect.
Students used Health Connect messaging and FHIR data models to transform and search for clinical data. Various transformations were applied to various messages.
Numbers: two paired containers per student were used for this class. On one container we provided the web-based Eclipse Orion editor and on the other the actual Health Connect instance. Containers were running over 6 different nodes managed by the orchestrator Docker Swarm.
Q&A
So how did our team achieve all the above? How were they able to run all those training labs on the Google Compute Platform? Did you know there was a backup plan (you never know in the cloud) to run on AWS? And did you know we could just as easily run on Microsoft Azure? How could all those infrastructures & instances run and be configured so quickly over the practical lab-session of no more than 20 minutes? Furthermore, how can we quickly and efficiently remove hundreds or thousands of resources without wasting hours clicking on web portals?
As you must have gathered by now, our Online Education team used the new InterSystems Cloud Manager to define, create, provision, deploy and unprovision the cloud infrastructures and services running on top of it.Secondly, everything customers saw, touched & experienced run in containers. What else these days? :-)
Summary
InterSystems Cloud Manager is a public, private and on-premises cloud tool that allows you to provision the infrastructure + configure + run InterSystems IRIS data platform instances.
Out of the box Cloud Manager supports the top three public IaaS providers
AWS
GCP and
Azure
but it can also assist you with a private and/or on-premise solution as it supports
the VMware vSphere API and
Pre-Existing server nodes (either virtual or physical)
When I said "out of the box" above, I did not lie :)InterSystems Cloud Manager comes packaged in a container so that you do not have to install anything and don't have to configure any software or set any variable in your environment. You just run the container, and you're ready to provision your cloud. Don't forget your credentials, though ;-)
The InterSystems Cloud Manager, although in the infancy of its MVP (minimum viable product) version, has already proven itself. It allows us to run on and test various IaaS providers quickly, provision a solution on-premise or just carve out a cloud infrastructure according to our definition.
I like to define it as a "battery included but swappable" solution.If you already have your installation and configuration solution developed with configuration management (CM) tools (Ansible, Puppet, Chef, Salt or others) and perhaps you want to test an alternative cloud provider, Cloud Manager allows you to create just the cloud infrastructure, while you can still build your systems with your CM tool. Just be careful of the unavoidable system drifts over time.On the other hand, if you want to start embracing a more DevOps type approach, appreciate the difference between the build phase and the run phase of your artefact, become more agile, support multiple deliveries and possibly deployments per day, you can use InterSystems' containers together with the Cloud Manager.
The tool can provision and configure both the new InterSystems IRIS data platform sharded cluster and traditional architectures (ECP client-servers + Data server with or without InterSystems Mirroring).
At the summit, we also had several technical sessions on Docker containers and two on the Cloud Manager tool itself. All sessions registered a full house. I also heard that many other sessions were packed. I was particularly impressed with the Docker container introductory session on Sunday afternoon where I counted 75 people. I don't think we could have fitted anybody else in the room. I thought people would have gone to the swimming pool :) instead, I think we had a clear sign telling us that our customers like innovation and are keen to learn.
Below is a picture depicting how our Learning Services department allowed us to test-drive the Cloud Manager at the XP-Lab. They run a container based on the InterSystems Cloud Manager; they added a nginx web server so that we can http-connect to it. The web server delivers a simple single page where they load a browser-based editor (Eclipse Orion) and at the bottom of the screen, the student is connected directly to the shell (GoTTY via websocket) of the same container so that she or he can run the provisioning & deployment commands. This training container, with all these goodies :) runs on a cloud -of course- and thanks to the pre-installed InterSystems Cloud Manager, students can provision and deploy a cluster solution on any cloud (just provide credentials).
To learn more about InterSystems Cloud Manager
here is an introductory video https://learning.intersystems.com/course/view.php?id=756and the global summit session https://learning.intersystems.com/mod/page/view.php?id=2864
and InterSystems & Containers
here are some of the sessions from GS2017https://learning.intersystems.com/course/view.php?id=685https://learning.intersystems.com/course/view.php?id=696https://learning.intersystems.com/course/view.php?id=737
--
When would Experience Labs be available on learning.intersystems.com? We are currently working to convert the experience labs into online experiences. The FHIR Experience will be ready in the next two weeks, closely followed by the Spark Experience. We have additional details to work out for the Build Your Own Cloud experience as it runs by building in our InterSystems cloud and can consume a lot of resources, but we expect to get that worked out in the next 4 - 6 weeks.Thanks Luca for the mention above, but it was a large team effort with several people from the Online Learning team as well as product manager, sales engineers, etc. @Eduard Lebedyuk & allYou can find a "getting started" course at https://learning.intersystems.com/HTH
Announcement
Steve Brunner · Jan 31, 2018
InterSystems is pleased to announce that InterSystems IRIS Data Platform 2018.1.0 is now released.This press release was issued this morning.InterSystems IRIS is a complete, unified data platform that makes it faster and easier to build real-time, data-rich applications. It mirrors our design philosophy that software should be interoperable, reliable, intuitive, and scalable.For information about InterSystems IRIS, please visit our website here. You'll find out why InterSystems IRIS is the first complete data platform suitable for transactional, analytic, and hybrid transactional-analytic applications. See how our InterSystems Cloud Manager enables open and rapid deployment on public and private clouds, bare metal or virtual machines. Review our vertical and horizontal scaling, to ensure high performance regardless of your workloads, data sizes or concurrency needs. Discover how to use familiar tools and languages like Java and REST to interoperate with all your data.To interactively learn more about InterSystems IRIS, we've introduced the InterSystems IRIS Experience, a new environment that allows guided and open access to InterSystems IRIS Data Platform for exploring its powerful features. The Experience offers you the ability to solve challenges and build solutions using real-world data sets. With compelling scenarios, such as big data analytics and predicting patterns of financial fraud, developers and organizations can experience the power of InterSystems IRIS firsthand.You can also review the complete online documentation. What happens with ZEN in IRIS? %ZEN classes hidden in Class Reference and nothing in the documentation. Is it deprecated, anything else also disappear since Caché/Ensemble? - no DOCBOOK- no SAMPLES- no JDBC Driver in installbut there is C:\InterSystems\IRIS\dev\java\lib\JDK18\isc-jdbc-3.0.0.jar .....- and a very small icon for csystray.exe in Win Hi Robert,DocBook has now moved fully online, which is what the mgmt portal will link to: http://docs.intersystems.com/irisSAMPLES included quite a few outdated examples and was also not appropriate for many non-dev deployments, so we've also moved to a different model there, posting the most relevant ones on GitHub, giving us more flexibility to provide updates and new ones: https://github.com/intersystems?q=samplesJDBC driver: to what extent is this different from the past? It's always just been available as a jarfile, as is customary for JDBC drivers. We do hope to be able to post it through Maven repositories in the near future though.Small icons: yeah, to make our installer and (more importantly) the container images more lightweight, we had to economize on space. Next to the removal of DocBook and Samples, using smaller icons also reduces the size in bytes ;) ;)InterSystems IRIS is giving us the opportunity to adopt a contemporary deployment model, where we were somewhat restricted by long-term backwards compatibility commitments with Caché & Ensemble. Some of these will indeed catch your eye and might even feel a little strange at first, but we really believe the new model makes developing and deploying applications easier and faster. Of course, we're open to feedback on all of these evolution and this is a good channel to hear from you.Thanks!benjamin Hi Dmitry,Zen is indeed no longer a central piece of our application development strategy. We'll support it for some time to come (your Zen app still works on IRIS), but our focus is on providing a fast and scalable data management platform rather than GUI libraries. In that sense, you may already have noticed that recent courses we published on the topic of application development focus on leveraging the right technologies to connect to the backend (i.e. REST) and suggest using best-of-breed third-party technologies (i.e. Angular) for web development.InterSystems IRIS is a new product where we're taking advantage of our Caché & Ensemble heritage. It's meant to address today's challenges when building critical applications and we've indeed leveraged a number of capabilities from those products, but also added a few significant new ones like containers, cloud & horizontal scalability. We'll be providing an overview of elements to check for Caché & Ensemble customers that would like to migrate to InterSystems IRIS shortly (i.e. difference in supported platforms), but please don't consider this as merely an upgrade. You may already have noticed the installer doesn't support upgrading anyhow.Thanks,benjamin Hi Ben,- I like the idea of external samples. That offers definitely more flexibility .- DOCUMATIC is unchanged and works local! That's important. OK - JDBC: it isn't visible in Custom Install. You only see xDBC -> ODBC . Not an issue, rather a surprise. The .jar are where they used to be before.I'm really happy that we finally can get out of old chains imposed by 40yrs (DSM-11 and others) backward compatibility .Robert Hooray :) What date is the Zen end-of-life planned for (ie. when is it supported until and when will it be removed)? Hi Ben,I just installed a few samples as in description . CONGRATULATIONS !- not only that I get what I want and leave the rest aside- I also can split samples by subjects in several DBs and namespaces without being trapped by (hidden) dependencies.I think this makes live for training and teaching significantly easier!And allows central bug fixing independent from core release dates.A great idea! Thanks! And even anybody can offer own examples. Right !!A major step forward! Regarding Samples see also from the InterSystems IRIS documentation:Downloading Samples for Use with InterSystems IRIS What about Health Connect clients, does IRIS include components that are part of HealthShare Elite?ThanksYury Hi i need IR how to download it. Do you have login credentials for WRC Online at http://wrc.intersystems.com/ ?Once you have logged in, use the "Online Distributions" link. You'll need to contact your InterSystems account manager to request a license key.Alternatively, get your hands on InterSystems IRIS in the cloud here.
Article
Niyaz Khafizov · Jul 6, 2018
Hi all. Yesterday I tried to connect Apache Spark, Apache Zeppelin, and InterSystems IRIS. During the process, I experienced troubles connecting it all together and I did not find a useful guide. So, I decided to write my own.
Introduction
What is Apache Spark and Apache Zeppelin and find out how it works together. Apache Spark is an open-source cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. So, it is very useful when you need to work with Big Data. And Apache Zeppelin is a notebook, that provides cool UI to work with analytics and machine learning. Together, it works like this: IRIS provides data, Spark reads provided data, and in a notebook we work with the data.
Note: I have done the following on Windows 10.
Apache Zeppelin
Now, we will install all the necessary programs. First of all, download apache zeppelin from the official site of apache zeppelin. I have used zeppelin-0.8.0-bin-all.tgz. It includes Apache Spark, Scala, and Python. Unzip it to any folder. After that you can launch zeppelin by calling \bin\zeppelin.cmd from the root of your Zeppelin folder. Wait until the Done, zeppelin server started string appears and open http://localhost:8080 in your browser. If everything is okay, you will see Welcome to Zeppelin! message.
Note: I assume, that InterSystems IRIS already installed. If not, download and install it before the next step.
Apache Spark
So, we have the browser's open window with Zeppelin notebook. In the upper-right corner click on anonymous and after, click on Interpreter. Scroll down and find spark.
Next to the spark find edit button and click on it. Scroll down and add dependencies to intersystems-spark-1.0.0.jar and to intersystems-jdbc-3.0.0.jar. I installed InterSystems IRIS to the C:\InterSystems\IRIS\ directory, so artifacts I need to add are at:
My files are here:
And save it.
Check that it works
Let us check it. Create a new note, and in a paragraph paste the following code:
var dataFrame=spark.read.format("com.intersystems.spark").option("url", "IRIS://localhost:51773/NAMESPACE").option("user", "UserLogin").option("password", "UserPassword").option("dbtable", "Sample.Person").load()
// dbtable - name of your table
URL - IRIS address. It is formed as follows IRIS://ipAddress:superserverPort/namespace:
protocol IRIS is a JDBC connection over TCP/IP that offers Java shared memory connection;
ipAddress — The IP address of the InterSystems IRIS instance. If you are connecting locally, use 127.0.0.1 instead of localhost;
superserverPort — The superserver port number for the IRIS instance, which is not the same as the webserver port number. To find the superserver port number, in the Management Portal, go to System Administration > Configuration > System Configuration > Memory and Startup; namespace — An existing namespace in the InterSystems IRIS instance. In this demo, we connect to the USER namespace.
Run the paragraph. If everything is okay, you will see FINISHED.
My notebook:
Conclusion
In conclusion, we found out how Apache Spark, Apache Zeppelin, and InterSystems IRIS can work together. In my next articles, I will write about data analysis.
Links
The official site of Apache Spark
Apache Spark documentation
IRIS Protocol
Using the InterSystems Spark Connector
💡 This article is considered as InterSystems Data Platform Best Practice.
Announcement
Evgeny Shvarov · Aug 21, 2018
Hi Community!This year will have a special section on Flash Talks which gives you an opportunity to introduce your tool or solution on InterSystems Global Summit 2018!What is Flash Talks? It's 15 min session you have on Technology Exchange scene: 10 min for your pitch, 5 min for Q&A. The session WILL BE live streamed on Developer Community YouTube Channel.Developer Community Flash Talks!Today, 10/02, Flash Talks Scene @ InterSystems Global Summit 2018!2 pm Open source approaches to work with Documents @Eduard.Lebedyuk, InterSystems2-15 InterSystems IRIS on Kubernetes by @Dmitry.Maslennikov2-30 Visual Studio Code IDE for InterSystems Data Platforms by @John.Murray, GeorgeJames Software2-45 Static Analysis for ObjectScript with CacheQuality by @Daniel.Tamajon, Lite Solutions3-00 InterSystems Open Exchange by @Evgeny.Shvarov, InterSystems3-15 Q&A Session on Developer Community, Global Masters, and Open Exchange Well!We already have two slots taken!One is:Static Analysis for ObjectScript with CacheQuality It's a CI compatible static analysis to for ObjectScript codebase which finds bugs and obeys code guidlines using a set of managed rules from LiteSolutions (which is now better will sound as IRISQuality, @Daniel.Tamajon? ;)Another slot is a secret topic, will be announced just before the GS.So we have two more, book your session on InterSystems IRIS Flash Talks from Developer Community! Yet another topic for InterSystems IRIS Flash Talks:InterSystems IRIS on Kubernetes by @Dmitry.Maslennikov Few days before Global Summit 2018!And I can announce yet another presenter: @Eduard.Lebedyuk, InterSystems.Title: Open source approaches to work with Documents.And we have the day and time!Find Developer Community Flash Talks "Share Your InterSystems IRIS Solution!" on Tuesday the 2nd of October on Flash Talks Stage from 2pm to 3-30 pm. And, we have another flash talk topic - "Visual Studio Code IDE for InterSystems Data Platforms" from @John.Murray!So many exciting topics! Looking forward! And one more topic from me: InterSystems Open Exchange - marketplace of Solutions, Tools and Adapters for InterSystems Data Platforms!The updated the agenda is here and in the topic:Tuesday 10/02, at Tech Exchange Global Summit 2018!2 pm Open source approaches to work with Documents @Eduard Lebedyuk, InterSystems2-15 InterSystems IRIS on Kubernetes by @Dmitry Maslennikov, 2-30 Visual Studio Code IDE for InterSystems Data Platforms by @John Murray, GeorgeJames Software2-45 Static Analysis for ObjectScript with CacheQuality by @Daniel Tamajon, Lite Solutions3-00 InterSystems Open Exchange by @Evgeny Shvarov, InterSystems3-15 Q&A Session on Developer Community, Global Masters and Open Exchange This is a live broadcast recording from Developer Community Flash Talks Great stuff. Thanks. Where can I get the sample files and/or some instructions on the use of Kubernetes as demonstrated? I bet @Dmitry.Maslennikov can share the info. Dmitry? I have done kubernetes deployments on a couple of projects, but both of them is private. And I can't share it. If you need some help, you can contact me directly, and I can share some ideas about how it can be done. Thanks Dmitry.
Announcement
Steve Brunner · Jun 5, 2018
InterSystems is pleased to announce the release of InterSystems IRIS Data Platform 2018.1.1
This is our first follow up to InterSystems IRIS 2018.1.0, released earlier this year. InterSystems IRIS is a unified data platform for building scalable multi-workload, multi-model data management applications with native interoperability and an open analytics platform. As a reminder, you can learn more about InterSystems IRIS Data Platform from these Learning Services resources.
Hundreds of fixes have been made in this release:
Many changes that will greatly reduce future compatibility issues.
Improvements to the documentation, user interface, and packaging to standardize naming of InterSystems IRIS components.
Significant reliability improvements to our sharded query processing architecture, as well as unifying all application code into a single database. This unification eliminates redundancy and the potential for inconsistency, and provides a foundation for sharding and distributed processing across all our data models.
Please read this document for complete details about the supported platforms, including cloud platforms and docker containers.
For those of you who want to get hands-on quickly, we urge you to try out our InterSystems IRIS Experience. You’ll notice there is a brand new Java experience.
The build corresponding to this release is 2018.1.1.643.0 To all the InterSystems Engineers. Great job and I am happy to use this product soon in my solutions. Great, but where is Caché 2018 (or at least the fieldtest) ? Are there any release notes? I can't seem to find any.RegardsGeorgewww.georgejames.com Caché and Ensemble 2018.1 Field Test is coming soon this summer. Unlike most maintenance releases we've not created separate release notes for this release. Relevant information has been added directly to the main documents. GREAT ! We're waiting for Caché 2018 to upgrade all of our instances, because 2017 has issues with ODBC. ...2017 has issues with ODBCHi Kurt,could you drop a few words on the subject: what kind of issues it has?(Just started moving the clients to 2017.2.1...) Hello,We're having issues with our Delphi-client where it receives empty datasets.Response fr om WRC : The problem was that Delphi switches from read uncommited to read commited mode and we had a problem there with the joins on null values. The fix will be included in the next maintenance kits for 2017.2.x. The change already went into 2018.1 so it will be in all versions 2018+. Has InterSystems yet published any guidance to help existing Caché or Ensemble users move their code onto IRIS? I haven't yet found that kind of document at https://docs.intersystems.com/ Yes, we maintain an adoption guide that covers exactly that purpose. In order to be able to properly follow up on questions you'd have, we're making it available through your technical account team (sales engineer or TAM) rather than ship it with the product. Maybe worth stating that in the product docs? Hi, what issues are you having with ODBC? We have a lot of clients connecting to our Cache DB through ODBC.
Article
Niyaz Khafizov · Aug 3, 2018
Hi all. Today we are going to install Jupyter Notebook and connect it to Apache Spark and InterSystems IRIS.
Note: I have done the following on Ubuntu 18.04, Python 3.6.5.
Introduction
If you are looking for well-known, widely-spread and mainly popular among Python users notebook instead of Apache Zeppelin, you should choose Jupyter notebook. Jupyter notebook is a very powerful and great data science tool. it has a big community and a lot of additional software and integrations. Jupyter notebook allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. And most importantly, it is a big community that will help you solve the problems you face.
Check requirements
If something doesn't work, look at the "Possible problems and solutions" paragraph in the bottom.
First of all, ensure that you have Java 8 (java -version returns "1.8.x"). Next, download apache spark and unzip it. After, run the following in the terminal:
pip3 install jupyter
pip3 install toree
jupyter toree install --spark_home=/path_to_spark/spark-2.3.1-bin-hadoop2.7 --interpreters=PySpark --user
Now, open the terminal and run vim ~/.bashrc . Paste in the bottom the following code (this is environment variables):
export JAVA_HOME=/usr/lib/jvm/installed java 8 export PATH="$PATH:$JAVA_HOME/bin" export SPARK_HOME=/path to spark/spark-2.3.1-bin-hadoop2.7 export PATH="$PATH:$SPARK_HOME/bin" export PYSPARK_DRIVER_PYTHON=jupyter export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
And run source ~/.bashrc.
Check that it works
Now, let us launch Jupyter notebook. Run pyspark in the terminal.
Open in your browser the returned URL. Should be something like the image below:
Click on new, choose Python 3, paste the following code into a paragraph:
import sysprint(sys.version)sc
Your output should look like this:
Stop jupyter using ctrl-c in the terminal.
Note: To add custom jars just move desired jars into $SPARK_HOME/jars.
So, we want to work with intersystems-jdbc and intersystems-spark (we will also need a jpmml library). Let us copy required jars into spark. Run the following in the terminal:
sudo cp /path to intersystems iris/dev/java/lib/JDK18/intersystems-jdbc-3.0.0.jar /path to spark/spark-2.3.1-bin-hadoop2.7/jars
sudo cp /path to intersystems iris/dev/java/lib/JDK18/intersystems-spark-1.0.0.jar /path to spark/spark-2.3.1-bin-hadoop2.7/jars
sudo cp /path to jpmml/jpmml-sparkml-executable-version.jar /path to spark/spark-2.3.1-bin-hadoop2.7/jars
Ensure that is works. Run pyspark in the terminal again and run the following code (from the previous article):
from pyspark.ml.linalg import Vectorsfrom pyspark.ml.feature import VectorAssemblerfrom pyspark.ml.clustering import KMeansfrom pyspark.ml import Pipelinefrom pyspark.ml.feature import RFormulafrom pyspark2pmml import PMMLBuilder
dataFrame=spark.read.format("com.intersystems.spark").\option("url", "IRIS://localhost:51773/NAMESPACE").option("user", "dev").\option("password", "123").\option("dbtable", "DataMining.IrisDataset").load() # load iris dataset
(trainingData, testData) = dataFrame.randomSplit([0.7, 0.3]) # split the data into two setsassembler = VectorAssembler(inputCols = ["PetalLength", "PetalWidth", "SepalLength", "SepalWidth"], outputCol="features") # add a new column with features
kmeans = KMeans().setK(3).setSeed(2000) # clustering algorithm that we use
pipeline = Pipeline(stages=[assembler, kmeans]) # First, passed data will run against assembler and after will run against kmeans.modelKMeans = pipeline.fit(trainingData) # pass training data
pmmlBuilder = PMMLBuilder(sc, dataFrame, modelKMeans)pmmlBuilder.buildFile("KMeans.pmml") # create pmml model
My output:
The output file is a jpmml kmeans model. Everything works!
Possible problems and solutions
command not found: 'jupyter':
vim ~/bashrc;
add in the bottom export PATH="$PATH:~/.local/bin";
in terminal source ~/.bashrc.
If it doesn't help, reinstall pip3 and jupyter.
env: 'jupyter': No such file or directory:
In ~/.bashrc export PYSPARK_DRIVER_PYTHON=/home/.../.local/bin/jupyter.
TypeError: 'JavaPackage' object is not callable:
Check that the required .jar file in /.../spark-2.3.1-bin-hadoop2.7/jars;
Restart notebook.
Java gateway process exited before sending the driver its port number:
Your Java version should be 8 (probably works with Java 6/7 too, but I didn't check it);
echo $JAVA_HOME should return to you Java 8 version. If not, change the path in ~/.bashrc;
Paste sudo update-alternatives --config java in the terminal and choose a proper java version;
Paste sudo update-alternatives --config javac in the terminal and choose a proper java version.
PermissionError: [Errno 13] Permission denied: '/usr/local/share/jupyter'
Add --user at the end of your command in the terminal
Error executing Jupyter command 'toree': [Errno 2] No such file or directory
Run the command without sudo.
A specific error may appear if you use system variables like PYSPARK_SUBMIT_ARGS and other spark/pyspark variables or because of /.../spark-2.3.1-bin-hadoop2.7/conf/spark-env.sh changes.
Delete these variables and check spark-env.sh.
Links
Jupyter
Apache Toree
Apache Spark
Load a ML model into InterSystems IRIS
K-Means clustering of the Iris Dataset
The way to launch Apache Spark + Apache Zeppelin + InterSystems IRIS
💡 This article is considered as InterSystems Data Platform Best Practice.
Announcement
Anastasia Dyubaylo · Aug 14, 2018
Hi Community!
New video "Deploying Shards Using InterSystems Cloud Manager" is available now on InterSystems Developers YouTube:
In this demonstration, you will learn how to deploy a sharded cluster using InterSystems Cloud Manager (ICM), including defining the deployment, provisioning the architecture, and deploying and managing services.
What is more?
You are very welcome to watch all the videos about ICM in a dedicated InterSystems Cloud Manager playlist.
Don`t forget to subscribe our InterSystems Developers YouTube Channel.
Enjoy and stay tuned!
Announcement
Anastasia Dyubaylo · Nov 2, 2018
Hi Developers!New video from Global Summit 2018 is available now on InterSystems Developers YouTube Channel:Unit Test Coverage in InterSystems ObjectScript InterSystems is sharing a tool they use for measuring test coverage. Watch this session recording and learn how you can measure the effectiveness of your existing unit tests, identify weak areas, enable improvement, and track results over time.Takeaway: I know how to use the tool InterSystems provides to improve my unit tests.Presenter: @Timothy.Leavitt, Developer on HealthShare Development Team.Note: Test Coverage Tool is also available on InterSystems Open Exchange. And...Content related to this session, including slides, video and additional learning content can be found here.Don`t forget to subscribe our InterSystems Developers YouTube Channel.Enjoy and stay tuned!
Article
Erik Hemdal · Jul 22, 2019
One of my colleagues at InterSystems encountered an unexpected issue when running InterSystems IRIS on a Macintosh in a container using Docker for Mac. I’d like to share what we found, so you might avoid running into similar issues.The ProblemThe task at hand was running a Java application with XEP to do a large data load into IRIS. When running the data load, the write daemon hung soon after starting the job, with messages like these in messages.log: 05/21/19-14:57:50:625 (757) 2 Process terminated abnormally (pid 973, jobid 0x00050016) (was a global updater)05/21/19-14:58:52:990 (743) 2 CP: Pausing users because the Write Daemon has not shown signs of activity for 301 seconds. Users will resume if Write Daemon completes a pass or writes to disk (wdpass=98). This problem was completely reproducible and was very mysterious, so Support got involved. What we foundWe were able to start the SystemPerformance utility while reproducing the problem and discovered the issue readily.In the iris.cpf file, the cache for 8KB databases was set to 4GB:globals=0,0,4096,0,0,0 That looked reasonable for an instance running on a machine with 8GB of memory. Since this was a test, the Mac was otherwise not heavily loaded. However, not all of that system memory was actually available to IRIS, as we saw in the output of the Linux free command inside the container: Memtotal, used, free, shared,buf/cache,available,swaptotal, swapused, swapfree, 1998, 331, 322, 513, 1344, 1003, 1023, 11, 1012, 1998, 340, 312, 513, 1345, 994, 1023, 11, 1012,. . . 1998, 272, 72, 1563, 1653, 44, 1023, 105, 918,. . . 1998, 123, 67, 1770, 1807, 12, 1023, 870, 153,. . . 1998, 135, 54, 1777, 1809, 14, 1023, 1023, 0, Only about 2GB was actually available. During the heavy data load, IRIS rapidly consumed the database cache until all memory and swap space available was exhausted; at which point the instance hung.The CauseDocker relies heavily on some key Linux technologies, particularly cgroups and namespaces, that aren’t available natively on platforms like Macintosh and Windows. On these platforms, Docker uses a Linux virtual machine internally: in the case of the Macintosh, this is provided by HyperKit. And as we found, it is possible to overallocate memory on this platform and configure IRIS with more memory than is actually available. If you are using Docker for Mac as your development platform, keep this internal VM in mind and size memory appropriately. I think it would be good to add screenshot like this, to show how to configure memory limits in macOS. In Windows should be quite similar I think. Thanks Dmitry! It looks like you did it.
Announcement
Anastasia Dyubaylo · Jul 29, 2019
Hi Everyone!
Please watch the new video on InterSystems Developers YouTube, recorded by @Sourabh.Sethi6829 in a new format called "Coding Talks":
A SOLID Design in InterSystems ObjectScript
In this session, we will discuss a SOLID Principle of Programming and will implement it in the example.We have used Caché Object Programming Language for examples. We will go step by step to understand the requirement, then what common mistakes we use to do while designing, understanding each principle and then complete design with its implementation via Caché Objects.
Additional resources:
CodeSet
Presentation
Also, check out the first part of "Locking in InterSystems ObjectScript" Coding Talk.
If you have any questions or suggestions, please write to @Sourabh.Sethi6829 at sethisourabh.hit@gmail.com.
Enjoy watching this video!