Hi all. We are going to find duplicates in a dataset using Apache Spark Machine Learning algorithms.

Note: I have done the following on Ubuntu 18.04, Python 3.6.5, Zeppelin 0.8.0, Spark 2.1.1

Introduction

In previous articles we have done the following:

0 0
1 677

We’re now less than a month away from our annual conference, the InterSystems Global Summit. This year, we’ll be descending on the beautiful outskirts of San Antonio, a city worth visiting for its wonderful river walkway and its 18th century Spanish Mission, even if it hadn’t been the location of this year’s InterSystems event. Leaving the tourist guidance to the tourist guides, let’s take a closer look at what the conference has in stock for you, including a dedicated post-summit symposium on AI and ML on Wednesday October 3!

4 0
0 299

The source class of a DeepSee cube has a property referencing a different class:

Class ClassA Extends %Persistent {
     Property P1 As ClassB;
}

When records in class B change, the ^OBJ.DSTIME global for Class A will not be automatically updated. This means that synchronization of cubes based on source class A will not reflect the changes occurred to property P1.
This post will help you determine the best way to achieve synchronization of properties referencing a different class

4 2
2 458
Article
· Jul 27, 2018 4m read
Load a ML model into InterSystems IRIS

Hi all. Today we are going to upload a ML model into IRIS Manager and test it.

Note: I have done the following on Ubuntu 18.04, Apache Zeppelin 0.8.0, Python 3.6.5.

Introduction

These days many available different tools for Data Mining enable you to develop predictive models and analyze the data you have with unprecedented ease. InterSystems IRIS Data Platform provide a stable foundation for your big data and fast data applications, providing interoperability with modern DataMining tools.

6 2
2 1.3K

I was approached recently by and end use who wanted to perform analysis of their databases and see how they could save some space by picking data good for deletion without harming the application. As part of investigation, they wanted to know sizes of globals within datasets. This can be achieved by various means but all of them provide data in text form only.

I thought I might be a good tool for database administrators in general - to see global sizes in a graphical way.

2 1
0 362
Announcement
· May 21, 2018
How is your code health

As a developer, usually I'm concerned about how my code health is, and how the other coders code can affect to my own work. And I'm quite sure most of us feel very similar.

In our company we use a Static Code Analysis tool to analyze code for different languages to ensure we are writing high quality and easily maintainable code by following a few best practices in terms of code structure and content. And the question was: why should be different for Caché ObjectScript language?

6 3
4 1.1K

The following post outlines a more flexible architectural design for DeepSee. As in the previous example, this implementation includes separate databases for storing the DeepSee cache, DeepSee implementation and settings, and synchronization globals. This example introduces one new databases to store the DeepSee indices. We will redefine the global mappings so that the DeepSee indices are not mapped together with the fact and dimension tables.

2 1
0 580

The following post outlines an architectural design of intermediate complexity for DeepSee. As in the previous example, this implementation includes separate databases for storing the DeepSee cache, DeepSee implementation and settings. This post introduces two new databases: the first to store the globals needed for synchronization, the second to store fact tables and indices.

5 5
0 735

Hi Everyone!

New session recording from Global Summit 2017 is already on InterSystems Developers YouTube:

Predicting Your Manhattan Cab Ride Fare

https://www.youtube.com/embed/E3o87dMxamE
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

1 0
0 223

Hi, Community!

I’m sure you are using Developer Community analytics built with InterSystems Analytics technology DeepSee:

You can find DC analytics n InterSystems->Analytics menu.

DC Analytics shows interactive dashboards on key figures of DC entities: Posts, Comments, and Members.

Since the last week, this analytics project is available for everyone with source code and data on DC Github!

2 1
1 579

Hi, Community!

This is the 3rd part of DeepSee Web story - Angular base UI for DeepSee Dashboards, see the beginning here.

By design, DSW provides an implementation for every widget in DeepSee library. But there are some extra features in DSW which make solutions built with DSW dashboards more functional. This article describes it.

2 0
1 876

With the release of InterSystems IRIS, we're also making available a nifty bit of software that allows you to get the best out of your InterSystems IRIS cluster when working with Apache Spark for data processing, machine learning and other data-heavy fun. Let's take a closer look at how we're making your life as a Data Scientist easier, as you're probably already facing tough big data challenges already, just from the influx of job offers in your inbox!

2 2
0 1.6K

There are several options how to deliver user interface(UI) for DeepSee BI solutions. The most common approaches are:

  • use native DeepSee Dashboards, get web UI in Zen and deliver it in your web apps.
  • use DeepSee REST API, get and build your own UI widgets and dashboards.

The 1st approach is good because of the possibility to build BI dashboards without coding relatively fast, but you are limited with preset widgets library which is expandable but with a lot of development efforts.

The 2nd provides you the way to use any comprehensive js framework (D3, Highcharts, etc) to visualize your DeepSee data, but you need to code widgets and dashboards on your own.

Today I want to tell you about yet another approach which combines both listed above and provides Angular based web UI for DeepSee Dashboards - DeepSee Web library.

3 16
5 2.3K

Hi, Community!

Please find a new session recording from Global Summit 2017:

An InterSystems Guide to the Data Galaxy

https://www.youtube.com/embed/cudBxwI1xTA
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

2 0
0 250

Hi guys,

I'm trying to immigrate some of my HealthInsight dashboards and pivot tables to another HS instance.

In some pivot tables, I defined them with a set of calculated dimensions defined in the analyzer, e.g as below:

Then when I exported the cubes and pivot tables in used to my new envirmonment. When I open my pivot tables again, the calculated dimensions are missing and hence my pivot tables no longer work:

0 6
0 434

Learning Services Live Webinars are back!

At this year’s Global Summit, InterSystems debuted InterSystems IRIS Data Platform™, a single, comprehensive product that provides capabilities spanning data management, interoperability, transaction processing, and analytics. InterSystems IRIS sets a new level of performance for the rapid development and deployment of data-rich and mission-critical applications. Now is your chance to learn more!

3 1
0 574

Last week, we announced the InterSystems IRIS Data Platform, our new and comprehensive platform for all your data endeavours, whether transactional, analytics or both. We've included many of the features our customers know and loved from Caché and Ensemble, but in this article we'll shed a little more light on one of the new capabilities of the platform: SQL Sharding, a powerful new feature in our scalability story.

13 11
1 1.5K

System Monitor is a flexible and highly configurable tool supplied with Caché (Ensemble, HealthShare), which collects the essential metrics of the operating system and Caché itself. System Monitor also notifies administrators about issues with Caché and the operating system, when one or several parameters reach the admin-defined thresholds.

1 2
2 1.3K

Hi, Community!

This week we have two videos.

Please find the second Developer Community Video of the week on InterSystems Developers YouTube Channel:

Turning Accountants into Explorers

https://www.youtube.com/embed/2wG17vzXTy4
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

1 0
0 280
Question
· Aug 3, 2017
DeepSee time series - season

Hello,

I have a series of data organized by time (year and month) so I can use a time dimension to drill down data. So far so good.

However, I need to display the data not by calendar years and months but rather by seasons. The season has 12 calendar months but starts in September. So I'd like to see the data from September / Year N to August / Year N+1 using the same hierarchy as normal time dimension.

Has anyone done something similar?

Obviously, the season can start by any month, not only September :)

TIA!

1 3
0 282
Article
· May 25, 2017 2m read
The Interns are Coming!

The Data Platforms department here at InterSystems is gearing up for this year's crop of interns, and I for one am very excited to meet them all next week!

We've got folks from top technical colleges with diverse specialties from hard core engineers to pure computer scientists to mathematicians to business professionals. They come from countries around the world like Vietnam, China, and Finland and they all come with impressive backgrounds. We're sure they will do very well this summer.

6 0
0 498