#Big Data

1 Follower · 46 Posts

Big data is a field that treats ways to analyze, systematically extract information from. Big data challenges include capturing data, data storage, data analysis, search, sharing, transfer, visualization, querying, updating, information privacy and data source.

Learn more.

All

Top

By update

Announcement Anastasia Dyubaylo · Dec 21, 2020

Vote for the best app in the InterSystems Analytics Contest!

Hey Developers,

This week is a voting week for the InterSystems Analytics Contest! So, it's time to give your vote to the best solutions built with InterSystems IRIS.

🔥 You decide: VOTING IS HERE 🔥

How to vote?

Please meet the new voting engine and algorithm for the Experts and Community nomination:

#InterSystems IRIS #IRIS contest #Open Exchange #Artificial Intelligence (AI) #Analytics #Big Data #Contest #Machine Learning (ML)

8 3

0 356

Announcement Anastasia Dyubaylo · Nov 19, 2020

InterSystems Analytics Contest 2020

Hey Developers!

We're pleased to invite you all to the next competition of creating open-source solutions using InterSystems IRIS! Please join:

🏆 InterSystems Analytics Contest 🏆

Duration: December 7 - 27, 2020

#InterSystems IRIS #IRIS contest #Open Exchange #Artificial Intelligence (AI) #Analytics #Big Data #Contest #Events #Machine Learning (ML)

8 16

2 1831

Announcement Anastasia Dyubaylo · Dec 5, 2020

InterSystems Analytics Contest Kick-off Webinar

Hi Community!

We are pleased to invite all the developers to the upcoming InterSystems Analytics Contest Kick-off Webinar! The topic of this webinar is dedicated to the Analytics contest.

On this webinar, we’ll demo the iris-analytics-template and answer the questions on how to develop, build, and deploy Analytics applications using InterSystems IRIS.

Date & Time: Monday, December 7 — 12:00 PM EDT

Speakers:
🗣 @Carmen Logue, InterSystems Product Manager - Analytics and AI
🗣 @Evgeny Shvarov, InterSystems Developer Ecosystem Manager

#InterSystems IRIS #IRIS contest #Open Exchange #Artificial Intelligence (AI) #Analytics #Big Data #Contest #Events #Machine Learning (ML) #Webinar

5 4

0 312

Article Alexey Maslov · Oct 20, 2020 11m read

Parallel Processing of Multi-Model Data in InterSystems IRIS and Caché

As we all well know, InterSystems IRIS has an extensive range of tools for improving the scalability of application systems. In particular, much has been done to facilitate the parallel processing of data, including the use of parallelism in SQL query processing and the most attention-grabbing feature of IRIS: sharding. However, many mature developments that started back in Caché and have been carried over into IRIS actively use the multi-model features of this DBMS, which are understood as allowing the coexistence of different data models within a single database. For example, the HIS qMS database contains both semantic relational (electronic medical records) as well as traditional relational (interaction with PACS) and hierarchical data models (laboratory data and integration with other systems). Most of the listed models are implemented using SP.ARM's qWORD tool (a mini-DBMS that is based on direct access to globals). Therefore, unfortunately, it is not possible to use the new capabilities of parallel query processing for scaling, since these queries do not use IRIS SQL access.

Meanwhile, as the size of the database grows, most of the problems inherent to large relational databases become right for non-relational ones. So, this is a major reason why we are interested in parallel data processing as one of the tools that can be used for scaling.

In this article, I would like to discuss those aspects of parallel data processing that I have been dealing with over the years when solving tasks that are rarely mentioned in discussions of Big Data. I am going to be focusing on the technological transformation of databases, or, rather, technologies for transforming databases.

#Caché #InterSystems IRIS #Big Data #DevOps

12 4

3 992

Announcement Anastasia Dyubaylo · Oct 13, 2020

New Video: Big Data in InterSystems IRIS

Hi Community!

Enjoy watching the new video on InterSystems Developers YouTube:

⏯ Big Data in InterSystems IRIS

#InterSystems IRIS #Big Data #Machine Learning (ML) #Sharding #Video

4 0

0 348

Announcement Denis Yuzhanin · Jul 3, 2020

Task for AI/ML contest. Recognize image coordintates. Constructor.

Hi everyone.
We are a team of company "Constructor" and we develop cutting edge cartographic systems. Recently the amount of image data skyrocketed so we want to give our users the ability to tie images to places automatically. For that, we want to use AI/ML technologies and we have a cool task for you.

https://cloud.mail.ru/public/pHbC/4r7Z58m6f/

There are three collections of datasets and in each you have:
Image from the original camera with no position information and set of images made from different points of view near this original camera with position information (list_files_info.

#Other #Artificial Intelligence (AI) #Big Data #Contest #Machine Learning (ML)

1 0

0 325

Announcement Anastasia Dyubaylo · May 29, 2020

New Video: Automated InterSystems IRIS Cloud Scaling

Hi Community,

The new video from Global Summit 2019 is already on InterSystems Developers YouTube:

⏯ Automated InterSystems IRIS Cloud Scaling

#InterSystems IRIS #AWS #Azure #Big Data #Cloud #Containerization #Deployment #Google Cloud Platform (GCP) #Global Summit 2019 #Video

0 0

0 427

Article Niyaz Khafizov · Jul 6, 2018 3m read

The way to launch Apache Spark + Apache Zeppelin + InterSystems IRIS

Hi all. Yesterday I tried to connect Apache Spark, Apache Zeppelin, and InterSystems IRIS. During the process, I experienced troubles connecting it all together and I did not find a useful guide. So, I decided to write my own.

Introduction

What is Apache Spark and Apache Zeppelin and find out how it works together. Apache Spark is an open-source cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance. So, it is very useful when you need to work with Big Data.

#InterSystems IRIS #Archive #Artificial Intelligence (AI) #Beginner #Best Practices #Big Data #Machine Learning (ML)

10 1

1 1986

Article sween · Nov 7, 2019 5m read

Export InterSystems IRIS Data to BigQuery on Google Cloud Platform

Loading your IRIS Data to your Google Cloud Big Query Data Warehouse and keeping it current can be a hassle with bulky Commercial Third Party Off The Shelf ETL platforms, but made dead simple using the iris2bq utility.

Let's say IRIS is contributing to workload for a Hospital system, routing DICOM images, ingesting HL7 messages, posting FHIR resources, or pushing CCDA's to next provider in a transition of care. Natively, IRIS persists these objects in various stages of the pipeline via the nature of the business processes and anything you included along the way. Lets send that up to Google Big Query to augment and compliment the rest of our Data Warehouse data and ETL (Extract Transform Load) or ELT (Extract Load Transform) to our hearts desire.

A reference architecture diagram may be worth a thousand words, but 3 bullet points may work out a little bit better:

It exports the data from IRIS into DataFrames
It saves them into GCS as .avro to keep the schema along the data: this will avoid to specify/create the BigQuery table schema beforehands.
It starts BigQuery jobs to import those .avro into the respective BigQuery tables you specify.

#InterSystems IRIS #InterSystems IRIS for Health #Best Practices #Big Data #Cloud #Google Cloud Platform (GCP) #integration-required

Open Exchange

5 3

0 1345

Article Mark Bolinsky · Mar 3, 2020 11m read

InterSystems IRIS and Intel Optane DC Persistent Memory

InterSystems and Intel recently conducted a series of benchmarks combining InterSystems IRIS with 2nd Generation Intel® Xeon® Scalable Processors, also known as “Cascade Lake”, and Intel® Optane™ DC Persistent Memory (DCPMM). The goals of these benchmarks are to demonstrate the performance and scalability capabilities of InterSystems IRIS with Intel’s latest server technologies in various workload settings and server configurations. Along with various benchmark results, three different use-cases of Intel DCPMM with InterSystems IRIS are provided in this report.

#HealthShare #InterSystems IRIS #InterSystems IRIS for Health #TrakCare #Big Data #HL7 #Interoperability #InterSystems Business Solutions and Architectures #Sharding #Testing

5 5

0 1170

Question Yunier Gonzalez · Oct 31, 2019

Working with Data: System in production

Greetings community. I would like to know how to migrate a BD in production to a local environment. When I have a system in production (BD Sql Server) what we do is mount a local copy to do the analysis with the data and not occupy resources of the system in production. My question is: How do you do it with Intersystems technology? I already tested the PowerBi connector and it looks great, but that's where the question came up.

#Caché #InterSystems IRIS #InterSystems IRIS for Health #Backup #Big Data #Databases #SQL

0 2

0 369

Article Niyaz Khafizov · Jul 27, 2018 4m read

Load a ML model into InterSystems IRIS

Hi all. Today we are going to upload a ML model into IRIS Manager and test it.

Note: I have done the following on Ubuntu 18.04, Apache Zeppelin 0.8.0, Python 3.6.5.

Introduction

These days many available different tools for Data Mining enable you to develop predictive models and analyze the data you have with unprecedented ease. InterSystems IRIS Data Platform provide a stable foundation for your big data and fast data applications, providing interoperability with modern DataMining tools.

In this series of articles we explore Data mining capabilities available with InterSystems IRIS.

#InterSystems IRIS #Artificial Intelligence (AI) #Analytics #API #Beginner #Best Practices #Big Data #Machine Learning (ML) #Python

6 2

2 1555

Question Paul Riker · Mar 29, 2019

Ensemble as a Data lake

We have been storing raw messages in a MySQL database for DR and ad hoc purposes. We are thinking of using an Ensemble instance as our data lake instead. We could segregate the source data by namespace or by global. But either way we'll want a custom global to index the data for data retrieval performance purposes.

Anyone else taking this approach? Any feedback?

#Ensemble #Big Data #Databases #Indexing

0 2

0 579

Article Benjamin De Boe · Jan 31, 2018 4m read

Introducing the InterSystems IRIS Connector for Apache Spark

With the release of InterSystems IRIS, we're also making available a nifty bit of software that allows you to get the best out of your InterSystems IRIS cluster when working with Apache Spark for data processing, machine learning and other data-heavy fun. Let's take a closer look at how we're making your life as a Data Scientist easier, as you're probably already facing tough big data challenges already, just from the influx of job offers in your inbox!

#InterSystems IRIS #Artificial Intelligence (AI) #Analytics #Big Data #Distributed Data Management #Java #Machine Learning (ML) #Sharding

2 2

0 1853

Article David E Nelson · Mar 9, 2017 9m read

Machine Learning with Spark and Caché

Apache Spark has rapidly become one of the most exciting technologies for big data analytics and machine learning. Spark is a general data processing engine created for use in clustered computing environments. Its heart is the Resilient Distributed Dataset (RDD) which represents a distributed, fault tolerant, collection of data that can be operated on in parallel across the nodes of a cluster. Spark is implemented using a combination of Java and Scala and so comes as a library that can run on any JVM.

#Caché #Artificial Intelligence (AI) #Analytics #Big Data #JDBC #Machine Learning (ML) #Python #Vector Search

11 5

1 2861

Article Fabian Haupt · Jan 20, 2017 8m read

Visualizing the data jungle -- Part I. Let's make a graph

This is the first article of a series diving into visualization tools and analysis of time series data. Obviously we are most interested in looking at performance related data we can gather from the Caché family of products. However, as we'll see down the road, we are absolutely not limited to that. For now we are exploring python and the libraries/tools available within that ecosystem.

The series is closely tying into Murray's excellent series about Caché performance and monitoring (see here) and more specifically this article.

#Caché #Big Data #Object Data Model #Python #Tools #Visualization

9 4

1 1827

Dev Community resources

InterSystems resources

#Big Data

Vote for the best app in the InterSystems Analytics Contest!

InterSystems Analytics Contest 2020

InterSystems Analytics Contest Kick-off Webinar

Parallel Processing of Multi-Model Data in InterSystems IRIS and Caché

New Video: Big Data in InterSystems IRIS

Task for AI/ML contest. Recognize image coordintates. Constructor.

New Video: Automated InterSystems IRIS Cloud Scaling

The way to launch Apache Spark + Apache Zeppelin + InterSystems IRIS

Introduction

Export InterSystems IRIS Data to BigQuery on Google Cloud Platform

InterSystems IRIS and Intel Optane DC Persistent Memory

Working with Data: System in production

Load a ML model into InterSystems IRIS

Introduction

Ensemble as a Data lake

Introducing the InterSystems IRIS Connector for Apache Spark

Machine Learning with Spark and Caché

Visualizing the data jungle -- Part I. Let's make a graph

Community in numbers

Dev Community resources

InterSystems resources

Our social networks

#Big Data

Introduction

Introduction

Trending apps

Community in numbers