Big Data | InterSystems Developer Community

Question

Yunier Gonzalez · Oct 31, 2019

Greetings community. I would like to know how to migrate a BD in production to a local environment. When I have a system in production (BD Sql Server) what we do is mount a local copy to do the analysis with the data and not occupy resources of the system in production. My question is: How do you do it with Intersystems technology?

#Backup #Big Data #Databases #SQL #Caché #InterSystems IRIS #InterSystems IRIS for Health

0 2

0 289

Article

Niyaz Khafizov · Jul 27, 2018 4m read

Load a ML model into InterSystems IRIS

Hi all. Today we are going to upload a ML model into IRIS Manager and test it.

Note: I have done the following on Ubuntu 18.04, Apache Zeppelin 0.8.0, Python 3.6.5.

Introduction

These days many available different tools for Data Mining enable you to develop predictive models and analyze the data you have with unprecedented ease. InterSystems IRIS Data Platform provide a stable foundation for your big data and fast data applications, providing interoperability with modern DataMining tools.

#AI #Analytics #API #Beginner #Best Practices #Big Data #Machine Learning #Python #InterSystems IRIS

6 2

2 1.3K

Question

Paul Riker · Mar 29, 2019

Ensemble as a Data lake

We have been storing raw messages in a MySQL database for DR and ad hoc purposes. We are thinking of using an Ensemble instance as our data lake instead. We could segregate the source data by namespace or by global. But either way we'll want a custom global to index the data for data retrieval performance purposes.

Anyone else taking this approach? Any feedback?

#Big Data #Databases #Indexing #Ensemble

0 2

0 455

Article

Benjamin De Boe · Jan 31, 2018 4m read

Introducing the InterSystems IRIS Connector for Apache Spark

With the release of InterSystems IRIS, we're also making available a nifty bit of software that allows you to get the best out of your InterSystems IRIS cluster when working with Apache Spark for data processing, machine learning and other data-heavy fun. Let's take a closer look at how we're making your life as a Data Scientist easier, as you're probably already facing tough big data challenges already, just from the influx of job offers in your inbox!

#AI #Analytics #Big Data #Distributed Data Management #Java #Machine Learning #Sharding #InterSystems IRIS

2 2

0 1.5K

Article

David E Nelson · Mar 9, 2017 9m read

Machine Learning with Spark and Caché

Apache Spark has rapidly become one of the most exciting technologies for big data analytics and machine learning. Spark is a general data processing engine created for use in clustered computing environments. Its heart is the Resilient Distributed Dataset (RDD) which represents a distributed, fault tolerant, collection of data that can be operated on in parallel across the nodes of a cluster. Spark is implemented using a combination of Java and Scala and so comes as a library that can run on any JVM.

#AI #Analytics #Big Data #JDBC #Machine Learning #Python #Caché

11 5

1 2.6K

Article

Fabian Haupt · Jan 20, 2017 8m read

Visualizing the data jungle -- Part I. Let's make a graph

This is the first article of a series diving into visualization tools and analysis of time series data. Obviously we are most interested in looking at performance related data we can gather from the Caché family of products. However, as we'll see down the road, we are absolutely not limited to that. For now we are exploring python and the libraries/tools available within that ecosystem.

#Big Data #Object Data Model #Python #Tools #Visualization #Caché

9 4

1 1.5K