Data Model

Syndicate content 13 

In the previous parts (1, 2) we talked about globals as trees. In this article, we will look at them as sparse arrays.

A sparse array - is a type of array where most values assume an identical value.

In practice, you will often see sparse arrays so huge that there is no point in occupying memory with identical elements. Therefore, it makes sense to organize sparse arrays in such a way that memory is not wasted on storing duplicate values.

In some programming languages, sparse arrays are part of the language - for example, in J, MATLAB. In other languages, there are special libraries that let you use them. For C++, those would be Eigen and the like.

Globals are good candidates for implementing sparse arrays for the following reasons:

Last comment 23 May 2019
+ 6   0 5
631

views

+ 6

rating

How Tax Service, OpenStreetMap, and InterSystems IRIS
could help developers get clean addresses

 

Pieter Brueghel the Younger, Paying the Tax (The Tax Collector), 1640

 

In my previous article, we just skimmed the surface of objects. Let's continue our reconnaissance. Today's topic is a tough one. It's not quite BIG DATA, but it's still the data not easy to work with: we're talking about fairly large amounts of data. It won't all fit into RAM at once, and some of it won't even fit on the drive (not due to lack of space, but because there's a lot of junk). The name of our subject is FIAS DB: the Federal Information Address System database - the databases of addresses in Russia. The archive is 5.5 GB. And it's a compressed XML file. After extraction, it will be a full 53 GB (set aside 110 GB for extraction). And when you start to parse and convert it, that 110 GB won't be enough. There won't be enough RAM either.

+ 2   2 1
0

comments

150

views

+ 2

rating

Headache-free stored objects: a simple example of working with InterSystems Caché objects in ObjectScript and Python

Neuschwanstein Castle

Tabular data storages based on what is formally known as the relational data model will be celebrating their 50th anniversary in June 2020. Here is an official document – that very famous article.  Many thanks for it to Doctor Edgar Frank Codd. By the way, the relational data model is on the list of the most important global innovations of the past 100 years published by Forbes.

On the other hand, oddly enough, Codd viewed relational databases and SQL as a distorted implementation of his theory.  For general guidance, he created 12 rules that any relational database management system must comply with (there are actually 13 rules). Honestly speaking, there is zero DBMS's on the market that observes at least Rule 0. Therefore, no one can call their DBMS 100% relational :) If you know any exceptions, please let me know.

+ 4   3 1
0

comments

297

views

+ 4

rating

Beginning - see Part 1.

 

3. Variants of structures when using globals

 

A structure, such as an ordered tree, has various special cases. Let's take a look at those that have practical value for working with globals.

 

 

 

 

3.1 Special case 1. One node without branches

 

Last comment 8 July 2017
+ 6   0 5
825

views

+ 6

rating

In the prior part of this series we have provided introduction to Google MapReduce approach, but still not covered their possible ObjectScript implementation. Which we will start to explain today.

Last comment 28 March 2017
+ 3   0 4
346

views

+ 3

rating

In part I of this series we have introduced MapReduce as a generic concept, and in part II we started to approach Caché ObjectScript implementation via introducing abstract interfaces. Now we will try to provide more concrete examples of applications using MapReduce.

Last comment 28 March 2017
+ 4   0 4
601

views

+ 4

rating

This post is the direct result of working with an InterSystems customer who came to me with the following problem:

SELECT COUNT(*) FROM MyCustomTable

Takes 0.005 seconds, total 2300 rows.  However:

Last comment 1 November 2016
+ 10   0 11
985

views

+ 10

rating

Process-private Globals  can be used as a data global in storage definition. That way, each process can have its own objects for the class with ppg storage. For example lets define a pool, which can:

  • add elements to a pool (ignoring duplicates)
  • check if an element exists in the pool

Here's the class:

Last comment 16 August 2016
+ 4   0 3
412

views

+ 4

rating

Introduction

The field test of Caché 2016.2 has been available for quite some time and I would like to focus on one of the substantial features that is new in this version: the document data model. This model is a natural addition to the multiple ways we support for handling data including Objects, Tables and Multidimensional arrays. It makes the platform more flexible and suitable for even more use cases.

Last comment 15 June 2016
+ 13   0 11
1868

views

+ 13

rating

I am pleased to announce the next 2016.2 field test kit, 2016.2.0.605.0.

There are about a hundred changes from the previous field test kit, including the following fixes to problems found by those using the kits in the field:

0   0 2
0

comments

132

views

0

rating

I am pleased to announce the next 2016.2 field test kit, 2016.2.0.595.0.

It may look like a slow week, with less than fifty changes having been checked in, but this kit includes the following fixes to problems found by you, the ones running the kits in the field:

+ 3   0 3
0

comments

151

views

+ 3

rating

Is there someone that has developped a program in order to create a 
"decisiontree"? Depending The answer to a question leads to another question, and so on, 
and so on, and there is an option to return to another point in the decisiontree.

Best regards,

Simon.

p.s. I've already got something, but it's not workable. But to get an idea:

Last comment 14 January 2016
+ 2   0 4
108

views

+ 2

rating