Article
· Dec 21, 2024 13m read

Taking up Collections in IRIS

Imagine you’re walking down the street on a nice summer’s day, and someone walks up to you and says “Hey, you work at InterSystems, right? I’ve been hearing more and more about InterSystems IRIS lately. I know IRIS has its own programing language called ObjectBook? or InstaScript? OK, I admit it, I know it’s called ObjectScript! I know IRIS also supports Python. I’m a Python developer, so that sounds great to me. But I’m also interested in ObjectScript. For example, Python and other languages support collections. Does ObjectScript support collections?”

You’d answer “Of course!”

And then your new friend might get excited and start firing off more questions:

  • How many kinds of collections does ObjectScript support?
  • Can ObjectScript use Python collections?
  • Can Python use ObjectScript collections?
  • Which collection is best?

How would you answer? Well, you don’t have to worry about answering. All you’d have to do is send your new friend the URL of this long page.

Let’s say you have 20, 500, or 10000 items that you want to manipulate programmatically. You may want to sort them, iterate through them, perform mathematical operations on them, or do something else with them. Using a collection is an obvious choice. Let's explore the IRIS collection collection.

There are 12 kinds of collections available in IRIS. Yes, 12. Why 12? Because new collection or collection-like features have been added over time. In the beginning, there were 2 kinds of collections. Then $lists were added. Then collection classes were added. Then dynamic objects and arrays were added for handling JSON. Then the ability to use Python collections was added. Finally vectors were added. In ObjectScript, it's possible to use all 12. In Python, as of v2024.3, you can use 9 (sparse arrays, $list, and $vector are not available).

I've divided the collections into 3 groups. There are simple code examples for each collection (Python on the left, ObjectScript on the right)

  • 7 collections that use integer keys.
  • 4 collections that allow any value to be the key; these are key-value collections.
  • The Python set collection only, which has no keys.

At the bottom, after the examples of the collections, there are a few more fun facts about collections as well as a poll. Plus, there might be a treat or joke down there at the end.

The one question this article does not answer is: "Which collection performs best, especially when the number of items in the collection is large?" I intend to write some code which will time things like computations across values ($vector should win that one!), creation, searching for values, accessing values, and anything else I come up with. And I'll add that code and the results to this posting,. Maybe one of you will finish writing that code and add it to this posting before I do.

Collections with Integer Keys

1. Delimited strings

You might not think of this as a collection, but it can easily be used as one. This is a collection of pieces. The keys are the 1-based (for ObjectScript) or 0-based (for Python) positions of the pieces in the string.

# NOTE: since access to pieces uses split(),
#       which returns a Python list,
#       you'd probably use a Python list instead
#
# create
>>> c = "10,20,30,20,10"
# iterate and access
>>> i = 0
>>> for val in c.split(","):
...    print(i, "–", val)
...    i += 1
010
120
230
320
410
# return the key for a particular value (30)
>>> print(c.split(",").index('30'))
2
// create
USER>set c = "10,20,30,20,10"
// iterate and access
USER>for i = 1:1:$length(c, ",") { write !, i, " – ", $piece(c, ",", i) }
110
220
330
420
510
// returning the key for a particular value (30)
// is awkward; use a different collection if this is required

 

2.    $list

This is a $list collection of values. The keys are the 1-based positions of the values in the $list. $list is not available in Python. However, you can convert a $list into a Python list.

# assuming "lb" is a $list returned by a call
# to an ObjectScript method
# $listbuild(10,20,30,20,10)
# convert it to a Python list
>>> c = iris._SYS.Python.ToList(lb)
>>> print(c)
[10, 20, 30, 20, 10]
// create
USER>set c = $listbuild(10,20,30,20,10)
// iterate and access
USER>for i = 1:1:$listlength(c) { write !, i, " – ", $list(c, i) }
110
220
330
420
510
// return the key for a particular value (30)
USER>write $listfind(c, 30)
3

 

3.    %DynamicArray object (JSON array)

This is a JSON array collection of values. The keys are the 0-based positions of the values. This is like a Python list. Searching for a value and returning its key requires iterating (not shown below). 

# create
>>> c = iris._Library.DynamicArray._New()
>>> _ = c.Set(0, 10)
>>> _ = c.Set(1, 20)
>>> _ = c.Set(2, 30)
>>> _ = c.Set(3, 20)
>>> _ = c.Set(4, 10)
# iterate and access
>>> for i in range(c._Size()):
...    print(i, "-", c._Get(i))
010
120
230
320
410
// create
USER>set c = [10,20,30,20,10]
// iterate and access
USER>for i = 0:1:(c.%Size() - 1) { write !, i, " – ", c.%Get(i) }
010
120
230
320
410

 

4.    %ListOfDataTypes object

This is a list collection of values. The keys are the 1-based positions of the values. Note: when using this collection as a property in a class (Property C as list of %String), the class definition is %Collection.ListOfDT instead (all functionality is the same).

# create
>>> c = iris._Library.ListOfDataTypes._New()
>>> _ = c.Insert(10)
>>> _ = c.Insert(20)
>>> _ = c.Insert(30)
>>> _ = c.Insert(20)
>>> _ = c.Insert(10)
# iterate and access
>>> for i in range(c.Size()):
...    print((i+1), "-", c.GetAt(i+1))
110
220
330
420
510
# return the key for a particular value (30)
>>> print(c.Find(30))
3
// create
USER>set c = ##class(%ListOfDataTypes).%New()
USER>do c.Insert(10), c.Insert(20), c.Insert(30)
USER>do c.Insert(20), c.Insert(10)
// iterate and access
USER>for i = 1:1:c.Size { write !, i, " – ", c.GetAt(i) }
110
220
330
420
510
// return the key for a particular value (30)
USER>write c.Find(30)
3

 

5.    $vector

This is a vector of values of a specific declared type (integers in the example below). The keys are the 1-based positions of the values in the vector. $vector is not available in Python.

// create
USER>set $vector(c,1,"int") = 10
USER>set $vector(c,2,"int") = 20
USER>set $vector(c,3,"int") = 30
USER>set $vector(c,4,"int") = 20
USER>set $vector(c,5,"int") = 10
// iterate and access
USER>for i = 1:1:$vectorop("count",c) { write !, i, " – ", $vector(c, i) }
110
220
330
420
510
// return the key for a particular value (30)
// by creating a bitstring for matching values
// and finding the 1 bit
USER>set b = $vectorop("=", c, 30)
USER>write $bitfind(b, 1)
3

 

6.    Python list (Used in ObjectScript via Builtins())

 This is a Python list of values used within ObjectScript. The keys are the 0-based positions of the values. Since ObjectScript doesn't recognize the Python [ ] syntax, you must call the __getitem__() method.

# create
c = [10, 20, 30, 20, 10]
# iterate and access
>>> for i in range(len(c)):
...    print(i, "-", c[i])
010
120
230
320
410
# return the key for a particular value (30)
>>> print(c.index(30))
2
// create
USER>set b = $system.Python.Builtins()
USER>set c = b.list()
USER>do c.append(10), c.append(20), c.append(30)
USER>do c.append(20), c.append(10)
// display
USER>zwrite c
c=10@%SYS.Python  ; [10, 20, 30, 20, 10]  ; <OREF>
// iterate and access
USER>for i = 0:1:(b.len(c) - 1) { write !, i, " – ", c."__getitem__"(i) }
010
120
230
320
410
// return the key for a particular value (30)
write c.index(30)
2

 

7.    Python tuple (Used in ObjectScript via Builtins())

This is a Python tuple of values used within ObjectScript. The keys are the 0-based positions of the values. Since ObjectScript doesn't recognize the Python [ ] syntax, you must call the __getitem__() method. Once created, a tuple is immutable. 

# create
t = (10, 20, 30, 20, 10)
# iterate and access
>>> for i in range(len(t)):
...    print(i, "-", t[i])
010
120
230
320
410
# return the key for a particular value (30)
>>> print(t.index(30))
2
// first, create a list and add values
USER>set b = $system.Python.Builtins()
USER>set c = b.list()
USER>do c.append(10), c.append(20), c.append(30)
USER>do c.append(20), c.append(10)
// convert it to a tuple
USER>set t = b.tuple(c)
// display
USER>zwrite t
t=7@%SYS.Python  ; (10, 20, 30, 20, 10)  ; <OREF>
// iterate and access
USER>for i = 0:1:(b.len(t) - 1) { write !, i, " – ", t."__getitem__"(i) }
010
120
230
320
410
// return the key for a particular value (30)
write t.index(30)
2

 

Key-Value Collections

Note: The examples in this section all use a mixture of integer, floating point, and string keys. This mixture would typically not occur in real code; it's used to demonstrate how the keys are treated by each kind of collection.

1.    Sparse array

This is a subscripted variable. Any string or numeric value is allowed as a subscript except for the empty string (""). The keys are the subscripts. The array is automatically sorted by the subscripts (numerically and alphabetically). Searching for a value requires iterating (not shown below). Sparse arrays are not available in Python. However, you can convert a Python dict into a sparse array reference, in order to pass it as an argument to an ObjectScript method.

# create
>>> c = {-1: 10, 2: 20, 8.5: 30, 'george': 20, 'thomas': 10}
# convert it to a sparse array
>>> sa = iris.arrayref(c)
>>> print(sa.value)
{'-1': 10, '2': 20, '8.5': 30, 'george': 20, 'thomas': 10}
# "sa" could be passed to an ObjectScript method
# that accepts a sparse array as an argument
// create (adding with keys intentionally out of order to show auto sorting)
USER>set c(-1) = 10, c(8.5) = 30, c(2) = 20
USER>set c("thomas") = 10, c("george") = 20
// iterate and access
USER>set key = ""
USER>while 1 {set key = $order(c(key)) quit:(key = "")  write !, key, " – ", c(key) }
-110
220
8.530
george – 20
thomas – 10

 

2.    %DynamicObject object (JSON object)

This is a JSON object collection of keys and values. This is like a Python dict. Searching for a value requires iterating (not shown below).

# create
>>> c = iris._Library.DynamicObject._New()
>>> _ = c._Set(-1, 10)
>>> _ = c._Set(2, 20)
>>> _ = c._Set(8.5, 30)
>>> _ = c._Set("george", 20)
>>> _ = c._Set("thomas", 10)
# iterate and access
>>> key = iris.ref()
>>> val = iris.ref()
>>> i = c._GetIterator()
>>> while i._GetNext(key, val):
...    print(key.value, "-", val.value)
-110
220
8.530
george – 20
thomas – 10
// create
USER>set c = {"-1":10, "2":20, "8.5":30, "george":20, "thomas":10}
// iterate and access
USER>set i = c.%GetIterator()
// call %GetNext() passing key and val BY REFERENCE (preceded with period)
USER>while i.%GetNext(.key, .val) { write !, key, " – ", val }
-110
220
8.530
george – 20
thomas – 10

 

3.    %ArrayOfDataTypes object

This is an array collection of keys and values. Any string or numeric value is allowed as a key except for the empty string (""). The collection is automatically sorted on the keys (numerically and alphabetically). Note: when using this collection as a property in a class (Property C as array of %String), the class definition is %Collection.ArrayOfDT instead (all functionality is the same).

# create (adding with keys intentionally out of order
# to show auto sorting)
>>> c = iris._Library.ArrayOfDataTypes._New()
>>> _ = c.SetAt(10, -1)
>>> _ = c.SetAt(30, 8.5)
>>> _ = c.SetAt(20, 2)
>>> _ = c.SetAt(10, "thomas")
>>> _ = c.SetAt(20, "george")
# iterate and access
>>> key = iris.ref("")
>>> while True:
...    val = c.GetNext(key)
...    if (key.value == ""):
...        break
...    print(key.value, "-", val)
-110
220
8.530
george – 20
thomas – 10
# return the key for a particular value (30)
>>> print(c.Find(30))
8.5
// create (adding with keys intentionally out of order to show auto sorting)
USER>set c = ##class(%ArrayOfDataTypes).%New()
USER>do c.SetAt(10, -1), c.SetAt(30, 8.5), c.SetAt(20, 2)
USER>do c.SetAt(10, "thomas"), c.SetAt(20, "george")
// iterate and access
USER>set key = ""
// call GetNext() passing key BY REFERENCE (preceded with period)
USER>while 1 { set val = c.GetNext(.key) quit:(key = "")  write !, key, " - ", val}
-110
220
8.530
george – 20
thomas – 10
// return the key for a particular value (30)
USER>write c.Find(30)
8.5

 

4.    Python dict (Used in ObjectScript via Builtins())

This is a Python dict of values used within ObjectScript. Any string or numeric value is allowed as a key. It's included here for completeness, as it is technically possible to use it within ObjectScript. But it might seem strange to do so, given the lack of the "for keys,values in…" syntax in ObjectScript. Searching for a value requires iterating (not shown below).

# create
>>> c = {"-1":10, "2":20, "8.5":30, "george":20, "thomas":10}
# iterate and access
>>> for key, val in c.items():
...    print(key, "-", val)
-110
220
8.530
george – 20
thomas – 10
// create
USER>set b = $system.Python.Builtins()
USER>set c = b.dict()
USER>do c.setdefault(-1, 10), c.setdefault(2, 20), c.setdefault(8.5, 30)
USER>do c.setdefault("george", 20), c.setdefault("thomas", 10)
// display
USER>zwrite c
c=15@%SYS.Python  ; {-1: 10, 2: 20, 8.5: 30, 'george': 20, 'thomas': 10}  ; <OREF>
// iterate (using try/catch) and access
USER>set iter = b.iter(c)
USER>try { while 1 { set key = b.next(iter) write !, key, " – ", c.get(key)} } catch ex {}
-110
220
8.530
george – 20
thomas – 10

 

Python Set Collection

1.    Python set (Used in ObjectScript via Builtins())

This is a Python set of values used within ObjectScript. A set is non-ordered, non-indexed, and doesn't allow duplicate values. Set operations such as union and intersection are supported. 

# create
>>> c1 = {10, 20, 30, 20, 10}
>>> c2 = {25, 30, 20, 75}
# display (no duplicate values allowed)
>>> print(c1)
{10, 20, 30}
>>> print(c2)
{25, 75, 20, 30}
# create union and intersection of the sets
>>> union = c1.union(c2)
>>> inter = c1.intersection(c2)
# display
>>> print(union)
{20, 25, 10, 75, 30}
>>> print(inter)
{20, 30}
// create
USER>set b = $system.Python.Builtins()
USER>set c1 = b.set(), c2 = b.set()
USER>do c1.add(10), c1.add(20), c1.add(30), c1.add(20), c1.add(10)
USER>do c2.add(25), c2.add(30), c2.add(20), c2.add(75)
// display (no duplicate values allowed)
USER>zwrite c1, c2
c1=4@%SYS.Python  ; {10, 20, 30}  ; <OREF>
c2=8@%SYS.Python  ; {25, 75, 20, 30}  ; <OREF>
// create the union and intersection of the sets
USER>set union = c1.union(c2)
USER>set inter = c1.intersection(c2)
// display
USER>zwrite union
union=11@%SYS.Python  ; {20, 25, 10, 75, 30}  ; <OREF>
USER>zwrite int
int=9@%SYS.Python  ; {20, 30}  ; <OREF>

 

Fun Facts

  • All 12 collections can be properties of IRIS objects. However, the 4 Python collections and sparse arrays cannot be saved as part of an IRIS persistent object.
  • The %Collection.ArrayOfDT collection is used for an array property of an object. If the object is persistent, when saving the object, the collection keys and values are automatically normalized into a child table.
  • The %Collection.ListOfDT collection is used for a list property of an object. If the object is persistent, when saving the object, the collection keys and values can optionally be automatically normalized into a child table.
  • 11 of the collections allow different types of values in the same collection, such as strings, integers, and doubles. The sole exception: the values of a $vector must all be the same declared type.
  • 7 of the collections allow object references as values. Delimited strings, $list, $vector, %ListOfDataTypes, and %ArrayOfDataTypes do not allow object references. For the latter two, %ListOfObjects and %ArrayOfObjects are alternatives that allow object references as values, along with the %Collection.ListOfObj and %Collection.ArrayOfObj classes for list or array properties of objects.
  • In IRIS, globals are sparse arrays on disk. Although sparse arrays are not available in Python, using iris.gref() it is possible to manipulate globals in Python.

I hope you found this posting useful. As a treat for reading this far, here's a photo from part of my childhood rock collection, purchased for $2.25 in the early 1970s. Item #11 is uranium! The helpful description states "It is a source of radio active material, used in construction of atomic bombs." Unfortunately, there are no safe handling warnings included. I'll put it back in the attic now.

 

Which kinds of collections have you used already?
Discussion (0)1
Log in or sign up to continue