Map, Reduce and Filter Collections

Inspired by the article "Declarative development in Caché" that's still trending  on the dev com. The OP explored a functional style of iterating over a collection. A comment today suggested "Caché would need syntax support for anonymous functions".

With Macros you can kind of get anonymous like syntax using dot notation.

This is not production code, but it does work. First the macros...

#Define foreach(%c,%l) for i=1:1:%c.Size set %l=%c.GetAt(i) do
#Define map(%oc,%nc,%l) set %nc=##class(%ListOfDataTypes).%New() for i=1:1:%oc.Size set %l=%oc.GetAt(i) do  do %nc.Insert(fpl8349312378)
#Define filter(%oc,%nc,%l) set %nc=##class(%ListOfDataTypes).%New() for i=1:1:%oc.Size set %l=%oc.GetAt(i) set fpl8349312378="" do  if fpl8349312378'="" do %nc.Insert(fpl8349312378)
#Define reduce(%oc,%res,%l) set %res="" for i=1:1:%oc.Size set %l=%oc.GetAt(i) do  set %res=fpl8349312378
#Define return(%val) set fpl8349312378=(%val) quit

Then in COS code...

Map all items to a new auto created collection in UPPERCASE, later foreach over the list...

  $$$map(originalCollection,newCollection,item)
  .$$$return($ZCONVERT(item,"U"))

  $$$foreach(newCollection,item)
  .write !,item

Filter all items in to a new auto created collection where the items must pattern match one or more alpha chars...

  $$$filter(originalCollection,newCollection,item)
  .if item?1.A $$$return(item)

Reduce down all items in a collection to get a total...

  $$$reduce(originalCollection,total,item)
  .$$$return(total+item)

It's a quick 5 minute hack, but it's kind of interesting how macro's can simplify code (and potential bugs) if done in the right way. With the exception of creating Macro soup!

One baked in language feature that would be nice is a real return statement (still wishing).

Sean.

 

  • + 2
  • 0
  • 356
  • 15

Comments

HI, Sean! It is really great except with 'dot syntax'. Is it mandatory?

I wouldn't promote 'dot syntax' though it looks elegant here.

One baked in language feature that would be nice is a real return statement (still wishing).

There is 'return'  statement in COS. Or is it not real?

I was just thinking the same thing lol.

I spent 10 years waiting for a return statement and it passed me by without noticing.

Reminds me to read the release notes a bit more often!

Lol )

I would suggest also another way "to be in touch" - visit InterSystems Global Summits, e.g. the next one which will take place on September 2017. We have a really good crowd of tough engineers there every year.

I agree, the dot syntax is a bit old school.

At the moment its the only way that I can think of for passing code into the context of a map reduce function.

It doesn't look so bad when part of a wider COS code block...

ClassMethod Test2()
{
  set originalCollection = ##class(%ListOfDataTypes).%New()
  do originalCollection.Insert("Sean")
  do originalCollection.Insert("Mark")
  do originalCollection.Insert("Bob")

  $$$map(originalCollection,newCollection,item)
  .$$$return($ZCONVERT(item,"U"))

  $$$foreach(newCollection,item)
  .write !,item

}

If COS implemented Lambda syntax using arrow functions then it would look at lot cleaner.

It wouldn't be hard for the COS compiler to implement. The inner code block would be scoped of to its own underlying M function with its return value being a quit back to the output of the macro or classmethod call.

ClassMethod Test2()
{
  set originalCollection = ##class(%ListOfDataTypes).%New()
  do originalCollection.Insert("Sean")
  do originalCollection.Insert("Mark")
  do originalCollection.Insert("Bob")

  set newCollection=$$$map(originalCollection, (item) => {
    return $ZCONVERT(item,"U")
  })

  $$$foreach(newCollection, (item) => {
    write !,item
  })

}

 

Thanks for reminding, Evgeny! 
[Hope people which managed to miss return statement will not miss these series :) ]

Good article, but not actually related.

MapReduce on big data is not the same as map, reduce and filter on small data collections.

Let me be crystal clear and honest - this is horrible

[I thought so more than 2 years ago when Max originally published this approach in Russian and still think so today]

When you write some code you write it not for yourself, not for being modern and trendy, you write it "for another guy" which will visit it tomorrow. You need to write it as simple as possible, using most obvious approach.

If you could write something using same or similar amount of code but without tricky macros then you have to write it simpler and without macros (as here) or without tricky iterators (as in Max case). This complexity just not worth a time your team will loss debugging such code.

Please, do  not get me wrong - I love JavaScript/TypeScript and all modern things. And would love to apply as expressive constructs as closures in JavaScript or lambdas in C++ (hmm). But they are not here (yet) in the ObjectScript. Many of us tried to lobby for closures addition for ages, but gods of COS had no interest in them.

Though, in my personal opinion, implemention of handy closure support would not be much harder than dotted DO statement (and might be based on the same VM token implementation). But I might be wrong in estimation of complexity. 

Let me be crystal clear and honest - this is horrible

In Russia, we say "Критикуя, предлагай!".  What is your better syntax sugar example? ;)

Though I agreed that closures in COS would add some better practices in coding for Caché.

> Let me be crystal clear and honest - this is horrible

LOL, well, lets crack open Ensemble and explore some macro code...

In all honesty, this post was not an advocacy but an exploration.

Map, Reduce and Filter are functions that I use every day in other languages that I never think to emulate in COS. Seeing the original OP it got me thinking, why can't we have it in COS as well.

It's good to explore these ideas, particularly as other languages are outpacing COS in a very big way. How else would they end up in the core language.

But I miss a lot a nice For Each sugar... This:

ForEach MyVar(key) 
{ 
    Write !,key 
} 

Is much better to read than:

Set key=""
For 
{ 
    Set key=$Order(MyVar(key))
    Quit:key=""

    Write !,key 
} 

I mean... We write code like this all the time, right? Locals and Globals are so important to us... Why not giving them some sugar?

Of course that the ForEach command would only try to $Order the last variable just like it works today with $Order. So instead of:

Set key1=""
For 
{ 
    Set key1=$Order(MyVar(key1))
    Quit:key1=""
 
    Write !,key1

    Set key2=""
    For 
    {
         Set key2=$Order(MyVar(key1, key2))
         Quit:key2=""

         Write !,$C(9),key2 
     }
} 

We would have:

ForEach MyVar(key1)
{ 
     Write !,key1
     ForEach MyVar(key1, key2) 
     { 
          Write !,$C(9), key2 
     }
} 

Much clearer and nice to read, don't you think? And if you really want to be fancy:

ForEach MyVar(key: value) 
{ 
    Write !,"The value for key ",key," is: ", $Get(value)
} 

Where value could be <UNDEFINED> if that local/global node ends up having no value defined (that's why the $Get on value).

I know it would be a lot of work, but, on a side note, It would be awesome if we could open up our virtual machine and give people the tools necessary so the community could implement other languages for our virtual machine. We could move our VBScript and TSQL implementation to this new, open, framework and have people use it as templates to build their own languages or language improvements.  I know building a compiler is not an easy thing and there are things that you have to hammer on the code. But it would be an interesting challenge and investigation project.

Agreed Amir! ForEach would be a great start.

Or my preference for in...

for k1 in ^foo {
  for k2 in ^foo(k1) {
    //...
  }
}

A good precursor to having map, reduce and filter.