Declarative development in Caché

Article

Maks Atygaev · Dec 11, 2016 4m read

#Object Data Model #ObjectScript #Tips & Tricks #Caché

Caché offers a number of methods for going through a collection and doing something with its elements. The easiest method uses a while-loop and lets you fulfill the task in an imperative manner. The developer needs to take care of the iterator, jumping to the next element and checking if the loop is within the collection.

But is it really what a developer should be concerned with?! A developer should be working on solving the problem at hand – quickly and producing code of the highest quality. It would be great to be able to just take a collection and apply a function to it that will perform the necessary operations on each element. No need to perform boundary checks, no need to create an iterator, no need to manually call a function for each element. This approach is called declarative programming.

Declarative programming is when you write your code in such a way that it describes what you want to do, and not how you want to do it.
(c) 1800-information

Let’s now think how to solve the task declaratively, using built-in tools and capabilities of Caché.

In languages that support higher-order functions (like JavaScript), you can describe a function for processing a collection element and pass it as a parameter to another function to apply the passed function to each element.

[2, 3, 5, 7, 11, 13, 17].forEach(function(i) {
    console.log(i);
});

In this case, an anonymous function is created that outputs an element to the console. This function is passed as an argument to another function - forEach.

Unfortunately, Caché doesn’t support higher-order functions that would provide a way to laconically accomplish your task. But let’s think how we can implement this concept using Caché’s standard means.

For starters, let’ take a look at a primitive implementation of a task that requires a loop to go through a collection with the subsequent output of each element.

set i = collection.Next("")

while (i '= "") {
    set item = collection.GetAt(i)

    w item,!

    set i = collection.Next(i)
}

To begin with, let’s recall that Caché ObjectScript supports the OOP paradigm. And since it does, we should take a look at standard design patterns and try applying them to solve our problem. We need to go through the entire collection and perform an action with each element. This makes me think about the Visitor pattern.

Let’s define the fp.Function class with one abstract “execute” method.

Class fp.Function [ Abstract ] {
    Method execute(item As %Numeric) [ Abstract ] {}
}

Let us now define the implementation of this “interface” — the fp.PrintlnFunction class.

Class fp.PrintlnFunction Extends (fp.Function, %RegisteredObject) {
    Method execute(item As %Numeric) {
        w item,!
    }
}

Okay, let’s edit our original code a bit.

set function = ##class(fp.PrintlnFunction).%New()

set i = list.Next("")

while (i '= "") {
    set item = list.GetAt(i)

    do function.execute(item)

    set i = list.Next(i)
}

Let’s now encapsulate the collection traversal algorithm. Let’s create the IterableStream class.

Class fp.IterableStream Extends %RegisteredObject {
    Property iterator As %Collection.AbstractIterator [ Private ];

    Method %OnNew(iterator As %Collection.AbstractIterator) As %Status [ Private, ServerOnly = 1 ] {
        set ..iterator = iterator

        return $$$OK
    }

    Method forEach(function As Function) {
        set i = ..iterator.Next("")

        while (i '= "") {
            set item = ..iterator.GetAt(i)

            do function.execute(item)

            set i = ..iterator.Next(i)
        }
    }
}

The solution can now be presented in the following way:

do ##class(IterableStream).%New(list).forEach(##class(PrintlnFunction).%New())

You can encapsulate the algorithm of creating a wrapper for the while-loop. To do this, let’s create the Streams class.

Class fp.Streams {
    ClassMethod on(iterator As %Collection.AbstractIterator) As IterableStream {
        return ##class(IterableStream).%New(iterator)
    }
}

You can then rewrite the solution in the following way:

do ##class(Streams).on(list).forEach(##class(PrintlnFunction).%New())

So, the problem has been solved declaratively. Yes, we have new classes. Yes, we have more code now. It must be noted, though, that the resulting code is more concise and transparent. It doesn’t contain distractions but helps us concentrate on the problem being worked on.

If you imagine classes like Function, Streams, IterableStream in the place of standard Caché classes, you will only need to create the PrintlnFunction class.

And that was my two cents on declarative programming in Caché. Happy coding, everyone!

Rubens Silva · Jul 4, 2017

~~As long as there's an abstract API for parsing, lexing, transpiling and serializing. It would be possible to even port any FP or FRP language to Caché.~~

Since you demo'ed JavaScript, it seems programming on a functional way could be possible if we could simply pass methods as parameters.

Well, that's actually the core rule for a language that supports functional paradigms.

Since there's no current support for such paradigm. Maybe we could wrap it using indirections or xecutes?

set array = []

do a.%Push({ "value": "some value to be replaced" })

set result = ##class(FP.Functor).%New()From(array).Map($this, "...ValueWithIndex", scopeParam)

Method ValueWithIndex(item As %DynamicObject, index As %Integer, scopeParams... As %String)
{
// Second core rule: always keep it pure. Map should always Clone the item, which could be implicit for Map.
   // But for this case I'll demonstrate it manually.

   set clonedItem = item.Clone() // Or %ConstructClone if possible.
     set clonedItem .%Set("value", "modified with "_index)
   return clonedItem
}

Please note that this still doesn't provide the possibility to use high-order functions. ~~The closest we could have I think is embedding subroutines within your context method. Which could also be reproduced as:~~

~~// Can also be Method.~~

ClassMethod YourContextMethod() As %String
{

~~set scopeParam = "blahblahblah"~~

// Now assume we're using %ZEN.proxyObject. Omitted for brevity.
set result = ##class(FP.Functor).From(array).Map("$$HOMapWithScopeParam", scopeParam)

HOMapWithScopeParam(result, item, value, scopeParam...)

// Now Implicitly cloned into result param.
set result.value = scopeParam(0) // Could be improved.
quit result
}

Nope, I forgot that procedures are exclusive for the subroutine that's defining them.

0 0

Mike Kadow · Dec 12, 2016

The definition of Declarative Programming I found is:

In computer science, declarative programming is a programming paradigm—a style of building the structure and elements of computer programs—that expresses the logic of a computation without describing its control flow.

Maybe my ignorance is showing here, but this seems like so much "pie in the sky", and I cannot relate it in a meaningful way to anything. Sorry folks, I guess this is just above my pay-grade.

1 0

Alok Saldanha · Dec 12, 2016

Hi Mike,

Based on the definitions I have seen, declarative programming is pretty broad and encompasses everything from SQL to more "functional programming" constructs such as what Maks is proposing. Most people are familiar with the pros and cons of using SQL vs implementing queries in code. I have worked on large projects organized around functional programming principles, and in my experience the benefits are:

Easily tested code. Because all functions take explicit arguments and return values without side effects, they are easily tested.
Safer refactoring. Typically variables and data structures are immutable, so most ways of reordering the code will either throw undefined errors or continue to work as intended. It will not introduce subtle bugs.
Easier refactoring. Because statements are mostly used to define variables, you can take an arbitrary set of statements and make them into a method, or conversely take a method and inline it. This makes it easy to extract functionality that you want to reuse in other places, or change the balance of what is done in a method call vs in the current method.

The above combine to make the code easily changeable with lower risk. The ability to reorganize the code without introducing bugs is the main benefit, and that then enables you to quickly deliver on your commitments. There are other potential benefits, such as it being easier to parallelize computation since there are no mutable data structures to synchronize, but I haven't needed or tried to realize such benefits in practice. I have heard functional programming in Cache could incur a significant performance penalty due to how function calls are implemented, but I would test to see what the impact is since it the benefits may outweigh the performance penalty. This may all seem very "pie in the sky", but in practice it is a useful tool to have that is no more fanciful than SQL.

5 0

Alok, maybe I have not been in on the right projects the last 30 years. But I am mostly concerned in getting code to work efficiently. "Refactoring, parallelize computation, mutable data structures to synchronize?" You are talking over my head, and I think over the heads of most programmers in the trenches. I don't have time to try to understand what you are trying to say, I have real work to do. If you want to have a real impact, say things that really can help me.

Maybe you really have something valuable here, I don't know. But the way present, is a major put-off. As always, it is not what you say, but how you say it.

Optimizing code can make it more difficult to understand and change later. Sometimes performance requirements dictate that you need to write the code as optimized as possible. Another approach is to make the code easy to understand so that it is easier to safely change later. That's what refactoring means, and that's where I found functional-style programming to have some benefit. The idea behind optimizing for changeability is that you can profile the code later if necessary and optimize for performance where it matters.

I haven't done much with parallel computation, just a couple exploratory projects with Apache Spark which makes use of immutable distributed data sets. The fact that the data sets themselves don't change mean that they can be copied around without worrying about other nodes changing them. It also means a given node can throw away any data that is not used as input to a subsequent function. When writing code for Apache Spark, you construct a data set that is never actually instantiated - instead you define a series of transformations on it which are distributed to the nodes which have the data and execute the transformations, throwing away the intermediate results and sending only the results of aggregation back the master node. It's a lot easier to use than it sounds. It's worth looking into if you want to do map-reduce style things. Apache Spark is not particularly efficient, but it lets you throw lots of cheap machines at the problem and get answers quickly, which makes it popular for ad-hoc analysis of very large datasets.

You have probably been on great projects, but in my experience the Bard's words ring true: "There are more things in heaven and Earth, Horatio, / Than are dreamt of in your philosophy". May you find someone better to declarative functional programming. I agree that it is hard to explain these things, I have had much better success gaining converts by simply using the techniques on a real project - and so perhaps I should let this thread go.

2 0

Maks Atygaev · Dec 13, 2016

The main idea of the post is .... Just imagine how it would be exciting if we could write the code in such way! :-)

Just imagine that Caché ObjectScript allows to write the code in such way.

You could traverse by the collection in one line! No any auxiliary classes or interfaces or something like that! :)

My goal is show another way to write the code.

Chris Sprague · Jul 4, 2017

Declarative programming is not 'pie in the sky' programming, though I can understand how one might be doubtful as the promises are great without much useful proof often given.

The 'killer app' for this kind of programming is manipulating collections, once you've used declarative programming in this context, it will become clear what the power really is. Luckily, 8 of the top 10 popular languages support this style of programming (https://www.tiobe.com/tiobe-index/). [Edit, changed to 8. Pretty sure C and ASM don't have anything like this.]

I don't want to say this is you, but some people have no room for innovation in their work lives, and will never accept new things. That said, this style of programming is at least 40 years old, and comes from Massachusetts!

(I'm assuming "this style of programming" is originating from Scheme, which is debatable at best...)

Jiri Svoboda · Dec 12, 2016

I certainly miss functional programming features like closures, higher-order functions and lazy evaluation every time I come back to ObjectScript from other languages.

Your code provides a nice example of implementing collection methods so we can simulate some FP constructs in COS. Thanks!

Jiri

3 0

Maks Atygaev · Dec 12, 2016

Thank you) I'm going to write one more article about the subject)

Rob Tweed · Dec 24, 2016

...of course you could just write your logic in JavaScript and access Cache as a document database: http://www.mgateway.com

Rini van de Vos... · Dec 31, 2016

Hi Alok,
Yes, it is always easier to explain and/or convince someone with a real-life case at hand.
But when you succeed in elevating a solution into a more structural way, like a software pattern, than many more projects will benefit. Even when a solution appears (or actual is) more complex at first sight, the solution for some problem (ak sw.pattern) will become more common and thus less un-familiar or strange.

Please keep up the good work,
Happy New Year,
Rini

Alok Saldanha · Jan 3, 2017

Good point Rini. Also, by recognizing software patterns you can influence the development of language features. For example, because functional programming has become more common in the Javascript community, language features such as "fat arrow" notation have been added to ES6 make it less cumbersome/strange/etc.

Chris Sprague · Mar 28, 2017

For this to be interesting to production developers, Caché would need syntax support for anonymous functions with compiler support to generate objects (similar to the example) with fields for each free variable to implement the closure. Map, filter and reduce implementations in the build-in libraries would be nice but could also be implemented by the programmer.

Interestingly, I would think this sort of thing would be incredibly useful in Caché because of how prominently arrays are used to represent complex data. Certainly, the first step is to get people interested and asking for it.

This post reminded me of something... To paraphrase Anton van Straaten, closures are the poor man's object; objects are the poor man's closure.

http://people.csail.mit.edu/gregs/ll1-discuss-archive-html/msg03277.html

Sean Connelly · Mar 28, 2017

Interesting idea, Haskell is certainly influencing other languages, so why not COS.

As an alternative to the op code...

1. Create an include file with...

#Define foreach(%c,%l) for i=1:1:%c.Size set %l=%c.GetAt(i) do

2. Then execute code as a semi anonymous function of foreach...

$$$foreach(newCollection,item)
.write !,item

I did a little experiment and here's the result:

ClassMethod testing(item)
{
  set array = ##class(%ListOfObjects).%New()
  for i=1:1:10 {
    set proxy = ##class(%ZEN.proxyObject).%New()
    set proxy.value = i
    do array.Insert(proxy)
  }

  set DoubleItemValueSumTwo = $classname()_":DoubleItemValueSumTwo"
  set Odds = $classname()_":Odds"
  set BiggerThanFive = $classname()_":BiggerThanFive"

  set result = ##class(FP.Functor).From(array).Map(DoubleItemValueSumTwo).Filter(Odds).Every(BiggerThanFive).Result()
  quit result
}

ClassMethod DoubleItemValueSumTwo(
item As %ZEN.proxyObject,
i)
{
  set item.value = (item.value * 2) + $random(2)
  quit item
}

ClassMethod Odds(
item As %ZEN.proxyObject,
i)
{
  quit (item.value # 2 '= 0)
}

ClassMethod BiggerThanFive(
item As %ZEN.proxyObject,
i As %Integer)
{
  quit item.value > 5
}

There is space for a lot of improvements I think. It's not exactly what you would call a performatic implementation. But that's a beginning.

I just posted my implementation on Git for analysis.

https://github.com/rfns/cache-fp-poc