Question
· Apr 15, 2019

Iterate over %List/$lb in C

I'm using callin to get global values.

Here's a simple function to get string value from global and return it:

int GetGlobalStr(char *global, CACHE_EXSTRP result)
{
    int push = CACHEPUSHGLOBAL(strlen(global), global);

    // narg Number of subscript expressions pushed onto the argument stack.
    int narg = 0;

    // flag - Indicates behavior when global reference is undefined:
    // 0 — returns CACHE_ERUNDEF
    // 1 — returns CACHE_SUCCESS but the return value is an empty string.
    int flag = 1;

    int get = CACHEGLOBALGET(narg, flag);

    int pop = CACHEPOPEXSTR(result);

    return ZF_SUCCESS;
}

I get global value in result successfully. However I need to iterate over $lb. How can I do that?

Discussion (11)1
Log in or sign up to continue

the structure of $LB() is rather simply a binary string

-----------element--------------
TotalLength = 1, 3, 7 bytes depending on size    *corrected*
Type = 1 byte  (check in JSON converter for codes, or just check with ZZDUMP)
Content : size = TotalLength-1-size of length field
-----------element--------------
TotalLength = 1, 3, 7 bytes depending on size    *corrected*
Type = 1 byte  (check in JSON converter for codes, or just check with ZZDUMP)
Content : size= TotalLength-1-size of length field
-----------element--------------

...

Therefore concatenation of $lb) is so easy

Here's a sample C code to iterate over $lb structure.

#include <math.h>

/// Convert unsigned integer bytes to integer
int64_t makeint(const char *buff, size_t offset, size_t offsetinint, size_t len)
{
    union
    {
        int64_t i64;
        uint8_t u8[8];
    }d64;

    offsetinint = offsetinint & 7;
    memset(&d64, 0, sizeof(d64));

    memcpy(&d64.u8[offsetinint], buff + offset, len > (8 - offsetinint) ? (8 - offsetinint) : len);
    return d64.i64;
}

/// get next power of 2 greater than v
int64_t next2(int64_t v)
{
    v--;
    v |= v >> 1;
    v |= v >> 2;
    v |= v >> 4;
    v |= v >> 8;
    v |= v >> 16;
    v |= v >> 32;
    v++;
    return v;
}

/// List types. NONE and NONE2 are placeholders
/// See %CACHE_HOME%\dev\Cache\callout\demo\czf.pdf (Section "Lists")
enum ListTypes {NONE, STRING, USTRING, NONE2, INTP, INTN, DOUBLEP, DOUBLEN, FLOAT};

/// This function iterates over all elements in a list
void ListToTuple(CACHE_EXSTRP result)
{
    // $lb structure
    char* list = result->str.ch;
    int listLength = result->len;

    // current element
    int num = 0;

    // current byte position
    int i=0;

    // length of current element
    int l = 0;

    // datatype
    int type = 0;

    while (i<listLength) {

        // Calculate length of current element - START
        if (0 == (l = (list[i]&255))) {

            // First BYTE is 0, length is in following 2 BYTEs
            size_t t_n = ((list[i+1]&255)|((list[i+2]&255)<<8));
            if (t_n != 0) {
                  l = t_n + 3;
            } else {
                // 4 Byte length
                l = ((list[i+3]&255) | ((list[i+4]&255) << 8) | ((list[i+5]&255) << 16) | ((list[i+6]&255) << 24)) + 7;
            }
        }
        // Calculate length of current element - END

        // Calculate data position - START
        int dataStart = 0;
        int dataLength = 0;

        if (l < 255) {
            type = list[i+1];
            dataStart = i + 2;
            dataLength = l - 2;
        } else if (l < 65536) {
            type = list[i+3];
            dataStart = i + 4;
            dataLength = l - 4;
        } else {
            type = list[i+7];
            dataStart = i + 8;
            dataLength = l - 8;
        }
        // Calculate data position - END
        
        if (type==STRING) {
            char* value;
            memcpy(value, list+dataStart, dataLength);
        } else if (type==USTRING) {
            char* value;
            memcpy(value, list+dataStart, dataLength/2);
        } else if (type == INTP) {
            int64_t value = makeint(list, dataStart, 0, dataLength);
        } else if (type==INTN) {
            int64_t value = 0;
            if (l==2) {
                value = -1;
            } else {
                memcpy(&value, list+dataStart, dataLength);
                if (value == 0) {
                    value = - (1 << (dataLength * 8));
                } else {
                    int64_t pow2 = next2(value);
                    value = value - pow2;
                }
            }
        } else if (type==DOUBLEP) {
            int64_t temp = makeint(list, dataStart+1, 0, dataLength-1);
            signed char exp = list[dataStart];
            double value = temp*pow(10, exp);
        } else if (type==DOUBLEN) {
            int64_t temp = 0;
            memcpy(&temp, list+dataStart+1, dataLength-1);
            if (temp == 0) {
                temp = - (1 << (dataLength * 8));
            } else {
                int64_t pow2 = next2(temp);
                temp = temp - pow2;
            }
            signed char exp = list[dataStart];
            double value = temp*pow(10, exp);
        } else if (type==FLOAT) {
            double value;
            memcpy(&value, list+dataStart, 8);
        }
        
        i += l;
        num++;
    }
}

There are some issues with above C code for manipulating a string of $LIST elements.

I do not believe the above code will work on a big-endian platform, such as PowerPC running the AIX operating system.  Conversions from $LIST representation to numeric representation will order the bytes backwards on big-endian hardware.

Decimal floating-point values in IRIS and Caché have almost 19 decimal digits of precision while the above code translates these numbers to IEEE binary double-precision floating-point values which have less than 16 decimal digits of precision.  This means that $LISTBUILD(0.3) will be converted by the above code into the value 0.29999999999999998889... .  The above code also introduces a double-round error so that decoding $listbuild(x) and $listbuild($double(x)) will not always be equal because the $double(x) function in IRIS/Caché will do the conversion from decimal to binary without a double round.

The above code is inconsistent in resulting type of integer values.  Consider,
USER>set L5A=$LISTBUILD(5),L5B=$LISTBUILD(50/10)
USER>WRITE $LISTSAME(L5A,L5B)
1

The $LIST elements in L5A and L5B contain the same value, 5.  However, the above code will convert L5A to the C int64_t type while it will convert L5B to the C double type.  If the value of the integer is greater than 2**53 then the above C code can convert the identical integer values into different integer values.

Also, the above code does not correctly handle all the special cases when (type==FLOAT) is true.

The IRIS/Caché $LIST representation is not as simple as most people think.  There are some unusual rules that must be followed if you want to get the same results as the $LISTxxx functions get in IRIS/Caché.