Article
· Jun 1, 2024 2m read

Searching the limits of new datatype VECTOR

Translated from the Spanish Community Article Contest.

Following the latest programming contest on OEX I had some surprising observation.
There were almost exclusive applications based on AI in combination with pre-cooked Py modules.
But digging deeper, all examples used the same technical pieces of IRIS.

Seen from point of view of IRIS it was pretty much the same whether searching for text
or searching for images or other pattern.  It ended in almost exchangeable methods.

This reminds me my private situation at home. My wife and my daughter maintain an
(to me) incredible huge collection of skirts, shirts and all other clothes. But in the end
of day it's my wife and my daughter I talk to and I live with - 
no matter what wrap around is applied.

Back to contest:
A lot of fancy wrapping for more or less the same technical IRIS content.
Everyone was running down the same highway. No one ever touched any limit.

So I tried to dig deeper and find the limits of datatype VECTOR.
All vectors have 2 base parameters
- static DATATYPE : "integer" (or "int"), "double", "decimal", "string", and "timestamp".
- semi dynamic LEN(gth): > 0  often also referred as POSITION; a pure Integer.

This LEN/POSITION parameter is the equivalent what you know as mathematical dimensions of a vector.
Of course with in Einstein's universe you may just need 4 dimensions or less
based on his Theory of Relativity.
Even Cosmologic String Theory that came up in the 60ies doesn't pass 11..12 dimensions.
But all the nice pre-coocked text analysis solution packages use 238, 364, >1200, ....
dimensions and probably more.

So: What is the limit set by IRIS to the possible positions?
Official documentation has no answer.
So I took my terminal window and tried

for i=1:1 set $vector(test,I,"int")=i
;; very fast
<VECTOR>
zwrite i
i=65537

I tried with all data types: The limit is 65536

OK. The length of numeric types * 65536  is clear under the magic <MAXSTRING> limit bigger than 3 Mb

BUT: What is happening with type "string" if its size has a significant dimension ?

The impressive result: 
I succeeded with 65536 positions and a STRING of 3.600.000 bytes
The test_string is a few kB under <MAXSTRING>
Though!  This are  225.000 MB in total in a single VECTOR !
I fail to imagine how this could be done.

No doubt handling this unusual giant takes time and you have to wait long enough for any access.
But it demonstrates that DataType VECTOR is able to serve all practical  requirements
without being limited by design..

I wish you much success working with VECTORs.

Discussion (0)1
Log in or sign up to continue