New post

Find

Article
· Apr 27, 2024 3m read

Recherche vectorielle géographique #2

Surprises techniques à l'aide de VECTORs

La construction de mon exemple technologique m'a permis de faire un certain nombre de découvertes que je souhaite partager.
Les premiers vecteurs que j'ai touchés sont apparus avec l'analyse de texte et plus de 200 dimensions.
Je dois avouer que je me sens bien dans le monde à 4 dimensions d'Einstein.
Les 7 à 15 dimensions qui peuplent la théorie des cordes dépassent quelque peu la frontière.
Mais 200 et plus, c'est définitivement bien au-delà de mon horizon mathématique.

Ndt : Je partage avec Robert la difficulté d'abstraction pour un grand nombre de dimensions, ce qui pour moi, rend l'exemple suivant très pertinent

Je me suis donc tourné vers notre planète et j'ai trouvé qu'un vecteur de (latitude, longitude) à 2 dimensions était suffisant pour les tests.
Un tableau pratique des capitales a été trouvé et a fourni un échantillon de données de test (abrégé).

CAPITAL COUNTRY LATITUDE LONGITUDE
Kabul Afghanistan 34.28N 69.11E
Tirana Albania 41.18N 19.49E
Algiers Algeria 36.42N 03.08E
Pago Pago American Samoa 14.16S 170.43W
Andorra la Vella Andorra 42.31N 01.32E
Luanda Angola 08.50S 13.15E
Saint John's Antigua and Barbuda 17.127N 61.846W
Buenos Aires Argentina 36.30S 60.00W
Yerevan Armenia 40.10N 44.31E
Oranjestad Aruba 12.32N 70.02W
Canberra Australia 35.15S 149.08E
Vienna Austria 48.12N 16.22E
Baku Azerbaijan 40.29N 49.56E
Nassau Bahamas 25.05N 77.20W
Manama Bahrain 26.10N 50.30E
Dhaka Bangladesh 23.43N 90.26E
Bridgetown Barbados 13.05N 59.30W
Minsk Belarus 53.52N 27.30E
Brussels Belgium 50.51N 04.21E

#1 Le chargement de ce fichier texte séparé par des tabulations avec LOAD DATA (SQL) a parfaitement fonctionné

#2 La transformation des coordonnées géographiques en INT était un exercice de codage mineur
Il en résulte une ClassMethod projetée en tant que procédure SQL utilisée dans une UPDATE sur la table.

#3 Comme les coordonnées géographiques se réfèrent à (0°N,0°W) quelque part dans l'Atlantique, ce n'est qu'une base théorique pour mes vecteurs.
IRIS supporte quelques fonctions VECTOR mais je n'ai trouvé aucune fonction AddVector() ou SubtractVector().
J'ai donc procédé « manuellement » à partir des coordonnées d'entrée.
Il est nécessaire de transformer les coordonnées en un point de base utile pour comparer ultérieurement les vecteurs
Il existe donc des coordonnées de BASE statiques et des coordonnées d'OEUVRE actives.

Obtenir les valeurs du vecteur est facile avec SQL en utilisant la fonction %EXTERNAL()
tandis qu'en ObjectScript, j'ai obtenu

        set vectorvalues=##class(%Vector).LogicalToOdbc(vectorvaraible) 

c'était moins impressionnant pour travailler avec des vecteurs.

#4
La similitude est calculée avec la fonction VECTOR_COSINE().
Vous calculez l'angle entre 2 vecteurs et COSINE le normalise entre +1 et -1.
L'entrée nécessite 2 vecteurs de même type et de même dimension.
Les exemples présentés dans la documentation fonctionnent bien si vous composez votre chaîne SQL comme suggéré
et TO_VETOR( ?,type,size) est OK avec %SQLStatement pour l'exécution.
MAIS :

J'ai essayé avec du code SQL intégré.
La vérification du code a signalé un certain désaccord, mais la compilation s'est déroulée sans problème
Lors de l'exécution, il s'est avéré que les variables hôtes dans TO_VECTOR(:myvec,INT,2) échouaient.
quelle que soit la combinaison de guillemets, accolades, .... que j'ai essayée.
Soyez donc avertis. Je suis retourné à %SQLStatement pour terminer mon VCOS.

#5 J'ai été surpris d'apprendre à quel point VECTOR_COSINE se propageait.
La vérification du vecteur Paris >> Bucuresti a permis de retracer la moitié du Moyen-Orient et de l'Asie de l'Est.
Limiter les résultats à > 0,999 est donc une bonne pratique dans ce scénario.

3 Comments
Discussion (3)1
Log in or sign up to continue
Article
· Apr 26, 2024 3m read

Geo Vector Search #2

Technical surprises using VECTORs
>>> UPDATED

Building my tech. example provided me with a bunch of findings htt I want to share.
The first vectors I touched appeared with text analysis and more than 200  dimensions.
I have to confess that I feel well with Einstein's 4 dimensional world.
7 to 15 dimensions populating the String Theory are somewhat across the border.
But 200 and more is definitely far beyond my mathematical horizon.

So I looked to our Globe and found a vector of (latitude,longitude) with 2 dimension is enough for testing.
A handy table of capitals was found and provided sample test data (shortened).

CAPITAL COUNTRY LATITUDE LONGITUDE
Kabul Afghanistan 34.28N 69.11E
Tirana Albania 41.18N 19.49E
Algiers Algeria 36.42N 03.08E
Pago Pago American Samoa 14.16S 170.43W
Andorra la Vella Andorra 42.31N 01.32E
Luanda Angola 08.50S 13.15E
Saint John's Antigua and Barbuda 17.127N 61.846W
Buenos Aires Argentina 36.30S 60.00W
Yerevan Armenia 40.10N 44.31E
Oranjestad Aruba 12.32N 70.02W
Canberra Australia 35.15S 149.08E
Vienna Austria 48.12N 16.22E
Baku Azerbaijan 40.29N 49.56E
Nassau Bahamas 25.05N 77.20W
Manama Bahrain 26.10N 50.30E
Dhaka Bangladesh 23.43N 90.26E
Bridgetown Barbados 13.05N 59.30W
Minsk Belarus 53.52N 27.30E
Brussels Belgium 50.51N 04.21E

#1
 Loading that TAB separated text file with LOAD DATA (SQL) worked perfect

#2
Transforming the geo coordinates int INT was a minor coding exercises
It resulted in a ClassMethod projected as SQL Procedure used inana UPDATE over the table.

#3
As geo coordinates refer to (0°N,0°W) somewhere in the Atlantic this is just a theoretical base for my vectors.
IRIS supports some VECTOR functions but I found no AddVector() or SubtractVector() function.
so this was done "manually" from the input coordinates.
The need arises from transforming coordinates to a useful base point for later comparing vectors
So you see static BASE coordinates and active WORK coordinates.

Getting the Vector's values is easy with SQL using %EXTERNAL() function
while in ObjectScript I ended up with
        set vectorvalues=##class(%Vector).LogicalToOdbc(vectorvariable) 
this was less impressive for working with Vectors.

#4
Similarity is calculated with VECTOR_COSINE() function.
You calculate the angle between 2 Vectors and COSINE norm it between +1 and -1
The input needs 2 Vectors of the same type and same dimension.
Examples as in documentation work fine if you compose your SQL String as suggested
and TO_VETOR(?,type,size) is OK  with  %SQLStatement  for execution.
BUT:
I tried it with embedded SQL.
Code checking signalled some disagreement but compiled without problem
At runtime it turned out that host variables in TO_VECTOR(:myvec,INT,2) failed
whatever combination of quotes, braces, .... I tried.
So be warned. I returned to  %SQLStatement to get my VCOS done.

UPDATE:  TO_VECTOR(:myvec,INT,2)
     set myvec="1314,-7979" 

Nothing special: Just a plain String with comma-separated values
It seems I couldn't believe that simple approach.
My apologizes to ISC Engineering.

#5
It was a surprise to learn how wide VECTOR_COSINE is spreading.
checking the vector Paris >> Bucuresti  traced half of Middle and East Asia.
So limiting the results to > 0.999 is a good practice in this scenario.

Video

GitHub
 

4 Comments
Discussion (4)1
Log in or sign up to continue
Article
· Apr 26, 2024 2m read

Geo Vector Search #1

Geographic use of vector search

The basic idea is to use Vectors in the mathematical sense.
I used geographic coordinates. These are of course only 2-dimensional
but much easier to follow as vectors in text analysis with >200 dimensions.

The example loads a list of worldwide capitals with their coordinates
The coordinates are interpreted as vectors from geographic point 0°N/0 W
(some very wet spot in the Gulf of Guinea, >400 km from the African Coast)
Finding common directions from that spot is a quite theoretical case.
So adjustment to your preferred starting point is implemented.
Now finding similar directions for some target city makes sense.
It's a methematical use of VECTOR_COSINE() function other than text search.

And as this is just 2 dimensional COSINE is just what we (hopefully) learned at school.
So the results are far better to understand: 

  • 1  =  total match, same direction 0° deviation from original
  • 0  = no match at all,  direction points 90° away from original
  • -1 = total opposite direction pointing backward by 180° from original
  •  ~0.999  = quite close to original 

You just get information on the direction, not on the size.
So your vector from Paris to Budapest  points also to Minsk or someplace In Asia

The demo is controlled by a rathrer simple menu:

  Use Geographc Vectors
=========================
     1 - Initialize Tables
     2 - Import Data
     3 - Set Base Location
     4 - Generate Vectors
     5 - Select Target Location
     6 - Show Best Matches
Select Function or * to exit :

for multiple retries, you always restart at

  • #3 set your starting location
  • #4 adjust coordinates to your selected base
  • #5 set your target location  defining your base vector
  • #6 see what's in between or in front of your vector
    • adjust tolerance from -1...+1

Video

3 Comments
Discussion (3)1
Log in or sign up to continue
Question
· Apr 18, 2024

initial user account

I installed a local docker container instance from here:   intersystemsdc/iris-community

I'm trying to login:    http://localhost:52773/csp/sys/UtilHome.csp

I thought if I used SYSTEM as the initial username, I could login, but I get #822 Access Denied.

Is that the correct username? maybe there's a better location to pull the docker instance.

***************nevermind, figured it out

3 Comments
Discussion (3)2
Log in or sign up to continue
Question
· Apr 11, 2024

Question about InterSystems API (IAM) install from tar file with IRIS running locally

I downloaded IAM-3.4.2.0-5604.tar.gz from the Online Distribution site this morning, it the implementation to install it on our Development environment to see if it is a viable solution. Following the instructions, I have ran into an issue trying to make sure I am entering the information into the prompts correctly.

I have IRIS HealthShare Health Connect 2024.1 running locally using a Local Web Server, so when prompted I have entered the IP Address and port 443 is that correct? 

:>iam-setup.sh
Welcome to the InterSystems IRIS and InterSystems API Manager (IAM) setup script.
This script sets the ISC_IRIS_URL environment variable that is used by the IAM container to get the IAM license key from InterSystems IRIS.
Enter the full image repository, name and tag for your IAM docker image:
intersystems/iam:3.4.1.0
Enter the IP address for your InterSystems IRIS instance. The IP address has to be accessible from within the IAM container, therefore, do not use "localhost" or "127.0.0.1" if IRIS is running on your local machine. Instead use the IP address of your local machine. If IRIS is running in a container, use the IP address of the host environment, not the IP address of the IRIS container:
xxx.xxx.xxx.xxx
Enter the web server port for your InterSystems IRIS instance:
443
Enter the password for the IAM user for your InterSystems IRIS instance:
Re-enter your password:
If local policy requires that HTTPS be used for communication, please provide the full path to your CA Certificate file now. Otherwise hit "Return":
/etc/pki/ca-trust/source/anchors/OSUWMC_CA.pem
If your InterSystems IRIS instance is only accessible via its CSPConfigName URL prefix, please provide the prefix with a trailing slash (/) now. Otherwise hit "Return":

Your inputs are:
Full image repository, name and tag for your IAM docker image: intersystems/iam:3.4.1.0
IP address for your InterSystems IRIS instance: xxx.xxx.xxx.xxx
Web server port for your InterSystems IRIS instance: 443
CA Certificate for HTTPS: /etc/pki/ca-trust/source/anchors/OSUWMC_CA.pem
CSPConfigName URL prefix:
Would you like to continue with these inputs (y/n)?
y
Getting IAM license using your inputs...

Couldn't reach InterSystems IRIS at xxx.xxx.xxx.xxx:443. One or both of your IP and Port are incorrect.

I have verified that...

  • IAM user is enabled
  • /api/iam is enabled

What port should be specified if you are running a Local Web Server/Web Gateway?

Thanks

Scott

9 Comments
Discussion (9)2
Log in or sign up to continue