Article
· May 13 3m read

Vector Search : Application to demonstrate Vector Search and Generative AI capabilities

Hi Community,

In this article, I will introduce my application iris-VectorLab along with step by step guide to performing vector operations.

IRIS-VectorLab is a web application that demonstrates the functionality of Vector Search with the help of embedded python. It leverages the functionality of the Python framework SentenceTransformers for state-of-the-art sentence embeddings.

Application Features

  • Text to Embeddings Translation.
  • VECTOR-typed Data Insertion.
  • View Vector Data
  • Perform Vector Search by using VECTOR_DOT_PRODUCT and VECTOR_COSINE functions.
  • Demonstrate the difference between normal and vector search
  • HuggingFace Text generation with the help of GPT2 LLM (Large Language Model) model and Hugging Face pipeline

To demonstrate the functionality of Vector Search, We will follow the below steps:

  • Step1: Generate embeddings and save Vector data to IRIS
  • Step2: View Vector data
  • Step3: Perform Vector Search

So Let us start.

Step 1: Create embeddings and save Vector data to IRIS 

The below Python function will save the vector data to IRIS:

// Save vector data
ClassMethod SaveData(desc As %String) As %String [ Language = python ]
{
	#Required to call objectscript method	
	import iris
	# Step 1: Prepare the Data 
	documents =[ desc ]
    # Step 2: Generate Document Embeddings
	from sentence_transformers import SentenceTransformer
	import pandas as pd
	#convert to dataframe for data manipulation
	df = pd.DataFrame(documents)
	#define column header
	df.columns = ['description']
	#Assign model
	model = SentenceTransformer('all-MiniLM-L6-v2')
    # Generate embeddings for each document
	document_embeddings = model.encode(documents)
	# assigning vector data to new column of dataframe
	df['description_vector'] = document_embeddings.tolist()
    # iterate through dataframe 
	for index, row in df.iterrows():
		# call SaveVector method of same class
		iris.cls(__name__).SaveVector(row['description'],str(row['description_vector']))
}
//Function to save vector data
ClassMethod SaveVector(desc As %String, descvec As %String) As %Status
{
	//Insert data to VectorLab table
  	&sql(INSERT INTO SQLUser.VectorLab  VALUES (:desc,to_vector(:descvec)))
 	if SQLCODE '= 0 {
   	 	write !, "Insert failed, SQLCODE= ", SQLCODE, ! ,%msg
    	quit
  	}
   	return $$$OK
}

 

Step2: View Vector data

The below function will return vector data


// View Vector data against ID
ClassMethod ViewData(id As %Integer, opt As %Integer) As %String
{
    // if opt = 1 then return normal description
    if opt = 1
    {
    &sql(SELECT description into :desc FROM SQLUser.VectorLab WHERE ID = :id)
    IF SQLCODE<0 {WRITE "SQLCODE error ",SQLCODE," ",%msg  QUIT}
    return desc
    }
    // return Vector data
    if opt = 2
    {
    &sql(SELECT description_vector into :desc FROM SQLUser.VectorLab WHERE ID = :id)
    IF SQLCODE<0 {WRITE "SQLCODE error ",SQLCODE," ",%msg  QUIT}
    
    //count number of vectors
    set count = $vectorop("count",desc)
    set vectorStr = ""
    //Iterate to all vectors, concatenate them and return as a string
    for i = 1:1:count 
        {
        if (i = 1)
            { set vectorStr = $vector(desc,i)}
        else
            { set vectorStr = vectorStr_", "_$vector(desc,i)}	
        }
    return vectorStr
    }
}

 

Vector data can be viewed from the management portal
Screenshot 2024-05-13 133805

 

Step 3: Perform vector search

The below function will perform vector search functionality and print the results.

ClassMethod VectorSearch(aurg As %String) As %String [ Language = python ]
{
	#init python liabraries
	from sentence_transformers import SentenceTransformer
	import pandas as pd
	
	# Assign the model
	model = SentenceTransformer('all-MiniLM-L6-v2')
	
	# Generate embedding of search parameter
	search_vector = str(model.encode(aurg, normalize_embeddings=True).tolist()) # Convert search phrase into a vector
	
	import iris
	#Prepare and execute SQL statement
	stmt = iris.sql.prepare("SELECT top 5 id,description FROM SQLUser.VectorLab ORDER BY VECTOR_DOT_PRODUCT(description_vector, TO_VECTOR(?)) DESC")
	results = stmt.execute(search_vector)
	results_df = pd.DataFrame(results) 
	print(results_df.head())
}

For more details please visit IRIS-VectorLab open exchange application page.

Thanks

Discussion (0)1
Log in or sign up to continue