#Vector Search

2 Followers · 131 Posts

Vector search is a method used in information retrieval and machine learning to find similar items based on their mathematical representations as vectors. In this approach, each item is represented as a high-dimensional vector, with each dimension corresponding to a feature or characteristic of the item. Vector search algorithms then compare these vectors to find similar items, such as having similar features or being close together in the vector space. Read more here.

Article Nicole Raimundo · May 15, 2024 9m read

DNA Similarity and Classification was developed as a REST API utilizing InterSystems Vector Search technology to investigate genetic similarities and efficiently classify DNA sequences. This is an application that utilizes artificial intelligence techniques, such as machine learning, enhanced by vector search capabilities, to classify genetic families and identify known similar DNAs from an unknown input DNA.

K-mer Analysis: Fundamentals in DNA Sequence Analysis

Fragmentation of a DNA sequence into k-mers is a fundamental technique in genetic data processing.

2
1 437
Article José Pereira · May 14, 2024 11m read

TL;DR

This article introduces using the langchain framework supported by IRIS for implementing a Q&A chatbot, focusing on Retrieval Augmented Generation (RAG). It explores how IRIS Vector Search within langchain-iris facilitates storage, retrieval, and semantic search of data, enabling precise and up-to-date responses to user queries. Through seamless integration and processes like indexing and retrieval/generation, RAG applications powered by IRIS enable the capabilities of GenAI systems for InterSystems developers.

3
3 583
Announcement Henrique Dias · May 15, 2024

Hello developers, 

Our project was designed to optimize patient clinical outcomes by reducing hospitalization time and supporting the development of resident and novice physicians. Additionally, it contributes to lowering financial waste in the healthcare system by improving the monitoring of pregnant patients, thereby decreasing risks and enhancing their safety.

Using the most accessible tool, the smartphone, was the obvious choice to make patients' lives easier.

2
0 215
Article Lucas Fernandes · May 16, 2024 2m read

The introduction of InterSystems' "Vector Search" marks a paradigm shift in data processing. This cutting-edge technology employs an embedding model to transform unstructured data, such as text, into structured vectors, resulting in significantly enhanced search capabilities. Inspired by this breakthrough, we've developed a specialized search engine tailored to companies.

We harness generative artificial intelligence to generate comprehensive summaries of these companies, delivering users a powerful and informative tool.

1
1 149
Announcement Ikram Shah · May 16, 2024

Hi Community,

Here is a brief walkthrough on the capabilities of IRIS AI Studio platform. It covers one complete flow from loading data into IRIS DB as vector embeddings and retrieving information through 4 different channels (search, chat, recommender and similarity). In the latest release, added docker support for local installation and live version to explore. 

0
1 176
Article Ikram Shah · May 12, 2024 5m read

Problem

Do you resonate with this - A capability and impact of a technology being truly discovered when it's packaged in a right way to it's audience. Finest example would be, how the Generative AI took off when ChatGPT was put in the public for easy access and not when Transformers/RAG's capabilities were identified. At least a much higher usage came in, when the audience were empowered to explore the possibilities.

6
6 628
Article Ikram Shah · May 15, 2024 6m read

In the previous article, we saw different modules in IRIS AI Studio and how it could help explore GenAI capabilities out of IRIS DB seamlessly, even for a non-technical stakeholder. In this article, we will deep dive into "Connectors" module, the one that enables users to seamlessly load data from local or cloud sources (AWS S3, Airtable, Azure Blob) into IRIS DB as vector embeddings, by also configuring embedding settings like model and dimensions. 

New Updates  ⛴️ 

2
2 418
Article shan yue · May 15, 2024 2m read

Hi Community,

In this article, I will introduce my application iris-image-vector-search.
The image vector retrieval demo uses IRIS Embedded Python and OpenAI CLIP model to convert images into 512 dimensional vector data. Through the new feature of Vector Search, VECTOR-COSINE is used to calculate similarity and display high similarity images.

Application direction of image retrieval  

Image retrieval has important application scenarios in the medical field, and using image retrieval can greatly improve work efficiency.

0
1 283
Article Muhammad Waseem · May 13, 2024 3m read

Hi Community,
In this article, I will introduce my application iris-VectorLab along with step by step guide to performing vector operations.

IRIS-VectorLab is a web application that demonstrates the functionality of Vector Search with the help of embedded python. It leverages the functionality of the Python framework SentenceTransformers for state-of-the-art sentence embeddings.

Application Features

  • Text to Embeddings Translation.
  • VECTOR-typed Data Insertion.
  • View Vector Data
  • Perform Vector Search by using VECTOR_DOT_PRODUCT and VECTOR_COSINE functions.
  • Demonstrate the difference between normal and vector search
  • HuggingFace Text generation with the help of GPT2 LLM (Large Language Model) model and Hugging Face pipeline
0
1 553
Announcement Anastasia Dyubaylo · Apr 8, 2024

Hey Community,

We have more exciting news! The new InterSystems online programming contest dedicated to Generative AI, Vector Search and Machine Learning is starting very soon! 

🏆 InterSystems Vector Search, GenAI and ML Contest 🏆

Duration: April 22 - May 19, 2024

Prize pool: $14,000


9
2 1449
Article xuanyou du · May 11, 2024 1m read

Principle: After dividing the article uploaded by the user into sentences using Python, the embedded value is obtained and stored in the Iris database. Then, the similarity between sentences is compared through Iris vector search, and finally displayed on the front-end page.

The installation steps can be viewed in the readme file. It should be noted that the BERT model used in the example has some memory requirements. If there is a long-term stuck situation during the testing process, other models such as MiniLM (which is used in the online demo) can be considered.

0
1 229
Article Robert Cemper · Apr 26, 2024 2m read

Geographic use of vector search

The basic idea is to use Vectors in the mathematical sense.
I used geographic coordinates. These are of course only 2-dimensional
but much easier to follow as vectors in text analysis with >200 dimensions.

The example loads a list of worldwide capitals with their coordinates
The coordinates are interpreted as vectors from geographic point 0°N/0 W
(some very wet spot in the Gulf of Guinea, >400 km from the African Coast)
Finding common directions from that spot is a quite theoretical case.
So adjustment to your preferred starting point is implemented.

3
1 343
Article Robert Cemper · May 4, 2024 3m read

Most examples I've seen so far in OEX or DC left the impression that VECTORs
are just something available with SQL with the 3 Functions especially around VECTOR_Search.
* TO_VECTOR()
* VECTOR_DOT_PRODUCT ()
* VECTOR_COSINE ()
There is a very useful summary hidden in iris-vector-search demo package.
From there you find everything you need over several links and corners.

1
2 413
Article Robert Cemper · Apr 26, 2024 3m read

Technical surprises using VECTORs
>>> UPDATED

Building my tech. example provided me with a bunch of findings htt I want to share.
The first vectors I touched appeared with text analysis and more than 200  dimensions.
I have to confess that I feel well with Einstein's 4 dimensional world.
7 to 15 dimensions populating the String Theory are somewhat across the border.
But 200 and more is definitely far beyond my mathematical horizon.

4
0 352
Article Robbie Luman · Jan 12, 2024 7m read

With the advent of Embedded Python, a myriad of use cases are now possible from within IRIS directly using Python libraries for more complex operations. One such operation is the use of natural language processing tools such as textual similarity comparison.

Setting up Embedded Python to Use the Sentence Transformers Library

Note: For this article, I will be using a Linux system with IRIS installed.

4
4 727
Announcement Evgeny Shvarov · Apr 18, 2024

Hi Developers!

Here're the technology bonuses for the InterSystems Vector Search, GenAI, and ML contest 2024 that will give you extra points in the voting:

  • Vector Search usage - 5
  • IntegratedML usage - 3
  • Embedded Python - 3
  • LLM AI or LangChain usage: Chat GPT, Bard, and others - 3
  • Questionnaire - 2
  • Docker container usage - 2 
  • ZPM Package deployment - 2
  • Online Demo - 2
  • Implement InterSystems Community Idea - 4
  • Find a bug in Vector Search, or Integrated ML, or Embedded Python - 2
  • First Article on Developer Community - 2
  • Second Article On DC - 1
  • First Time Contribution - 3
  • Video on YouTube - 3
  • Suggest a new idea - 1

See the details below.<--break-><--break->

0
0 368
Article Luis Angel Pérez Ramos · Mar 27, 2024 6m read

As you have seen in the latest community publications, InterSystems IRIS has included since version 2024.1 the possibility of including vector data types in its database and based on this type of data vector searches have been implemented. Well, these new features reminded me of the article I published a while ago that was based on facial recognition using Embedded Python.

Introduction

For those of you who don't remember what that article was about, it is linked at the end of this article.

2
2 596
Question Kim Trieu · Mar 26, 2024

Using VECTOR_COSINE() in SQL query to perform a text similarity search on existing embeddings in a %VECTOR column.

Code is below.

Commented out sql query returns this error: SQLCODE: -29  Field 'NEW_EMBEDDING_STR' not found in the applicable tables^ SELECT TOP ? maxID , activity , outcome FROMMain .AITest ORDER BY VECTOR_COSINE ( new_embedding_str ,

Sql query as written returns ERROR #5002: ObjectScript error: <PYTHON EXCEPTION> *<class 'OSError'>: isc_stdout_write: PyArg_ParseTuple failed!

10
0 311
Discussion Muhammad Waseem · Mar 12, 2024

Hi Community!
As an AI language model, ChatGPT is capable of performing a variety of tasks like language translation, writing songs, answering research questions, and even generating computer code. With its impressive abilities, ChatGPT has quickly become a popular tool for various applications, from chatbots to content creation.
But despite its advanced capabilities, ChatGPT is not able to access your personal data. So we need to build a custom ChatGPT AI by using LangChain Framework:
Below are the steps to build a custom ChatGPT:

  • Step 1: Load the document 

  • Step 2: Splitting the document into chunks

  • Step 3: Use Embedding against Chunks Data and convert to vectors

  • Step 4: Save data to the Vector database

  • Step 5: Take data (question) from the user and get the embedding

  • Step 6: Connect to VectorDB and do a semantic search

  • Step 7: Retrieve relevant responses based on user queries and send them to LLM(ChatGPT)

  • Step 8: Get an answer from LLM and send it back to the user

  For more details, please Read this article

6
1 462
InterSystems Official Fabiano Sanches · Mar 14, 2024

The 2024.1 release of InterSystems IRIS Data Platform is now Generally Available (GA).

Release Highlights

In this release, you can expect a host of exciting updates, including:

  1. Using vectors in ObjectScript: A powerful capability for optimizing data manipulation.
  2. Vector Search (experimental): A cutting-edge feature for efficient data retrieval.
  3. Multi-Volume Database: Enhancing scalability and storage management.
  4. FastOnline Backup (experimental): Streamlining backup processes.
  5. Multiple Super Server Ports: Providing flexibility in network configuration.
  6. and much more!
5
0 612
InterSystems Official Fabiano Sanches · Feb 28, 2024

InterSystems announces its fourth preview, as part of the developer preview program for the 2024.1 release.  This release will include InterSystems IRIS®,  InterSystems IRIS® for HealthTM, and HealthShare® Health Connect.

Highlights

Many updates and enhancements have been added in 2024.1 and there are also brand-new capabilities, such as Using vectors in ObjectScript,  Vector Search (experimental), Multi-Volume Database, the ability to use Fast Online Backup (experimental), and the introduction of Multiple Super Server Ports.

2
0 357
InterSystems Official Fabiano Sanches · Jan 31, 2024

InterSystems announces its second preview, as part of the developer preview program for the 2024.1 release.  This release will include InterSystems IRIS®,  InterSystems IRIS® for HealthTM, and HealthShare® Health Connect.

Highlights

Many updates and enhancements have been added in 2024.1 and there are also brand-new capabilities, such as Using vectors in ObjectScript,  Vector Search (experimental), Multi-Volume Database, the ability to use Fast Online Backup (experimental), and the introduction of Multiple Super Server Ports.

9
0 572
InterSystems Official Fabiano Sanches · Feb 15, 2024

InterSystems announces its third preview, as part of the developer preview program for the 2024.1 release.  This release will include InterSystems IRIS®,  InterSystems IRIS® for HealthTM, and HealthShare® Health Connect.

Highlights

Many updates and enhancements have been added in 2024.1 and there are also brand-new capabilities, such as Using vectors in ObjectScript,  Vector Search (experimental), Multi-Volume Database, the ability to use Fast Online Backup (experimental), and the introduction of Multiple Super Server Ports.

0
0 256
Article Dmitry Maslennikov · Sep 18, 2023 7m read

Nowadays so much noise around LLM, AI, and so on. Vector databases are kind of a part of it, and already many different realizations for the support in the world outside of IRIS. 

Why Vector?

  • Similarity Search: Vectors allow for efficient similarity search, such as finding the most similar items or documents in a dataset. Traditional relational databases are designed for exact match searches, which are not suitable for tasks like image or text similarity search.
  • Flexibility: Vector representations are versatile and can be derived from various data types, such as text (via embeddings like Word2Vec, BERT), images (via deep learning models), and more.
  • Cross-Modal Searches: Vectors enable searching across different data modalities. For instance, given a vector representation of an image, one can search for similar images or related texts in a multimodal database.

And many other reasons.

So, for this pyhon contest, I decided to try to implement this support. And unfortunately I did not manage to finish it in time, below I'll explain why.

7
3 1339
InterSystems Official Fabiano Sanches · Jan 18, 2024

InterSystems announces its first preview, as part of the developer preview program for the 2024.1 release.  This release will include InterSystems IRIS®,  InterSystems IRIS® for HealthTM, and HealthShare® Health Connect.

Highlights

Many updates and enhancements have been added in 2024.1 and there are also brand-new capabilities, such as Using vectors in ObjectScript,  Vector Search (experimental), Multi-Volume Database, the ability to use Fast Online Backup (experimental), and the introduction of Multiple Super Server Ports.

3
0 385
Article Luis Angel Pérez Ramos · Dec 29, 2023 6m read

It seems like yesterday when we did a small project in Java to test the performance of IRIS, PostgreSQL and MySQL (you can review the article we wrote back in June at the end of this article). If you remember, IRIS was superior to PostgreSQL and clearly superior to MySQL in insertions, with no big difference in queries.

Well, shortly after @Dmitry Maslennikov told me "Why don't you test it from a Python project?" Well, here is the Python version of the tests we previously performed using the JDBC connections.

6
3 1021
Article Guillaume Rongier · Dec 18, 2023 13m read

1. IRIS RAG Demo

IRIS RAG Demo

This demo showcases the powerful synergy between IRIS Vector Search and RAG (Retrieval Augmented Generation), providing a cutting-edge approach to interacting with documents through a conversational interface. Utilizing InterSystems IRIS's newly introduced Vector Search capabilities, this application sets a new standard for retrieving and generating information based on a knowledge base. The backend, crafted in Python and leveraging the prowess of IRIS and IoP, the LLM model is orca-mini and served by the ollama server. The frontend is an chatbot written with Streamlit.

3
2 1158