Article
· May 18, 2024 3m read

IRIS AI Studio: Playground to explore RAG capabilities on top of IRIS DB vector embeddings

In the previous article, we saw in detail about Connectors, that let user upload their file and get it converted into embeddings and store it to IRIS DB. In this article, we'll explore different retrieval options that IRIS AI Studio offers - Semantic Search, Chat, Recommender and Similarity. 

New Updates  ⛴️ 

  • Added installation through Docker. Run `./build.sh` after cloning to get the application & IRIS instance running in your local
  • Connect via InterSystems Extension in vsCode - Thanks to @Evgeny Shvarov 
  • Added FAQ's in the home page that covers the basic info for new users

Semantic Search

Semantic Search is a data retrieval method that aims to understand the user's intent and the context of the query rather than just matching keywords. It considers the relationship between words and concepts, going beyond simple keyword matching.

Building upon the index creation and data loading into the IRIS DB covered in the previous article's Connectors code, we now start with the index and run the query engine:

Ref. to `query_llama.py`

query_engine = index.as_query_engine()
response = query_engine.query(query)

 

Chat Engine

The Chat Engine augments the loaded content and acts as a stateful version of the previous query_engine. It keeps track of the conversation history and can answer questions with past context in mind.

Ref. to `chat_llama.py`

chat_engine = index.as_chat_engine(chat_mode='condense_question')
response = chat_engine.chat(user_message)

 

Recommender 

The Recommender is similar to the recommendation systems used on e-commerce sites, where similar products are displayed below the product we look for. The Recommender and Similarity RAG falls in similar lines, but in the Recommender, you can choose the LLM ranking model to be used.

Ref. to `reco_llama.py`

if reco_type == 'cohere_rerank':
    cohere_rerank = CohereRerank(api_key=cohere_api_key, top_n=results_count)
    query_engine = index.as_query_engine(
        similarity_top_k=10,
        node_postprocessors=[cohere_rerank],
    )
    response = query_engine.query(query)

elif reco_type == 'openai_rerank':
    rankgpt_rerank = RankGPTRerank(
        llm=OpenAI(
            model="gpt-3.5-turbo-16k",
            temperature=0.0,
            api_key=openai_api_key,
        ),
        top_n=results_count,
        verbose=True,
    )
    query_engine = index.as_query_engine(
        similarity_top_k=10,
        node_postprocessors=[rankgpt_rerank],
    )
    response = query_engine.query(query)

Similarity

The Similarity feature returns a similarity score, which helps evaluate the quality of a question-answering system via semantic similarity. You can filter the results based on the minimum similarity score and the number of similar items to retrieve from the DB.

Ref. to `similarity_llama.py`

retriever = index.as_retriever(similarity_top_k=top_k)
nodes = retriever.retrieve(query)

 

By leveraging these different retrieval options at IRIS AI Studio, including Semantic Search, Chat Engine, Recommender, and Similarity, users can explore the potential of their data. These features enable advanced information retrieval, contextual question answering, personalized recommendations, and semantic similarity analysis, empowering users to derive valuable insights and make data-driven decisions (at least to build similar ones in their domain).

🚀 Vote for this application in Vector Search, GenAI and ML contest, if you find it promising!

Feel free to share any feedback or inputs you may have. Thank you.

Discussion (0)0
Log in or sign up to continue