
Hi Community,
In this article, I will introduce my application iris-RAG-Gen .
Iris-RAG-Gen is a generative AI Retrieval-Augmented Generation (RAG) application that leverages the functionality of IRIS Vector Search to personalize ChatGPT with the help of the Streamlit web framework, LangChain, and OpenAI. The application uses IRIS as a vector store.

Application Features
- Ingest Documents (PDF or TXT) into IRIS
- Chat with the selected Ingested document
- Delete Ingested Documents
- OpenAI ChatGPT
Ingest Documents (PDF or TXT) into IRIS
Follow the Below Steps to Ingest the document:
- Enter OpenAI Key
- Select Document (PDF or TXT)
- Enter Document Description
- Click on the Ingest Document Button

Ingest Document functionality inserts document details into rag_documents table and creates 'rag_document + id' (id of the rag_documents) table to save vector data.

The Python code below will save the selected document into vectors:
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import PyPDFLoader, TextLoader
from langchain_iris import IRISVector
from langchain_openai import OpenAIEmbeddings
from sqlalchemy import create_engine,text
class RagOpr:
def ingestDoc(self,filePath,fileDesc,fileType):
embeddings = OpenAIEmbeddings()
if fileType == "text/plain":
loader = TextLoader(filePath)
elif fileType == "application/pdf":
loader = PyPDFLoader(filePath)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=0)
texts = text_splitter.split_documents(documents)
COLLECTION_NAME = self.get_collection_name(fileDesc,fileType)
db = IRISVector.from_documents(
embedding=embeddings,
documents=texts,
collection_name = COLLECTION_NAME,
connection_string=self.CONNECTION_STRING,
)
def get_collection_name(self,fileDesc,fileType):
with self.engine.connect() as conn:
with conn.begin():
sql = text("""
SELECT *
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_SCHEMA = 'SQLUser'
AND TABLE_NAME = 'rag_documents';
""")
result = []
try:
result = conn.execute(sql).fetchall()
except Exception as err:
print("An exception occurred:", err)
return ''
if len(result) == 0:
sql = text("""
CREATE TABLE rag_documents (
description VARCHAR(255),
docType VARCHAR(50) )
""")
try:
result = conn.execute(sql)
except Exception as err:
print("An exception occurred:", err)
return ''
with self.engine.connect() as conn:
with conn.begin():
sql = text("""
INSERT INTO rag_documents
(description,docType)
VALUES (:desc,:ftype)
""")
try:
result = conn.execute(sql, {'desc':fileDesc,'ftype':fileType})
except Exception as err:
print("An exception occurred:", err)
return ''
sql = text("""
SELECT LAST_IDENTITY()
""")
try:
result = conn.execute(sql).fetchall()
except Exception as err:
print("An exception occurred:", err)
return ''
return "rag_document"+str(result[0][0])
Type the below SQL command in the management portal to retrieve vector data
SELECT top 5
id, embedding, document, metadata
FROM SQLUser.rag_document2

Chat with the selected Ingested document
Select the Document from select chat option section and type question. The application will read the vector data and return the relevant answer

The Python code below will save the selected document into vectors:
from langchain_iris import IRISVector
from langchain_openai import OpenAIEmbeddings,ChatOpenAI
from langchain.chains import ConversationChain
from langchain.chains.conversation.memory import ConversationSummaryMemory
from langchain.chat_models import ChatOpenAI
class RagOpr:
def ragSearch(self,prompt,id):
COLLECTION_NAME = "rag_document"+str(id)
embeddings = OpenAIEmbeddings()
db2 = IRISVector (
embedding_function=embeddings,
collection_name=COLLECTION_NAME,
connection_string=self.CONNECTION_STRING,
)
docs_with_score = db2.similarity_search_with_score(prompt)
relevant_docs = ["".join(str(doc.page_content)) + " " for doc, _ in docs_with_score]
llm = ChatOpenAI(
temperature=0,
model_name="gpt-3.5-turbo"
)
conversation_sum = ConversationChain(
llm=llm,
memory= ConversationSummaryMemory(llm=llm),
verbose=False
)
template = f"""
Prompt: {prompt}
Relevant Docuemnts: {relevant_docs}
"""
resp = conversation_sum(template)
return resp['response']
For more details, please visit iris-RAG-Gen open exchange application page.
Thanks