Ollama AI with IRIS

Article

Open Exchange

#API #Artificial Intelligence (AI) #GitHub #Python #Vector Search #InterSystems IRIS #InterSystems IRIS for Health #Open Exchange

In this article I will be discussing the use of an alternative LLM for generative IA. OpenIA is commonly used, in this article I will show you how to use it and the advantages of using Ollama

In the generative AI usage model that we are used to, we have the following flow:

we take texts from a data source (a file, for example) and embedding that text into vectors
we store the vectors in an IRIS database.
we call an LLM (Large Language Model) that accesses these vectors as context to generate responses in human language.

We have great examples of this in this community, such as IRIS Vector Search (https://openexchange.intersystems.com/package/iris-vector-search) and IRIS AI Studio (https://openexchange.intersystems.com/package/IRIS-AI-Studio-2). In these examples, LLM is a service that must be subscribed to (like OpenIA) in order to obtain a usage key. In short, what we do is call the OpenIA REST API passing the vectorized data as context and it returns us a response based on that context.

In this article, instead of OpenIA, I am suggesting to use LLM Ollama. Unlike OpenIA, Ollama runs locally. I see two main advantages in this:

More security, as we do not need to transfer data to a third-party API.
Lower cost, as there is no need to pay a subscription to call the service. Remember that just the traffic of the vectors stored in IRIS and the OpenIA already exceeds the limit of its free license. So when testing using OpenIA we received the error 429 - "You exceeded your current quota, please check your plan and billing details."

The disadvantage is that it demands resources from your computer (with less than 16GB of RAM it will be difficult to run).

To use Ollama, you need to download it and install it on your computer from the website https://ollama.com/download. After that, simply tell the Python llama_index library to use Ollama instead of OpenIA using this command:

Settings.llm = Ollama(model="llama3.2", request_timeout=360.0)

In the Open Exchange application https://openexchange.intersystems.com/package/ollama-ai-iris there are more details about the code.

So first we load a text (in the data_example directory of https://github.com/RodolfoPscheidtJr/ollama-ai-iris you can see the text used as an example) in vector form into IRIS:

Then use this vectorized text as a context to ask Ollama some questions. Asking "What did the author do?" we get:

And asking "Does the author like paintings?" we get: