d[IA]gnosis: searching for similarities in our vector database and using

Article

Open Exchange

#Artificial Intelligence (AI) #Embedded Python #Large Language Model (LLM) #Vector Search #InterSystems IRIS for Health

I just realized I never finished this serie of articles!

GIF de Shame On You Meme | Tenor

In today's article, we'll take a look at the production process that extracts the ICD-10 diagnoses most similar to our text, so we can select the most appropriate option from our frontend.

Looking for diagnostic similarities:

From the screen that shows the diagnostic requests received in HL7 in our application, we can search for the ICD-10 diagnoses closest to the text entered by the professional.

To speed up the search process, we stored the vectorized text of the diagnosis received at the time of capturing the HL7 message in our database. To do this, we implemented a simple BPL that extracts the diagnosis code from the message and sends it to a method to generate the vector:

And here is the code that vectorizes the diagnosis received:

ClassMethod GetEncoding(sentence As %String) As %String [ Language = python ]
{
        import sentence_transformers
        # create the model and form the embeddings
        model = sentence_transformers.SentenceTransformer('/iris-shared/model/')
        embeddings = model.encode(sentence, normalize_embeddings=True).tolist() # Convert search phrase into a vector
        # convert the embeddings to a string
        return str(embeddings)
}

This way, we have our diagnostics vectorized, avoiding having to vectorize them again every time we need to perform a search on them. As you can see, we're using the sentence_transformer library to generate the vector with the downloaded model.

With all our received diagnoses vectorized and stored in our database, we will only need to execute a SELECT query to extract those ICD-10 diagnoses closest to the received diagnoses.

Let's look at the code for the method published in our web service that will return the 25 most similar diagnoses:

ClassMethod GetCodeOptions(idRequest As %String) As %Status
{
	set ret = $$$OK
    try {
        set sql = 
            "SELECT TOP 25 * FROM (SELECT C.CodeId, C.Description, VECTOR_DOT_PRODUCT(C.VectorDescription, R.VectorDescription) AS Similarity FROM ENCODER_Object.Codes C, ENCODER_Object.CodeRequests R WHERE R.ID = ?) WHERE Similarity > 0.5 ORDER BY Similarity DESC"
		set statement = ##class(%SQL.Statement).%New()
		$$$ThrowOnError(statement.%Prepare(sql))
        set rs = statement.%Execute(idRequest)

        set array = []
        while rs.%Next() {
            do array.%Push({
                    "CodeId": (rs.%Get("CodeId")),
                    "Description": (rs.%Get("Description")),
                    "Similarity": (rs.%Get("Similarity"))
                })
        }
        set %response.Status = ..#HTTP200OK
        write array.%ToJSON()

    } catch ex {
        set %response.Status = ..#HTTP400BADREQUEST
        return ex.DisplayString()
    }
    quit ret
}

If you look at the query, we're limiting the search to 25 results and only those diagnoses that exceed a similarity level of 0.5. For the similarity calculation, we chose the VECTOR_DOT_PRODUCT method , although we could have used VECTOR_COSINE . I haven't found any substantial differences between them in this case.

Here we have the search result:

Using LLM to identify diagnosis

So far we've only done a simple search for perfectly identified diagnoses, but... Could we identify diagnoses directly from free text?

Let's try it!

For this feature, we'll use Ollama, which provides us with an API to send our questions to a LLM model of our choice. If you take a look at our docker-compose.yml file, you'll see the Ollama container declaration:

  ## llm locally installed
  ollama:
    build:
      context: .
      dockerfile: ollama/Dockerfile
    container_name: ollama
    volumes:
    - ./ollama/shared:/ollama-shared
    ports:
      - "11434:11434"

For the door, we defined in the container deployment that the LLM llama3.2 should be downloaded. The reason? Well, it seemed to me to be the one that performed the best in my tests.

This is the content of the entrypoint.sh file that runs when the container is deployed:

#!/bin/bash
echo "Starting Ollama server..."
ollama serve &
SERVE_PID=$!

echo "Waiting for Ollama server to be active..."
while ! ollama list | grep -q 'NAME'; do
  sleep 1
done

ollama pull llama3.2

wait $SERVE_PID

To take advantage of Ollama's capabilities, I have modified the screen that analyzes free texts to indicate that it should use the LLM to extract diagnoses from the entered text.

We have also modified the business process by adding a method that will construct the prompt needed for llama3.2 to directly extract the diagnostics:

Method AnalyzeText(text As %String, analysisId As %String, language As %String) As %String [ Language = python ]
{
    import sentence_transformers
    import iris
    import requests

    try:
        url = "http://ollama:11434/api/generate"
        data = {
            "model": "llama3.2",
            "prompt": "Extrae únicamente los diagnósticos del siguiente texto separándolos por , y sin añadir interpretaciones: "+text,
            "stream": False
        }
        response = requests.post(url, json=data)
        analyzedText = response.json()
        
        model = sentence_transformers.SentenceTransformer('/iris-shared/model/')
        phrases = analyzedText['response'].split(",")
        sqlsentence = ""
        # iris.cls("Ens.Util.Log").LogInfo("ENCODER.BP.AnalyzeTextProcess", "AnalyzeText", "Starting process")
        for phraseToAnalyze in phrases :
            if phraseToAnalyze != "":
                embedding = model.encode(phraseToAnalyze, normalize_embeddings=True).tolist()
                sqlsentence = "INSERT INTO ENCODER_Object.TextMatches (CodeId, Description, Similarity, AnalysisId, RawText) SELECT TOP 50 * FROM (SELECT CodeId, Description, VECTOR_DOT_PRODUCT(VectorDescription, TO_VECTOR('"+str(embedding)+"', DECIMAL)) AS Similarity, '"+analysisId+"', '"+phraseToAnalyze+"' FROM ENCODER_Object.Codes) ORDER BY Similarity DESC"
                stmt = iris.sql.prepare("INSERT INTO ENCODER_Object.TextMatches (CodeId, Description, Similarity, AnalysisId, RawText) SELECT TOP 50 * FROM (SELECT CodeId, Description, VECTOR_DOT_PRODUCT(VectorDescription, TO_VECTOR(?, DECIMAL)) AS Similarity, ?, ? FROM ENCODER_Object.Codes) WHERE Similarity > 0.65 ORDER BY Similarity DESC")                    
                rs = stmt.execute(str(embedding), analysisId, phraseToAnalyze)        
    except Exception as err:
        iris.cls("Ens.Util.Log").LogInfo("ENCODER.BP.AnalyzeTextProcess", "AnalyzeText", repr(err))
        return repr(err)

    return "Success"
}

The prompt is very simple; it could possibly be improved and fine-tuned as needed; I'll leave it up to you. This method will retrieve the LLM response, separated by commas, and store the vectorized diagnoses found in our database. Here's an example of the result:

In the lower right corner, you can see all the LLM findings regarding the text in the lower left corner.

Well, we'd now have an application that uses LLM models to help us code diagnoses. As you've seen, it's not at all complicated to implement and can be a good foundation for building more complex and comprehensive solutions.

Thank you very much for your time!

Go to the original post written by @Luis Angel Pérez Ramos

d[IA]gnosis: searching for similarities in our vector database and using LLM to extract diagnoses

Looking for diagnostic similarities:

Using LLM to identify diagnosis