New post

查找

Article
· Mar 25 1m read

Cómo poner los registros de la aplicación en el ^ERRORS global

Rúbrica de preguntas frecuentes de InterSystems

Esto puede hacerse con TRY-CATCH:

 #dim ex As %Exception.AbstractException
 TRY {
    //Code that causes an error
  }
  CATCH ex {
     do ex.Log()
  }

Si utilizáis ^%ETN, llamadlo desde la entrada BACK (BACK^%ETN).

Echad también un vistazo al artículo relacionado: Cómo obtener errores de aplicación (^ERRORS) utilizando un comando

Discussion (0)1
Log in or sign up to continue
Question
· Mar 25

¿Cómo tokenizar un texto usando SentenceTransformer?

Hola a todos.

Estoy intentando crear una tabla indexada con un campo vectorial para poder buscar por su valor. He estado investigando y descubrí que, para obtener el valor del vector a partir del texto (token), se debe usar un método de Python como el siguiente:

ClassMethod TokenizeData(desc As %String) As %String [ Language = python ]
{
    import iris
    # Step 2: Generate Document Embeddings
    from sentence_transformers import SentenceTransformer

    model = SentenceTransformer('/opt/irisbuild/all-MiniLM-L6-v2')

    # Generate embeddings for each document
    document_embeddings = model.encode(desc)

    return document_embeddings.tolist()
}

El modelo all-MiniLM-L6-v2 se descargó de https://ollama.com/library/all-minilm y se instaló en mi instancia de Docker.

Al intentar probar este método (desde Visual Studio), se generó el siguiente error:

<THROW>DebugStub+40^%Debugger.System.1 *%Exception.PythonException <PYTHON EXCEPTION> 246 <class 'OSError'>: It looks like the config file at '/opt/irisbuild/all-MiniLM-L6-v2/config.json' is not a valid JSON file.  

Luego cambié el archivo config.json para crear un archivo JSon válido (solo escribí las llaves) y repetí la prueba, pero hay un nuevo error.

<THROW>DebugStub+40^%Debugger.System.1 *%Exception.PythonException <PYTHON EXCEPTION> 246 <class 'safetensors_rust.SafetensorError'>: Error while deserializing header: HeaderTooSmall  

¿Alguien sabe cómo solucionar este problema?

¿Hay alguna otra forma de crear el valor del vector para poder indexarlo?

Saludos cordiales.

1 Comment
Discussion (1)2
Log in or sign up to continue
Question
· Mar 25

How to tokenize a text using SentenceTransformer?

Hi all.

I'm trying to create an indexed table with an vector field so I can search by the vector value.
I've been investigating and found that to get the vector value based on the text (token), use a Python method like the following:

ClassMethod TokenizeData(desc As %String) As %String [ Language = python ]
{
    import iris
    # Step 2: Generate Document Embeddings
    from sentence_transformers import SentenceTransformer

    model = SentenceTransformer('/opt/irisbuild/all-MiniLM-L6-v2')

    # Generate embeddings for each document
    document_embeddings = model.encode(desc)

    return document_embeddings.tolist()
}

The model all-MiniLM-L6-v2 is downloaded from https://ollama.com/library/all-minilm and installed into my Docker instance.

When I've tryed to test this métod (from Visual Studio), it throws the following error:

 

<THROW>DebugStub+40^%Debugger.System.1 *%Exception.PythonException <PYTHON EXCEPTION> 246 <class 'OSError'>: It looks like the config file at '/opt/irisbuild/all-MiniLM-L6-v2/config.json' is not a valid JSON file.  

Then I changed the config.json file to create a valid JSon file (I only wrote the curly braces) and repeated the test, but there is a new error.

 

<THROW>DebugStub+40^%Debugger.System.1 *%Exception.PythonException <PYTHON EXCEPTION> 246 <class 'safetensors_rust.SafetensorError'>: Error while deserializing header: HeaderTooSmall  

Does anyone know how to fix this problem?
Is there any other way to create the vector value so I can index it?

Best regards.

2 Comments
Discussion (2)2
Log in or sign up to continue
Question
· Mar 25

Walking a virtual document's structure

Is there a generic process for "walking" the structure of a virtual document - eg an HL7 message (EnsLib.HL7.Message) or an XML document (EnsLib.EDI.XML.Document).

At least we'd want to be able to visit all "nodes" (HL7 fields or sub-fields, XML nodes) in the virtual document and be able to work out/generate the Property Path (so we could call "GetValueAt"). 

We can just about come up with something generic for HL7, since it only nests down to 4 levels within each segment, though we're using numeric Property Path's at that point rather than symbolic ones (MSH:1.3 etc).

Reading the documentation has not so far cast light! Any pointers welcome. Thanks.

7 Comments
Discussion (7)3
Log in or sign up to continue
Discussion (1)1
Log in or sign up to continue