Hi Henry,
The embedding class can be extended.
I give an example in demo application: https://openexchange.intersystems.com/package/toot
My parametrs were:
- modelName
- tokenizerPath
- HotStart
- modelPath
The config
The new config needs to be already present before you can compile a custom embedding class. The "toot" app above shows this in file "iris.script" ie: how you can set this when building in docker from scratch.
It currently is not available to add via SQL at this point of a dockerfile build, hence object insert.
Set embedConf=##class(%Embedding.Config).%New() Set embedConf.Name="toot-v2-config" Set embedConf.Configuration="{""modelName"": ""toot-v2-config"",""modelPath"":""/opt/hub/toot/"",""tokenizerPath"":""/opt/hub/toot/tokenizer_tune.json"",""HotStart"":1}" Set embedConf.EmbeddingClass="TOOT.Data.Embedding2" Set embedConf.VectorLength=384 Set embedConf.Description="an embedding model provided by Alex Woodhead" Set tSC=embedConf.%Save()
If you have an already installed instance, you can also use SQL to add and also update an existing embedding config.
Custom embedding class
In "toot" app, Look at source file "/src/XML/TOOT_Data_Embedding2_CLS.xml" this shows how the additional parameters are consumed by the custom embedding class.
Discussion
I started discussion article if you learn some new tricks and could consider also share about new challenges there also.
https://community.intersystems.com/post/vector-embeddings-feedback
Hope this helps
Thanks for time and many suggestions.
Have also been brainstorming what is already in the platform that I could leverage to assist speed up functionality.
Approach is to loop over extent to find the next "set record" that has the most additional numbers available that are not previously selected.
I came up with using Bit Strings instead of IRIS lists at language level.
This allows efficient bit operations via BitLogic "OR" operator.
Storing the BitStrings in a calculated property on record insert and update, mitigates recalculating a Bit String from source string when iterating over an extent each time, looking for the next best record to use.
FInally wrapped this up in a Class Query.
Class TOOT.Data.Instrument Extends %Persistent
{
Property NoteList As %String(MAXLEN = 8000, TRUNCATE = 1);
Property NoteListEmbedding As %Embedding(MODEL = "toot-v2-config", SOURCE = "NoteList");
Property NoteListBit As %Binary [ SqlComputed,SqlComputeOnChange = NoteList];
/// Calculates and stores a bit string during record insert or update
ClassMethod NoteListBitComputation(cols As %Library.PropertyHelper) As %Binary
{
set bitString=""
set numlist=cols.getfield("NoteList")
set numsLen=$L(numlist,",")
for i=1:1:numsLen {
set val=$P(numlist,",",i)
continue:val<6
continue:$D(found(val))
Set found(val)=""
Set $Bit(bitString,val)=1
}
return bitString
}
/// Callable query for getting best document based on bitflags
Query BestNoteList(Top As %Integer=5, Accumulate As %Boolean=0) As %Query(ROWSPEC = "ID:%String,Count:%Integer") [SqlProc]
{
}
ClassMethod BestNoteListExecute(ByRef qHandle As %Binary, Top As %Integer=5, Accumulate As %Boolean=0) As %Status
{
Set:Top<1 Top=5
Set qHandle=$LB(Top,0,"")
Quit $$$OK
}
ClassMethod BestNoteListFetch(ByRef qHandle As %Binary, ByRef Row As %List, ByRef AtEnd As %Integer = 0) As %Status
[ PlaceAfter = BestNoteListExecute ]
{
Set qHandle=$get(qHandle)
Return:qHandle="" $$$OK
Set Top=$LI(qHandle,1)
Set Counter=$LI(qHandle,2)
Set BitString=$LI(qHandle,3)
Set Counter=Counter+1
If (Counter>Top) {
Set Row=""
Set AtEnd=1
quit $$$OK
}
Set statement=##class(%SQL.Statement).%New()
Set tSC=statement.%PrepareClassQuery("TOOT.Data.Instrument","Extent")
Set tResult=statement.%Execute()
Set MaxCount=$BITCOUNT(BitString,1)
Set MaxBitStr=""
Set MaxId=0
While tResult.%Next() {
Set tmpId=tResult.%Get("ID")
Set tmpBit=##class(TOOT.Data.Instrument).%OpenId(tmpId,0).NoteListBit
Set tmpBit=$BITLOGIC(BitString|tmpBit)
Set tmpCount=$BITCOUNT(tmpBit,1)
If tmpCount>MaxCount {
Set MaxCount=tmpCount
Set MaxBitStr=tmpBit
Set MaxId=tmpId
}
}
Do tResult.%Close()
If (MaxId'=0) {
Set Row=$LB(MaxId,MaxCount)
Set AtEnd=0
Set $LI(qHandle,2)=Counter
Set $LI(qHandle,3)=MaxBitStr
} Else {
Set Row=""
Set $LI(qHandle,2)=Counter
Set AtEnd=1
}
Return $$$OK
}
ClassMethod BestNoteListClose(ByRef qHandle As %Binary) As %Status [ PlaceAfter = BestNoteListFetch ]
{
Set qHandle=""
Quit $$$OK
}
}
Calling from the Management Portal:
Where ID is the Record ID and Count is the increasing coverage of bitflags with each itteration of appending a new record.
Temporarily added logging to the Compute Method to confirm not being called during the query running.







Through working on this and other projects I have really come to appreciate all the heavy lifting that the Gradio framework does. Most of my time is simply focused on requirement, application logic and model development goals.
The Gradio UI framework automatically generates functional machine callable APIs. By using HuggingFace Spaces to host Gradio Apps+ GPU Model remotely, the API is directly available to be securely consumed from both my local UnitTests scripts and benchmark scripts.
All without any redesign or parallel API effort:
See: Gradio view-api-page
Many legacy systems can interoperate with HTTP(S).
So from MCP, the Python / Node Javascript client or simply using CURL, the API inference service is available.
Gradio conveniently provides work queues giving a neat configuration lever for balancing contention for serverside business logic that can happily run for example 10 parallel requests, from the GPU inference requests.
On HuggingFace you can implement two-tier (separate CPU app and GPU Model inference instances) in a way that passes a session cookie from the client back to the GPU Model.
This adds support for per client request queuing to throttle individual clients from affecting the experience of all the other users.
This demo is "Running on Zero". This is the "stateless" infrastructure Beta offering from HuggingFace.
ie: The Gradio app is transparently hosted without managing a specific named instance type, and by annotating GPU functions, the inference is offloaded transparently to powerful cloudy GPUs.
Updating application settings for model or files, automatically facilitates the docker image rebuild ( if necessary ) and relaunches the API and invalidates and reloads the model ( if needed ).
Another aspect of Gradio display framework is being able to embed a new Gradio app on client-side to existing web application, instead of implementing a direct serverside integration.
See: Gradio embedding-hosted-spaces
Gradio has been great for local iteration development, in that it can be launched to monitor its own source files and automatically reload the existing application in open web-browser after a source file is (re)saved.
In summary use of Gradio ( with Python ) has imbued some quite high expectations of what is achievable from a single application framework, offloading the responsibilities of on-demand infrastructure ( via HuggingFace ), minimal user interface code and automatic API, to just simply focus on business domain value outputs and iterating faster.
A good yardstick for feedback on other frameworks.