How to convert an embeddings vector to be saved as %Vector
I try to get a vector from calling GetEmbedding, but i failed to convert it into a vector
Here is a simplyfied sample class:
Class User.myclass Extends %Persistent
{Property myVECTOR As %Vector(CAPTION = "Vector");
Property myProperty As %String(MAXLEN = 40) [ Required ];
}
here the GetEmbedding part from User.mymethods:
...
ClassMethod GetEmbedding(sentences As %String) As %String [ Language = python ]
{
import sentence_transformers model = sentence_transformers.SentenceTransformer('C:/InterSystems/IRIS/lib/python/Lib/site-packages/sentence_transformers/models/all-MiniLM-L6-v2')
embeddings = model.encode(sentences) embeddings_list = [str(embedding.tolist()) for embedding in embeddings]
return embeddings_list
}.....
I try to save the VECTOR into myVECTOR, but that will fail because the vector is not in the right format:
set VECTOR=##class(User.mymethods).GetEmbedding("this is my text")
/// here I need to convert the VECTOR, but I do not know the right command
set data=##class(User.myclass).%New()
set data.myProperty ="anything"
set data.myVECTOR=VECTOR
set ok=data.%Save()
Question: is there a method to convert the vector into the right format to be saved as %Vector
Comments
Hallo @Ditmar Tybussek lange nix mehr gehört von dir !
See my article Using VECTORs in ObjectScript.
related to your code:
Simple Set vector=.... doesn't work
it is Set $VECTOR(vectorname, pos,type) very clear example
and type can never be changed
position is less sensitive.
In your case using SQL functíon TO_VECTOR() might be quite comfortable
as it does all the checks for you under cover.
i also used all-MiniLM-L6-v2 (as PY method) starting from label sc3 in the example
in my OEX package Vector-inside-IRIS
LG. aus Wien
Danke Robert,
is the TO_VECTOR(SQL) the "only" way to convert an array, or is there an alternativ, if someone likes to see "what's under the cover" and set the vector while creating a record?
Ditmar
not really since as you know there is always ObjectScript under most SQL Functions.
BUT as mentioned: I found TO_VECTOR() the most comfortable one.
without SQL it might look like this: or worse
;; assume a $LB structure of %Integer as inputset intlist=$LB(...........)
;; fill vectorset maxvecsize=22; just an assumptionfor i=1:1:maxvecsize {
set val=$li(intlist,i)
if '(val\1=val) continueset$vector(vec,i,"int")=val
}
write$isvector(vec)
write vec
zwrite vecI think the difference is obvious.
Servus
Try this example
Class User.myclass Extends %Persistent
{
Property myVECTOR As %Vector(CAPTION = "Vector", DATATYPE = "INTEGER");
Property myProperty As %String(MAXLEN = 40) [ Required ];
ClassMethod GetEmbedding1(sentences As %String) As %String
{
q "2,4,6,8"
}
ClassMethod GetEmbedding2(sentences As %String) As %DynamicObject
{
q [1,3,5,7,9]
}
ClassMethod Test()
{
d ..%KillExtent()
set data=##class(User.myclass).%New()
set data.myProperty ="anything 1"
set data.myVECTOR=##class(User.myclass).myVECTORDisplayToLogical(##class(User.myclass).GetEmbedding1("this is my text"))
d $system.OBJ.DisplayError(data.%Save())
; OR
set data=..%New()
set data.myProperty ="anything 2"
set data.myVECTOR=data.myVECTORDisplayToLogical(..GetEmbedding2("this is my text"))
d $system.OBJ.DisplayError(data.%Save())
zw ^User.myclassD
}
}
USER>d ##class(User.myclass).Test()
^User.myclassD=2
^User.myclassD(1)=$lb("",{"type":"integer", "count":4, "length":4, "vector":[2,4,6,8]} ; <VECTOR>,"anything 1")
^User.myclassD(2)=$lb("",{"type":"integer", "count":5, "length":5, "vector":[1,3,5,7,9]} ; <VECTOR>,"anything 2")