go to post Sean Connelly · Aug 26, 2024 The results can be good, but there's room for improvement. One of my go-to LLM tests is to provide a zero-shot prompt for ObjectScript code that calculates the distance between two points. As seen in this example (https://community.intersystems.com/ask-dc-ai?question_id=238163), which has some issues. Most LLMs get close to answering this question, but they often fail with operator precedence and / or the square root function, either misspelling $ZSQR as $SQRT or hallucinating a math library that doesnt even exist, such as ##class(%SYSTEM.Math).Sqrt(). The problem stems from the token volume of other languages outweighing ObjectScript in the gradient descent during training. This causes solutions and functions from other languages to bleed into the ObjectScript code. RAG powered by IRIS vector search is a good approach to address this, but it looks like it can be improved. Generally, the LLM often has the right idea of how to answer the question, but not always with the correct details. To use an analogy, the LLM already has millions of recipes, but these might not match the ingredients in the kitchen. If we can tell it what ingredients to use, it will do a far better job at morphing a relevant recipe. One strategy is to first ask the LLM to break down a question into smaller questions, such as "How do you calculate the square root of a number?" However, the DC AI still struggles with these simple atomic questions about the ObjectScript language: https://community.intersystems.com/ask-dc-ai?question_id=239065https://community.intersystems.com/ask-dc-ai?question_id=239114https://community.intersystems.com/ask-dc-ai?question_id=239125 At a minimum, the vector database should also include these atomic facts about IRIS and ObjectScript. One approach I've experimented with is to produce a compact document of atomic facts and either include the entire text in the prompt or have the LLM select which facts it thinks it needs first from a keyword list. This results in a prompt reengineered by the LLM itself: "InterSystems IRIS ObjectScript Reference: ObjectScript has strict left-to-right operator precedence, use brackets to ensure correct operator precedence, $ZSQR(num) is used to calculate the square root of the given number, $ZPOWER(num,exponent) is used to calculate the value of a number raised to a specified power. Task: How do you calculate the distance between two points." With this approach, we see that the DC AI gives a much better response: https://community.intersystems.com/ask-dc-ai?question_id=239152 ClassMethod CalculateDistance(x1, y1, x2, y2) As %Float{ // Calculate the differences Set dx = x2 - x1 Set dy = y2 - y1 // Calculate the distance using the Pythagorean theorem Set distance = $ZSQR($ZPOWER(dx, 2) + $ZPOWER(dy, 2)) Return distance} I'm enthusiastic about this technology's potential. If you need a beta tester for future versions, please reach out. I'm also happy to contribute my ideas further if its of use.
go to post Sean Connelly · Jan 14, 2024 Hi Scott, Probably best to avoid modifying the Ens.MessageHeader properties, doing so might affect trace logging and potentially lead to unexpected side effects. Here are a few alternative ideas.... Modify the MSH Segment: In the normalization process, tweak the sender or receiver field in the MSH segment to include a unique identifier that corresponds to the source service name. Then use the MSH to route the message. Use a Utility Method: Develop a utility method within a class that inherits from Ens.Util.FunctionSet. This method would read the source config name from the first message header in the session. You can then use this method in your router logic as it will be automagically included. Separate Normalization Processes: A config only option would be to create a normalization process for each service and then use that process name in the router logic.
go to post Sean Connelly · Nov 16, 2023 Hey Paul, Half agree if the OP requirements turn out to be a file to file use case. If not, I wanted to defend the record mapper solution a little so as not to put off other readers from this approach. As an example, I recently implemented a solution that had a million line CSV file. This would generate around 300,000 messages from a complex mapping. Each record was individually passed into a DTL which in turn produced a master file update message. These were then pushed into the local EPR (as per the EPR's requirements). Yes, many messages, but not what I would call an overhead when taken into context of the frequency and need for M16 messages. Bottom line, the solution I selected was the most maintainable solution. Almost zero glue code to go wrong and all maintanence managed inside DTL's. This is exactly what the record mapper was designed for.
go to post Sean Connelly · Nov 16, 2023 Hi Thom, What are you trying to do with the data? Rereading your original post, you say you want to make a slight transformation to the file, which almost suggests the end result is another CSV file? Perhaps if you could expand the requirements a little then it will be easier to point you in the simplist direction.
go to post Sean Connelly · Nov 15, 2023 Hi, You can use EnsLib.RecordMap.Service.FileService, this will ingest large files and generate many small messages that you can then send to a router / DTL.Here is a video explaining how its done...https://learning.intersystems.com/course/view.php?id=1094 Good luck!
go to post Sean Connelly · Nov 15, 2023 Looks like there are some conflicts in the points...Java Native has 2 and then 3 points.Here the LLM has 3, 4 and 6 points... LLM AI or LangChain usage: Chat GPT, Bard and others - 3LLM AI or LangChain usage: Chat GPT, Bard and others - 4 pointsCollect 6 bonus expert points for building a solution that uses LangChain libs or Large Language Models (LLM)
go to post Sean Connelly · Oct 24, 2023 I recommend using two terminals to experiment with locking and unlocking various globals. By observing the lock table during this process, you'll gain a clearer understanding of lock behavior across different processes. Next, consider what you're aiming to lock. In other words, identify what you're trying to safeguard and against which potential issues. For instance, is the variable "loc" unique? Could two processes obtain the same value for "loc"? Without seeing the preceding code, it's challenging to discern if "loc" was assigned manually or via `$Increment`. Remember, using `$Increment` eliminates the need for locks in most cases. Also, reevaluate your decision to use transactions for a single global write. If your goal is transactional integrity, think about which additional global writes should be encompassed in that transaction. For example, if "obj" is defined just before this code with a `%Save()` method, then include that save in the same transaction. Otherwise, a system interruption between the two actions can lead to an unindexed object, compromising data integrity. I strongly advise revisiting the documentation multiple times and actively experimenting with these concepts. While these techniques offer significant advantages, their efficacy diminishes if not executed properly.
go to post Sean Connelly · Oct 23, 2023 Hi Rochdi, As mentioned, always ensure locks are explicitly released after use. Reading between the lines a little, you might find these following points useful... Locks don't actually "seal" a global node; they're advisory. Any rogue process can still write to a locked global. Another process is only deterred from writing to a global if it also attempts to obtain a lock and fails. The developer is responsible to implement this and handle failed locks in every location a write happens. Without a timeout, a process can hang indefinitely due to a lock. You could argue its good practice to use a timeout. If you implement a timeout, always verify the value of $test to ensure you've acquired the lock and not just timed out. $Increment() is useful for creating sequenced ordinal ID's that are always unique, without the need for locks. This is true for high-concurrency solutions. (If you're using locks only to make the key "inc" unique, then consider using $increment and forgo the locks.)
go to post Sean Connelly · Oct 23, 2023 Expanding on this a little, your service / process that calls the operation will get back an Ens.StringContainer, if at this point you need access to the JSON then convert the stream to a dynamic object, something like... set obj = ##class(%DynamicAbstractObject).%FromJSON(response.StreamGet())
go to post Sean Connelly · Oct 23, 2023 Hi Yone, I would keep it simple, avoid unpacking JSON here and make pResponse a generic Ens.StreamContainer Something like this should do it... set tSC=httpRequest.Post(URL,0) if $$$ISERR(tSC) return tSC //no need to throw, the director will handle tSC set pResponse=##class(Ens.StreamContainer).%New() return pResponse.StreamSet(tResponse.Data)
go to post Sean Connelly · Oct 3, 2023 I wonder if MIN or MAX would also work, assuming the values are all the same...MAX(a) as No_Urut
go to post Sean Connelly · Jul 2, 2023 Hi Irina,I've been trying to submit DevBox into the competition but I'm not getting a submit option for it.
go to post Sean Connelly · Aug 12, 2022 ClassMethod OnPage() As %Status [ ServerOnly = 1 ] { //just the query string... set qs=%request.CgiEnvs("QUERY_STRING") //SOLUTION 1: $piece only set externalCenterCode=$p(qs,":") set startDateRange=$p($p(qs,":authoredOn=le",2),":") set endDataRange=$p($p(qs,":authoredOn=ge",2),":") //SOLUTION 2: generic solution if params grow for i=1:1:$l(qs,":") { set nvp=$p(qs,":",i),name=$p(nvp,"=",1),value=$p(nvp,"=",2) //fix the quirks if value="" set value="name",name="ecc" if name="authoredOn" set name=$e(value,1,2),value=$e(value,3,*) set params(name)=value } //SOLUTION 3: regex(ish) solution set code=$p(qs,":") set loc=$locate(qs,"le\d{4}-\d{2}-\d{2}") set start=$e(qs,loc+2,loc+11) set loc=$locate(qs,"ge\d{4}-\d{2}-\d{2}") set end=$e(qs,loc+2,loc+11) //some helper code to dump the variables into the CSP page write !,"<pre>" zwrite //use this to take a good look at the request object... zwrite %request write !,"</pre>" quit $$$OK } Here are three solutions and a couple of inline tips, including your request for regex example I wouldn't worry too much about using $piece, its very common to use it in this way Eduards comment above also has a fourth suggestion to use $lfs (list from string) which is also commonly used as a way of piecing out data
go to post Sean Connelly · Jul 18, 2022 Hi Community and Experts, Many thanks for all the votes!I have big plans for Kozo Pages, lots of work still do to, as well as integration with CloudStudio. Not sure I will have time for any more competitions for some while :) I also have an exciting second part to the Kozo Pages solution that I have not revealed yet, so lots more to come!
go to post Sean Connelly · Jun 20, 2022 Hi Luc. Thanks for the update. I've only been able to test in the latest versions of Chrome and Edge so far. I was wondering if anyone has tried Safari yet? I suspect it will probably fail until I make direct support for it.
go to post Sean Connelly · Jun 16, 2022 Thanks again for the help getting this working Robert, very much appreciated!
go to post Sean Connelly · Jun 13, 2022 For any early adopters that want to provide agile input / feedback into the project, then I have set up a discord channel, everyone is welcome... https://discord.gg/ZnvdMywsjP