Find

Article
· Feb 14, 2024 4m read

Data Tagging in IRIS Using Embedded Python and the OpenAI API

The invention and popularization of Large Language Models (such as OpenAI's GPT-4) has launched a wave of innovative solutions that can leverage large volumes of unstructured data that was impractical or even impossible to process manually until recently. Such applications may include data retrieval (see Don Woodlock's ML301 course for a great intro to Retrieval Augmented Generation), sentiment analysis, and even fully-autonomous AI agents, just to name a few!

In this article, I want to demonstrate how the Embedded Python feature of IRIS can be used to directly interface with the Python OpenAI library, by building a simple data tagging application that will automatically assign keywords to the records we insert into an IRIS table. These keywords can then be used to search and categorize the data, as well as for data analytics purposes. I will use customer reviews of products as an example use case.

Prerequisites

  • A running instance of IRIS
  • An OpenAPI API key (which you can create here)
  • A configured development environment (I will be using VS Code for this article)

The Review Class

Let us start by creating an ObjectScript class that will define the data model for our customer reviews. To keep things simple, we will only define 4 %String fields: the customer's name, the product name, the body of the review, and the keywords we will generate. The class should extend %Persistent so that we can save its objects to disk.

Class DataTagging.Review Extends %Persistent
{
Property Name As %String(MAXLEN = 50) [ Required ];
Property Product As %String(MAXLEN = 50) [ Required ];
Property ReviewBody As %String(MAXLEN = 300) [ Required ];
Property Keywords As %String(MAXLEN = 300) [ SqlComputed, SqlComputeOnChange = ReviewBody ];
}

Since we want the Keywords property to be automatically computed on insert or update to the ReviewBody property, I am marking it as SqlComputed. You can learn more about computed values here.

The KeywordsComputation Method

We now want to define a method that will be used to compute the keywords based on the review body. We can use Embedded Python to interact directly with the official openai Python package. But first, we need to install it. To do so, run the following shell command:

<your-IRIS-installation-path>/bin/irispip install --target <your-IRIS-installation-path>/Mgr/python openai

We can now use OpenAI's chat completion API to generate the keywords:

ClassMethod KeywordsComputation(cols As %Library.PropertyHelper) As %String [ Language = python ]
{
    '''
    This method is used to compute the value of the Keywords property
    by calling the OpenAI API to generate a list of keywords based on the review body.
    '''
    from openai import OpenAI

    client = OpenAI(
        # Defaults to os.environ.get("OPENAI_API_KEY")
        api_key="<your-api-key>",
    )

    # Set the prompt; use few-shot learning to give examples of the desired output
    user_prompt = "Generate a list of keywords that summarize the content of a customer review of a product. " \
                + "Output a JSON array of strings.\n\n" \
                + "Excellent watch. I got the blue version and love the color. The battery life could've been better though.\n\nKeywords:\n" \
                + "[\"Color\", \"Battery\"]\n\n" \
                + "Ordered the shoes. The delivery was quick and the quality of the material is terrific!.\n\nKeywords:\n" \
                + "[\"Delivery\", \"Quality\", \"Material\"]\n\n" \
                + cols.getfield("ReviewBody") + "\n\nKeywords:"
    # Call the OpenAI API to generate the keywords
    chat_completion = client.chat.completions.create(
        model="gpt-4",  # Change this to use a different model
        messages=[
            {
                "role": "user",
                "content": user_prompt
            }
        ],
        temperature=0.5,  # Controls how "creative" the model is
        max_tokens=1024,  # Controls the maximum number of tokens to generate
    )

    # Return the array of keywords as a JSON string
    return chat_completion.choices[0].message.content
}

Notice how in the prompt, I first specify the general instructions of how I want GPT-4 to "generate a list of keywords that summarize the content of a customer review of a product," and then I give two example inputs along with the desired outputs. I then insert cols.getfield("ReviewBody") and end the prompt with the word "Keywords:", nudging it to complete the sentence by providing the keywords in the same format as the examples I gave it. This is a simple example of the Few-Shot Prompting technique.

I chose to store the keywords as a JSON string for the sake of simplicity of presentation; a better way to store them in production could be a DynamicArray, but I will leave this as an exercise to the reader.

Generating Keywords

We can now test our data tagging application by inserting a row into our table using the following SQL query through the Management Portal:

INSERT INTO DataTagging.Review (Name, Product, ReviewBody)
VALUES ('Ivan', 'BMW 330i', 'Solid car overall. Had some engine problems but got everything fixed under the warranty.')

As you can see below, it automatically generated four keywords for us. Well done!

Conclusions

To summarize, the ability of InterSystems IRIS to embed Python code allows for a large range of possibilities when dealing with unstructured data. Leveraging the power of OpenAI for automated data tagging is just one example of what one can achieve with this powerful feature. This leads to fewer human errors and higher efficiency overall.

4 Comments
Discussion (4)3
Log in or sign up to continue
Question
· Feb 8, 2024

help with TLS on 2016 version

Hi,

I am trying to connect to another server using  %Net.HttpRequest.

I keep getting this error  : SSL23_GET_SERVER_HELLO:unsupported protocol.

My guess is that the site I am reaching for uses TLS1.3 which is not supported in 2016, But I cant right now ask my client to upgrade.

Is it possible to override this ? install some kind of a patch or a more recent version of openssl on the server ?

Thanks

Amiram

2 Comments
Discussion (2)3
Log in or sign up to continue
Announcement
· Feb 8, 2024

Seeking Exam Design Feedback for InterSystems IRIS Developer Professional Exam

Hello Everyone,

The Certification Team of InterSystems Learning Services is developing an InterSystems IRIS Developer Professional certification exam, and we are reaching out to our community for feedback that will help us evaluate and establish the contents of this exam.

Note: This exam will replace the current InterSystems IRIS Core Solutions Developer Specialist exam when it is released. Please note from the target role description below that the focus of the new exam will be more on developer best practices and a lot less on the ObjectScript programming language.

How do I provide my input? Complete our Job Task Analysis survey (JTA)! We will present you with a list of job tasks, and you will rate them on their importance as well as other factors.

How much effort is involved? It takes about 20-30 minutes to fill out the survey. You can be anonymous or identify yourself and ask us to get back to you.

How can I access the survey? You can access it here

  • Survey does not work well on mobile devices - you can access it, but it will involve a lot of scrolling
  • Survey can be resumable if you return to it on the same device in the same browser - answers save with the Save/Next button
  • Survey will close on March 8, 2024

 

What’s in it for me? You get to weigh-in on the exam topics for our new developer exam AND you will be entered in a raffle where 15 lucky winners will be given a $50 Tango* card (Available for US-based participants. InterSystems and VA employees are not eligible).

  • Tango cards are a popular digital reward platform that provides a wide selection of e-gift cards from various retailers.  

Here are the exam title and the definition of the target role:

InterSystems IRIS Developer Professional

back-end software developer who:

  • writes and executes efficient, scalable, maintainable, and secure code on (or adjacent to) InterSystems IRIS using best practices for the development lifecycle,
  • effectively communicates development needs to systems and operations teams (e.g., database architecture strategy),
  • integrates InterSystems IRIS with modern development practices and patterns, and
  • is familiar with the different data models and modes of access for InterSystems IRIS (ObjectScript, Python, SQL, JDBC/ODBC, REST, language gateways, etc.).

At least 2 years of experience developing with InterSystems IRIS is recommended. Any code samples that include InterSystems IRIS classes will have methods displayed in both ObjectScript and Python (or SQL). 

Discussion (0)1
Log in or sign up to continue
Announcement
· Feb 7, 2024

InterSystems Certification is looking for question writers for our upcoming InterSystems TrakCare Reports exam

The InterSystems Certification Team is building an InterSystems TrakCare Reports certification exam and is looking for Subject Matter Experts (SMEs) from our community to help write and review questions. You, as a valued InterSystems community member, know the challenges of working with our technology and what it takes to be successful at your job. A work assignment will typically involve writing 15 assigned questions and reviewing 15 assigned directly to you.

Proposed Project Work Dates: The work assignments will be assigned by the Certification Team through September 15, 2024.

Here are the details:

Action Item

Details

Contact InterSystems Certification

Write to certification@intersystems.com to express your interest in the Certification Subject Matter Expert Program. Tell us that you are interested in being an InterSystems TrakCare Reports SME (an individual with at least one year of experience with InterSystems TrakCare Reports tasks).

Complete project profile - External Participants

If you are an external volunteer looking to participate, a team member will send you a profile form to determine if your areas of expertise align with open project. 

Accept

If you are selected for an exam development opportunity, a team member will email you a Non-Disclosure Agreement requiring your signature.

Train

After receiving your signed document, and before beginning to write questions, you will be asked to watch a short training video on question-item writing.

Participate

Once onboarded, the Certification Team will send you information regarding your first assignment. This will include:

  • an invitation to join Certiverse, our new test delivery platform, as an item writer and reviewer
  • an item writing assignment, which usually consists of the submission of 15 scenario-based questions
  • an alpha testing assignment, which usually consists of reviewing 15 items written by your peers

You will typically be given one month to complete the assignment.

Subject Matter Experts are eligible for a SME badge based on successful completion of their exam development participation. SMEs are also awarded the InterSystems TrakCare Reports certification if they write questions for all KSA Groups and their questions are accepted.

Interested in participating? Email certification@intersystems.com now!

KSA Group KSA Target Item
1. Creates InterSystems Reports using Logi Designer within TrakCare 1. Describes what the specification is saying

1. Recalls what data sources and procedures are, and how to access the sources of data

2. Identifies what parameters are used from the specification

3. Distinguishes between different page report component types (eg. cross tabs, banded objects, normal tables)

2. Identifies the components of InterSystems Reports

1. Distinguishes between catalogues and reports

2. Recalls the features of a catalogue

3. Catalogues connections and terms

4. Accesses the catalogue manager in the designer

5. Identifies which data source types are used in reporting

6. Identifies the data source connection and how to modify it

7. Identify what is required to use a JDBC connection

8. Recalls what a stored procedure is

9. Recalls when and why to update a stored procedure

10. Distinguishes between different data sources and their use cases

11. Recalls the importance of binding parameters

12. Manages catalogues using reference entities

13. Recalls how to change the SQL type of a database field (eg. dates)

14. Identifies how to reuse sub-reports

15. Recalls the different use cases for sub-reports

16. Describe how to use parameter within a sub-report

17. Recalls how to configure the parameters that the sub-report requires

18. Recalls how to link a field on a row to filter sub-reports

19. Recalls the potential impact of updating stored procedures on the settings

3. Uses Logi Designer to design and present data

1. Distinguishes between the different formats of reports

2. Determines when and how to use different kinds of page report component types

3. Recalls the meaning of each band and where they appear (eg. page header vs banded page header)

4. Recalls how to add groups and work with single vs multiple groups

5. Differentiates between the types of summaries

6. Uses tools to manage, organize and group data and pages including effectively using page breaks

7. Identifies when to use formulas

8. Uses formulas to format data and tables

9. Determines how to best work with images including using dynamic images

10. Uses sub-reports effectively

11. Inserts standard page headers and footers into report

12. Recalls how to embed fonts into report

13. Applies correct formatting, localization, and languages

2. Integrates InterSystems reporting within TrakCare 1. Understands TrakCare report Architecture

1. Applies correct formatting, localization, and languages

2. Recalls how many user-inputted parameters can be used in TrakCare 

3. Recalls how to setup menu for a report and how to add menu to a header

4. Recalls what a security group is and adds menus to security group access

5. Configure TrakCare layout webcommon.report

6. Differentiates between different types of layout fields

3. Supports InterSytems Reports 1. Verifies printing setup

1. Debug using menu or preview button

2. Tests the report by making sure it runs as expected

3. Demonstrates how to run reports with different combinations of parameters

4. Tests report performance with a big data set

5. Identifies error types

2. Uses print history

1. Identifies use cases for the print history feature

2. Recalls the steps to retry printing after a failed print

3. Uses print to history to verify parameters are correctly passed to the parameters in the stored procedure

4. Recalls how to identify a report was successfully previewed or if it encountered errors

Discussion (0)1
Log in or sign up to continue
Announcement
· Feb 6, 2024

Seeking Exam Design Feedback for InterSystems TrakCare Technical Integration Specialist Exam

Hello Everyone,

The Certification Team of InterSystems Learning Services is in the process of developing an exam focusing on creating and working with TrakCare Integration, and we need input from our InterSystems TrakCare community. Your input will be used to evaluate and establish the contents of the exam.

How do I provide my input? We will provide a list of job tasks. You will rate them on their importance as well as other factors.

How much effort is involved? It takes about 15-20 minutes to fill out the survey.

How can I access the survey? You can access it here: InterSystems TrakCare Technical Integration Specialist 

  • Survey does not work well on mobile devices - you can access it, but it will involve a lot of scrolling
  • Survey can be resumable if you return to it on the same device in the same browser - answers save with the Save/Next button
  • Survey will close on March 22, 2024

Here are the exam title and the definition of the target role:

InterSystems TrakCare Technical Integration Specialist

An IT specialist who is experienced with:

  • general TrakCare fundamentals, 
  • the TrakCare data model,
  • industry-standard integration messaging formats (HL7v2/FHIR/SDA3/IHE),
  • the HealthCare Messaging Framework (HMF), and 
  • has at least 6-12 months full-time experience working with TrakCare integrations.

Thank you,

InterSystems Certification

Discussion (0)1
Log in or sign up to continue