1. IRIS RAG Demo

IRIS RAG Demo

This demo showcases the powerful synergy between IRIS Vector Search and RAG (Retrieval Augmented Generation), providing a cutting-edge approach to interacting with documents through a conversational interface. Utilizing InterSystems IRIS's newly introduced Vector Search capabilities, this application sets a new standard for retrieving and generating information based on a knowledge base.
The backend, crafted in Python and leveraging the prowess of IRIS and IoP, the LLM model is orca-mini and served by the ollama server.
The frontend is an chatbot written with Streamlit.

14 3
2 628

We have a yummy dataset with recipes written by multiple Reddit users, however most of the information is free text as the title or description of a post. Let's find out how we can very easily load the dataset, extract some features and analyze it using features from OpenAI large language model within Embedded Python and the Langchain framework.

9 2
2 171

With the release of InterSystems IRIS, we're also making available a nifty bit of software that allows you to get the best out of your InterSystems IRIS cluster when working with Apache Spark for data processing, machine learning and other data-heavy fun. Let's take a closer look at how we're making your life as a Data Scientist easier, as you're probably already facing tough big data challenges already, just from the influx of job offers in your inbox!

2 2
0 1.5K

In this article, I am trying to identify the multiple areas to develop the features we can able to do using python and machine learning.

Each hospital is every moment trying to improve its quality of service and efficiency using technology and services.

The healthcare sector is one of the very big and vast areas of service options available and python is one of the best technology for doing machine learning.

In every hospital, humans will come with some feelings, if this feeling will understand using technology is make a chance to provide better service.

2 2
2 323

As you have seen in the latest community publications, InterSystems IRIS has included since version 2024.1 the possibility of including vector data types in its database and based on this type of data vector searches have been implemented. Well, these new features reminded me of the article I published a while ago that was based on facial recognition using Embedded Python.

7 2
2 204

A few months ago, I read this interesting article from MIT Technology Review, explaing how COVID-19 pandemic are issuing challenges to IT teams worldwide regarding their machine learning (ML) systems.

Such article inspire me to think about how to deal with performance issues after a ML model was deployed.

2 2
0 398
Article
· Jul 27, 2018 4m read
Load a ML model into InterSystems IRIS

Hi all. Today we are going to upload a ML model into IRIS Manager and test it.

Note: I have done the following on Ubuntu 18.04, Apache Zeppelin 0.8.0, Python 3.6.5.

Introduction

These days many available different tools for Data Mining enable you to develop predictive models and analyze the data you have with unprecedented ease. InterSystems IRIS Data Platform provide a stable foundation for your big data and fast data applications, providing interoperability with modern DataMining tools.

6 2
2 1.3K

Keywords: Jupyter Notebook, Tensorflow GPU, Keras, Deep Learning, MLP, and HealthShare

1. Purpose and Objectives

In previous"Part I" we have set up a deep learning demo environment. In this "Part II" we will test what we could do with it.

Many people at my age had started with the classic MLP (Multi-Layer Perceptron) model. It is intuitive hence conceptually easier to start with.

1 2
3 885

Artificial intelligence is not limited only to generating images through text with instructions or creating narratives with simple directions.

You can also make variations of a picture or include a special background to an already existing one.

Additionally, you can obtain the transcription of audio regardless of its language and the speed of the speaker.

So, let's analyze how the file management works.

9 2
1 288

Diabetes can be discovered from some parameters well known to the medical community. In this way, in order to help the medical community and computerized systems, especially AI, the National Institute of Diabetes and Digestive and Kidney Diseases published a very useful dataset for training ML algorithms in the detection/prediction of diabetes. This publication can be found on the largest and best known data repository for ML, Kaggle at https://www.kaggle.com/datasets/mathchi/diabetes-data-set.

6 1
1 290

Keywords: IRIS, IntegratedML, Machine Learning, Covid-19, Kaggle

Purpose

Recently I noticed a Kaggle dataset for the prediction of whether a Covid-19 patient will be admitted to ICU. It is a spreadsheet of 1925 encounter records of 231 columns of vital signs and observations, with the last column of "ICU" being 1 for Yes or 0 for No. The task is to predict whether a patient will be admitted to ICU based on known data.

2 1
1 769
Article
· Apr 8, 2019 4m read
Should we use computers?

The titular question was quite relevant and often discussed some thirty years ago. The thought went: “Sure, there are industries where computers are the norm, but in my industry we got just fine so far, the benefits are questionable, problems innumerable and unsolved. Can we continue as before or should we embrace this new technology?”

Today, everyone asks the same question but about Machine Learning and Artificial Intelligence. The doubts are the same – lack of expertise, lack of known path, perceived irrelevancy to the industry.

Yet, as before, the correct, even the only possible answer is a resounding yes. Read on to find out why.

2 1
1 336

Fixing the terminology

A robot is not expected to be either huge or humanoid, or even material (in disagreement with Wikipedia, although the latter softens the initial definition in one paragraph and admits virtual form of a robot). A robot is an automate, from an algorithmic viewpoint, an automate for autonomous (algorithmic) execution of concrete tasks. A light detector that triggers street lights at night is a robot. An email software separating e-mails into “external” and “internal” is also a robot. Artificial intelligence (in an applied and narrow sense, Wikipedia interpreting it differently again) is algorithms for extracting dependencies from data. It will not execute any tasks on its own, for that one would need to implement it as concrete analytic processes (input data, plus models, plus output data, plus process control). The analytic process acting as an “artificial intelligence carrier” can be launched by a human or by a robot. It can be stopped by either of the two as well. And managed by any of them too.

6 0
0 279

Challenges of real-time AI/ML computations

We will start from the examples that we faced as Data Science practice at InterSystems:

  • A “high-load” customer portal is integrated with an online recommendation system. The plan is to reconfigure promo campaigns at the level of the entire retail network (we will assume that instead of a “flat” promo campaign master there will be used a “segment-tactic” matrix). What will happen to the recommender mechanisms? What will happen to data feeds and updates into the recommender mechanisms (the volume of input data having increased 25000 times)? What will happen to recommendation rule generation setup (the need to reduce 1000 times the recommendation rule filtering threshold due to a thousandfold increase of the volume and “assortment” of the rules generated)?
  • An equipment health monitoring system uses “manual” data sample feeds. Now it is connected to a SCADA system that transmits thousands of process parameter readings each second. What will happen to the monitoring system (will it be able to handle equipment health monitoring on a second-by-second basis)? What will happen once the input data receives a new bloc of several hundreds of columns with data sensor readings recently implemented in the SCADA system (will it be necessary, and for how long, to shut down the monitoring system to integrate the new sensor data in the analysis)?
  • A complex of AI/ML mechanisms (recommendation, monitoring, forecasting) depend on each other’s results. How many man-hours will it take every month to adapt those AI/ML mechanisms’ functioning to changes in the input data? What is the overall “delay” in supporting business decision making by the AI/ML mechanisms (the refresh frequency of supporting information against the feed frequency of new input data)?

4 0
1 587
Article
· Jun 14, 2023 2m read
LangChain Ghost in the PDF

Posing a question to consider during the current Grand Prix competition.

I wanted to share an observation about using PDFs with LangChain.

When loading the text out of a PDF, I noticed there was an artifact of gaps within some of the words extracted.

For example (highlighted in red)

1 0
0 227

Demonstration example for the current Grand Prix contest for use of a more complex Parameter template to test the AI.

Interview Questions

There is documentation. A recruitment consultant wants to quickly challenge candidates with some relevant technical questions to a role.

Can they automate making a list of questions and answers from the available documentation?

Interview Answers and Learning

One of the most effective ways to cement new facts into accessible long term memory is with phased recall.

2 0
0 1K

As an AI language model, ChatGPT is capable of performing a variety of tasks like language translation, writing songs, answering research questions, and even generating computer code. With its impressive abilities, ChatGPT has quickly become a popular tool for various applications, from chatbots to content creation.
But despite its advanced capabilities, ChatGPT is not able to access your personal data. So in this article, I will demonstrate below steps to build custom ChatGPT AI by using LangChain Framework:

4 0
1 9.7K

Keywords: PyODBC, unixODBC, IRIS, IntegratedML, Jupyter Notebook, Python 3

Purpose

A few months ago I touched on a brief note on "Python JDBC connection into IRIS", and since then I referred to it more frequently than my own scratchpad hidden deep in my PC. Hence, here comes up another 5-minute note on how to make "Python ODBC connection into IRIS".

1 0
1 1.8K

Kidney Disease can be discovered from some parameters well known to the medical community. In this way, in order to help the medical community and computerized systems, especially AI, the scientist Akshay Singh published a very useful dataset for training ML algorithms in the detection/prediction of kidney disease. This publication can be found on the largest and best known data repository for ML, Kaggle at https://www.kaggle.com/datasets/akshayksingh/kidney-disease-dataset.

2 0
0 291

Problem

In a fast-paced clinical environment, where quick decision-making is crucial, the lack of streamlined document storage and access systems poses several obstacles. While storage solutions for documents exist (e.g, FHIR), accessing and effectively searching for specific patient data within those documents meaningfully can be a significant challenge.

7 0
2 377