New
Article Jorge Jaramillo Herrera · 14 hr ago 7m read

This article presents a straightforward approach to automatically and efficiently tune hyperparameters for machine learning models using Optuna as the optimisation framework. We explore how to use both Optuna’s native storage options and InterSystems IRIS as a database backend to track the progress of hyperparameter searches. We also show how MLflow can be used to monitor experiments and manage models through its tracking and model registry UI.

This article is based on this Kaggle Notebook, which you can run and directly edit yourself.

0
0 18
New
Article Jorge Jaramillo Herrera · May 5 19m read

This article introduces SHAP explainability methods as an approach to understand the reasons behind predictions in machine learning black-box models. It also includes a simple Jupyter notebook that you can use and modify to gain hands-on experience with these concepts:

https://www.kaggle.com/code/jorgeivnjh/explainability-in-ml-models

https://github.com/JorgeIvanJH/Explainability-in-ML-models

We will leverage these concepts for a future implementation in our Continuous Training Pipeline: https://community.intersystems.com/post/complementing-iris-mlflow-continuous-training-ct-pipeline

0
0 32
Article Jorge Jaramillo Herrera · Mar 30 7m read

A Continuous Training (CT) pipeline formalises a Machine Learning (ML) model developed through data science experimentation, using the data available at a given point in time. It prepares the model for deployment while enabling autonomous updates as new data becomes available, along with robust performance monitoring, logging, and model registry capabilities for auditing purposes.

InterSystems IRIS already provides nearly all the components required to support such a pipeline. However, one key element is missing: a standardised tool for model registry.

1
0 317
Discussion Jorge Jaramillo Herrera · Feb 23

Hello everyone,
I’m looking to implement Continuous Training (CT) as part of an MLOps strategy for some data science projects in IRIS. I want to automate the full cycle:

- Monitoring model performance & accuracy degradation.
- Retraining models automatically.
- Validating and updating production models.

I’ve looked into IntegratedML, but it seems more focused on the SQL interface for training (AutoML). Even with the new Custom Models (beta), which allows for more flexibility with Python, it doesn't seem to provide the "Continuous" orchestration out of the box.

I’d like to know:

0
0 66
Article Jorge Jaramillo Herrera · Jan 9 9m read

1-command only required for an entire IRIS instance for Data Science projects, and leveraging this to compare query methods' speed (Dynamic SQL, Pandas Query, and Globals).

Before joining InterSystems, I worked in a team of web developers as a data scientist. Most of my day-to-day work involved training and embedding ML models in Python-based backend applications through microservices, mainly built with the Django framework and using Postgres SQL for sourcing the data.

3
0 83