Article
· May 30, 2022 7m read

Web App to Predict Diabetes using IRIS IntegratedML

Diabetes can be discovered from some parameters well known to the medical community. In this way, in order to help the medical community and computerized systems, especially AI, the National Institute of Diabetes and Digestive and Kidney Diseases published a very useful dataset for training ML algorithms in the detection/prediction of diabetes. This publication can be found on the largest and best known data repository for ML, Kaggle at https://www.kaggle.com/datasets/mathchi/diabetes-data-set.

The diabetes dataset has the following metadata information (source: https://www.kaggle.com/datasets/mathchi/diabetes-data-set):

  • Pregnancies: Number of times pregnant
  • Glucose: Plasma glucose concentration a 2 hours in an oral glucose tolerance test
  • BloodPressure: Diastolic blood pressure (mm Hg)
  • SkinThickness: Triceps skin fold thickness (mm)
  • Insulin: 2-Hour serum insulin (mu U/ml)
  • BMI: Body mass index (weight in kg/(height in m)^2)
  • DiabetesPedigreeFunction: Diabetes pedigree function (It provided some data on diabetes mellitus history in relatives and the genetic relationship of those relatives to the patient. This measure of genetic influence gave us an idea of the hereditary risk one might have with the onset of diabetes mellitus - source: https://machinelearningmastery.com/case-study-predicting-the-onset-of-di...)
  • Age: Age (years)
  • Outcome: Class variable (0 or 1)

Number of Instances: 768

Number of Attributes: 8 plus class

For Each Attribute: (all numeric-valued)

  1. Number of times pregnant
  2. Plasma glucose concentration a 2 hours in an oral glucose tolerance test
  3. Diastolic blood pressure (mm Hg)
  4. Triceps skin fold thickness (mm)
  5. 2-Hour serum insulin (mu U/ml)
  6. Body mass index (weight in kg/(height in m)^2)
  7. Diabetes pedigree function
  8. Age (years)
  9. Class variable (0 or 1)

Missing Attribute Values: Yes

Class Distribution: (class value 1 is interpreted as "tested positive for diabetes")

Get the Diabetes data from Kaggle

The Diabetes data from Kaggle can be loaded into an IRIS table using the Health-Dataset application: https://openexchange.intersystems.com/package/Health-Dataset. To do this, from your module.xml project, set the dependency (ModuleReference for Health Dataset):

 
Module.xml with Health Dataset application reference

Web Frontend and Backend Application to Predict Diabetes

Go to Open Exchange  app link (https://openexchange.intersystems.com/package/Disease-Predictor) and follow these steps:

  1. Clone/git pull the repo into any local directory
$ git clone https://github.com/yurimarx/predict-diseases.git
  1. Open a Docker terminal in this directory and run:
$ docker-compose build
  1. Run the IRIS container:
$ docker-compose up -d 
  1. Go to Execute Query into Management Portal to train the AI model: http://localhost:52773/csp/sys/exp/%25CSP.UI.Portal.SQL.Home.zen?$NAMESPACE=USER
  2. Create the VIEW used to train:
CREATE VIEW DiabetesTrain AS SELECT Outcome, age, bloodpressure, bmi, diabetespedigree, glucose, insulin, pregnancies, skinthickness FROM dc_data_health.Diabetes
  1. Create the AI Model using the view:
CREATE MODEL DiabetesModel PREDICTING (Outcome) FROM DiabetesTrain
  1. Train the model:
TRAIN MODEL DiabetesModel
  1. Go to http://localhost:52773/disease-predictor/index.html to use the Disease Predictor frontend and predict diseases like this: Disease-Predictor

Behind the scenes

Backend ClassMethod to predict Diabetes

InterSystems IRIS allows you execute SELECT to predict using the previous model created.

 
Backend ClassMethod to predict Diabetes

Now, any web application can consume the prediction and show the results. See the source code into frontend folder to predict-diseases application.

Discussion (1)1
Log in or sign up to continue