Web App to Predict Kidney Disease using IRIS IntegratedML

Article

Yuri Marx · May 31, 2022 9m read

Open Exchange

#Artificial Intelligence (AI) #IntegratedML #InterSystems IRIS

Kidney Disease can be discovered from some parameters well known to the medical community. In this way, in order to help the medical community and computerized systems, especially AI, the scientist Akshay Singh published a very useful dataset for training ML algorithms in the detection/prediction of kidney disease. This publication can be found on the largest and best known data repository for ML, Kaggle at https://www.kaggle.com/datasets/akshayksingh/kidney-disease-dataset.

About the Dataset

The kidney disease dataset has the following metadata information (source: https://www.kaggle.com/datasets/akshayksingh/kidney-disease-dataset):

It has 400 rows with 25 features like red blood cells, pedal edema, sugar,etc.
The aim is to classify whether a patient has chronic kidney disease or not.
The classification is based on a attribute named 'classification' which is either 'ckd'(chronic kidney disease) or 'notckd.
The dataset author performed cleaning of the dataset which includes mapping the text to numbers and some other changes. After the cleaning, the dataset author done some EDA(Exploratory Data Analysis) and then divided the dataset int training and testing and applied the models on them. It is observed that the classification results are not much satisfying initially. So, instead of dropping the rows with Nan values the dataset author used the lambda function to replace them with mode for each column. After that the dataset author divided the dataset again into training and testing sets and applied models on them. This time the results are better and we see that the random forest and decision trees are the best performers with an accuracy of 1.0 and 0 misclassifications. The performance of the classification is measured by printing confusion matrix, classification report and accuracy.

Data Set Information (source: https://archive.ics.uci.edu/ml/datasets/chronic_kidney_disease):

We use the following representation to collect the dataset
age - age
bp - blood pressure
sg - specific gravity
al - albumin
su - sugar
rbc - red blood cells
pc - pus cell
pcc - pus cell clumps
ba - bacteria
bgr - blood glucose random
bu - blood urea
sc - serum creatinine
sod - sodium
pot - potassium
hemo - hemoglobin
pcv - packed cell volume
wc - white blood cell count
rc - red blood cell count
htn - hypertension
dm - diabetes mellitus
cad - coronary artery disease
appet - appetite
pe - pedal edema
ane - anemia
class - class

Attribute Information (source: https://archive.ics.uci.edu/ml/datasets/chronic_kidney_disease):

We use 24 + class = 25 ( 11 numeric ,14 nominal)
1.Age(numerical)
age in years
2.Blood Pressure(numerical)
bp in mm/Hg
3.Specific Gravity(nominal)
sg - (1.005,1.010,1.015,1.020,1.025)
4.Albumin(nominal)
al - (0,1,2,3,4,5)
5.Sugar(nominal)
su - (0,1,2,3,4,5)
6.Red Blood Cells(nominal)
rbc - (normal,abnormal)
7.Pus Cell (nominal)
pc - (normal,abnormal)
8.Pus Cell clumps(nominal)
pcc - (present,notpresent)
9.Bacteria(nominal)
ba - (present,notpresent)
10.Blood Glucose Random(numerical)
bgr in mgs/dl
11.Blood Urea(numerical)
bu in mgs/dl
12.Serum Creatinine(numerical)
sc in mgs/dl
13.Sodium(numerical)
sod in mEq/L
14.Potassium(numerical)
pot in mEq/L
15.Hemoglobin(numerical)
hemo in gms
16.Packed Cell Volume(numerical)
17.White Blood Cell Count(numerical)
wc in cells/cumm
18.Red Blood Cell Count(numerical)
rc in millions/cmm
19.Hypertension(nominal)
htn - (yes,no)
20.Diabetes Mellitus(nominal)
dm - (yes,no)
21.Coronary Artery Disease(nominal)
cad - (yes,no)
22.Appetite(nominal)
appet - (good,poor)
23.Pedal Edema(nominal)
pe - (yes,no)
24.Anemia(nominal)
ane - (yes,no)
25.Class (nominal)
class - (ckd,notckd)

Get the Kidney data from Kaggle

The Kidney data from Kaggle can be loaded into an IRIS table using the Health-Dataset application: https://openexchange.intersystems.com/package/Health-Dataset. To do this, from your module.xml project, set the dependency (ModuleReference for Health Dataset):

Module.xml with Health Dataset application reference

Web Frontend and Backend Application to Predict Kidney Disease

Go to Open Exchange app link (https://openexchange.intersystems.com/package/Disease-Predictor) and follow these steps:

Clone/git pull the repo into any local directory

$ git clone https://github.com/yurimarx/predict-diseases.git

Open a Docker terminal in this directory and run:

$ docker-compose build

Run the IRIS container:

$ docker-compose up -d

Go to Execute Query into Management Portal to train the AI model: http://localhost:52773/csp/sys/exp/%25CSP.UI.Portal.SQL.Home.zen?$NAMESPACE=USER
Create the VIEW used to train:

CREATE VIEW KidneyDiseaseTrain AS SELECT 
age, al, ane, appet, ba, bgr, bp, bu, cad, classification, dm, hemo, htn, pc, pcc, pcv, pe, pot, rbc, rc, sc, sg, sod, su, wc
FROM dc_data_health.KidneyDisease

Create the AI Model using the view:

CREATE MODEL KidneyDiseaseModel PREDICTING (classification) FROM KidneyDiseaseTrain

Train the model:

TRAIN MODEL KidneyDiseaseModel

Go to http://localhost:52773/disease-predictor/index.html to use the Disease Predictor frontend and predict diseases like this:

Behind the scenes

Backend ClassMethod to predict Kidney Disease

InterSystems IRIS allows you execute SELECT to predict using the previous model created.

Backend ClassMethod to predict Kidney Disease

/// Predict Kidney Disease

ClassMethod PredictKidneyDisease() As %Status

{

Try {

Set data = {}.%FromJSON(%request.Content)

Set %response.Status = 200

Set %response.Headers("Access-Control-Allow-Origin")="*"

Set qry = "SELECT PREDICT(KidneyDiseaseModel) As PredictedKidneyDisease, "

_"age, al, ane, appet, ba, bgr, bp, bu, cad, dm, "

_"hemo, htn, pc, pcc, pcv, pe, pot, rbc, rc, sc, sg, sod, su, wc "

_"FROM (SELECT "_data.age_" AS age, "

_data.al_" As al, "

_"'"_data.ane_"'"_" AS ane, "

_"'"_data.appet_"'"_" AS appet, "

_"'"_data.ba_"'"_" As ba, "

_data.bgr_" As bgr, "

_data.bp_" AS bp, "

_data.bu_" AS bu, "

_"'"_data.cad_"'"_" As cad, "

_"'"_data.dm_"'"_" As dm, "

_data.hemo_" AS hemo, "

_"'"_data.htn_"'"_" AS htn, "

_"'"_data.pc_"'"_" As pc, "

_"'"_data.pcc_"'"_" As pcc, "

_data.pcv_" AS pcv, "

_"'"_data.pe_"'"_" AS pe, "

_data.pot_" As pot, "

_"'"_data.rbc_"'"_" As rbc, "

_data.rc_" AS rc, "

_data.sc_" AS sc, "

_data.sg_" As sg, "

_data.sod_" As sod, "

_data.su_" AS su, "

_data.wc_" AS wc)"

Set tStatement = ##class(%SQL.Statement).%New()

Set qStatus = tStatement.%Prepare(qry)

If qStatus'=1 {WRITE "%Prepare failed:" DO $System.Status.DisplayError(qStatus) QUIT}

Set rset = tStatement.%Execute()

Do rset.%Next()

Set Response = {}

Set Response.classification = rset.PredictedKidneyDisease

Set Response.age = rset.age

Set Response.al = rset.al

Set Response.ane = rset.ane

Set Response.appet = rset.appet

Set Response.ba = rset.ba

Set Response.bgr = rset.bgr

Set Response.bp = rset.bp

Set Response.bu = rset.bu

Set Response.cad = rset.cad

Set Response.dm = rset.dm

Set Response.hemo = rset.hemo

Set Response.htn = rset.htn

Set Response.pc = rset.pc

Set Response.pcc = rset.pcc

Set Response.pcv = rset.pcv

Set Response.pe = rset.pe

Set Response.pot = rset.pot

Set Response.rbc = rset.rbc

Set Response.rc = rset.rc

Set Response.sc = rset.sc

Set Response.sg = rset.sg

Set Response.sod = rset.sod

Set Response.su = rset.su

Set Response.wc = rset.wc

Write Response.%ToJSON()

Return 1

} Catch err {

write !, "Error name: ", ?20, err.Name,

!, "Error code: ", ?20, err.Code,

!, "Error location: ", ?20, err.Location,

!, "Additional data: ", ?20, err.Data, !

Return 0

}

Now, any web application can consume the prediction and show the results. See the source code into frontend folder to predict-diseases application.