Question
· Jul 7

How to "real" training a machine learning model

Hi all,

Some days ago, I've seen a youtuber talking about how to create a neural network (sorry, is in spanish)

https://www.youtube.com/embed/iX_on3VxZzk
[This is an embedded link, but you cannot view embedded content directly on the site because you have declined the cookies necessary to access it. To view embedded content, you would need to accept all cookies in your Cookies Settings]

In short, it uses the neural network to learn how to convert degrees Celsius to degrees Fahrenheit.
Degrees Fahrenheit = (degrees Celsius × 9/5) +32
In this video, he uses Python to create the neural network, where he creates a table with the values ​​of degrees Celsius and degrees Fahrenheit.
Then he does 1000 trainings to the model that he has created, when he consults the prediction at a value that is not in the table that he has used to train, it gives a correct (or fairly close) value.

Well, I wanted to do the same example using IRIS and Intersystems IntegratedML, so I created a table with the two values, and following the instructions in "Introduction to machine learning

Class St.MLL.celsiusFahrenheit Extends %Persistent
{

/// Value of Celsius
Property Celsius As %Decimal;
/// Value of Fahrenheit
Property Fahrenheit As %Decimal;
/// Populate table
ClassMethod Populate() As %Status
{
    &sql(INSERT INTO St_MLL.celsiusFahrenheit VALUES(-40,-40))
    &sql(INSERT INTO St_MLL.celsiusFahrenheit VALUES(-10,14))
    &sql(INSERT INTO St_MLL.celsiusFahrenheit VALUES(0,32))
    &sql(INSERT INTO St_MLL.celsiusFahrenheit VALUES(8,46))
    &sql(INSERT INTO St_MLL.celsiusFahrenheit VALUES(15,59))
    &sql(INSERT INTO St_MLL.celsiusFahrenheit VALUES(22,72))
    &sql(INSERT INTO St_MLL.celsiusFahrenheit VALUES(38,100))
    Return $$$OK
}

ClassMethod Training() As %Status
{
    write "Creating model celsiusFahrenheitModel",!
    &sql(CREATE MODEL celsiusFahrenheitModel PREDICTING (Fahrenheit) FROM St_MLL.celsiusFahrenheit)
    write "Training model",!
    for i=1:1:100
    {
        &sql(TRAIN MODEL celsiusFahrenheitModel As FirstModel)
        write "Step "_i_" of 100",!
    }
    write "Validate model celsiusFahrenheitModel",!
    &sql(VALIDATE MODEL celsiusFahrenheitModel FROM St_MLL.celsiusFahrenheit)
}

}

I've done the same, training the model 100 times.

I've created other table with the values to test the model

Class St.MLL.celsiusTest Extends %Persistent
{

/// Value of Celsius
Property Celsius As %Decimal;
/// Value of Fahrenheit
Property Fahrenheit As %Decimal;
/// Populate table
ClassMethod Populate() As %Status
{
    &sql(INSERT INTO St_MLL.celsiusTest VALUES(10,0))
    &sql(INSERT INTO St_MLL.celsiusTest VALUES(20,0))
    &sql(INSERT INTO St_MLL.celsiusTest VALUES(30,0))
    &sql(INSERT INTO St_MLL.celsiusTest VALUES(40,0))
    &sql(INSERT INTO St_MLL.celsiusTest VALUES(50,0))
    &sql(INSERT INTO St_MLL.celsiusTest VALUES(60,0))
    &sql(INSERT INTO St_MLL.celsiusTest VALUES(70,0))
    Return $$$OK
}
}

But it doesn't seem to work, because it always returns the same prediction value.

USER > do ##class(St.MLL.celsiusFahrenheit).Populate()
USER > do ##class(St.MLL.celsiusFahrenheit).Training()
Creating model celsiusFahrenheitModel
Training model
Step 1 of 100
Step 2 of 100
Step 3 of 100
Step 4 of 100
......
Step 99 of 100
Step 100 of 100
Validate model celsiusFahrenheitModel
USER > do ##class(St.MLL.celsiusTest).Populate()
USER >

I was expecting:

Celisu Fahrenheit prediction
10 0 50
20 0 68
30 0 86
40 0 104
50 0 122
60 0 140
70 0 158

I believed that, once the model was trained, it could predict what value corresponds to it, because we do not know what the value is in Fahrenheit.
What is being done wrong? Am I trying to do something that is not possible? That is, have my model learn what the pattern is and I only have to ask for the value of my model's prediction according to the value of degrees Celsius.

Best regards

Product version: IRIS 2023.3
$ZV: IRIS for UNIX (Ubuntu Server LTS for x86-64 Containers) 2023.3 (Build 254U) Wed Nov 8 2023 13:03:30 EST
Discussion (4)2
Log in or sign up to continue

Hi Kurro!

Thanks for your article and trying out IntegratedML. To hopefully point you in the right direction:

1. IntegratedML is not "just neural networks", but rather an autoML pipeline (see AutoML Guide) that first tests several ML methods on a subset of the data, then performs a training run using the full data using the ML method (neural networks, logistic regression, random forests, etc) that performed best on the subset of the data. In fact, by default, for regression problems like this, we only use XGBRegressor -- so in this case the method that IntegratedML uses is not a neural network at all!

2. "TRAIN MODEL" only needs to be called once per training dataset. Looping over the examples is handled inside that call.

3. This is potentially too small a dataset to produce reliable results. IntegratedML splits the data internally into training and testing subsets, so you would probably get better output if you have at least 100 random examples.

Kind Regards,

Thomas