Overview
With the help of SQL, you can build, train, and manage machine learning models directly in the database with InterSystems IRIS's robust IntegratedML tool. Using SQL examples that represent your data, we will go over IntegratedML configuration and how it is used in practical situations in this article.
IntegratedML Configuration
A ML configuration (“ML Configuration”) defines the machine learning provider that will perform the training, in addition to other necessary information. IntegratedML has a default configuration called %AutoML, already activated after installing InterSystems IRIS.
Creating ML Configuration
To create a new ML configuration, we can use the System Management Portal or SQL commands.
Creating ML Configuration via SQL:
CREATE ML CONFIGURATION MeuMLConfig PROVIDER AutoML USING {'verbosity': 1};
To set this configuration as default:
SET ML CONFIGURATION MeuMLConfig;
To view the training settings:
SELECT * FROM INFORMATION_SCHEMA.ML_TRAINING_RUNS;
IntegratedML Application
Creating a predictive model to estimate the amount of energy generated by a consumer unit:
CREATE MODEL PredicaoEnergia PREDICTING (quantidade_generada) FROM UnidadeConsumidora;
Training the model:
TRAIN MODEL PredicaoEnergia;
Making predictions:
SELECT quanto_generada, PREDICT(PredicaoEnergia) AS predicao FROM UnidadeConsumidora WHERE id = 1001;
Implementation: Machine Learning in Solar Energy
1. Data Integration with IRIS
We extracted essential data from multiple tables to build the dataset:
SELECT PSID, CHNNLID, TYPENAME, DEVICESN, DEVICETYPE, FACTORYNAME, STATUS FROM datafabric_solar_bd.EQUIPAMENTS;
2. Predictive Maintenance Model Training
Using Python Embedded in IRIS to train a predictive maintenance model:
from sklearn.ensemble import RandomForestClassifier
from iris import irispy
# Load data
sql_query = "SELECT PSID, DEVSTATUS, ALARMCOUNT FROM datafabric_solar_bd.USINAS;" data = irispy.sql(sql_query)
# Train the model
model = RandomForestClassifier()
model.fit(data[['DEVSTATUS', 'ALARMCOUNT']], data['PSID'])
3. Forecasting Energy Production
Using time series analysis to forecast daily energy production:
from fbprophet import Prophet
# Prepare dataset
df = irispy.sql("SELECT STARTTIMESTAMP, PRODDAYPLANT FROM datafabric_solar_bd.POINTMINUTEDATA;")
df.rename(columns={'STARTTIMESTAMP': 'ds', 'PRODDAYPLANT': 'y'}, inplace=True)
# Train forecasting model
model = Prophet()
model.fit(df)
future = model.make_future_dataframe(periods=30)
forecast = model.predict(future)
4. Identifying Areas of High Solar Irradiance
The analysis of geospatial data allows the identification of areas with the greatest potential for solar energy generation, optimizing resource allocation.
Conclusion
IntegratedML makes it easier to implement machine learning in InterSystems IRIS by allowing models to be trained and applied directly using SQL. Furthermore, using machine learning techniques for predictive maintenance and energy generation forecasting can help solar plants operate more efficiently