Don't have an account? Sign Up
IntegratedML Orientation
This introduction to InterSystems IRIS Cloud IntegratedML shows how you can use the service to derive insights from your data through machine learning. In it, you will create a model that predicts the tip amount for a given New York City taxi ride. Steps 1 and 2 help you create a deployment equipped with IntegratedML and load sample data into its database before the remaining steps guide you through the basic machine learning tasks in the IntegratedML workflow:
As you proceed, click the informationinfo_outline icons for more information about these tasks.
- Create a model
- Train it as a predictive model
- Validate the trained model
- Predict values using the trained model
As you proceed, click the information
Note:If you want to use a Cloud IntegratedML deployment you have already created, skip to Step 2.
to open the Create InterSystems IRIS Deployment page with most of the fields prefilled.
to open the Create InterSystems IRIS Deployment page with most of the fields prefilled.
- Enter your password at the Password and Confirm Password prompts
- Scroll to and expand Deployment Name so you can name your deployment.
- Finally, expand Review and click Create. Deployment creation takes a minute or two; when the status on the deployment's tile changes to Running, click the tile to display the deployment's Overview page.
Choose Add and Manage Files on the left-hand menu and log into the deployment using the password you provided. On the Add and Manage Files page:
- Click External Transfer, select the pre-loaded samples external storage location, and click List Files.
- Select yellow_tripdata.sql, yellow_tripdata_15k-train.csv, and yellow_tripdata_15k-validate.csv then click Transfer to add these files to the Files added to the deployment list.
- Select DDL or DML statement(s) as the file type and click Next.
- Click Next again (with InterSystems IRIS selected as the platform) to display the available files.
- Select yellow_tripdata.sql, then click Import Files and confirm. This creates two tables: one for data to train your model on, and one to validate and make predictions.
- Click Done.
- Select CSV data as the file type and click Next.
- Select the yellow_tripdata_15k-train.csv file and click Next.
- Select the SQLUser.yellow_tripdata_train table from the dropdown list. This loads the contents of the CSV file into the training data table.
- Select the Import file has header row checkbox and click Import File. The Result column shows 15,276 rows updated.
- Repeat the CSV import process for the other CSV file (yellow_tripdata_15k-validate.csv), selecting the corresponding table and confirming successful import of 15,637 rows.
Now that your data is imported, you can create a model that targets a specific field for predictions. Choose IntegratedML Tools on the left-hand menu, then expand the Create section and complete the following steps:
- Enter TipPrediction as the name of your model.
- Select SQLUser.yellow_tripdata_train as the table to train on.
- Select tip_amount NUMERIC as the field to predict. The Create model SQL statement displays.
- Click Create model to execute the generated SQL and create the model.
Collapse the Create panel and expand Train. To train the model on the relationships between the fields in the yellow_tripdata_train data set, ultimately enabling it to predict the tip_amount field, complete the following steps:
- Select TipPrediction predicting tip_amount for training. The Train model SQL statement displays.
- Click Train model to begin training the model. The Training Runs panel opens, showing the run's status. Training time varies with deployment size, up to 4-5 minutes on the smallest. Once the Run Status changes to completed, you can hide the panel.
Collapse the Train panel and expand Validate. To validate your trained model and see metrics confirming its accuracy and validity, complete the following steps:
- Select your trained model, TipPrediction_t1 using TipPrediction for validation. (You may need to refresh the dropdown.)
- Select SQLUser.yellow_tripdata_validate as the table to validate from. The Validate model SQL statement displays.
- Click Validate model to begin validating the model. The Validation runs panel opens, showing the run's status.
- When the run is completed, examine the validation metrics by hiding the panel and clicking Show validation metrics. Note that the R2 value is very close to 1, which indicates a strong predictive model.
- Click the eye icon under Chart to see a visualization of your validation results.
Collapse the Validate panel and expand Predict. The validation subset of the taxi data was not included in your model's training, making it ideal for testing predictions. To predict the tip_amount field, complete the following steps:
- Select your trained model, TipPrediction_t1 using TipPrediction, for prediction. (You may need to refresh the dropdown.)
- Select SQLUser.yellow_tripdata_validate as the table to predict on.
- Enter 100 as the number of rows to predict.
- Click Generate SQL to generate the SQL prediction statement.
- Click Execute on SQL Query Tools to proceed to the SQL Query Tools page and execute the generated SQL.
On the SQL Query Tools page, click Execute to execute the generated SQL statement, which creates predictions for the tip_amount field. The displayed results include both the predicted and actual tip amounts for each row, so you can compare your model's predictions with the actual tip amount for each NYC taxi ride.
Continue learning about IRIS Cloud IntegratedML by
- Transferring and importing the starexportddl.sql, starclassification_train.csv, and starclassification_validate.csv files from external storage (see step 2) and using them to create, train and validate a model that uses the class field as label. This model predicts whether the astronomical object described by each row is a star, quasar, or galaxy and is therefore a classification model, which predicts discrete values, rather than a regression model like TipPrediction, which predicts continuous numeric values.
- Exploring Using IntegratedML and the AutoML Reference.
- Watching What is Machine Learning? and What is IntegratedML?.