Create ML Model#
MLModel.Create
MLModel.CreateTemplate
MLModel.CreateFromTemplate
Overview#
This Data Platform entity is used to create an ML Model. You can also create a template using a similar procedure and use a template to create a new object (example: through the import/export of an xml file with the object settings). This selection is done in the ML Model menu.
Setup#
No specific setup is required other than to meet the preconditions of the transaction.
Preconditions#
- The ML Model name must be unique.
Sequence of Steps#
There are several ways to create a new versioned object. Depending on the level, follow these steps to get started:
- Entity - In the landing page of this entity type in the Business Data menu or in the details page of an existing entity of the same type, select New on the top ribbon. For more information see Creating Entity Objects.
- Revision - If you want to create a new revision, go to the New dropdown button on the top ribbon and select Revision. For more information, see Revisions.
- Version - If you want to create a version associated to an existing revision, go to the New dropdown button on the top ribbon and select Version. For more information, see Versions.
Step 1: Change Set#
- Select an existing Change Set or select Create to create a new Change Set. If configured to support implicit Change Sets, it is also possible to check the option Automatic Change Set.
- Optionally, select an Approval Role.
Step 2: General Data#
Data Set#
In this tab, select the Data Set to be used as a data source.
- Choose a Data Set that will be used as the source of data for this ML Model.
- Choose a specific field to use when sorting the Data Set. You can preview the data retrieved through the Data Set by selecting the Preview button on the right side of the Order By selection field and reviewing the information displayed in the grid below.
-
Select Next to continue.
Features#
In this tab, you will be able to view a summary of the data as well as edit the properties and transformations applied to the features associated to each of the fields retrieved through the selected Data Set.
You can set the following properties depending on the Field Type:
| Feature Field Type | Editable properties |
|---|---|
| Dimension | Mark as Label Replace Nulls with Most Frequent Encoding ( One Hot or Ordinal) |
| Numeric | Mark as Label Replace Nulls with Mean Normalize Min-Max Remove Outliers (also specify the Sigma threshold) |
| Timestamp | None |
Only features that meet specific criteria will be available for editing and the system will ignore properties (typically known as "features" within the Machine Learning vocabulary) with single values - with no cardinality or very high cardinality (example: all values are different).
From the recommended features you may now:
- Mark a property as a label (a label is the target we want the machine learning model to learn).
- Replace null cells with mean value.
- Normalize the data by applying a MinMax Scaler.
- Remove outliers from the dataset.
Info
For supervised learning, one of the features must have the Mark as Label property selected.
For unsupervised learning just do not choose any label.
Data Splitting#
In this tab, choose the different weight that the ML Model will follow. It is possible to combine different criteria, either by adjusting the sliders or by inputting the proportion of records to be included in each set.
| Data Split | Minimum weight | Maximum Weight |
|---|---|---|
| Train | 50% | 80% |
| Validate | 10% | 40% |
| Test | 10% | 25% |
All the weights used must add up to 100. The corresponding limits can be consulted in the table above.
After you have finished the configuration, select Create to complete the operation.
Info
It is good practice to evenly split the validation and test set as represented in the picture. They should all present similar data distribution as well. If you are not sure, just use the default values.



