AI Trainer

Purpose

The AI Trainer Processor trains one or more AI models using message data collected from a Workflow. It reads attribute values from incoming messages, assembles them into a training dataset, trains a machine learning model using the Weka library, and stores the trained model in the cluster's AI Storage.

The trained model is then used by an AI Classifier Processor to classify new messages.

Use this Processor to:

Train a classification model from historical or sample message data
Collect training data over time (time/message-based mode) or in batches (stream-based mode)
Store a trained model in AI Storage that can be deployed for real-time classification

AI Storage — browsing trained model versions

note

Each training run creates a new version of the model in AI Storage, identified by a running number. Use :latest to reference the most recent version, or :<version-number> to reference a specific version.

Prerequisites

This processor requires:

An AI Model Resource that defines the model's input/output attributes, model type (e.g., J48 Decision Tree, Multilayer Perceptron), and hyperparameters
A path in AI Storage where the trained model will be stored
Incoming messages with attribute values that can be mapped to the model's input schema

How Training Works

The Trainer accumulates message data as training samples and then runs a Weka model training algorithm. The steps are:

Collect — read attribute values from each incoming message using the configured input attribute mappings
Aggregate — store collected samples until the training trigger conditions are met
Train — run the selected Weka training algorithm on the collected dataset
Store — save the trained model to the configured path in AI Storage

Each training run creates a new version of the model. The path combined with the version number uniquely identifies each model version in AI Storage.

Configuration

Name & Description

Name: Name of the Asset. Spaces are not allowed in the name.

Description: Enter a description.

Input Ports

This Processor can only have one Input Port from which it receives data to process.

A port can have a name and description. A Name is mandatory.

You cannot delete the only Input Port.

Output Ports

This Processor can only have one Output Port to send messages on within the Workflow.

Output Ports

A port can have a name and description. A Port Name is mandatory.

You cannot delete the only Output Port.

Training Settings

Training mode

Controls how and when training data is collected and when training is triggered.

Option	Description
Stream based	All messages from the same stream are collected as one training batch. Training starts when the stream ends (onCommit). Use this when you have a complete labeled dataset in one stream.
Time / Message based	Messages are collected continuously over time. Training begins when at least one of the minimum conditions (message count or duration) is satisfied, and ends when at least one of the maximum conditions (message count or duration) is reached. Use this for continuous data collection with periodic retraining.

Training set limits (Time / Message based only)

Configure the boundaries for when training starts and stops:

Minimum training set:

Field	Description
Minimum number of training messages	The minimum number of messages required before training can be performed. If fewer messages are received than this threshold, training will not run.
Minimum training duration [s]	The minimum duration (in seconds) that training data must be streamed before training can be performed.

Maximum training set:

Field	Description
Maximum number of training messages	The maximum number of messages to collect before training ends.
Maximum training duration [s]	The maximum duration (in seconds) before training ends, regardless of message count.

Training begins when at least one minimum condition is satisfied, and ends when at least one maximum condition is reached. Stream-based mode does not use these parameters — all incoming messages are used for training regardless of count or duration.

Training settings — Time/Message based mode with minimum and maximum training set limits

Models to train

The models table lists all AI Model Resources to train and where to store each trained model in AI Storage.

Click + ADD MODEL to add a new model configuration. Each row has:

Column	Description
Model name	A human-readable name for this trained model (e.g., `VoiceClassifier-v2`)
Path in AI Storage	The path in AI Storage where the trained model will be stored (e.g., `models/my-classifier`). Supports macros for per-environment values.
AI Model	Reference to an existing AI Model Resource in the Project — defines the input/output schema, model type, and hyperparameters
Model type	The Weka algorithm to use (read from the selected AI Model Resource, e.g., `J48 Decision Tree`, `Multilayer Perceptron`)

AI Model Selector — path and input attribute mappings

The AI Model Resource defines:

Input attributes — the attribute names and types that the model expects as input
Output attribute — the attribute that the model will predict (the class label)
Model type — the Weka algorithm (Decision Tree / Neural Network)
Hyperparameters — algorithm-specific settings (e.g., for J48: unpruned tree, minimum number of objects per leaf; for MLP: learning rate, momentum, hidden layers)

Input attribute mappings

For each selected AI Model, defines how incoming message data maps to the model's input attributes.

Each row maps a Data Dictionary attribute (from the AI Model Resource's input schema) to a message accessor:

Column	Description
Name	The attribute name from the AI Model Resource's input schema (read-only)
Type	The attribute's data type from the Data Dictionary (read-only)
Message accessor	A message accessor expression that reads the value from the current message

These are the features that will be fed to the Weka model during training.

Behavior

Stream Based Mode

Each message in the stream is added to the training batch as a training sample
The input attribute mappings are used to extract feature values from each message
When the stream ends (onCommit), the collected batch is used to train the model
The trained model is stored in AI Storage at the configured path
If the stream contains fewer messages than Minimum number of training messages, training is skipped

Time / Message Based Mode

Messages are collected continuously over time
Each message adds to the training batch
Training starts when at least one minimum condition is satisfied (message count OR time elapsed)
Training ends when at least one maximum condition is reached (message count OR time elapsed)
If neither maximum is set, training never triggers automatically (manual reset required)
After training, the batch is cleared and collection begins again for the next training cycle

Inheritance

All settings support inheritance — a child Asset can override individual models or fields while inheriting the rest from its parent.

Example

A Workflow receives usage records and needs to train a classification model to categorize records by type.

The JavaScript Processor extracts and formats the relevant attributes from each record before passing them to the Trainer.

AI Trainer configuration:

Setting	Value
Training mode	`Time / Message based`
Minimum number of training messages	`1000`
Maximum number of training messages	`10000`

Models to train:

Model name	Path to trained model file	AI Model	Model type
`UsageClassifier-v2`	`models/usage-classifier-v2`	`UsageClassifier`	J48 Decision Tree

Input attribute mappings:

Attribute	Message accessor
`call_type_ind`	`Detail.D2_05.CALL_TYPE_IND`
`call_destination_ind`	`Detail.D2_05.CALL_DESTINATION_IND`
`rate_scenario_cd`	`Detail.D2_05.RATE_SCENARIO_CD`
...	`Detail.D2_05.*`

What happens at runtime:

Messages arrive with raw attributes from the source
The JavaScript Processor extracts and formats the relevant fields
The AI Trainer collects the formatted messages into a training batch
After 1000+ messages have been collected, the Trainer assembles the feature data into a Weka dataset
The J48 Decision Tree algorithm trains on the dataset
The trained model is stored in AI Storage at models/usage-classifier-v2
The model is now available for the AI Classifier to load and use for classification

Purpose​

Prerequisites​

How Training Works​

Configuration​

Name & Description​

Input Ports​

Output Ports​

Training Settings​

Training mode​

Training set limits (Time / Message based only)​

Models to train​

Input attribute mappings​

Behavior​

Stream Based Mode​

Time / Message Based Mode​

Inheritance​

Example​

See Also​

Purpose

Prerequisites

How Training Works

Configuration

Name & Description

Input Ports

Output Ports

Training Settings

Training mode

Training set limits (Time / Message based only)

Models to train

Input attribute mappings

Behavior

Stream Based Mode

Time / Message Based Mode

Inheritance

Example

See Also