LTM – SYNTASA™

Description

The LTM Process is one of the steps in the process of building the training dataset using the Lookback process dataset.

This process provides the ability to apply reductions, transformations, and filters to cleanse the data further.

Help Desk > v4 - LTM > image2018-11-13_11-17-2.png

Process Configuration

The LTM Process Configuration screen has three tabs - Parameters, Filters, Output.

Below are details of each screen and descriptions of each of the fields.

Parameters

There are three types of Parameters that are available for configuration - Reductions, Transformations, Outliers.

Reductions

Method - define reduction methods to apply to the data (3 available)
- Tfidf (term frequency inverse document frequency) - used for term importance in a document that can be defined with a minDf, number of features, ability to Tokenize a specific delimiter, and ability to enable n-gram model sequence on a specified field
- Tf (term frequency) - used for term importance where a number of features on a defined field are configured, ability to Tokenize a specific delimiter, and ability to enable n-gram model sequence on a specific field
- One Hot Encoding - transforms categorical features to a format that works better with classification and regression algorithms by performing binarization of the features
Select Fields - dropdown to define the field(s) within the Lookback dataset that should have the reduction method applied
minDF - minimum document features
Features - total features to consider
Delimiter - toggle on/off to define the character to delimit the values in the concatenated string
n-gram - toggle on/off to define the length or number to apply for the character sequence

Transformations

Method - defines a transformation method to apply to the data (4 available)
- Max Abs - this estimator scales and translates each feature individually such that the maximal absolute value of each feature in the training set will be 1.0
- Bucketed Lsh (Locality-sensitive hashing) - reduces dimensionality mapping similar items to same 'buckets' with high probability
- Min Max - a normalization that applies a 0 to the minimum of a variable and a 1 to the maximum of a variable
- Z Score - converts all indicators to a common scale with an average of zero and standard deviation of one
Select Field(s) - drop-down to define the field or fields the transformation should be applied to

Outliers

Method - defines how to handle outliers, if desired
- Std Dev - defines if outliers should be handled by specifying the number of standard deviations that should be considered for the training dataset

Help Desk > v4 - LTM > image2018-11-13_12-6-49.png

Filters

Filters provides the ability to filter the dataset (aka apply a Where clause) to include only certain data.

Help Desk > v4 - LTM > image2018-11-13_12-9-27.png

To create a filter click the green plus button and the filter editor screen will appear. Multiple filters can be applied, ensure the proper (AND/OR) logic is applied.

Output

The Output tab provides the ability to provide a name for the Display Name of the treatment model on the graph canvas.

Help Desk > v4 - LTM > image2018-11-13_12-12-55.png

Expected Output

The expected output of the LTM process creates a treatment model, which is stored in the Base Path, to apply to the history(lookback) dataset which is then referenced in the Featurize Process.