Evaluator – SYNTASA™

Description

Once the learning and scoring processes have completed, the Evaluator Process is used to produce a dataset for analytical use.

This dataset is used for viewing the accuracy of the model.

Help Desk > v4 - Evaluator > image2018-11-13_12-28-49.png

Process Configuration

There are two screens that need configuring for this process type - Parameters and Output.

Below are details of each screen and descriptions of each of the fields.

Help Desk > v4 - Evaluator > image2018-11-13_13-40-17.png

Parameters

There are three variables that need to be configured:

Prediction Column - used to select the prediction column passed from the scoring process
Label Column - used to select the field that is associated with the label element of the scoring process
Evaluator Type - there are four types available to select from the drop-down:
- Binary Class Evaluator - compares two methods of assigning a binary attribute, one of which is usually a standard method and the other is being investigated
- Multi Class Evaluator - is the process of classifying instances into one of three or more classes
- Ranking Class Evaluator - sums up into a single number or score
- Regression Evaluator - is the process of deciding whether the numerical results quantifying hypothesized relationships between variables, obtained from regression analysis, are acceptable as descriptions of the data.

Output

The Output tab provides the ability to change the default Table Name and Display Name values on the graph canvas, along with selecting whether to load to Big Query (BQ) if in the the Google Cloud Platform (GCP), load to Redshift or RDS if in Amazon Web Services (AWS), or simply write to HDFS if an using on-premise Hadoop.

Help Desk > v4 - Evaluator > image2018-11-13_14-11-8.png

Expected Output

The expected output of the Evaluator Process is the below table within the environment the data is processed (e.g. AWS, GCP, on-premise Hadoop):

Table Name <tb_test_evaluation> default value - table using Syntasa defined column names
Display Name <Test Evaluation> default value - display name of node on canvas

This table can be queried directly using an enterprise provided query engine.

The resulting table contains the following fields:

precision
recall
measure
numBins
areaUnderROC
areaUnderPR
partitionDate
calc Time

{[{category.name}]}

Description

Process Configuration

Parameters

Output

Expected Output