Description
Once the Learning Process has completed, the Score Process provides the ability to configure and process the scoring code into a Syntasa Composer workflow.
Once the process is configured and tested it can be deployed to production, and scheduled for Syntasa to run on a scheduled basis.
Process Configuration
The Score Process Configuration screen has three tabs that define the mapping of the desired data, structure of the schema, any required filtering, and storage rules of the files. Click on the Score node to access the editor.
Below are details of each screen and descriptions of each of the fields.
Mapping
The mapping screen defines the field names, allows the ability to specify the Feature(s) and Identifier(s) as well as Partition(s).
Actions
For Score there are five options available: Add, Add All, Clear, Import and Export.
- Add - used to select specific fields from the input table.
- Add All - will select all fields from the input table.
- Clear - clear all selected fields from the mapping canvas.
- Import - selected if the client has JSON data available to provide the custom mappings.
- Export - utilized to export the existing mapping schema in a .csv format that can be used to assist in the editing or manipulation of the schema. This updated file could then be used to input an updated schema into the dataset.
Mapping Output
- Order - column ordering
- Field Name - specified name of the column
- Feature - field specified as a feature - click on the corresponding cell and select the checkbox
- Identifier - field to aggregate on - click on the corresponding cell and select the checkbox
- Partition - column to partition the data on
To switch a field to Feature, Identifier or Partition, click in the corresponding cell and select the checkbox.
Filters
Filters provides the ability to filter the dataset (aka apply a Where clause) to include only certain data.
To create a filter click the green plus button and the filter editor screen will appear. Multiple filters can be applied, ensure the proper (AND/OR) logic is applied.
Output
The Output tab provides the ability to name table and displayed name on the graph canvas, along with selecting whether to load to Big Query (BQ) if in the the Google Cloud Platform (GCP), load to Redshift or RDS if in Amazon Web Services (AWS), or simply write to HDFS if an using on-premise Hadoop.
Expected Output
The expected output of the Score process is the below table within the environment the data is processed (e.g. AWS, GCP, on-premise Hadoop):
- Table Name <tb_training_preditionsl> or <tb_test_preditions> default value - table using Syntasa defined column names
- Display Name <Training Predictions> or <Test Preditions> default value - display name of node on canvas
This table can be queried directly using an enterprise provided query engine.
Additionally, the table can serve as the foundation for building other processes within the Syntasa Composer environment.