Generic Event Enrich – SYNTASA™

The Generic Event Enrich process applies functions to the data, joins lookups and writes the data into an event level dataset. The dataset is the foundation for building the session, product, and visitor datasets, but can also be used for analysis and user-defined analytics datasets.

Process Configuration

The Generic Event Enrich process includes four screens providing the ability to join multiple datasets, map to the schema, apply desired filters, and understand where the data is being written. Below are details of each screen and descriptions of each of the fields.

Click on the Generic Event Enrich node to access the editor.

General

Process Name - Provide a descriptive name to the process, ideally beginning with a verb.

Join

This section provides the information Syntasa needs if more than one set of data will be joined.

Joins

To create a join, click the green plus button.

Join Type - left or inner join at this time
Dataset selector - choose the dataset that will be joined with the first dataset
Alias - type a table alias if a different name is desired or required
Left Value - choose the field from the first dataset that will provide a link with the joined dataset
Operator - select how the left value should be compared with the right value, for a join this will typically be a = sign
Right Value - select the joining dataset value that is being compared with the left value

Mapping

This section is where the raw data schema is declared, user-defined or other labels are applied and then mapped into the Syntasa schema.

Syntasa has a growing set of custom functions that can be applied along with any Hive functions to perform data transformations.

It is recommended to consult Syntasa professional services with any questions before applying other than the default functions.

Name - fixed Syntasa table column labels
Label - customizable user-friendly names
Function - raw file fields are mapped into the Syntasa columns using predefined or custom functions

Actions

For Generic Event Enrich there are two options available: Import and Export. Import is selected if the user wants to provide a custom mapping schema that they have created using an Excel .csv file. Export is utilized to export the existing mapping schema in a .csv format that can be used to assist in the editing or manipulation of the schema. This updated file could then be used to input an updated schema into the dataset.

To perform Import:

Click Actions button
Click Import
Click on the green paperclip icon to browse to the desired file to import
Once a file is selected, click Open
Click Apply
Wait 60 seconds to ensure the process of pulling in mappings and labels is complete
Use the scroll, order, and search options to locate the Cust fields and Cust metrics fields to ensure all the report suite custom eVars, Props and Events have been mapped

To perform Export:

Click Actions button
Click Export
syntasa_mapping_export.csv will be created and downloaded for the user

Filters

Filters provide the user with the ability to filter the dataset (apply a Where clause).

To create a filter:

click the green plus button
filter editor screen will appear
ensure the proper (AND/OR) logic is applied
select the appropriate Left Value from the drop-down list or click --Function Editor-- to create and apply a custom function
select the appropriate Operator from the drop-down list
select the desired Right Value for the filter from the drop-down list or click --Function Editor-- to create and apply a custom function
multiple filters can be created and applied

Outputs

The Output section provides the ability to name the output table and how the output process should be labeled on the app graph.

Dataset

Table Name - This defines the name of the database table where the output data will be written. Please ensure that the table name is unique to all other tables within the defined event store, otherwise, data previously written by another process will get overwritten.

Display Name - The label of the process output icon is displayed on the app graph canvas.

Partition Scheme - Defines how the output table should be stored in a segmented fashion. Options are Daily, Hourly, and None. Daily is typically chosen.

File Format - Defines the format of the output file. Options are Avro, Orc, Parquet, and Textfile.

Field Delimiter - Only available if the file format is selected as a text file, this defines how the fields should be separated, e.g. \t for tab-delimited.

Load To BQ - This option is only relevant to Google Cloud Platform deployments. BQ stands for Big Query and this option allows for the ability to create a Big Query table. If using AWS, this will have the option to Load To RedShift, and if an on-premise installation data is normally written to HDFS and does not display a Load To option.

Location - This is automatically generated, and not editable, based on the paths and settings of the event store the app is created in, the key of the app, and the table name given above.

Expected Output

The expected output of the Generic Event Enrich process are the below tables within the environment the data is processed (e.g. AWS, GCP, on-premise Hadoop):

tb_event - event level table using Syntasa-defined column names
vw_event - view built off tb_event providing user-friendly labels

{[{category.name}]}