Session Enrich – SYNTASA™

The Session Enrich process applies functions to data generated by the Event Enrich process, joins lookups, and writes the data into a session-level dataset, which can be thought of as event data aggregated at the visit level. This data is then grouped by day to allow for partitioning, which allows for more rapid and easy analysis over specified time periods.

Much of the settings are similar to that of Event Enrich.

Process Configuration

The Session Enrich process includes four screens providing the ability to join multiple datasets, map to the schema, apply desired filters, and understand where the data is being written. Below are details of each screen and descriptions of each of the fields.

Screenshot

Join

This section provides the information Syntasa needs if more than one set of data will be joined.

To create a join, click the green plus button.

Primary Source - The first dataset connected on the graph will appear by default
Alias - type a table alias if a different name is desired or required.

Joins

Join Type - left or inner join
Dataset selector - choose the dataset that will be joined with the first dataset
Alias - type a table alias if a different name is desired or required
Left Value - choose the field from the first dataset that will provide a link with the joined dataset (i.e. customer ID if joining a CRM dataset)
Operator - select how the left value should be compared with the right value, for joins this will typically be an = sign
Right Value - select the joining dataset value that is being compared with the left value.

Mapping

The table is where the event data is defined and labeled into the Syntasa schema.

Name - fixed Syntasa table column labels, some names are editable
Label - Some labels are customizable and user-friendly
Function - This is where we write our enrichment, this can be one of the following:

You can write custom logic like regex or a case statement.
Combining columns into one column. For example, customer_name might be cust_name in one report and first_name in another. Unified, we can combine these as we will have one column for both report suites that we might call customer_name. This would be achieved by typing report_name.field_name_in_source

Note - User can add custom notes
Delete - User is able to delete a customized row from the table

Actions

For Unified Session Enrich there are two options available: Import and Export. Import is selected if the user wants to provide a custom mapping schema that they have created using an Excel .csv file. Export is utilized to export the existing mapping schema in a .csv format that can assist in editing or manipulating the schema. This updated file could then be used to input an updated schema into the dataset

To perform Import:

Click Actions button
Click Import
Click on the green paperclip icon to browse to the desired file to import
Once the file is selected, click Open
Click Apply
Wait 60 seconds to ensure the process of pulling in mappings and labels is complete
Use the scroll, order, and search options to locate the cust_fields and cust_metrics fields to ensure all the report suite custom fields have been mapped

To perform Export:

Click Actions button
Click Export
syntasa_mapping_export.csv will be created and downloaded for the user

Filters

Filters allow you to filter the dataset

To create a filter, click the green (+) button, which brings up the filter editor screen. Multiple filters can be used; however, the right (AND/OR) logic must be used.

Screenshot

Outputs

The Output tab provides the ability to name tables and displayed names on the graph canvas, along with selecting whether to load to Big Query (BQ) if in the Google Cloud Platform (GCP), load to Redshift or RDS/Athena if in Amazon Web Services (AWS), or simply write to HDFS if using on-premise Hadoop.

Table Name - defines the name of the database table where the output data will be written. Please ensure that the table name is unique to all other tables within the defined Event Store, otherwise, data previously written by another process will get overwritten
Display Name - The label of the process output icon is displayed on the app graph canvas.
Configurations

Partition Scheme - Defines how the output table should be stored in a segmented fashion. Options are Daily, Hourly, and None. Daily is typically chosen.
File Format - Defines the format of the output file. Options are Avro, Orc, Parquet, Textfile

Load To BQ - This option is only relevant to Google Cloud Platform deployments. BQ stands for Big Query and this option allows for the ability to create a Big Query table. If using AWS, this will have the option to Load To RedShift, and if an on-premise installation data is normally written to HDFS and does not display a Load To option.
Location - Storage bucket or HDFS location where source raw files will be stored for the Syntasa Event Enrich process to use.

Expected Output

The expected output of the Session Enrich process are the below tables within the environment the data is processed (e.g. AWS, GCP, on-prem Hadoop):

tb_session - visitor-level table using Syntasa-defined column names
vw_session - view built off tb_event providing user-friendly labels

{[{category.name}]}