Identity Graph is a app that provides the ability to map persistent IDs for an individual (e.g. Customer ID) across multiple sources (Web Analytics, CRM, Enterprise, Email, Product, Orders, etc.). Mapping persistent IDs from multiple sources provides an understanding of a customer's behavior across different systems.
Article Summary:
- Prerequisites: covering what is required in order to setup a new unified app
- Creating a new app
- Configure: How to configure unified event enrich process
- Testing the configuration ind development
- Running and scheduling a job in production
Pre-requisites
The following screens need to be populated before configuring this app:
- Infrastructure - All required fields populated with the environment details of your on-premise, Google Cloud Platform, or AWS environment.
- ID Store - Where data will reside in your cloud environment or HDFS cluster.
App creation
- Click on the menu icon ( ) and under Apps select "Workflow" from the sub-menu.
- Click the green "+" plus sign button, on the top right of your screen and select New App from the dropdown.
- Fill in the New App screen.
- Name = display name of your new ID Graph application.
- Key = will automatically populate based on the name you enter.
- App prefix = will automatically populate based on the name you enter, you are able to change this if you prefer something else.
- Template = Choose your app template, in this case, ID Graph.
- Description = Purely informational text field.
- Event Store = Drop down where you can choose your pre-configured event store.
- Override Icon = Toggle button, (Don't change, unless you would like to use a custom icon).
- Click 'Create'.
Configure ID Graph
- Now find your new app and click on it to open.
- The workflow will look like the screenshot below. We will then drag in our data sources, which will be tables generated by other apps.
- Click the lock icon on the top-left to unlock the workflow.
- From the left side menu, under Stores drag an Event Store () on to the workflow.
- Click on the Event Store on the workflow and populate the two drop-downs:
- Select the Event Store, if you have a long list type the name of your list.
- Select the Dataset, again if you have a long list type the name of your list.
- Now save the changes by clicking tick () on the right.
- Click on the Event Store (color will change to purple) edge connection and drag to the Register Identity node.
- Register Identity process is now enabled for editing, click on it.
- Populate the 3 required drop-down fields in the Parameters Tab:
- Select an available local ID from the field. The Local Id is typically an anonymous ID such as a cookie field (example in Adobe Analytics this is visitor_id/mcvisid).
- Select an available Universal ID from the field. The Universal ID is a persistent ID, for known users (for example a Customer ID).
- Select a First Seen Time from the field. To help determine the first time the ID pairing was found.
- Click on the Output tab on the left, and check:
- The table name, you are able to update this if you have a specific naming convention
- The Display name on the workflow.
- Under configurations. (These are default settings, please don't modify them without a consult with Syntasa).
- Partition Schema: Daily.
- File Format: Parquet.
- Load to Big Query: toggled ON (by default).
- Now save the changes by clicking tick () on the right.
- If you have multiple sources you would like to add to the Register Identity app, then repeat from step 4 to 10 till you've added all your sources.
- on the top-right click 'Save & Lock'.
Test in development
Now you're ready to test your configuration.
- To test the Register Identity process node, click on the node while holding the shift key. The node will be highlighted in grey with a tick(see the below screenshot) to indicate it's been selected.
- Now click on the"Job" button on the top-right and then click "New Job" from the drop-down. (see the screenshot below).
- You will now be presented with a window for configuring your job, let's populate the below:
- The job Name.
- The job Description, informative for the user.
- Tag the JOB.
- Process name: auto-populated, non-editable.
- Runtime is a drop-down, this allows us to choose the type/size of the cluster you want to use for processing.
- Process Mode, for the first run, Replace Date Range is sufficed, however, if you're multiple times you may opt for Drop and replace or add new and modified. (It's worth noting Drop and Replace is not advised for production as this will drop your data in Big Query).
- Date Range is a drop-down and you have a number of options like Custom, which allows you to select the dates you want to process (From Date / To Date) or alternatives are available like "Last N Days" which will allow you to select a relative date range e.g. last 2 days and if needed you can add an offset. For the purposes of this, we're using Custom so we will have to enter two dates.
- for Preview Record Limit and Default Test File Limit, leave these as default.
- Now click on "Save & Execute" and the job will start.
- Now Click on Activity to expand and show job details on the right side menu.
- Activity will have a 1 to indicate a running job.
- Activity will have a 1 to indicate a running job.
- Once the job completes, you can click on the output preview.
- The window will have a menu on the left, click on "Preview"
- The window will have a menu on the left, click on "Preview"
Run job in production
- Deploy your development workflow to production. From the development workflow, click the "Deploy" button, and then you will be presented with all the below screen for initial deployment.
- After the initial deployment, you will be required to create a snapshot name.
- Snapshot is a feature that saves the state of the app, so you can track changes over time.
- Now open the production workflow.
- Highlight the nodes you want to include in the job by holding "Shift" then clicking on Register Identity process.
- Click on the Job button on the top-right menu, and choose "New Job" in the sub-menu.
- Fill in the name, description and date range. Clicking "Save and Execute", which will start the job.
- Now Click on Activity to expand and show job details on the right side menu.
- Activity will have a 1 to indicate a running job.
- Once the job completes, you can click on the output preview.
- The window will have a menu on the left, click on "Preview".