A process to download a customized report from Google Analytics.
Process Configuration
The GA Report process comprises three primary configuration screens that describe the input dataset, set the parameters for setting the report parameters, and configure the output settings for handling the file.
Click on the GA Report node to access the editor.
General
Process Name - Provide a descriptive name for the process, ideally beginning with a verb
Process Version - The user can select the desired process version from the dropdown list
Input
Input Selection: choose the input data source
Parameters
Report Configuration
Metrics: Metrics in GA that match Key Fields and are integrated with the GA tracking code (i.e. Client ID). GA custom dimensions are in the form of ga:metricsname. A comma-separated list.
Dimensions - Comma-separated string of values for existing GA dimensions aligning with data in the Data Field (i.e. ga:dimension2,ga:dimension3,ga:dimension4)
GA View ID - A Data Import in GA must be created first, in that Data Import a View is associated with the data import. The same view defined in that record is used here.
Authentication
Bucket Name - Name of the bucket where the service account file resides
File Path - Path along with full file name of the service account JSON file that gets created upon service account creation
Output
This is where the user can define the dataset names and other environment-specific options for data availability.
Table Name - Defines the name of the database table where the output data will be written. Please ensure that the table name is unique to all other tables within the defined Event Store, otherwise, data previously written by another process will get overwritten.
Display Name - The label of the process output icon is displayed on the app graph canvas.
Configurations
Partition Scheme - Defines how the output table should be stored in a segmented fashion. Options are Daily, Hourly, and None. Daily is typically chosen.
File Format - Defines the format of the output file. The available option is Parquet.
Load To BQ - BQ stands for Big Query and this option allows for the ability to create a Big Query table. If using AWS, this will have the option to Load To RedShift, and if an on-premise installation data is normally written to HDFS and does not display a Load To option.
Location - This is automatically generated, and not editable, based on the paths and settings of the event store the app is created in, the key of the app, and the table name given above.
Expected Output
The GA Report process should retrieve the report from GA. The data will be written to a partitioned table.