This article explores the functionalities provided by the synutils.eventstore module in the Syntasa application. These utilities offer functionalities to write data and files to an Event Store using Python.
The synutils.eventstore module provides several methods for writing data from DataFrames to the Event Store. Here's a breakdown of each method and its parameters:
1. writeToEventStore(dataFrame: DataFrame, @outputDateSet1: String, noPartitions=0)
This method writes data from a DataFrame to the Event Store without any data partitioning.
Parameters:
dataFrame: The DataFrame containing the data to be written.@outputDateSet1: The user-defined name for the output dataset in the Event Store.noPartitions(Optional): The number of partitions to create for the data (defaults to 0, meaning no partitioning).
Example:
Python
/**
*
* @param dataFrame
* @param @outputDateSet1 is a user parameter
* @param noPartitions is optional default is 0
*/
writeToEventStore(dataFrame: DataFrame, @outputDateSet1: String)
Example :
writeToEventStore(df,"@outputDateSet1")
Scala
/**
* @param dataFrame
* @param outputDateSet is a user parameter
*/
writeToEventStore(dataFrame: DataFrame, outputDateSet: String)
Example :
writeToEventStore(df,"@outputDateSet1")
2. writeToEventStore(dataFrame: DataFrame, @outputDateSet1: String, noPartitions:Int, partitionedDateColumn:String)
This method writes data from a DataFrame to the Event Store with data partitioning based on a specified date column.
Parameters:
- All parameters are the same as
writeToEventStore(dataFrame: DataFrame, @outputDateSet1: String, noPartitions=0), with the addition of:partitionedDateColumn: The name of the column in the DataFrame to be used for partitioning the data.
Example:
Python
/**
*
* @param dataFrame
* @param @outputDateSet1 is a user parameter
* @param noPartitions is optional default is 0
* @param @partitionedDateColumn is a user parameter
*/
writeToEventStore(dataFrame: DataFrame, @outputDateSet1: String,noPartitions:Int, partitionedDateColumn:String)
Example :
writeToEventStore(df,"@outputDateSet1",1,"event_partition")
Scala
/**
* @param dataFrame
* @param outputDateSet is a user parameter
* @param numFiles : Number of files per partition. default is 0
* @param partitionedDateColumn is a user parameter
*/
writeToEventStore(dataFrame: DataFrame, outputDateSet1: String,numFiles:Int,partitionedDateColumn:String)
Example:
writeToEventStore(df,"@outputDateSet1",1,"event_partition")
Notes for Hourly partitions:
- When partitioning data on an hourly basis, you must select a column that contains time values. Additionally, in the Output tab of certain code processes (such as the Spark Processor), you also need to choose Hourly as the partition type to ensure the output is generated correctly.
- Syntasa accepts the time format yyyy-MM-dd-HH. If your input source has a different time format, you can convert it to a Syntasa-compatible format using the following code before writing to event store:
df = df.withColumn(
"modified_time_syntasa",
date_format(to_timestamp("modified_time", "yyyy-MM-dd-HH:mm:ss"), "yyyy-MM-dd-HH")
)
- Explanation:
modified_time– Column in the input source containing the original time.yyyy-MM-dd-HH:mm:ss– Current time format of themodified_timecolumn.yyyy-MM-dd-HH– Target time format accepted by Syntasa (or any format you want to convert to).modified_time_syntasa– New column that stores the time in Syntasa-compatible format. This column can be used as input to the writeToEventStore() method.
You can also overwrite the original column by using the same column name (modified_time) instead of creating a new one(modified_time_syntasa). This will update the time format in-place, and you can use the same column in subsequent code.
3. writeFileToEventStore(localFilePath: String, eventStoreFilePath: String)
This method writes the contents of a local file to the Event Store.
Parameters:
localFilePath: The path to the local file on the system.eventStoreFilePath: The desired path for the file within the Event Store.
Example:
Python
/**
*
* @param localFilePath
* @param eventStoreFilePath
*/
writeFileToEventStore(localFilePath: String, eventStoreFilePath: String)
Example:
writeFileToEventStore("filePath","eventstoreFilePath")
Scala
/**
* @param localFilePath
* @param eventStoreFilePath
*/
writeFileToEventStore(localFilePath: String, eventStoreFilePath: String)
Example :
writeFileToEventStore("filePath","eventstoreFilePath")
These utilities provide a convenient way to interact with the Event Store from your Syntasa application using Python. By leveraging these methods, you can efficiently write various data structures and files to the store for further processing or analysis.