Once an external dataset is created, it can be easily used as the output destination for an application process in Syntasa. This allows the transformed data generated by a process (for example, a Spark processor) to be written directly to an external location, based on the workflow configuration.
Steps to Select an External Dataset
Follow the steps below to configure an app process to use an external dataset:
- Open the application and navigate to the process for which you want to configure the output (for example, a Spark processor).
- Go to the Output tab of the selected process.
- Enable the External Dataset toggle.
- Once the toggle is enabled:
- A dropdown list becomes available.
- This dropdown displays all external datasets that are created under the Event Store used by the application.
- Select the required external dataset from the dropdown list.
- Click the tick (✔) icon to save the process configuration.
- Save the workflow.
At this point, the process is successfully configured to use the selected external dataset as its output destination.
Execution Behavior
- When the process is executed, the output data is written to the external location associated with the selected dataset, only if the dataset is enabled for the current workflow (Development or Production).
- If the external dataset is not enabled for the active workflow, the output is written to the default Syntasa-managed internal path.
- When running a job in DROP & REPLACE process mode, Syntasa is configured not to drop the external dataset or its underlying data, ensuring your external source remains intact.
Previewing Output Data
After the job execution completes:
- You can preview the output data directly from the output node of the process.
- The previewed data is fetched directly from the external location, not from the Syntasa internal storage.
This ensures that users can validate and verify data written to the external storage without leaving the Syntasa application.
Using External Dataset as Input
An external dataset can also be used as an input through the Event Store. From a user perspective, using an external dataset as input works exactly the same way as using an internal dataset.
To configure this:
- Select Event Store as the input type.
- Choose the required Event Store.
- Select the external dataset from the dataset list.
No additional configuration or changes to the process are required. Once selected, the process reads data directly from the external location referenced by the dataset.
Deployment Logic
When an application is deployed from Development to Production, Syntasa applies an intelligent reuse strategy for external datasets.
Scenario 1: If a Production external dataset with the same name already exists, Syntasa reuses the existing Production dataset ID and does not create a new dataset.
Scenario 2: If no matching Production external dataset is found, Syntasa creates a new Production dataset using the production event store path.
Note: During deployment, external datasets should always have the copy mode set to None