he To File process in Syntasa allows you to export processed data from the platform into external file formats like CSV or text files. When choosing the Standard output format, the system creates a comma-separated (CSV-style) file that is easy to share, analyze, or integrate with external tools. This article will guide you step-by-step through the configuration options for generating a Standard output file, including an explanation of each setting visible in the Parameters section of the TO FILE process.
Parameters Screen
On the Parameters screen, you configure the fields for the output file. When you configure the To File process and choose the Standard file format, you will see several input fields and toggles that control how the file is generated.
Here is the explanation of each field shown on the parameter screen:
Output File Format
This dropdown lets you choose between two formats: Standard and Key-Value. To create a CSV-style file, select Standard. This is ideal when your data is structured in rows and columns, like a typical spreadsheet or table.
Incremental Load
This toggle should be enabled if your input data is partitioned (for example, by date or hour) and you want to export only the most recent or specific partitions. If it's disabled, the output file will include all records regardless of partitioning.
Field Delimiter
This defines the character used to separate values in the output file. Common delimiters include commas (,), tabs (\t), or pipes (|). For instance, using a comma as the delimiter will generate lines like this:
Name,Age,City John,30,Boston
Include Headers
If your input data includes column headers and you want them to appear in the first row of your output file, turn this toggle ON. This is helpful when the output file is meant to be opened in tools like Excel or imported into other systems.
Compression Type
This dropdown allows you to compress the output file using one of the supported formats: Gzip, Lz4, Bzip2, or Snappy. If you don’t want any compression, select None (which is the default setting). Choose a compression type if you're working with large files or want to save storage space.
Output File Name
Enter the name you want the output file to have. You can also use dynamic parameters to make filenames meaningful and unique. For example:
-
output_{@ROWS}.csvwill result in a file likeoutput_150.csvif 150 rows are exported. -
data_{@FROM_DATE}.csvwill add the value of the starting date of the job execution . This will result in aoutput_2025-0-01.csv
This makes it easier to track files by time or size. Refer to the separate guide Parameters in To FILE Process for a full list of supported variables.
External File Path
This is the path in your connected cloud storage where the output file will be saved. Note that you should provide only the directory path, without the bucket name. The bucket name is taken from the connected output connection. For example, entering /demo/to_file/standard will place the file in that directory in your cloud storage (e.g., S3 or GCS).
Max Split Size
This optional field is used to control the maximum size (in MB) of each output file by either merging smaller input files together or splitting lager input files into smaller ones. Please note that while merge functionality is currently supported, the split functionality (breaking a large file into smaller chunks) is planned for a future release. Here is how merge functionality works:
This merge feature is particularly useful when your input data is partitioned and exists as multiple smaller files. When you specify a value in this field, Syntasa will keep merging input files until the combined size reaches just under the specified limit. For example, if you have five partition files of 5 MB each and you set the Max Split Size to 16 MB, Syntasa will generate two output files. The first file will contain the merged data from the first three partitions (5 + 5 + 5 = 15 MB), because adding the fourth would exceed the 16 MB threshold. The system then starts a new group and merges the remaining two files (5 + 5 = 10 MB) into the second output file. This way, the final output consists of two files — one approximately 15 MB and another 10 MB — staying within the specified size limit.
To distinguish these files, you can use the @SPLIT parameter in the output file name. For example, if the file name is output_{@SPLIT}.csv, the generated files will be named:
output_1.csv output_2.csv
In case, you don't add {@SPLIT} parameter in your file name when multiple files are going to be generated, then the last file will be only visible as the final output as each new file will replace the previous file because of same file name.
Additional Files
The Additional Files section in the To File process allows you to generate extra files alongside the main output file. These files are often used to log metadata, create status indicators (such as success flags), or pass custom messages to downstream systems. You can define both the file name and its content, which will be created in the same directory as the primary output file.
An important behavior to understand is that additional files are only created if the primary output file is successfully generated. This ensures that any downstream system relying on these files will only act when the main file is available and valid. You can configure multiple additional files under a single TO FILE process, and each one will be created as soon as the corresponding primary file is generated.
When working with merged or split output, especially where the Max Split Size setting is used, the system may generate multiple output files (e.g., output_1.csv, output_2.csv, etc.). In such cases, it is crucial to handle additional files correctly to prevent unintended overwrites. This is where the {@SPLIT} parameter becomes essential.
If you use the {@SPLIT} parameter in the names of your additional files, Syntasa will create a unique version of each additional file for each primary file. For example, suppose your primary output is split into two files: output_1.csv and output_2.csv. If you define two additional files as success_{@SPLIT}.txt and info_{@SPLIT}.txt, the system will generate:
-
success_1.txtandinfo_1.txtforoutput_1.csv -
success_2.txtandinfo_2.txtforoutput_2.csv
This ensures that each primary file has its own set of supporting files.
However, if you do not include the {@SPLIT} parameter in the additional file name, all output file chunks will attempt to create an additional file with the same name, such as success.txt. In that case, the file created by the first output (e.g., output_1.csv) will be overwritten by the second (e.g., output_2.csv), leading to potential data loss or confusion.
Therefore, when using Max Split Size or expecting multiple output files, it is highly recommended to include the {@SPLIT} token in additional file names to maintain uniqueness and avoid overwriting.
In summary, additional files are powerful for signaling and metadata logging, but they must be carefully configured—especially in multi-file outputs—to ensure they align one-to-one with each primary output file.