For non-partitioned tables, Syntasa’s standard process modes typically operate on an “all-or-nothing” basis. For example, the Drop & Replace mode deletes the entire table and recreates it with the current session’s data.
The Code Managed process mode provides a more flexible alternative. It allows developers to perform granular operations—such as appending to a master list, updating specific rows, or refreshing a lookup table—without the platform automatically triggering a full table overwrite.
How “Code Managed” Affects Non-Partitioned Steps
When a job step targeting a non-partitioned dataset is set to Code Managed, the platform changes its execution logic in two key ways:
Bypassing the “Skip Check”
Normally, if the platform detects no new input data for a non-partitioned source, it may skip the step to save compute costs.
- Code Managed Behavior: The system always executes the step. This is critical for non-partitioned reference tables that may need to be refreshed from an external API or a database where Syntasa cannot automatically track “new” records via file timestamps.
Full Data Visibility
In standard modes, the platform attempts to scope the data to the specific “Dates to Process” defined in the job.
- Code Managed Behavior: The platform provides your code with the entire date range requested for the job. Since the table is not partitioned, your code has the freedom to read the existing table, join it with new data, and decide exactly how to write the result back.
Common Use Cases
Maintaining Reference & Lookup Tables
If you are maintaining a “Master Customer List” or a “Product Catalog” that is not partitioned by date, you often need to perform an Upsert (Update or Insert).
- Standard Mode: Would require dropping the whole catalog and rewriting it.
- Code Managed: Allows you to use a MERGE statement to update only the changed records, preserving the rest of the table.
Accumulating Snapshots
For tables that act as a running log or an audit trail where you only ever want to Append new rows.
- Standard Mode: Might default to overwriting the table if not configured carefully.
- Code Managed: Gives you total control to use .mode("append") in your Spark code, ensuring historical data is never touched by the platform’s automation.
External System Integration
When a step’s primary purpose is to push data to an external system (like a CRM or a Marketing tool) rather than writing to a Syntasa table.
- Code Managed: Ensures the step runs every time the job is triggered, allowing your code to handle the connection and data transfer logic independently.
Implementation Example (Python/Spark)
In a Notebook or Code step, you can now explicitly define your write logic. Because the mode is “Code Managed,” the platform will not interfere with the target table.
# Example: Appending to a non-partitioned Audit Table
new_audit_logs = spark.sql("SELECT ...")
# Explicitly choosing 'append' ensures the platform doesn't drop the table
new_audit_logs.write \
.format("parquet") \
.mode("append") \
.saveAsTable("audit_log_master")Best Practices for Non-Partitioned Data
- Manual Schema Management: Since the platform isn’t managing the table creation in this mode, ensure your code handles schema evolution. If you add a column to your DataFrame, you may need to manually ALTER TABLE or use a format like Delta Lake that supports mergeSchema.
- Handle Empty Inputs: Because the “Skip Check” is bypassed, your code will run even if the input DataFrame is empty. Include a check in your code (e.g., if df.count() > 0:) to avoid unnecessary processing or empty writes.
- Avoid Table Locks: When writing to non-partitioned tables in a high-concurrency environment, be mindful of table-level locks. Since you are managing the write, ensure your code handles potential write conflicts gracefully.
Summary Comparison
| Feature | Standard (Drop & Replace) | Code Managed |
| Execution | Only if new data is detected | Always runs |
| Target Table | Dropped and Recreated | Left untouched by platform |
| Write Logic | Automated by Syntasa | Defined in your Code |
| Data Retention | Only current session data | Full control (Append/Update/Overwrite) |