Syntasa’s Cross-GCP Project Support introduces a decoupled architecture that separates the platform’s Control Plane from its Data Plane. This allows enterprise customers to install and manage the Syntasa platform in one GCP project while executing jobs, storing data, and managing connections in a completely separate GCP project.
This “Two-Project” model aligns with industry security standards, enabling better billing separation, compliance isolation, and providing a foundation for future multi-tenant workspace architectures.
Key Concepts
The Control Plane (Platform Project)
The Control Plane is where the Syntasa platform orchestration and management services reside.
Components Included
- Syntasa Microservices: The GKE cluster running the UI, Job Orchestrator, and Authentication services.
- Application Metadata: Configurations, user identities, and workflow definitions.
- Orchestration Logic: Services responsible for determining when and how jobs should execute.
The Data Plane (Compute & Storage Project)
The Data Plane is where customer data, compute workloads, and storage resources reside.
Components Included
- Compute Resources: Dataproc clusters and Spark jobs.
- Storage: GCS buckets used for staging, intermediate, and output data.
- EventStore: BigQuery datasets and Pub/Sub topics.
- Connections: External source and sink connectors.
Why Use Cross-Project Support?
| Benefit | Description |
|---|---|
| Security Isolation | Keep sensitive data and compute resources in a restricted-access project while maintaining platform administration in a separate management project. |
| Billing Separation | Track and attribute cloud costs more easily, with compute and storage charges billed directly to the Data Plane project. |
| Compliance | Support regulatory requirements that require separation between application management and data processing environments. |
| Multi-Team Scaling | Establish the foundation for a single Syntasa installation to manage multiple independent data projects and workspaces. |
How It Works: The Two-Tier Resolution Model
Syntasa uses intelligent resolution logic to determine which GCP project should be used for a specific workload or operation.
Resolution Layers
- Platform Default
Administrators configure a global Data Plane Project ID in the Infrastructure Settings. By default, all jobs and storage operations are routed to this project. - Runtime Override
Specific workloads can override the default by defining a GCP Project ID Override at the Runtime level.
Example Scenario
Most workloads may execute in the Marketing-Data project, while a highly restricted workload can use a Runtime configured for the Finance-Data project. In this case, Syntasa automatically routes that workload to the Finance project while leaving all other jobs unaffected.
Configuration & Setup
Infrastructure Settings
Administrators configure the primary Data Plane settings within the Infrastructure module.
Key Configuration Fields
| Setting | Description |
|---|---|
| Data Plane Project ID | The target GCP project used for data operations. |
| Region/Zone | The default compute location within the Data Plane project. |
| Worker Service Account | The IAM identity used to execute workloads within the Data Plane. |
IAM Permissions
For the Control Plane to communicate with the Data Plane, the Syntasa Service Account must be granted specific IAM roles in the Data Plane project.
Required Roles
| IAM Role | Purpose |
|---|---|
roles/dataproc.editor | Dataproc job management |
roles/storage.admin | GCS data management |
roles/bigquery.dataEditor | EventStore access |
roles/iam.serviceAccountUser | Identity impersonation |
Networking (Kafka Connectivity)
To allow the Control Plane to monitor jobs running in a separate GCP project, Syntasa uses a GCP TCP Load Balancer.
This provides a stable cross-project IP address for Kafka communication and avoids the limitations associated with internal cluster DNS resolution.
User Experience
Transparent Execution
For most users, the workflow experience remains unchanged. Pipelines, notebooks, and jobs are created and executed normally while the platform manages cross-project routing in the background.
Job Visibility
The Job Activity UI displays the specific GCP Project ID where each workload was executed, making it easy to identify whether the platform default or a Runtime override was used.
Error Handling
If the platform lacks the required permissions in a target project, Syntasa surfaces actionable error messages that clearly identify the missing IAM roles.
Future Roadmap: Toward Full Workspaces
Cross-GCP Project Support serves as the foundation for a broader Multi-Workspace architecture.
Future enhancements will include:
- Isolated Storage: Per-workspace GCS bucket prefixes.
- Compute Abstraction: Support for Spark-on-Kubernetes and notebooks operating fully within the Data Plane project.
- Self-Service Provisioning: Automated provisioning of Data Plane resources directly from the Syntasa UI.