Runtimes are the compute resources used to execute processes and logic within jobs and notebooks defined within the Syntasa platform. Users define runtime templates, configuring the needed cluster and container type, size, and settings.
The defined runtime templates are available for selection in jobs, notebooks, and the interactive mode of apps. Each occasion of these areas of work will initiate a new instance of the runtime template. Hence, there may be several instances of a runtime template that are active at a given time.
Categorizing runtime types
While there are several runtime types available, these are categorized into two primary groups for easier understanding. Also, a third group is a slight variation of runtimes found in the two primary groups.
Container runtimes are launched on the same Kubernetes cluster the Syntasa platform is using. They have limited to no settings to choose from because they are defined by the subsequently selected image. The image names, e.g. Syntasa Base Image or Syntasa Dash Image, help identify the intended use that the image is optimized, which is detailed in the article Kubernetes Container.
Cloud-native runtimes, i.e. those with "cluster" in the name, utilize the appropriate cloud-native service (EMR in AWS; Dataproc in GCP; Synapse in Azure) to run jobs, execute code in notebooks, etc. These runtimes have all the various settings you would see in the cloud-native service. For example, the master and worker instance types, number of worker instances, max uptimes, etc.
Streaming runtimes should only be used for streaming jobs, i.e. continuously running jobs unless manually stopped, as opposed to non-streaming runtimes that are used for batch jobs and notebooks that will be shut down once the job completes or inactivity or max timeout is reached.
The types of runtimes available depend on the cloud platform the Syntasa environment is installed; the settings available for the runtime template depend on the runtime type that is selected. The following runtime types are available for GCP and AWS:
Google Cloud Platform (GCP)
- GCP Dataproc Cluster
- Kubernetes Container
- GCP Dataproc Stream Cluster
- Kubernetes Stream Container
Amazon Web Services (AWS)
- AWS EMR Cluster
- Kubernetes Container
- AWS EMR Stream Cluster