Available in Syntasa environments installed in AWS, the AWS EMR Stream Cluster runtime type utilizes the cloud service to run jobs, execute code in notebooks, etc., and have all the various settings seen in the cloud service.
Also, this runtime type is a streaming runtime that should only be used for streaming jobs, i.e. continuously running jobs unless manually stopped, as opposed to non-streaming runtimes that are used for batch jobs and notebooks that will be shut down once the job completes or inactivity or max timeout is reached.
The basic runtime attributes required for all runtime types are detailed in Creating Runtime Templates; the settings available for this runtime type are detailed below. The other fields are similar to those found in the AWS EMR Cluster runtime type, but the streaming-specific differences are noted below.
Instance type and options
The AWS EMR Cluster runtime type enables several fields related to the master and worker instance types required for the runtime. The various instance types can be reviewed in AWS's Supported Instance Types article.
The fields are the same as those found in the AWS EMR Cluster runtime type, but the "max uptime" fields are excluded here since it is intended for streaming.
There are several applications that can be utilized with the AWS EMR Stream Cluster runtime. Checking the associated checkbox will enable the feature(s) on new cluster instances of the runtime template.
There are also Spark configurations available. Key settings related to the number of cores and memory are defaulted but can be adjusted as needed. Other values available for configuration are detailed in the Apache Spark documentation on Spark Configuration and Running Spark on YARN.
The default configurations share those set as defaults in the AWS EMR Cluster runtime type, but many others are added to support the streaming use case.