Session Policy – SYNTASA™

AWS only

Session Policy applies to AWS-hosted data planes only. If you are on GCP or Azure, this chapter doesn't apply to you — Data authorization is still in force, but the credential-scoping mechanism described here is not available outside AWS. Skip to How They Work Together for the comparison.

Session Policy is the platform's second layer of defense. Where Data Authorization catches anything that goes through the Spark engine, Session Policy catches what the Spark engine never sees: a boto3.client('s3').get_object(...) call inside a notebook cell, a Hadoop S3A read from outside Spark, anything that reaches AWS directly. It works by scoping the AWS credentials injected into your kernel and your jobs. The credentials are issued in a way that they cannot reach anything outside your data planes — even if your code tries.

Both layers come from the same source of truth — your data plane assignments — and both are entirely automatic. You don't configure anything, pass credentials, or change your code. The only thing you might notice is an access denied error from the AWS SDK if you try to reach a bucket or catalog database outside your data planes.

How access scoping works

Each data plane you are assigned to has associated S3 paths and Glue catalog databases. When Session Policy is active, the platform generates temporary AWS credentials limited to exactly those resources — the S3 prefixes and Glue ARNs that belong to your data planes. The broader infrastructure-level permissions still exist, but your session credentials cannot exercise them.

Think of it as a key that opens only your assigned doors. If you are assigned to a data plane, your session can reach its storage and catalog. If you are not assigned, it cannot — even if your code uses raw boto3 with the credentials in your kernel environment.

When your data plane assignments change, the next session you start automatically reflects the updated set.

In notebook flows

Before your kernel becomes available, the platform generates scoped credentials based on your current data plane assignments and injects them into your kernel environment. Your kernel starts already scoped.

From that point on, any code in your notebook — a Spark DataFrame read, Pandas with an S3 URI, a direct boto3 call, an aws s3 cp shell command — operates within that scope. There is nothing to configure and nothing to change in how you write your code. Code that touches storage or catalog resources outside your data planes will receive an AWS access denied response.

Notebook process runs (scheduled or triggered) work the same way, whether the source is a notebook card or a plain JupyterLab notebook. Before the executor pod starts, the platform injects scoped credentials on behalf of the notebook owner. The run uses the same data plane boundaries that would apply to that user in an interactive session.

In Spark batch jobs

When you submit a Spark batch job, the platform scopes the credentials before the job is submitted to the cluster. Every operation within the job — PySpark transformations, Scala Spark code, any direct SDK calls inside UDFs or driver code — runs under those scoped credentials.

System jobs follow the same pattern. Credentials are scoped to the job owner's data plane assignments when the job runs.

What is and isn't covered

Session Policy covers any AWS API call that uses the session credentials issued to your kernel or job. That includes:

S3 reads and writes via any access method — Spark, boto3, Hadoop S3A, or any other AWS SDK usage.
Glue catalog access (AWS only) — catalog queries scoped to the databases belonging to your assigned data planes. Hive metastore access on non-AWS deployments is governed by Data Authorization, not by Session Policy.
Any AWS SDK operation that uses the session credentials in your kernel or job environment.

It does not cover:

Non-AWS cloud providers. Session Policy applies to AWS only. GCP and Azure environments are not in scope.
Non-S3 data sources. JDBC connections, on-premises databases, and similar sources are not affected by credential scoping.
Spark-engine queries. These are handled by Data Authorization. Session Policy and Data Authorization are complementary but independent — both are needed for full coverage.

Credential lifetime — the 12-hour window

Scoped credentials are valid for 12 hours from the moment they are issued. This applies to both notebook sessions and batch jobs, and is both the default and the maximum that AWS allows.

For notebooks: If your kernel has been running for more than 12 hours, the credentials it holds will have expired. Code that attempts to reach S3 or the catalog after that point gets an access denied error. To get fresh credentials, re-spawn your kernel from the Syntasa portal — the platform issues a new scoped set automatically.

For batch jobs: Jobs expected to complete within 12 hours are unaffected. Jobs running longer than that may encounter credential expiry mid-execution and fail with access denied. There is no automatic credential refresh today; the workaround is to break long workloads into staged jobs that each complete inside the window. Automatic refresh for long-running jobs is on the roadmap.

Admin bypass

Users with the System Admin role are exempt from Session Policy. Their notebook kernels and batch jobs run with the full underlying cloud role rather than scoped credentials. System admins often need to operate across the full infrastructure to manage configurations, diagnose access issues, and work across data planes on behalf of other users — restricting them to a specific set of data planes would interfere with those responsibilities. If you are a System Admin and you see broader access than a regular user would, that is by design.

How your admin enables it

Session Policy is enabled at the platform level by an admin in syntasa-config:

syntasa-config
syntasa_session_policy_enabled: "true"
syntasa_session_policy_duration: "43200"   # seconds; max and default = 12 hours

With the flag set, Session Policy operates entirely in the background — your workflow does not change. If your code is getting access denied errors when you expect to have access, ask your admin to confirm syntasa_session_policy_enabled is in the expected state and that your data plane assignments include the bucket or catalog you're trying to reach.

Things to know

No mid-session refresh. Credentials are issued once at kernel start (or job submission) and held for the lifetime of the session. If your data plane assignments change while your kernel is running, the change is not picked up until you restart your kernel.
12-hour ceiling on long-running jobs. Jobs that run longer than 12 hours will encounter credential expiry. See the credential lifetime section above for the workaround.
AWS only. Other cloud providers are not in scope today.

FAQ

Does this affect how I write my code?

No. You write notebooks and jobs exactly as you would otherwise. Session Policy operates on the credentials in your environment — it is invisible to your code. The only difference you may notice is an AWS access denied response if your code tries to reach storage or catalog resources outside your data planes.

What if I'm using boto3 directly?

Your boto3 calls automatically use the scoped credentials in your kernel environment. You do not configure anything. Calls within your data planes succeed; calls outside fail with access denied — exactly as intended.

What happens if my job runs longer than 12 hours?

Credentials expire after 12 hours. Cloud access attempts after that point fail with access denied. For jobs that need to run that long, break the workload into staged jobs that each complete inside the window.

I was assigned to a new data plane — do I need to restart my session?

Yes. Credentials are generated when your kernel starts and are not updated mid-session. Restart your kernel from the Syntasa portal to get a fresh scoped set that includes the new data plane.

Related sections

Data Authorization. The first layer of defense — Spark engine enforcement. Session Policy is the second layer that catches what Data Authorization can't see.
How They Work Together. Side-by-side comparison of the two layers, when each applies, and what neither covers.

{[{category.name}]}