Data Authorization and Session Policy are complementary, not alternatives. They address different threat vectors at different layers of the stack and are designed to be deployed together. In production, both should be on. This chapter is the side-by-side recap so you can see, at a glance, which layer is doing what.
Side-by-side comparison
| Data Authorization | Session Policy | |
|---|---|---|
| What it stops | Unauthorized SQL, DataFrame, and DDL/DML operations through the Spark engine | Direct cloud SDK calls (boto3, Hadoop S3A) to unauthorized AWS resources |
| Where it enforces | Spark logical plan analysis — before any data is read | AWS IAM / STS — before the session or job receives credentials |
| When the user is blocked | Spark raises SyntasaAccessDeniedException with the specific resource named | AWS returns an access denied response on the first SDK call to an out-of-scope resource |
| Cloud scope | AWS, GCP, Azure | AWS only |
| What happens when assignments change | Reflected within 5 minutes (cache TTL) — restart kernel for instant | Not reflected until next kernel start or job submission |
| Long-running job risk | None | Credentials expire after 12 hours |
| Admin enable key | syntasa_authz_enabled: "true" | syntasa_session_policy_enabled: "true" |
| System Admin behavior | Full bypass | Full bypass |
| User action required | None — fully automatic | None — fully automatic |
How a typical operation gets to the right layer
To make the split concrete, walk through Alice running four operations against an S3 path she is not authorized to access:
| What Alice runs | Caught by | What she sees |
|---|---|---|
| spark.sql("SELECT * FROM unauthorized_db.table") | Data Authorization | SyntasaAccessDeniedException at plan analysis — naming the database. No data read. |
| spark.read.parquet("s3://unauthorized-bucket/path/") | Data Authorization (path enforcement) | SyntasaAccessDeniedException at plan analysis — naming the path. |
| boto3.client("s3").get_object(Bucket="unauthorized-bucket", Key="...") | Session Policy (AWS only) | AWS access denied response from boto3. Spark never sees this call. |
| pd.read_csv("s3://unauthorized-bucket/file.csv") — Pandas with S3 URI | Session Policy (AWS only) | AWS access denied response from the underlying S3 call. |
On AWS, both layers are active and any of those four attempts is caught. On GCP or Azure, only the first two — the Spark-engine paths — are caught; the boto3 / Pandas-S3 examples don't apply (you would be using the equivalent GCP or Azure SDK, which Session Policy doesn't cover today).
Diagnosing an access denied error
If you got an access denied error and you don't know which layer caused it, follow this decision tree:
- Is the exception class SyntasaAccessDeniedException? → Data Authorization. Read the resource named in the message and request access to its data plane via your admin.
- Is it a generic AWS access denied response from boto3, Pandas, or aws s3 cp? → Session Policy (AWS only). The S3 bucket or Glue database is outside your data plane scope. Same fix: request access to the relevant data plane.
- Has your kernel been running for more than 12 hours? → Session Policy credentials have expired. Restart the kernel from the Syntasa portal — the new kernel gets fresh credentials.
- Did your admin change your assignments recently? → Either restart your kernel (immediate effect) or wait up to 5 minutes for the Data Authorization cache to refresh. Restart is the reliable path.
If none of these match, you may genuinely lack access to the resource you're trying to reach — confirm with your admin which data plane owns it and request assignment. To see your current data plane assignments, go to the Syntasa portal under your user settings or the data catalog (there is no notebook-side method that lists assignments today).
What neither layer covers
There is a short list of access paths that fall outside both layers. Use other controls (network policy, JDBC credential management, application-level auth) for these:
- JDBC connections to external databases.
- On-premises systems reached over your network.
- Any access path that does not go through AWS credentials and does not go through the Spark engine.
Related sections
- Data Authorization. The Spark-engine layer.
- Session Policy. The credential-scoping layer (AWS only).