New featureMedium

Amazon SageMaker Unified Studio Notebooks now support EMR Serverless

Amazon SageMaker Unified Studio Notebooks now support Amazon EMR Serverless, allowing data engineers and analysts to choose their Spark runtime for interactive analytics and data engineering workloads.

Amazon SageMaker Unified Studio Notebooks now support Amazon EMR Serverless with Apache Spark Connect, providing data engineers and analysts with more flexibility in choosing their Spark runtime for interactive analytics and data engineering workloads. In addition to Amazon Athena Spark, users can now leverage Amazon EMR Serverless as their Spark runtime, selecting the optimal engine based on their requirements. With this launch, users can run PySpark and Spark SQL on an EMR Serverless Spark Application in Notebook cells. Users can select their Spark runtime from the Notebook side panel, and the selected runtime applies to both Python and SQL cells. Additionally, users can leverage SageMaker Data Agent, the built-in AI assistant, to generate code and execution plans from natural language prompts, accelerating Spark development workflows with EMR Serverless. Organizations can leverage pre-initialized capacity to improve session start times, while benefiting from unified Spark UI monitoring across all supported engines for consistent visibility into job execution and performance. Additionally, EMR Serverless provides VPC connectivity support for workloads requiring network isolation. This feature is available in all AWS Regions where Amazon SageMaker Unified Studio is available, supporting both SageMaker Unified Studio notebooks and JupyterLab IDE environments.

Read the original AWS announcement