Amazon EMR Serverless Now Supports Interactive Sessions with Spark Connect
Amazon EMR Serverless now supports interactive sessions with Spark Connect, enabling development and execution of Apache Spark applications from managed notebooks in SageMaker Unified Studio and other notebook environments or IDEs, with monitoring, debugging, and cost visibility for individual sessions
Amazon EMR Serverless now supports interactive sessions with Spark Connect. This enables you to develop and run Apache Spark applications from managed notebooks in Amazon SageMaker Unified Studio, as well as your favorite notebook environments and IDEs such as Jupyter and Visual Studio Code. An interactive session provides a persistent Spark context that spans across cells and scripts, allowing you to blend local Python code execution with remote Spark operations within a unified environment. This is enabled by Spark Connect's client-server architecture, which decouples your application client from the Spark driver, allowing you to maintain your preferred development environment and tooling while Spark infrastructure runs independently on EMR Serverless. This architecture enables workflows such as ad hoc data exploration, iterative step-by-step debugging, and incremental PySpark job development before deploying to production. For observability, you get real-time session monitoring via the Spark UI, history tracking through the Spark History Server, and session management from the EMR console or API/CLI/SDK. Spark Connect on Amazon EMR Serverless is available with EMR release 7.13 in all AWS Regions where Amazon EMR Serverless is available. The SageMaker Unified Studio experience is available in supported regions.