New featureMedium

AWS Glue Interactive Sessions now supports Spark Connect for interactive workloads

AWS Glue Interactive Sessions now supports Apache Spark Connect, enabling users to develop and run Spark applications from their preferred environments on AWS Glue's serverless infrastructure without managing clusters

AWS Glue Interactive Sessions now supports Apache Spark Connect, allowing you to develop and run Apache Spark applications from your preferred environment, including managed notebooks in Amazon SageMaker Unified Studio or your preferred notebook environments and IDEs like Jupyter and Visual Studio Code, while running them on AWS Glue's serverless infrastructure without managing clusters. With Spark Connect, you submit Spark jobs to AWS Glue Interactive Sessions using a thin client architecture that decouples your client application from the Spark execution environment. This enables workflows such as ad hoc data exploration, iterative step-by-step debugging, and incremental PySpark job development before deploying to production, all from the tools you already use. Spark Connect also simplifies upgrades and improves stability by isolating client dependencies from the server-side Spark runtime. For observability, you get real-time session monitoring via the Spark UI, history tracking through the Spark History Server, and session management using the AWS Glue API, CLI, or SDK. AWS Glue Interactive Sessions with Spark Connect is available in Asia Pacific (Mumbai, Seoul, Singapore, Sydney, Tokyo), Canada (Central), Europe (Frankfurt, Ireland, London, Paris, Stockholm), South America (São Paulo), US East (Ohio, N. Virginia), and US West (Oregon). To get started, connect to Glue Interactive Sessions using Spark Connect from notebooks in Amazon SageMaker Unified Studio, your favorite IDE with a Python interpreter, or the AWS API, SDK, and CLI. To learn more, visit the AWS Glue Interactive Sessions documentation.

Read the original AWS announcement