Select your cookie preferences

We use essential cookies and similar tools that are necessary to provide our site and services. We use performance cookies to collect anonymous statistics, so we can understand how customers use our site and make improvements. Essential cookies cannot be deactivated, but you can choose “Customize” or “Decline” to decline performance cookies.

If you agree, AWS and approved third parties will also use cookies to provide useful site features, remember your preferences, and display relevant content, including relevant advertising. To accept or decline all non-essential cookies, choose “Accept” or “Decline.” To make more detailed choices, choose “Customize.”

Use multi-language notebooks with Spark kernels

Focus mode
Use multi-language notebooks with Spark kernels - Amazon EMR

Each Jupyter notebook kernel has a default language. For example, the Spark kernel's default language is Scala, and the PySpark kernels's default language is Python. With Amazon EMR 6.4.0 and later, EMR Studio supports multi-language notebooks. This means that each kernel in EMR Studio can support the following languages in addition to the default language: Python, Spark, R, and Spark SQL.

To activate this feature, specify one of the following magic commands at the beginning of any cell.

Language Command
Python

%%pyspark

Scala

%%scalaspark

R

%%rspark

Not supported for interactive workloads with EMR Serverless.

Spark SQL

%%sql

When invoked, these commands execute the entire cell within the same Spark session using the interpreter of the corresponding language.

The %%pyspark cell magic allows users to write PySpark code in all Spark kernels.

%%pyspark a = 1

The %%sql cell magic allows users to execute Spark-SQL code in all Spark kernels.

%%sql SHOW TABLES

The %%rspark cell magic allows users to execute SparkR code in all Spark kernels.

%%rspark a <- 1

The %%scalaspark cell magic allows users to execute Spark Scala code in all Spark kernels.

%%scalaspark val a = 1

Share data across language interpreters with temporary tables

You can also share data between language interpreters using temporary tables. The following example uses %%pyspark in one cell to create a temporary table in Python and uses %%scalaspark in the following cell to read data from that table in Scala.

%%pyspark df=spark.sql("SELECT * from nyc_top_trips_report LIMIT 20") # create a temporary table called nyc_top_trips_report_view in python df.createOrReplaceTempView("nyc_top_trips_report_view")
%%scalaspark // read the temp table in scala val df=spark.sql("SELECT * from nyc_top_trips_report_view") df.show(5)
PrivacySite termsCookie preferences
© 2025, Amazon Web Services, Inc. or its affiliates. All rights reserved.