Menu
Amazon EMR
Amazon EMR Release Guide

Presto

Use Presto as a fast SQL query engine for large data sources.

Release Information

Application Amazon EMR Release Label Components installed with this application

Presto 0.170

emr-5.8.0

emrfs, emr-goodies, hadoop-client, hadoop-hdfs-datanode, hadoop-hdfs-library, hadoop-hdfs-namenode, hadoop-kms-server, hadoop-yarn-nodemanager, hadoop-yarn-resourcemanager, hadoop-yarn-timeline-server, hive-client, hcatalog-server, mysql-server, presto-coordinator, presto-worker

For more information about Presto, go to https://prestodb.io/.

Note

  • Certain Presto properties or properties that pertain to Presto cannot be configured directly with the configuration API. You can configure log.properties and config.properties. However, the following properties cannot be configured:

    • node.properties (configurable in Amazon EMR version 5.6.0 and later)

    • jvm.config

    For more information about these configuration files, see the Presto documentation.

  • Presto is not configured to use EMRFS. Instead, it uses PrestoS3FileSystem.

  • You can access the Presto web interface on the Presto coordinator using port 8889.

Adding Database Connectors

You can add JDBC connectors at cluster launch using the configuration classifications. For more information about connectors, see https://prestodb.io/docs/current/connector.html.

These classifications are called:

  • presto-connector-blackhole

  • presto-connector-cassandra

  • presto-connector-hive

  • presto-connector-jmx

  • presto-connector-kafka

  • presto-connector-localfile

  • presto-connector-mongodb

  • presto-connector-mysql

  • presto-connector-postgresql

  • presto-connector-raptor

  • presto-connector-redis

  • presto-connector-tpch

Example Configuring a Cluster with the PostgreSQL JDBC

To launch a cluster with the PostgreSQL connector installed and configured create a file, myConfig.json, with the following content:

Copy
[ { "Classification": "presto-connector-postgresql", "Properties": { "connection-url": "jdbc:postgresql://example.net:5432/database", "connection-user": "MYUSER", "connection-password": "MYPASS" }, "Configurations": [] } ]

Then, use the following command to create the cluster:

Copy
aws emr create-cluster --name PrestoConnector --release-label --instance-type m3.xlarge \ --instance-count 2 --applications Name=Hadoop Name=Hive Name=Pig Name=Presto \ --use-default-roles --no-auto-terminate --ec2-attributes KeyName=myKey \ --log-uri s3://my-bucket/logs --enable-debugging \ --configurations file://./myConfig.json

Using LDAP Authentication with Presto

Amazon EMR version 5.5.0 and later supports using Lightweight Directory Access Protocol (LDAP) authentication with Presto. To use LDAP, you must enable HTTPS access for the Presto coordinator (set http-server.https.enabled=true in config.properties on the master node). For configuration details, see LDAP Authentication in Presto documentation.

Enabling SSL/TLS for Internal Communication Between Nodes

With Amazon EMR version 5.6.0 and later, you can enable SSL/TLS secured communication between Presto nodes by using a security configuration to enable in-transit encryption. For more information, see Specifying Amazon EMR Encryption Options Using a Security Configuration. The default port for internal HTTPS is 8446. The port used for internal communication must be the same port used for HTTPS access to the Presto coordinator. The http-server.https.port=port_num parameter in the Presto config.properties file specifies the port.

When in-transit encryption is enabled, Amazon EMR does the following for Presto:

  • Distributes the artifacts you specify for in-transit encryption throughout the Presto cluster. For more information about encryption artifacts, see Providing Certificates for In-Transit Data Encryption.

  • Modifies the config.properties file for Presto as follows:

    • Sets http-server.http.enabled=false on core and task nodes, which disables HTTP in favor of HTTPS.

    • Sets http-server.https.*, internal-communication.https.*, and other values to enable HTTPS and specify implementation details, including LDAP parameters if you have enabled and configured LDAP.