Working with Tables in Amazon Keyspaces (for Apache Cassandra) - Amazon Keyspaces (for Apache Cassandra)

Working with Tables in Amazon Keyspaces (for Apache Cassandra)

This section provides details about working with tables in Amazon Keyspaces (for Apache Cassandra).

Creating Tables in Amazon Keyspaces

Amazon Keyspaces performs data definition language (DDL) operations, such as creating and deleting tables, asynchronously. You can monitor the creation status of new tables in the AWS Management Console, which indicates when a table is pending or active. You can also monitor the creation status of a new table programmatically by using the system schema table.

A table shows as active in the system schema when it's ready for use. The recommended design pattern to check when a new table is ready for use is to poll the Amazon Keyspaces system schema tables (system_schema_mcs.*). For a list of DDL statements for tables, see the Tables section in the CQL language reference.

The following query shows the status of a table.

SELECT keyspace_name, table_name, status FROM system_schema_mcs.tables WHERE keyspace_name = 'mykeyspace' AND table_name = 'mytable';

For a table that is still being created and is pending,the output of the query looks like this.

keyspace_name | table_name | status --------------+------------+-------- mykeyspace | mytable | CREATING

For a table that has been successfully created and is active, the output of the query looks like the following.

keyspace_name | table_name | status --------------+------------+-------- mykeyspace | mytable | ACTIVE

Static Columns in Amazon Keyspaces

When you declare a column in an Amazon Keyspaces table as static, the value stored in this column is shared between all rows in a logical partition. When you update the value of this column, Amazon Keyspaces applies the change automatically to all rows in the partition.

This section describes how to calculate the encoded size of data when you're writing to static columns. This process is handled separately from the process that writes data to the nonstatic columns of a row. In addition to size quotas for static data, read and write operations on static columns also affect metering and throughput capacity for tables independently.

Calculating Static Column Size per Logical Partition in Amazon Keyspaces

This section provides details about how to estimate the encoded size of static columns in Amazon Keyspaces. The encoded size is used when you're calculating your bill and quota use. You should also use the encoded size when you calculate provisioned throughput capacity requirements for tables. To calculate the encoded size of static columns in Amazon Keyspaces, you can use the following guidelines.

  • Partition keys can contain up to 2048 bytes of data. Each key column in the partition key requires up to 3 bytes of metadata. These metadata bytes count towards your static data size quota of 1 MB per partition. When calculating the size of your static data, you should assume that each partition key column uses the full 3 bytes of metadata.

  • Use the raw size of the static column data values based on the data type. For more information about data types, see Data Types.

  • Add 104 bytes to the size of the static data for metadata.

  • Clustering columns and regular, nonprimary key columns do not count towards the size of static data. To learn how to estimate the size of nonstatic data within rows, see Calculating Row Size in Amazon Keyspaces.

The total encoded size of a static column is based on the following formula:

partition key columns + static columns + metadata = total encoded size of static data

Consider the following example of a table where all columns are of type integer. The table has two partition key columns, two clustering columns, one regular column, and one static column.

CREATE TABLE mykeyspace.mytable(pk_col1 int, pk_col2 int, ck_col1 int, ck_col2 int, reg_col1 int, static_col1 int static, primary key((pk_col1, pk_col2),ck_col1, ck_col2));

In this example, we calculate the size of static data of the following statement:

INSERT INTO mykeyspace.mytable (pk_col1, pk_col2, static_col1) values(1,2,6);

To estimate the total bytes required by this write operation, you can use the following steps.

  1. Calculate the size of a partition key column by adding the bytes for the data type stored in the column and the metadata bytes. Repeat this for all partition key columns.

    1. Calculate the size of the first column of the partition key (pk_col1):

      4 bytes for the integer data type + 3 bytes for partition key metadata = 7 bytes
    2. Calculate the size of the second column of the partition key (pk_col2):

      4 bytes for the integer data type + 3 bytes for partition key metadata = 7 bytes
    3. Add both columns to get the total estimated size of the partition key columns:

      7 bytes + 7 bytes = 14 bytes for the partition key columns
  2. Add the size of the static columns. In this example, we only have one static column that stores an integer (which requires 4 bytes).

  3. Finally, to get the total encoded size of the static column data, add up the bytes for the primary key columns and static columns, and add the additional 104 bytes for metadata:

    14 bytes for the partition key columns + 4 bytes for the static column + 104 bytes for metadata = 122 bytes.

You can also update static and nonstatic data with the same statement. To estimate the total size of the write operation, you must first calculate the size of the nonstatic data update. Then calculate the size of the row update as shown in the example at Calculating Row Size in Amazon Keyspaces, and add the results. In this case, you can write a total of 2 MB—1 MB is the maximum row size quota, and 1 MB is the quota for the maximum static data size per logical partition.

To calculate the total size of an update of static and nonstatic data in the same statement, you can use the following formula:

(partition key columns + static columns + metadata = total encoded size of static data) + (partition key columns + clustering columns + regular columns + row metadata = total encoded size of row) = total encoded size of data written

Consider the following example of a table where all columns are of type integer. The table has two partition key columns, two clustering columns, one regular column, and one static column.

CREATE TABLE mykeyspace.mytable(pk_col1 int, pk_col2 int, ck_col1 int, ck_col2 int, reg_col1 int, static_col1 int static, primary key((pk_col1, pk_col2),ck_col1, ck_col2));

In this example, we calculate the size of data when we write a row to the table, as shown in the following statement:

INSERT INTO mykeyspace.mytable (pk_col1, pk_col2, ck_col1, ck_col2, reg_col1, static_col1) values(2,3,4,5,6,7);

To estimate the total bytes required by this write operation, you can use the following steps.

  1. Calculate the total encoded size of static data as shown earlier. In this example, it's 122 bytes.

  2. Add the size of the total encoded size of the row based on the update of nonstatic data, following the steps at Calculating Row Size in Amazon Keyspaces. In this example, the total size of the row update is 134 bytes.

    122 bytes for static data + 134 bytes for nonstatic data = 256 bytes.

Metering Read/Write Operations of Static Data in Amazon Keyspaces

Static data is associated with logical partitions in Cassandra, not individual rows. Logical partitions in Amazon Keyspaces can be virtually unbound in size by spanning across multiple physical storage partitions. As a result, Amazon Keyspaces meters write operations on static and nonstatic data separately. Furthermore, writes that include both static and nonstatic data require additional underlying operations to provide data consistency.

If you perform a mixed write operation of both static and nonstatic data, this results in two separate write operations—one for nonstatic and one for static data. This applies to both on-demand and provisioned read/write capacity modes.

The following example provides details about how to estimate the required read capacity units (RCUs) and write capacity units (WCUs) when you're calculating provisioned throughput capacity requirements for tables in Amazon Keyspaces that have static columns. You can estimate how much capacity your table needs to process writes that include both static and nonstatic data by using the following formula:

2 x WCUs required for nonstatic data + 2 x WCUs required for static data

For example, if your application writes 27 KBs of data per second and each write includes 25.5 KBs of nonstatic data and 1.5 KBs of static data, then your table requires 56 WCUs (2 x 26 WCUs + 2 x 2 WCUs).

Amazon Keyspaces meters the reads of static and nonstatic data the same as reads of multiple rows. As a result, the price of reading static and nonstatic data in the same operation is based on the aggregate size of the data processed to perform the read.