Managing collations and and character sets for Amazon RDS for Microsoft SQL Server
This topic provide guidance on how to manage collations and character sets for Microsoft SQL Server in Amazon RDS. It explains how to configure collations during database creation and modify them later, ensuring proper handling of text data based on language and locale requirements. Additionally, it covers best practices for maintaining compatibility and performance in SQL Server environments in Amazon RDS.
SQL Server supports collations at multiple levels. You set the default server collation when you create the DB instance. You can override the collation in the database, table, or column level.
Topics
Server-level collation for Microsoft SQL Server
When you create a Microsoft SQL Server DB instance, you can set the server collation that you want to use. If you don't choose a different collation, the server-level collation defaults to SQL_Latin1_General_CP1_CI_AS. The server collation is applied by default to all databases and database objects.
Note
You can't change the collation when you restore from a DB snapshot.
Currently, Amazon RDS supports the following server collations:
Collation | Description |
---|---|
Arabic_CI_AS |
Arabic, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Chinese_PRC_BIN2 |
Chinese-PRC, binary code point sort order |
Chinese_PRC_CI_AS |
Chinese-PRC, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Chinese_Taiwan_Stroke_CI_AS |
Chinese-Taiwan-Stroke, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Danish_Norwegian_CI_AS |
Danish-Norwegian, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Finnish_Swedish_CI_AS |
Finnish, Swedish, and Swedish (Finland), case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
French_CI_AS |
French, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Hebrew_BIN |
Hebrew, binary sort |
Hebrew_CI_AS |
Hebrew, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Japanese_BIN |
Japanese, binary sort |
Japanese_CI_AS |
Japanese, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Japanese_CS_AS |
Japanese, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Japanese_XJIS_140_CI_AS |
Japanese, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters, variation selector insensitive |
Japanese_XJIS_140_CI_AS_KS_VSS |
Japanese, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive, supplementary characters, variation selector sensitive |
Japanese_XJIS_140_CI_AS_VSS |
Japanese, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive, supplementary characters, variation selector sensitive |
Japanese_XJIS_140_CS_AS_KS_WS |
Japanese, case-sensitive, accent-sensitive, kanatype-sensitive, width-sensitive, supplementary characters, variation selector insensitive |
Korean_Wansung_CI_AS |
Korean-Wansung, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Latin1_General_100_BIN |
Latin1-General-100, binary sort |
Latin1_General_100_BIN2 |
Latin1-General-100, binary code point sort order |
Latin1_General_100_BIN2_UTF8 |
Latin1-General-100, binary code point sort order, UTF-8 encoded |
Latin1_General_100_CI_AS |
Latin1-General-100, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Latin1_General_100_CI_AS_SC_UTF8 |
Latin1-General-100, case-insensitive, accent-sensitive, supplementary characters, UTF-8 encoded |
Latin1_General_BIN |
Latin1-General, binary sort |
Latin1_General_BIN2 |
Latin1-General, binary code point sort order |
Latin1_General_CI_AI |
Latin1-General, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive |
Latin1_General_CI_AS |
Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Latin1_General_CI_AS_KS |
Latin1-General, case-insensitive, accent-sensitive, kanatype-sensitive, width-insensitive |
Latin1_General_CS_AS |
Latin1-General, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Modern_Spanish_CI_AS |
Modern-Spanish, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Polish_CI_AS |
Polish, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
SQL_1xCompat_CP850_CI_AS |
Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 49 on Code Page 850 for non-Unicode Data |
SQL_Latin1_General_CP1_CI_AI |
Latin1-General, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 54 on Code Page 1252 for non-Unicode Data |
SQL_Latin1_General_CP1_CI_AS (default) |
Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 52 on Code Page 1252 for non-Unicode Data |
SQL_Latin1_General_CP1_CS_AS |
Latin1-General, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 51 on Code Page 1252 for non-Unicode Data |
SQL_Latin1_General_CP437_CI_AI |
Latin1-General, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 34 on Code Page 437 for non-Unicode Data |
SQL_Latin1_General_CP850_BIN |
Latin1-General, binary sort order for Unicode Data, SQL Server Sort Order 40 on Code Page 850 for non-Unicode Data |
SQL_Latin1_General_CP850_BIN2 |
Latin1-General, binary code point sort order for Unicode Data, SQL Server Sort Order 40 on Code Page 850 for non-Unicode Data |
SQL_Latin1_General_CP850_CI_AI |
Latin1-General, case-insensitive, accent-insensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 44 on Code Page 850 for non-Unicode Data |
SQL_Latin1_General_CP850_CI_AS |
Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 42 on Code Page 850 for non-Unicode Data |
SQL_Latin1_General_CP1256_CI_AS |
Latin1-General, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 146 on Code Page 1256 for non-Unicode Data |
SQL_Latin1_General_CP1255_CS_AS |
Latin1-General, case-sensitive, accent-sensitive, kanatype-insensitive, width-insensitive for Unicode Data, SQL Server Sort Order 137 on Code Page 1255 for non-Unicode Data |
Thai_CI_AS |
Thai, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
Turkish_CI_AS |
Turkish, case-insensitive, accent-sensitive, kanatype-insensitive, width-insensitive |
To choose the collation:
-
If you're using the Amazon RDS console, when creating a new DB instance choose Additional configuration, then enter the collation in the Collation field. For more information, see Creating an Amazon RDS DB instance.
-
If you're using the AWS CLI, use the
--character-set-name
option with thecreate-db-instance
command. For more information, see create-db-instance. -
If you're using the Amazon RDS API, use the
CharacterSetName
parameter with theCreateDBInstance
operation. For more information, see CreateDBInstance.
Database-level collation for Microsoft SQL Server
You can change the default collation at the database, table, or column level by overriding the collation when creating a new database or database object. For example, if your default server collation is SQL_Latin1_General_CP1_CI_AS, you can change it to Mohawk_100_CI_AS for Mohawk collation support. Even arguments in a query can be type-cast to use a different collation if necessary.
For example, the following query would change the default collation for the AccountName column to Mohawk_100_CI_AS
CREATE TABLE [dbo].[Account] ( [AccountID] [nvarchar](10) NOT NULL, [AccountName] [nvarchar](100) COLLATE Mohawk_100_CI_AS NOT NULL ) ON [PRIMARY];
The Microsoft SQL Server DB engine supports Unicode by the built-in NCHAR, NVARCHAR, and NTEXT data types. For example, if you need CJK support, use these Unicode data types for character storage and override the default server collation when creating your databases and tables. Here are several links from Microsoft covering collation and Unicode support for SQL Server: