Character sets for custom vocabularies and vocabulary filters - Amazon Transcribe

Character sets for custom vocabularies and vocabulary filters

For each language Amazon Transcribe supports, there is a specific set of characters Amazon Transcribe can recognize. When you create a custom vocabulary or vocabulary filter, use only the characters listed in your language's character set. If you use unsupported characters, your custom vocabulary or vocabulary filter fails.

Important

Be sure to check that your custom vocabulary file uses only the supported Unicode code points and code point sequences listed within the following character sets.

Many Unicode characters can appear identical in popular fonts, even if they use different code points. Only the code points listed in this guide are supported. For example, the French word déjà can be rendered using precomposed characters (where one Unicode value represents an accented character) or decomposed characters (where two Unicode values represent an accented character, one value for the base character and another for the accent).

  • Precomposed version: 0064 00E9 006A 00E0 (renders as déjà)

  • Decomposed version: 0064 0065 0301 006A 0061 0300 (renders as déjà)

Topics

Abkhaz character set

For Abkhaz custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 љ 0459
б 0431 њ 045A
в 0432 ћ 045B
г 0433 ќ 045C
д 0434 ѝ 045D
е 0435 ў 045E
ж 0436 џ 045F
з 0437 ґ 0491
и 0438 ғ 0493
й 0439 җ 0497
к 043A ҙ 0499
л 043B қ 049B
м 043C ҟ 049F
н 043D ҡ 04A1
о 043E ң 04A3
п 043F ҥ 04A5
р 0440 ҩ 04A9
с 0441 ҫ 04AB
т 0442 ҭ 04AD
у 0443 ү 04AF
ф 0444 ұ 04B1
х 0445 ҳ 04B3
ц 0446 ҵ 04B5
ч 0447 ҷ 04B7
ш 0448 һ 04BB
щ 0449 ҽ 04BD
ъ 044A ҿ 04BF
ы 044B ӊ 04CA
ь 044C ӑ 04D1
э 044D ӓ 04D3
ю 044E ӗ 04D7
я 044F ә 04D9
ѐ 0450 ӡ 04E1
ё 0451 ӣ 04E3
ђ 0452 ӧ 04E7
ѓ 0453 ө 04E9
є 0454 ӯ 04EF
ѕ 0455 ӱ 04F1
і 0456 ӳ 04F3
ї 0457 ӷ 04F7
ј 0458 ӹ 04F9
ԥ 0525

Afrikaans character set

For Afrikaans custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ï 00EF
è 00E8 ó 00F3
é 00E9 ô 00F4
ê 00EA ö 00F6
ë 00EB ú 00FA
í 00ED û 00FB
î 00EE ü 00FC

Arabic character set

For Arabic custom vocabularies, you can use the following Unicode characters in the Phrase field. You can also use the hyphen (-) character to separate words.

Character Code Character Code
ء 0621 س 0633
آ 0622 ش 0634
أ 0623 ص 0635
ؤ 0624 ض 0636
إ 0625 ط 0637
ئ 0626 ظ 0638
ا 0627 ع 0639
ب 0628 غ 063A
ة 0629 ف 0641
ت 062A ق 0642
ث 062B ك 0643
ج 062C ل 0644
ح 062D م 0645
خ 062E ن 0646
د 062F ه 0647
ذ 0630 و 0648
ر 0631 ى 0649
ز 0632 ي 064A

Asturian character set

For Asturian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ñ 00F1
é 00E9 ó 00F3
í 00ED ú 00FA
ü 00FC

Azerbaijani character set

For Azerbaijani custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ä 00E4 ğ 011F
ç 00E7 ı 0131
ö 00F6 ş 015F
ü 00FC ə 0259
̇ 0307

Armenian character set

For Armenian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ա 0561 մ 0574
բ 0562 յ 0575
գ 0563 ն 0576
դ 0564 շ 0577
ե 0565 ո 0578
զ 0566 չ 0579
է 0567 պ 057A
ը 0568 ջ 057B
թ 0569 ռ 057C
ժ 056A ս 057D
ի 056B վ 057E
լ 056C տ 057F
խ 056D ր 0580
ծ 056E ց 0581
կ 056F ւ 0582
հ 0570 փ 0583
ձ 0571 ք 0584
ղ 0572 օ 0585
ճ 0573 ֆ 0586

Bashkir character set

For Bashkir custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 љ 0459
б 0431 њ 045A
в 0432 ћ 045B
г 0433 ќ 045C
д 0434 ѝ 045D
е 0435 ў 045E
ж 0436 џ 045F
з 0437 ґ 0491
и 0438 ғ 0493
й 0439 җ 0497
к 043A ҙ 0499
л 043B қ 049B
м 043C ҟ 049F
н 043D ҡ 04A1
о 043E ң 04A3
п 043F ҥ 04A5
р 0440 ҩ 04A9
с 0441 ҫ 04AB
т 0442 ҭ 04AD
у 0443 ү 04AF
ф 0444 ұ 04B1
х 0445 ҳ 04B3
ц 0446 ҵ 04B5
ч 0447 ҷ 04B7
ш 0448 һ 04BB
щ 0449 ҽ 04BD
ъ 044A ҿ 04BF
ы 044B ӊ 04CA
ь 044C ӑ 04D1
э 044D ӓ 04D3
ю 044E ӗ 04D7
я 044F ә 04D9
ѐ 0450 ӡ 04E1
ё 0451 ӣ 04E3
ђ 0452 ӧ 04E7
ѓ 0453 ө 04E9
є 0454 ӯ 04EF
ѕ 0455 ӱ 04F1
і 0456 ӳ 04F3
ї 0457 ӷ 04F7
ј 0458 ӹ 04F9

Basque character set

For Basque custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ñ 00F1
é 00E9 ó 00F3
í 00ED ú 00FA
ü 00FC

Belarusian character set

For Belarusian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 с 0441
б 0431 т 0442
в 0432 у 0443
г 0433 ф 0444
д 0434 х 0445
е 0435 ц 0446
ж 0436 ч 0447
з 0437 ш 0448
й 0439 ы 044B
к 043A ь 044C
л 043B э 044D
м 043C ю 044E
н 043D я 044F
о 043E ё 0451
п 043F і 0456
р 0440 ў 045E

Bengali character set

For Bengali custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0981 09A6
0982 09A7
0983 09A8
0985 09AA
0986 09AB
0987 09AC
0988 09AD
0989 09AE
098A 09AF
098B 09B0
098F 09B2
0990 09B6
0993 09B7
0994 09B8
0995 09B9
0996 09BC
0997 09BD
0998 09BE
0999 ি 09BF
099A 09C0
099B 09C1
099C 09C2
099D 09C3
099E 09C4
099F 09C7
09A0 09C8
09A1 09CB
09A2 09CC
09A3 09CD
09A4 09CE
09A5 09D7

Bosnian character set

For Bosnian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ć 0107 đ 0111
č 010D š 0161
ž 017E

Bulgarian character set

For Bulgarian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 п 043F
б 0431 р 0440
в 0432 с 0441
г 0433 т 0442
д 0434 у 0443
е 0435 ф 0444
ж 0436 х 0445
з 0437 ц 0446
и 0438 ч 0447
й 0439 ш 0448
к 043A щ 0449
л 043B ъ 044A
м 043C ь 044C
н 043D ю 044E
о 043E я 044F

Catalan character set

For Catalan custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
à 00E0 ï 00EF
ç 00E7 ò 00F2
è 00E8 ó 00F3
é 00E9 ú 00FA
í 00ED ü 00FC
ŀ 0140

Central Kurdish character set

For Central Kurdish custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ئ 0626 م 0645
ا 0627 ن 0646
ب 0628 و 0648
ت 062A پ 067E
ج 062C چ 0686
ح 062D ڕ 0695
خ 062E ژ 0698
د 062F ڤ 06A4
ر 0631 ک 06A9
ز 0632 گ 06AF
س 0633 ڵ 06B5
ش 0634 ھ 06BE
ع 0639 ۆ 06C6
غ 063A ۇ 06C7
ف 0641 ی 06CC
ق 0642 ێ 06CE
ل 0644 ە 06D5

Chinese, Mandarin (Mainland China), Simplified character set

For Chinese (Simplified) custom vocabularies, the Phrase field can use any of the characters listed in the following file:

The SoundsLike field can contain the pinyin syllables listed in the following file:

When you use pinyin syllables in the SoundsLike field, separate the syllables with a hyphen (-).

Amazon Transcribe represents the four tones in Chinese (Simplified) using numbers. The following table shows how tone marks are mapped for the word 'ma'.

Tone Tone mark Tone number
Tone 1 ma1
Tone 2 ma2
Tone 3 ma3
Tone 4 ma4
Note

For the 5th (neutral) tone, you can use Tone 1, with the exception of 'er', which must be mapped to Tone 2. For example, 打转儿 would be represented as 'da3-zhuan4-er2'.

Chinese (Simplified) custom vocabularies don't use the IPA field, but you must still include the IPA header in the custom vocabulary table.

The following example is an input file in text format. The example uses spaces to align the columns. Your input files should use TAB characters to separate the columns. Include spaces only in the DisplayAs column.

Phrase SoundsLike IPA DisplayAs 康健 kang1-jian4 谴责 qian3-ze2 国防大臣 guo2-fang2-da4-chen2 世界博览会 shi4-jie4-bo2-lan3-hui4 世博会

Chinese, Mandarin (Taiwan), Traditional character set

For Chinese (Traditional) custom vocabularies, the Phrase field can use any of the characters listed in the following file:

The SoundsLike field can contain the zhuyin syllables listed in the following file:

When you use zhuyin syllables in the SoundsLike field, separate the syllables with a hyphen (-).

Amazon Transcribe represents the four tones in Chinese (Traditional) using numbers. The following table shows how tone marks are mapped for the word ㄇㄚ.

Tone Tone mark
Tone 1 ㄇㄚ
Tone 2 ㄇㄚˊ
Tone 3 ㄇㄚˇ
Tone 4 ㄇㄚˋ

Chinese (Traditional) custom vocabularies don't use the IPA field, but you must still include the IPA header in the custom vocabulary table.

The following example is an input file in text format. The example uses spaces to align the columns. Your input files should use TAB characters to separate the columns. Include spaces only in the DisplayAs column.

Phrase SoundsLike IPA DisplayAs 健康 ㄐㄧㄢˋ-ㄎㄤ 譴責 ㄑㄧㄢˇ-ㄗㄜˊ 國防部長 ㄍㄨㄛˊ-ㄈㄤˊ-ㄅㄨˋ-ㄓㄤˇ 世界博覽會 ㄕˋ-ㄐㄧㄝˋ-ㄅㄛˊ-ㄌㄢˇ-ㄏㄨㄟˋ 世博會

Croatian character set

For Croatian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ć 0107 đ 0111
č 010D š 0161
ž 017E

Czech character set

For Czech custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ď 010F
é 00E9 ě 011B
í 00ED ň 0148
ó 00F3 ř 0159
ú 00FA š 0161
ý 00FD ť 0165
č 010D ů 016F
ž 017E

Danish character set

For Danish custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
Å 00C5 æ 00E6
Æ 00C6 é 00E9
Ø 00D8 ø 00F8
å 00E5

Dutch character set

For Dutch custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
à 00E0 î 00EE
á 00E1 ï 00EF
â 00E2 ñ 00F1
ä 00E4 ò 00F2
ç 00E7 ó 00F3
è 00E8 ô 00F4
é 00E9 ö 00F6
ê 00EA ù 00F9
ë 00EB ú 00FA
ì 00EC û 00FB
í 00ED ü 00FC

English character set

For English custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

Estonian character set

For Estonian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ä 00E4 ü 00FC
õ 00F5 š 0161
ö 00F6 ž 017E

Farsi character set

For Farsi custom vocabularies, you can use the following characters in the Phrase field.

Character Code Character Code
ء 0621 ظ 0638
آ 0622 ع 0639
أ 0623 غ 063A
ؤ 0624 ف 0641
ئ 0626 ق 0642
ا 0627 ل 0644
ب 0628 م 0645
ت 062A ن 0646
ث 062B ه 0647
ج 062C و 0648
ح 062D َ 064E
خ 062E ُ 064F
د 062F ِ 0650
ذ 0630 ّ 0651
ر 0631 پ 067E
ز 0632 چ 0686
س 0633 ژ 0698
ش 0634 ک 06A9
ص 0635 گ 06AF
ض 0636 ی 06CC
ط 0637    

Finnish character set

For Finnish custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ä 00E4 ö 00F6
å 00E5 š 0161
ž 017E

French character set

For French custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
À 00C0 à 00E0
 00C2 â 00E2
Ç 00C7 ç 00E7
È 00C8 è 00E8
É 00C9 é 00E9
Ê 00CA ê 00EA
Ë 00CB ë 00EB
Î 00CE î 00EE
Ï 00CF ï 00EF
Ô 00D4 ô 00F4
Ö 00D6 ö 00F6
Ù 00D9 ù 00F9
Û 00DB û 00FB
Ü 00DC ü 00FC

Galician character set

For Galician custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ñ 00F1
é 00E9 ó 00F3
í 00ED ú 00FA
ü 00FC

Georgian character set

For Georgian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
10D0 10E0
10D1 10E1
10D2 10E2
10D3 10E3
10D4 10E4
10D5 10E5
10D6 10E6
10D7 10E7
10D8 10E8
10D9 10E9
10DA 10EA
10DB 10EB
10DC 10EC
10DD 10ED
10DE 10EE
10DF 10EF
10F0

German character set

For German custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ä 00E4 Ä 00C4
ö 00F6 Ö 00D6
ü 00FC Ü 00DC
ß 00DF    

Greek character set

For Greek custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ά 03AC ν 03BD
έ 03AD ξ 03BE
ή 03AE ο 03BF
ί 03AF π 03C0
ΰ 03B0 ρ 03C1
α 03B1 ς 03C2
β 03B2 σ 03C3
γ 03B3 τ 03C4
δ 03B4 υ 03C5
ε 03B5 φ 03C6
ζ 03B6 χ 03C7
η 03B7 ψ 03C8
θ 03B8 ω 03C9
ι 03B9 ϊ 03CA
κ 03BA ϋ 03CB
λ 03BB ό 03CC
μ 03BC ώ 03CE
ΐ 0390

Gujarati character set

For Gujarati custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0A81 0AA6
0A82 0AA7
0A83 0AA8
0A85 0AAA
0A86 0AAB
0A87 0AAC
0A88 0AAD
0A89 0AAE
0A8A 0AAF
0A8B 0AB0
0A8D 0AB2
0A8F 0AB3
0A90 0AB5
0A91 0AB6
0A93 0AB7
0A94 0AB8
0A95 0AB9
0A96 0ABC
0A97 0ABE
0A98 િ 0ABF
0A99 0AC0
0A9A 0AC1
0A9B 0AC2
0A9C 0AC3
0A9D 0AC5
0A9E 0AC7
0A9F 0AC8
0AA0 0AC9
0AA1 0ACB
0AA2 0ACC
0AA3 0ACD
0AA4 0AD0
0AA5 0AE0

Hausa character set

For Hausa custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ƙ 0199 ɓ 0253
ƴ 01B4 ɗ 0257
̃ 0303

Hebrew character set

For Hebrew custom vocabularies, you can use the following Unicode characters in the Phrase field:

Character Code Character Code
- 002D ם 05DD
א 05D0 מ 05DE
ב 05D1 ן 05DF
ג 05D2 נ 05E0
ד 05D3 ס 05E1
ה 05D4 ע 05E2
ו 05D5 ף 05E3
ז 05D6 פ 05E4
ח 05D7 ץ 05E5
ט 05D8 צ 05E6
י 05D9 ק 05E7
ך 05DA ר 05E8
כ 05DB ש 05E9
ל 05DC ת 05EA

Hindi character set

For Hindi custom vocabularies, you can use the following Unicode characters in the Phrase field:

Character Code Character Code
- 002D 0925
. 002E 0926
0901 0927
0902 0928
0903 092A
0905 092B
0906 092C
0907 092D
0908 092E
0909 092F
090A 0930
090B 0932
090F 0935
0910 0936
0911 0937
0913 0938
0914 0939
0915 093E
0916 ि 093F
0917 0940
0918 0941
0919 0942
091A 0943
091B 0945
091C 0947
091D 0948
091E 0949
091F 094B
0920 094C
0921 094D
0922 095B
0923 095C
0924 095D

Amazon Transcribe maps the following characters:

Character Mapped to
ऩ (0929) न (0928)
ऱ (0931) र (0930)
क़ (0958) क (0915)
ख़ (0959) ख (0916)
ग़ (095A) ग (0917)
फ़ (095E) फ (092B)
य़ (095F) य (092F)

Hungarian character set

For Hungarian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ö 00F6
é 00E9 ú 00FA
í 00ED ü 00FC
ó 00F3 ő 0151
ű 0171

Icelandic character set

For Icelandic custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ú 00FA
é 00E9 ý 00FD
ð 00F0 þ 00FE
í 00ED æ 00E6
ó 00F3 ö 00F6

Indonesian character set

For Indonesian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

Italian character set

For Italian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
À 00C0 à 00E0
Ä 00C4 ä 00E4
Ç 00C7 ç 00E7
È 00C8 è 00E8
É 00C9 é 00E9
Ê 00CA ê 00EA
Ë 00CB ë 00EB
Ì 00CC ì 00EC
Ò 00D2 ò 00F2
Ù 00D9 ù 00F9
Ü 00DC ü 00FC

Japanese character set

For Japanese custom vocabularies, the DisplayAs field supports all hiragana, katakana, and kanji characters, and fullwidth romaji capital letters.

The Phrase field supports the characters listed in the following file:

Kabyle character set

For Kabyle custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ï 00EF 1E0D
č 010D 1E25
ř 0159 1E5B
ǧ 01E7 1E63
ɛ 025B 1E6D
ɣ 0263 1E93

Kannada character set

For Kannada custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0C82 0CA7
0C83 0CA8
0C85 0CAA
0C86 0CAB
0C87 0CAC
0C88 0CAD
0C89 0CAE
0C8A 0CAF
0C8B 0CB0
0C8E 0CB2
0C8F 0CB3
0C90 0CB5
0C92 0CB6
0C93 0CB7
0C94 0CB8
0C95 0CB9
0C96 0CBC
0C97 0CBD
0C98 0CBE
0C99 ಿ 0CBF
0C9A 0CC0
0C9B 0CC1
0C9C 0CC2
0C9D 0CC3
0C9E 0CC6
0C9F 0CC7
0CA0 0CC8
0CA1 0CCA
0CA2 0CCB
0CA3 0CCC
0CA4 0CCD
0CA5 0CD5
0CA6 0CD6
0CE0

Kazakh character set

For Kazakh custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
т 0442 ы 044B
б 0431 я 044F
о 043E с 0441
п 043F һ 04BB
ш 0448 д 0434
и 0438 р 0440
ч 0447 г 0433
н 043D ё 0451
қ 049B й 0439
і 0456 ө 04E9
щ 0449 в 0432
е 0435 э 044D
ә 04D9 ң 04A3
ю 044E л 043B
з 0437 ф 0444
х 0445 к 043A
ц 0446 у 0443
ү 04AF ж 0436
м 043C ғ 0493
ь 044C а 0430
ъ 044A ұ 04B1

Kinyarwanda character set

For Kinyarwanda custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ó 00F3
â 00E2 ô 00F4
ã 00E3 ú 00FA
ç 00E7 ü 00FC
è 00E8 ā 0101
é 00E9 ē 0113
ê 00EA ī 012B
ë 00EB ō 014D
í 00ED ū 016B
ï 00EF ́ 0301

Korean character set

For Korean custom vocabularies, you can use any of the Hangul syllables in the Phrase field. For more information, see Hangul Syllables on Wikipedia.

Kyrgyz character set

For Kyrgyz custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 љ 0459
б 0431 њ 045A
в 0432 ћ 045B
г 0433 ќ 045C
д 0434 ѝ 045D
е 0435 ў 045E
ж 0436 џ 045F
з 0437 ґ 0491
и 0438 ғ 0493
й 0439 җ 0497
к 043A ҙ 0499
л 043B қ 049B
м 043C ҟ 049F
н 043D ҡ 04A1
о 043E ң 04A3
п 043F ҥ 04A5
р 0440 ҩ 04A9
с 0441 ҫ 04AB
т 0442 ҭ 04AD
у 0443 ү 04AF
ф 0444 ұ 04B1
х 0445 ҳ 04B3
ц 0446 ҵ 04B5
ч 0447 ҷ 04B7
ш 0448 һ 04BB
щ 0449 ҽ 04BD
ъ 044A ҿ 04BF
ы 044B ӊ 04CA
ь 044C ӑ 04D1
э 044D ӓ 04D3
ю 044E ӗ 04D7
я 044F ә 04D9
ѐ 0450 ӡ 04E1
ё 0451 ӣ 04E3
ђ 0452 ӧ 04E7
ѓ 0453 ө 04E9
є 0454 ӯ 04EF
ѕ 0455 ӱ 04F1
і 0456 ӳ 04F3
ї 0457 ӷ 04F7
ј 0458 ӹ 04F9

Latvian character set

For Latvian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ā 0101 ķ 0137
č 010D ļ 013C
ē 0113 ņ 0146
ģ 0123 š 0161
ī 012B ū 016B
ž 017E

Lithuanian character set

For Lithuanian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ą 0105 į 012F
č 010D š 0161
ę 0119 ų 0173
ė 0117 ū 016B
ž 017E

Luganda character set

For Luganda custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ÿ 00FF ŋ 014B

Macedonian character set

For Macedonian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 љ 0459
б 0431 њ 045A
в 0432 ћ 045B
г 0433 ќ 045C
д 0434 ѝ 045D
е 0435 ў 045E
ж 0436 џ 045F
з 0437 ґ 0491
и 0438 ғ 0493
й 0439 җ 0497
к 043A ҙ 0499
л 043B қ 049B
м 043C ҟ 049F
н 043D ҡ 04A1
о 043E ң 04A3
п 043F ҥ 04A5
р 0440 ҩ 04A9
с 0441 ҫ 04AB
т 0442 ҭ 04AD
у 0443 ү 04AF
ф 0444 ұ 04B1
х 0445 ҳ 04B3
ц 0446 ҵ 04B5
ч 0447 ҷ 04B7
ш 0448 һ 04BB
щ 0449 ҽ 04BD
ъ 044A ҿ 04BF
ы 044B ӊ 04CA
ь 044C ӑ 04D1
э 044D ӓ 04D3
ю 044E ӗ 04D7
я 044F ә 04D9
ѐ 0450 ӡ 04E1
ё 0451 ӣ 04E3
ђ 0452 ӧ 04E7
ѓ 0453 ө 04E9
є 0454 ӯ 04EF
ѕ 0455 ӱ 04F1
і 0456 ӳ 04F3
ї 0457 ӷ 04F7
ј 0458 ӹ 04F9

Malay character set

For Malay custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

Malayalam character set

For Malayalam custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0D02 0D28
0D03 0D2A
0D05 0D2B
0D06 0D2C
0D07 0D2D
0D08 0D2E
0D09 0D2F
0D0A 0D30
0D0B 0D31
0D0E 0D32
0D0F 0D33
0D10 0D34
0D12 0D35
0D13 0D36
0D14 0D37
0D15 0D38
0D16 0D39
0D17 0D3E
0D18 ി 0D3F
0D19 0D40
0D1A 0D41
0D1B 0D42
0D1C 0D43
0D1D 0D46
0D1E 0D47
0D1F 0D48
0D20 0D4A
0D21 0D4B
0D22 0D4C
0D23 0D4D
0D24 0D7A
0D25 0D7B
0D26 0D7C
0D27 0D7D

Maltese character set

For Maltese custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
à 00E0 ù 00F9
è 00E8 ċ 010B
ì 00EC ġ 0121
ò 00F2 ħ 0127
ż 017C

Marathi character set

For Marathi custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0901 0925
0902 0926
0903 0927
0905 0928
0906 092A
0907 092B
0908 092C
0909 092D
090A 092E
090B 092F
090D 0930
090F 0932
0910 0933
0911 0935
0913 0936
0914 0937
0915 0938
0916 0939
0917 093C
0918 093E
0919 ि 093F
091A 0940
091B 0941
091C 0942
091D 0943
091E 0945
091F 0947
0920 0948
0921 0949
0922 094B
0923 094C
0924 094D
0950

Meadow Mari character set

For Meadow Mari custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 љ 0459
б 0431 њ 045A
в 0432 ћ 045B
г 0433 ќ 045C
д 0434 ѝ 045D
е 0435 ў 045E
ж 0436 џ 045F
з 0437 ґ 0491
и 0438 ғ 0493
й 0439 җ 0497
к 043A ҙ 0499
л 043B қ 049B
м 043C ҟ 049F
н 043D ҡ 04A1
о 043E ң 04A3
п 043F ҥ 04A5
р 0440 ҩ 04A9
с 0441 ҫ 04AB
т 0442 ҭ 04AD
у 0443 ү 04AF
ф 0444 ұ 04B1
х 0445 ҳ 04B3
ц 0446 ҵ 04B5
ч 0447 ҷ 04B7
ш 0448 һ 04BB
щ 0449 ҽ 04BD
ъ 044A ҿ 04BF
ы 044B ӊ 04CA
ь 044C ӑ 04D1
э 044D ӓ 04D3
ю 044E ӗ 04D7
я 044F ә 04D9
ѐ 0450 ӡ 04E1
ё 0451 ӣ 04E3
ђ 0452 ӧ 04E7
ѓ 0453 ө 04E9
є 0454 ӯ 04EF
ѕ 0455 ӱ 04F1
і 0456 ӳ 04F3
ї 0457 ӷ 04F7
ј 0458 ӹ 04F9

Mongolian character set

For Mongolian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 љ 0459
б 0431 њ 045A
в 0432 ћ 045B
г 0433 ќ 045C
д 0434 ѝ 045D
е 0435 ў 045E
ж 0436 џ 045F
з 0437 ґ 0491
и 0438 ғ 0493
й 0439 җ 0497
к 043A ҙ 0499
л 043B қ 049B
м 043C ҟ 049F
н 043D ҡ 04A1
о 043E ң 04A3
п 043F ҥ 04A5
р 0440 ҩ 04A9
с 0441 ҫ 04AB
т 0442 ҭ 04AD
у 0443 ү 04AF
ф 0444 ұ 04B1
х 0445 ҳ 04B3
ц 0446 ҵ 04B5
ч 0447 ҷ 04B7
ш 0448 һ 04BB
щ 0449 ҽ 04BD
ъ 044A ҿ 04BF
ы 044B ӊ 04CA
ь 044C ӑ 04D1
э 044D ӓ 04D3
ю 044E ӗ 04D7
я 044F ә 04D9
ѐ 0450 ӡ 04E1
ё 0451 ӣ 04E3
ђ 0452 ӧ 04E7
ѓ 0453 ө 04E9
є 0454 ӯ 04EF
ѕ 0455 ӱ 04F1
і 0456 ӳ 04F3
ї 0457 ӷ 04F7
ј 0458 ӹ 04F9

Norwegian Bokmål character set

For Norwegian Bokmål custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
å 00E5 æ 00E6
ø 00F8

Odia/Oriya character set

For Odia/Oriya custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0B01 0B26
0B02 0B27
0B03 0B28
0B05 0B2A
0B06 0B2B
0B07 0B2C
0B08 0B2D
0B09 0B2E
0B0A 0B2F
0B0B 0B30
0B0F 0B32
0B10 0B33
0B13 0B36
0B14 0B37
0B15 0B38
0B16 0B39
0B17 0B3C
0B18 0B3E
0B19 ି 0B3F
0B1A 0B40
0B1B 0B41
0B1C 0B42
0B1D 0B43
0B1E 0B47
0B1F 0B48
0B20 0B4B
0B21 0B4C
0B22 0B4D
0B23 0B56
0B24 0B5F
0B25 0B60
0B71

Pashto character set

For Pashto custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
آ 0622 و 0648
أ 0623 ي 064A
ؤ 0624 ً 064B
ئ 0626 ٌ 064C
ا 0627 ٍ 064D
ب 0628 َ 064E
ت 062A ُ 064F
ث 062B ِ 0650
ج 062C ّ 0651
ح 062D ْ 0652
خ 062E ٔ 0654
د 062F ٰ 0670
ذ 0630 ټ 067C
ر 0631 پ 067E
ز 0632 ځ 0681
س 0633 څ 0685
ش 0634 چ 0686
ص 0635 ډ 0689
ض 0636 ړ 0693
ط 0637 ږ 0696
ظ 0638 ژ 0698
ع 0639 ښ 069A
غ 063A ک 06A9
ف 0641 ګ 06AB
ق 0642 گ 06AF
ل 0644 ڼ 06BC
م 0645 ی 06CC
ن 0646 ۍ 06CD
ه 0647 ې 06D0

Polish character set

For Polish custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ó 00F3 ł 0142
ą 0105 ń 0144
ć 0107 ś 015B
ę 0119 ź 017A
ż 017C

Portuguese character set

For Portuguese custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
À 00C0 à 00E0
Á 00C1 á 00E1
 00C2 â 00E2
à 00C3 ã 00E3
Ä 00C4 ä 00E4
Ç 00C7 ç 00E7
È 00C8 è 00E8
É 00C9 é 00E9
Ê 00CA ê 00EA
Ë 00CB ë 00EB
Í 00CD í 00ED
Ñ 00D1 ñ 00F1
Ó 00D3 ó 00F3
Ô 00D4 ô 00F4
Õ 00D5 õ 00F5
Ö 00D6 ö 00F6
Ú 00DA ú 00FA
Ü 00DC ü 00FC

Punjabi character set

For Punjabi custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0A05 0A27
0A06 0A28
0A07 0A2A
0A08 0A2B
0A09 0A2C
0A0A 0A2D
0A0F 0A2E
0A10 0A2F
0A13 0A30
0A14 0A32
0A15 0A35
0A16 0A38
0A17 0A39
0A18 0A3C
0A19 0A3E
0A1A ਿ 0A3F
0A1B 0A40
0A1C 0A41
0A1D 0A42
0A1E 0A47
0A1F 0A48
0A20 0A4B
0A21 0A4C
0A22 0A4D
0A23 0A5C
0A24 0A70
0A25 0A71
0A26 0A72
0A73

Romanian character set

For Romanian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ă 0103 ș 0219
â 00E2 ț 021B
î 00EE ş 015F
ţ 0163

Russian character set

For Russian custom vocabularies, you can use the following characters in the Phrase field:

Character Code Character Code
' 0027 п 043F
- 002D р 0440
. 002E с 0441
а 0430 т 0442
б 0431 у 0443
в 0432 ф 0444
г 0433 х 0445
д 0434 ц 0446
е 0435 ч 0447
ж 0436 ш 0448
з 0437 щ 0449
и 0438 ъ 044A
й 0439 ы 044B
к 043A ь 044C
л 043B э 044D
м 043C ю 044E
н 043D я 044F
о 043E ё 0451

Serbian character set

For Serbian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ć 0107 і 0456
č 010D ї 0457
đ 0111 ј 0458
š 0161 љ 0459
ž 017E њ 045A
а 0430 ћ 045B
б 0431 ќ 045C
в 0432 ѝ 045D
г 0433 ў 045E
д 0434 џ 045F
е 0435 ґ 0491
ж 0436 ғ 0493
з 0437 җ 0497
и 0438 ҙ 0499
й 0439 қ 049B
к 043A ҟ 049F
л 043B ҡ 04A1
м 043C ң 04A3
н 043D ҥ 04A5
о 043E ҩ 04A9
п 043F ҫ 04AB
р 0440 ҭ 04AD
с 0441 ү 04AF
т 0442 ұ 04B1
у 0443 ҳ 04B3
ф 0444 ҵ 04B5
х 0445 ҷ 04B7
ц 0446 һ 04BB
ч 0447 ҽ 04BD
ш 0448 ҿ 04BF
щ 0449 ӊ 04CA
ъ 044A ӑ 04D1
ы 044B ӓ 04D3
ь 044C ӗ 04D7
э 044D ә 04D9
ю 044E ӡ 04E1
я 044F ӣ 04E3
ѐ 0450 ӧ 04E7
ё 0451 ө 04E9
ђ 0452 ӯ 04EF
ѓ 0453 ӱ 04F1
є 0454 ӳ 04F3
ѕ 0455 ӷ 04F7
ӹ 04F9

Sinhala character set

For Sinhala custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0D82 0DAF
0D83 0DB0
0D85 0DB1
0D86 0DB3
0D87 0DB4
0D88 0DB5
0D89 0DB6
0D8A 0DB7
0D8B 0DB8
0D8C 0DB9
0D8D 0DBA
0D91 0DBB
0D92 0DBD
0D93 0DC0
0D94 0DC1
0D95 0DC2
0D96 0DC3
0D9A 0DC4
0D9B 0DC5
0D9C 0DC6
0D9D 0DCA
0D9E 0DCF
0D9F 0DD0
0DA0 0DD1
0DA1 0DD2
0DA2 0DD3
0DA3 0DD4
0DA4 0DD6
0DA5 0DD8
0DA7 0DD9
0DA8 0DDA
0DA9 0DDB
0DAA 0DDC
0DAB 0DDD
0DAC 0DDE
0DAD 0DDF
0DAE 0DF2

Slovak character set

For Slovak custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
á 00E1 ň 0148
ä 00E4 ó 00F3
č 010D ô 00F4
ď 010F ŕ 0155
é 00E9 š 0161
í 00ED ť 0165
ĺ 013A ú 00FA
ľ 013E ý 00FD
ž 017E

Slovenian character set

For Slovenian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
č 010D š 0161
ž 017E

Somali character set

For Somali custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
s 0073 d 0064
t 0074 a 0061
a 0061 r 0072
n 006E d 0064

Spanish character set

For Spanish custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
Á 00C1 á 00E1
É 00C9 é 00E9
Í 00CD í 00ED
Ó 00D3 ó 0XF3
Ú 00DA ú 00FA
Ñ 00D1 ñ 0XF1
ü 00FC    

Sundanese character set

For Sundanese custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
s 0073 d 0064
t 0074 a 0061
a 0061 r 0072
n 006E d 0064

Swahili character set

For Swahili custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
s 0073 d 0064
t 0074 a 0061
a 0061 r 0072
n 006E d 0064

Swedish character set

For Swedish custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
Ä 00C4 ä 00E4
Å 00C5 å 00E5
Ö 00D6 ö 00F6

Tagalog/Filipino character set

For Tagalog/Filipino custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code
ñ 00F1

Tamil character set

For Tamil custom vocabularies, you can use the following characters in the Phrase field:

Character Code Character Code
0B85 0BB0
0B86 0BB2
0B87 0BB5
0B88 0BB4
0B89 0BB3
0B8A 0BB1
0B8E 0BA9
0B8F 0B9C
0B90 0BB6
0B92 0BB7
0B93 0BB8
0B94 0BB9
0B83 0BCD
0B95 0BBE
0B99 ி 0BBF
0B9A 0BC0
0B9E 0BC1
0B9F 0BC2
0BA3 0BC6
0BA4 0BC7
0BA8 0BC8
0BAA 0BCA
0BAE 0BCB
0BAF 0BCC

Tatar character set

For Tatar custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 љ 0459
б 0431 њ 045A
в 0432 ћ 045B
г 0433 ќ 045C
д 0434 ѝ 045D
е 0435 ў 045E
ж 0436 џ 045F
з 0437 ґ 0491
и 0438 ғ 0493
й 0439 җ 0497
к 043A ҙ 0499
л 043B қ 049B
м 043C ҟ 049F
н 043D ҡ 04A1
о 043E ң 04A3
п 043F ҥ 04A5
р 0440 ҩ 04A9
с 0441 ҫ 04AB
т 0442 ҭ 04AD
у 0443 ү 04AF
ф 0444 ұ 04B1
х 0445 ҳ 04B3
ц 0446 ҵ 04B5
ч 0447 ҷ 04B7
ш 0448 һ 04BB
щ 0449 ҽ 04BD
ъ 044A ҿ 04BF
ы 044B ӊ 04CA
ь 044C ӑ 04D1
э 044D ӓ 04D3
ю 044E ӗ 04D7
я 044F ә 04D9
ѐ 0450 ӡ 04E1
ё 0451 ӣ 04E3
ђ 0452 ӧ 04E7
ѓ 0453 ө 04E9
є 0454 ӯ 04EF
ѕ 0455 ӱ 04F1
і 0456 ӳ 04F3
ї 0457 ӷ 04F7
ј 0458 ӹ 04F9

Telugu character set

For Telugu custom vocabularies, you can use the following characters in the Phrase field:

Character Code Character Code
- 002D 0C24
0C01 0C25
0C02 0C26
0C03 0C27
0C05 0C28
0C06 0C2A
0C07 0C2B
0C08 0C2C
0C09 0C2D
0C0A 0C2E
0C0B 0C2F
0C30 0C0E
0C31 0C0F
0C32 0C10
0C33 0C12
0C35 0C13
0C36 0C14
0C37 0C15
0C38 0C16
0C39 0C17
0C3E 0C18
ి 0C3F 0C19
0C40 0C1A
0C41 0C1B
0C42 0C1C
0C43 0C1D
0C44 0C1E
0C47 0C1F
0C48 0C20
0C4A 0C21
0C4B 0C22
0C4C 0C23
0C4D

Thai character set

For Thai custom vocabularies, you can use the following characters in the Phrase field:

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
0E01 0E25
0E02 0E26
0E03 0E27
0E04 0E28
0E05 0E29
0E06 0E2A
0E07 0E2B
0E08 0E2C
0E09 0E2D
0E0A 0E2E
0E0B 0E2F
0E0C 0E30
0E0D 0E31
0E0E 0E32
0E0F 0E34
0E10 0E35
0E11 0E36
0E12 0E37
0E13 0E38
0E14 0E39
0E15 0E3A
0E16 0E40
0E17 0E41
0E18 0E42
0E19 0E43
0E1A 0E44
0E1B 0E45
0E1C 0E46
0E1D 0E47
0E1E 0E48
0E1F 0E49
0E20 0E4A
0E21 0E4B
0E22 0E4C
0E23 0E4D
0E24

Turkish character set

For Turkish custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
Ç 00C7 ö 00F6
Ö 00D6 û 00FB
Ü 00DC ü 00FC
â 00E2 Ğ 011E
ä 00E4 ğ 011F
ç 00E7 İ 0130
è 00E8 ı 0131
é 00E9 Ş 015E
ê 00EA ş 015F
í 00ED š 0161
î 00EE ž 017E
ó 00F3    

Ukrainian character set

For Ukrainian custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
а 0430 р 0440
б 0431 с 0441
в 0432 т 0442
г 0433 у 0443
д 0434 ф 0444
е 0435 х 0445
ж 0436 ц 0446
з 0437 ч 0447
и 0438 ш 0448
й 0439 щ 0449
к 043A ь 044C
л 043B ю 044E
м 043C я 044F
н 043D є 0454
о 043E і 0456
п 043F ї 0457
ґ 0491

Uyghur character set

For Uyghur custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
ؑ 0611 و 0648
ؓ 0613 ى 0649
ؔ 0614 ي 064A
ء 0621 ً 064B
آ 0622 ٌ 064C
أ 0623 ٍ 064D
ؤ 0624 َ 064E
إ 0625 ُ 064F
ئ 0626 ِ 0650
ا 0627 ّ 0651
ب 0628 ْ 0652
ة 0629 ٓ 0653
ت 062A ٔ 0654
ث 062B ٗ 0657
ج 062C ٰ 0670
ح 062D ٹ 0679
خ 062E ٺ 067A
د 062F ٻ 067B
ذ 0630 ټ 067C
ر 0631 ٽ 067D
ز 0632 پ 067E
س 0633 ٿ 067F
ش 0634 ڀ 0680
ص 0635 ځ 0681
ض 0636 ڃ 0683
ط 0637 ڄ 0684
ظ 0638 څ 0685
ع 0639 چ 0686
غ 063A ڇ 0687
ـ 0640 ڈ 0688
ف 0641 ډ 0689
ق 0642 ڊ 068A
ك 0643 ڌ 068C
ل 0644 ڍ 068D
م 0645 ڏ 068F
ن 0646 ڑ 0691
ه 0647 ړ 0693
ڕ 0695

Uzbek character set

For Uzbek custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
т 0442 я 044F
б 0431 с 0441
о 043E ҳ 04B3
п 043F д 0434
ш 0448 р 0440
и 0438 ў 045E
ч 0447 г 0433
н 043D ё 0451
қ 049B й 0439
е 0435 в 0432
ю 044E э 044D
з 0437 л 043B
х 0445 ф 0444
ц 0446 к 043A
м 043C у 0443
ь 044C ж 0436
ъ 044A ғ 0493
а 0430

Vietnamese character set

Amazon Transcribe represents the six tones in Vietnamese using numbers. The following table shows how tone marks are mapped for the word 'ma'.

Tone name Tone mark Tone number
ngang ma ma1
sắc ma2
huyền ma3
hỏi mả ma4
ngã ma5
nặng mạ ma6

For Vietnamese custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • A - Z

  • ' (apostrophe)

  • - (hyphen)

  • . (period)

  • & (ampersand)

  • ; (semicolon)

  • _ (low line)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
à 00E0 À 00C0
á 00E1 Á 00C1
â 00E2 Â 00C2
ã 00E3 Ã 00C3
è 00E8 È 00C8
é 00E9 É 00C9
ê 00EA Ê 00CA
ì 00EC Ì 00CC
í 00ED Í 00CD
ò 00F2 Ò 00D2
ó 00F3 Ó 00D3
ô 00F4 Ô 00D4
õ 00F5 Õ 00D5
ù 00F9 Ù 00D9
ú 00FA Ú 00DA
ý 00FD Ý 00DD
ă 0103 Ă 0102
đ 0111 Đ 0110
ĩ 0129 Ĩ 0128
ũ 0169 Ũ 0168
ơ 01A1 Ơ 01A0
ư 01B0 Ư 01AF
1EA1 1EA0
1EA3 1EA2
1EA5 1EA4
1EA7 1EA6
1EA9 1EA8
1EAB 1EAA
1EAD 1EAC
1EAF 1EAE
1EB1 1EB0
1EB3 1EB2
1EB5 1EB4
1EB7 1EB6
1EB9 1EB8
1EBB 1EBA
1EBD 1EBC
ế 1EBF 1EBE
1EC1 1EC0
1EC3 1EC2
1EC5 1EC4
1EC7 1EC6
1EC9 1EC8
1ECB 1ECA
1ECD 1ECC
1ECF 1ECE
1ED1 1ED0
1ED3 1ED2
1ED5 1ED4
1ED7 1ED6
1ED9 1ED8
1EDB 1EDA
1EDD 1EDC
1EDF 1EDE
1EE1 1EE0
1EE3 1EE2
1EE5 1EE4
1EE7 1EE6
1EE9 1EE8
1EEB 1EEA
1EED 1EEC
1EEF 1EEE
1EF1 1EF0
1EF3 1EF2
1EF5 1EF4
1EF7 1EF6
1EF9 1EF8

Welsh character set

For Welsh custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
à 00E0 ò 00F2
á 00E1 ó 00F3
â 00E2 ô 00F4
ä 00E4 ö 00F6
è 00E8 ù 00F9
é 00E9 ú 00FA
ê 00EA û 00FB
ë 00EB ü 00FC
ì 00EC ý 00FD
í 00ED ÿ 00FF
î 00EE ŵ 0175
ï 00EF ŷ 0177
1EF3

Wolof character set

For Wolof custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
à 00E0 ê 00EA
ã 00E3 ë 00EB
ç 00E7 ñ 00F1
è 00E8 ó 00F3
é 00E9 ô 00F4
ŋ 014B

Zulu character set

For Zulu custom vocabularies, you can use the following characters in the Phrase field:

  • a - z

  • - (hyphen)

  • . (period)

You can also use the following Unicode characters in the Phrase field:

Character Code Character Code
s 0073 d 0064
t 0074 a 0061
a 0061 r 0072
n 006E d 0064