Transcribing numbers and punctuation - Amazon Transcribe

Transcribing numbers and punctuation

Amazon Transcribe automatically adds punctuation to all supported languages, and capitalizes words appropriately for languages that use case distinction in their writing systems.

For most languages, numbers are transcribed into their word forms. However, if your media is in English or German, Amazon Transcribe treats numbers differently depending on the context in which they're used.

For example, if a speaker says "Meet me at eight-thirty AM on June first at one-hundred Main Street with three-dollars-and-fifty-cents and one-point-five chocolate bars," this is transcribed as:

  • English and German dialects: Meet me at 8:30 a.m. on June 1st at 100 Main Street with $3.50 and 1.5 chocolate bars

  • All other languages: Meet me at eight thirty a m on June first at one hundred Main Street with three dollars and fifty cents and one point five chocolate bars

To see all the rules associated with spoken numbers in English and German, refer to the following table.

Rules

English dialects

(input audio → output text)

German dialects

(input audio → output text)

Convert cardinal numbers greater than ten to numbers.

  • "Fifty five" → 55

  • "a hundred" → 100

  • "One thousand and thirty one" → 1031

  • "One hundred twenty-three million four hundred fifty six thousand seven hundred eight nine" → 123,456,789

  • "fünfundfünfzig" → 55

  • "vier tausend sechs hundert einundachtzig" → 4681

  • "eine Sache" → "eine Sache"

Convert cardinal numbers followed by "million" or "billion" to numerals followed by a word when "million" or "billion" is not followed by a number.

  • "one hundred million" → 100 million

  • "one billion" → 1 billion

  • "two point three million" → 2.3 million

  • "zehn Millionen Menschen" → 10 Millionen Menschen

  • "zehn Millionen fünf hundert tausend" → 10.500.000

Convert ordinal numbers greater than ten to numbers.

  • "Forty third" → 43rd

  • "twenty sixth avenue" → 26th avenue

  • "dreiundzwanzigste" → 23

  • "vierzigster" → 40

  • "ich war Erster" → "ich war Erster"

Convert fractions to their numeric format.

  • "a quarter" → 1/4

  • "three sixteenths" → 3/16

  • "a half" → 1/2

  • "a hundredth" → 1/100

Fractions are not converted into a numeric format.

  • "ein Drittel" → "ein Drittel"

Convert numbers less than ten to digits if there are more than one in a row.

  • "three four five" → 345

  • "My phone number is four two five five five five one two one two" → My phone number is 4255551212

  • "eins zwei drei" → 123

  • "plus vier neun zwei vier eins" → +49241

The words "dot" or "point" are displayed as a decimal.

  • "three hundred and three dot five" → 303.5

  • "three point twenty three" → 3.23

  • "zero point four" → 0.4

  • "point three" → 0.3

Decimals are indicated by ",".

  • "zweiundzwanzig komma drei" → 22,3

Convert the word "percent" after a number to a percent symbol (%).

  • "twenty three percent" → 23%

  • "twenty three point four five percent" → 23.45%

  • "fünf Prozent Hürde" → 5% Hürde

  • "dreiundzwanzig komma vier Prozent" → 23,4%

Convert monetary words to symbols.

Convert the words "dollar," "U S dollar," "Australian dollar, "AUD," or "USD" after a number to a dollar sign ($) before the number.

  • "one dollar and fifteen cents" → $1.15

  • "twenty three USD" → $23

  • "twenty three Australian dollars" → $23

Convert the words "pounds," "British pounds," or "GDB" after a number to pound sign (£) before the number.

  • "twenty three pounds" → £23

  • "I have two thousand pounds" → I have £2,000

  • "five pounds thirty three pence" → £5.33

Convert the words "rupees," "Indian rupees," or "INR" after a number to rupee sign (₹) before the number.

  • "twenty three rupees" → ₹23

  • "fifty rupees thirty paise" → ₹50.30

Convert the words "Euro" to a euro sign.

  • "ein euro" → 1 €

  • "ein Euro vierzig" → 1,40 €

  • "ein Euro vierzig Cent" → 1,40 €

Convert times to numbers.

  • "seven a m eastern standard time" → 7 a.m. eastern standard time

  • "twelve thirty p m" → 12:30 p.m.

  • "vierzehn Uhr fünfzehn" → 14:15 Uhr

Convert dates to numbers.

  • "May fifth twenty twelve" → May 5th 2012

  • "May five twenty twelve" → May 5 2012

  • "five May twenty twelve" → 5 May 2012

  • "dritter Dezember neunzehn hundert sechundfünfzig" → 3. Dezember 1956

Separate spans of numbers by the word "to".

  • "twenty three to thirty seven" → 23 to 37

Not applicable

Years are represented as four digits; this is only valid for years in the 20th, 21st, and 22nd centuries.

  • "nineteen sixty two" → 1962

  • "the year is twenty twelve" → the year is 2012

  • "twenty nineteen" → 2019

  • "twenty one thirty" → 2130

Not applicable

Display slashes and dashes.

  • "fifty-five dash thirteen" → 55-13

Slashes are not displayed.

  • "fifty-five slash thirteen" → 55 slash 13

  • "fünfundfünfzig Schrägstrich dreizehn" → 55/13

  • "fünfundfünfzig Strich dreizehn" → 55-13

Display numbered paragraphs.

Numbered paragraphs are not displayed using the paragraph symbol (§).

  • "paragraph seventeen" → paragraph 17

  • "Paragraf siebzehn" → § 17