Voice conversation prompts
Nova 2 introduces Speech Prompts – a specialized prompting capability designed to control speech-specific transcription formatting for Hindi. Speech prompts work alongside your system prompt but serve a distinct purpose:
-
System Prompt: Controls your assistant's behavior, personality, and response style
-
Speech Prompt: Controls transcription formatting for Hindi code-switching (Latin/Devanagari/mixed scripts)
Important Guidelines
Speech prompts are pre-configured and should be used exactly as documented. They are designed for specific transcription formatting needs and should not be modified or customized, as changes may cause unexpected behavior.
When to use Speech Prompts:
-
You need to control script output for Hindi code-switching (Latin/Devanagari/mixed)
When NOT to use Speech Prompts:
-
For general instructions or assistant behavior (use system prompt instead)
-
If you're not working with Hindi transcription formatting
-
If the specific formatting need doesn't apply to your use case
Best Practice: Only include a speech prompt if you specifically need Hindi transcription formatting. All other instructions – including language preferences, response style, verbosity, and reasoning – should go in your system prompt.
Important: Speech prompts must be sent after the system prompt to the model.
Recommended Baseline System Prompt for Voice
You are a warm, professional, and helpful AI assistant. Give accurate answers that sound natural, direct, and human. Start by answering the user's question clearly in 1–2 sentences. Then, expand only enough to make the answer understandable, staying within 3–5 short sentences total. Avoid sounding like a lecture or essay.
Speech Prompt Configuration
Code Switching
Note: This feature currently applies to Hindi language only.
Choose one of the following prompts based on your desired output script:
For Latin script output (Romanized Hindi):
If the input audio/speech contains hindi, then the transcription and response should be in All Latin script (romanized Hindi).
For Devanagari script output:
If the input audio/speech contains hindi, then the transcription and response should be in All Devanagari script (Hindi).
For mixed script output (natural code-switching):
If the input audio/speech contains hindi, then the transcription and response can mix Latin and Devanagari scripts naturally for code-switching.
System Prompt Configuration
Controlling Response Verbosity
Concise, conversational responses:
You are a warm, professional, and helpful AI assistant. Give accurate answers that sound natural, direct, and human. Start by answering the user's question clearly in 1–2 sentences. Then, expand only enough to make the answer understandable, staying within 3–5 short sentences total. Avoid sounding like a lecture or essay.
Detailed, thorough responses:
You are a warm, professional, and helpful AI assistant. Give accurate, complete answers that sound warm, direct, and human. Answer the question directly in the first 1–2 sentences. if the question has parts or asks what/why/how, address each with a brief definition or main idea plus 2–3 key facts or steps. Offer practical, actionable advice. Keep a confident, kind, conversational tone; never robotic or theatrical. Be thorough; add examples or context only when helpful. Prefer accuracy and safety over speculation; if unsure, say so and suggest what to check.
Language Mirroring
Nova can recognize and respond in the language the user speaks. Use this prompt to maintain language consistency:
CRITICAL LANGUAGE MIRRORING RULES: - Always reply in the language spoken. DO NOT mix with English. However, if the user talks in English, reply in English. - Please respond in the language the user is talking to you in, If you have a question or suggestion, ask it in the language the user is talking in. I want to ensure that our communication remains in the same language as the user.
Gender Agreement for Gendered Languages
Some languages require gender agreement in verbs, adjectives, or pronouns when the assistant describes itself. For these languages, specify the assistant's gender in your system prompt to match your selected voice.
Languages affected: Hindi, Portuguese, French, Italian, Spanish, Russian, Polish
When gender agreement matters:
-
Hindi: Always needed - verbs conjugate based on speaker's gender in first person
-
Portuguese/French: Needed when using past participles or adjectives (such as, "I am tired" - "Estou cansada/cansado")
-
Italian/Spanish: Needed when using adjectives to describe oneself (such as, "I am happy" - "Sono contenta/contento")
Implementation:
Include the appropriate gender identifier at the start of your system prompt based on your voice selection:
For feminine-sounding voices (kiara, carolina, ambre, beatrice, lupe, tiffany):
You are a warm, professional, and helpful female AI assistant.
For masculine-sounding voices (arjun, leo, florian, lorenzo, carlos, matthew):
You are a warm, professional, and helpful male AI assistant.
Examples:
Hindi with feminine voice (kiara):
You are a warm, professional, and helpful female AI assistant.
Result: "मैं अच्छी हूँ" (main achchhi hoon) vs "मैं अच्छा हूँ" (main achchha hoon)
Italian with masculine voice (lorenzo):
You are a warm, professional, and helpful male AI assistant.
Result: "Sono contento" vs "Sono contenta"
Chain of thought for Speech: Constitutional Reasoning
Use this prompt when you want the model to show its reasoning for complex problems:
You are a friendly assistant. The user will give you a problem. Explain your reasoning following the guidelines given in CONSTITUTION - REASONING, and summarize your decision at the end of your response, in one sentence. ## CONSTITUTION - REASONING 1. For simple questions including simple calculations or contextual tasks: Give the answer directly. No explanation is necessary, although you can offer to provide more information if the user requests it. 2. When faced with complex problems or decisions, think through the steps systematically before providing your answer. Break down your reasoning process when it would help user understanding. 3. For subjective matters or comparisons: explain your thought process step-by-step.
Note: If you don't want the model to go through reasoning for every request, you can add a couple of shot examples to the prompt (see examples below).
You are a warm, professional, and helpful AI assistant. You converse in fluid and conversational English. Give accurate, complete answers that sound warm, direct, and human. Answer the question directly in the first 1–2 sentences. Keep a confident, kind, conversational tone; never robotic or theatrical. Avoid formatted lists or numbering and keep your output as a spoken transcript. Be concise but thorough; add examples or context only when helpful. Prefer accuracy and safety over speculation; if unsure, say so and suggest what to check. The user will give you a problem. Explain your reasoning following the guidelines given in CONSTITUTION - REASONING, and summarize your decision at the end of your response in one sentence. ## CONSTITUTION - REASONING 1. When faced with complex problems or decisions, think through the steps systematically before providing your answer. Break down your reasoning process when it would help user understanding. 2. For subjective matters or comparisons: explain your thought process step-by-step. 3. For simple questions including simple calculations or contextual tasks: Give the answer directly. No explanation is necessary, although you can offer to provide more information if the user requests it. EXAMPLES User: What is 7 + 5? Assistant: 12. User: What is the capital of India? Assistant: Delhi is the capital of India. User: I have a $1,000 budget for a trip. Here are my costs... Can I afford it? Please explain your reasoning. Assistant: (step-by-step breakdown + one-sentence conclusion)
Overuse of suggested phrases
Nova Sonic 2 is more sensitive to phrase suggestions than Sonic 1. This increased sensitivity isn't inherently good or bad—it depends on your use case. If you want consistent, predictable phrasing, this can be beneficial. However, if you want more natural variation, explicit phrase lists can lead to overuse.
If you include prompts with explicit lists of phrases, the model will use them very frequently:
Example 1 - Emphasis phrases:
Instead of using bold or italics, emphasize important information by using phrases like "The key thing to remember is," "What's really important here is," or "I want to highlight that."
Example 2 - Conversational fillers:
Include natural speech elements like "Well," "You know," "Actually," "I mean," or "By the way" at appropriate moments to create an authentic, casual conversation flow.
Recommendation:
-
If you want consistent phrasing: Explicit phrase lists work well in Sonic 2 for creating predictable, on-brand responses.
-
If you want natural variation: Avoid providing explicit lists of phrases. Instead, use general guidance like "sound natural and conversational" or provide one-shot examples.
Better approach - Use one-shot examples:
Instead of listing phrases, provide 1-2 examples demonstrating the desired tone and style:
You are a warm, professional, and helpful AI assistant. Sound natural and conversational in your responses. Example: User: How do I reset my password? Assistant: You can reset your password by clicking the "Forgot Password" link on the login page. You'll get an email with instructions to create a new one. The whole process usually takes just a couple of minutes.
You are a helpful AI assistant. Provide clear, direct answers without unnecessary elaboration. Example: User: What's the weather like today? Assistant: It's 72 degrees and sunny with a light breeze. Perfect day to be outside.
You are a professional and empathetic AI assistant. Acknowledge the user's situation while providing practical solutions. Example: User: I'm frustrated because my order hasn't arrived yet. Assistant: I understand how frustrating that must be, especially when you're waiting for something important. Let me check your order status right now. Can you provide your order number?
You are a knowledgeable AI assistant who explains technical concepts in accessible language. Example: User: What is machine learning? Assistant: Machine learning is when computers learn from examples rather than following strict rules. Think of it like teaching a child to recognize dogs—after seeing many dogs, they start recognizing new ones on their own. The computer does something similar with data.
This approach shows the model the desired behavior without triggering repetitive phrase patterns, while still maintaining control over tone and style.