Core Concepts: Voice

Flex leverages the Twilio Client to manage calls in the browser via the Flex UI. The Twilio Client requires an installed web browser that supports Web Real-Time Communication (WebRTC) and an internet connection. To understand Twilio's requirements for network connectivity, see Voice Client JS and Mobile SDKs’ Network Connectivity Requirement.

When you create a new Flex project in the Twilio Console, Twilio automatically provisions a number that accepts incoming calls and SMS. Flex supports both inbound and outbound voice out of the box.

A Flex call can have one or more call legs, referring to the connection between a device and Twilio. For example, in a bridged call scenario, a call can have two call legs with Twilio: One leg from customer to Twilio and one from Twilio to an agent. The Call Logs page of your Flex project displays each call leg individually.

Inbound Calls

Inbound calls (or incoming calls) are calls made by customers to your contact center. On a successful connection, an inbound call is routed as an incoming call request to an available agent.

To test the default inbound call experience for Flex:

  1. Log in and make yourself available as an agent.
  2. Place a call to the number that came with your Flex instance.

To customize inbound voice, you can set up a scalable Interactive Voice Response (IVR) system or build an intelligent bot for your contact center using Autopilot.

Interactive Voice Response (IVR) Apps

IVR is an automated telephony system that interacts with customers through voice and touch-tone keypad selections (DTMF tones). It is also commonly known as a phone tree or a phone menu.

You can build IVR systems for your contact center that provide richer call context before routing them to your agents. For example, you can prompt your customers to supply general and account-based information to reduce delays and eliminate the need for call transfers. To get started, see the Build an IVR visually with Studio and Set up an IVR using Twilio Studio tutorials.


Autopilot is a platform that lets you build artificial intelligence (AI) bots to help your customers interact with Flex. It uses natural language understanding (NLU) and automatic speech recognition.

With Autopilot, you can build a bot and deploy it as an IVR. Common use cases include returning frequently requested information, performing simple common tasks and collecting further details from the end-customer before passing them to an agent. For example, Autopilot bots can automate routine requests like looking up account information or resetting passwords. Bots can be trained to work across messaging channels like SMS, web, and mobile chat.

You can seamlessly hand off a voice call to your agents with the conversation context intact. When updating a Studio Flow for inbound calls on Flex, you can use the Autopilot widget with the Send to Flex widget for handing the conversation over from a bot to a human agent. Note that you can only trigger an Autopilot Execution using the Studio Canvas UI.

See How to build a conversational IVR for a step-by-step guide on a conversational IVR that uses an Autopilot bot to manage Disneyland vacation reservations. For details on handing off an Autopilot voice session to Flex, see Handoff to Task Router or Flex.

Outbound Calls

Outbound calls (or outgoing calls) refer to calls made by agents to your customers.

With the release of the native Dialpad in Flex UI v1.18.0, agents can initiate outbound calls or transfer an ongoing call to another agent or a supervisor. You can also leverage the StartOutboundCall action (provided by the Actions Framework) to implement use cases like click-to-dial and preview dialers. To start using the native Dialpad, follow the steps in Enabling the Flex Dialpad.

Transfers (Warm and Cold)

A warm transfer involves consulting with the agent receiving the call transfer before the initiating agent can wrap up the call on their end. The Warm Transfer or Consult button is represented as a Phone icon in the Flex UI. To learn more about a warm call transfer flow, see Call Control Concepts: Warm Transfer.

A cold transfer does not involve communicating with the receiving agent. When an agent initiates a cold transfer (represented as a right arrow icon), the voice task autocompletes for the agent transferring the call. To learn more about a cold call transfer flow, see Call Control Concepts: Cold Transfer.

An agent can either transfer the call to another agent, or to a Task Queue (both for warm and cold transfers).

The native Flex Dialpad supports both warm and cold transfers on outbound calls.


Voice conferencing is where two or more people in different locations use technology like a conference bridge to participate in a voice call. Twilio's Voice Conference lets you manage multi-party calls from 2 to 250 participants. Voice Conferences can be used for standard multi-party audio bridges, inbound contact centers, or for outbound dialers. To learn more about the lifecycle of a conference and how to manage conference participants, see Voice Conference.


In telephony, a call may be placed on hold when an agent needs to review additional details, transfer the call, or consult with a supervisor. When a call is on hold, the connection is not terminated but no verbal communication is possible until the call is removed from hold by the same or another extension on the key phone system. In the Flex UI, the Hold button is represented by a pause icon. To remove a contact from hold status, click Hold again. Under the hood, Flex uses the hold property of the Voice API Participant resource to set the hold status of a call participant.

Twilio Flex allows you to change the hold music or record a message for the caller while a call is on hold. If you're on a paid Flex plan, you can review your contact center's Average Hold Time and other built-in metrics with Flex Insights.

Mute / Unmute

In the Flex UI, agents can mute or unmute themselves on a call by clicking Mute (or the microphone icon). Muting and unmuting affect the muted property of the Voice API Participant resource.

Call Default Limits

  • Voice tasks or calls are limited to 100 by default per queue. To change the default value, see Update the Call Queue Limit for Flex.
  • Calls and conferences have a four-hour limit.

Call Recordings

Recording voice calls is a must-have feature for many contact centers: either for keeping an audit trail for legal reasons, resolving customer complaints, or for supervisors to coach the agents.

The preferred recording mode is dual-channel recording, which means that each party's audio is recorded onto a separate track (typically one for the customer and one for the agent). Follow Enabling Dual-Channel Recordings to enable dual-channel recordings via Studio or custom code. Currently, this is only available for inbound calls.

The opposite of that is single-channel recording, where the audio from all participants is mixed. If single-channel recording is sufficient for your use case, you can enable it with a single click on the Flex Settings page in the Console. Notice that some Flex Insights features will not be available on single-channel recordings (cross-talk analysis, graphical display of audio timeline separated per participant, etc).

Voice Insights

Voice Insights lets you dive into the data behind your calls. It provides call quality analytics and aggregation tools for drilling into calls made within the last 30 days. To review the aggregate dashboard and call summary for your Flex instance, visit the Voice Insights page in the Twilio Console. For high precision call metrics, event streams, and programmatic access, you need to enable the Voice Insights Advanced Features. See the documentation for more details on the advanced features.

Agent-Assisted <Pay>

Twilio <Pay> enables agents to securely capture caller payment information in a PCI-compliant manner during a voice conversation using the Agent-Assisted <Pay> API. When leveraging the Agent-Assisted <Pay> feature within Flex, agents control the payment flow and guide callers by requesting payment information one at a time (e.g., payment card number, expiration date, security code). Agents can continue to converse with callers but will not hear their DTMF input, ensuring the security of the payment information. Once all the payment information is collected, agents complete the <Pay> session and Twilio securely transmits the collected information to the payment processor via your configured <Pay> connector. To get started, see the <Pay> Connector and the Agent-Assisted <Pay> APIs documentation.

Media Streams

Media Streams gives access to the raw audio stream of your Programmable Voice calls by forking the audio stream in real-time and sending it to a destination of your choosing using websockets. This enables use cases such as real-time transcriptions, sentiment analysis, voice authentication, and more. See the <Stream> API documentation to learn more. Raw audio can also be streamed into Twilio from another application, enabling use cases such as conversational IVR or integrations with a regional provider for custom Text-to-Speech. See the documentation for bi-directional streaming.

Rate this page:

Need some help?

We all do sometimes; code is hard. Get help now from our support team, or lean on the wisdom of the crowd by visiting Twilio's Stack Overflow Collective or browsing the Twilio tag on Stack Overflow.

Thank you for your feedback!

Please select the reason(s) for your feedback. The additional information you provide helps us improve our documentation:

Sending your feedback...
🎉 Thank you for your feedback!
Something went wrong. Please try again.

Thanks for your feedback!