Model Library/Dia 1.6B

Nari Labs logoDia 1.6B

TTS
Web UI

Dia directly generates highly realistic dialogue from a transcript. You can condition the output on audio, enabling emotion and tone control

On-Demand Dedicated 1xRTX 4090

Details

Modalities

audio

Recommended Hardware

1xRTX 4090

Estimated Price

Loading...

Provider

Nari Labs

Family

Dia

License

Apache 2.0

Dia 1.6B: Realistic Dialogue Generation from Text

Dia is a text-to-speech model developed by Nari Labs that directly generates highly realistic dialogue from transcripts. The model supports English language generation and enables emotion and tone control through audio conditioning.

Key Features

Dialogue Generation with Speaker Tags Dia produces natural speech from transcripts using [S1] and [S2] speaker tags, making it easy to create multi-speaker conversations directly from text.

Nonverbal Communication The model recognizes and generates approximately 20 different nonverbal expressions including laughter, coughing, throat clearing, sighing, and gasps. These are triggered using simple tags like "(laughs)", "(clears throat)", and "(sighs)".

Voice Cloning Dia includes voice cloning functionality that enables speaker consistency across generations. The model produces different voices with each generation without requiring fine-tuning on specific voices, and supports seed-fixing for reproducibility.

Audio Conditioning The model can be conditioned on audio input, enabling precise control over emotion and tone in the generated speech output.

Use Cases

  • Creating realistic dialogue for audio content and storytelling
  • Generating conversational speech with multiple speakers
  • Producing speech with emotional expressions and nonverbal sounds
  • Voice synthesis applications requiring speaker consistency
  • Accessibility tools for text-to-speech conversion

Training and Architecture

Dia draws inspiration from SoundStorm and Parakeet architectures, utilizing the Descript Audio Codec for audio generation. The model development benefited from resources provided by the Google TPU Research Cloud program and a Hugging Face ZeroGPU grant.

Quick Start Guide

Choose a model and click 'Deploy' above to find available GPUs recommended for this model.

Rent your dedicated instance preconfigured with the model you've selected.

Start sending requests to your model instance and getting responses right now.

Vast AI

© 2025 Vast.ai. All rights reserved.

Vast.ai