Deploy LLMs with dstack on Vast.ai

January 16, 2026
2 Min Read
By Team Vast
Share
Subscribe

dstack is an open-source GPU orchestration platform that automates instance provisioning and lifecycle management across cloud providers. This guide walks through using dstack with Vast.ai as the backend, combining declarative infrastructure with competitive GPU marketplace pricing.

What Makes dstack Special?

Key features include:

  • Infrastructure as Code: Define GPU requirements, pricing limits, and workloads in YAML
  • Automatic Provisioning: dstack finds and provisions the best available instances
  • Cost Controls: Set max_price to cap hourly costs automatically
  • Built-in Proxy: Access services through dstack's authenticated endpoint

What's in the New Guide

Our latest documentation walks you through deploying language models with dstack and vLLM on Vast.ai.

Complete Setup Instructions

  • Installing and configuring dstack with your Vast.ai API key
  • Starting the dstack server and CLI
  • Creating service configurations for vLLM deployments

Working Service Configurations

  • Ready-to-use YAML for deploying Qwen3-30B-A3B
  • GPU memory and pricing parameters
  • Real deployment outputs showing what to expect

API Integration Examples

  • Python code using the OpenAI SDK
  • cURL examples for testing
  • Streaming response implementations

Why dstack + Vast.ai?

The combination gives you orchestration on top of a GPU marketplace:

  • Simplified Workflow: No manual instance hunting or environment setup
  • Cost Optimization: dstack finds the cheapest instance meeting your requirements
  • Flexible Pricing: Access Vast.ai's competitive rates with automatic cost caps
  • Production-Ready API: vLLM provides OpenAI-compatible endpoints

The guide demonstrates deploying Qwen3-30B-A3B on an H100 80GB—provisioned with a single command.

Who Should Use This Guide?

This deployment guide is perfect for:

  • Teams wanting reproducible, version-controlled GPU deployments
  • Developers building LLM applications who want simple infrastructure management
  • Anyone tired of manually provisioning and configuring GPU instances

Get Started

The complete guide is now available in our documentation:

Read: Deploy LLMs with dstack and vLLM on Vast.ai →

Whether you're new to GPU orchestration or looking for a better way to manage LLM deployments, this guide provides everything you need to get started with dstack on Vast.ai.

Ready to try it? Sign up for Vast.ai and follow the guide to deploy your first model.

Vast AI

© 2026 Vast.ai. All rights reserved.

Vast.ai