SOC 2 Certified

Instant GPUs.
Transparent Pricing.

Deploy high-performance GPU instances in seconds and save up to 80% vs. traditional clouds – with 24/7 expert support.

Trusted by developers from around the world

CHAI
BOSCH
Cognition
Inria
IBM
Brave
Speechify
CHAI
BOSCH
Cognition
Inria
IBM
Brave
Speechify

Real-Time GPU Pricing

Prices set by supply and demand across our marketplace. No list prices. No hidden fees.

How It Works

From sign-up to running GPU workloads in under five minutes.

1

Add Credit

Start with as little as $5. No contracts, no minimums.

2

Search GPUs

Filter by model, VRAM, price, and availability across the marketplace.

3

Deploy

Launch instances in seconds. Scale up or down anytime.

Built for Developers

Provision GPU compute programmatically. CLI, Python SDK, and REST API — deploy from code, not clicks.

pip install vastai
Python SDKDocs →
pip install vastai-sdk
REST APIDocs →
curl -H "Authorization: Bearer $VAST_API_KEY" https://cloud.vast.ai/api/v1/bundles/
deploy.py
from vastai_sdk import VastAI
vast = VastAI(api_key="...")

offers = vast.search_offers(
    query="gpu_name=H100_SXM num_gpus=8"
)
result = vast.launch_instance(
    id=offers[0]["id"],
    image="vllm/vllm-openai:latest",
    disk=100, ssh=True
)

One Platform, Three Ways to Deploy

GPU Cloud for full control. Serverless for zero-ops inference. Clusters for large-scale training.

GPU Cloud

On-demand instances across 40+ data centers and 20,000+ GPUs. Deploy in seconds, scale without limits.

Serverless

Deploy models as endpoints. Autoscale to zero, pay only for compute time — no idle costs.

Clusters

Dedicated multi-node GPU clusters with InfiniBand networking for large-scale training.

Built for Every AI Workload

From training to inference, fine-tuning to rendering — run any GPU workload on Vast.

AI/ML Frameworks

AI Text Generation

AI Image + Video Generation

Batch Data Processing

Audio-to-Text Transcription

Virtual Computing

Popular Models, Ready to Deploy

Launch pre-configured templates for the most popular open-source models.

Qwen3.5 397B A17B

Efficient multimodal reasoning model with hybrid DeltaNet-attention architecture

Kimi K2.5

Kimi K2.5 is an open-source, native multimodal agentic model built through continual pretraining on approximately 15 trillion mixed visual and text tokens atop Kimi-K2-Base

LTX-2

LTX-2 is a DiT-based audio-video foundation model designed to generate synchronized video and audio within a single model

DeepSeek OCR

Contexts Optical Compression vision language model

Vast.ai reduced our GPU costs by over 60% while giving us the flexibility to scale training jobs on demand. We serve 200K daily users without breaking the bank.

Giang, Creatix Technology

How Teams Win with Vast.ai

See how companies use Vast.ai to scale AI workloads, reduce costs, and accelerate innovation.

Creatix Technology

Creatix Technology

Creatix Technology Scales to 200K Daily Users with Vast.ai's GPU Cloud

How a fast-growing AI app company cut infrastructure costs by over 60% and powered millions of new users with Vast.ai.

Tech
PAICON

PAICON

PAICON Accelerates Global, Data-Centric Cancer Diagnostics with Vast.ai

How a global oncology data platform used Vast.ai’s GPU marketplace to rapidly iterate on Athena—validating that diversity can matter more than scale—while significantly reducing research-phase training costs.

Medical AI

Start with $5. Scale to 20,000 GPUs.

No contracts, no minimums. Deploy GPU instances in seconds on the world's largest GPU marketplace.