Vast.ai GPUs Can Now Be Rented Through SkyPilot

August 8, 2025

4 Min Read

By Team Vast

Why This Integration Matters

Running large-scale AI workloads can be expensive and resource-intensive, especially as GPU availability fluctuates across providers. With Vast.ai's extensive GPU supply and SkyPilot's intelligent cloud orchestration, you can enjoy streamlined access to affordable, on-demand compute resources.

SkyPilot automatically selects the most cost-effective and available infrastructure, provisioning GPUs to ensure seamless execution of AI and batch jobs at scale. As the market leader in low-cost cloud GPU rental, Vast.ai provides an ideal foundation for this approach.

What Is SkyPilot?

Put simply, SkyPilot is a powerful framework for managing cloud workloads in a way that abstracts away infrastructure complexity. It was developed in the UC Berkeley lab that proposed the concept of "sky computing," which aims to make cloud providers function as a single compute pool.

With SkyPilot, you can run AI models and batch jobs efficiently without worrying about infrastructure management. SkyPilot includes improvements to managed jobs and Kubernetes support, faster GPU provisioning, and ready-to-use LLM recipes.

Vast.ai's Role

You can easily access and manage GPU resources on Vast.ai through SkyPilot. Vast.ai provides a vast supply of cost-effective GPU compute, while SkyPilot automates the selection and provisioning process to match your workload needs – allowing you to run thousands of concurrent jobs and scale efficiently.

With Vast.ai's extensive GPU marketplace, you get reliable, affordable compute on demand. SkyPilot enhances this by simplifying deployment, ensuring a seamless experience while maximizing access and failover protection.

Getting Started: Practical Tutorial

Now let's walk through a complete example of deploying a language model on Vast.ai using SkyPilot.

Setting Up the Environment

First, install SkyPilot with Vast.ai support:

pip install -U "skypilot[vast]"
pip install "vastai-sdk>=0.1.12"

Configure your Vast.ai API key:

# Get your API key from https://vast.ai → Account → API Keys
echo "YOUR_VAST_API_KEY_HERE" > ~/.vast_api_key
chmod 600 ~/.vast_api_key

# Verify setup
sky check

Expected output:

Vast: enabled ✓

Creating the YAML Configuration

Create a file named deepseek-r1-inference.yaml:

resources:
  accelerators: H100:1
  cloud: vast
  image_id: docker:vllm/vllm-openai:latest
  disk_size: 100

run: |
  vllm serve deepseek-ai/DeepSeek-R1-0528-Qwen3-8B \
    --host 0.0.0.0 \
    --port 8000 \
    --max-model-len 4096 \
    --gpu-memory-utilization 0.9 \
    --reasoning-parser deepseek_r1

This configuration:

Requests 1x H100 GPU on Vast.ai
Uses the pre-built vLLM Docker image
Allocates 100GB disk space for the model
Serves the DeepSeek R1 model on port 8000

Deploying the Service

Launch your service with SkyPilot:

sky launch deepseek-r1-inference.yaml --cluster deepseek-r1

This command will:

Find an available H100 GPU on Vast.ai
Provision the instance
Download the DeepSeek R1 model (~15GB)
Start the vLLM API server

The deployment takes 5-10 minutes. Monitor progress with:

sky logs deepseek-r1 --follow

Verify deployment:

sky status deepseek-r1

You should see Status: UP when ready.

SSH Port Forwarding

Since direct port mapping isn't available, use SSH tunneling to access your service:

ssh -L 8000:localhost:8000 deepseek-r1

Keep this terminal open! The tunnel needs to stay active for API access.

Testing the API

In a new terminal, test your deployed service:

# Test health endpoint
curl http://localhost:8000/health

# List available models
curl http://localhost:8000/v1/models

Python Client Example

import openai

# Configure client for your local tunnel
client = openai.OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="sk-fake-key"  # vLLM doesn't require real API key
)

# Make a simple request
response = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B",
    messages=[{
        "role": "user",
        "content": "Write a brief history of Los Angeles in one paragraph."
    }],
    max_tokens=300,
    temperature=0.7
)

print(response.choices[0].message.content)

Managing Your Deployment

# Check status
sky status deepseek-r1

# Stop cluster (saves money)
sky stop deepseek-r1

# Start stopped cluster
sky start deepseek-r1

# Terminate completely
sky down deepseek-r1

Conclusion

You now have a production-ready AI API running on Vast.ai's cost-effective H100 GPUs, managed through SkyPilot's streamlined interface. This combination provides enterprise-grade AI capabilities at a fraction of traditional cloud costs.

The integration between Vast.ai and SkyPilot represents a powerful solution for developers who need reliable, affordable GPU access without the complexity of manual infrastructure management. Start experimenting with your own AI workloads using this cost-effective stack!