Vast.ai GPUs can be seamlessly rented through SkyPilot, an open-source framework that automates AI, LLM, and batch job execution. This integration makes it easy to access Vast.ai's low-cost, high-performance GPUs while SkyPilot handles provisioning, scaling, and cost optimization.
Whether you're training deep learning models, fine-tuning LLMs, or running large-scale batch jobs, this partnership provides an efficient and cost-effective solution for developers everywhere.
Running large-scale AI workloads can be expensive and resource-intensive, especially as GPU availability fluctuates across providers. With Vast.ai's extensive GPU supply and SkyPilot's intelligent cloud orchestration, you can enjoy streamlined access to affordable, on-demand compute resources.
SkyPilot automatically selects the most cost-effective and available infrastructure, provisioning GPUs to ensure seamless execution of AI and batch jobs at scale. As the market leader in low-cost cloud GPU rental, Vast.ai provides an ideal foundation for this approach.
Put simply, SkyPilot is a powerful framework for managing cloud workloads in a way that abstracts away infrastructure complexity. It was developed in the UC Berkeley lab that proposed the concept of "sky computing," which aims to make cloud providers function as a single compute pool.
With SkyPilot, you can run AI models and batch jobs efficiently without worrying about infrastructure management. SkyPilot includes improvements to managed jobs and Kubernetes support, faster GPU provisioning, and ready-to-use LLM recipes.
You can easily access and manage GPU resources on Vast.ai through SkyPilot. Vast.ai provides a vast supply of cost-effective GPU compute, while SkyPilot automates the selection and provisioning process to match your workload needs – allowing you to run thousands of concurrent jobs and scale efficiently.
With Vast.ai's extensive GPU marketplace, you get reliable, affordable compute on demand. SkyPilot enhances this by simplifying deployment, ensuring a seamless experience while maximizing access and failover protection.
Now let's walk through a complete example of deploying a language model on Vast.ai using SkyPilot.
First, install SkyPilot with Vast.ai support:
pip install -U "skypilot[vast]"
pip install "vastai-sdk>=0.1.12"
Configure your Vast.ai API key:
# Get your API key from https://vast.ai → Account → API Keys
echo "YOUR_VAST_API_KEY_HERE" > ~/.vast_api_key
chmod 600 ~/.vast_api_key
# Verify setup
sky check
Expected output:
Vast: enabled ✓
Create a file named deepseek-r1-inference.yaml
:
resources:
accelerators: H100:1
cloud: vast
image_id: docker:vllm/vllm-openai:latest
disk_size: 100
run: |
vllm serve deepseek-ai/DeepSeek-R1-0528-Qwen3-8B \
--host 0.0.0.0 \
--port 8000 \
--max-model-len 4096 \
--gpu-memory-utilization 0.9 \
--reasoning-parser deepseek_r1
This configuration:
Launch your service with SkyPilot:
sky launch deepseek-r1-inference.yaml --cluster deepseek-r1
This command will:
The deployment takes 5-10 minutes. Monitor progress with:
sky logs deepseek-r1 --follow
Verify deployment:
sky status deepseek-r1
You should see Status: UP
when ready.
Since direct port mapping isn't available, use SSH tunneling to access your service:
ssh -L 8000:localhost:8000 deepseek-r1
Keep this terminal open! The tunnel needs to stay active for API access.
In a new terminal, test your deployed service:
# Test health endpoint
curl http://localhost:8000/health
# List available models
curl http://localhost:8000/v1/models
import openai
# Configure client for your local tunnel
client = openai.OpenAI(
base_url="http://localhost:8000/v1",
api_key="sk-fake-key" # vLLM doesn't require real API key
)
# Make a simple request
response = client.chat.completions.create(
model="deepseek-ai/DeepSeek-R1-0528-Qwen3-8B",
messages=[{
"role": "user",
"content": "Write a brief history of Los Angeles in one paragraph."
}],
max_tokens=300,
temperature=0.7
)
print(response.choices[0].message.content)
# Check status
sky status deepseek-r1
# Stop cluster (saves money)
sky stop deepseek-r1
# Start stopped cluster
sky start deepseek-r1
# Terminate completely
sky down deepseek-r1
You now have a production-ready AI API running on Vast.ai's cost-effective H100 GPUs, managed through SkyPilot's streamlined interface. This combination provides enterprise-grade AI capabilities at a fraction of traditional cloud costs.
The integration between Vast.ai and SkyPilot represents a powerful solution for developers who need reliable, affordable GPU access without the complexity of manual infrastructure management. Start experimenting with your own AI workloads using this cost-effective stack!