June 2026 Product Update

June's updates bring NVIDIA B200 and B300 Blackwell Ultra GPU supply to the marketplace, plus new templates, guides, and platform improvements.
NVIDIA B200 and B300 GPUs Now Available
Fresh GPU supply is live on Vast.ai, with the powerful NVIDIA B300 now available. Built on the Blackwell Ultra architecture, the B300 features 288GB of HBM3e memory, 8 TB/s of memory bandwidth, and 20,480 CUDA cores - designed for large-scale inference, reasoning models, distributed training, and any workloads that keep running out of VRAM. See up-to-date B300 pricing, and rent by the hour or reserve long-term.
We've also added hundreds of B200 GPUs to the marketplace. Rent individual instances or use Vast.ai Serverless to automatically scale workloads without managing infrastructure yourself.
Platform Updates: Serverless, Billing, and Security Fixes
We've continued our platform updates with several improvements this month. Serverless metrics reporting has been revamped with increased granularity and new metrics, and serverless UI performance has been significantly improved to accommodate high instance count endpoints.
In addition, when you create a machine report, that machine now stays out of your search results for longer, so your results stay more relevant to you.
On the fixes side: we've improved support for manual invoices on billing transaction history exports, fixed some SSH key issues with key editing and stale data, addressed OAuth login and two-factor authentication (2FA) issues, and made a number of security improvements.
New Templates
This month, we've added a template for Qwen3.6 35B A3B. It's the first open-weight model in the Qwen3.6 series, combining a hybrid Gated DeltaNet and Gated Attention architecture with sparse Mixture-of-Experts (MoE) routing and a vision encoder optimized for agentic coding, frontend workflows, and repository-level reasoning.
Serverless templates for Qwen3.6 and Gemma 4 are now available. The Qwen3.6 template supports a 35B-total, 3B-active model.
We've also updated the vLLM template to include TurboQuant support and made some maintenance updates to the ComfyUI template.
Our Commitment
We remain committed to keeping high-performance AI infrastructure accessible and affordable for everyone. With fresh B200 and B300 GPU supply live on Vast.ai, we're continuing to expand the resources available to our users while making it easier to build, train, fine-tune, and deploy AI workloads at scale.
Need help? Reach out anytime at support@vast.ai or join our Discord server to connect with the community and keep up with our latest updates.
Change Log
New Features
- NVIDIA B200 and B300 GPUs are now available.
- Creating a machine report now removes the machine from your results for a longer duration.
- Revamp of serverless metrics reporting: increased granularity and new metrics.
- Significant improvement in the performance of serverless UI to accommodate high instance count endpoints.
- Billing page UI improvements.
- Search page performance improvements.
- B300 support.
Issues Resolved
- Improved support for manual invoices on billing transaction history export.
- Fixed instance SSH key issues with key editing and stale data.
- OAuth login issues.
- Various 2FA fixes.
- Security improvements.
New Templates
- Qwen3.6 35B A3B: the first open-weight model in the Qwen3.6 series, built on direct community feedback and focused on stability and real-world utility. It combines a hybrid Gated DeltaNet and Gated Attention architecture with sparse Mixture-of-Experts routing and a vision encoder for unified multimodal reasoning.
- Serverless templates for Qwen3.6 and Gemma 4.
Updated Templates
- vLLM template: TurboQuant support.
- ComfyUI maintenance.
New Guides
- Step-by-step instructions for moving workloads from Salad Cloud to Vast.ai, including instance setup and cost comparison.
- How to generate video with LTX-2.3 and ComfyUI on Vast.ai.
- A practical guide to fine-tuning open-source LLMs on Vast.ai GPU instances using the Axolotl framework.
- Deploy and run NVIDIA Nemotron 3 Super on Vast.ai for high-performance inference.


