Here is how you can set up an API for yourself to do inference with Llama3 and TGI

A guide for being a semi-professional or professional host on vast.ai

Show User

Show IP Addresses

Get Subaccount

Create Subaccount

Reset API Key

Transfer Credit

Overview

Architecture

Here's how you can use Vast's autoscaling service with some of our popular templates.

cost is estimated number of tokens for request.

For TGI, a good default is max_new_tokens

For Comfy UI, the calculation is more complex, but a good default is 200

You will also need to provide a payload object with your endpoint's expected query parameters

In this example, we are using an expected payload for our TGI example in our docs

Getting Started

Templates Reference

Debugging

Overview & Quickstart

Commands

Permissions and Authorization

Python SDK Usage

Introduction

Is there a spend rate limit on my account?

Account

Can I use my personal account for Cloud Sync?

Cloud

Can I change my template on an existing instance?

Can I increase/decrease my disk space on my instance?

Templates

If I rent a server and delete if after 10 minutes will I pay for 1 hour of usage or 10 minutes?

Why am I getting the error "No such payment method id None." when I try to add credit?

If my account is negative a few $, what will happen? What happens if my Vast balance is negative?

Why am I getting charge more per hour than expected?

Why are my GPUs not showing up in the list?

Billing

Earning

View my instance's IP addresses and open ports

Stopping, Restarting and Destroying an Instance

Can I change my existing instance's template?

How can I increase the storage of the running instance?

Why is my instance taking so long to start upon creation?

Why is my instance stuck on "Loading" upon creation?

Why is my instance stuck on "Scheduling" when trying to restart it?

What can I do if my instance is stuck on "Scheduling" when trying to restart it?

Why did I get the error "offer is no longer avaliable"?

Why is my instance stuck on "Connecting"?

What happens to my data when I destroy/delete my instance?

Is there any way to execute my dockerfile?

How come some machines say only ", US" with no state? Are they more sketchy to deploy on?

Can I change my SSH Key when I already have an instance?

I get the error "Permission denied (publickey)." when trying to SSH?

Why don't I see the remaining time of my instance I'm renting?

 What's the difference between on-demand and reserved machines?

Can I have external storage that can store the checkpoint models and use it when I connect to an instance?

I am trying to launch a docker container in the cloud instance. I get this error "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?"

My instance is broken, please fix my machine?

My instance is giving me this error: "Unable to find image 'ubuntu20.04_latest/ssh:latest' locally Error response from daemon: Get "https:/v2/": http: no Host in request URL"

Instances

I can't search for instances. I am getting the error "Error: invalid_request 0 is not a valid search op"

I can't find the machine type I am looking for?

When I try to rent an instance I get "Error: server_error Something went wrong on the server"

Search

Referral Program

How do I turn my user account into a team account?

Teams

Copy Data

Cancel Copy

Instance<->Cloud copy (cloud sync) (Docker)

How do you recommend I move data from an existing instance?

Help, I want to move my data but I forgot what directory it's in!

Data Movement

Cloud Sync

How to Create Your Own Multi-Node Cluster in Vast

Step 1 - Look for Machines with Same Datacenter ID

Step 2 - Find Public IP Addresses of Machines

Step 3 - Ping IP Addresses of Other Machines to Test Connectivity

Step 4 - [Optional] Hit Server Running at Port on Another Machine in Your Cluster

Create Your Own Multi-Node Cluster

Route

Example: Calling the **`/get_autogroup_stats/`** Endpoint in "Timeseries" Mode

Example: Calling the **`/get_autogroup_stats/`** Endpoint with cURL

Stats

Example: Fetching Endpoint Logs with cURL

Logs

Worker List

Get Instances

How do I host my machine(s) on Vast? How can I rent my PC?

Help I am getting this error on my machine?

Can I send a message to a customer using my machine letting them know that I fixed an issue that they were having?

I fear I will decrease my reliability from restarting my machine and potentially lose my verification.

Why did the reliability on my machine decrease?

How do I minimize my reliability dropping?

If someone has already used an image on my machine does redownload happen or is the system smart?

My storage for clients is somehow full. I just have a few jobs stored in my server and most of them are old and didn't delete once the job finished. A lot of them are really old, can I remove them to free up some space?

Why can't I see my machine on the Search page in the console?

What is Stripe Express, and how do I access it?

Which email address does Stripe use to send Stripe Express invitations?

I earned enough to need a 1099 form in 2023. Why haven’t I received an email from Stripe?

Guide to Taxes

A description of the requirements for becoming a secure cloud datacenter

Datacenter Status

Why does it show paid on my invoice when I don't see the payment in my account yet?

Payment

A guide that explains the different stages of verification for a host's machine

Verification Stages

Get Machines

Schedule Maintenance

List Machine

Unlist Machine

Rental Types

Reserved Instance Discounts

Search Interface

Other differences from Docker-based instances

Virtual Machines

Launch Modes

How to config templates for Vast.ai instances

Docker repository (registry) authentication

Docker Execution Environment

Jupyter

Step 1 - Open the Windows Powershell terminal

Step 2 - Run the SSH Key generation command if you do not already have a default ssh key (id_rsa)

SSH/SCP

Create Instances

Update Instances

Destroy Instances

Reboot Instance

Request Logs

Execute Command

Change Bid Price

Get Machine Earnings

Get Invoices

Set Minimum Bid

Set Default Job

Remove Default Job

2) Set up your device for connecting to instances

QuickStart

What is the Secure Cloud (Only Trusted Datacenters) filter?

What operating systems are provided? Windows?

Are vast.ai interruptible instances the same as AWS spot or GCE interruptible?

What happens when my interruptible instance loses the bid?

How can I restart my programs once the instance restarts?

I see my instance has a Lifetime - what does that mean?

How can I stop the instance from within the instance?

How can I launch another Docker container from within the instance?

How can I open an identity port map like 32001:32001 where external:internal are the same?

I'm getting very slow transfer speeds using the jupyter download/upload?

What is this HTTPS website unsecure warning?

I'm deleting files in Juypter but it's not freeing disk space! How do I truly delete?

I'm getting some missing library or package error?

How can I more easily download many files?

Jupyter is ok, but can I run colab directly with a vast instance?

How do I connect to an SSH instance on linux/mac?

How do I connect to an SSH instance from Windows?

What is this tmux thing? How do I create multiple bash terminals on my ssh instance?

How do you protect my data from other clients?

How do you protect my data from providers?

Can you bill my card automatically so I don't have to add credit in advance?

I didn't enable debit-mode - what are these automatic charges to my card?

Why should I trust vast.ai with my credit card info?

Do you support PayPal? What about cryptocurrency?

How do I upload/download to/from my instance?

How do I upload/download to/from my instance - using scp?

I'm getting a ConnectionResetError downloading files?

How can I download a Kaggle dataset using wget?

All my instances keep stopping, switching to inactive status, even though I didn't press the stop button. What's going on?

I keep getting this error: spend_rate_limit. What's going on?

I tried to connect with ssh and it asked for a password. What is the password?

I stopped my instance, and now when I try to restart it the status is stuck on \"scheduling\". What is wrong?

Do you support payout in any crypto-currencies?

What happens if I turn off my machine or just lose internet during a compute job?

Docs - Use Cases

Huggingface TGI with LLama3

1) Choose The Huggingface LLama3 TGI API Template From the Recommended Section #

2) Modifying the Template #

3) Rent a GPU #

4) Monitor Your Instance #

5) Congratulations! #

Serverless/Autoscaler Guide #