Docs - Autoscaler

Stats Endpoint

Please note the Autoscaler is currently in Beta, and is subject to changes and downtime.

Both endptgroups and autogroups keep track of a number of different metrics relating to their performance over time.

The list of the currently tracked metrics are here:

  • nworkers: Current number of workers in the autogroup.
  • totreqs : Total number of requests your autogroup has served.
  • curload : The rate at which requested work units are coming in to the server (work units per second)
  • capacity: The total number of requested work units your autogroup can handle.
  • reliable: Average rate of success for requests sent to workers in your autogroup.
  • reqrate : The rate at which new requests are coming into your autogroup.

The /get_endpoint_stats/ endpoint and /get_autogroup_stats/ endpoint returns these metrics in one of two modes.

Below are the minimum required fields to include in your request to these endpoints.

/get_endpoint_stats/ :

  • One of the following
    • id: ID of your endpoint group
    • endpoint: name of your endpoint group
  • api_key: API_KEY corresponding to your endpoint group used for authentication purposes

/get_autogroup_stats/ :

  • id: ID of your autogroup
  • api_key: API_KEY corresponding to your autogroup used for authentication purposes

If you just include these parameters in your request, this endpoint will return the current value for all of these metrics.

Here is an example output from calling the endpoint in this mode:

{ "capacity": 8306.60345739313, "curload": 9250.0, "endpoint": "test-local", "id": 28, "nworkers": 44, "reliable": 0.9942605208097545, "reqrate": 31.918882596395715, "totreqs": 8603534 }

If you are interested in getting results for a specific metric over a specific timerange, you can use this endpoint in "timeseries" mode by supplying these extra parameters.

  • metric_name: Name of the metric you are querying
  • t1: Start timepoint of the timeseries
  • t2: End timepoint of the timeseries

Below is an example of calling the /get_autogroup_stats/ endpoint in "timeseries" mode to query for the nworkers metric.

1 2 3 4 5 6 7 8 9 10 11 12 13 stats_payload = { "id" : 28, "metric_name" : "nworkers", "t1" : 5000, "t2" : 5010, "api_key" : "API_KEY_HERE" } response = requests.post("https://run.vast.ai/get_autogroup_stats/", headers={"Content-Type": "application/json"}, data=json.dumps(stats_payload), timeout=4) if response.status_code != 200: print(f"Failed to call /get_autogroup_stats/, response.status_code: {response.status_code}") return

And here is the example and response in curl form:

curl https://run.vast.ai/get_autogroup_stats/ -X POST -d '{"id" : 28, "api_key" : "API_KEY_HERE", "metric_name" : "nworkers", "t1": 5000, "t2" : 5010}' -H 'Content-Type: application/json'

response:

{ "endpoint": "test-local", "id": 28, "nworkers": [ 8.0, 8.0 ], "range_len": 2 }

Note that the t1 and t2 timestamps are measured in seconds and are relative to the UNIX Epoch.