Docs - Guides

Google Colab

Google Colab is a flavor of Jupyter notebook that runs in the cloud. It is not really meant to be run on cloud based GPUs. However there are some work arounds to get it to work on Vast by forwarding ports and fooling Google Colab into thinking Vast is a GPU on the local machine.

We do not recommend Google Colab for Disco Diffusion. Follow our Disco Diffusion guide to run it on Jupyter.

For simple notebooks, we recommend downloading the notebook from Goolge Colab as a .ipynb file, running a Vast Jupyter instance and then uploading the notebook into Jupyter directly. Jupyter by itself is much more reliable than Google Colab.

Here is our Google Colab video guide for running Jupyter on a Vast instance, forwarding the ports to your local Windows 10 machine and then connecting to the Vast instance from within Google Colab.

Run any Google Colab notebook on Vast #

Colab supports a 'local runtime' option to allow people to run colab connecting to their local machine, using their own GPUs. This feature is intentionally restricted to allow only a localhost connection. Getting around that restriction requires using ssh forwarding to make a remote jupyter instance appear local.

We highly recommend running a Jupyter instance, but if you must use Google Colab then it is important to understand it's limitations.

Known issues and limitations #

Because you are connecting to a remote Jupyter instance using SSH forwarding, if your close your browser you can't simply re-open the browser and reconnect. You will need to stop/restart Jupyter through the SSH connection, get a new token and then use that updated link to reconnect to the local runtime.

Another small limitation is that there is no way (unless you get Colabs Pro) to open a terminal from within Colab. With Jupyter, you can easily open up a terminal from within the Jupyter instance to zip files or run other commands from a terminal.

Step 1 - Create a Pytorch SSH instance #

Go through the typical setup flow for a Vast instance and make sure that you select the Pytorch OS Image with the SSH launch mode. Enabling a direct connection will limit your machine options to those with open ports but will have faster download/upload speeds.

Step 2 - SSH into the instance #

You will then need to SSH connect with port forwarding. Our default SSH command for Linux/macOS already forwards port 8080, but if you have multiple SSH instances you will need to use different ports to avoid conflicts. The default SSH command can be found by clicking on the Connect button from a rented instance.

On Windows, Putty supports port forwarding. You will need to add source port 8080 and destination localhost:8080 to the Tunnels options menu.

Step 3 - Install packages, run Jupyter #

Once connected to the instance, you'll want to install/upgrade some packages:

apt install -y git curl ffmpeg libsm6 libxext6;

Colab can't connect at all to latest versions of Jupyter as of 7/5/2022 due to a bug. We have a pinned version of Jupyter that we know works. The unpinned version will just install latest, which eventually should have a fix for this bug.

Install Jupyter pinned (working as of 7//5/22):

pip install notebook==6.4.11 tornado==6.1 jupyter ipywidgets jupyter_http_over_ws widgetsnbextension pandas-profiling opencv-python pandas matplotlib regex;

Or unpinned (may or may not work):

pip install jupyter tornado ipywidgets jupyter_http_over_ws widgetsnbextension pandas-profiling opencv-python pandas matplotlib regex;

Then install extensions:

jupyter nbextension enable --py widgetsnbextension; jupyter serverextension enable --py jupyter_http_over_ws;

Then run Jupyter with options like these (adjust the port 8080 to match whatever port you forwarded over SSH):

jupyter notebook --NotebookApp.allow_origin='*' --port=8080 --NotebookApp.port_retries=0 --allow-root --NotebookApp.allow_remote_access=True

That will output a couple of http addresses. You want to use the localhost address with the access token. Make sure to copy the entire string.

Step 4 - Connect to local runtime #

Open Google Colab and hit the Connect button and select the option to "connect to local runtime". That should then connect!

Note that your SSH session will need to remain open. If at anytime your SSH connection dies, Google Colab will lose the connection to the Vast instance and will start to throw errors. To reconnect, you will need to re-establish a connection to the Vast instance and perhaps restart Jupyter by hitting CTRL+C to kill the current running Jupyter and then use the last command to restart the notebook.

If SSH connection drops #

If your SSH connection disconnects due to a network error or other reason, the Google colab instance will throw an error and give you the option to reconnect.

The first thing to do is to reconnect via SSH to the Vast instance. Once that is established, you can try to "reconnect" to the Google colab instance, but that typically does not work.

The only way to re-establish a connection is to stop the Jupyter running on the Vast instance and then restart it. Then you can take that URL + token and reconnect on Google Colab.

This can cause other problems to the running notebook. You may or may not need to then re-run all the cells of your notebook.

All your data will still be on the Vast instance and available to be copied off of it.


Vast AI
@ 2023, all rights reserved