Nvidia Container Registry (NGC) is a container registry that contains pre-built images optimized for Nvidia GPU’s. This includes images for Pytorch, Nemomegatron, BERT, ect. In the following example we’re going to use the pytorch:23.01-py3
image as the base for our Megatron-LM image.
The first step is to authenticate with NGC, this will allow us to pull down images from ngc.io.
Register for account on https://ngc.nvidia.com
Login and fetch your API key, see Nvidia Docs for instructions on how to do that.
Next install the NGC cli on your HeadNode like so:
wget --content-disposition https://ngc.nvidia.com/downloads/ngccli_linux.zip && unzip ngccli_linux.zip && chmod u+x ngc-cli/ngc
echo "export PATH=$(pwd)/ngc-cli:\$PATH" >> ~/.bashrc
source ~/.bashrc
ngc config set
$oauthtoken
and password is the API key from NGC.ubuntu@ip-10-0-21-26:~$ docker login nvcr.io
Username: $oauthtoken
Password:
WARNING! Your password will be stored unencrypted in /home/ubuntu/.docker/config.json.
Configure a credential helper to remove this warning. See
https://docs.docker.com/engine/reference/commandline/login/#credentials-store
Login Succeeded
nemo:23.06
from NGC:docker pull nvcr.io/nvidia/nemo:23.06