Utilizing Python on Keeling
If you are starting out, it may be easiest to use Conda. Conda is an open source package management system and environment management system. It easily installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments such that different environments can be used without breaking the other.
To acquire Conda:
wget https://repo.anaconda.com/archive/Anaconda3-2023.07-0-Linux-x86_64.shNote
To get the latest version of Conda for Linux-x86_64, find the newest release here.
For a lightweight experience, you can use Miniconda, which consists of less initial packages:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.shRun the script to install it by:
bash Anaconda3-2023.07-0-Linux-x86_64.shor
bash Miniconda3-latest-Linux-x86_64.shAccept the policy by responding
yeswhen prompted.Specify where you want to install it. The default path will typically put it in your
$HOMEdirectory under the directoryanaconda3. After this, packages will be downloaded and installed.Activate your conda installation. You may have to
source .bashrcto get it to work initially.You will then be in the
baseenvironment.
Managing environments
With conda, you can create, export, list, remove, and update environments that have different versions of Python and/or packages installed in them. Isolating different purposes to different environments ensures that changes do not break previously existing code due to dependencies.
To create a new environment:
conda create --name myenv
where myenv is the name you wish to use for the environment.
Switching or moving between environments is called activating the environment. To actiavte this new environment:
conda activate myenv
For more information about environments see
https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
Utilizing Jupyter notebooks on Keeling
Start an interactive session that meets your compute requirements. For example:
qlogin -p node -n 4 --mem-per-cpu=16G -time 24:00:00would request 4 cores with 16 GB of memory each for 24 hours.
Hint
If you always are doing the same parameters in your command, consider setting up an alias in your
.bashrcsuch as the following:alias qlognb="qlogin -p node -n 4 --mem-per-cpu=16G -time 24:00:00"which shortens the effort for issue a compute job for your notebooks by simply
qlognbNote the node that your job is on at that is important. This should be presented to you on job start up with the displayed information regarding your job request (under “connecting to node”) or can be acquired in general at any point by typing:
hostnameStart a jupyter notebook:
jupyter notebook --port=XXXX --no-browser --ip=127.0.0.1or if you prefer to use jupyter-lab (https://jupyterlab.readthedocs.io/en/stable/)
jupyter-lab --port=XXXX --no-browser --ip=127.0.0.1where the
XXXXis a port selected by you. It is important that you select and use a port unique to yourself and not a port that will conflict with other users.Hint
Similar to before, this command may be shortened as an alias if you find yourself using the same parameters. Example:
alias nb="jupyter-lab --port=XXXX --no-browser --ip=127.0.0.1"which would then be simply invoked by
nbUsing a terminal, open a second ssh session to keeling, with the following command to access the compute node that is running your notebook server:
ssh -L XXXX:127.0.0.1:XXXX netID@keeling.earth.illinois.edu ssh -L XXXX:127.0.0.1:XXXX hostnamewhere
hostnameis the Keeling compute node (eg: keeling-d01, keeling-g20),XXXXis your unique port andnetIDis your netID.