Utilizing Python on Keeling

If you are starting out, it may be easiest to use Conda. Conda is an open source package management system and environment management system. It easily installs, runs and updates packages and their dependencies. Conda easily creates, saves, loads and switches between environments such that different environments can be used without breaking the other.

  1. To acquire Conda:

    wget https://repo.anaconda.com/archive/Anaconda3-2023.07-0-Linux-x86_64.sh
    

    Note

    To get the latest version of Conda for Linux-x86_64, find the newest release here.

    For a lightweight experience, you can use Miniconda, which consists of less initial packages:

    wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
    
  2. Run the script to install it by:

    bash Anaconda3-2023.07-0-Linux-x86_64.sh
    

    or

    bash Miniconda3-latest-Linux-x86_64.sh
    
  3. Accept the policy by responding yes when prompted.

  4. Specify where you want to install it. The default path will typically put it in your $HOME directory under the directory anaconda3. After this, packages will be downloaded and installed.

  5. Activate your conda installation. You may have to source .bashrc to get it to work initially.

  6. You will then be in the base environment.

Managing environments

With conda, you can create, export, list, remove, and update environments that have different versions of Python and/or packages installed in them. Isolating different purposes to different environments ensures that changes do not break previously existing code due to dependencies.

To create a new environment:

conda create --name myenv

where myenv is the name you wish to use for the environment.

Switching or moving between environments is called activating the environment. To actiavte this new environment:

conda activate myenv

For more information about environments see

https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html

Utilizing Jupyter notebooks on Keeling

  1. Start an interactive session that meets your compute requirements. For example:

    qlogin -p node -n 4 --mem-per-cpu=16G -time 24:00:00
    

    would request 4 cores with 16 GB of memory each for 24 hours.

    Hint

    If you always are doing the same parameters in your command, consider setting up an alias in your .bashrc such as the following:

    alias qlognb="qlogin -p node -n 4 --mem-per-cpu=16G -time 24:00:00"
    

    which shortens the effort for issue a compute job for your notebooks by simply

    qlognb
    
  2. Note the node that your job is on at that is important. This should be presented to you on job start up with the displayed information regarding your job request (under “connecting to node”) or can be acquired in general at any point by typing:

    hostname
    
  3. Start a jupyter notebook:

    jupyter notebook --port=XXXX --no-browser --ip=127.0.0.1
    

    or if you prefer to use jupyter-lab (https://jupyterlab.readthedocs.io/en/stable/)

    jupyter-lab --port=XXXX --no-browser --ip=127.0.0.1
    

    where the XXXX is a port selected by you. It is important that you select and use a port unique to yourself and not a port that will conflict with other users.

    Hint

    Similar to before, this command may be shortened as an alias if you find yourself using the same parameters. Example:

    alias nb="jupyter-lab --port=XXXX --no-browser --ip=127.0.0.1"
    

    which would then be simply invoked by

    nb
    
  4. Using a terminal, open a second ssh session to keeling, with the following command to access the compute node that is running your notebook server:

    ssh -L XXXX:127.0.0.1:XXXX netID@keeling.earth.illinois.edu ssh -L XXXX:127.0.0.1:XXXX hostname
    

    where hostname is the Keeling compute node (eg: keeling-d01, keeling-g20), XXXX is your unique port and netID is your netID.