How to Install Python Packages in the Linux Enclave Virtual Desktop

Purpose:

This document offers basic guidance on how install Python packages when using the Linux Enclave Virtual Desktop.

Introduction:

To minimize challenges you may face when installing Python packages, this guide outlines the process for installing python packages step-by-step.

 

The example shown below uses Anaconda to install the packages. Note that in April 2020, Anaconda underwent a licensing change and introduced a commercial license. Consequently, use of Anaconda's offerings at Mass General Brigham requires a license. More information on the license requirement is available here. To purchase the Anaconda licensed subscription, please visit: https://rc.partners.org/research-apps-and-services/academic-software.

 

Prerequisites:

Research the package information and the operating system (OS) required to use them.

Before you go any further, make sure you have or verified the following information:


1. The specific python packages and the version you want to install.A list of examples could be as the following:

*For example: Tensorflow=2.3.0, pytorch=1.1.0; gensim; corextopic; wordcloud

2. The operating system you are using: Are you logged on the Windows 10 Enclave Virtual Desktop or the Linux Enclave Virtual Desktop?

3. The software you are using Anaconda, Jupiter Notebook, Spyder, or anther software.

4. In the Linux Enclave Virtual Desktop, you will install the Python packages in the Anaconda env and use them in Spyder or Jupiter Notebook

 

Requirements for Installing Packages:

This section describes the steps to follow before installing other Python
packages.

1. Ensure you can run Python from the command line
Before you go any further, make sure you have Python and that the expected version is available from your command line. You can check this by running the python script below:

python3 --version

You should get output such as Python 3.7.6 (in this example).

2. Ensure you can run pip from the command line. You can check this by running the python script below:

python3 -m pip --version

3. Ensure pip, setuptools, and wheel are up to date

While pip alone is sufficient to install from pre-built binary archives, up to date copies of the setuptools and wheel projects are useful to ensure you can also install from source archives:

python3 -m pip install --upgrade pip setuptools wheel

 

Step-by-Step Instructions:

Creating Virtual Environments

Python “Virtual Environments” allow Python packages to be installed in an isolated location for a particular application, rather than being installed globally.

If you are looking to safely install global command line tools, see Installing stand alone command line tools.

Scenario I

Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications?

If you install everything into /usr/lib/python3.6/site-packages (or whatever your platform’s standard location is), it’s easy to end up in a situation where you unintentionally upgrade an application that shouldn’t be upgraded.

Scenario II

What if you want to install an application and leave it be? If an application works, any change in its libraries or the versions of those libraries can break the application.

Scenario III

What if you can’t install packages into the global site-packages directory? For instance, on a shared host.

In all these scenarios, virtual environments can help you. They have their own installation directories, and they do not share libraries with other virtual environments.

 

A common tool for creating Python virtual environments is venv. venv is available by default in Python 3.3 and later, and installs pip and setuptoolsinto created virtual environments in Python 3.4 and later.

The basic usage is as follows:
Using venv: python3 -m venv <DIR>
For example: python3 -m venv my_env

In the example below, venv creates my_env directory structures in the user's home directory.

Activate the virtual environment, my_env, by executing the activate command. source /my_env/bin/activate

Installing packages using Conda

Efficient Way to Activate Conda in VS Code
https://medium.com/analytics-vidhya/efficient-way-to-activate-conda-in-vscode-ef21c4c231f2
In VS Code you can choose to run Python file in Debugger, or Run Python file in the terminal.

Install Conda

You can install Conda in many different wasy:

1. You can install Conda as a package using pip install conda
pip install conda

2. Install Conda as a standalone application
Conda is a powerful package manager and environment manager that you use with command line commands at the Anaconda Prompt for Windows, or in a terminal window for Linux.

Before you proceed, ensure that you have procured the Anaconda license

Anaconda Installation

For x86 systems

1. In your browser, download the Anaconda installer for Linux.

2. RECOMMENDED: Verify data integrity with SHA-256. For more information on hashes, see What about cryptographic hash verification?

Open a terminal and run the following: sha256sum /path/filename

3. Enter the following to install Anaconda for Python 3.7:
bash ~/Downloads/Anaconda3-2020.02-Linux-x86_64.sh

Conda setup in the Linux Enclave Virtual Desktop

Please follow the following three steps:

1. Copy this global config file described here below to your home directory.

This global approach (usually in /home/<username>/Anaconda3/.condarc) overrides the user settings in .condarc, which is typically found inside the user’s homedir, ~/.condarc, similarly to the Rprofile.site in RStudio.

As long as this .condarc in place in your home directory, all packages should be resolvable via “conda install <package>” or “pip install <package>” from within a virtual env in Conda.

The global config file looks like this:

# No SSL verification
ssl_verify: false
# Do not auto activate base
auto_activate_base: true
# Display what is going to be downloaded
show_channel_urls: true
# Channels
channels:
- http://repo.analyticsenclave.org:8082/artifactory/api/conda/
- http://repo.analyticsenclave.org:8082/artifactory/api/conda/conda-forge-remote
default_channels:
- http://repo.analyticsenclave.org:8082/artifactory/api/conda/

2. Create virtual env and install packages
For example: conda create -n my_env python=3.8.12 Tensorflow pytorch gensim -y

 

 

3. If an outdated version of a package is not found (ie:pytorch=1.1.0) , then two options are available 

Install the latest version of the package

Compile the specific version needed

Install packages using the "conda install <package>" command

We have added the conda-forge channel for resolution of packages that aren’t on the main Anaconda channel nor on Pypi.

They can be resolved using the "conda install <package>" command.

 

Use pip for Installing

pip is the recommended installer. Below, we’ll cover the most common usage scenarios. For more detail, see the pip docs, which includes a complete Reference Guide.

 

Installing from PyPI

The most common usage of pip is to install from the Python Package Index using a requirement specifier. Generally speaking, a requirement specifier is composed of a project name followed by an optional version specifier. PEP 440 contains a full specification of the currently supported specifiers. Below are some examples.

To install the latest version of “SomeProject”:
python3 -m pip install "SomeProject"
For example: python3 -m pip install Tensorflow

 

install a specific version of "Some Project":
python3 -m pip install "SomeProject==1.4"
For example: python3 -m pip install Tensorflow==2.3.0

Source Distributions vs Wheels

pip can install from either source destructions or wheels, but if both are present on PyPI, pip will prefer a compatible wheel. You can override pip's default behavior by eg using its non-binary option

Wheels are a pre-built distribution format that provides faster installation compared to Source Distributions (sdist), especially when a project contains compiled extensions.

If pip does not find a wheel to install, it will locally build a wheel and cache it for future installs, instead of rebuilding the source distribution in the future.

 

Requirements files

Requirements files” are files containing a list of items to be installed using pip install like so. Details on the format of the files are here: Requirements File Format.

Logically, a Requirements file is just a list of pip install arguments placed in a file. Note that you should not rely on the items in the file being installed by pip in any particular order.


Install a list of requirements specified in a Requirements File.

python3 -m pip install -r requirements.txt

In practice, here is a common use of Requirements files:

Requirements files are used to hold the result from pip freeze for the purpose of achieving Repeatable Installs.

In this case, your requirement file contains a pinned version of everything that was installed when pip freeze was run.

python -m pip freeze > requirements.txt

python -m pip install -r requirements.txt

Relevant References

The following useful references that may be beneficial to the end users: 

 

Go to KB0040433 in the IS Service Desk

Related articles

Analytics Enclave

ERISOne Cluster Applications

IDEA Platform