September 23, 2024
Purpose:
This document offers basic guidance on how install Python packages when using the Linux Enclave Virtual Desktop.
Introduction:
To minimize challenges you may face when installing Python packages, this guide outlines the process for installing python packages step-by-step.
The example shown below uses Anaconda to install the packages. Note that in April 2020, Anaconda underwent a licensing change and introduced a commercial license. Consequently, use of Anaconda's offerings at Mass General Brigham requires a license. More information on the license requirement is available here. To purchase the Anaconda licensed subscription, please visit: https://rc.partners.org/research-apps-and-services/academic-software. |
Prerequisites:
Research the package information and the operating system (OS) required to use them.
Before you go any further, make sure you have or verified the following information:
1. The specific python packages and the version you want to install.A list of examples could be as the following:
*For example: Tensorflow=2.3.0, pytorch=1.1.0; gensim; corextopic; wordcloud
2. The operating system you are using: Are you logged on the Windows 10 Enclave Virtual Desktop or the Linux Enclave Virtual Desktop?
3. The software you are using Anaconda, Jupiter Notebook, Spyder, or anther software.
4. In the Linux Enclave Virtual Desktop, you will install the Python packages in the Anaconda env and use them in Spyder or Jupiter Notebook
Requirements for Installing Packages:
This section describes the steps to follow before installing other Python
packages.
1. Ensure you can run Python from the command line
Before you go any further, make sure you have Python and that the expected version is available from your command line. You can check this by running the python script below:
python3 --version
You should get output such as Python 3.7.6 (in this example).
2. Ensure you can run pip from the command line. You can check this by running the python script below:
python3 -m pip --version
3. Ensure pip, setuptools, and wheel are up to date
While pip alone is sufficient to install from pre-built binary archives, up to date copies of the setuptools and wheel projects are useful to ensure you can also install from source archives:
python3 -m pip install --upgrade pip setuptools wheel
Step-by-Step Instructions:
Creating Virtual Environments
Python “Virtual Environments” allow Python packages to be installed in an isolated location for a particular application, rather than being installed globally.
If you are looking to safely install global command line tools, see Installing stand alone command line tools.
Scenario I
Imagine you have an application that needs version 1 of LibFoo, but another application requires version 2. How can you use both these applications?
If you install everything into /usr/lib/python3.6/site-packages (or whatever your platform’s standard location is), it’s easy to end up in a situation where you unintentionally upgrade an application that shouldn’t be upgraded.
Scenario II
What if you want to install an application and leave it be? If an application works, any change in its libraries or the versions of those libraries can break the application.
Scenario III
What if you can’t install packages into the global site-packages directory? For instance, on a shared host.
In all these scenarios, virtual environments can help you. They have their own installation directories, and they do not share libraries with other virtual environments. |
A common tool for creating Python virtual environments is venv. venv is available by default in Python 3.3 and later, and installs pip and setuptoolsinto created virtual environments in Python 3.4 and later. |
The basic usage is as follows:
Using venv: python3 -m venv <DIR>
For example: python3 -m venv my_env
In the example below, venv creates my_env directory structures in the user's home directory.
Activate the virtual environment, my_env, by executing the activate command. source /my_env/bin/activate
Installing packages using Conda
Efficient Way to Activate Conda in VS Code
https://medium.com/analytics-vidhya/efficient-way-to-activate-conda-in-vscode-ef21c4c231f2
In VS Code you can choose to run Python file in Debugger, or Run Python file in the terminal.
Install Conda
You can install Conda in many different wasy:
1. You can install Conda as a package using pip install condapip install conda
2. Install Conda as a standalone application
Conda is a powerful package manager and environment manager that you use with command line commands at the Anaconda Prompt for Windows, or in a terminal window for Linux.
Before you proceed, ensure that you have procured the Anaconda license |
Anaconda Installation
For x86 systems
1. In your browser, download the Anaconda installer for Linux.
2. RECOMMENDED: Verify data integrity with SHA-256. For more information on hashes, see What about cryptographic hash verification?
Open a terminal and run the following: sha256sum /path/filename
3. Enter the following to install Anaconda for Python 3.7:
bash ~/Downloads/Anaconda3-2020.02-Linux-x86_64.sh
Conda setup in the Linux Enclave Virtual Desktop
Please follow the following three steps:
1. Copy this global config file described here below to your home directory.
This global approach (usually in /home/<username>/Anaconda3/.condarc) overrides the user settings in .condarc, which is typically found inside the user’s homedir, ~/.condarc, similarly to the Rprofile.site in RStudio.
As long as this .condarc in place in your home directory, all packages should be resolvable via “conda install <package>
” or “pip install <package>
” from within a virtual env in Conda.
The global config file looks like this:
# No SSL verification ssl_verify: false # Do not auto activate base auto_activate_base: true # Display what is going to be downloaded show_channel_urls: true # Channels channels: - http://repo.analyticsenclave.org:8082/artifactory/api/conda/ - http://repo.analyticsenclave.org:8082/artifactory/api/conda/conda-forge-remote default_channels: - http://repo.analyticsenclave.org:8082/artifactory/api/conda/ |
2. Create virtual env and install packages
For example: conda create -n my_env python=3.8.12 Tensorflow pytorch gensim -y
3. If an outdated version of a package is not found (ie:pytorch=1.1.0) , then two options are available
Install the latest version of the package
Compile the specific version needed
Install packages using the "conda install <package>" command
We have added the conda-forge channel for resolution of packages that aren’t on the main Anaconda channel nor on Pypi.
They can be resolved using the "conda install <package>" command.
Use pip for Installing
pip is the recommended installer. Below, we’ll cover the most common usage scenarios. For more detail, see the pip docs, which includes a complete Reference Guide.
Installing from PyPI
The most common usage of pip is to install from the Python Package Index using a requirement specifier. Generally speaking, a requirement specifier is composed of a project name followed by an optional version specifier. PEP 440 contains a full specification of the currently supported specifiers. Below are some examples.
To install the latest version of “SomeProject”:
python3 -m pip install "SomeProject"
For example: python3 -m pip install Tensorflow
install a specific version of "Some Project":python3 -m pip install "SomeProject==1.4
"
For example: python3 -m pip install Tensorflow==2.3.0
Source Distributions vs Wheels
pip can install from either source destructions or wheels, but if both are present on PyPI, pip will prefer a compatible wheel. You can override pip's default behavior by eg using its non-binary option.
Wheels are a pre-built distribution format that provides faster installation compared to Source Distributions (sdist), especially when a project contains compiled extensions.
If pip does not find a wheel to install, it will locally build a wheel and cache it for future installs, instead of rebuilding the source distribution in the future.
Requirements files
Requirements files” are files containing a list of items to be installed using pip install like so. Details on the format of the files are here: Requirements File Format.
Logically, a Requirements file is just a list of pip install arguments placed in a file. Note that you should not rely on the items in the file being installed by pip in any particular order.
Install a list of requirements specified in a Requirements File.
python3 -m pip install -r requirements.txt
In practice, here is a common use of Requirements files:
Requirements files are used to hold the result from pip freeze for the purpose of achieving Repeatable Installs.
In this case, your requirement file contains a pinned version of everything that was installed when pip freeze was run.
python -m pip freeze > requirements.txt
python -m pip install -r requirements.txt
Relevant References
The following useful references that may be beneficial to the end users: