Python Virtual Environments – Tutorial

Today, we’ll show you how to use virtual environments to create and manage separate environments. We’ll also take a look at how Python dependencies are stored and resolved.

Python Introduction

Why the Need for Virtual Environments?

Python, like most other modern programming languages, has its own unique way of downloading, storing, and resolving packages (or modules). While this has its advantages, there were some interesting decisions made about package storage and resolution, which has lead to some problems—particularly with how and where packages are stored.

There are a few different locations where these packages can be installed on your system. For example, most system packages are stored in a child directory of the path stored in sys.prefix.

>>> import sys
>>> sys.prefix
'/System/Library/Frameworks/Python.framework/Versions/3.5'

More relevant to the topic of this article, third party packages installed using easy_install or pip are typically placed in one of the directories pointed to by site.getsitepackages:

>>> import site
>>> site.getsitepackages()
[
  '/System/Library/Frameworks/Python.framework/Versions/3.5/Extras/lib/python',
  '/Library/Python/3.5/site-packages'
]

So, why do all of these little details matter?

It’s important to know this because, by default, every project on your system will use these same directories to store and retrieve site packages (third party libraries). At first glance, this may not seem like a big deal, and it isn’t really, for system packages (packages that are part of the standard Python library), but it does matter for site packages.

Consider the following scenario where you have two projects: ProjectA and ProjectB, both of which have a dependency on the same library, ProjectC. The problem becomes apparent when we start requiring different versions of ProjectC. Maybe ProjectA needs v1.0.0, while ProjectB requires the newer v2.0.0, for example.

This is a real problem for Python since it can’t differentiate between versions in the site-packages directory. So both v1.0.0 and v2.0.0 would reside in the same directory with the same name:

/System/Library/Frameworks/Python.framework/Versions/3.5/Extras/lib/python/ProjectC

Since projects are stored according to just their name, there is no differentiation between versions. Thus, both projects, ProjectA and ProjectB, would be required to use the same version, which is unacceptable in many cases.

What Is a Virtual Environment?

At its core, the main purpose of Python virtual environments is to create an isolated environment for Python projects. This means that each project can have its own dependencies, regardless of what dependencies every other project has.

In our little example above, we’d just need to create a separate virtual environment for both ProjectA and ProjectB, and we’d be good to go. Each environment, in turn, would be able to depend on whatever version of ProjectC they choose, independent of the other.

The great thing about this is that there are no limits to the number of environments you can have since they’re just directories containing a few scripts. Plus, they’re easily created using the virtualenv or pyenv command line tools.

Using Virtual Environments

To get started, if you’re not using Python 3, you’ll want to install the virtualenv tool with pip:

$ pip install virtualenv

If you are using Python 3, then you should already have the venv module from the standard library installed.

Start by making a new directory to work with:

$ mkdir python-virtual-environments && cd python-virtual-environments

Create a new virtual environment inside the directory:

# Python 2:
$ virtualenv env

# Python 3
$ python3 -m venv env

The Python 3 venv approach has the benefit of forcing you to choose a specific version of the Python 3 interpreter that should be used to create the virtual environment. This avoids any confusion as to which Python installation the new environment is based on.

From Python 3.3 to 3.4, the recommended way to create a virtual environment was to use the pyvenv command-line tool that also comes included with your Python 3 installation by default. But on 3.6 and above, python3 -m venv is the way to go.

In the above example, this command creates a directory called env, which contains a directory structure similar to this:

├── bin
│   ├── activate
│   ├── activate.csh
│   ├── activate.fish
│   ├── easy_install
│   ├── easy_install-3.5
│   ├── pip
│   ├── pip3
│   ├── pip3.5
│   ├── python -> python3.5
│   ├── python3 -> python3.5
│   └── python3.5 -> /Library/Frameworks/Python.framework/Versions/3.5/bin/python3.5
├── include
├── lib
│   └── python3.5
│       └── site-packages
└── pyvenv.cfg

Here’s what each folder contains:

  • bin: files that interact with the virtual environment
  • include: C headers that compile the Python packages
  • lib: a copy of the Python version along with a site-packages folder where each dependency is installed

Further, there are copies of, or symlinks to, a few different Python tools as well as to the Python executable themselves. These files are used to ensure that all Python code and commands are executed within the context of the current environment, which is how the isolation from the global environment is achieved. We’ll explain this in more detail in the next section.

More interesting are the activate scripts in the bin directory. These scripts are used to set up your shell to use the environment’s Python executable and its site-packages by default.

In order to use this environment’s packages/resources in isolation, you need to “activate” it. To do this, just run the following:

$ source env/bin/activate
(env) $

Notice how your prompt is now prefixed with the name of your environment (env, in our case). This is the indicator that env is currently active, which means the python executable will only use this environment’s packages and settings.

To show the package isolation in action, we can use the bcrypt module as an example. Let’s say we have bcrypt installed system-wide but not in our virtual environment.

Before we test this, we need to go back to the “system” context by executing deactivate:

(env) $ deactivate
$

Now your shell session is back to normal, and the python command refers to the global Python install. Remember to do this whenever you’re done using a specific virtual environment.

Now, install bcrypt and use it to hash a password:

$ pip -q install bcrypt
$ python -c "import bcrypt; print(bcrypt.hashpw('password'.encode('utf-8'), bcrypt.gensalt()))"
$2b$12$vWa/VSvxxyQ9d.WGgVTdrell515Ctux36LCga8nM5QTW0.4w8TXXi

Here’s what happens if we try the same command when the virtual environment is activated:

$ source env/bin/activate
(env) $ python -c "import bcrypt; print(bcrypt.hashpw('password'.encode('utf-8'), bcrypt.gensalt()))"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ImportError: No module named 'bcrypt'

As you can see, the behavior of the python -c "import bcrypt..." command changes after the source env/bin/activate call.

In one instance, we have bcrypt available to us, and in the next we don’t. This is the kind of separation we’re looking to achieve with virtual environments, which is now easily achieved.

Conclusion

In this article, you learned about not only how Python dependencies are stored and resolved, but also how you can use different community tools to help get around various packaging and versioning problems.

As you can see, thanks to the huge Python community, there are quite a few tools at your disposal to help with these common problems. As you progress as a developer, be sure to take time to learn how to use these tools to your advantage. You may even find unintended uses for them or learn to apply similar concepts to other languages you use.

Leave a Comment

Show Buttons
Hide Buttons