In this post you will learn some best practices for coding in Python. Specifically, you will learn how to create different virtual environments to prevent the dependencies of one project from interferring with another. By the end you will:

  • Learn how to create virtual environments to isolate python packages for each project
  • Learn how to save all your projects dependencies to a file to use on another machine
  • Learn how to import a requirements.txt file to reproduce an virtual environment and run code on a new machine

Prerequisites:

  • Basic/working knowledge of how to write python code. Does not need to be advance, as long as you know how to write and run a module (script).

Reproducible Code

Many of the keys to writing good code in python also mean you write reproducible code. Why? Because

Code is meant for HUMANS, not machines

So making sure you write good code means making sure you write code that you or others can read, understand, and edit without much hassle. The better you get at that skill, the better coder you will become, but simultaneously, you will also have more flexible and reproducible code. In this post, we will focus on the last part, reproducability. We will do this by making an environment that can be recreated so anyone can replicated your code on their machine.

Virtual Environments

Virtual Environments are a mechanism used to prevent dependency issues across your python project. This is designed to solve the problem that occurs when one project might need numpy 1.23.0, while another one might need 0.97. To solve this issue, virtual environments create isolated environments to install packages for a specific project. That way, the packages installed for one project won’t affect any other projects. There are several tools for implementing virtual environments:

Anaconda

For beginners and data scientist, I would say anaconda environments are the easiest to use because they also allow you to specify a python version. So I would recommend using those, and switching to one of the others if the project needs it or you learn the basics because conda environments can take a lot of space and are quite heavy.

After installing (with a package manager, e.g., homebrew brew install --cask anaconda or download it from the website). You can create a conda env with the following command:

conda create -n NAME_OF_ENVIRONMENT python=VERSION_NUMBER

then you just activate the environment with:

conda activate NAME_OF_ENVIRONMENT

Note: When active, you will see the name of the environment in parentheses before your terminal prompt (e.g., (my-env) ~).

now everything that you install with pip install or conda install will be completely isolated from the rest of your python environments so you don’t need to worry about dependency incompatabilities.

Note: conda also always you to install packages with conda install, but they are not always the most up to date, so I tend to stick with the pip versions, unless its necessary.

To leave the environment just use:

conda deactivate

Venv

Using the builtin venv method is pretty common online for github projects, so I will show a basic example.

python3 -m venv NAME_OF_VIRTUAL_ENV # Creation
source ./NAME_OF_VIRTUAL_ENV/bin/activate # Activation Mac and linux
# venv\Scripts\activate # Activate on Windows
deactivate  # Deactivates the environment

Now that you have a working (and active) virtual environment, you can install all the packages that you want, and they will not affect any other proejcts that you have.

Freeze the Packages!

Now, that you know what a virtual environment is and how to set them up, you can learn how to replicate previous virtual environment on new machines. We just need to make sure the new machine has all the same packages and version numbers as the previous machine, but how do we do that? In python, you can save all the installed packages in your current virutal environment along with their version numbers following command:

pip freeze > requirements.txt

This will save all the packages that are installed in your current environment to a file called requirements.txt. Then on a new system you can install all those packages into your environment with:

pip install -r /path/to/requirements.txt

Now, whenever you make a new project, after you install all the packages, you can run the pip freeze command to save them and the install them in your virutal environment on your new machine with pip install -r. Now, you have a replicated virtual environment and can run the code on the new machine with the same results!