Python for Scientists – Environments

If you are only here for a refresher on the commands, click here to visit the Wrap Up section. Read on if you want background information and discussion.

Why make environments? Why can’t I just install everything into the root, or base, python environment and go happily on my way? Because sooner or later this is going to happen.

You find a new feature in the latest release for a package that finally allows you to simplify your code. This is great! You quickly type the command to update the package in your base environment. The installation is going fine until you get a message that installation failed and Conda will try to roll back changes. The roll back seems to work fine and you decide to live without the new feature. Then you try to run your script again and it fails due to a package conflict. What!? My script worked just a few minutes ago! After searching online for an hour for a fix, you decide to delete Miniconda3 and reinstall everything. You try to remember all of the packages that were used, but other scripts from different projects rely on different packages. Oh no…you’re in for a terrible, horrible, no good, very bad day(s).

This is where environments really shine! Each environment is essentially a sandbox (isolated Python instances) that will not interfere with other environments. For example, if the “py36_myProject” environment becomes corrupt for whatever reason, the remaining Conda environments are unaffected. This also allows a user to have multiple versions of Python available to them. A person can have one environment for Python 2.7 even if your root environment is Python 3.6. Another good reason to use environments is the ability to share them with others which makes the project reproducible. Conda environments make all of this possible.

In this post we will go through techniques to manage Conda environments for a single Operating System. This post covers how to install environments for any Operating System, but with less control over dependencies (read: exact environment replication not guaranteed). Currently there are three separate ways to share Conda environments. Why are there three? I don’t know, but developers are trying to consolidate them. This post will be updated with the latest information if/when consolidation happens.

Topics covered

  • Create environments
  • Activate/deactivate environments
  • Environment naming conventions
  • Install packages
  • Update packages
  • Save environments
  • Delete environments

Click here for a list of related posts.

Updated: 2020-01-24


From Scratch

If you have never created an environment before, or you are not reproducing another environment, you will start with the basic method. Note that it’s better to manually type the commands instead of using copy/paste. I have found it builds muscle memory and helps a person remember things which results in less searching online for help.

The basic command is conda create --name <project name> <list of packages>. The --name option can be shorthanded as -n. Don’t let this confuse you. In general, Conda allows shorthand options. I will use the full option names as they require less explaining. The list of packages can use math symbols to specify versions. For example, in this list:

python=3.7 matplotlib>0.2 numpy

I’m requesting python version 3.7x, matplotlib with a version greater than 0.2, and the latest version of numpy that doesn’t conflict with the other packages. This comes in handy if you know certain versions of packages work well together. Latest releases rarely have bugs, but they do happen occasionally.

Let’s create our first test environment. The commands are the same on Windows, Linux, and Mac. You can come up with your own naming convention, but I like to use py<version>_<project description>. For our first environment let’s name it py37_test. Open an Anaconda prompt and type the following command:

conda create --name py37_test python=3.7

Conda will solve the environment and provide a list of packages it will install as in the picture below.

Either press Enter to accept the default (which is yes), or type y and then press the Enter key to accept. You can add the -y option on the end of the Conda create command to automatically accept installation installation plan, but I like to review the plan to make sure I didn’t make a typo or forget to add a package. For example, you could have typed (new addition in bold)

conda create --name py37_test python=3.7 -y

And it would have automatically downloaded and installed the libraries. The choice is yours, but I recommend leaving off the -y option.

Conda will then download and install your new environment to the <conda install path>/Miniconda3/envs/ folder. Some packages can take a while to download so don’t fret if it looks like installation has stalled. Just give it some time. Eventually, you will see that the installation succeeded and instructions on how to activate your environment.

Activate the environment by typing conda activate py37_test. The environment prefix on your cursor will change from (base) to (py37_test). Congratulations! Your computer is now using the Python environment that we just installed. It cannot use any libraries in the base environment or any other installed environment.

If you want to use a different environment, or create a new one, deactivate the current environment using conda deactivate. This is not needed if you are already in the base environment. You can see a list of installed environemnts by typing:

conda env list

Another Option

While perusing the interwebs you might find that some people use conda env create instead of conda create. Both of them will build your environment, but current guidance from Conda docs is to use conda create. The benefit of using conda env create is that you can build an environment from an environment.yml file. These files allow you to build an environment that contains libraries installed via pip. However, you should very rarely use pip. Please avoid using pip at all costs because it can lead to package conflicts as you might accidentally use a non-Conda version of pip. Please do not use pip to install libraries unless it’s absolutely needed. I hope I made my point.

In this post we will use the conda create version which installs an exact copy of the coding environment. This increases reproducibility of your research/work. The conda env create method is discussed here.

Sharing Environments

I prefer to create specification files to share exact copies of an environment that only use packages from Conda. This increases reproducibility of your work on a single operating system. If somebody wants to convince me otherwise, please make a suggestion in the comments section below. Some people might prefer to use YAML files instead as they can install packages from both Conda and pip. My hope is that Conda developers consolidate environment sharing options so there is only one method to rule them all.

The awesome thing about whichever method you decide to use is that it also protects you from disaster. Specification files and YAML files can be saved with the rest of your project using version control (explained in an upcoming post). If your computer dies, it takes a matter of minutes to reproduce your environment (after reinstalling the OS) and get back to working on the project. This is only possible if you follow some sort of version control.

Usually I install libraries/packages from the conda-forge repository. I don’t have any environments that requires pip, therefore I don’t use YAML files. The reason I like to use specification files is that it provides full remote download paths of installed libraries. This means that an environment can be exactly reproduced on a similar operating system. The downsides are that it does not work on a different operating system, and it won’t install anything from pip.

Activate the test environment we created earlier (conda activate py37_test) and then type the following line to create a specifications file:

conda list --explicit > py37_test_specifications_file.txt

If you want to save the specifications file to a certain directory, append the full file path after the greater-than sign.

If you have a specifications file, then an exact copy of the environment can be created by typing:

conda create --name py37_test --file py37_test_specifications_file.txt

If the specifications file is in a different directory then you have to add the full path to the file.

Environment Maintenance

What happens when you need a new package that wasn’t installed when you initially created the environment? Easy. We activate the new environment and install the package.

conda install packageName

Some packages are not available on the standard Conda channel (a channel is essentially a remote location where packages are stored; Conda is the default channel populated with packages maintained by Continuum Analytics), but there’s a secondary channel that will usually have the package you’re after. This channel is named conda-forge.  Conda-forge is a channel maintained by users (instead of Continuum Analytics), and is usually updated with new packages faster than the official Conda channel. To use the conda-forge channel for a specific package type:

conda install -c conda-forge packageName

Remember, activate the environment first! Try not to install anything into the base environment. Keep it clean. Simply change newPackage to whatever package you need to download. Conda will then attempt to install it.

Sometimes you install packages that go unused. Either because the package didn’t solve the problem, or you found another way to solve the issue. Whatever the reason, package removal is easy.

conda remove packageName

Note that if the package is a dependency for another package, those packages will also be removed. This can be avoided using the --force flag, but it isn’t recommended. As I write this post, Conda does not remove packages that are no longer used by any other package. Developers are currently working on this feature.

You can update all packages at once, but sometimes this can cause package conflicts. If you are dead set on updating all packages at once, save yourself some trouble and create a specifications file beforehand. Then if something breaks after the upgrade you can start at the last point where everything worked.

conda update --all

The safer method is to update only the specific package that requires an update for new functionality. Updating for the sake of updating is not recommended. If your packages work, leave them alone.

conda update numpy

Multiple environments can take up a lot of disk space. A good technique to manage disk space with multiple environments is to save a specifications file for the environment and then delete the environment. If/when you need the environment again, you can create a replica with the specifications file. The command to remove an entire environment is:

conda remove --name py37_test --all

Replace py37_test with the name of the respective environment.

Wrap Up

Here is a review of the commands discussed in this post

  • conda create --name py37_test python=3.7 numpy
    • Create an environment named ‘py37_test’ and install Python version 3.7 and the latest compatible Numpy package
  • conda activate py37_test
    • You must activate an environment before using it
  • conda install pandas
    • After activating an environment, additional packages can be installed such as Pandas.
  • conda install -c conda-forge packageName
    • After activating an environment, install a package from a specific repository.
  • conda list --explicit > py37_test_spec_file.txt
    • Save a specifications file that includes an explicit list of all installed libraries in a specific environment
  • conda create --name py37_test --file py37_test_spec_file.txt
    • Create an environment from a specifications file
  • conda update numpy
    • Update the numpy package in the activated environment
  • conda deactivate
    • If you want to change environments, deactivate the current environment and then activate the new one.
  • conda env list
    • Return a list of installed environments
  • conda remove --name py37_test --all
    • Remove all traces of the named environment

There you have it! These command examples should cover everything you need in reference to conda environments.


Liked it? Take a second to support AtmoGuy on Patreon!
Become a patron at Patreon!

Comments are closed.

Proudly powered by WordPress | Theme: Baskerville 2 by Anders Noren.

Up ↑