An Introduction to Python Virtual Environment

Data Science, Machine Learning, Deep Learning, and Artificial Intelligence are some of the most heard about buzzwords in the modern analytical eco-space. The exponential growth of technology in this regard has simplified our lives and made us more machine dependent. The astonishing hype surrounding such technologies has prompted professionals from various disciples to hop on to the ship and consider analytics as their career option.

To master Data Science or Artificial Intelligence in that regard, one needs a myriad of skills which includes Programming, Mathematics, Statistics, Probability, Machine Learning, and also Deep Learning. The most sort after languages for programming in Data Science is Python, and R with the former being regarded as the holy grail of the programming world because of its functionality, flexibility, community, and others.

Python is comparatively easy to master but given its importance, it has various usages which demand certain specific areas to be mastered more efficiently compared to others. In this blog, we would learn about the virtual environments in Python and how they could be used.

What is a Python Virtual Environment?

A python virtual environment is a tool which ensures the separation of resources, and dependencies of a project by creating separate virtual environments for them.

As the virtual environments are just directories running a few scripts, it ensures the creation of an unlimited number of virtual environments.

Why Do We Need Virtual Environments?

Python has a rich list of modules, and packages used for different applications. However, often those packages would not come in the form of a standard library. Thus to ensure the fixation of a common bus, an application might need a version of a library specific to it.

It is often impossible for a single installation of python to include the requirements of every application. A conflict would be created when two applications would need two different versions of a particular module.

In our system, by default, each and every application would use the same directory for storing, and retrieval of the site-packages which are the third party libraries. This kind of situation may not be a cause of concern for system packages but certainly is for site-packages.

To eliminate such scenarios, Python has the facility of creating virtual environments which would separate the modules, and packages needed by each application in its own isolated environment. It would also have a standard self-contained directory consisting of the version of the python installed.

Imagine a scenario where both project A, and project B has their dependencies on the same project C. Now, at this points everything might seem fine, but when project A would need version v1.0.0 of Project C, and project B would need v2.0.0 of the project C, then a conflict would arise as it’s not possible for Python to differentiate between the two different versions in the directory called site-packages. As a result, both the versions would have the same name in the same directory.

This would lead to both the projects using the same version which would not be acceptable in many cases in real life. Thus Python Virtual Environments and the virtualenv/tools come to the rescue in those cases.

Creating a Virtual Environment

Python 3 already has the venv module for creating, and managing the virtual environments. For Python 2 users, the virtual environment could be created using the pip install virtualenvcommand. The venv module would ensure the installation of the last version of python available. In case of having multiple versions, the specific version like python3 could be selected for the creation.

The selection of directory is the first step as it is the place where the virtual environment would be located. Once the directory is decided, the command – python3 -m venv dimensionless-env could be executed on it to create a directory named dimensionless-env if it didn’t exist before, and would also create several directories inside it which includes the Python interpreter, various files, the standard library, and so on.

Once the virtual environment is created, it needs to be activated using the below commands –

dimensionless-envScriptsactivate.bat in the Windows operating system.
source dimensionless-env/bin/activate in the Unix or Mac operating system. The bash shell uses this script. For csh, or fish shells, there are alternate scripts that could be used such as activate.csh, and activate.fish.

The shell’s prompt would display the virtual environment that’s being used after its being activated. It would also modify the Python environment to get the exact version of Python, and its installation.

$ source ~/envs/tutorial-env/bin/activate

(tutorial-env)

>>> import sys

>>> sys.path

[”, ‘/usr/local/lib/python35.zip’, …,

‘~/envs/tutorial-env/lib/python3.5/site-packages’]

The creation of the virtual environment allows you to do anything like installing, upgrading or removing packages using the pip command. Let’s search for the package called astronomy in our environment.

(dimensionless-env) $ pip search astronomy

There are several sub-commands in pip like install, freeze, etc. The latest version of any package could be installed by specifying its name.

Often, an application needs a specific version of a particular package to be installed which could be accomplished using the == sign to mention the version number as shown below.

Re-running the same command would do nothing but to install the latest version from here, either the version name could be specified or the ‘upgrade’ keyword could be used as shown below.

(dimensionless-env) $ pip install –upgrade requests

Collecting requests

Installing collected packages: requests

Found existing installation: requests 2.6.0

Uninstalling requests-2.6.0:

Successfully uninstalled requests-2.6.0

Successfully installed requests-2.7.0

To uninstall a particular package pip uninstall package-name command is used. In order to get detailed information about a particular package, the pip show command is used. All the installed packages in the virtual environment could be displayed using the pip list command.

(dimensionless-env) $ pip list

novas (3.1.1.3)

numpy (1.9.2)

pip (7.0.3)

requests (2.7.0)

setuptools (16.0)

The pip freeze command would also do the same task but in the format of pip install. Thus a generic notion is to put that in a requirments.txt file.

(dimensionless-env) $ pip freeze > requirements.txt

(dimensionless-env) $ cat requirements.txt

novas==3.1.1.3

numpy==1.9.2

requests==2.7.0

This requirements.txt file could be shipped and committed to allowing users making necessary installations using the install –r command.

What is Virtualenvwrapper?

Python virtual environments provide flexibility in the development, and the maintenance of our project as creating isolated environments allows projects to be separated from each other with the required dependencies for an individual project could be installed in that particular environment.

Though the virtual environments resolve the conflicts which arise due to the packages management, it is not completely perfect. Some problems often arise while managing the environment which is resolved by the virtualenvwrapper tool.

Some of the useful features of virtualenvwrapper are –

Organization – Virtualenvwrapper ensures all the virtual environments are organized in one particular location

Flexibility – It eases the process of creating, deleting, and copying environments by proving the respective methods for each.

Simplicity – There is a single command which allows switching between the environments.

The virtualenvwrapper could be installed using the pip install virtualenvwrapper command and then activating it either by running source or by executing the virtualenvwrapper.sh script. After the first installation using pip, the exact location of the virtualenvwrapper.sh would be known from the output of the installation.

How Python Virtual Environment is Used in Data Science?

The field of Data Science encompasses several methodologies which include Deep Learning as well. Deep Learning works with the principle of neural networks which is similar to the neurons in the human brain. Unlike the traditional Machine Learning algorithms, Deep Learning needs a huge volume of data, and vast computational power to make accurate predictions.

There are several Python libraries used for deep Learning such as TensorFlow, Keras, PyTorch, and so on. TensorFlow, which was created by Google is mostly used for Deep Learning operations. However, to work with TensorFlow in the Jupyter Notebook, we need to create a virtual environment first, and then install all the necessary packages inside that environment.

Once, you are into the Anaconda prompt, the conda create -n myenv python=3.6 command would create a new virtual environment known as myenv. The environment could be activated using the conda activate myenv command. The activation of the environment would let us install all the below necessary packages required to work TensorFlow.

conda install jupyter

conda install scipy

pip install –upgrade tensorflow

TensorFlow is used in applications like Object Detection, Image Processing, and so on.

Conclusion

Python is the most important programming language to master in the 21^st century, and mastering it would open the door for numerous career opportunities. Its virtual environment feature allows to efficiently create, and manage a project, and its dependencies.

In this article, we learned that it’s not only about how virtual environments allows storing dependencies flawlessly but resolves various issues surrounding packaging, and the versioning in a project. The huge community of Python helps you find any tools needed for your project.

Dimensionless has several blogs and training to get started with Python Learning and Data Science in general.