In order to be a highly efficient, flexible, and production-ready library, TensorFlow uses dataflow graphs to represent computation in terms of the relationships between individual operations. Dataflow is a programming model widely used in parallel computing and, in a dataflow graph, the nodes represent units of computation while the edges represent the data consumed or produced by a computation unit.
This post is taken from the book Hands-On Neural Networks with TensorFlow 2.0 by Ekta Saraogi and Akshat Gupta. This book by Packt Publishing explains how TensorFlow works, from basics to advanced level with case-study based approach.
Representing computation using graphs comes with the advantage of being able to run the forward and backward passes required to train a parametric machine learning model via gradient descent, applying the chain rule to compute the gradient as a local process to every node; however, this is not the only advantage of using graphs.
Reducing the abstraction level and thinking about the implementation details of representing computation using graphs brings the following advantages:
The main structure – tf.Graph
Python simplifies the graph description phase since it even creates a graph without the need to explicitly define it; in fact, there are two different ways to define a graph:
In order to explicitly define a graph, TensorFlow allows the creation of tf.Graph objects that, through the as_default method, create a context manager; every operation defined inside the context is placed inside the associated graph. In practice, a tf.Graph object defines a namespace for the tf.Operation objects it contains.
The second peculiarity of the tf.Graph structure is its graph collections. Every tf.Graph uses the collection mechanism to store metadata associated with the graph structure. A collection is uniquely identified by a key and its content is a list of objects/operations. The user does not usually need to worry about the existence of a collection since they are used by TensorFlow itself to correctly define a graph.
Graph definition – from tf.Operation to tf.Tensor
A dataflow graph is the representation of a computation where the nodes represent units of computation, and the edges represent the data consumed or produced by the computation.
In the context of tf.Graph, every API call defines tf.Operation (node) that can have multiple inputs and outputs tf.Tensor (edges). For instance, referring to our main example, when calling tf.constant([[1, 2], [3, 4]], dtype=tf.float32), a new node (tf.Operation) named Const is added to the default tf.Graph inherited from the context. This node returns a tf.Tensor (edge) named Const:0.
Since each node in a graph is unique, if there is already a node named Const in the graph (that is the default name given to all the constants), TensorFlow will make it unique by appending the suffix '_1', '_2', and so on to the name. If a name is not provided, as in our example, TensorFlow gives a default name to each operation added and adds the suffix to make them unique in this case too.
The output tf.Tensor has the same name as the associated tf.Operation, with the addition of the :ID suffix. The ID is a progressive number that indicates how many outputs the operation produces. In the case of tf.constant, the output is just a single tensor, therefore ID=0; but there can be operations with more than one output, and in this case, the suffixes :0, :1, and so on are added to the tf.Tensor name generated by the operation.
It is also possible to add a name scope prefix to all operations created within a context—a context defined by the tf.name_scope call. The default name scope prefix is a / delimited list of names of all the active tf.name_scope context managers. In order to guarantee the uniqueness of the operations defined within the scopes and the uniqueness of the scopes themselves, the same suffix appending rule used for tf.Operation holds.
The following code snippet shows how our baseline example can be wrapped into a separate graph, how a second independent graph can be created in the same Python script, and how we can change the node names, adding a prefix, using tf.name_scope. First, we import the TensorFlow library:
(tf1)
import tensorflow as tf
Then, we define two tf.Graph objects (the scoping system allows you to use multiple graphs easily):
g1 = tf.Graph()
g2 = tf.Graph()
with g1.as_default():
A = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
x = tf.constant([[0, 10], [0, 0.5]])
b = tf.constant([[1, -1]], dtype=tf.float32)
y = tf.add(tf.matmul(A, x), b, name="result")
with g2.as_default():
with tf.name_scope("scope_a"):
x = tf.constant(1, name="x")
print(x)
with tf.name_scope("scope_b"):
x = tf.constant(10, name="x")
print(x)
y = tf.constant(12)
z = x * y
Then, we define two summary writers. We need to use two different tf.summary.FileWriter objects to log two separate graphs.
writer = tf.summary.FileWriter("log/two_graphs/g1", g1)
writer = tf.summary.FileWriter("log/two_graphs/g2", g2)
writer.close()
Run the example and use TensorBoard to visualize the two graphs, using the left-hand column on TensorBoard to switch between "runs."
Nodes with the same name, x in the example, can live together in the same graph, but they have to be under different scopes. In fact, being under different scopes makes the nodes completely independent and completely different objects. The node name, in fact, is not only the parameter name passed to the operation definition, but its full path, complete with all of the prefixes.
In fact, running the script, the output is as follows:
Tensor("scope_a/x:0", shape=(), dtype=int32)
Tensor("scope_b/x:0", shape=(), dtype=int32)
As we can see, the full names are different and we also have other information about the tensors produced. In general, every tensor has a name, a type, a rank, and a shape:
Being a C++ library, TensorFlow is strictly statically typed. This means that the type of every operation/tensor must be known at graph definition time. Moreover, this also means that it is not possible to execute an operation among incompatible types.
Looking closely at the baseline example, it is possible to see that both matrix multiplication and addition operations are performed on tensors with the same type, tf.float32. The tensors identified by the Python variables A and b have been defined, making the type clear in the operation definition, while tensor x has the same tf.float32 type; but in this case, it has been inferred by the Python bindings, which are able to look inside the constant value and infer the type to use when creating the operation.
Another peculiarity of Python bindings is their simplification in the definition of some common mathematical operations using operator overloading. The most common mathematical operations have their counterpart as tf.Operation; therefore, using operator overloading to simplify the graph definition is natural.
The following table shows the available operators overloaded in the TensorFlow Python API:
Python operator |
Operation name |
__neg__ |
unary - |
__abs__ |
abs() |
__invert__ |
unary ~ |
__add__ |
binary + |
__sub__ |
binary - |
__mul__ |
binary elementwise * |
__floordiv__ |
binary // |
__truediv__ |
binary / |
__mod__ |
binary % |
__pow__ |
binary ** |
__and__ |
binary & |
__or__ |
binary | |
__xor__ |
binary ^ |
__le__ |
binary /kbd> |
__lt__ |
binary <= |
__gt__ |
binary > |
__ge__ |
binary <= |
__matmul__ |
binary @ |
Operator overloading allows a faster graph definition and is completely equivalent to their tf.* API call (for example, using __add__ is the same as using the tf.add function). There is only one case in which it is beneficial to use the TensorFlow API call instead of the associated operator overload: when a name for the operation is needed.
tf.device creates a context manager that matches a device. The function allows the user to request that all operations created within the context it creates are placed on the same device. The devices identified by tf.device are more than physical devices; in fact, it is capable of identifying devices such as remote servers, remote devices, remote workers, and different types of physical devices (GPUs, CPUs, and TPUs). It is required to follow a device specification to correctly instruct the framework to use the desired device. A device specification has the following form:
/job:<JOB_NAME>/task:<TASK_INDEX>/device:<DEVICE_TYPE>:<DEVICE_INDEX>
Broken down as follows:
There is no need to specify every part of a device specification. For example, when running a single-machine configuration with a single GPU, you might use tf.device to pin some operations to the CPU and GPU.
We can thus extend our baseline example to place the operations on the device we choose. Thus, it is possible to place the matrix multiplication on the GPU, since it is hardware optimized for this kind of operation, while keeping all the other operations on the CPU.
Please note that since this is only a graph description, there's no need to physically have a GPU or to use the tensorflow-gpu package. First, we import the TensorFlow library:
(tf1)
import tensorflow as tf
Now, use the context manager to place operations on different devices, first, on the first CPU of the local machine:
with tf.device("/CPU:0"):
A = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
x = tf.constant([[0, 10], [0, 0.5]])
b = tf.constant([[1, -1]], dtype=tf.float32)
Then, on the first GPU of the local machine:
with tf.device("/GPU:0"):
mul = A @ x
When the device is not forced by a scope, TensorFlow decides which device is better to place the operation on:
y = mul + b
Then, we define the summary writer:
writer = tf.summary.FileWriter("log/matmul_optimized", tf.get_default_graph())
writer.close()
If we look at the generated graph, we'll see that it is identical to the one generated by the baseline example, with two main differences:
The matmul node is placed on the first GPU of the local machine, while any other operation is executed in the CPU. TensorFlow takes care of communication among different devices in a transparent manner:
tf.Session is a class that TensorFlow provides to represent a connection between the Python program and the C++ runtime.
The tf.Session object is the only object able to communicate directly with the hardware (through the C++ runtime), placing operations on the specified devices, using the local and distributed TensorFlow runtime, with the goal of concretely building the defined graph. The tf.Session object is highly optimized and, once correctly built, caches tf.Graph in order to speed up its execution.
Being the owner of physical resources, the tf.Session object must be used as a file descriptor to do the following:
Typically, instead of manually defining a session and taking care of its creation and destruction, a session is used through a context manager that automatically closes the session at the block exit.
The constructor of tf.Session is fairly complex and highly customizable since it is used to configure and create the execution of the computational graph.
In the simplest and most common scenario, we just want to use the current local hardware to execute the previously described computational graph as follows:
(tf1)
# The context manager opens the session
with tf.Session() as sess:
# Use the session to execute operations
sess.run(...)
# Out of the context, the session is closed and the resources released
There are more complex scenarios in which we wouldn't want to use the local execution engine, but use a remote TensorFlow server that gives access to all the devices it controls. This is possible by specifying the target parameter of tf.Session just by using the URL (grpc://) of the server:
(tf1)
# the IP and port of the TensorFlow server
ip = "192.168.1.90"
port = 9877
with tf.Session(f"grpc://{ip}:{port}") as sess:
sess.run(...)
By default, the tf.Session will capture and use the default tf.Graph object, but when working with multiple graphs, it is possible to specify which graph to use by using the graph parameter. It's easy to understand why working with multiple graphs is unusual, since even the tf.Session object is able to work with only a single graph at a time.
The third and last parameter of the tf.Session object is the hardware/network configuration specified through the config parameter. The configuration is specified through the tf.ConfigProto object, which is able to control the behavior of the session. The tf.ConfigProto object is fairly complex and rich with options, the most common and widely used being the following two (all the others are options used in distributed, complex environments):
The baseline example can now be extended to not only define a graph, but to proceed on to an effective construction and the execution of it:
import tensorflow as tf
import numpy as np
A = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
x = tf.constant([[0, 10], [0, 0.5]])
b = tf.constant([[1, -1]], dtype=tf.float32)
y = tf.add(tf.matmul(A, x), b, name="result")
writer = tf.summary.FileWriter("log/matmul", tf.get_default_graph())
writer.close()
with tf.Session() as sess:
A_value, x_value, b_value = sess.run([A, x, b])
y_value = sess.run(y)
# Overwrite
y_new = sess.run(y, feed_dict={b: np.zeros((1, 2))})
print(f"A: {A_value}\nx: {x_value}\nb: {b_value}\n\ny: {y_value}")
print(f"y_new: {y_new}")
The first sess.run call evaluates the three tf.Tensor objects, A, x, b, and returns their values as numpy arrays.
The second call, sess.run(y), works in the following way:
The addition is the entry point of the graph resolution (Python variable y) and the computation ends.
The first print call, therefore, produces the following output:
A: [[1. 2.]
[3. 4.]]
x: [[ 0. 10. ]
[ 0. 0.5]]
b: [[ 1. -1.]]
y: [[ 1. 10.]
[ 1. 31.]]
The third sess.run call shows how it is possible to inject into the computational graph values from the outside, as numpy arrays, overwriting a node. The feed_dict parameter allows you to do this: usually, inputs are passed to the graph using the feed_dict parameter and through the overwriting of the tf.placeholder operation created exactly for this purpose.
tf.placeholder is just a placeholder created with the aim of throwing an error when values from the outside are not injected inside the graph. However, the feed_dict parameter is more than just a way to feed the placeholders. In fact, the preceding example shows how it can be used to overwrite any node. The result produced by the overwriting of the node identified by the Python variable, b, with a numpy array that must be compatible, in terms of both type and shape, with the overwritten variable, is as follows:
y_new: [[ 0. 11.]
[ 0. 32.]]
The baseline example has been updated in order to show the following:
So far, we have used graphs with constant values and used the feed_dict parameter of the sess.run call to overwrite a node parameter. However, since TensorFlow is designed to solve complex problems, the concept of tf.Variable has been introduced: every parametric machine learning model can be defined and trained with TensorFlow.
In this post we talked about how dataflow graphs work in TensorFlow. To know, the implementation of convolution neural network in TensorFlow for via Churn Prediction Case Study and pneumonia detection from the x-ray case study, read the book Hands-On Neural Networks with TensorFlow 2.0 by Packt Publishing.
© 2020 Data Science Central ® Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of Data Science Central to add comments!
Join Data Science Central