TensorFlow Basics
TensorFlow (TF) is quickly becoming the technology that powers many DL applications. There are other APIs, such as Theano, but it is the one that has gathered the greatest interest and mostly applies to us. Overarching frameworks, such as Keras, offer the ability to deploy TF or Theano models, for instance. This is great for prototyping and building a quick proof of concept, but, as a game developer, you know that when it comes to games, the dominant requirements are always performance and control. TF provides better performance and more control than any higher-level framework such as Keras. In other words, to be a serious DL developer, you likely need and want to learn TF.
TF, as its name suggests, is all about tensors. A tensor is a mathematical concept that describes a set of data organized in n dimensions, where n could be 1, 2 x 2, 4 x 4 x 4, and so on. A one-dimensional tensor would describe a single number, say , a 2 x 2 tensor would beor what you may refer to as a matrix. A 3 x 3 x 3 tensor would describe a cube shape. Essentially, any operation that you would apply on a matrix can be applied to a tensor and everything in TF is a tensor. It is often helpful when you first start working with tensors, as someone with a game development background, to think of them as a matrix or vector.
Tensors are nothing more than multidimensional arrays, vectors, or matrices, and many examples are shown in the following diagram:
Let's go back and open up Chapter_1_4.py and follow the next steps in order to better understand how the TF example runs:
- First, examine the top section again and pay special attention to where the placeholder and variable is declared; this is shown again in the following snippet:
tf.placeholder("float", [None, n_input])
...
tf.Variable(tf.random_normal([n_input, n_hidden_1]))
- The placeholder is used to define the input and output tensors. Variable sets up a variable tensor that can be manipulated while the TF session or program executes. In the case of the example, a helper method called random_normal populates the hidden weights with a normally distributed dataset. There are other helper methods such as this that can be used; check the docs for more info.
- Next, we construct the logits model as a function called multilayer_perceptron, as follows:
def multilayer_perceptron(x):
layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1'])
layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2'])
out_layer = tf.matmul(layer_2, weights['out']) + biases['out']
return out_layer
logits = multilayer_perceptron(X)
- Inside the function, we see the definition of three network layers, two input and one output. Each layer is constructed by using the add or + function to add the results of the matmul (x, weights['h1']) and the biases['b1']. Matmul does a simple matrix multiplication of each weight times the input x. Think back to our first example perceptron; this is the same as multiplying all our weights by the input and then adding the bias. Note how the resultant tensors (layer_1, layer_2) are used as inputs into the following layer.
- Skip down to around line 50 and note how we grab references to the loss, optimizer, and initialization functions:
loss_op = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate)
train_op = optimizer.minimize(loss_op)
init = tf.global_variables_initializer()
- It is important to understand that we are storing references to the functions and not executing them just yet. The loss and optimizer functions have been covered in some depth already, but also pay special attention to the global_variables_initalizer() function. This function is where all the variables are initialized, and we are required to run this function first.
- Next, scroll down to the start of the session initialization and start, as follows:
with tf.Session() as sess:
sess.run(init)
- We construct Session in TF as a container of execution or what is called a graph. This is a mathematical graph that describes nodes and connections, not that unlike the networks we are simulating. Everything in TF needs to happen within a session. Then we run the first function, (init), with run.
- As we have already covered the training in some detail, the next element we will look at is the next function, run, executed by the following code:
_, c = sess.run([train_op, loss_op], feed_dict={X: batch_x,Y: batch_y})
- A lot is going on in the run function. We input as a set the training and loss functions train_op and loss_op using the current feed_dict dictionary as input. The resultant output value, c, is equal to the total cost. Note that the input function set is defined as train_op then loss_op. In this case, the order is defined as train/loss, but it could be also reversed if you choose. You would also need to reverse the output values as well, since the output order matches the input order.
The rest of the code has already been defined in some detail, but it is important to understand some of the key differences when building your models with TF. As you can see, it is relatively easy for us to now build complex neural networks quickly. Yet, we are still missing some critical knowledge that will be useful in constructing more complex networks later. What we have been missing is the underlying math used to train a neural network, which we will explore in the next section.