Linear Transformations
In this topic, we will introduce linear transformations. Linear transformations are the backbone of modeling with ANNs. In fact, all the processes of ANN modeling can be thought of as a series of linear transformations. The working components of linear transformations are scalars, vectors, matrices, and tensors. Operations such as additions, transpositions, and multiplications are performed on these components.
Scalars, Vectors, Matrices, and Tensors
Scalars, vectors, matrices, and tensors are the actual components of any deep learning model. While they may be simple in principle, having a fundamental understanding of how to utilize all types, as well as the operations that can be performed on them. It is key to the mathematics of ANNs. Scalars, vectors, and matrices are examples of the general entity known as a tensor, so the term tensors may be used throughout this chapter but may refer to any component. Scalars, vectors, and matrices refer to tensors with a specific number of dimensions. The rank of a tensor is an attribute that determines the number dimensions the tensor spans. The definitions of each are listed here:
- Scalar: They are single numbers and are an example of 0-order tensors.
- Vector: Vectors are 1-dimensional arrays of single numbers and are an example of first-order tensors.
- Matrix: Matrices are rectangular arrays that span over two dimensions that consist of single numbers. They are an example of second-order tensors.
- Tensor: Tensors are the general entity that encapsulates scalars, vectors, and matrices. In general, the name is reserved for tensors of order 3 or more.
In figure 2.4 are some examples of a scalar, vector, matrix, and a 3-dimensional tensor:
Figure 2.4: A visual representation of scalars, vectors, matrices, and tensors
Tensor Addition
Tensors can be added together to create new tensors. We will use the example of matrices in this chapter, but the concept extends to tensors with any rank. Matrices may be added to scalars, vectors, and other matrices under certain conditions.
Two matrices may be added (or subtracted) together if they have the same shape. For such matrix-matrix addition, the resultant matrix is determined by element-wise addition of the input matrices. The resultant matrix will therefore have the same shape as the two input matrices. We can define the matrix as the matrix sum C = A + B where each element in C is the sum of the same element in A and B. Matrix addition is commutative, which means that the order of A and B does not matter – A + B = B + A. Matrix addition is also associative, which means that the same result is achieved even when the order of additions is different or even if the operation is applied more than once. A + (B + C) = (A + B) + C.
The same matrix addition principles apply for scalars, vectors, and tensors. An example is shown in figure 2.5:
Figure 2.5: An example of matrix-matrix addition
Scalars can also be added to matrices. Here, each element of the matrix is added to the scalar individually, as is shown in figure 2.6:
Figure 2.6: An example of matrix-scalar addition
It is possible to add vectors to matrices if the number of columns between the two match each other. This is known as broadcasting.
Exercise 6: Perform Various Operations with Vectors, Matrices, and Tensors
Note
For the exercises and activities within this chapter, you will need to have Python 3.6, Jupyter, and NumPy installed on your system. All exercises and activities will be primarily developed in the Jupyter Notebook. It is recommended to keep a separate notebook for different assignments, unless advised not to. Here is the link to download them from GitHub repository: https://github.com/TrainingByPackt/Applied-Deep-Learning-with-Keras/tree/master/Lesson02.
In this exercise, we are going to demonstrate how to create and work with vectors, matrices, and tensors within Python. We will assume familiarity with scalars. This can all be achieved with the NumPy library using the array and matrix functions. Tensors of any rank can be created with NumPy array function. Follow the steps to perform this exercise:
- Open the Jupyter Notebook to implement this exercise. Import all the necessary dependencies. We can create a 1-dimensional array, or a vector, as follows:
import numpy as np
vec1 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
vec1
The following figure shows the output of the preceding code:
Figure 2.7: Output of the created vector
- We can also create 2-dimensional array, or matrix, with the array function:
mat1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
mat1
The following figure shows the output of the preceding code:
Figure 2.8: Screenshot of the output of the created matrix
- We also use the matrix function to create matrices, which will show a similar output:
mat2 = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
- We can create a 3-dimensional array, or tensor, using the array function:
ten1 = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
ten1
The following figure shows the output of the preceding code:
Figure 2.9: Output of the created tensor
- Determining the shape of a given vector, matrix, or tensor is important since certain operations, such as addition and multiplication, can only be applied to components of certain shapes. The shape of an n-dimensional array can be determined using the shape method. Following is the code for determining the shape of vec1:
vec1.shape
The following figure shows the output of the preceding code:
Figure 2.10: Output of the shape of the vector
Following is the code for determining the shape of mat1:
mat1.shape
The following figure shows the output of the preceding code:
Figure 2.11: Output of the shape of the matrix
Following is the code for determining the shape of ten1:
ten1.shape
The following figure shows the output of the preceding code:
Figure 2.12: Output of the shape of the tensor
- Matrices can be added or subtracted if the shapes of the matrices are the same. These are the input values for matrix 1:
mat1 = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
mat1
The following figure shows the output of the preceding code:
Figure 2.13: Values in matrix 1
These are the input values for matrix 2:
mat2 = np.matrix([[2, 1, 4], [4, 1, 7], [4, 2, 9], [5, 21, 1]])
mat2
The following figure shows the output of the preceding code:
Figure 2.14: Values in matrix 2
- Here, we will add matrix 1 and matrix 2:
mat3 = mat1 + mat2
mat3
The following figure shows the output of the preceding code:
Figure 2.15: Addition of matrix 1 and matrix 2
- Scalars can be added to arrays as follows:
mat1 + 4
The following figure shows the output of the preceding code:
Figure 2.16: Matrix addition with a scalar
In this exercise, we learned how to perform various operations with vectors, matrices, and tensors. We also learned how to determine the shape of the matrix.
Reshaping
A tensor of any size can be reshaped as long as the number of total elements remains the same. For example, a (4x3) matrix can be reshaped into a (6x2) matrix since they both have a total of 12 elements. The rank, or number of dimensions, can also be changed in the reshaping process. For example, a (4x3) matrix can be reshaped into a (3x2x2) tensor. Here, the rank has changed from 2 to 3. The (4x3) matrix can also be reshaped into a (12x1) vector, in which the rank has changed from 2 to 1. Figure 2.17 illustrates tensor reshaping— on the left is a tensor with shape (4x1x3), which can be reshaped to a tensor of shape (4x3). Here, the number of elements (12) has remained constant, though the shape and rank of the tensor have changed.
Figure 2.17: Visual representation of reshaping a (4x1x3) tensor to a (4x3) tensor
Matrix Transposition
The transpose of a matrix is an operator that flips the matrix over its diagonal. When this occurs, the rows become the columns and vice versa. The transpose operation is usually denoted as a T superscript upon the matrix. Tensors of any rank can also be transposed.
Figure 2.18: A visual representation of matrix transposition
The following figure shows the matrix transposition properties of matrices A and B:
Figure 2.19: Matrix transposition properties where A and B are matrices
A square matrix, a matrix with equivalent number of rows and columns, is said to be symmetrical if the transpose of a matrix is equivalent to the original matrix.
Exercise 7: Matrix Reshaping and Transposition
In this exercise, we are going to demonstrate how to reshape and transpose matrices. This will become important since some operations can only be applied to components if certain tensor dimensions match. For example, tensor multiplication can only be applied if the inner dimensions of the two tensors match. Reshaping or transposition of tensors is one way to modify the dimensions of the tensor to ensure that certain operations can be applied.
- Open a Jupyter notebook from the start menu to implement this exercise. We can create a 2-dimensional array, with 4 rows and 3 columns, as follows:
import numpy as np
mat1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
mat1
We can check the shape by looking at the shape of the matrix to confirm the shape:
mat1.shape
- We can reshape the array to have 3 rows and 4 columns instead, as follows:
mat2 = np.reshape(mat1, [3,4])
mat2
The following figure shows the output of the preceding code:
Figure 2.20: Matrix reshaping
- We can confirm this by printing the shape:
mat2.shape
The following figure shows the output of the preceding code:
Figure 2.21: Shape of the reshaped matrix
- We can also reshape to 3-dimensional arrays as follows:
mat3 = np.reshape(mat1, [3,2,2])
mat3
The following figure shows the output of the preceding code:
Figure 2.22: Reshaped matrix to a 3-dimensional tensor
The number of dimensions can be confirmed by printing the shape of the array:
mat3.shape
- We can also reshape to a 1-dimensional array as follows:
mat4 = np.reshape(mat1, [12])
mat4
The following figure shows the output of the preceding code:
Figure 2.23: Reshaped matrix to a 1-dimensional tensor
The number of dimensions can be confirmed by printing the shape of the array:
mat4.shape
The following figure shows the output of the preceding code:
Figure 2.24: Shape of the reshaped matrix
- Taking the transpose of an array will flip it across its diagonal. For a 1-dimensional array, a row-vector will be converted to a column vector and vice versa. For a 2-dimensional array, or matrix, each row becomes a column and vice versa. The transpose of an array can be called using the T method:
mat = np.matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
mat.T
The following figure shows the output of the preceding code:
Figure 2.25: Visual demonstration of the transpose function
- We can check the shape of the matrix and its transpose to verify that the dimensions have changed.
mat.shape
The following figure shows the output of the preceding code:
Figure 2.26: Shape of a matrix
Here is the code for checking the shape of the transposed matrix:
mat.T.shape
The following figure shows the output of the preceding code:
Figure 2.27: Shape of the matrix transposition
- To reinforce the notion that reshaping and transposing are different, we can see which elements of each array match:
np.reshape(mat1, [3,4]) == mat1.T
The following figure shows the output of the preceding code:
Figure 2.28: The Boolean matrix showing element-wise equivalence
We can see that only the first and last elements match.
In this topic, we have introduced some of the basic components of linear algebra, including scalars, vectors, matrices, and tensors. We also covered some basic manipulation of linear algebra components, such as addition, transposition, and reshaping.
Matrix Multiplication
Matrix multiplication is fundamental to neural network operation. While the rules for addition are simple and intuitive, the rules for multiplication for matrices and tensors are more complex. Matrix multiplication involves more than simple element-wise multiplication of the elements. Rather, a more complicated procedure is implemented that involves the entire row of one matrix and an entire column of the other. We will explain how multiplication works for 2-dimensional tensors, or matrices; however, tensors of higher orders can also be multiplied.
Given a matrix, and another matrix, , the product of the two matrices is, and each element, , is defined element-wise as. We note that the shape of the resultant matrix is the same as the outer dimensions of the matrix product, or the number of rows of the first matrix and the number of columns of the second matrix. In order for the multiplication to work, the inner dimensions of the matrix product must match, or the number of columns of the first matrix and the number of columns of the second matrix. The concept of inner and outer dimensions of matrix multiplication is shown in the following figure:
Figure 2.29: A visual representation of inner and outer dimensions in matrix multiplication
Unlike matrix addition, matrix multiplication is not commutative, which means that the order of the matrices in the product matters.
Figure 2.30: Matrix multiplication is non-commutative
For example, let's say we have the following two matrices:
Figure 2.31: Two matrices, A and B
One way to construct the product is to have matrix A first, multiplied by B:
Figure 2.32: Visual representation of matrix A multiplied by B, A•B
This results in a 2x2 matrix. Another way to construct the product is to have B first, multiplied by A:
Figure 2.33: Visual representation of matrix B multiplied by A, B•A
Here we can see that matrix formed from the product BA is a 3x3 matrix and is very different from matrix formed from the product AB.
Scalar-matrix multiplication is much more straightforward and is simply the product of every element in the matrix multiplied by the scalar so that , where is a scalar and A is a matrix.
Exercise 8: Matrix Multiplication
In this exercise, we are going to demonstrate how to multiply matrices together:
- Open a Jupyter Notebook from the start menu to implement this exercise.
To demonstrate the fundamentals of matrix multiplication, we will begin with two matrices of the same shape:
import numpy as np
mat1 = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]])
mat2 = np.array([[2, 1, 4], [4, 1, 7], [4, 2, 9], [5, 21, 1]])
- Since both matrices have the same shape and they are not square, they cannot be multiplied as is, since the inner dimensions of the product must match. One way we could resolve this is to take the transpose of one of the matrices, then we would be able to perform the multiplication.
We could take the transpose of the second matrix, which would mean that a (4x3) matrix is getting multiplied by a (3x4) matrix. The result would be a (4x4) matrix. Multiplication is performed using the dot method:
mat1.dot(mat2.T)
The following figure shows the output of the preceding code:
Figure 2.34: Matrix multiplication
- We can also take the transpose of the first matrix, which would mean that a (3x4) matrix is getting multiplied by a (4x3) matrix. The result would be a (3x3) matrix:
mat1.T.dot(mat2)
The following figure shows the output of the preceding code:
Figure 2.35: Matrix multiplication by transposing first
- We can also reshape one of the arrays to make sure the inner dimension of the matrix multiplication matches. For example, we can reshape the first array to make it a (3x4) matrix instead of transposing. We note that the result is not the same as with transposing:
np.reshape(mat1, [3,4]).dot(mat2)
The following figure shows the output of the preceding code:
Figure 2.36: The matrix multiplication with matrix reshaping
In the previous exercise, we have learned how to multiply two matrices together. The same concept can be applied to tensors of all ranks, not just second-order tensors. Tensors of different ranks can even be multiplied together if their inner dimensions match. The next exercise demonstrates how to multiply 3-dimensional tensors together.
Exercise 9: Tensor Multiplication
In this exercise, we are going to apply our knowledge of matrix multiplication to higher-order tensors:
- Open a Jupyter notebook from the start menu to implement this exercise. We will begin by creating a 3-dimensional tensor using the NumPy library and the array function. Import all the necessary dependencies:
import numpy as np
mat1 = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
mat1
The following figure shows the output of the preceding code:
Figure 2.37: A screenshot of the output of the 3-dimensional tensor created
- The shape can be confirmed using the shape method:
mat1.shape
This tensor has the shape (2x2x3).
- Now we create a new 3-dimensional tensor that we will be able to multiply the tensor by. We can take the transpose of the original matrix:
mat2 = mat1.T
mat2
The following figure shows the output of the preceding code:
Figure 2.38: The transpose of the 3-dimensional tensor
- The shape can be confirmed using the shape method:
mat1.shape
This tensor has the shape (3x2x2).
- Now we can take the dot product of the two matrices as follows:
mat3 = mat2.dot(mat1)
mat3
The following figure shows the output of the preceding code:
Figure 2.39: Output of the product of the two 3-dimensional tensors
- We can look at the shape of this resultant tensor:
mat3.shape
The following figure shows the output of the preceding code:
Figure 2.40: Output of the shape of the product of tensors
Now we have a 4-dimensional tensor.
In this topic, we have learned how to perform matrix multiplication using the NumPy library in Python. While we do not have to perform the matrix multiplication directly when we create ANNs with Keras, it is nevertheless useful to understand the underlying mathematics.