
Understanding NumPy arrays
You might already know that Python is a weakly-typed language. This means that you do not have to specify a data type whenever you create a new variable. For example, the following will automatically be represented as an integer:
In [5]: a = 5
You can double-check this by typing as follows:
In [6]: type(a)
Out[6]: int
Going a step further, we can create a list of integers using the list() command, which is the standard multielement container in Python. The range(x) function will spell out all integers from 0 up to x-1:
In [7]: int_list = list(range(10))
... int_list
Out[7]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Similarly, we can create a list of strings by telling Python to iterate over all the elements in the integer list, int_list, and applying the str() function to each element:
In [8]: str_list = [str(i) for i in int_list]
... str_list
Out[8]: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
However, lists are not very flexible to do math on. Let's say, for example, we wanted to multiply every element in int_list by a factor of 2. A naive approach might be to do the following--but see what happens to the output:
In [9]: int_list * 2
Out[9]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Python created a list whose content is simply all elements of int_list produced twice; this is not what we wanted!
This is where NumPy comes in. NumPy has been designed specifically to make array arithmetic in Python easy. We can quickly convert the list of integers into a NumPy array:
In [10]: import numpy as np
... int_arr = np.array(int_list)
... int_arr
Out[10]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Let's see what happens now when we try to multiply every element in the array:
In [11]: int_arr * 2
Out[11]: array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
Now we got it right! The same works with addition, subtraction, division, and many other functions.
In addition, every NumPy array comes with the following attributes:
- ndim: The number of dimensions
- shape: The size of each dimension
- size: The total number of elements in the array
- dtype: The data type of the array (for example, int, float, string, and so on)
Let's check these preceding attributes for our integer array:
In [12]: print("int_arr ndim: ", int_arr.ndim)
... print("int_arr shape: ", int_arr.shape)
... print("int_arr size: ", int_arr.size)
... print("int_arr dtype: ", int_arr.dtype)
Out[12]: int_arr ndim: 1
... int_arr shape: (10,)
... int_arr size: 10
... int_arr dtype: int64
From these outputs, we can see that our array contains only one dimension, which contains ten elements, and all elements are 64-bit integers. Of course, if you are executing this code on a 32-bit machine, you might find dtype: int32.