SciPy Recipes
上QQ阅读APP看书,第一时间看更新

Using object arrays to store heterogeneous data

Up to this point, we only considered arrays that contained native data types, such as floats or integers. If we need an array containing heterogeneous data, we can create an array with arbitrary Python objects as elements, as shown in the following code:

x = np.array([2.5, 'a string', [2,4], {'a':0, 'b':1}])

This will result in an array with the np.object data type, as indicated in the output line reproduced as follows:

array([2.5, 'string', [2, 4], {'a': 0, 'b': 1}], dtype=object)

We mentioned that all elements in a NumPy array must be of the same type. In the case of arrays of objects, NumPy wraps the data in each array item with a common object type. The objects are unwrapped when accessed, so that the conversion is transparent for the user. 

If the objects to be contained in the array are not known at construction time, we can create an empty array of objects with the following code:

x = np.empty((2,2), dtype=np.object)

The first argument, (2,2), in the call to empty(), specifies the shape of the array, and dtype=np.object says that we want an array of objects. The resulting array is not really empty but has every entry set as equal to None. We can then assign arbitrary objects to the entries of x.

In a NumPy array of objects, as in Python lists and tuples, the stored values are references to the objects, not copies of the objects themselves.