The __bytes__() method
There are relatively few occasions to transform an object into bytes. We'll look at this in detail in Part 2, Persistence and Serialization.
In the most common situation, an application can create a string representation, and the built-in encoding capabilities of the Python IO classes will be used to transform the string into bytes. This works perfectly for almost all situations. The main exception would be when we're defining a new kind of string. In that case, we'd need to define the encoding of that string.
The bytes()
function does a variety of things, depending on the arguments:
bytes(integer)
: This returns an immutable bytes object with the given number of0x00
values.bytes(string)
: This will encode the given string into bytes. Additional parameters for encoding and error handling will define the details of the encoding process.bytes(something)
: This will invokesomething.__bytes__()
to create a bytes object. The encoding or error arguments will not be used here.
The base object
class does not define __bytes__()
. This means our classes don't provide a __bytes__()
method by default.
There are some exceptional cases where we might have an object that will need to be encoded directly into bytes before being written to a file. It's often simpler to work with strings and allow the str
type to produce bytes for us. When working with bytes, it's important to note that there's no trivial way to decode bytes from a file or interface. The built-in bytes
class will only decode strings, not our unique, new objects. We might need to parse the strings that are decoded from the bytes. Or, we might need to explicitly parse the bytes using the struct
module and create our unique objects from the parsed values.
We'll look at encoding and decoding Card
to bytes. As there are only 52 card values, each card could be packed into a single byte. However, we've elected to use a character to represent suit
and a character to represent rank
. Further, we'll need to properly reconstruct the subclass of Card
, so we have to encode several things:
- The subclass of
Card
(AceCard
,NumberCard
,FaceCard
) - The parameters to the subclass-defined
__init__()
Note that some of our alternative __init__()
methods will transform a numeric rank into a string, losing the original numeric value. For the purposes of a reversible byte encoding, we need to reconstruct this original numeric rank value.
The following is an implementation of __bytes__()
, which returns a UTF-8 encoding of the Cards
class, rank
, and suit
:
def __bytes__( self ): class_code= self.__class__.__name__[0] rank_number_str = {'A': '1', 'J': '11', 'Q': '12', 'K': '13'}.get( self.rank, self.rank ) string= "("+" ".join([class_code, rank_number_str, self.suit,] ) + ")" return bytes(string,encoding="utf8")
This works by creating a string representation of the Card
object and then encoding the string into bytes. This is often the simplest and most flexible approach.
When we are given a pile of bytes, we can decode the string and then parse the string into a new Card
object. The following is a method that can be used to create a Card
object from bytes:
def card_from_bytes( buffer ): string = buffer.decode("utf8") assert string[0 ]=="(" and string[-1] == ")" code, rank_number, suit = string[1:-1].split() class_ = { 'A': AceCard, 'N': NumberCard, 'F': FaceCard }[code] return class_( int(rank_number), suit )
In the preceding code, we've decoded the bytes into a string. Then we've parsed the string into individual values. From those values, we can locate the class and build the original Card
object.
We can build a bytes representation of a Card
object as follows:
b= bytes(someCard)
We can reconstruct the Card
object from the bytes as follows:
someCard = card_from_bytes(b)
It's important to note that the external bytes representation is often challenging to design. We're creating a representation of the state of an object. Python already has a number of representations that work well for our class definitions.
It's often better to use the pickle
or json
modules than to invent a low-level bytes representation of an object. This is the subject of Chapter 9, Serializing and Saving – JSON, YAML, Pickle, CSV, and XML.