Mastering Objectoriented Python
上QQ阅读APP看书,第一时间看更新

The __bytes__() method

There are relatively few occasions to transform an object into bytes. We'll look at this in detail in Part 2, Persistence and Serialization.

In the most common situation, an application can create a string representation, and the built-in encoding capabilities of the Python IO classes will be used to transform the string into bytes. This works perfectly for almost all situations. The main exception would be when we're defining a new kind of string. In that case, we'd need to define the encoding of that string.

The bytes() function does a variety of things, depending on the arguments:

  • bytes(integer): This returns an immutable bytes object with the given number of 0x00 values.
  • bytes(string): This will encode the given string into bytes. Additional parameters for encoding and error handling will define the details of the encoding process.
  • bytes(something): This will invoke something.__bytes__() to create a bytes object. The encoding or error arguments will not be used here.

The base object class does not define __bytes__(). This means our classes don't provide a __bytes__() method by default.

There are some exceptional cases where we might have an object that will need to be encoded directly into bytes before being written to a file. It's often simpler to work with strings and allow the str type to produce bytes for us. When working with bytes, it's important to note that there's no trivial way to decode bytes from a file or interface. The built-in bytes class will only decode strings, not our unique, new objects. We might need to parse the strings that are decoded from the bytes. Or, we might need to explicitly parse the bytes using the struct module and create our unique objects from the parsed values.

We'll look at encoding and decoding Card to bytes. As there are only 52 card values, each card could be packed into a single byte. However, we've elected to use a character to represent suit and a character to represent rank. Further, we'll need to properly reconstruct the subclass of Card, so we have to encode several things:

  • The subclass of Card (AceCard, NumberCard, FaceCard)
  • The parameters to the subclass-defined __init__()

Note that some of our alternative __init__() methods will transform a numeric rank into a string, losing the original numeric value. For the purposes of a reversible byte encoding, we need to reconstruct this original numeric rank value.

The following is an implementation of __bytes__(), which returns a UTF-8 encoding of the Cards class, rank, and suit:

    def __bytes__( self ):
        class_code= self.__class__.__name__[0]
        rank_number_str = {'A': '1', 'J': '11', 'Q': '12', 'K': '13'}.get( self.rank, self.rank )
        string= "("+" ".join([class_code, rank_number_str, self.suit,] ) + ")"
        return bytes(string,encoding="utf8")

This works by creating a string representation of the Card object and then encoding the string into bytes. This is often the simplest and most flexible approach.

When we are given a pile of bytes, we can decode the string and then parse the string into a new Card object. The following is a method that can be used to create a Card object from bytes:

def card_from_bytes( buffer ):
    string = buffer.decode("utf8")
    assert string[0 ]=="(" and string[-1] == ")"
    code, rank_number, suit = string[1:-1].split()
    class_ = { 'A': AceCard, 'N': NumberCard, 'F': FaceCard }[code]
    return class_( int(rank_number), suit )

In the preceding code, we've decoded the bytes into a string. Then we've parsed the string into individual values. From those values, we can locate the class and build the original Card object.

We can build a bytes representation of a Card object as follows:

b= bytes(someCard)

We can reconstruct the Card object from the bytes as follows:

someCard = card_from_bytes(b)

It's important to note that the external bytes representation is often challenging to design. We're creating a representation of the state of an object. Python already has a number of representations that work well for our class definitions.

It's often better to use the pickle or json modules than to invent a low-level bytes representation of an object. This is the subject of Chapter 9, Serializing and Saving – JSON, YAML, Pickle, CSV, and XML.