Descriptors
A descriptor lets you customize what should be done when you refer to an attribute of an object.
Descriptors are the base of a complex attribute access in Python. They are used internally to implement properties, methods, class methods, static methods, and the super type. They are classes that define how attributes of another class can be accessed. In other words, a class can delegate the management of an attribute to another one.
The descriptor classes are based on three special methods that form the descriptor protocol:
- __set__(self, obj, value): This is called whenever the attribute is set. In the following examples, I will refer to this as a setter.
- __get__(self, obj, owner=None): This is called whenever the attribute is read (referred to as a getter).
- __delete__(self, obj): This is called when del is invoked on the attribute.
A descriptor that implements __get__() and __set__() is called a data descriptor. If it just implements __get__(), then it is called a non-data descriptor.
Methods of this protocol are, in fact, called by the object's special __getattribute__() method (do not confuse it with __getattr__(), which has a different purpose) on every attribute lookup. Whenever such a lookup is performed, either by using a dotted notation in the form of instance.attribute, or by using the getattr(instance, 'attribute') function call, the __getattribute__() method is implicitly invoked and it looks for an attribute in the following order:
- It verifies whether the attribute is a data descriptor on the class object of the instance.
- If not, it looks to see whether the attribute can be found in the __dict__ lookup of the instance object.
- Finally, it looks to see whether the attribute is a non-data descriptor on the class object of the instance.
In other words, data descriptors take precedence over __dict__ lookup, and __dict__ lookup takes precedence over non-data descriptors.
To make it clearer, here is an example from the official Python documentation that shows how descriptors work on real code:
class RevealAccess(object): """A data descriptor that sets and returns values normally and prints a message logging their access. """ def __init__(self, initval=None, name='var'): self.val = initval self.name = name def __get__(self, obj, objtype): print('Retrieving', self.name) return self.val def __set__(self, obj, val): print('Updating', self.name) self.val = val class MyClass(object): x = RevealAccess(10, 'var "x"') y = 5
Here is an example of using it in the interactive session:
>>> m = MyClass() >>> m.x Retrieving var "x" 10 >>> m.x = 20 Updating var "x" >>> m.x Retrieving var "x" 20 >>> m.y 5
The preceding example clearly shows that, if a class has the data descriptor for the given attribute, then the descriptor's __get__() method is called to return the value every time the instance attribute is retrieved, and __set__() is called whenever a value is assigned to such an attribute. Although the case for the descriptor's __del__ method is not shown in the preceding example, it should be obvious now: it is called whenever an instance attribute is deleted with the del instance.attribute statement or the delattr(instance, 'attribute') call.
The difference between data and non-data descriptors is important for the reasons highlighted at the beginning of the section. Python already uses the descriptor protocol to bind class functions to instances as methods. They also power the mechanism behind the classmethod and staticmethod decorators. This is because, in fact, the function objects are non-data descriptors too:
>>> def function(): pass >>> hasattr(function, '__get__') True >>> hasattr(function, '__set__') False
This is also true for functions created with lambda expressions:
>>> hasattr(lambda: None, '__get__') True >>> hasattr(lambda: None, '__set__') False
So, without __dict__ taking precedence over non-data descriptors, we would not be able to dynamically override specific methods on already constructed instances at runtime. Fortunately, thanks to how descriptors work in Python, it is possible; so, developers may use a popular technique called monkey patching to change the way in which instances work without the need for subclassing.