In Python, you use the dot (.
) operator to access attributes of an object.
Normally, that isn’t something you have to give much thought to.
However, when you want to customize what happens during attribute access,
things can get complicated.
In this post, we’ll see how to control what happens when you use the dot operator.
Before we can talk about customizing attribute access,
we need to discuss two related topics:
class and instance attributes, and descriptors.
If both of those are familiar to you,
feel free to skip ahead.
Class and instance attributes
There are two kinds of attributes in Python:
class and instance attributes.
In the following class,
volume
is a class attribute,
and line
is an instance attribute.
class Speaker:
volume = "low"
def __init__(self, line):
self.line = line
def speak(self):
return line if self.volume == "low" else line.upper()
Each instance of the class has its own line
attribute.
It’s easy to see why calling a.speak()
below returns "hello"
while b.speak()
returns "goodbye"
:
>>> a = Speaker("hello")
>>> b = Speaker("goodbye")
>>> a.speak()
"hello"
>>> b.speak()
"goodbye"
The class attribute volume
is shared by all instances.
If you change its value, that change is visible to both a
and b
:
>>> Speaker.volume = "high"
>>> a.speak()
'HELLO'
>>> b.speak()
'GOODBYE'
Each instance has an instance dictionary where instance
attributes are stored:
>>> a.__dict__
{'line': 'hello'}
>>> b.__dict__
{'line': 'goodbye'}
Class attributes are stored in a class dictionary:
>>> Speaker.__dict__
mappingproxy({'__module__': '__main__',
'volume': 'low',
'__init__': <function __main__.Speaker.__init__(self, line)>,
'speak': <function __main__.Speaker.speak(self)>,
'__dict__': <attribute '__dict__' of 'Speaker' objects>,
'__weakref__': <attribute '__weakref__' of 'Speaker' objects>,
'__doc__': None})
It’s important to remember this distinction between
instance and class attributes (and where they are stored).
Descriptors
Descriptors are an important concept related to attribute access.
A descriptor is a class that defines one or more of the following methods:
__get__()
,
__set__()
,
- or
__delete__()
Below is a simple descriptor class. Its __get__()
method always returns 0.
class ZeroAttribute:
"""
Attribute that is always 0
"""
def __get__(self, obj, owner=None):
return 0
Descriptors are only useful as class variables:
class Foo:
x = ZeroAttribute()
Accessing Foo.x
will run its __get__()
method:
>>> Foo.x # calls ZeroAttribute.__get__()
0
Even though x
was defined like a class attribute,
you can also access x
as an instance attribute,
which also invokes the ZeroAttribute.__get__()
method:
>>> a = Foo()
>>> a.x # calls ZeroAttribute.__get__()
0
Arguments to the __get__()
method
The __get__()
method accepts two arguments, obj
and owner
.
-
If the __get__()
method is called by acessing a _class__ attribute,
obj
is set to None
, and owner
is set to the class.
-
If the __get__()
method is called by accessing an instance attribute,
obj
is set to the instance, and owner
is set to the type of the instance.
This lets you specify different behaviour
for class attribute access and instance attribute access.
For example, if you explicitly don’t want to allow class attribute access,
you can do something like this:
class ZeroAttribute:
"""
Attribute that is always 0
"""
def __get__(self, obj, owner=None):
if obj is None:
raise AttributeError() # don't allow accessing as a class attribute
return 0
class Foo:
x = ZeroAttribute()
>>> Foo.x # accessing as class attribute, will raise
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in __get__
AttributeError
>>>
>>> a = Foo()
>>> a.x # accessing as instance attribute, OK
0
Data descriptors and non-data descriptors
A descriptor that only defines __get__()
is called a non-data descriptor.
A descriptor that also defines __set__()
or __delete__()
is called a data descriptor.
As you’ll see in the next section, this is an important difference to remember.
Customizing attribute access
Now that you know about class and instance attributes,
and a little bit about descriptors,
you’re ready to understand how attribute access really works in Python,
and how to customize it.
When you write x.y
, the x.__getattribute__()
method is invoked.
The default implementation of __getattribute__()
does the following:
- First, it checks if
y
is a data descriptor.
If so, it returns the result of its __get__()
method.
- Next, it tries to find
'y'
in the instance dictionary of x
and return it.
- Next, it checks if
y
is a non_data descriptor.
If so, it returns the result of its __get__()
method.
- Next, it tries to find
'y'
in the class dictionary of the type of x
and return it.
- Finally, if none of the above worked, it raises an
AttributeError
.
If __getattribute__()
raises an AttributeError
,
x.__getattr__()
is called if it is defined.
So you can control the way attribute access works in a few different ways:
- Override
__getattribute__
- Write a
__getattr__
- Make the attribute a descriptor object
Let’s look at a some examples that will help understand when to use which.
Example 1: returning None
when an attribute isn’t found
By default, accessing an attribute that doesn’t exist gives you an AttributeError
:
>>> class MyClass:
... def __init__(self, x):
... self.x = x
...
>>> obj = MyClass(42)
>>> obj.x
42
>>> obj.y
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MyClass' object has no attribute 'y'
Let’s say we want to get None
(or some other custom behaviour) when we access
non-existent attributes.
Overriding __getattribute__()
- the wrong way
The first approach we might consider is overriding __getattribute__()
.
Try to spot the problem with the code below:
class MyClass:
def __init__(self, x):
self.x = x
def __getattribute__(self, name):
try:
return self.__dict__[name]
except KeyError:
return None
If we try to access a non-existent attribute of a MyClass
instance,
we get a RecursionError
!
>>> obj = MyClass(42)
>>> obj.y
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 6, in __getattribute__
File "<stdin>", line 6, in __getattribute__
File "<stdin>", line 6, in __getattribute__
[Previous line repeated 996 more times]
RecursionError: maximum recursion depth exceeded
The problem is this line in MyClass.__getattribute__()
,
which itself calls MyClass.__getattribute__()
:
return self.__dict__[name] # calls self.__getattr__("__dict__") - infinite recursion!
Overriding __getattribute__()
- the less wrong, but still wrong, way
To prevent MyClass.__getattribute__()
from calling itself recursively,
we could use the base class’, i.e., object
’s implementation of
__getattribute__()
instead. If that fails, we return None
:
class MyClass:
def __init__(self, x):
self.x = x
def __getattribute__(self, name):
try:
return object.__getattribute__(self, name)
except AttributeError:
return None
This gives us the behaviour we want:
>>> obj = MyClass(42)
>>> obj.x
42
>>> obj.y # no error, returns None
>>>
Using __getattr__
- the right approach
The more elegant solution is to write a __getattr__()
method.
Recall that the default implementation of
__getattribute__()
will raise AttributeError
when it doesn’t find an attribute.
When that happens, __getattr__()
is called:
class MyClass:
def __init__(self, x):
self.x =x
def __getattr__(self, name):
return None
>>> obj = MyClass(42)
>>> obj.x # OK - __getattribute__ will find attribue 'x'
42
>>> obj.y # OK - __getattr__ is invoked and returns None
Example 2 - an attribute that updates itself when accessed
As another example,
consider writing an attribute that is updated every time you access it:
>>> obj = MyClass()
>>> obj.x
0
>>> obj.x
1
>>> obj.x
2
Using a non-data descriptor
When you want to control the access behaviour of a specific attribute,
a descriptor is generally the right tool for the job.
The IncrementingAttribute.__get__()
method below returns 0 the first time
it is called for an instance.
Subsequently, it returns 1, 2, 3, etc.
It does this by storing an internal attribute _value
in the instance.
class IncrementingAttribute:
def __get__(self, obj, owner=None):
# rdon't allowaccessing as a class attribute
if obj is None:
raise AttributeError()
# if accessing for the first time, return 0,
# otherwise, increment by 1 and return the result
if not hasattr(obj, "_value"):
obj._value = -1
obj._value += 1
return obj._value
class MyClass:
x = IncrementingAttribute()
The attribute x
is updated each time it is accessed:
>>> obj = MyClass()
>>> obj.x
0
>>> obj.x
1
What happens if we “reset” the value of x
and then try to access it?
>>> obj.x
1
>>> obj.x = 0 # reset to 0
>>> obj.x
0
>>> obj.x
0
>>> obj.x
0
Oh no! x
has somehow lost the ability to update itself.
To understand why,
it’s first important to understand what happens when the following line is executed:
Because x
does not define a __set__()
method,
this line will fall back to the default behaviour of setting an attribute.
That is, it will add an entry 'x'
to the instance dictionary of obj
.
You can see this by inspecting the __dict__
attribute before and after
setting the attribute x
:
>>> obj = MyClass()
>>> obj.x
0
>>> obj.__dict__
{'_value': 0}
>>> obj.x
1
>>> obj.__dict__
{'_value': 1}
>>> obj.x = 1 # adds entry 'x' to obj.__dict__
{'_value': 1, 'x': 1}
Recall that x
is a non-data descriptor, that is,
it only defines a __get__()
method.
Because __getattribute__()
looks in the instance dictionary before
checking for non-data descriptors,
it finds x
in the instance dictionary and returns that.
Using a data descriptor
The solution to the above problem is to also define a __set__()
method in our descriptor class:
class IncrementingAttribute:
def __get__(self, obj, owner=None):
# don't allowaccessing as a class attribute
if obj is None:
raise AttributeError()
# if accessing for the first time, return 0,
# otherwise, increment by 1 and return the result
if not hasattr(obj, "_value"):
obj._value = -1
obj._value += 1
return obj._value
def __set__(self, obj, value):
obj._value = value - 1
class MyClass:
x = IncrementingAttribute()
Now, we get the expected reset behaviour:
>>> obj = MyClass()
>>> obj.x
0
>>> obj.x
1
>>> obj.x = 0
>>> obj.x
0
>>> obj.x
1
What about property
?
You could also use a property
to achieve the same result:
class MyClass:
@property
def x(self):
if not hasattr(self, "_value"):
self._value = -1
self._value += 1
return self._value
@x.setter
def x(self, value):
self._value = value - 1
In fact, @property
is really just a way to define a (data) descriptor!
So if you find yourself writing properties that all look the same
(either in the same class or across different classes),
that’s a sign that you should write a descriptor instead.
Further reading
If you’re looking for a detailed “Introduction to descriptors”,
or examples of how descriptors can be used,
see the
Descriptor HowTo
by Raymond Hettinger.
It’s one of my favourite parts of the official Python docs!