Ashwin Srinath

__get__ it right: controlling attribute access in Python

In Python, you use the dot (.) operator to access attributes of an object. Normally, that isn’t something you have to give much thought to. However, when you want to customize what happens during attribute access, things can get complicated.

In this post, we’ll see how to control what happens when you use the dot operator. Before we can talk about customizing attribute access, we need to discuss two related topics: class and instance attributes, and descriptors. If both of those are familiar to you, feel free to skip ahead.

Class and instance attributes

There are two kinds of attributes in Python: class and instance attributes. In the following class, volume is a class attribute, and line is an instance attribute.

class Speaker:
    volume = "low"
    
    def __init__(self, line):
        self.line = line
    
    def speak(self):
        return line if self.volume == "low" else line.upper()

Each instance of the class has its own line attribute. It’s easy to see why calling a.speak() below returns "hello" while b.speak() returns "goodbye":

>>> a = Speaker("hello")
>>> b = Speaker("goodbye")
>>> a.speak()
"hello"
>>> b.speak()
"goodbye"

The class attribute volume is shared by all instances. If you change its value, that change is visible to both a and b:

>>> Speaker.volume = "high"
>>> a.speak()
'HELLO'
>>> b.speak()
'GOODBYE'

Each instance has an instance dictionary where instance attributes are stored:

>>> a.__dict__
{'line': 'hello'}
>>> b.__dict__
{'line': 'goodbye'}

Class attributes are stored in a class dictionary:

>>> Speaker.__dict__
mappingproxy({'__module__': '__main__',
              'volume': 'low',
              '__init__': <function __main__.Speaker.__init__(self, line)>,
              'speak': <function __main__.Speaker.speak(self)>,
              '__dict__': <attribute '__dict__' of 'Speaker' objects>,
              '__weakref__': <attribute '__weakref__' of 'Speaker' objects>,
              '__doc__': None})

It’s important to remember this distinction between instance and class attributes (and where they are stored).

Descriptors

Descriptors are an important concept related to attribute access. A descriptor is a class that defines one or more of the following methods:

  1. __get__(),
  2. __set__(),
  3. or __delete__()

Below is a simple descriptor class. Its __get__() method always returns 0.

class ZeroAttribute:
    """
    Attribute that is always 0
    """
    def __get__(self, obj, owner=None):
        return 0

Descriptors are only useful as class variables:

class Foo:
    x = ZeroAttribute()

Accessing Foo.x will run its __get__() method:

>>> Foo.x  # calls ZeroAttribute.__get__()
0

Even though x was defined like a class attribute, you can also access x as an instance attribute, which also invokes the ZeroAttribute.__get__() method:

>>> a = Foo()
>>> a.x  # calls ZeroAttribute.__get__()
0

Arguments to the __get__() method

The __get__() method accepts two arguments, obj and owner.

This lets you specify different behaviour for class attribute access and instance attribute access. For example, if you explicitly don’t want to allow class attribute access, you can do something like this:

class ZeroAttribute:
    """
    Attribute that is always 0
    """
    def __get__(self, obj, owner=None):
        if obj is None:
            raise AttributeError()  # don't allow accessing as a class attribute
        return 0

class Foo:
    x = ZeroAttribute()
>>> Foo.x  # accessing as class attribute, will raise
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in __get__
AttributeError
>>>
>>> a = Foo()
>>> a.x  # accessing as instance attribute, OK
0

Data descriptors and non-data descriptors

A descriptor that only defines __get__() is called a non-data descriptor. A descriptor that also defines __set__() or __delete__() is called a data descriptor. As you’ll see in the next section, this is an important difference to remember.

Customizing attribute access

Now that you know about class and instance attributes, and a little bit about descriptors, you’re ready to understand how attribute access really works in Python, and how to customize it.

When you write x.y, the x.__getattribute__() method is invoked. The default implementation of __getattribute__() does the following:

If __getattribute__() raises an AttributeError, x.__getattr__() is called if it is defined.

So you can control the way attribute access works in a few different ways:

Let’s look at a some examples that will help understand when to use which.

Example 1: returning None when an attribute isn’t found

By default, accessing an attribute that doesn’t exist gives you an AttributeError:

>>> class MyClass: 
...     def __init__(self, x):
...         self.x = x
...
>>> obj = MyClass(42)
>>> obj.x
42
>>> obj.y
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'MyClass' object has no attribute 'y'

Let’s say we want to get None (or some other custom behaviour) when we access non-existent attributes.

Overriding __getattribute__() - the wrong way

The first approach we might consider is overriding __getattribute__(). Try to spot the problem with the code below:

class MyClass:
    def __init__(self, x):
        self.x = x
        
    def __getattribute__(self, name):
        try:
            return self.__dict__[name]
        except KeyError:
            return None

If we try to access a non-existent attribute of a MyClass instance, we get a RecursionError!

>>> obj = MyClass(42)
>>> obj.y
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 6, in __getattribute__
  File "<stdin>", line 6, in __getattribute__
  File "<stdin>", line 6, in __getattribute__
  [Previous line repeated 996 more times]
RecursionError: maximum recursion depth exceeded

The problem is this line in MyClass.__getattribute__(), which itself calls MyClass.__getattribute__():

return self.__dict__[name]  # calls self.__getattr__("__dict__") - infinite recursion!

Overriding __getattribute__() - the less wrong, but still wrong, way

To prevent MyClass.__getattribute__() from calling itself recursively, we could use the base class’, i.e., object’s implementation of __getattribute__() instead. If that fails, we return None:

class MyClass:
    def __init__(self, x):
        self.x = x
        
    def __getattribute__(self, name):
        try:
            return object.__getattribute__(self, name)
        except AttributeError:
            return None

This gives us the behaviour we want:

>>> obj = MyClass(42)
>>> obj.x
42
>>> obj.y  # no error, returns None
>>>

Using __getattr__ - the right approach

The more elegant solution is to write a __getattr__() method. Recall that the default implementation of __getattribute__() will raise AttributeError when it doesn’t find an attribute. When that happens, __getattr__() is called:

class MyClass:
    def __init__(self, x):
        self.x =x
        
    def __getattr__(self, name):
        return None
>>> obj = MyClass(42)
>>> obj.x  # OK - __getattribute__ will find attribue 'x'
42
>>> obj.y  # OK - __getattr__ is invoked and returns None

Example 2 - an attribute that updates itself when accessed

As another example, consider writing an attribute that is updated every time you access it:

>>> obj = MyClass()
>>> obj.x
0
>>> obj.x
1
>>> obj.x
2

Using a non-data descriptor

When you want to control the access behaviour of a specific attribute, a descriptor is generally the right tool for the job.

The IncrementingAttribute.__get__() method below returns 0 the first time it is called for an instance. Subsequently, it returns 1, 2, 3, etc. It does this by storing an internal attribute _value in the instance.

class IncrementingAttribute:
    def __get__(self, obj, owner=None):
        # rdon't allowaccessing as a class attribute
        if obj is None:
            raise AttributeError()
            
        # if accessing for the first time, return 0,
        # otherwise, increment by 1 and return the result
        if not hasattr(obj, "_value"):
            obj._value = -1
        obj._value += 1
        return obj._value

class MyClass:
    x = IncrementingAttribute()

The attribute x is updated each time it is accessed:

>>> obj = MyClass()
>>> obj.x
0
>>> obj.x
1

What happens if we “reset” the value of x and then try to access it?

>>> obj.x
1
>>> obj.x = 0  # reset to 0
>>> obj.x
0
>>> obj.x
0
>>> obj.x
0

Oh no! x has somehow lost the ability to update itself. To understand why, it’s first important to understand what happens when the following line is executed:

obj.x =  0

Because x does not define a __set__() method, this line will fall back to the default behaviour of setting an attribute. That is, it will add an entry 'x' to the instance dictionary of obj. You can see this by inspecting the __dict__ attribute before and after setting the attribute x:

>>> obj = MyClass()
>>> obj.x
0
>>> obj.__dict__
{'_value': 0}
>>> obj.x
1
>>> obj.__dict__
{'_value': 1}
>>> obj.x = 1   # adds entry 'x' to obj.__dict__
{'_value': 1, 'x': 1}

Recall that x is a non-data descriptor, that is, it only defines a __get__() method. Because __getattribute__() looks in the instance dictionary before checking for non-data descriptors, it finds x in the instance dictionary and returns that.

Using a data descriptor

The solution to the above problem is to also define a __set__() method in our descriptor class:

class IncrementingAttribute:
    def __get__(self, obj, owner=None):
        # don't allowaccessing as a class attribute
        if obj is None:
            raise AttributeError()
        # if accessing for the first time, return 0,
        # otherwise, increment by 1 and return the result
        if not hasattr(obj, "_value"):
            obj._value = -1
        obj._value += 1
        return obj._value
        
    def __set__(self, obj, value):
        obj._value = value - 1

class MyClass:
    x = IncrementingAttribute()

Now, we get the expected reset behaviour:

>>> obj = MyClass()
>>> obj.x
0
>>> obj.x
1
>>> obj.x = 0
>>> obj.x
0
>>> obj.x
1

What about property?

You could also use a property to achieve the same result:

class MyClass:
    @property
    def x(self):
        if not hasattr(self, "_value"):
            self._value = -1
        self._value += 1
        return self._value

    @x.setter
    def x(self, value):
        self._value = value - 1

In fact, @property is really just a way to define a (data) descriptor! So if you find yourself writing properties that all look the same (either in the same class or across different classes), that’s a sign that you should write a descriptor instead.

Further reading

If you’re looking for a detailed “Introduction to descriptors”, or examples of how descriptors can be used, see the Descriptor HowTo by Raymond Hettinger. It’s one of my favourite parts of the official Python docs!