Portrait of me in my natural habitat

Handle Missing Keys in Python Dictionaries with defaultdict

Posted 2021-03-16

tl;dr: Use defaultdict when missing values in a dictionary should start with the same default value.

Effective Python has a bunch of fun factoids in it, and one in particular I I like is Item 17: Prefer defaultdict Over setdefault to Handle Missing Items in Internal State. This section introduces the defaultdict from the collections module, which is something I’ve started to make regular use of.

Consider a dictionary where you keep track of your test scores for all your classes. It might look something like this:

class_grades = {
    'CSC-110': [ 'B' ],
    'BIO-161': [ 'C' ]
}

The code for handling missing classes when logging a grade can be centralized to an add_grade method like so:

def add_grade(class_id: str, grade: str):
    found = class_grades.get(class_id)
    if found is None:
        found = []
        class_grades[class_id] = found
    found.append(grade)


add_grade('CSC-110', 'A-')
add_grade('SWE-200', 'A')
print(class_grades)
>>>
{'CSC-110': ['B', 'A-'], 'BIO-161': ['C'], 'SWE-200': ['A']}

The abstraction of add_grade will hide the complexity of managing a missing class_id from users of class_grades. While this is fine, defaultdict can handle all of this for us.

Basically, defaultdict allows us to define what should be added to a dictionary in the case of a missing key. For class_grades that’s an empty list.

The new definition of class_grades looks like this:

from collections import defaultdict

class_grades = defaultdict([])

Now add_grade can be eliminated entirely, and in its place we can just call get and append directly.

class_grades.get('CSC-110').append('A-')
class_grades.get('SWE-200').append('A')
print(class_grades)
>>>
{'CSC-110': ['B', 'A-'], 'BIO-161': ['C'], 'SWE-200': ['A']}

This leads to an overall cleaner interface because we don’t have to program defensively around a possible None value. We can always assume class_grades.get is going to return a valid response, and just call append.

You can find more tidbits like these in Effective Python .