Handle Missing Keys in Python Dictionaries with defaultdict
Posted 2021-03-16
tl;dr: Use
defaultdict
when missing values in a dictionary should start with the same default value.
Effective Python has a bunch of fun factoids in it, and one in particular I I like is Item 17: Prefer defaultdict Over setdefault to Handle Missing Items in Internal State. This section introduces the defaultdict
from the collections
module, which is something I’ve started to make regular use of.
Consider a dictionary where you keep track of your test scores for all your classes. It might look something like this:
class_grades = {
'CSC-110': [ 'B' ],
'BIO-161': [ 'C' ]
}
The code for handling missing classes when logging a grade can be centralized to an add_grade
method like so:
def add_grade(class_id: str, grade: str):
found = class_grades.get(class_id)
if found is None:
found = []
class_grades[class_id] = found
found.append(grade)
add_grade('CSC-110', 'A-')
add_grade('SWE-200', 'A')
print(class_grades)
>>>
{'CSC-110': ['B', 'A-'], 'BIO-161': ['C'], 'SWE-200': ['A']}
The abstraction of add_grade
will hide the complexity of managing a missing class_id
from users of class_grades
. While this is fine, defaultdict
can handle all of this for us.
Basically, defaultdict
allows us to define what should be added to a dictionary in the case of a missing key. For class_grades
that’s an empty list.
The new definition of class_grades
looks like this:
from collections import defaultdict
class_grades = defaultdict([])
Now add_grade
can be eliminated entirely, and in its place we can just call get
and append
directly.
class_grades.get('CSC-110').append('A-')
class_grades.get('SWE-200').append('A')
print(class_grades)
>>>
{'CSC-110': ['B', 'A-'], 'BIO-161': ['C'], 'SWE-200': ['A']}
This leads to an overall cleaner interface because we don’t have to program defensively around a possible None
value. We can always assume class_grades.get
is going to return a valid response, and just call append
.
You can find more tidbits like these in Effective Python .