Your Python job path may need you to develop dictionary-like classes. To be more precise, you could want to create tailor-made dictionaries that exhibit altered behavior, gain new capabilities, or both. You may achieve this in Python in a few different ways: by directly subclassing the built-in dict class, by inheriting from an abstract base class, or by inheriting from UserDict. This guide will teach you how to:
-
Create dictionary-like classes by inheriting from the
built-in
dict
class
-
Identify
common pitfalls
that can happen when inheriting from
dict
-
Build dictionary-like classes by
subclassing
UserDict
from the
collections
module
In addition, you’ll write some code to illustrate the differences between dict and UserDict, two classes you might use to build your own dictionaries.
To get the most out of this guide, you should be comfortable with the basics of Python’s dict class. You should also be familiar with inheritance and the fundamentals of object-oriented programming.
Python.
Creating Dictionary-Like Classes in Python
The Python dictionary is a useful and flexible collection data type that may be accessed through the built-in dict class. You may find dictionaries in your own code and in the Python source code.
The default features of Python dictionaries aren’t always sufficient. It’s likely that you’ll need to build a specialized dictionary-like class to handle such data. In other words, you’ll need a new or updated class that mimics the dictionary’s behavior.
There are usually two main motivations for making one’s own dictionary-like
classes:
-
Extending
the regular dictionary by adding new functionality -
Modifying
the standard dictionary’s functionality
Keep in mind that there may be times when you need to do more than just tweak the dictionary’s default settings.
There are many approaches to building personalized dictionaries, and you should choose one that is appropriate for your goals and degree of expertise. You
can:
-
Inherit from an appropriate abstract base class, such as
MutableMapping
-
Inherit from the Python built-in
dict
class directly -
Subclass
UserDict
from
collections
Before deciding on a course of action, it’s important to keep a few things in mind. Learn more by reading on!
details.
Building a Dictionary-Like Class From an Abstract Base Class
Inheritance from an abstract base class (ABC) such as MutableMapping is necessary for this method of building dictionary-like classes. Except for the methods. getitem (),. setitem (),. delitem (),. iter__(), and. len__(), this class offers concrete generic implementations of all the dictionary methods.
In addition, let’s say you need to alter the features of any other common dictionary approach. Overriding the relevant method and providing an appropriate implementation that meets your requirements is necessary in this scenario.
A good deal of effort will be required for this procedure. It’s also difficult to get right without extensive experience with Python and its data model. As you’ll be developing the class entirely in Python, performance concerns are also possible.
The key benefit of this approach is that the parent ABC will notify you if you forget to implement any of its methods in your own version.
This approach should only be used if you want a dictionary-like class that significantly deviates from the standard dictionary.
This tutorial will teach you how to create classes that behave like dictionaries by deriving from the built-in dict class and the UserDict class.
strategies.
Inheriting From the Python Built-in
For a long time, subclassing C implementations of Python types was not feasible. Python 2.2 addressed this problem. Built-in types, such as dict, may now be subclassed directly. Many technical benefits accrue to the subclasses as a result of this modification.
they:
- Will work in every place that requires the original built-in type
-
Can define new
instance
,
static
, and
class
methods -
Can store their
instance attributes
in a
.__slots__
class attribute, which essentially replaces the
.__dict__
attribute
The first item on this list might be necessary for C code that depends on a preexisting class in Python. The second component extends the capabilities of the dictionary beyond their initial implementation. The third feature allows you to limit a subclass’s attributes to to those found in the. slots__ file.
There are a few downsides despite the fact that subclassing built-in classes provides several benefits. There are certain particularly frustrating issues to be aware of while using dictionaries. To illustrate, suppose you need a class similar to a dictionary where all the keys are strings where all the letters, if present, are capitalized.
You may achieve this by creating a dict subclass that modifies the. setitem__ method ()
method:
>>>
>>> class UpperCaseDict(dict):
... def __setitem__(self, key, value):
... key = key.upper()
... super().__setitem__(key, value)
...
>>> numbers = UpperCaseDict()
>>> numbers["one"] = 1
>>> numbers["two"] = 2
>>> numbers["three"] = 3
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}
Cool! It seems that your personal dictionary works well. There are, however, some underlying problems with this group. UpperCaseDict will give you an unexpected and buggy result if you attempt to create an instance with certain initialization data.
behavior:
>>>
>>> numbers = UpperCaseDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'one': 1, 'two': 2, 'three': 3}
I don’t understand what’s going on. Why doesn’t calling the constructor for your class cause your dictionary to capitalize the keys? It seems that the dictionary is not created by the implicit call to. setitem__() from the class initializer,. init__(). Hence, the capitalization process never completes.
Other dictionary functions, such as.update() and.setdefault(), are also impacted by this bug.
example:
>>>
>>> numbers = UpperCaseDict()
>>> numbers["one"] = 1
>>> numbers["two"] = 2
>>> numbers["three"] = 3
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}
>>> numbers.update({"four": 4})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4}
>>> numbers.setdefault("five", 5)
5
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4, 'five': 5}
Yet again, these usages show that your uppercase functionality needs some refinement. The only way to fix this is to provide your own implementations of all the methods in question. For instance, you might construct an. init__() function that looks like this to resolve the initialization problem:
this:
# upper_dict.py
class UpperCaseDict(dict):
def __init__(self, mapping=None, /, **kwargs):
if mapping is not None:
mapping = {
str(key).upper(): value for key, value in mapping.items()
}
else:
mapping = {}
if kwargs:
mapping.update(
{str(key).upper(): value for key, value in kwargs.items()}
)
super().__init__(mapping)
def __setitem__(self, key, value):
key = key.upper()
super().__setitem__(key, value)
In this case,. init__() capitalizes the keys and uses them to initialize the current instance.
Your custom dictionary’s startup should now go smoothly after applying this patch. Try it out by doing the commands below.
code:
>>>
>>> from upper_dict import UpperCaseDict
>>> numbers = UpperCaseDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3}
>>> numbers.update({"four": 4})
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'four': 4}
The initialization problem was resolved once you supplied your own. init__() function. The non-capitalization of the “four” key suggests that other methods, such as.update(), continue to function erroneously.
The behavior of dict subclasses seems odd. The open-closed concept was kept in mind throughout the development of built-in types. Hence, they may be expanded upon, but not altered. The invariants of these classes may be compromised if changes to their fundamental characteristics are permitted. Developers of Python’s core agreed to take precautions to prevent such changes.
That’s why subclassing the built-in dict class may be a bit of a challenge, a pain, and a potential source of errors. Thankfully, there are still options available to you. The collections module’s UserDict class is among the
them.
Subclassing
UserDict has been a part of Python’s standard library since version 1.6. Formerly, this class resided in a module with its own name. Given its main function, the collections module is a more natural home for the UserDict class in Python 3.
Since straight inheritance from Python’s dict was not feasible, UserDict was born. UserDict is still part of the standard library for both its utility and compatibility with older code, despite the fact that it is often unnecessary due to the option of subclassing the built-in dict class directly.
The dict object is hidden away inside of UserDict for your convenience. The.data instance parameter offers access to the underlying dictionary, making this class functionally equivalent to the built-in dict data type. As you’ll see later on in this guide, this capability may be used to make it easier to build your own dictionary-like classes.
Instead of being intended for direct instantiation, UserDict’s primary goal is to facilitate the creation of subclasses that provide similar functionality, such as dictionaries.
Other, subtler distinctions can exist. You may find them by updating your original implementation of UpperCaseDict to match the code example.
below:
>>>
>>> from collections import UserDict
>>> class UpperCaseDict(UserDict):
... def __setitem__(self, key, value):
... key = key.upper()
... super().__setitem__(key, value)
...
This time, however, you’re importing the collections module’s UserDict instead of dict for inheritance. How will your UpperCaseDict class change as a result of this modification? Read on for more details.
examples:
>>>
>>> numbers = UpperCaseDict({"one": 1, "two": 2})
>>> numbers["three"] = 3
>>> numbers.update({"four": 4})
>>> numbers.setdefault("five", 5)
5
>>> numbers
{'ONE': 1, 'TWO': 2, 'THREE': 3, 'FOUR': 4, 'FIVE': 5}
UpperCaseDict is now reliable in its correctness. No special. init (),.update(), or.setdefault() implementations are required. This is an effective class. This is because your. setitem__() implementation is relied on by every method in UserDict that modifies an existing key or creates a new one.
You may recall that the wrapped dictionary is stored in the.data property of UserDict, making it distinct from dict. You may avoid making unnecessary calls to super() by accessing.data directly, perhaps simplifying your code. .data files may be accessed and edited in the same ways as any other kind of file.
dictionary.
Coding Dictionary-Like Classes: Practical Examples
You already know that. setitem__() is never used in.update() or. init__() in dict subclasses. Because of this, dict subclasses work differently than other Python classes that also include a. setitem__() method.
You may avoid this problem by deriving from UserDict, where all operations that modify items in the underlying dictionary call. setitem__(). This is one way in which UserDict helps you write better, more concise code.
Granted, when considering the creation of a dictionary-like class, it is more intuitive to inherit from dict rather than UserDict. This is due to the fact that although every Python programmer is familiar with dict, only a few of them know about UserDict.
Several problems that arise when a class is derived from dict are usually easy to resolve by switching to UserDict. These concerns, however, are not necessarily pressing. The importance of these settings relies heavily on the extent to which you want to personalize the dictionary’s features.
In short, there are certain problems for which UserDict isn’t the best answer. In general, it’s OK to inherit from dict if you wish to add to the standard dictionary without changing its fundamental structure. But, UserDict is the superior choice if you want to modify the dictionary’s default operation by replacing its specialized methods.
Keep in mind that dict is very fast since it is built in C and tuned for speed. However, because it’s created from scratch in Python, UserDict’s speed may be severely constrained.
When determining whether to inherit from dict or UserDict, you should take into account a number of considerations. Considerations such as, but not limited to,
following:
- Amount of work
- Risk of errors and bugs
- Ease of use and coding
- Performance
In the next part, you’ll get to try out the first three items on this list in practice by writing some code. Implications for performance will be discussed in more detail later, in the chapter devoted to that topic.
.
A Dictionary That Accepts British and American Spelling of Keys
As an initial example, let’s assume you want a dictionary that not only holds American English keys but also permits British English key lookup. The. setitem__() and. getitem__() special methods will need to be modified in order to implement this dictionary in code.
Keys may be stored consistently in American English using the. setitem__() function. Whether written in American or British English, the. getitem__() function will let you to access the value associated with a specified key.
UserDict is preferable to the dict class because of the necessity to alter the class’s fundamental functionality. You won’t need to provide your own. init (),.update(), and so on when using UserDict.
There are two primary techniques to implement a subclass of UserDict. The.data property may be used to simplify the code, or you can use super() and other specialized methods.
The.data-dependent code is as follows:
:
# spelling_dict.py
from collections import UserDict
UK_TO_US = {"colour": "color", "flavour": "flavor", "behaviour": "behavior"}
class EnglishSpelledDict(UserDict):
def __getitem__(self, key):
try:
return self.data[key]
except KeyError:
pass
try:
return self.data[UK_TO_US[key]]
except KeyError:
pass
raise KeyError(key)
def __setitem__(self, key, value):
try:
key = UK_TO_US[key]
except KeyError:
pass
self.data[key] = value
Starting with the British words as keys and the corresponding American terms as values, you construct a constant, UK TO US, in this example.
Next you’ll create an instance of EnglishSpelledDict that extends UserDict. Finding the current key is the job of the. getitem__() function. The method will return the key if it exists. If the key doesn’t already exist, the technique verifies whether or not it was spelt using British English spelling conventions. If that’s the case, the underlying dictionary is queried for the key in American English.
The. setitem__() function consults a UK TO US dictionary in the event that an input key cannot be located. If the input key is found in UK TO US, the translation to American English will occur. At last, the procedure will appoint the input value to the desired key.
Learn the ins and outs of EnglishSpelledDict right now!
practice:
>>>
>>> from spelling_dict import EnglishSpelledDict
>>> likes = EnglishSpelledDict({"color": "blue", "flavour": "vanilla"})
>>> likes
{'color': 'blue', 'flavor': 'vanilla'}
>>> likes["flavour"]
vanilla
>>> likes["flavor"]
vanilla
>>> likes["behaviour"] = "polite"
>>> likes
{'color': 'blue', 'flavor': 'vanilla', 'behavior': 'polite'}
>>> likes.get("colour")
'blue'
>>> likes.get("color")
'blue'
>>> likes.update({"behaviour": "gentle"})
>>> likes
{'color': 'blue', 'flavor': 'vanilla', 'behavior': 'gentle'}
Subclassing UserDict eliminates the need to create duplicated code. For instance, your. getitem__() and. setitem__() methods will be automatically relied upon by the default implementations of.get(),.update(), and.setdefault().
You’ll save time and effort by writing less code. In addition to saving time, you’ll also be safer since less code usually means fewer mistakes. Please keep in mind that the
If you want to make sure that the del keyword in your EnglishSpelledDict dictionary works with both spellings, you’ll need to add a new. delitem__() function. To a similar extent, if you so
In order for membership tests to be inclusive of other spellings, you will need to implement your own version of the. contains__() function.
The biggest problem with this approach is that you’ll have to rewrite a lot of code to avoid using.data if you ever decide to change EnglishSpelledDict so that it inherits from dict.
With super() and a few specialized methods, the same functionality as previously may be provided, as shown in the following example. Now that your custom dictionary is dict-compatible, you may switch the parent class whenever you choose.
like:
# spelling_dict.py
from collections import UserDict
UK_TO_US = {"colour": "color", "flavour": "flavor", "behaviour": "behavior"}
class EnglishSpelledDict(UserDict):
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
pass
try:
return super().__getitem__(UK_TO_US[key])
except KeyError:
pass
raise KeyError(key)
def __setitem__(self, key, value):
try:
key = UK_TO_US[key]
except KeyError:
pass
super().__setitem__(key, value)
This version differs visually from the original but performs identically. However, abandoning.data may make coding more challenging. Super(),. getitem__(), and. setitem__() are being substituted for their intended purposes. This code assumes familiarity with Python’s data model, which is an advanced and involved subject.
You now have the freedom to alter the super class whenever you need to since your class is compatible with dict thanks to this new implementation. Keep in mind that if your class directly inherits from dict, you’ll need to reimplement. init__() and maybe additional methods to ensure that, as new keys are added to the dictionary, they are also translated to American spelling.
Subclassing UserDict rather than dict is frequently the preferred method of extending the dictionary’s default capabilities. The primary cause is because the built-in dict uses optimizations and shortcuts in its implementation that cause you to have to override methods that you would otherwise be able to inherit from UserDict.
class.
A Dictionary That Accesses Keys Through Values
More functionality beyond the default behavior is another typical need for a custom dictionary. Create methods in a dictionary-like class to obtain the key that corresponds to a particular value, for instance.
You should have access to a function that returns the first key that matches the specified value. A method returning an iterator over the matched keys and values is also desirable.
This example shows how this tradition may be put into practice.
dictionary:
# value_dict.py
class ValueDict(dict):
def key_of(self, value):
for k, v in self.items():
if v == value:
return k
raise ValueError(value)
def keys_of(self, value):
for k, v in self.items():
if v == value:
yield k
This time, though, you’ll inherit from dict instead of UserDict. Why? In this scenario, you are extending the dictionary with features that do not change its fundamental characteristics. For this reason, it makes more sense to inherit from dict. Later on in this guide, we’ll also see that it performs better than its predecessor.
By calling.key of(), you may examine each pair of keys and values in the underlying dictionary. The conditional statement looks for data that coincides with a predetermined value. The first matched value’s key is returned by the if block of code. If the specified key is not present, the method throws a ValueError.
In its capacity as a generator method,.keys of() will return just those keys whose value is a match for the value sent to it as an argument.
How does this dictionary function in
practice:
>>>
>>> from value_dict import ValueDict
>>> inventory = ValueDict()
>>> inventory["apple"] = 2
>>> inventory["banana"] = 3
>>> inventory.update({"orange": 2})
>>> inventory
{'apple': 2, 'banana': 3, 'orange': 2}
>>> inventory.key_of(2)
'apple'
>>> inventory.key_of(3)
'banana'
>>> list(inventory.keys_of(2))
['apple', 'orange']
Cool! All in all, your ValueDict dictionary is functional. It takes Python’s dict as its foundation and adds extra functionalities on top of that.
Create a dictionary-like class with unique behavior by overriding UserDict’s special methods like. setitem__() and. getitem__(). This class will behave similarly to the built-in dict class.
Therefore, it is preferable to explicitly inherit from dict in Python if you require a dictionary-like class with added functionality that does not alter or change the basic dict behavior. This method will be more efficient, straightforward, and organic.
efficient.
A Dictionary With Additional Functionalities
Now, let’s imagine you want to develop a dictionary that offers the following as an example of how to design a custom dictionary with extra features:
methods:
Method | Description |
---|---|
|
Takes a callable
as an argument and applies it to all the values in the underlying dictionary |
|
Removes a given
from the underlying dictionary |
|
Returns
or
depending on whether the dictionary is empty or not |
These three operations may be implemented without changing the fundamental characteristics of the standard dict class. Hence, it seems that subclassing dict is preferable than UserDict.
This is the code that builds the necessary interfaces on top of dict.
:
# extended_dict.py
class ExtendedDict(dict):
def apply(self, action):
for key, value in self.items():
self[key] = action(value)
def remove(self, key):
del self[key]
def is_empty(self):
return len(self) == 0
Here, the.apply() method is being used to apply a callable to each and every entry in the underlying dictionary. The modified value is then remapped to the original key. The del statement is used by the.remove() function to delete the specified key from the dictionary. To determine whether the dictionary is empty, the.is empty() method checks using the standard len() method.
How ExtendedDict Works
works:
>>>
>>> from extended_dict import ExtendedDict
>>> numbers = ExtendedDict({"one": 1, "two": 2, "three": 3})
>>> numbers
{'one': 1, 'two': 2, 'three': 3}
>>> numbers.apply(lambda x: x**2)
>>> numbers
{'one': 1, 'two': 4, 'three': 9}
>>> numbers.remove("two")
>>> numbers
{'one': 1, 'three': 9}
>>> numbers.is_empty()
False
Then, a new instance of ExtendedDict is created by passing in a standard dictionary. The augmented dictionary is then subjected to an.apply() operation. Every entry in the dictionary is sent as an argument to a lambda function that squares the value in question.
Then, passing an existing key to.remove() will cause the key-value pair to be removed from the dictionary. Since numbers isn’t empty, the.is empty() method on numbers returns False. If the underlying dictionary was valid, it would have returned True.
empty.
Considering Performance
Since UserDict is developed entirely in Python, inheriting from it may create performance issues. The built-in dict class, on the other hand, is written in C and is well optimized for speed. Hence, be careful to time your code to uncover any performance problems if you need to utilize a custom dictionary in mission-critical code.
Returning to your ExtendedDict class, create two new classes, one inheriting from dict and the other from UserDict, to see whether performance degrades when switching between the two inheritance types.
The format of your lessons should be
this:
# extended_dicts.py
from collections import UserDict
class ExtendedDict_dict(dict):
def apply(self, action):
for key, value in self.items():
self[key] = action(value)
def remove(self, key):
del self[key]
def is_empty(self):
return len(self) == 0
class ExtendedDict_UserDict(UserDict):
def apply(self, action):
for key, value in self.items():
self[key] = action(value)
def remove(self, key):
del self[key]
def is_empty(self):
return len(self) == 0
The sole distinction between ExtendedDict dict and ExtendedDict UserDict is that the former extends the dict class while the latter extends the UserDict class.
Start with timing fundamental dictionary activities like class creation to see how well they run. In your Python interactive, try out the following code.
session:
>>>
>>> import timeit
>>> from extended_dicts import ExtendedDict_dict
>>> from extended_dicts import ExtendedDict_UserDict
>>> init_data = dict(zip(range(1000), range(1000)))
>>> dict_initialization = min(
... timeit.repeat(
... stmt="ExtendedDict_dict(init_data)",
... number=1000,
... repeat=5,
... globals=globals(),
... )
... )
>>> user_dict_initialization = min(
... timeit.repeat(
... stmt="ExtendedDict_UserDict(init_data)",
... number=1000,
... repeat=5,
... globals=globals(),
... )
... )
>>> print(
... f"UserDict is {user_dict_initialization / dict_initialization:.3f}",
... "times slower than dict",
... )
UserDict is 35.877 times slower than dict
To determine how long a program takes to run, this code sample combines the timeit module and the min() method. It is assumed that the reader is familiar with creating instances of classes; in this case, ExtendedDict dict and ExtendedDict UserDict.
After the timing code has been executed, the durations of each initialization may be compared. The UserDict-based class takes more time to initialize than the dict-based class does in this scenario. This finding points to a considerable efficiency gap.
New functionality execution time measurement might be of interest as well. The.apply() function’s runtime is a good illustration. In order to do this test, please execute
code:
>>>
>>> extended_dict = ExtendedDict_dict(init_data)
>>> dict_apply = min(
... timeit.repeat(
... stmt="extended_dict.apply(lambda x: x**2)",
... number=5,
... repeat=2,
... globals=globals(),
... )
... )
>>> extended_user_dict = ExtendedDict_UserDict(init_data)
>>> user_dict_apply = min(
... timeit.repeat(
... stmt="extended_user_dict.apply(lambda x: x**2)",
... number=5,
... repeat=2,
... globals=globals(),
... )
... )
>>> print(
... f"UserDict is {user_dict_apply / dict_apply:.3f}",
... "times slower than dict",
... )
UserDict is 1.704 times slower than dict
This time around, the performance gap between the UserDict-based class and the dict-based class is less than usual.
If you’re making a custom dictionary by subclassing dict, you can usually count on it being faster at conventional dictionary operations than a class built on UserDict. Yet, it’s possible that the new features will take about the same amount of time to execute as the existing ones. What criteria might you use to determine the best route to take? Your code has to be time-measured, obviously.
As you’ll essentially be recreating the dict class in pure Python if you want to tweak the fundamental dictionary functionality, UserDict is the way to go.