Whether you’re processing text or strings in your code, you may take use of the Python str class’s myriad of helpful features, which can make your life much easier. Nonetheless, there are circumstances in which all of these wonderful advantages could not be enough for you. It’s possible that you’ll need to create some own string-like classes. In Python, you may do this task by either directly inheriting from the built-in str class or by subclassing UserString, which can be found in the collections module. You will learn how to do the following at the end of this guide:
-
Create custom string-like classes by inheriting from the
built-in
str
class
-
Build custom string-like classes by subclassing
UserString
from the
collections
module
-
Decide when to use
str
or
UserString
to create
custom string-like classes
In the meanwhile, you will compose a few examples that will assist you in determining whether or not to make use of str or UserString when you are developing your own custom string classes. Your decision will be influenced greatly by the particular use case you have in mind.
If you are acquainted with Python’s built-in str class and the characteristics that are typical of it, you will find that it is easier to follow along with this lesson. You will also need to have a fundamental understanding of object-oriented programming and inheritance in order to use Python effectively.
Developing Classes in Python That Are Similar to Strings
In Python, string creation may be accomplished using the language’s built-in str class. You’ll find yourself using strings, which are sequences of characters, in many different contexts, particularly when dealing with textual data. There may likely be occasions when the default capabilities of Python’s str will not be enough to satisfy your requirements. As a result, you could find it necessary to develop specialised string-like classes in order to address your particular issue.
Creating bespoke string-like objects is often justified for at least two different reasons.
classes:
-
Extending
the regular string by adding new functionality -
Modifying
the standard string’s functionality
You could also get across circumstances in which you need to increase the capability of strings while also making certain modifications to the default behaviour of strings at the same time.
When developing string-like classes in Python, you’ll most likely use one of the following approaches more often than the others. You have the option to directly inherit from Python’s built-in str class or to subclass UserString from collections. Note: In object-oriented programming, it is standard practise to use the verbs inherit and subclass interchangeably. This is because the two concepts are quite similar.
Immutability is an important characteristic of Python strings; this implies that you cannot make any changes to them while they are still in place. Hence, while picking the proper method to develop your own unique string-like classes, you need to evaluate whether or not the required characteristics would effect immutability. This is because immutability is a property that cannot be changed.
If, for instance, you want to change the way that already-existing string methods behave, then you can generally get away with subclassing str without too much trouble. In contrast, if you need to modify the way strings are formed, you will need to have extensive expertise to inherit from str. You will need to implement your own version of the. new__() function. In the latter scenario, inheriting from UserString might make your life simpler since it would save you from the need to modify the. new__() method.
You are going to understand the benefits and drawbacks of each method in the following parts, which will allow you to choose the approach that will be most effective in solving the challenge that you are facing.
Using the Built-in str Class in Python as a Parent Class
It was not feasible to inherit directly from Python types that were implemented in C for a considerable amount of time. This problem was resolved in Python 2.2. You now have the ability to subclass built-in types, such as the str type. When you need to develop custom string-like classes, you’ll find that this newly added capability is rather useful.
You have the ability to extend and alter the default behaviour of this built-in class if you inherit straight from the str class. Altering the manner in which your custom string-like classes are instantiated to conduct changes before newly created instances are ready is another option available to you.
Increasing the Capabilities of the String’s Default Behavior
When you need to enhance the functionality of the default Python strings with additional behaviour, for instance, you will need a custom string-like class to do this. Take, as an example, the scenario in which you want a string-like class that implements a new method to count the amount of words included in the underlying string.
In this particular example, the word separation between phrases in your custom string will be handled by the whitespace character. But, it ought to also let you to specify a separator character if you so want. In order to write code for a class that satisfies these requirements, you may do something like
this:
>>>
>>> class WordCountString(str):
... def words(self, separator=None):
... return len(self.split(separator))
...
This class has direct inheritance from the str base class. This indicates that it offers an interface that is identical to that of its parent class.
You extend the functionality of this inherited interface by adding a new method that you name words(). This function receives an argument in the form of a separator character, which is then sent to the.split() method. Its value of None is the default, and thus causes it to split on runs of continuous white space. After that, you provide the underlying string you want to divide into words to.split() along with the desired separator. The last step is to find the total number of words by using the len() function.
The following is an example of how you may make use of this class in your
code:
>>>
>>> sample_text = WordCountString(
... """Lorem ipsum dolor sit amet consectetur adipisicing elit. Maxime
... mollitia, molestiae quas vel sint commodi repudiandae consequuntur
... voluptatum laborum numquam blanditiis harum quisquam eius sed odit
... fugiat iusto fuga praesentium optio, eaque rerum! Provident similique
... accusantium nemo autem. Veritatis obcaecati tenetur iure eius earum
... ut molestias architecto voluptate aliquam nihil, eveniet aliquid
... culpa officia aut! Impedit sit sunt quaerat, odit, tenetur error,
... harum nesciunt ipsum debitis quas aliquid."""
... )
>>> sample_text.words()
68
Cool! The.words() function that you have works well. It takes the given text, separates it into words, and then returns the total number of words. You are free to change the way in which this method delimits and processes words, but the existing implementation is sufficient for the purposes of this illustrative example.
In this demonstration, the behaviour of Python’s str has not been altered from its default setting. You have just included a new action in your custom class’s repertoire. But, it is also possible to modify str’s default behaviour by overriding any of its default methods, which is something you will investigate in the next section.
Changing the Way the String Does Its Normal Operations
As an example of how to adjust the behaviour of str in a custom string-like class, pretend that you require a string class that always publishes its letters in uppercase. This will teach you how to change the default behaviour of str. You are able to do this by overriding the. str__() special function, which is responsible for controlling the printing behaviour of string objects.
This is an UpperPrintString class that acts just as you would expect.
need:
>>>
>>> class UpperPrintString(str):
... def __str__(self):
... return self.upper()
...
Once again, this class is descended from the str base class. A copy of the underlying string, self, is returned by the. str__() function. This copy has all of the string’s letters capitalised. Use the.upper() function to change the letters into their uppercase counterparts.
Proceed to perform the following commands in order to test out your individualised string-like class.
code:
>>>
>>> sample_string = UpperPrintString("Hello, Pythonista!")
>>> print(sample_string)
HELLO, PYTHONISTA!
>>> sample_string
'Hello, Pythonista!'
When you print an instance of UpperPrintString, the string that is shown on your screen will have all of the letters in the string capitalised. Take note that the initial string has not been altered or changed in any way. You merely modified the default printing functionality of str.
Adjusting Some Aspects of str’s Instantiation Process
In this part of the guide, you’ll be doing something a little bit different. You will begin by developing a string-like class that modifies the input string in some way before proceeding to generate the actual string object. Consider the following scenario: you need a class that behaves like a string but saves all of its letters in lowercase. In order to do this, you will need to override the class initializer, which is denoted by the function name. init__(), and then perform actions like as
this:
>>>
>>> class LowerString(str):
... def __init__(self, string):
... super().__init__(string.lower())
...
You have provided a. init__() function that takes precedence over the standard str initializer in this line of code. You access the. init__() method of the parent class by using the super() function, which is located inside this implementation of. init__(). After that, you initialise the current string and then execute the.lower() function on the input string to change all of the letters in that string into lowercase letters.
Nevertheless, as you’ll see in the subsequent examples, the code that was just shown doesn’t function.
example:
>>>
>>> sample_string = LowerString("Hello, Pythonista!")
Traceback (most recent call last):
...
TypeError: object.__init__() takes exactly one argument...
You are unable to alter the value of str objects inside the. init__() function because str objects are immutable. This is due to the fact that the value is set during the object creation process as opposed to the object initialization process. The. new__() function is the only one that may be overridden in order to change the value of a string that is being instantiated, and this is the only way to do it.
The procedure is as follows:
this:
>>>
>>> class LowerString(str):
... def __new__(cls, string):
... instance = super().__new__(cls, string.lower())
... return instance
...
>>> sample_string = LowerString("Hello, Pythonista!")
>>> sample_string
'hello, pythonista!'
In this example, the. new__() function of the super class is overridden by your LowerString class, allowing you to tailor the process by which instances are produced. In this scenario, you must modify the input string before you can proceed with the creation of a brand new LowerString object. Now your class functions exactly as you need it to. It accepts a string as input and saves the string with all of the characters in lowercase.
If there is ever a moment when you need to modify the input string while the object is being instantiated, then you will have to override the. new__() method. For this strategy to work, you will need to have an in-depth understanding of Python’s data model and its specialised functions.
The UserString Class Is Subclassed From Collections
The UserString class, which is part of the collections module, is the second instrument in this set that gives you the ability to develop your own string-like classes. The built-in string type is encapsulated inside this class as a wrapper. Its purpose was to facilitate the creation of string-like classes in situations when direct inheritance from the built-in str class was not an option.
Since you now have the option of directly subclassing str, you may find that your requirement for UserString decreases. Nonetheless, you may still find this class in the standard library; this is done both for your convenience and to maintain compatibility with older versions of the programme. You’re going to find out very soon that in addition to its obvious advantages, this class also has a few tricks up its sleeve that may come in handy.
The most important function that UserString has is its.data property, which allows you to access the object that is wrapped around the string. The generation of user-specific strings may be simplified with the use of this property, particularly in circumstances in which the mutability of the string is affected by the intended customisation.
You will review the examples from the previous parts in the two sections that follow, but this time you will subclass UserString rather than str. To get things rolling, you will begin by extending and changing the default behaviour of Python strings. This will get the ball rolling.
Extending and Altering the Behavior of the String in Its Standard Form
You could implement WordCountString and UpperPrintString by inheriting from the UserString class rather of subclassing the built-in str class. This would be a more efficient use of your time. You will just need to modify the superclass in order to use this newly implemented feature. You won’t need to make any changes to the classes’ initial internal implementation if you follow this guide.
WordCountString and UpperPrintString both have been updated and can be found here.
:
>>>
>>> from collections import UserString
>>> class WordCountString(UserString):
... def words(self, separator=None):
... return len(self.split(separator))
...
>>> class UpperPrintString(UserString):
... def __str__(self):
... return self.upper()
...
The only thing that differentiates these new implementations from the ones that came before them is the fact that you now inherit from UserString. Take into consideration that in order to inherit from UserString, you will first need to import the class from the collections module.
If you give these classes another go using the same examples that you used previously, you’ll see that they function in exactly the same way as the corresponding classes that are based on str.
:
>>>
>>> sample_text = WordCountString(
... """Lorem ipsum dolor sit amet consectetur adipisicing elit. Maxime
... mollitia, molestiae quas vel sint commodi repudiandae consequuntur
... voluptatum laborum numquam blanditiis harum quisquam eius sed odit
... fugiat iusto fuga praesentium optio, eaque rerum! Provident similique
... accusantium nemo autem. Veritatis obcaecati tenetur iure eius earum
... ut molestias architecto voluptate aliquam nihil, eveniet aliquid
... culpa officia aut! Impedit sit sunt quaerat, odit, tenetur error,
... harum nesciunt ipsum debitis quas aliquid."""
... )
>>> sample_text.words()
68
>>> sample_string = UpperPrintString("Hello, Pythonista!")
>>> print(sample_string)
HELLO, PYTHONISTA!
>>> sample_string
'Hello, Pythonista!'
In these demonstrations, the functionality of your updated implementations of WordCountString and UpperPrintString is equivalent to that of the previous versions. Thus, why should you make use of UserString instead than str? There does not seem to be any justification for doing this up to this time. Yet, UserString comes in useful when you need to customise the way strings are formed since it allows you to modify the string creation process.
Adjusting several aspects of the UserString instantiation procedure
The LowerString class may be written in code by inheriting from the UserString class. By altering the parent class, you will be able to modify the initialization procedure in the instance initializer, which is denoted by the function. init__(), without having to override the function. new__(), which is responsible for creating new instances.
This is the updated version of LowerString, along with an explanation of how it functions in.
practise:
>>>
>>> from collections import UserString
>>> class LowerString(UserString):
... def __init__(self, string):
... super().__init__(string.lower())
...
>>> sample_string = LowerString("Hello, Pythonista!")
>>> sample_string
'hello, pythonista!'
By utilising UserString rather than str as your superclass in the example that was just shown, you have made it feasible to do changes on the string that is being entered. Since UserString is a wrapper class, the modifications are conceivable. This is due to the fact that UserString saves the completed string in its.data property, which is the true immutable object.
Since UserString is a wrapper over the str class, it offers a versatile and easy method for creating custom strings with changeable behaviour. This is because of UserString’s ability to wrap the str class. Because of the class’s inherent immutability requirement, providing changeable behaviours via inheriting from str might be a challenging endeavour.
In the next section, you will develop a string-like class that mimics a mutable string data type by making use of UserString.
A Look at How Your String-Like Classes Can Simulate Mutations
Consider the scenario in which you want a changeable string-like class; this is the last illustration of why you should include UserString in your Python toolkit. To put it another way, you need a string-like class that allows for modifications to be made in-line.
Due to the immutable nature of strings, the. setitem__() special method is not available for usage with them in the same way that it is with lists and dictionaries. This function will need to be added to your custom string in order for you to be able to alter characters and slices by their indices via an assignment statement.
In addition to this, the typical behaviour of common string methods will need to be altered to accommodate your string-like class. Since we want to make this example as brief as possible, you will just be modifying the.upper() and.lower() functions. At the end, you will need to supply a.sort() function so that your string may be sorted in situ.
The underlying string is not altered in any way by the standard string techniques. They bring back a new string object that has undergone the necessary modification. You will require the methods to conduct their modifications wherever they are needed in your custom string.
To accomplish all of these objectives, start up your preferred source code editor, create a new file on your computer and give it the name mutable string.py, and then write the following in the file:
code:
1# mutable_string.py
2
3from collections import UserString
4
5class MutableString(UserString):
6 def __setitem__(self, index, value):
7 data_as_list = list(self.data)
8 data_as_list[index] = value
9 self.data = "".join(data_as_list)
10
11 def __delitem__(self, index):
12 data_as_list = list(self.data)
13 del data_as_list[index]
14 self.data = "".join(data_as_list)
15
16 def upper(self):
17 self.data = self.data.upper()
18
19 def lower(self):
20 self.data = self.data.lower()
21
22 def sort(self, key=None, reverse=False):
23 self.data = "".join(sorted(self.data, key=key, reverse=reverse))
Line by line, this is how the code should be used.
line:
-
Line 3
imports
UserString
from
collections
. -
Line 5
creates
MutableString
as a subclass of
UserString
. -
Line 6
defines
.__setitem__()
. Python calls this special method whenever you run an assignment operation on a sequence using an index, like in
sequence[0] = value
. This implementation of
.__setitem__()
turns
.data
into a list, replaces the item at
index
with
value
, builds the final string using
.join()
, and assigns its value back to
.data
. The whole process simulates an in-place transformation or mutation. -
Line 11
defines
.__delitem__()
, the special method that allows you to use the
del
statement for removing characters by index from your mutable string. It’s implemented similar to
.__setitem__()
. On line 13, you use
del
to delete items from the temporary list. -
Line 16
overrides
UserString.upper()
and calls
str.upper()
on
.data
. Then it stores the result back in
.data
. Again, this last operation simulates an in-place mutation. -
Line 19
overrides
UserString.lower()
using the same technique as in
.upper()
. -
Line 22
defines
.sort()
, which combines the built-in
sorted()
function with the
str.join()
method to create a sorted version of the original string. Note that this method has the same signature as
list.sort()
and the built-in
sorted()
function.
That wraps it up! Your changeable string is now ready for use! To give it a try, go back to the Python shell you were using and type in the following.
code:
>>>
>>> from mutable_string import MutableString
>>> sample_string = MutableString("ABC def")
>>> sample_string
'ABC def'
>>> sample_string[4] = "x"
>>> sample_string[5] = "y"
>>> sample_string[6] = "z"
>>> sample_string
'ABC xyz'
>>> del sample_string[3]
>>> sample_string
'ABCxyz'
>>> sample_string.upper()
>>> sample_string
'ABCXYZ'
>>> sample_string.lower()
>>> sample_string
'abcxyz'
>>> sample_string.sort(reverse=True)
>>> sample_string
'zyxcba'
Great! The behaviour of your brand-new changeable string-like class is exactly what one would anticipate. It works similarly to a changeable sequence in that it enables you to make changes to the underlying string without moving it. Please take into consideration that this sample only covers a select few string ways. You are free to experiment with alternative ways and proceed with the process of giving your class with additional mutability capabilities.