For better tracking of what your own code does, use the lightweight yet extensible Python logging module. Using it provides you a lot more freedom than simply jamming print() calls all over your code.
The logging package in Python, however, might be challenging at times. It can be challenging to keep track of all the many components and how they interact, including handlers, loggers, levels, namespaces, and filters.
Examining the CPython source code of logging can help you tie up any loose ends in your comprehension of it. Reading through the short, modular Python code that powers logging can help you have that “aha!” moment.
This article is meant to be read in conjunction with Logging in Python, which is a tour on how to use the package, and the logging HOWTO manual. The following concepts will be familiar to you after reading this article:
:
-
logging
levels and how they work -
Thread-safety versus process-safety in
logging
-
The design of
logging
from an OOP perspective - Logging in libraries vs applications
-
Best practices and design patterns for using
logging
To construct a sense of how it’s laid out, we’ll primarily go line-by-line through the Python logging package’s core module.
out.
How to Follow Along
You can presume that every code block or link is based on a particular commit in the Python 3.7 CPython repository, especially commit since the logging source code is essential to this essay.
d730719 . The logging package itself is located in the Lib/ directory of the CPython source code.
The file you’ll spend the most time on is logging/__init__.py, which is where much of the heavy work happens within the logging package.
here:
cpython/
│
├── Lib/
│ ├── logging/
│ │ ├── __init__.py
│ │ ├── config.py
│ │ └── handlers.py
│ ├── ...
├── Modules/
├── Include/
...
... [truncated]
After that, let’s go on.
in.
Preliminaries
The first hundred lines or so of __init__.py introduce a few subtle but significant concepts before we move on to the heavyweight classes.
concepts.
Preliminary #1: A Level Is Just an
It may seem a little opaque to use objects like logging.INFO or logging.DEBUG. What do these internal variables do, and how are they defined?
Actually, the Python logging’s uppercase constants are just integers, forming an enum-like collection of numerical values.
levels:
CRITICAL = 50
FATAL = CRITICAL
ERROR = 40
WARNING = 30
WARN = WARNING
INFO = 20
DEBUG = 10
NOTSET = 0
Why not just use “INFO” or “DEBUG” strings instead? To make it easy and clear to compare one level to another, levels are int constants. Additionally, names are given to them to give them semantic meaning. A message’s severity level of 50 may not be immediately obvious, but its level of CRITICAL lets you know that your program has a flashing red light somewhere.
In some places, such as logger.setLevel(“DEBUG”), you can technically pass just the str form of a level. This internally calls _checkLevel(), which searches a dict for the corresponding int.
:
_nameToLevel = {
'CRITICAL': CRITICAL,
'FATAL': FATAL,
'ERROR': ERROR,
'WARN': WARNING,
'WARNING': WARNING,
'INFO': INFO,
'DEBUG': DEBUG,
'NOTSET': NOTSET,
}
def _checkLevel(level):
if isinstance(level, int):
rv = level
elif str(level) == level:
if level not in _nameToLevel:
raise ValueError("Unknown level: %r" % level)
rv = _nameToLevel[level]
else:
raise TypeError("Level not an integer or a valid string: %r" % level)
return rv
Which one should you choose? I don’t have strong opinions on this, but it’s interesting to note that the logging documentation frequently uses logging.DEBUG rather than “DEBUG” or 10. Additionally, Python 2 does not support passing the str form, and some logging methods, such as logger.isEnabledFor(), will only accept an int rather than the str form.
cousin.
Preliminary #2: Logging Is Thread-Safe, but Not Process-Safe
You’ll find the next short code block a few lines down, which is subtly important to the entire
package:
import threading
_lock = threading.RLock()
def _acquireLock():
if _lock:
_lock.acquire()
def _releaseLock():
if _lock:
_lock.release()
A reentrant lock called _lock resides in the logging/__init__.py module’s global namespace. Almost all of the objects and operations in the entire logging package are made thread-safe by this, allowing threads to read and write data without worrying about a race condition. _acquireLock() and _releaseLock() are used frequently by the module and its classes, as can be seen in the module source code.
But there’s something missing here: what about process safety? The logging module is not process safe, to put it succinctly. In general, two processes cannot write to the same file without a lot of proactive work on the part of the programmer first, so this is not an inherent flaw of logging.
This indicates that you should exercise caution when using classes that involve multiprocessing, such as logging.FileHandler. Halfway through a lengthy procedure, you might encounter a nasty bug if two processes attempt to simultaneously read from and write to the same underlying file.
The official Logging Cookbook contains a thorough recipe for circumventing this restriction. One alternative since this requires quite a bit of setup is to have each process log to a separate file based on its process ID, which you can obtain using the os.getpid command.()
.
Package Architecture: Logging’s MRO
After going over some initial setup code, let’s take a broad look at how logging is organized. The logging package makes good use of inheritance and OOP. Here’s a glimpse at some of the most significant classes in the method resolution order (MRO) for the
package:
object
│
├── LogRecord
├── Filterer
│ ├── Logger
│ │ └── RootLogger
│ └── Handler
│ ├── StreamHandler
│ └── NullHandler
├── Filter
└── Manager
Only the classes that are most important to highlight are shown in the tree diagram above, which does not include all of the module’s classes. Note: To view the chain of inheritance, use the dunder attribute logging.StreamHandler.__mro__. The Python 2 docs contain a comprehensive overview of the MRO, though it also applies to Python 3.
This long list of classes can be confusing because there is a lot going on and a lot of jargon is used. To filter or not to filter? Logger vs. Handler: Which is better? Keeping track of everything can be difficult, let alone trying to picture how everything fits together. Here is a diagram of a scenario where one logger with two handlers attached writes a log message using level logging because a picture really does speak a thousand words.INFO:
Objects in a logging flow (Image: Real Python)
Everything described above would appear as
this:
import logging
import sys
logger = logging.getLogger("pylog")
logger.setLevel(logging.DEBUG)
h1 = logging.FileHandler(filename="/tmp/records.log")
h1.setLevel(logging.INFO)
h2 = logging.StreamHandler(sys.stderr)
h2.setLevel(logging.ERROR)
logger.addHandler(h1)
logger.addHandler(h2)
logger.info("testing %d.. %d.. %d..", 1, 2, 3)
The Logging HOWTO contains a more thorough map of this flow. The scenario depicted above has been simplified.
Only one Logger instance, logger, and two Handler instances, h1 and h2, are defined in your code.
The logger object acts as a filter when you call logger.info(“testing%d..%d..%d..”, 1, 2, 3) because it also has a level attached to it. The logger won’t do anything with a message unless it is severe enough. The logger is permitted to proceed because it is at level DEBUG and the message is at level INFO or higher.
The message string “testing%d..%d..%d.” and its arguments (1, 2, 3) are placed into a legitimate class instance of a LogRecord, which is merely a container for the message and its metadata, by logger internally calling logger.makeRecord().
The logger object scans the area for its handlers (instances of Handler), which could be connected to the logger directly or to its parents (a subject we’ll cover later). In this instance, it identifies two
handlers:
-
One with level
INFO
that dumps log data to a file at
/tmp/records.log
-
One that writes to
sys.stderr
but only if the incoming message is at level
ERROR
or higher
An additional round of tests begins at this point. The record is written to Handler 1 (green arrow) but not to Handler 2’s stderr stream because the LogRecord and its message only carry level INFO. (red arrow). Writing the LogRecord to a stream is referred to as emitting for Handlers, and this action is recorded in their.emit() method.
Let’s now examine everything in more detail
above.
The
A LogRecord is what? An instance of the LogRecord class is the object you send to be logged when you log a message. It is produced by a Logger instance for you and contains all the important details about that event. Internally, it is merely a wrapper for a dict that holds the record’s attributes. To zero or more Handler instances, a Logger instance sends a LogRecord instance.
Some metadata are included in the LogRecord, including the
following:
- A name
- The creation time as a Unix timestamp
- The message itself
- Information on what function made the logging call
Here’s a look at some of the metadata it contains, which you can examine by stepping through a call to logging.error() with the pdb module.
:
>>>
>>> import logging
>>> import pdb
>>> def f(x):
... logging.error("bad vibes")
... return x / 0
...
>>> pdb.run("f(1)")
You arrive at line 1517 after stepping through a few higher-level functions.
:
(Pdb) l
1514 exc_info = (type(exc_info), exc_info, exc_info.__traceback__)
1515 elif not isinstance(exc_info, tuple):
1516 exc_info = sys.exc_info()
1517 record = self.makeRecord(self.name, level, fn, lno, msg, args,
1518 exc_info, func, extra, sinfo)
1519 -> self.handle(record)
1520
1521 def handle(self, record):
1522 """
1523 Call the handlers for the specified record.
1524
(Pdb) from pprint import pprint
(Pdb) pprint(vars(record))
{'args': (),
'created': 1550671851.660067,
'exc_info': None,
'exc_text': None,
'filename': '<stdin>',
'funcName': 'f',
'levelname': 'ERROR',
'levelno': 40,
'lineno': 2,
'module': '<stdin>',
'msecs': 660.067081451416,
'msg': 'bad vibes',
'name': 'root',
'pathname': '<stdin>',
'process': 2360,
'processName': 'MainProcess',
'relativeCreated': 295145.5490589142,
'stack_info': None,
'thread': 4372293056,
'threadName': 'MainThread'}
Internally, a LogRecord has a wealth of metadata that is used in various ways.
Since the Logger and Handler take care of this for you, you won’t frequently need to interact directly with a LogRecord. Even so, it’s still important to understand what data is contained in a LogRecord since this is where the timestamp and other pertinent details in record log messages originate. Note: The setLogRecordFactory(), getLogRecordFactory(), and makeLogRecord functions can be found below the LogRecord class.()
factory processes. Unless you want to substitute a custom class for LogRecord to contain log messages and their associated data, you won’t need these.
metadata.
The
The Logger and Handler classes are both essential to how logging functions and they frequently communicate with one another. A.level is connected to a Logger, a Handler, and a LogRecord, respectively.
If the LogRecord’s effective level is equal to or greater than the Logger’s, the Logger takes the LogRecord and gives it to the Handler. The LogRecord vs. Handler test follows the same rules. Level-based filtering is what this is, and Logger and Handler implement it slightly differently.
In other words, before the message that you log can do anything, a (minimum) two-step test is applied. A LogRecord must have a level at least as high as both the logger and handler in order to be fully passed from a logger to handler and then logged to the end stream (which could be sys.stdout, a file, or an email sent via SMTP).
PEP 282 explains how this operates:
Each Logger object keeps track of an interest level (or threshold) for logs and ignores requests for logs below that level. (
The Source
Where then, for both Logger and Handler, does this level-based filtering actually take place?
It is a reasonable initial assumption for the Logger class that the logger would compare its.level attribute to the level of the LogRecord and end there. But it’s a little more complicated than that.
For loggers, level-based filtering happens in.isEnabledFor(), which then calls.getEffectiveLevel(). Never just check logger.level; instead, use logger.getEffectiveLevel(). The Logger objects are arranged in a hierarchical namespace for this reason. (More on this will be revealed later.)
A Logger instance’s level is 0 by default. ( NOTSET ). The root logger, which serves as the parent of all other loggers, is one of the parent loggers for loggers, which also have parent loggers. A logger will move up the hierarchy to determine its effective level in relation to its parent. (which ultimately may be root if no other parents are found).
In the Logger, this occurs at this point.
class:
class Logger(Filterer):
# ...
def getEffectiveLevel(self):
logger = self
while logger:
if logger.level:
return logger.level
logger = logger.parent
return NOTSET
def isEnabledFor(self, level):
try:
return self._cache[level]
except KeyError:
_acquireLock()
if self.manager.disable >= level:
is_enabled = self._cache[level] = False
else:
is_enabled = self._cache[level] = level >= self.getEffectiveLevel()
_releaseLock()
return is_enabled
Here is an example that uses the source code you are currently viewing.
above:
>>>
>>> import logging
>>> logger = logging.getLogger("app")
>>> logger.level # No!
0
>>> logger.getEffectiveLevel()
30
>>> logger.parent
<RootLogger root (WARNING)>
>>> logger.parent.level
30
The lesson is to not rely on level. Your logging setup will probably behave differently than you anticipated if you haven’t explicitly set a level on your logger object and you’re depending on.level for some reason.
How does Handler compare? The level-to-level comparison for handlers is easier even though it actually takes place in the Logger’s.callHandlers() method.
class:
class Logger(Filterer):
# ...
def callHandlers(self, record):
c = self
found = 0
while c:
for hdlr in c.handlers:
found = found + 1
if record.levelno >= hdlr.level:
hdlr.handle(record)
A logger checks with each of its registered handlers and performs a quick check on the.level attribute of that Handler instance for a specific LogRecord instance (referred to as record in the source code above). The record is only passed on if the LogRecord’s.levelno is greater than or equal to the handler’s. This is described as “conditionally emit[ting] the specified logging” in a docstring for logging.
record.”
The
Where does level-based filtering take place, we posed the question earlier?It’s simple to become sidetracked by the Filter and Filterer classes while trying to respond to this question. Ironically, neither of the Filter or Filterer classes are required to perform level-based filtering for Logger and Handler instances.
The level-based filtering that is carried out by default can be supplemented with additional function-based filters using Filter and Filterer. I like to think of it as filtering à la carte.
Because Logger and Handler are both capable of receiving additional custom filters that you specify, Filterer serves as their base class. With logger.addFilter() or handler.addFilter(), which is what self.filters refers to in the following, you can add instances of Filter to them.
method:
class Filterer(object):
# ...
def filter(self, record):
rv = True
for f in self.filters:
if hasattr(f, 'filter'):
result = f.filter(record)
else:
result = f(record)
if not result:
rv = False
break
return rv
Depending on whether the filters in this class approve of a given record (a LogRecord instance), the function.filter() returns True or False.
This is. handle() for the Logger and Handler, respectively.
classes:
class Logger(Filterer):
# ...
def handle(self, record):
if (not self.disabled) and self.filter(record):
self.callHandlers(record)
# ...
class Handler(Filterer):
# ...
def handle(self, record):
rv = self.filter(record)
if rv:
self.acquire()
try:
self.emit(record)
finally:
self.release()
return rv
There are no additional filters included with Logger or Handler by default, but here’s a quick example of how you could add
one:
>>>
>>> import logging
>>> logger = logging.getLogger("rp")
>>> logger.setLevel(logging.INFO)
>>> logger.addHandler(logging.StreamHandler())
>>> logger.filters # Initially empty
[]
>>> class ShortMsgFilter(logging.Filter):
... """Only allow records that contain long messages (> 25 chars)."""
... def filter(self, record):
... msg = record.msg
... if isinstance(msg, str):
... return len(msg) > 25
... return False
...
>>> logger.addFilter(ShortMsgFilter())
>>> logger.filters
[<__main__.ShortMsgFilter object at 0x10c28b208>]
>>> logger.info("Reeeeaaaaallllllly long message") # Length: 31
Reeeeaaaaallllllly long message
>>> logger.info("Done") # Length: <25, no output
The class ShortMsgFilter is defined above, and its.filter() is overridden. You could also just pass a callable to.addHandler(), such as a function, lambda, or class that defines.__call__.()
.
The
The Manager class is one more unnoticed participant in logging that merits discussion. The one instance of the Manager class that serves as a container for the expanding hierarchy of loggers that are defined across packages, not the Manager class itself, is what matters most. The importance of a single instance of this class in binding the module together and enabling its components to communicate with one another will become clear in the following section.
other.
The All-Important Root Logger
One Logger instance stands out among the others. It’s known as the root.
logger:
class RootLogger(Logger):
def __init__(self, level):
Logger.__init__(self, "root", level)
# ...
root = RootLogger(WARNING)
Logger.root = root
Logger.manager = Manager(Logger.root)
One of the clever tricks used by the logging package is the last three lines of this code block. Here are a couple
points:
-
The root logger is just a no-frills Python object with the identifier
root
. It has a level of
logging.WARNING
and a
.name
of
"root"
. As far as the class
RootLogger
is concerned, this unique name is all that’s special about it. -
The
root
object in turn becomes a
class attribute
for the
Logger
class. This means that all instances of
Logger
, and the
Logger
class itself, all have a
.root
attribute that is the root logger. This is another example of a singleton-like pattern being enforced in the
logging
package. -
A
Manager
instance is set as the
.manager
class attribute for
Logger
. This eventually comes into play in
logging.getLogger("name")
. The
.manager
does all the facilitation of searching for existing loggers with the name
"name"
and creating them if they don’t exist.
The Logger Hierarchy
In the logger namespace, absolutely everything is a child of root. That applies to both the loggers you define yourself and the ones you import from third-party libraries.
Remember how, despite not having it explicitly set, the.getEffectiveLevel() for our logger instances returned 30 (WARNING) earlier? This is due to the fact that the root logger is at the top of the hierarchy and that any nested loggers that have a null level of NOTSET will fall back to the root logger.
:
>>>
>>> root = logging.getLogger() # Or getLogger("")
>>> root
<RootLogger root (WARNING)>
>>> root.parent is None
True
>>> root.root is root # Self-referential
True
>>> root is logging.root
True
>>> root.getEffectiveLevel()
30
The search for a logger’s handlers follows the same logic. In essence, the search is a reverse-order search up a logger’s
parents.
A Multi-Handler Design
Although the logger hierarchy may appear to be a good idea in theory, how useful is it in actual use?
Instead of continuing to look at the logging code, let’s write our own mini-application that makes use of the logger hierarchy to lessen boilerplate code and maintain scalability as the project’s codebase expands.
This is the undertaking.
structure:
project/
│
└── project/
├── __init__.py
├── utils.py
└── base.py
The utils.py and base.py files’ primary functions are unimportant. The interaction in logging objects between the modules in project/ is what we’re focusing on more here.
Say, for example, that you want to create a multifaceted logging system.
setup:
-
Each module gets a
logger
with multiple handlers. -
Some of the handlers are shared between different
logger
instances in different modules. These handlers only care about level-based filtering, not the module where the log record emanated from. There is a handler for
DEBUG
messages, one for
INFO
, one for
WARNING
, and so on. -
Each
logger
is also tied to one more additional handler that only receives
LogRecord
instances from that lone
logger
. You can call this a module-based file handler.
What we’re aiming for would appear visually something like this:
a multifaceted logging strategy (Image: Real Python)
The two turquoise objects are Logger instances that were created using logging.getLogger(__name__) for each module in a package. Everything else is an instance of a handler.
This design was created with neat compartmentalization in mind. Messages from a single logger or messages of a certain level and higher coming from any logger or module can be conveniently viewed.
The logger hierarchy’s characteristics make it appropriate for establishing this multifaceted logger-handler layout. Why does that matter? The Django documentation provides the following succinct explanation:
What makes the hierarchy so crucial? Because loggers have the option to set their logging calls to propagate to their parents. This allows you to define a single set of handlers at the top of a logger tree and catch all logging calls in the logger subtree. All logging messages sent to the project.interesting and project.interesting.stuff loggers will be caught by a logging handler defined in the project namespace. (
The Source
When a logger propagates, it means that it keeps moving up its chain of parents in search of handlers. For a Logger instance, the.propagate attribute is set to True by
default:
>>>
>>> logger = logging.getLogger(__name__)
>>> logger.propagate
True
If propagate is True in.callHandlers(), each succeeding parent is assigned to the local variable c until the hierarchy is complete.
exhausted:
class Logger(Filterer):
# ...
def callHandlers(self, record):
c = self
found = 0
while c:
for hdlr in c.handlers:
found = found + 1
if record.levelno >= hdlr.level:
hdlr.handle(record)
if not c.propagate:
c = None
else:
c = c.parent
This means that a logger there becomes a parent to any loggers present in other modules in the same package because the __name__ dunder variable within a package’s __init__.py module is just the name of the package.
Here are the.name attributes that were produced when logging.getLogger(__name__) was used to assign logger.
:
Module |
Attribute |
---|---|
|
|
|
|
|
|
The ‘project.utils’ and ‘project.base’ loggers will latch onto not only their own direct handlers but also any handlers connected to ‘project’ because they are children of ‘project’.
Let’s start creating the modules. __init__.py comes first.
:
# __init__.py
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
levels = ("DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL")
for level in levels:
handler = logging.FileHandler(f"/tmp/level-{level.lower()}.log")
handler.setLevel(getattr(logging, level))
logger.addHandler(handler)
def add_module_handler(logger, level=logging.DEBUG):
handler = logging.FileHandler(
f"/tmp/module-{logger.name.replace('.', '-')}.log"
)
handler.setLevel(level)
logger.addHandler(handler)
When the project package is imported, this module is also imported. You add a handler for each of the hierarchy’s levels, DEBUG through CRITICAL, and attach it to a single logger at the top.
Additionally, you define a utility function that extends loggers with an additional FileHandler, whose filename matches the name of the module where the logger is defined. (This is predicated on the logger being defined with __name__.)
Following that, you can add a basic boilerplate logger setup to base.py and utils.py. Keep in mind that the add_module_handler() function from __init__.py only requires you to add one more handler. The level-oriented handlers are already added to their parent logger called “project,” so you don’t need to worry about them.
:
# base.py
import logging
from project import add_module_handler
logger = logging.getLogger(__name__)
add_module_handler(logger)
def func1():
logger.debug("debug called from base.func1()")
logger.critical("critical called from base.func1()")
This is utils.py.
:
# utils.py
import logging
from project import add_module_handler
logger = logging.getLogger(__name__)
add_module_handler(logger)
def func2():
logger.debug("debug called from utils.func2()")
logger.critical("critical called from utils.func2()")
Let’s examine how everything integrates using a brand-new Python.
session:
>>>
>>> from pprint import pprint
>>> import project
>>> from project import base, utils
>>> project.logger
<Logger project (DEBUG)>
>>> base.logger, utils.logger
(<Logger project.base (DEBUG)>, <Logger project.utils (DEBUG)>)
>>> base.logger.handlers
[<FileHandler /tmp/module-project-base.log (DEBUG)>]
>>> pprint(base.logger.parent.handlers)
[<FileHandler /tmp/level-debug.log (DEBUG)>,
<FileHandler /tmp/level-info.log (INFO)>,
<FileHandler /tmp/level-warning.log (WARNING)>,
<FileHandler /tmp/level-error.log (ERROR)>,
<FileHandler /tmp/level-critical.log (CRITICAL)>]
>>> base.func1()
>>> utils.func2()
The resulting log files will demonstrate that our filtration system operates as intended. Level-oriented handlers direct multiple loggers to a different location while module-oriented handlers direct one logger to a particular file.
file:
$ cat /tmp/level-debug.log
debug called from base.func1()
critical called from base.func1()
debug called from utils.func2()
critical called from utils.func2()
$ cat /tmp/level-critical.log
critical called from base.func1()
critical called from utils.func2()
$ cat /tmp/module-project-base.log
debug called from base.func1()
critical called from base.func1()
$ cat /tmp/module-project-utils.log
debug called from utils.func2()
critical called from utils.func2()
There is a lot of redundancy introduced by this design, which is a disadvantage worth mentioning. There can be six files or more that one LogRecord instance can go to. In a performance-critical application, that is also a non-negligible amount of file I/O that could accumulate.
After seeing a real-world illustration, let’s change gears and explore a potential logging confusion-causing factor.
.
The “Why Didn’t My Log Message Go Anywhere?” Dilemma
When logging, there are two common scenarios where it’s simple to trip up
up:
- You logged a message that seemingly went nowhere, and you’re not sure why.
- Instead of being suppressed, a log message appeared in a place that you didn’t expect it to.
There are one or two common justifications for each of these. A message you sent seemed to go unanswered, and you’re not sure why.
Remember that a logger will climb its hierarchy until it finds the root logger with its own WARNING, so if you don’t set a custom level for that logger, the effective level is WARNING.
level:
>>>
>>> import logging
>>> logger = logging.getLogger("xyz")
>>> logger.debug("mind numbing info here")
>>> logger.critical("storm is coming")
storm is coming
This default prevents the.debug() call from doing anything. A log message didn’t get suppressed; it showed up where you least expected it.
You didn’t include any handlers when you defined your logger in the previous section. What makes it write to the console, then?
This is because logging subtly employs a handler called lastResort that, in the absence of any other handlers, writes to sys.stderr.
found:
class _StderrHandler(StreamHandler):
# ...
@property
def stream(self):
return sys.stderr
_defaultLastResort = _StderrHandler(WARNING)
lastResort = _defaultLastResort
When a logger attempts to locate its,
handlers:
class Logger(Filterer):
# ...
def callHandlers(self, record):
c = self
found = 0
while c:
for hdlr in c.handlers:
found = found + 1
if record.levelno >= hdlr.level:
hdlr.handle(record)
if not c.propagate:
c = None
else:
c = c.parent
if (found == 0):
if lastResort:
if record.levelno >= lastResort.level:
lastResort.handle(record)
The lastResort handler is used if the logger gives up looking for handlers (both its own direct handlers and attributes of parent loggers).
There is one more minute detail that is important to know. Instead of discussing the identically named module-level functions of the logging package, which are covered in a separate section, this one has focused primarily on instance methods (methods that a class defines).
If you use the functions, such as logging.info() as opposed to logger.info(), then internally, something different occurs. A StreamHandler that writes to sys.stderr is added when the function calls logging.basicConfig(). The behavior is essentially the same in the end.
same:
>>>
>>> import logging
>>> root = logging.getLogger("")
>>> root.handlers
[]
>>> root.hasHandlers()
False
>>> logging.basicConfig()
>>> root.handlers
[<StreamHandler <stderr> (NOTSET)>]
>>> root.hasHandlers()
True
Taking Advantage of Lazy Formatting
Now let’s shift gears and examine in more detail how messages are connected to their data. Although str.format() and f-strings have replaced it, you’ve probably used Python’s percent-style formatting to do things like
this:
>>>
>>> print("To iterate is %s, to recurse %s" % ("human", "divine"))
To iterate is human, to recurse divine
As a result, you might be tempted to act in a similar manner when logging.
call:
>>>
>>> # Bad! Check out a more efficient alternative below.
>>> logging.warning("To iterate is %s, to recurse %s" % ("human", "divine"))
WARNING:root:To iterate is human, to recurse divine
This calls logging.warning() with the entire format string and its arguments as the message argument.
Here is the suggested substitute, directly from the
logging documents
:
>>>
>>> # Better: formatting doesn't occur until it really needs to.
>>> logging.warning("To iterate is %s, to recurse %s", "human", "divine")
WARNING:root:To iterate is human, to recurse divine
Right, it seems a little strange. The format string is formatted lazily rather than immediately, which appears to go against the rules of how percent-style string formatting operates.
with greed. This is what that implies.
The Logger.warning() method signature looks like
this:
def warning(self, msg, *args, **kwargs)
The same holds true for all other methods, including.debug(). Both “human” and “divine” are caught as *args when you call warning(“To iterate is%s, to recurse%s”, “human”, “divine”), and within the bounds of the method’s body, args is equal to (“human”, “divine”).
Compare this to the initial call.
above:
logging.warning("To iterate is %s, to recurse %s" % ("human", "divine"))
In this form, args is an empty tuple and everything in the parentheses is combined into the phrase “To iterate is human, to recurse divine” and passed as msg.
Why is this important? Runtime performance can be slightly decreased by frequent logging calls, but the logging package makes every effort to prevent this from happening. Logging delays string formatting until the LogRecord is requested by a Handler by not immediately merging the format string with its arguments.
This occurs in LogRecord.getMessage(), so the LogRecord does not become its fully merged self until logging determines that the LogRecord will actually be passed to a handler.
All of this simply means that the logging package implements some very precise performance optimizations where it is necessary. Even though this may seem like a minor detail, logging’s laziness in handling string formatting can matter if you’re calling logging.debug() the same way repeatedly inside of a loop with function calls as arguments.
A Logger instance will check its.isEnabledFor() to determine whether any msg and args merging should be done in the first place before proceeding.
place.
Functions vs Methods
The module-level functions that are prominently promoted in the public API of logging are located toward the bottom of logging/__init__.py. Logger methods like.debug(),.info(), and.warning() were already shown to you. The top-level functions are simply renamed versions of their corresponding methods with two key differences.
features:
-
They always call their corresponding method from the root logger,
root
. -
Before calling the root logger methods, they call
logging.basicConfig()
with no arguments if
root
doesn’t have any handlers. As you saw earlier, it is this call that sets a
sys.stdout
handler for the root logger.
Here is an example of logging.error.()
:
def error(msg, *args, **kwargs):
if len(root.handlers) == 0:
basicConfig()
root.error(msg, *args, **kwargs)
The same pattern applies to all of the logging functions, including logging.debug(), logging.info(), and others. It’s interesting to follow the commands’ progression. You’ll eventually arrive at the same location, which is also where the internal Logger._log() function is called.
All calls to level-based functions such as debug(), info(), warning(), and others route to this location. Primarily, _log() has two
purposes:
-
Call
self.makeRecord()
:
Make a
LogRecord
instance from the
msg
and other arguments you pass to it. -
Call
self.handle()
:
This determines what actually gets done with the record. Where does it get sent? Does it make it there or get filtered out?
Here is a flowchart showing the entire procedure:
logging call internals (Image: Real Python)
Pdb can be used to track the call stack as well.
.
What Does
The top-level getLogger(), which encapsulates Logger.manager.getLogger, is also concealed in this section of the source code.()
:
def getLogger(name=None):
if name:
return Logger.manager.getLogger(name)
else:
return root
This is where the singleton logger enforcement process begins.
design:
-
If you specify a
name
, then the underlying
.getLogger()
does a
dict
lookup on the string
name
. What this comes down to is a lookup in the
loggerDict
of
logging.Manager
. This is a dictionary of all registered loggers, including the intermediate
PlaceHolder
instances that are generated when you reference a logger far down in the hierarchy before referencing its parents. -
Otherwise,
root
is returned. There is only one
root
—the instance of
RootLogger
discussed above.
This function is what underlies a hack that enables you to view all of the registered
loggers:
>>>
>>> import logging
>>> logging.Logger.manager.loggerDict
{}
>>> from pprint import pprint
>>> import asyncio
>>> pprint(logging.Logger.manager.loggerDict)
{'asyncio': <Logger asyncio (WARNING)>,
'concurrent': <logging.PlaceHolder object at 0x10d153710>,
'concurrent.futures': <Logger concurrent.futures (WARNING)>}
Hold on a second, please. What’s going on here? It appears as though the logging package underwent an internal change as a result of the import of another library, and that is what actually took place.
First, keep in mind that the class attribute Logger.manager refers to an instance of Manager that is added to the Logger class. All of the singleton instances of Logger are to be monitored and managed by the manager. These are kept in the file.loggerDict.
Now, this dictionary is empty when you first import logging. However, the same dictionary is filled with three loggers after you import asyncio. This is an illustration of how one module can permanently set the attributes of another module. Unsurprisingly, the asyncio/log.py file contains the
following:
import logging
logger = logging.getLogger(__package__) # "asyncio"
In order for the manager to have control over the entire namespace of loggers, the key-value pair is set in Logger.getLogger(). In the logger dictionary, which is a part of the logging package, the object asyncio.log.logger is thus registered. The concurrent.futures package, which asyncio imports, also experiences a similar issue.
An equivalent shows the singleton design’s strength.
test:
>>>
>>> obj1 = logging.getLogger("asyncio")
>>> obj2 = logging.Logger.manager.loggerDict["asyncio"]
>>> obj1 is obj2
True
This comparison shows (by glossing over some specifics) what getLogger() ultimately accomplishes.
does.
Library vs Application Logging: What Is
The definition of NullHandler can be found in the last hundred or so lines of the logging/__init__.py source. Here is the complete definition.
glory:
class NullHandler(Handler):
def handle(self, record):
pass
def emit(self, record):
pass
def createLock(self):
self.lock = None
The differences between logging in a library and logging in an application are the focus of the NullHandler. See what that means, shall we?
A library is a generalizable, extensible Python package that is meant to be installed and set up by other users. It was created by a developer specifically to be distributed to users. Popular open-source initiatives like NumPy, dateutil, and cryptography are a few examples.
A program is made for a much smaller group of users and a much more narrowly defined purpose. (possibly just one user). It’s a program or collection of programs that the user has carefully tailored to perform a specific set of tasks. A Django app that is hidden behind a web page is an illustration of an application. Commonly, libraries and the tools they contain are used (imported) by applications.
There are distinct recommended practices for logging in an app compared to a library.
NullHandler fills that role. It is essentially a useless stub class.
The __init__.py file in your package must contain this one simple setup step if you’re developing a Python library.
:
# Place this in your library's uppermost `__init__.py`
# Nothing else!
import logging
logging.getLogger(__name__).addHandler(NullHandler())
This accomplishes two vital goals.
First off, even if the end user specifies otherwise, a library logger that is declared with logger = logging.getLogger(__name__) will by default log to sys.stderr. This may be referred to as an opt-out strategy because the library’s end users must manually turn off console recording if they so want.
Instead, the conventional wisdom advises using an opt-in strategy: don’t emit any log messages by default and allow the library’s end users decide whether they wish to further configure the loggers and add handlers to them. Here’s how the logging package’s author, Vinay Sajip, expressed that attitude in more direct terms:
A third-party library that uses logging should not automatically provide logging output that a developer or user of an application using it would not want. ( Source )
The decision to run methods like logger.addHandler() or logger.setLevel() progressively now rests with the library user rather than the library developer.
The second justification for NullHandler’s existence is more antiquated. Python 2.7 and before raised a warning when attempting to log a LogRecord from a logger without a handler defined. This can be avoided by including the no-op class NullHandler.
Following is a detailed explanation of what occurs in the line logging.getLogger(__name__).addHandler(NullHandler()) from
above:
-
Python gets (creates) the
Logger
instance with the same name as your package. If you’re designing the
calculus
package, within
__init__.py
, then
__name__
will be equal to
'calculus'
. -
A
NullHandler
instance gets attached to this logger. That means that Python will not default to using the
lastResort
handler.
Remember that any logger produced in any of the other.py modules of the package will be a child of this logger in the logger hierarchy, and since this handler also belongs to them, they won’t need to use the lastResort handler or default to logging to standard error. ( stderr ).
Let’s use the following book in your library as an example.
structure:
calculus/
│
├── __init__.py
└── integration.py
As the library developer, you are free to do the following in integration.py:
following:
# calculus/integration.py
import logging
logger = logging.getLogger(__name__)
def func(x):
logger.warning("Look!")
# Do stuff
return None
Now a user uses pip install calculus to install your package from PyPI. In certain application code, they use from calculus.integration import func. This user is free to edit and customize the library’s logger object anyway they see fit, just like any other Python object.
content.
What Logging Does With Exceptions
You should be aware of the risk of exceptions resulting from your calls to logging. The height of irony would be if a logging.error() method that is intended to provide you with more verbose debugging information instead raises an exception on its own.
In a clever move, the logging package will print the traceback but not raise the exception itself if it runs into a problem with logging.
Here is an illustration of a typical typo: providing two arguments to a format string that only needs one. What you see below, however, is not an exception being raised; rather, it is a prettified written traceback of the internal exception, which was itself
suppressed:
>>>
>>> logging.critical("This %s has too many arguments", "msg", "other")
--- Logging error ---
Traceback (most recent call last):
File "lib/python3.7/logging/__init__.py", line 1034, in emit
msg = self.format(record)
File "lib/python3.7/logging/__init__.py", line 880, in format
return fmt.format(record)
File "lib/python3.7/logging/__init__.py", line 619, in format
record.message = record.getMessage()
File "lib/python3.7/logging/__init__.py", line 380, in getMessage
msg = msg % self.args
TypeError: not all arguments converted during string formatting
Call stack:
File "<stdin>", line 1, in <module>
Message: 'This %s has too many arguments'
Arguments: ('msg', 'other')
This enables your program to continue its actual program flow gracefully. The justification for this is that you wouldn’t want a program to be stopped in its tracks by an uncaught exception that originated from a logging call itself. While some tracebacks are disorganized, this one is instructive and reasonably simple. Handler.handleError() allows the suppression of logging-related exceptions. If something goes wrong when the handler calls.emit(), the method where it attempts to log the record, it falls back to.handleError(). Here is how.emit() for the StreamHandler is implemented.
class:
def emit(self, record):
try:
msg = self.format(record)
stream = self.stream
stream.write(msg + self.terminator)
self.flush()
except Exception:
self.handleError(record)
Any formatting- or writing-related error is captured rather than thrown, and handleError kindly writes the traceback to sys.stderr.
.
Logging Python Tracebacks
When it comes to exceptions and associated tracebacks, what about situations in which your software should log the exception and continue running regardless?
Let’s go over a few approaches to doing this.
Here is a fictitious example of a lottery simulator using non-Python code. You’re creating a lottery game online where players may bet on their luck
number:
import random
class Lottery(object):
def __init__(self, n):
self.n = n
def make_tickets(self):
for i in range(self.n):
yield i
def draw(self):
pool = self.make_tickets()
random.shuffle(pool)
return next(pool)
The crucial code below is located behind the front-end application. Make sure to keep track of any website issues that could result in a user losing their money. Using logging.error() and reporting the str form of the exception object is the first (less than ideal) method.
itself:
try:
lucky_number = int(input("Enter your ticket number: "))
drawn = Lottery(n=20).draw()
if lucky_number == drawn:
print("Winner chicken dinner!")
except Exception as e:
# NOTE: See below for a better way to do this.
logging.error("Could not draw ticket: %s", e)
Instead of the traceback, you will simply receive the actual exception message from this. You discover this cryptic message in the server logs for your website.
message:
ERROR:root:Could not draw ticket: object of type 'generator' has no len()
Hmm. You have a significant issue as the program developer, and as a result, a user was taken advantage of. The exception message itself, however, might not be that instructive. Isn’t it interesting to know the history of the traceback that produced this exception?
Utilizing logging.exception(), which records a message with the level ERROR and also shows the exception traceback, is the recommended course of action. The two last lines above should be changed to
these:
except Exception:
logging.exception("Could not draw ticket")
Now, you have a clearer idea of what’s happening.
on:
>>>
ERROR:root:Could not draw ticket
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "<stdin>", line 9, in draw
File "lib/python3.7/random.py", line 275, in shuffle
for i in reversed(range(1, len(x))):
TypeError: object of type 'generator' has no len()
Logging brings in the exception via sys.exc_info(), so you may avoid having to reference it yourself by using exception().
This makes it more obvious that random.shuffle(), which has to know the length of the object it is shuffleing, is the source of the issue. Our Lottery class delays and raises before the pool can be shuffled, much less before it can generate a winning ticket, because it passes a generator to shuffle().
When deep, multi-library tracebacks are involved in huge, complex applications and you are unable to step into them using a live debugger like pdb, you’ll find that logging.exception() is even more helpful.
There is only one line of code for logging.Logger.exception() and hence, logging.exception()
line:
def exception(self, msg, *args, exc_info=True, **kwargs):
self.error(msg, *args, exc_info=exc_info, **kwargs)
To put it another way, logging.exception() just runs logging.error() with exc_info=True, which is by default False. Call the function or method with exc_info=True if you want to log an exception traceback but at a level other than logging.ERROR.
Keep in mind that exception() should only be used inside of an except and in the context of an exception handler.
block:
for i in data:
try:
result = my_longwinded_nested_function(i)
except ValueError:
# We are in the context of exception handler now.
# If it's unclear exactly *why* we couldn't process
# `i`, then log the traceback and move on rather than
# ditching completely.
logger.exception("Could not process %s", i)
continue
Instead of using this pattern to suppress any exceptions, use it selectively. When you’re troubleshooting a lengthy function call stack and otherwise witnessing an ambiguous, obscure, and challenging error, it can be quite useful.