Sunday, 19 September 2010

Python Logging Functionality - Facts vs. Myths

tl; dr: Some inaccuracies and misunderstandings about how stdlib logging works have been expressed in the documentation and marketing presentations of a suggested alternative. On closer examination, certain statements which imply a lack of functionality or other shortcomings in stdlib logging have been shown to be inaccurate. You should feel confident that in using the stdlib logging package you are very unlikely to find it wanting, and that, if it seems too hard or not possible to achieve some result that you want to achieve, you should raise the issue on comp.lang.python or bugs.python.org and be assured of prompt responses and resolutions. Now, you can read on if you want more details :-).

Recently Armin Ronacher, developer of Werkzeug and Jinja among other things, released a library for logging which, in his opinion, is preferable to the logging package provided as part of Python since the 2.3 release. Now preferences are a very personal thing, and since Armin has made worthy software contributions to the Python community, there will no doubt be many people who trust his judgement and follow where he leads. In general, choice in software is good, and people should be free to make up their own minds about the software they want to use.   However, in order to make an informed decision, people need accurate information on which to base that decision. For example, I chose to use argparse over the stdlib's optparse even before argparse landed in the stdlib; in making that choice, I looked at Steven Bethard's rationale as described here and was fortunate to have the time to be able to examine and evaluate each of his points for myself.

In the case of choosing whether to use Python logging or something else, people may or may not have the time to investigate in any depth the relative merits of the alternatives on offer. Armin has seemingly made it easier for busy people by including, in his library's documentation, reasons why you should use his offering in preference to the stdlib's logging package, and what he perceives to be problems with the code in the stdlib. This is what Steven did for argparse, but in the case of logging Armin has made a number of statements which are inaccurate or misleading. No doubt he has not meant to deliberately mislead anyone, but some of the things he has said have led me to conclude that perhaps Armin doesn't really understand some of the aspects of how stdlib logging works, and how you can use it in Web applications, desktop applications, casual utility scripts, and of course libraries which support all of these application development constituencies. And if Armin - a clearly talented developer - is making some egregious mistakes in his thinking about how stdlib logging works and what you can or can't do with it, then it's possible that the lack of understanding which he demonstrates is more widespread; as maintainer of the stdlib logging package, I feel there's a need to do something to rectify this. Hence, this post.

Isn't that what the stdlib documentation is for? Of course, at least partly, and while the stdlib logging documentation undergoes continual additions and revisions (which hopefully can be viewed as continuous improvement), it is a difficult medium in which to address certain points. Of necessity, the stdlib documentation for most modules is presented largely in a reference style, with only limited examples which might serve as tutorials. And even for sections where much effort has been expended in presenting using a tutorial style, there is always room for expositions which do not fit easily in the stdlib documentation: for example, a "cookbook" style presentation which tries to show how to solve common and less common problems which arise in considering how best to use a package. And then, who reads documentation, anyway? ;-)

Of course no software is perfect, nor can any design or implementation hope to please all of its users all of the time. I have always viewed stdlib logging as capable of being improved, not because there was anything especially wrong with it to start with, but rather that through the philosophy of continuous improvement you have the best chance of providing something which remains relevant and useful to as many users as possible. That, it seems to me, is a core Python philosophy, enshrined in the procedures, standards and practices of the PEP process.

You might be thinking that this post is a bit of an over-reaction on my part. Perhaps, but as well as the concern I expressed about there being a lack of understanding out there about how logging works, I have another concern. This is based on a strong view that for certain types of functionality, too much choice is not a good thing. For example, imagine if there were several competing implementations of regular expressions in Python, or implementations of pickling functionality, in common use. Some library developers would use one particular version, others a different one. So an application developer using several such libraries would not only have to pull in additional dependencies but also spend some time and thought on how to get the different libraries to work together. And so it is with logging, in my view; it's an infrastructure function and a lot of libraries out there already use it. Of course if it fell short of meeting developer requirements, then it should not be immune from being supplanted by a better alternative (as with optparse and argparse). However, I believe that stdlib logging still remains fit for purpose, and am willing to stand up and say so when the criticisms made of it are without merit or overstated.

So, let's review some of the statements in Armin's documentation and presentations which merit closer examination. It's a long list, hence the tl;dr at the top of this post, but there are even more points that I could have made if not for the lack of time :-(

Statements What they imply ("Myths") How it really is ("Facts") Conclusions
In the opening paragraph in the "Why you should use Logbook" section
Despite the existence of stdlib logging, you should use Logbook. Logbook is better than stdlib logging now, and is ready to be used. As stated in the same paragraph, Logbook is in the alpha stage of development and should be treated as a developer preview, and only supports Python 2.6 and 2.7. Logbook is at present suitable for evaluation use only.
In the "Advantages over Logging" section
If properly configured, Logbook’s logging calls will be very cheap and provide a great performance improvement over the standard library’s logging module. While we are not there yet, there will be some performance improvements in the upcoming versions when we implement certain critical code paths in C. stdlib logging is slow. You can evaluate these test scripts yourself: time_logging.py and time_logbook.py. In the output of these scripts, you need to look at the "log_simple" line on each for a meaningful comparison. It's not a scientific benchmark, but on my machine, with these scripts, the Logbook code runs around 30% slower than the stdlib code. Logbook is at present a fair bit slower than stdlib logging, so it is somewhat premature to boast about "great performance improvements".
While it's possible for Logbook to improve performance by writing performance-critical code in C, it's also possible for stdlib code to improve using the same approach. (Note, however, that building C extensions of third-party libraries from source under Windows can be a pain).
Anyway, according to the numbers shown by these test scripts, logging calls take of the order of tens of microseconds - not a real concern in most scenarios.
If you are having performance problems which you think are due to logging overhead, the recommendation is to profile and see where the bottlenecks really are. If they are found to be in logging, please post your findings on comp.lang.python or bugs.python.org if you think that's appropriate.
Logbook also supports the ability to inject additional information for all logging calls happening in a specific thread. For example, this makes it possible for a web application to add request-specific information to each log record such as remote address, request URL, HTTP method and more. stdlib logging can't do this. I have already explained to Armin how to do this - see this post. stdlib logging CAN do this fairly easily.
The Logbook system is (besides the stack) stateless and unit testing it is very simple. If context managers are used, it is impossible to corrupt the stack, so each test can easily hook in custom log handlers. There's something inherently good about a shared stack, but having a shared dict is somehow worse. Unit testing with stdlib logging is not simple. Unit tests with stdlib logging can't hook in custom handlers. There's no convincing argument that the type of shared state (execution context) offered by Logbook is better than how stdlib works. You just have to take it on trust. Unit testing with stdlib logging is not hard, since you can add/remove handlers to/from loggers, close handlers in tearDown which you opened in setUp, and disable loggers you don't need. Unit tests with stdlib logging CAN hook in any handlers you want; in fact, Python's own regression tests use unittest to do testing of the logging package itself. From the evidence presented, there's no proof of any actual functional advantage of Logbook over logging. You just have to take it on trust.
In the "Design Principles" section
Logbook [...] supports paradigms we think are more suitable for modern applications than the traditional Java inspired logging system that can also be found in the Python standard library and many more programming languages. Having a design that was inspired by a Java library is somehow bad, just because of Java. Tha fact that it appears in many more programming languages damns it even further. Although the Java log4j system was an influence on stdlib logging, this is because it had some good abstract ideas, well worth using. Though we have a lot of very clever people in the Python community, not every clever person is a Pythonista; it's very parochial to think that good ideas can only come from the Python community. The ideas that emerged in log4j and other packages are the ideas of "what happened? where did it happen? how important is it? who wants to know?" and if you think about it, these ideas are hardly Java-specific. In fact, they are pretty central to the problem domain addressed by logging. So - "what happened?" is the details of the logging call, "where did it happen?" is the logger name, "how important is it?" is the level, and "who wants to know?" is the handler. Hardly that complicated, and pretty much a minimum requirement for any logging package that aspires to the name.
Also, anyone who bothers to look at log4j in detail will see that Python logging is not a mindless translation - it's very Pythonic. Beyond the basic abstractions of "What? Where? How Important? Who Needs To Know?", there's no real correspondence between the Java artefacts and the Python ones. Using David A. Wheeler's SLOCCount, log4j 1.2.15 = 168 source files, around 16K SLOC; Python 2.6 logging = 3 source files, < 1.5K SLOC. Almost every class in stdlib logging maps to a core concept of logging, other than the handler classes, which are geared towards specific types of audience for logging messages.
There's no indication that a Java influence has been detrimental to stdlib logging. All of the core concepts from stdlib logging are also present in Logbook: loggers ("where?"), levels ("how important?"), handlers ("who wants to know?") and the details passed to logging calls ("what happened?"). So if stdlib logging "suffers" from Java influences, so equally does Logbook.
It's also worth bearing in mind that Python logging works in Python 2.3 -> 2.7 and 3.0 -> 3.2, without the need to depend on specific features present only in the more recent versions of Python. And while a case can be made for certain programming styles like the with statement to have better support in stdlib logging, these are by no means essential to the basic business of logging; in any case support for new features can always be added as and when it's needed.
Logbook is unique in that it has the concept of logging channels but that it does not keep a global registry of them. In the standard library’s logging module a logger is attached to a tree of loggers that are stored in the logging module itself as global state. The shared state embodied in Logbook's process or thread context is somehow good, but the shared tree of loggers stored as "global" state in stdlib logging is somehow bad. One reason for the shared state in stdlib logging is that application developers often need explicit access to loggers used by libraries, in order to control the level of verbosity of logging from those libraries, in the context of the application being developed. This verbosity varies over the life of an application - for example, when a bug is discovered in an application in production, it's common to selectively turn the verbosity of different library loggers up and down, while diagnosing the problem. You can't do that when the logger used by a library is hidden completely from a user of that library.
This is a fundamental point which may not bite developers working in limited scenarios, but I've had reason to rely on this functionality time and again. For example: I sometimes need to turn the verbosity of SQLAlchemy's logging so that I can examine SQL statements being generated from the ORM. If I couldn't access SQLAlchemy's top-level logger, using the name "sqlalchemy" and the fact that because of the "global" registry, the SQLAlchemy code and my code are both referring to the same object when they say logging.getLogger('sqlalchemy')), I couldn't turn this verbosity up and down, as and when needed.
Strictly speaking, the tree of loggers in stdlib logging does not need to be global. As a look at logging's source code will show, the tree of loggers is held as an instance variable of an instance of a Manager class. A binding to this Manager instance is currently stored in the Logger class, and through this binding you can say that the tree of loggers is global. This state of affairs has been the situation since the first release of logging in the stdlib.
It would be well within the bounds of possibility to either expand the Manager class to be a context manager, or derive a context manager class from it. If this was done, then the tree of loggers need not be global: it would just be an attribute of the Manager instance, and there could be more than one Manager (or derived context manager) instance in a Python process. This improved context manager may well appear in the future, but at present there has not been a strong enough case made for the need for doing this. If such a case is made (for example, by someone generating a PEP or proto-PEP making the justification) then providing such a feature can be considerered, and it would not be a major undertaking (but not a trivial one, either).
There is nothing wrong with logging's design just because it uses a shared registry of loggers. In fact, the shared registry confers some positive, practical benefits in real-life use cases.
In logbook a logger is just an opaque object [...] the lifetime and availability of that object is controlled by the person creating that logger. The registry is necessary for the logging library to give the user the ability to configure these loggers. It's better for a library developer to keep their loggers completely encapsulated in the library, so application developers using that library have no access to it. The need to access the loggers in libraries comes up again and again in practice, particularly in production environments and long-running service processes where you can't just fire up a debugger, when a problem arises, to see what's causing it. With stdlib logging, features exist to turn the verbosity of library logging up and down, for different libraries at different times, without needing to stop the long-running processes. The stated advantage of Logbook over stdlib logging seems illusory. The logger registry actually solves practical problems in real-world environments where multiple libraries from multiple vendors are integrated into an application. It's not clear, from Logbook's documentation, whether you can tune the verbosity of logging in a library component which utilised Logbook for its logging functionality.
Logbook has a completely different concept of dispatching from loggers to the actual handlers which removes the requirement and usefulness of such a registry. The advantage of the logbook system is that it’s a cheap operation to create a logger and that a logger can easily be garbage collected to remove all traces of it. There's no utility in a shared registry between different components of an application. Creating a logger in stdlib logging is expensive. It's important to be able to remove all traces of a logger that you created. The preceding discussions try to show why a shared registry is useful. There's no example given in the Logbook documentation of how you might (as an application developer) tune the logging verbosity of a component you use. It's not clear that this is even possible in Logbook.
Creating loggers in the stdlib is not particularly expensive, and is a one-off operation; so amortized over many uses, the creation should actually be cheaper in the stdlib.
It's true that loggers can't be garbage collected to remove all traces of it, but not clear why that is really important, as the memory used by loggers is not significant in normal usage.
stdlib logging is designed the way it is for good reasons, not because its designer lacks imagination. The stated disadvantages of stdlib logging over Logbook are not proven, just stated as if they were.
Logbook moves the burden of delivering a log record from the log channel’s attached log to an independent entity that looks at the context of the execution to figure out where to deliver it. The loggers in stdlib logging are suffering from some kind of burden when delivering log records. In both stdlib logging and Logbook, records are delivered using handlers. Loggers become a way for an application developer to finely control the logging verbosity of their application, so that they are in control rather than library developers who, of necessity, cannot anticipate every circumstance about how their libraries are used and deployed. The context of execution is insufficient to determine how records generated in libraries should be delivered. In practice, you need a combination of the execution context and the wishes of the application developer regarding how library logging is to be configured.
In the "Context sensitive handler stack" section
Let’s take a GUI application [which] might fail or log messages. The typical default behaviour here would be to log into a logfile. Fair enough, that’s how these applications work. But what’s the point in logging if not even a single warning happened? The traditional solution with the logging library from Python is to set the level high (like ERROR or WARNING) and log into a file. When things break, you have a look at the file and hope it contains enough information. When you are in full control [...] with a stack based system like Logbook has, there is a lot more you can do. For example you could immediately after your application boots up instanciate a FingersCrossedHandler. This handler buffers all log records in memory and does not emit them at all. What’s the point? That handler activates when a certain threshold is reached. For example, when the first warning occurs you can write the buffered messages as well as the warning that just happened into a logfile and continue logging from that point. Because there is no point in logging when you will never look at that file anyways. This kind of functionality is only possible with a context sensitive handler stack, which only Logbook has. You can't do this with stdlib logging. You can do exactly this with stdlib logging, where from the earliest release there's been a MemoryHandler class which buffers up records in memory until a message with a specified threshold level is seen, when all the messages buffered so far are then forwarded to a target handler for processing. The supposed advantage of a context-sensitive handler stack in this scenario isn't any particular advantage at all.
But that alone is not the killer feature of a stack. In a GUI application there is the point where we are still initializing the windowing system. So a file is the best place to log messages. But once we have the GUI initialized, it would be very helpful to show error messages to a user in a console window or a dialog. So what we can do is to initialize at that point a new handler that logs into a dialog. When then a long running tasks in the GUI starts we can move that into a separate thread and intercept all the log calls for that thread into a separate window until the task succeeded. This is somehow a killer feature of a stack, which you can't do any other way. You can do this kind of processing perfectly easily with stdlib logging. There are examples of dialog- or window-based handlers (for Qt, say) in answers to questions on Stack Overflow. You can combine the provided stdlib logging handlers such as MemoryHandler and FileHandler together with your own window-specific handlers (dependent on which windowing system you are using) to achieve the sort of effect described. Another supposed advantage of a context sensitive handler stack, which isn't.
In Armin's Logbook presentation
In the "Why not logging?" slide
stdlib logging is not really suited for Web applications. Don't even bother trying to use stdlib logging with Web applications, it's not worth it. It's perfectly possible to use stdlib logging with Web application, as I showed Armin how to meet his requirements, in this post. This statement is just plain wrong and illustrates a lack of understanding of how to use the functionality of stdlib logging, rather than any lack of functionality in stdlib logging.
Logging's central registry of loggers makes unittesting a pain. It's too much trouble to use stdlib logging in your unittests. It doesn't need to be a pain. A future post on this blog will describe scenarios showing how to use logging in your unit tests.
It's probably possible to provide additional utility functions in stdlib logging to make some aspects of unit testing easier, but there are no real showstoppers; if Armin describes his problem scenario in as much detail as he described his Web application configuration problem, I will (hopefully) be able to suggest some ways of easing his pain.
Unless specific details are given as to why unit testing with stdlib logging is considered to be a pain (i.e. what you'd like to do but what you can't do) then this statement must be taken with a pinch of salt.
The same registry also causes issues when libraries want to start logging The registry is central to the issues caused by libraries doing logging There are documented things that library authors need to do when setting up logging for libraries. If library developers do not follow the recommendations then this might in fact cause problems, but that has nothing to do with the existence of a central registry (or at least, the connection has not been demonstrated). There's no evidence that the existence of a central registry is somehow responsible for how third-party libraries misuse logging.
You can't delete loggers once created. It's important that you be able to delete loggers once created. The reason why loggers are not deleted once created is that multiple threads (other than the one that wants to delete the logger) can have references to the logger. This is facilitated by having a central registry, but even in a system without a central registry, loggers could be passed between threads. In that scenario, you can't actually guarantee to delete the logger since there could be references to it in another thread.
It's not demonstrated that it's important to expunge all loggers completely from memory. The number of logger instances is related to the number of logical "areas" in an application - the idea of "where" in a component a particular event occurred. This is not so large that the amount of memory used by loggers becomes significant. And stdlib logging allows loggers to be disabled, which means that they cease to perform any logging (until re-enabled) and are as good as gone.
It's true that you can't delete loggers, but they can be easily disabled; they do not, under normal usage patterns, occupy large amounts of memory.
In the "Why does nobody like logging?" slide
Why does nobody like logging? Nobody likes logging. This is an opinion, not a fact. There are many people who have expressed their thanks to me for the existence of the logging package and feedback saying that it has been very useful to them.
Even Armin has told me that he liked logging "a lot", which is hard to reconcile with what he's now saying - since logging's basic design has been the same from the outset, and it seems to meet his functional needs even if it doesn't press his aesthetic hot buttons.
There's no basis in fact for this statement. It's marketing FUD.
The default configuration of stdlib logging for applications is bad. The statement speaks for itself. On the contrary, the default configuration for applications is consistent with the principle of least surprise and the Zen of Python's "explicit is better than implicit". In common with the Unix philosophy, software should not be excessively verbose except when it needs to be, or is asked to be. By default, stdlib logging will only output messages of severity WARNING or higher. (It's easy to change this default.)
In fact, in the default configuration of logbook 0.2, every DEBUG logging call in every library which uses Logbook in your application will print debug messages to stderr when your application runs, even if you specify that you want to see only messages of severity WARNING or greater. Perhaps you won't care if your application is a GUI application or a Web application, since you won't have a console, and perhaps logbook is expressly designed only for these types of application. For a console application, it's very likely you will get Too Much Information. Here is an example.
The default configuration of stdlib logging is quite sane; there's nothing wrong with it.
Useless default configuration for libraries The statement speaks for itself. Library developers are supposed to add a NullHandler to their top-level logger and to the level and propagation flag as needed, but to not add any other handlers to their logging configuration because that configuration is the prerogative of the application developer. However, it's not possible to prevent library authors from breaking these rules. By default, libraries aren't supposed to handle messages - only to log them - and the application developer is expected to configure any logging (though they don't have to; if they don't, no logging output should be generated). This is consistent with the principle of least surprise and Zen of Python point mentioned earlier.
Who sets up the logging config? Libraries sometimes call basicConfig. If library developers don't follow the guidelines set down for logging from libraries, that's somehow the fault of stdlib logging. There is documentation about how to configure logging in libraries, and while this may be lacking in some way, it's not immediately clear what that lack is. Perhaps more or better documentation is needed, but that won't eliminate completely the possibility that a library developer will misguidedly add handlers to the logging configuration, say by calling basicConfig().
logging.warn() and friends are not thread-safe. Stay away from stdlib logging if you want to use threads. If logging is correctly configured in an application, there should be no untoward behaviour by stdlib logging. In versions of Python earlier than 2.7.1 and 3.2, the code in basicConfig() omitted acquiring and releasing an internal lock, and you could demonstrate incorrect behaviour in stdlib logging by calling basicConfig() directly or indirectly from multiple threads; this has been fixed in 2.7.1 and 3.2, but nevertheless, it is wrong to do any logging from multiple threads before configuring handlers (if you want any logging output, that is) - because if no handlers have been configured before threads start running, any messages they log could be lost (because there are no handlers to deliver them).
So, good practice would mean that this problem shouldn't occur in practice, and even if it does, the worst consequence is that handlers are added multiple times, resulting in multiple messages output for the same event.
It's disingenuous to say that these functions are not thread-safe. Although the statement is technically correct, the unexpected behaviour only occurs if logging is not configured before multiple threads are started, and more than one of these threads calls basicConfig() either directly or via logging.warn() and friends.
Even then, the behaviour occurs only rarely, due to thread timing uncertainties.
In any case, the incorrect behaviour will not occur in Python versions >= 2.7.1 or >= 3.2, because the requisite lock acquisition/release code has now been added.

Friday, 17 September 2010

Configuring logging for Web applications

tl;dr: You can easily configure your web applications with Python logging, so that logged messages can go into web-application-specific log files (or other destinations) even for multiple web applications in the same Python process, and have those logs contain request-specific information such as remote user IP address, request method and others. Demonstration scripts are here and here, copy them to the same initially empty folder and run webapptest.py in that folder, and then look at app*.log. Look at the InjectingFilter class - that's the core of the solution. Read on for more information.

In a recent post on Reddit, Armin Ronacher, developer of Werkzeug and Jinja among other things, had this to say:
I don't know a setup of logging that would work out for me. But maybe you do and could share. Say I am a library called by anyone and use three or five loggers without any handlers. Now in the same Python process are two different web applications, each independently but using the same library. How can each of these applications have their own log where all logging messages from that library are redirected to the correct one of these application's log files (the correct one is the one where all calls were issued by the same handling thread) including additional information such as current IP and HTTP request method. I tried, I failed.
Now, Armin has recently been working on his own logging library, and saying at conferences and in other public fora that he's done this mainly because he thinks Python logging is not really suited to Web applications. This, despite the fact that it's used in Google App Engine, Pylons, Tornado and other Web application platforms, by the many developers that use them. Now Armin can make a little arm-waving FUD go a long way, and I was very glad to have an actual problem statement from him that I could try to nail with a solution. I'm pleased to say it only took around ten minutes or so of work to come up with something which seems to solve his problem for him. His response? "This is actually quite neat ..." ;-)

To illustrate the solution, I first wrote a simple module which represents the library. This demonstration example only uses one logger, but it could use three or five loggers and the solution would be the same. Here is the library module, christened webapplib.py:

import logging
import time

logger = logging.getLogger(__name__)

def useful():
    logger.debug('Hello from webapplib!')
    time.sleep(0.01)

The useful() function represents a call into the library from a Web application; this call will be made during request processing. The time.sleep() call is there to simulate real activity, so that multiple threads will definitely be running at the same time processing requests, for a more realistic scenario.

The next step was to construct a test harness which simulates requests and web applications. First, the request:

class Request:
    def __init__(self, method, ip):
        self.method = method
        self.ip = ip

REQUESTS = [
    Request('GET', '192.168.2.20'),
    Request('POST', '192.168.2.20'),
    Request('GET', '192.168.2.21'),
    Request('POST', '192.168.2.21'),
    Request('GET', '192.168.2.22'),
    Request('POST', '192.168.2.22'),
]

The dummy Request class just has HTTP method and IP, as example attributes which need to be put into the logs. The REQUESTS array is just a list of dummy requests which will be sent, during the test run, to the dummy web applications.

Next, we need to consider how to simulate the Web applications. Let's start with the following skeleton snippet:

import logging

root = logging.getLogger()
root.setLevel(logging.DEBUG)

formatter = logging.Formatter('%(asctime)s %(threadName)-11s %(appName)s %(name)-9s %(ip)s %(method)-4s %(message)s')

class WebApp:
    def __init__(self, name):
        self.name = name
        handler = logging.FileHandler(name + '.log', 'w')
        handler.setFormatter(formatter)
        root.addHandler(handler)
        self.num_requests = 0

    def process_request(self, request):
        self.num_requests += 1
        logger.debug('Request processing started')
        webapplib.useful()
        logger.debug('Request processing finished')

Now the appName, method and ip in the format string are not part of the normal set of attributes which are part of logging: they are extra request-specific attributes which we want to inject into all log messages. In order for the format string to have the desired effect, attributes with those names will need to be added to the LogRecord for the event being logged, before formatting takes place; we'll come to that in a bit.

The WebApp class, representing a dummy web application, is fairly straightforward: we initialize a FileHandler for that application in the constructor, based on its name, set its formatter, and add it to the root logger.

We maintain a counter of requests processed by each WebApp instance. This counter is incremented in process_request, which we will call from our top-level test harness to simulate processing a request. This method logs two DEBUG messages, and calls into the common webapplib library's useful() method, which will also log a DEBUG message. So in total, each processed request should result in 3 lines in the log.

Before looking at the specifics of the solution to the problem, let's see what the top-level test harness looks like:

def main():
    app1 = WebApp('app1')
    app2 = WebApp('app2')
    apps = [app1, app2]
    threads = []
    #Add a common handler which will capture all events
    handler = logging.FileHandler('app.log', 'w')
    handler.setFormatter(formatter)
    root.addHandler(handler)
    #while True:
    for i in xrange(1000):
        try:
            app = choice(apps)
            request = choice(REQUESTS)
            t = threading.Thread(target=app.process_request, args=(request,))
            threads.append(t)
            t.start()
        except KeyboardInterrupt:
            break    
    for t in threads:
        t.join()
    for app in apps:
        print '%s processed %s requests' % (app.name, app.num_requests)
    
if __name__ == '__main__':
    main()

So the test harness creates two dummy Web applications called app1 and app2, and sends a thousand requests (chosen at random from our array of pre-constructed requests) to an app which is also chosen at random. A FileHandler for 'app.log' is created and added to the root logger, which will capture all logging (i.e. across all Web applications). Each request is processed on a separate thread.

So far, we've described the simulation artefacts and the overall structure of the test harness. Now let's look at the core of the solution. Let's review the problems we have to solve:
  • Each LogRecord needs the relevant thread-specific information injected into it before formatting.
  • The handler for each of the application-specific logs needs to filter out any record which is not intended for that log.
We can solve both of these by using an appropriate Filter subclass. We'll call it InjectingFilter, because it injects data into the LogRecord as well as acting as a filter. But how will it work? We'll use a threading.local object to hold thread-specific state which is needed by the filter. Here's the first cut of what InjectingFilter looks like:

import threading

tlocal = threading.local()

class InjectingFilter(logging.Filter):
    def filter(self, record):
        record.method = tlocal.request.method
        record.ip = tlocal.request.ip
        return True

This shows just the injection logic of the filter, and not any filtering logic. The thread-local object tlocal will hold a reference to the request currently being processed, which the filter uses to inject HTTP method and IP address into the LogRecord. This request needs to be set into the thread-local in WebApp.process_request, which now looks like this:

    def process_request(self, request):
        tlocal.request = request
        self.num_requests += 1
        logger.debug('Request processing started')
        webapplib.useful()
        logger.debug('Request processing finished')

So far, so good. Next, we need to implement the filtering part: this is done by attaching, to each app-specific handler, an app-specific filter. In order to operate effectively, the filter needs to know whether the current thread belongs to its Web application (in which case, the record is accepted for processing) or not (in which case, the record is rejected). One easy way of doing this is to ensure that the WebApp maintains a set of names of the threads it's currently using, and the filter can just check if the current thread belongs to its app's set. So we need to keep a reference to the app in the filter, and this also allows the injection of the appName attribute into the LogRecord. So the WebApp class will now look like this (simple implementation):

class WebApp:
    def __init__(self, name):
        self.name = name
        self.threads = set()
        handler = logging.FileHandler(name + '.log', 'w')
        f = InjectingFilter(self)
        handler.setFormatter(formatter)
        handler.addFilter(f)
        root.addHandler(handler)
        self.num_requests = 0

    def process_request(self, request):
        tlocal.request = request
        tlocal.appName = self.name
        tname = threading.currentThread().getName()
        self.threads.add(tname)
        self.num_requests += 1
        try:
            logger.debug('Request processing started')
            webapplib.useful()
            logger.debug('Request processing finished')
        finally:
            self.threads.remove(tname)

and the InjectingFilter class will look like this  (simple implementation):

class InjectingFilter(logging.Filter):
    def __init__(self, app):
        self.app = app

    def filter(self, record):
        record.method = tlocal.request.method
        record.ip = tlocal.request.ip
        record.appName = tlocal.appName
        tname = threading.currentThread().getName()
        return tname in self.app.threads

You can pick up the whole script as webapptest.py from here, and use it with the webapplib.py script shown as the first code in this post. What happens if we run it?
vinay@zeta-lucid:~/projects/scratch$ python webapptest.py
app1 processed 506 requests
app2 processed 494 requests
vinay@zeta-lucid:~/projects/scratch$ wc -l app*.log
  1518 app1.log
  1482 app2.log
  3000 app.log
  6000 total 
So the app.log file contains all the log messages - 3 per request, for a total of 3000 - and the lines in app1.log and app2.log add up to 3000 too, which looks promising. Plus the individual line counts for these files are 3 x the number of requests processed, so that's also consistent. Let's look at the first few lines of each file:
vinay@zeta-lucid:~/projects/scratch$ head app.log app1.log app2.log
==> app.log <==
2010-09-17 16:11:48,226 Thread-1    app1 __main__  192.168.2.20 GET  Request processing started
2010-09-17 16:11:48,227 Thread-1    app1 webapplib 192.168.2.20 GET  Hello from webapplib!
2010-09-17 16:11:48,228 Thread-2    app1 __main__  192.168.2.21 POST Request processing started
2010-09-17 16:11:48,229 Thread-2    app1 webapplib 192.168.2.21 POST Hello from webapplib!
2010-09-17 16:11:48,230 Thread-3    app2 __main__  192.168.2.21 POST Request processing started
2010-09-17 16:11:48,230 Thread-3    app2 webapplib 192.168.2.21 POST Hello from webapplib!
2010-09-17 16:11:48,231 Thread-4    app1 __main__  192.168.2.22 POST Request processing started
2010-09-17 16:11:48,232 Thread-4    app1 webapplib 192.168.2.22 POST Hello from webapplib!
2010-09-17 16:11:48,233 Thread-5    app2 __main__  192.168.2.21 GET  Request processing started
2010-09-17 16:11:48,233 Thread-5    app2 webapplib 192.168.2.21 GET  Hello from webapplib!

==> app1.log <==
2010-09-17 16:11:48,226 Thread-1    app1 __main__  192.168.2.20 GET  Request processing started
2010-09-17 16:11:48,227 Thread-1    app1 webapplib 192.168.2.20 GET  Hello from webapplib!
2010-09-17 16:11:48,228 Thread-2    app1 __main__  192.168.2.21 POST Request processing started
2010-09-17 16:11:48,229 Thread-2    app1 webapplib 192.168.2.21 POST Hello from webapplib!
2010-09-17 16:11:48,231 Thread-4    app1 __main__  192.168.2.22 POST Request processing started
2010-09-17 16:11:48,232 Thread-4    app1 webapplib 192.168.2.22 POST Hello from webapplib!
2010-09-17 16:11:48,236 Thread-7    app1 __main__  192.168.2.21 POST Request processing started
2010-09-17 16:11:48,236 Thread-7    app1 webapplib 192.168.2.21 POST Hello from webapplib!
2010-09-17 16:11:48,237 Thread-8    app1 __main__  192.168.2.22 POST Request processing started
2010-09-17 16:11:48,238 Thread-8    app1 webapplib 192.168.2.22 POST Hello from webapplib!

==> app2.log <==
2010-09-17 16:11:48,230 Thread-3    app2 __main__  192.168.2.21 POST Request processing started
2010-09-17 16:11:48,230 Thread-3    app2 webapplib 192.168.2.21 POST Hello from webapplib!
2010-09-17 16:11:48,233 Thread-5    app2 __main__  192.168.2.21 GET  Request processing started
2010-09-17 16:11:48,233 Thread-5    app2 webapplib 192.168.2.21 GET  Hello from webapplib!
2010-09-17 16:11:48,234 Thread-6    app2 __main__  192.168.2.22 GET  Request processing started
2010-09-17 16:11:48,235 Thread-6    app2 webapplib 192.168.2.22 GET  Hello from webapplib!
2010-09-17 16:11:48,239 Thread-9    app2 __main__  192.168.2.20 POST Request processing started
2010-09-17 16:11:48,239 Thread-9    app2 webapplib 192.168.2.20 POST Hello from webapplib!
2010-09-17 16:11:48,241 Thread-3    app2 __main__  192.168.2.21 POST Request processing finished
2010-09-17 16:11:48,244 Thread-5    app2 __main__  192.168.2.21 GET  Request processing finished
So app.log contains interspersed messages from app1 and app2, but app1.log seems to contain just app1 messages, and app2.log seems to contain just app2 messages. This is encouraging, and we can confirm with:
vinay@zeta-lucid:~/projects/scratch$ grep app1 app1.log | wc -l
1518
vinay@zeta-lucid:~/projects/scratch$ grep app2 app1.log | wc -l
0
vinay@zeta-lucid:~/projects/scratch$ grep app2 app2.log | wc -l
1482
vinay@zeta-lucid:~/projects/scratch$ grep app1 app2.log | wc -l
0
The log lines all contain appName, request method and IP address, as required, Job done, it seems.

Of course, this is a test scenario with dummy requests and web applications implemented simplistically, but the principle of the solution will apply equally to real web application server environments.

Monday, 13 September 2010

QueueHandler and ZeroMQ support

A new handler has been added to logging for Python 3.2: QueueHandler . This allows you to send logging events to a queue of some description. For users of earlier versions of Python, here is the latest code from the Py3K branch:

class QueueHandler(logging.Handler): 
    """ 
    This handler sends events to a queue. Typically, it would be used together 
    with a multiprocessing Queue to centralise logging to file in one process 
    (in a multi-process application), so as to avoid file write contention 
    between processes. 
 
    This code is new in Python 3.2, but this class can be copy pasted into 
    user code for use with earlier Python versions. 
    """ 
 
    def __init__(self, queue): 
        """ 
        Initialise an instance, using the passed queue. 
        """ 
        logging.Handler.__init__(self) 
        self.queue = queue 
 
    def enqueue(self, record): 
        """ 
        Enqueue a record. 
 
        The base implementation uses put_nowait. You may want to override 
        this method if you want to use blocking, timeouts or custom queue 
        implementations. 
        """ 
        self.queue.put_nowait(record) 
 
    def emit(self, record): 
        """ 
        Emit a record. 
 
        Writes the LogRecord to the queue, preparing it for pickling first. 
        """ 
        try: 
            # The format operation gets traceback text into record.exc_text 
            # (if there's exception data), and also puts the message into 
            # record.message. We can then use this to replace the original 
            # msg + args, as these might be unpickleable. We also zap the 
            # exc_info attribute, as it's no longer needed and, if not None, 
            # will typically not be pickleable. 
            self.format(record) 
            record.msg = record.message 
            record.args = None 
            record.exc_info = None 
            self.enqueue(record) 
        except (KeyboardInterrupt, SystemExit): 
            raise 
        except: 
            self.handleError(record) 

This code is perfectly usable in earlier Python versions, including 2.x - just copy and paste it into your own code. In addition to usage with queues from the queue and multiprocessing modules (as described in this earlier post), the QueueHandler makes it easy to support other queue-like objects, such as ZeroMQ sockets.

In the example below, a PUBLISH socket is created separately and passed to the handler (as its ‘queue’):

import zmq # using pyzmq, the Python binding for ZeroMQ 
import json # for serializing records portably 
 
ctx = zmq.Context() 
sock = zmq.Socket(ctx, zmq.PUB) # or zmq.PUSH, or other suitable value 
sock.bind('tcp://*:5556') # or wherever 
 
class ZeroMQSocketHandler(QueueHandler): 
    def enqueue(self, record): 
        data = json.dumps(record.__dict__) 
        self.queue.send(data) 

handler = ZeroMQSocketHandler(sock)

Of course there are other ways of organizing this, for example passing in the data needed by the handler to create the socket:

class ZeroMQSocketHandler(QueueHandler): 
    def __init__(self, uri, socktype=zmq.PUB, ctx=None): 
        self.ctx = ctx or zmq.Context() 
        socket = zmq.Socket(self.ctx, socktype) 
        socket.bind(uri) 
        QueueHandler.__init__(self, socket) 
 
    def enqueue(self, record): 
        data = json.dumps(record.__dict__) 
        self.queue.send(data) 
 
    def close(self): 
        self.queue.close() 

To test this out, put this together into a little test script (imports not shown, but you can get the working script here):

def main(): 
    print('Enter messages to send:') 
    h = ZeroMQSocketHandler('tcp://*:5556') 
    logger = logging.getLogger() 
    logger.addHandler(h) 
    try: 
        while True: 
            s = raw_input('> ') 
            logger.warning(s) 
    finally: 
        logger.removeHandler(h) 
        h.close() 

For the receiving end, you can use a simple script like this:

import json 
import pprint 
import zmq 
 
URI = 'tcp://localhost:5556' 
 
def main(): 
    print('Receiving on a SUB socket: %s' % URI) 
    ctx = zmq.Context() 
    sock = zmq.Socket(ctx, zmq.SUB) 
    sock.setsockopt(zmq.SUBSCRIBE, '') 
    sock.connect(URI) 
    try: 
        while True: 
            msg = sock.recv() 
            data = json.loads(msg) 
            pprint.pprint(data) 
            print('-'*40) 
    finally: 
        sock.close() 
 
if __name__ == '__main__': 
    main() 

And the output would be something like:
Receiving on a SUB socket: tcp://localhost:5556
{u'args': None,
 u'created': 1284446528.9099669,
 u'exc_info': None,
 u'exc_text': None,
 u'filename': u'zmqlog.py',
 u'funcName': u'main',
 u'levelname': u'WARNING',
 u'levelno': 30,
 u'lineno': 78,
 u'message': u"It's easy to log to ZeroMQ!",
 u'module': u'zmqlog',
 u'msecs': 909.96694564819336,
 u'msg': u"It's easy to log to ZeroMQ!",
 u'name': u'root',
 u'pathname': u'zmqlog.py',
 u'process': 13647,
 u'processName': u'MainProcess',
 u'relativeCreated': 521204.14185523987,
 u'thread': -1215568192,
 u'threadName': u'MainThread'}
----------------------------------------

Using filters to add contextual information to logs

As an alternative to using LoggerAdapters to add contextual information to logs, you can also use Filters to achieve the same result. This approach has been documented with an example in the in-development Python documentation, here. This functionality has been available since the earliest version of Python logging (i.e. since 2.3) but since the advent of thread-locals in 2.4 it can be used to log thread-local information. This can be used, for example, in an HTTP request handler, to record request-specific information such as client IP address, HTTP method, path, query string etc. in your logs.

Tuesday, 7 September 2010

Using logging with multiprocessing

There can be a few gotchas when using logging with the multiprocessing module. For example, if you want to write rotated log files from your multi-process application, a naïve implementation might just configure a RotatingFileHandler directly. This could lead to multiple processes trying to write to the file concurrently, which will almost certainly lead to the log getting corrupted because of interleaved writes by different processes.

Note that logging to the same rotated files from multiple threads in a single-process application would be fine; the logging package uses threading locks to ensure that no log corruption occurs. There's no equivalent cross-platform synhronisation for processes in the stdlib, however; that's why you can get corruption with multi-process applications.

To circumvent the problem scenario, you can use a multiprocessing Queue and a listener process which listens for logging events sent to the queue. When it sees these events, it pops them off the queue and processes them; as it's the only process which will write to files directly, there are no contention issues which lead to corruption. The other processes just need to configure a QueueHandler, which will send logging events via the queue to the listener process.

The plan is to add QueueHandler to Python 3.2, but the implementation here is simple enough and should be copy-pastable into your own code for use with earlier Python versions.

The script is fairly well annotated so I'll say no more.

#!/usr/bin/env python
# Copyright (C) 2010 Vinay Sajip. All Rights Reserved.
#
# Permission to use, copy, modify, and distribute this software and its
# documentation for any purpose and without fee is hereby granted,
# provided that the above copyright notice appear in all copies and that
# both that copyright notice and this permission notice appear in
# supporting documentation, and that the name of Vinay Sajip
# not be used in advertising or publicity pertaining to distribution
# of the software without specific, written prior permission.
# VINAY SAJIP DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING
# ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL
# VINAY SAJIP BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR
# ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER
# IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
# OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
#
"""
An example script showing how to use logging with multiprocessing.

The basic strategy is to set up a listener process which can have any logging
configuration you want - in this example, writing to rotated log files. Because
only the listener process writes to the log files, you don't have file
corruption caused by multiple processes trying to write to the file.

The listener process is initialised with a queue, and waits for logging events
(LogRecords) to appear in the queue. When they do, they are processed according
to whatever logging configuration is in effect for the listener process.

Other processes can delegate all logging to the listener process. They can have
a much simpler logging configuration: just one handler, a QueueHandler, needs
to be added to the root logger. Other loggers in the configuration can be set
up with levels and filters to achieve the logging verbosity you need.

A QueueHandler processes events by sending them to the multiprocessing queue
that it's initialised with.

In this demo, there are some worker processes which just log some test messages
and then exit.

This script was tested on Ubuntu Jaunty and Windows 7.

Copyright (C) 2010 Vinay Sajip. All Rights Reserved.
"""
# You'll need these imports in your own code
import logging
import logging.handlers
import multiprocessing

# Next two import lines for this demo only
from random import choice, random
import time

class QueueHandler(logging.Handler):
    """
    This is a logging handler which sends events to a multiprocessing queue.
    
    The plan is to add it to Python 3.2, but this can be copy pasted into
    user code for use with earlier Python versions.
    """

    def __init__(self, queue):
        """
        Initialise an instance, using the passed queue.
        """
        logging.Handler.__init__(self)
        self.queue = queue
        
    def emit(self, record):
        """
        Emit a record.

        Writes the LogRecord to the queue.
        """
        try:
            ei = record.exc_info
            if ei:
                dummy = self.format(record) # just to get traceback text into record.exc_text
                record.exc_info = None  # not needed any more
            self.queue.put_nowait(record)
        except (KeyboardInterrupt, SystemExit):
            raise
        except:
            self.handleError(record)
#
# Because you'll want to define the logging configurations for listener and workers, the
# listener and worker process functions take a configurer parameter which is a callable
# for configuring logging for that process. These functions are also passed the queue,
# which they use for communication.
#
# In practice, you can configure the listener however you want, but note that in this
# simple example, the listener does not apply level or filter logic to received records.
# In practice, you would probably want to do ths logic in the worker processes, to avoid
# sending events which would be filtered out between processes.
#
# The size of the rotated files is made small so you can see the results easily.
def listener_configurer():
    root = logging.getLogger()
    h = logging.handlers.RotatingFileHandler('/tmp/mptest.log', 'a', 300, 10)
    f = logging.Formatter('%(asctime)s %(processName)-10s %(name)s %(levelname)-8s %(message)s')
    h.setFormatter(f)
    root.addHandler(h)

# This is the listener process top-level loop: wait for logging events
# (LogRecords)on the queue and handle them, quit when you get a None for a 
# LogRecord.
def listener_process(queue, configurer):
    configurer()
    while True:
        try:
            record = queue.get()
            if record is None: # We send this as a sentinel to tell the listener to quit.
                break
            logger = logging.getLogger(record.name)
            logger.handle(record) # No level or filter logic applied - just do it!
        except (KeyboardInterrupt, SystemExit):
            raise
        except:
            import sys, traceback
            print >> sys.stderr, 'Whoops! Problem:'
            traceback.print_exc(file=sys.stderr)

# Arrays used for random selections in this demo

LEVELS = [logging.DEBUG, logging.INFO, logging.WARNING,
          logging.ERROR, logging.CRITICAL]

LOGGERS = ['a.b.c', 'd.e.f']

MESSAGES = [
    'Random message #1',
    'Random message #2',
    'Random message #3',
]

# The worker configuration is done at the start of the worker process run.
# Note that on Windows you can't rely on fork semantics, so each process
# will run the logging configuration code when it starts.
def worker_configurer(queue):
    h = QueueHandler(queue) # Just the one handler needed
    root = logging.getLogger()
    root.addHandler(h)
    root.setLevel(logging.DEBUG) # send all messages, for demo; no other level or filter logic applied.

# This is the worker process top-level loop, which just logs ten events with
# random intervening delays before terminating.
# The print messages are just so you know it's doing something!
def worker_process(queue, configurer):
    configurer(queue)
    name = multiprocessing.current_process().name
    print('Worker started: %s' % name)
    for i in range(10):
        time.sleep(random())
        logger = logging.getLogger(choice(LOGGERS))
        level = choice(LEVELS)
        message = choice(MESSAGES)
        logger.log(level, message)
    print('Worker finished: %s' % name)

# Here's where the demo gets orchestrated. Create the queue, create and start
# the listener, create ten workers and start them, wait for them to finish,
# then send a None to the queue to tell the listener to finish.
def main():
    queue = multiprocessing.Queue(-1)
    listener = multiprocessing.Process(target=listener_process,
                                       args=(queue, listener_configurer))
    listener.start()
    workers = []
    for i in range(10):
        worker = multiprocessing.Process(target=worker_process,
                                       args=(queue, worker_configurer))
        workers.append(worker)
        worker.start()
    for w in workers:
        w.join()
    queue.put_nowait(None)
    listener.join()

if __name__ == '__main__':
    main()

Friday, 3 September 2010

Using a custom file naming scheme for rotated log files

When you use rotated log files with Python logging's built-in functionality, the rotated files are named by appending a number to the base log file name: thus, app.log would give rise to log files app.log, app.log.1, app.log.2 etc.

Sometimes, you may not want this: for example, you may want to preserve the file extension so that you can take advantage of file associations (typically on Windows). You can implement a scheme where log file names take the form app.log, app.1.log, app.2.log etc. by subclassing RotatingFileHandler and overriding the doRollover() method. The exact code for this method varies slightly across different versions of Python, so I won't reproduce the whole method, but there's always an if statement in the method which does the rotation. In your overridden method, you can use the following logic (in place of the default logic) to implement an extension-preserving rotation scheme, for example as follows:

if self.backupCount > 0:
    name, ext = os.path.splitext(self.baseFilename)
    for i in range(self.backupCount - 1, 0, -1):
        sfn = "%s.%d%s" % (name, i, ext)
        dfn = "%s.%d%s" % (name, i + 1, ext)
        if os.path.exists(sfn):
            if os.path.exists(dfn):
                os.remove(dfn)
            os.rename(sfn, dfn)
    dfn = "%s.1%s" % (name, ext)
    if os.path.exists(dfn):
        os.remove(dfn)
    os.rename(self.baseFilename, dfn)