Friday, 3 December 2010

Getting more control over LogRecord creation

Every logging event is represented by a LogRecord instance. When an event is logged and not filtered out by a logger’s level, a LogRecord is created, populated with information about the event and then passed to the handlers for that logger (and its ancestors, up to and including the logger where further propagation up the hierarchy is disabled). Before Python 3.2, there were only two places where this creation was done:

  1. Logger.makeRecord, which is called in the normal process of logging an event. This invoked LogRecord directly to create an instance.
  2. logging.makeLogRecord, which is called with a dictionary containing attributes to be added to the LogRecord. This is typically invoked when a suitable dictionary has been received over the network (e.g. in pickle form via a SocketHandler, or in JSON form via an HTTPHandler).

This has usually meant that if you need to do anything special with a LogRecord, you’ve had to do one of the following.

  1. Create your own Logger subclass, which overrides makeRecord(), and set it using logging.setLoggerClass before any loggers that you care about are instantiated.
  2. Add a Filter to a logger or handler, which does the necessary special manipulation you need when its filter() method is called.

The first approach would be a little unwieldy in the scenario where (say) several different libraries wanted to do different things. Each would attempt to set its own Logger subclass, and the one which did this last would win. The second approach works reasonably well for many cases, but does not allow you to e.g. use a specialized subclass of LogRecord. Library developers can set a suitable filter on their loggers, but they would have to remember to do this every time they introduced a new logger (which they would do simply by adding new packages or modules and doing logger = logging.getLogger(__name__) at module level). It’s probably one too many things to think about. They could also add the filter to a NullHandler attached to their top-level logger, but this would not be invoked if an application developer attached a handler to a lower-level library logger – so output from that handler would not reflect the intentions of the library developer.

In Python 3.2, LogRecord creation will be done through a factory, which you can specify. The factory is just a callable you can set with logging.setLogRecordFactory(factory), and interrogate with logging.getLogRecordFactory(). The factory is invoked with the same signature as the LogRecord constructor, as LogRecord is the default setting for the factory:

factory(name, lvl, pathname, lineno, msg, args, exc_info, func=None, sinfo=None, **kwargs) 

The parameters are as follows:

  • name – the name of the logger used.
  • lvl – the numeric value of the event’s level.
  • pathname, lineno – the location in the source code where the logging call appears.
  • msg, args – the logging message and arguments.
  • exc_info – any exception info, or None.
  • func – the name of the function or method from which the logging call was made.
  • sinfo – a traceback such as is produced by traceback.print_stack. (This is not the same as the exception traceback.)
  • kwargs – currently empty, but this could change in future versions of logging.

This approach allows a custom factory to control all aspects of LogRecord creation. For example, you could return a subclass, or just add some additional attributes to the record once created, using a pattern similar to this:

old_factory = logging.getLogRecordFactory()

def record_factory(*args, **kwargs):
record = old_factory(*args, **kwargs)
record.custom_attribute = 0xdecafbad
return record

logging.setLogRecordFactory(record_factory)

This pattern allows different libraries to chain factories together, and as long as they don’t overwrite each other’s attributes or unintentionally overwrite the attributes provided as standard, there should be no surprises. However, it should be borne in mind that each link in the chain adds run-time overhead to all logging operations, and the technique should only be used when the use of a Filter does not provide the desired result.

These changes have been added to Python 3.2 (i.e. in the py3k branch) and should be in the first beta. Feel free to try them out. Your feedback is, as always, most welcome.

2 comments:

  1. Hi Vinay,
    What if you want to make some changes to 'msg'; how do you code that ?
    -Yon

    ReplyDelete
  2. @Yon - Once you add stuff to the logging record, you should be able to inject a %(custom_attribute)s into your messages
    It will get resolved when the Handler calls the Formatter class's format(self,record) method on the record.

    ReplyDelete