Skip to content
Snippets Groups Projects
  • Richard van der Hoff's avatar
    39230d21
    Clean up some LoggingContext stuff (#7120) · 39230d21
    Richard van der Hoff authored
    * Pull Sentinel out of LoggingContext
    
    ... and drop a few unnecessary references to it
    
    * Factor out LoggingContext.current_context
    
    move `current_context` and `set_context` out to top-level functions.
    
    Mostly this means that I can more easily trace what's actually referring to
    LoggingContext, but I think it's generally neater.
    
    * move copy-to-parent into `stop`
    
    this really just makes `start` and `stop` more symetric. It also means that it
    behaves correctly if you manually `set_log_context` rather than using the
    context manager.
    
    * Replace `LoggingContext.alive` with `finished`
    
    Turn `alive` into `finished` and make it a bit better defined.
    39230d21
    History
    Clean up some LoggingContext stuff (#7120)
    Richard van der Hoff authored
    * Pull Sentinel out of LoggingContext
    
    ... and drop a few unnecessary references to it
    
    * Factor out LoggingContext.current_context
    
    move `current_context` and `set_context` out to top-level functions.
    
    Mostly this means that I can more easily trace what's actually referring to
    LoggingContext, but I think it's generally neater.
    
    * move copy-to-parent into `stop`
    
    this really just makes `start` and `stop` more symetric. It also means that it
    behaves correctly if you manually `set_log_context` rather than using the
    context manager.
    
    * Replace `LoggingContext.alive` with `finished`
    
    Turn `alive` into `finished` and make it a bit better defined.
log_contexts.md 17.52 KiB

Log Contexts

To help track the processing of individual requests, synapse uses a 'log context' to track which request it is handling at any given moment. This is done via a thread-local variable; a logging.Filter is then used to fish the information back out of the thread-local variable and add it to each log record.

Logcontexts are also used for CPU and database accounting, so that we can track which requests were responsible for high CPU use or database activity.

The synapse.logging.context module provides a facilities for managing the current log context (as well as providing the LoggingContextFilter class).

Deferreds make the whole thing complicated, so this document describes how it all works, and how to write code which follows the rules.

##Logcontexts without Deferreds

In the absence of any Deferred voodoo, things are simple enough. As with any code of this nature, the rule is that our function should leave things as it found them:

from synapse.logging import context         # omitted from future snippets

def handle_request(request_id):
    request_context = context.LoggingContext()

    calling_context = context.set_current_context(request_context)
    try:
        request_context.request = request_id
        do_request_handling()
        logger.debug("finished")
    finally:
        context.set_current_context(calling_context)

def do_request_handling():
    logger.debug("phew")  # this will be logged against request_id

LoggingContext implements the context management methods, so the above can be written much more succinctly as:

def handle_request(request_id):
    with context.LoggingContext() as request_context:
        request_context.request = request_id
        do_request_handling()
        logger.debug("finished")

def do_request_handling():
    logger.debug("phew")

Using logcontexts with Deferreds

Deferreds --- and in particular, defer.inlineCallbacks --- break the linear flow of code so that there is no longer a single entry point where we should set the logcontext and a single exit point where we should remove it.

Consider the example above, where do_request_handling needs to do some blocking operation, and returns a deferred:

@defer.inlineCallbacks
def handle_request(request_id):
    with context.LoggingContext() as request_context:
        request_context.request = request_id
        yield do_request_handling()
        logger.debug("finished")

In the above flow:

  • The logcontext is set
  • do_request_handling is called, and returns a deferred
  • handle_request yields the deferred
  • The inlineCallbacks wrapper of handle_request returns a deferred

So we have stopped processing the request (and will probably go on to start processing the next), without clearing the logcontext.

To circumvent this problem, synapse code assumes that, wherever you have a deferred, you will want to yield on it. To that end, whereever functions return a deferred, we adopt the following conventions:

Rules for functions returning deferreds:

  • If the deferred is already complete, the function returns with the same logcontext it started with.
  • If the deferred is incomplete, the function clears the logcontext before returning; when the deferred completes, it restores the logcontext before running any callbacks.

That sounds complicated, but actually it means a lot of code (including the example above) "just works". There are two cases:

  • If do_request_handling returns a completed deferred, then the logcontext will still be in place. In this case, execution will continue immediately after the yield; the "finished" line will be logged against the right context, and the with block restores the original context before we return to the caller.

  • If the returned deferred is incomplete, do_request_handling clears the logcontext before returning. The logcontext is therefore clear when handle_request yields the deferred. At that point, the inlineCallbacks wrapper adds a callback to the deferred, and returns another (incomplete) deferred to the caller, and it is safe to begin processing the next request.

    Once do_request_handling's deferred completes, it will reinstate the logcontext, before running the callback added by the inlineCallbacks wrapper. That callback runs the second half of handle_request, so again the "finished" line will be logged against the right context, and the with block restores the original context.

As an aside, it's worth noting that handle_request follows our rules -though that only matters if the caller has its own logcontext which it cares about.

The following sections describe pitfalls and helpful patterns when implementing these rules.

Always yield your deferreds

Whenever you get a deferred back from a function, you should yield on it as soon as possible. (Returning it directly to your caller is ok too, if you're not doing inlineCallbacks.) Do not pass go; do not do any logging; do not call any other functions.

@defer.inlineCallbacks
def fun():
    logger.debug("starting")
    yield do_some_stuff()       # just like this

    d = more_stuff()
    result = yield d            # also fine, of course

    return result

def nonInlineCallbacksFun():
    logger.debug("just a wrapper really")
    return do_some_stuff()      # this is ok too - the caller will yield on
                                # it anyway.

Provided this pattern is followed all the way back up to the callchain to where the logcontext was set, this will make things work out ok: provided do_some_stuff and more_stuff follow the rules above, then so will fun (as wrapped by inlineCallbacks) and nonInlineCallbacksFun.

It's all too easy to forget to yield: for instance if we forgot that do_some_stuff returned a deferred, we might plough on regardless. This leads to a mess; it will probably work itself out eventually, but not before a load of stuff has been logged against the wrong context. (Normally, other things will break, more obviously, if you forget to yield, so this tends not to be a major problem in practice.)

Of course sometimes you need to do something a bit fancier with your Deferreds - not all code follows the linear A-then-B-then-C pattern. Notes on implementing more complex patterns are in later sections.

Where you create a new Deferred, make it follow the rules

Most of the time, a Deferred comes from another synapse function. Sometimes, though, we need to make up a new Deferred, or we get a Deferred back from external code. We need to make it follow our rules.

The easy way to do it is with a combination of defer.inlineCallbacks, and context.PreserveLoggingContext. Suppose we want to implement sleep, which returns a deferred which will run its callbacks after a given number of seconds. That might look like:

# not a logcontext-rules-compliant function
def get_sleep_deferred(seconds):
    d = defer.Deferred()
    reactor.callLater(seconds, d.callback, None)
    return d

That doesn't follow the rules, but we can fix it by wrapping it with PreserveLoggingContext and yield ing on it:

@defer.inlineCallbacks
def sleep(seconds):
    with PreserveLoggingContext():
        yield get_sleep_deferred(seconds)

This technique works equally for external functions which return deferreds, or deferreds we have made ourselves.

You can also use context.make_deferred_yieldable, which just does the boilerplate for you, so the above could be written:

def sleep(seconds):
    return context.make_deferred_yieldable(get_sleep_deferred(seconds))

Fire-and-forget