When you’re debugging a web service, it’s handy if you can get all the log entries associated with a single HTTP request. This is easy if you generate a unique ID for each request and include it in all the log entries.
If your services calls other services, you can build a simple tracing system by including this ID in all those calls. If you’re using nginx as a reverse proxy, you can use it to generate the IDs.1
At work, we’re using the Tornado web framework in Python and we wanted to have the request IDs in our logs. The most obvious solution is to pass the request ID to every function that logs anything and manually include it in the logging calls. This leads to cluttered code, though, and it’s easy to forget to add the ID everywhere.
We wanted to have an easier, more magical solution and we found it in context variables and log filters.
Context variables are “context-local” variables. They’re similar to thread-local variables – and to Clojure’s dynamic vars – but they work correctly with Python’s asyncio. Thus they’re a great choice for storing “request-local” data such as the request ID.
In your handler, you can generate or extract the ID and store it in a context variable:
import contextvars
import tornado.web
request_id_var = contextvars.ContextVar("request_id")
class MyHandler(tornado.web.RequestHandler):
# prepare is called at the beginning of request handling
def prepare(self):
# If the request headers do not include a request ID, let's generate one.
request_id = self.request.headers.get("request-id") or str(uuid.uuid4())
request_id_var.set(request_id)
In my example, I’ve implemented prepare
in the same class as my actual
handler, but in our real application, all our handlers inherit from a custom
base class implements request ID preparation and some other common features.
Python’s logging cookbook has two recipes for adding contextual information to logging output: LoggerAdapters and filters. We chose filters to avoid having to wrap loggers everywhere. Then we can use the familiar pattern of getting a module-specific logger:
import logging
logger = logging.getLogger(__name__)
The main use case for filters is limiting which log entries get emitted, but
they’re allowed to mutate the records. Our filter always returns True
:
class MyFilter(logging.Filter):
def filter(self, record):
record.request_id = request_id_var.get()
return True
Unlike log handlers, filters do not propagate. This means that you have to add the filter to every logger… or you can add it to the handler of your root logger.
my_filter = MyFilter()
for handler in logging.getLogger().handlers:
handler.addFilter(my_filter)
Here’s the full example:
import contextvars
import logging
import uuid
import tornado.ioloop
import tornado.web
logger = logging.getLogger(__name__)
request_id_var = contextvars.ContextVar("request_id")
# Let's have an async function for the sake of demonstration
async def generate_number():
logger.info("generate a number")
return 4
class MyHandler(tornado.web.RequestHandler):
# prepare is called at the beginning of request handling
def prepare(self):
# If the request headers do not include a request ID, let's generate one.
request_id = self.request.headers.get("request-id") or str(uuid.uuid4())
request_id_var.set(request_id)
async def get(self):
number = await generate_number()
self.write(f"Here's a number: {number}")
def make_app():
return tornado.web.Application([(r"/", MyHandler),])
class MyFilter(logging.Filter):
def filter(self, record):
record.request_id = request_id_var.get("-")
return True
if __name__ == "__main__":
logging.basicConfig(
format="%(levelname)s %(request_id)s %(message)s", level=logging.INFO
)
# Log filters do not propagate, but handlers do. Thus we add the filter
# to the handlers of the root logger so that the messages of child loggers
# get filtered as well.
my_filter = MyFilter()
for handler in logging.getLogger().handlers:
handler.addFilter(my_filter)
port = 8000
app = make_app()
app.listen(port)
logger.info("Listening at http://localhost:%d/", port)
tornado.ioloop.IOLoop.current().start()
The example is also available as a Gist.
If you run it (remember to pip install tornado
) and do a HTTP request, you will see something like this:
INFO - Listening at http://localhost:8000/
INFO f00e793f-73e5-4210-8709-41aefe839e5a generate a number
INFO f00e793f-73e5-4210-8709-41aefe839e5a 200 GET / (::1) 1.27ms
See also my other posts about Python.
-
For a more sophisticated distributed tracing system, take a look at something like OpenTracing. ↩︎