In all my flask projects to date, logging is an after thought, with poorly structured log statements scattered across different modules or only relying on the default framework logger. This might work in dev mode but when deployed, its almost impossible to monitor and observe any issues with the running application.
We can apply a technique known as structure query logging. Structured Query Logging refers to a technique of recording logs in a wel-defined, machine-readable format such as JSON. Storing logs in free form text makes it difficult to search for and analyze for production issues. Usinng structured query logging, we can perform more advanced search queries on logs to triage operational issues. AWS Cloudwatch Logs support structured query logging. We can publish our logs to cloudwatch in JSON format.
The example below is based on a Flask web application running on ECS Fargate mode but the same concept applies to any python application.
Firstly, we need to centralize logging. Based on this excellent guide dash0 guide to logging in python, I was able to refactor the application to load its logging configuration into external config files before the actual Flask application is instantiated. Flask uses the logging module and we need to set this up before the application object is initialized. This allows us to load different config files based on environment type. For example, in dev mode, we only need to output logs to stdout while running on AWS, we can output the logs in JSON format.
Below is an example of the logging config for deployment to AWS:
version: 1
disable_existing_loggers: False
filters:
add_context:
(): board.log_context.ContextFilter
formatters:
stream_formatter:
format: "%(asctime)s %(levelname)s: %(message)s [in %(pathname)s:%(lineno)d]"
json_formatter:
(): pythonjsonlogger.jsonlogger.JsonFormatter
format: "%(asctime)s %(name)s %(levelname)s %(message)s"
rename_fields:
levelname: level
asctime: time
handlers:
console:
class: logging.StreamHandler
formatter: json_formatter
level: INFO
stream: ext://sys.stdout
filters: [add_context]
loggers:
board:
handlers: [console]
level: INFO
propagate: False
root:
handlers: [console]
level: WARNING
python-json-logger
library provides a JSON formatter that will convert your logs into a JSON structure. We can install the library and apply the formatter to the logging.StreamHandler
as the application runs in a docker container and we need to stream the logs to stdout for Cloudwatch.
We also define a logging filter ContextFilter
class. It subclasses logging.Filter
and only has 1 overwritten method which filters each log record as it arrives. Here, we are using it to attach additional fields to each record such as hostname
and process_id
as shown above. The context filter class is defined below:
_log_context = contextvars.ContextVar("log_context", default={})
class ContextFilter(logging.Filter):
def __init__(self, name=""):
super().__init__(name)
self.hostname = socket.gethostname()
self.process_id = os.getpid()
def filter(self, record):
record.hostname = self.hostname
record.process_id = self.process_id
context = _log_context.get()
for key, value in context.items():
setattr(record, key, value)
return True
@contextmanager
def add_to_log_context(**kwargs):
current_context = _log_context.get()
new_context = {**current_context, **kwargs}
# set the new context and get the token
token = _log_context.set(new_context)
try:
yield
finally:
_log_context.reset(token)
Note that we utilise a _log_context
context var object which is set via add_to_log_context
function. This function allows for more fine-grained logging via code blocks attached to say, specific routes, which will be explained later.
The second stage to apply centralized logging is some form of middleware which would intercept the request and before the response is returned. I used Flask before_request
and after_this_request
decorators to achieve this. Below is my implementation:
@app.before_request
def log_before_request():
start_time = time.monotonic()
client_ip = request.headers.get("X-Forwarded-For") or request.remote_addr
app.logger.info(
f"incoming {request.method} from {request.path}",
extra={
"method": request.method,
"url": request.url,
"path": request.path,
"client_ip": client_ip,
"user_agent": request.headers.get("user-agent")
}
)
@after_this_request
def log_after_request(response):
duration = (time.monotonic() - start_time) * 100
log_level = logging.INFO
if response.status_code >= 500:
log_level = logging.ERROR
elif response.status_code >= 400:
log_level = logging.WARNING
app.logger.log(
log_level,
f"{request.method} to {request.path} completed with status {response.status_code}",
extra={
"method": request.method,
"url": request.url,
"path": request.path,
"status_code": response.status_code,
"latency": duration
}
)
return response
In log_before_request
, we defined a start timer and try to capture other request attributes such as ip addr, path and method. after_this_request
decorator runs immediately after the request is processed. Here, we log the request method, path and status code. The above creates 2 entries per event, before and after each request.
Following is an example of a log entry when accessing the login page:
{"
time": "2025-07-18 15:36:36,965",
"name": "board",
"level": "INFO",
"message": "incoming GET from /login",
"method": "GET",
"path": "/login",
"client_ip": "172.18.0.1",
"user_agent": "Mozilla/5.0 (X11; Linux x86_64; rv:140.0) Gecko/20100101 Firefox/140.0", "hostname": "8a79d4222d78",
"process_id": 60,
"url": "http://localhost:8000/login?next=/",
"remote_addr": "172.18.0.1"
}
{
"time": "2025-07-18 15:36:37,057",
"name": "board",
"level": "INFO",
"message": "GET to /login completed with status 200",
"status_code": 200,
"duration_ms": 9.182815499934804,
"hostname": "8a79d4222d78",
"process_id": 60,
"url": "http://localhost:8000/login?next=/",
"remote_addr": "172.18.0.1"
}
The add_to_log_context
function allows the addition of specific messages to the log stream. For example, we can add an additional log statement if a user visits the /login page:
@bp.route("/login", methods=['GET', 'POST'])
def login():
with add_to_log_context(user_email=form.email.data):
current_app.logger.info(f"User {current_user.email} has logged in.")
...
This would result in the following log entries:
{"time": "2025-07-19 16:06:18,940", "name": "board", "level": "INFO", "message": "incoming POST from /login", "method": "POST", "url": "http://localhost:8000/login?next=/", "path": "/login", "client_ip": "172.18.0.1", "user_agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/138.0.0.0 Safari/537.36", "hostname": "3832968803fa", "process_id": 13}
{"time": "2025-07-19 16:06:19,503", "name": "board", "level": "INFO", "message": "User test@example.com has logged in.", "hostname": "3832968803fa", "process_id": 13, "user_email": "test@example.com"}
{"time": "2025-07-19 16:06:19,503", "name": "board", "level": "INFO", "message": "POST to /login completed with status 302", "method": "POST", "url": "http://localhost:8000/login?next=/", "path": "/login", "status_code": 302, "latency": 56.536332900031994, "hostname": "3832968803fa", "process_id": 13}
Notice how an additional second log record is created with a custom message of a logged in user. This could help improve observability in certain parts of the application.
In summary, using centralized logging in a web application can help to simplify the use of the python logging module. Structured query logging further provides better observability in human-readable logs that provide more context and are searchable.