We have an nginx server that, in some contexts, receives sensitive data in the HTTP username field. More specifically, it's an API key that clients are sending like curl -u "$API_KEY:" ...
.
The default nginx access_log format includes $remote_user, which writes the entire client API key into the access log and taints the file with sensitive data. I know I can define a different log_format that omits the $remote_user variable entirely, however I can see cases where having at least a hint about who the client was could be enormously helpful for log correlation or incident response. Is there a way to configure nginx to store a severely truncated copy of $remote_user in the access log instead of the full value from the client? (i.e. ABCDEFGH12345678
becomes ABCD*
or something along those lines.)
(It also goes without saying I don't want to wreck the actual REMOTE_USER type variables that the WSGI backend relies on for authentication.)
This is nginx 1.10.3, as shipped in the default Debian Stretch repos.
You can use the map directive to set one variable based upon another.
I havent tested this, so not sure if Nginx plays nicely with the
{,9}
part of my regex syntax so you might have to adjust it slightly, but something like this should give you a variable with the first 9 characters of the remote_user variable, edit your log format to include the truncated_user variable instead.@miknik's answer almost worked, but it took a little tweaking and experimentation to actually get it to behave. The full configuration I ended up using is: