I have a client certificate which has non English characters in the subject, when I use the attribute $ssl_client_s_dn of ngx_http_ssl_module the value is escaped twice.
real value = "L=D'UNCONGÉ"
the \xC3 is encoded again to get \x5CC3 ,I need to avoid this
actual value = "L=D'UNCONG\x5CC3\x5C89"
expected value = "L=D'UNCONG\xC3\x89"
can someone please advise me on what I did wrong, How can I avoid this.
There are many parts to this puzzle.
First, let me comment that the "Expected Value" you mention includes Hexadecimal numeric constants (format: \xHH). This would only be 'expected' where this is a default encoding.
Nginx itself uses most String Literals and Numeric Constants such as Hex, Octal or Unicode representations without an issue in configs; and the same when it reads Headers- so I was initially confused by the unusual format of this "double-escaping".
I checked the RFC regarding Cert Encoding of Distinguished Name (DN). There is a subtle difference in how the Cert String is formatted vs. something like double-encoded URL strings, but the diff is important.
Please Note the RFC and in particular Note the table of examples toward the end of section 5 and see "Quoted" values vs. UTF-8: https://datatracker.ietf.org/doc/html/rfc2253#section-5
The RFC states that Cert UTF-8 "É" (0xC389) when Quoted is \C3\89. Based on the Actual Value you have reported, this confirms => Nginx is reading a Quoted value from the Cert itself.
Then Nginx escapes the "\" in \C3\89 and this explains why it becomes \x5CC3\x5C89 as in your "Actual Value".
However, this unusual representation is (probably) limited to the Logs of Nginx itself - depending on your configs it is probably not the actual value of $ssl_client_s_dn.
By default Nginx escapes strings in Logs but does not escape values in variables. Regarding the log format you can prevent escaping by adding an option to the log_format directive:
Append the variable $ssl_client_s_dn to the appropriate place in the log_format value so that it can be checked. Study the Nginx Log Module if necessry to see how to do this https://nginx.org/en/docs/http/ngx_http_log_module.html
If you are new to Nginx this may be a lot to unpack... so let me review the main points, and after careful consideration hopefully you can tailor the config correctly to suit the exact use-case:
This problem is implementation specific, meaning it does not have a generic solution. There isn't some simple fundamental issue here- it is a complex case of numerous dependent factors.
Preventing Escaped Logs will help you better understand the value you have received from the Cert; please ensure that you log at least $ssl_client_s_dn until you are sure everything is working as expected.
If some Proxy/Server/Service other than an End User sends the Client Cert to Nginx, keep in mind this is the chief Dependency - How they construct the Cert encoding determines how the Cert Subject DN will be read by Nginx.
Note this point alone could solve cases in which the Upstream is mangling the Header value- If you are sending the value of $ssl_client_s_dn to the App/Uptream in a Header, ensure that you DO NOT enclose it in quotes, because this does not need to be escaped for Nginx, and probably does not need to be escaped by the Upstream. If this is already done, then change it so that you DO enclose it in quotes. While I prefer the non-quoted method, this simple change could "fix" the issue for the App/Upstream:
I am making many assumptions. You might not forward the Cert Subject DN to another resource at all, you might only evaluate it in Nginx, you might have already updated log_format with escape=none, there are so many possible situations...
In all of these cases, remember each component may either Escape the value handled, or escape it only in Logs. Knowing how each component is configured, and how they handle String Literals, UTF-8 multibyte chars and/or Numeric Constants is key!
Consider at least these factors:
a => Sender of Client Cert, how the initial Cert data is formatted
b => Nginx typically uses the exact Cert Subject DN it receives, but it escapes values in its own logs by default unless you override this behavior.
c => App/Upstream (if you send $ssl_client_s_dn in Header) may escape the Header value, or it may leave it untouched and only escape it in Logs like Nginx...
These are fundamental concepts - what about providing a solution to the question?
There is a way to modify the value of the Cert DN set in $ssl_client_s_dn when sending to Upstream if this is desired.
Log the value of both $client_dn_fix and $ssl_client_s_dn.
When reviewing the Logs you will be able to see the real value of the Certs Subject DN expressed in $ssl_client_s_dn and $client_dn_fix then validates Nginx's view of it.
After reviewing logs to understand the internal operations better, you may replace values like "hex escape" or "original" with a string like: "L=D'UNCONGÉ" (UTF-8 Literals) or "L=D'UNCONG\xC3\x89" so that the Upstream will receive the "Expected Value".
There are many possible ways to implement the concepts here, it just depends what you really need.
Edit: OP has confirmed that "L=D'UNCONG\C3\89" appears in Logs when checking value of $ssl_client_s_dn , so the following map is finally useful, note this reduces Key/Values to 2 entries: