Background
I have four Windows servers, running Server 2012 R2, running Apache 2.4 on each. These are arranged into two pairs of load-balanced web servers, with the first pair facing the internet, and the second pair inside the LAN. There is an incoming firewall rule to allow port 80 traffic from the first to the second for API requests.
I am also running the Tableau Server visualisation software in the LAN, and my current task is to proxy traffic through the two sets of web servers to Tableau. I am using these rules on the outer web server:
/tableau-proxy/* : proxy to the inner web server
/* : run a web app that carries Tableau content
On the inner web server, I am using these rules:
/tableau-proxy/* : proxy to Tableau and strip off the /tableau-proxy prefix
/* : run an API app
For both sets of rules, I am using mod_proxy
, mod_proxy_http
, and mod_rewrite
. The proxying itself is working; for example, a image request can be carried all the way from the internet, bounce off the two web servers, land on Tableau, and then be carried back to the user as PNG data.
Content rewriting problem
The last 5% of this problem is to rewrite HTML links coming back from Tableau so that the proxy subdir prefix is restored. This is on the LAN-side/inner web servers. The rule I am using is this:
/ -> /tableau-proxy
I am trying to use mod_proxy_html
for this, but with the configuration I am using, I am getting a blank page. Interestingly, it is served with a response code of 200, so it is a bit harder to debug.
Configuration
The config I am using is this:
LoadModule proxy_html_module modules/mod_proxy_html.so
LoadModule xml2enc_module modules/mod_xml2enc.so
Include conf/extra/proxy-html.conf
<Directory "C:\Apache24\htdocs\public\tableau-proxy">
LogLevel alert rewrite:trace2 proxy_html:trace2
# Proxy requests to the Tableau LB
RewriteEngine on
# Here is a test Tableau server
RewriteRule (.*) http://tabtest/$1 [P]
# Don't make the x-forwarded-for header a list of proxy hops!
# That will break redirects in Tableau
ProxyAddHeaders Off
ProxyHTMLExtended Off
xml2EncDefault utf-8
LogLevel debug
ProxyHTMLEnable On
ProxyHTMLURLMap / /tableau-proxy
</Directory>
I have also tried preserving the charset of the document as it comes back down the proxy path:
ProxyHTMLCharsetOut *
That does not seem to make a difference, and results in:
Info: Got charset utf-8 from HTTP headers
Error: Invalid argument: [client: IP] AH01427: xml2enc: Charset utf-8 not supported
(What, it doesn't support UTF-8?)
I have tried experimenting with different HTML doctypes, again with no change:
ProxyHTMLDoctype XHTML Legacy
I've tried adding the proxy HTML module as a filter, in case this does something more than ProxyHTMLEnable On
, again to no avail:
SetOutputFilter proxy-html
I have seen some sporadic reports around the web of other folks experiencing this issue, and some of the extra items above represent my trying some suggested solutions, but the blank page persists. What can I try next?
I have discovered a solution. The Tableau package makes use of Apache to serve content, so it is able to benefit from detecting what encoding formats are acceptable to the client. A colleague suggested that perhaps the HTML link rewriting was choking on compressed content, and this turned out to be correct.
I therefore used
mod_headers
to turn compression off, thus:My sense is that this is not a problem with Tableau; I think the problem is in the Apache proxy and it is choking on the compressed content. I saw elsewhere on the web that some folks are adding decompress-modify-recompress filters into their proxy configuration, presumably to handle this exact issue, but that did not work for me (admittedly, I did not persist with this line of enquiry for long).
So, this is a workaround, though a very acceptable one for us. I will perhaps try to revisit this to get compression working again.