I use a simple site enabled to publish files in Apache:
File: /etc/apache2/sites-enabled/contents.conf
<Directory "/mnt/data/contents/">
Options FollowSymLinks
Require all granted
<IfModule mod_expires.c>
ExpiresActive on
ExpiresDefault "access plus 7 days"
</IfModule>
</Directory>
The files are simple XML, an example starts with these lines:
<mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:kitodo="http://meta.kitodo.org/v1/"
When I download the file locally, wget
complains about no headers:
user@myhostname:~$ wget http://myhostname/contents/example/example.xml
--2024-12-05 16:14:59-- http://myhostname/contents/example/example.xml
Resolving myhostname (myhostname)... 127.0.1.1
Connecting to myhostname (myhostname)|127.0.1.1|:80... connected.
HTTP request sent, awaiting response... 200 No headers, assuming HTTP/0.9
Length: unspecified
Saving to: ‘example.xml’
example.xml [ <=> ] 12,66K --.-KB/s in 4,8s
2024-12-05 16:15:04 (2,66 KB/s) - ‘example.xml’ saved [12966]
The downloaded file starts like:
12:25:45 GMT
Accept-Ranges: bytes
Content-Length: 12563
Cache-Control: max-age=0
Expires: Thu, 05 Dec 2024 09:45:44 GMT
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: application/xml; charset=utf-8
<mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:kitodo="http://meta.kitodo.org/v1/"
Obviously the first line doesn't belong there and prevents HTTP headers from being properly recognized. Where does the line come from and how can I turn it off? I don't experience anything like this on other systems.
Server version: Apache/2.4.41 (Ubuntu)
Loaded Modules:
core_module (static)
so_module (static)
watchdog_module (static)
http_module (static)
log_config_module (static)
logio_module (static)
version_module (static)
unixd_module (static)
access_compat_module (shared)
alias_module (shared)
auth_basic_module (shared)
authn_core_module (shared)
authn_file_module (shared)
authnz_ldap_module (shared)
authz_core_module (shared)
authz_host_module (shared)
authz_user_module (shared)
autoindex_module (shared)
dav_module (shared)
dav_fs_module (shared)
deflate_module (shared)
dir_module (shared)
env_module (shared)
expires_module (shared)
filter_module (shared)
headers_module (shared)
jk_module (shared)
ldap_module (shared)
mime_module (shared)
mpm_prefork_module (shared)
negotiation_module (shared)
php7_module (shared)
reqtimeout_module (shared)
rewrite_module (shared)
setenvif_module (shared)
socache_shmcb_module (shared)
ssl_module (shared)
status_module (shared)
File: /etc/apache2/sites-enabled/000-default.conf
(comments removed)
<VirtualHost *:80>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/typo3/public
<Directory /var/www/typo3/public/>
Options Indexes FollowSymLinks MultiViews
AllowOverride All
Order allow,deny
Allow from all
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
JkMount /kitodo ajp13_worker
JkMount /kitodo/* ajp13_worker
<Location /kitodo>
Order allow,deny
Allow from all
</Location>
</VirtualHost>
File: /etc/apache2/sites-enabled/default-ssl.conf
(comments removed)
<IfModule mod_ssl.c>
<VirtualHost _default_:443>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
SSLEngine on
SSLCertificateFile /etc/ssl/certs/ssl-cert-snakeoil.pem
SSLCertificateKeyFile /etc/ssl/private/ssl-cert-snakeoil.key
<FilesMatch "\.(cgi|shtml|phtml|php)$">
SSLOptions +StdEnvVars
</FilesMatch>
<Directory /usr/lib/cgi-bin>
SSLOptions +StdEnvVars
</Directory>
</VirtualHost>
</IfModule>
Edit: Output of curl -i
:
user@myhostname:~# curl -i http://myhostname/contents/example/example.xml
curl: (1) Received HTTP/0.9 when not allowed
Output of wget -O - -o /dev/null --save-headers
09:41:43 GMT
Accept-Ranges: bytes
Content-Length: 10971
Cache-Control: max-age=0
Expires: Mon, 09 Dec 2024 08:22:33 GMT
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: application/xml; charset=utf-8
(...)
ataTable:11:inputText",onco:function(xhr,status,args,data){preserveMetadata(); updateTitleMetadata();;}});" /><s
(...)
stands for the XML file content. I also see there is content at the end that shouldn’t belong there either. I recognize that content as part of the web application included through the JkMount. And the content in the last line also is different with each request.
Output of tcpdump -vv -i any -s 0 'tcp port http'
(I hope I got the right lines, because there are people working on the server meanwhile):
09:15:05.940645 IP (tos 0x0, ttl 64, id 26047, offset 0, flags [DF], proto TCP (6), length 60)
localhost.60850 > myhostname.http: Flags [S], cksum 0xff30 (incorrect -> 0x75f8), seq 4023966924, win 65495, options [mss 65495,sackOK,TS val 2467400527 ecr 0,nop,wscale 7], length 0
09:15:05.940660 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
myhostname.http > localhost.60850: Flags [S.], cksum 0xff30 (incorrect -> 0xbe63), seq 1059986559, ack 4023966925, win 65483, options [mss 65495,sackOK,TS val 1084365632 ecr 2467400527,nop,wscale 7], length 0
09:15:05.940673 IP (tos 0x0, ttl 64, id 26048, offset 0, flags [DF], proto TCP (6), length 52)
localhost.60850 > myhostname.http: Flags [.], cksum 0xff28 (incorrect -> 0xe51f), seq 1, ack 1, win 512, options [nop,nop,TS val 2467400527 ecr 1084365632], length 0
09:15:05.940703 IP (tos 0x0, ttl 64, id 26049, offset 0, flags [DF], proto TCP (6), length 241)
localhost.60850 > myhostname.http: Flags [P.], cksum 0xffe5 (incorrect -> 0x771f), seq 1:190, ack 1, win 512, options [nop,nop,TS val 2467400527 ecr 1084365632], length 189: HTTP, length: 189
GET /contents/example/example.xml HTTP/1.1
User-Agent: Wget/1.20.3 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: myhostname
Connection: Keep-Alive
09:15:05.940721 IP (tos 0x0, ttl 64, id 2993, offset 0, flags [DF], proto TCP (6), length 52)
myhostname.http > localhost.60850: Flags [.], cksum 0xff28 (incorrect -> 0xe463), seq 1, ack 190, win 511, options [nop,nop,TS val 1084365632 ecr 2467400527], length 0
09:15:05.946161 IP (tos 0x0, ttl 64, id 2994, offset 0, flags [DF], proto TCP (6), length 11426)
myhostname.http > localhost.60850: Flags [P.], cksum 0x2b97 (incorrect -> 0xe10f), seq 1:11375, ack 190, win 512, options [nop,nop,TS val 1084365637 ecr 2467400527], length 11374: HTTP
09:15:05.946180 IP (tos 0x0, ttl 64, id 26050, offset 0, flags [DF], proto TCP (6), length 52)
localhost.60850 > myhostname.http: Flags [.], cksum 0xff28 (incorrect -> 0xb81b), seq 190, ack 11375, win 463, options [nop,nop,TS val 2467400532 ecr 1084365637], length 0
09:15:10.951973 IP (tos 0x0, ttl 64, id 2995, offset 0, flags [DF], proto TCP (6), length 52)
myhostname.http > localhost.60850: Flags [F.], cksum 0xff28 (incorrect -> 0xa45b), seq 11375, ack 190, win 512, options [nop,nop,TS val 1084370643 ecr 2467400532], length 0
09:15:10.952765 IP (tos 0x0, ttl 64, id 26051, offset 0, flags [DF], proto TCP (6), length 52)
localhost.60850 > myhostname.http: Flags [F.], cksum 0xff28 (incorrect -> 0x90cb), seq 190, ack 11376, win 512, options [nop,nop,TS val 2467405539 ecr 1084370643], length 0
09:15:10.952792 IP (tos 0x0, ttl 64, id 2996, offset 0, flags [DF], proto TCP (6), length 52)
myhostname.http > localhost.60850: Flags [.], cksum 0xff28 (incorrect -> 0x90ca), seq 11376, ack 191, win 512, options [nop,nop,TS val 1084370644 ecr 2467405539], length 0
You can see that the last three entries come exactly 5 secs later, which is the unrelated stuff at the bottom.
Important additional findings:
Web folder completely emptied, no .htaccess files can be playing in.
The behavior does not occur if the XML file is retrieved via a compressed query (Accept-Encoding gzip)
If I remove 'security.conf' from 'conf-enabled', I get a slightly different (but still wrong) first line of output:
st-Modified: Thu, 28 Nov 2024 09:41:43 GMT
[sic!]The behavior only occurs if the XML file is downloaded via a symbolic link in the web folder that points to a CIFS mount point, not if it is located right in the folder