I use a simple site enabled to publish files in Apache:
File: /etc/apache2/sites-enabled/contents.conf
<Directory "/mnt/data/contents/">
Options FollowSymLinks
Require all granted
<IfModule mod_expires.c>
ExpiresActive on
ExpiresDefault "access plus 7 days"
</IfModule>
</Directory>
The files are simple XML, an example starts with these lines:
<mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:kitodo="http://meta.kitodo.org/v1/"
When I download the file locally, wget
complains about no headers:
user@myhostname:~$ wget http://myhostname/contents/example/example.xml
--2024-12-05 16:14:59-- http://myhostname/contents/example/example.xml
Resolving myhostname (myhostname)... 127.0.1.1
Connecting to myhostname (myhostname)|127.0.1.1|:80... connected.
HTTP request sent, awaiting response... 200 No headers, assuming HTTP/0.9
Length: unspecified
Saving to: ‘example.xml’
example.xml [ <=> ] 12,66K --.-KB/s in 4,8s
2024-12-05 16:15:04 (2,66 KB/s) - ‘example.xml’ saved [12966]
The downloaded file starts like:
12:25:45 GMT
Accept-Ranges: bytes
Content-Length: 12563
Cache-Control: max-age=0
Expires: Thu, 05 Dec 2024 09:45:44 GMT
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: application/xml; charset=utf-8
<mets:mets xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:kitodo="http://meta.kitodo.org/v1/"
Obviously the first line doesn't belong there and prevents HTTP headers from being properly recognized. Where does the line come from and how can I turn it off? I don't experience anything like this on other systems.
Server version: Apache/2.4.41 (Ubuntu)
Loaded Modules:
core_module (static)
so_module (static)
watchdog_module (static)
http_module (static)
log_config_module (static)
logio_module (static)
version_module (static)
unixd_module (static)
access_compat_module (shared)
alias_module (shared)
auth_basic_module (shared)
authn_core_module (shared)
authn_file_module (shared)
authnz_ldap_module (shared)
authz_core_module (shared)
authz_host_module (shared)
authz_user_module (shared)
autoindex_module (shared)
dav_module (shared)
dav_fs_module (shared)
deflate_module (shared)
dir_module (shared)
env_module (shared)
expires_module (shared)
filter_module (shared)
headers_module (shared)
jk_module (shared)
ldap_module (shared)
mime_module (shared)
mpm_prefork_module (shared)
negotiation_module (shared)
php7_module (shared)
reqtimeout_module (shared)
rewrite_module (shared)
setenvif_module (shared)
socache_shmcb_module (shared)
ssl_module (shared)
status_module (shared)
File: /etc/apache2/sites-enabled/000-default.conf
(comments removed)
<VirtualHost *:80>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/typo3/public
<Directory /var/www/typo3/public/>
Options Indexes FollowSymLinks MultiViews
AllowOverride All
Order allow,deny
Allow from all
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
JkMount /kitodo ajp13_worker
JkMount /kitodo/* ajp13_worker
<Location /kitodo>
Order allow,deny
Allow from all
</Location>
</VirtualHost>
File: /etc/apache2/sites-enabled/default-ssl.conf
(comments removed)
<IfModule mod_ssl.c>
<VirtualHost _default_:443>
ServerAdmin webmaster@localhost
DocumentRoot /var/www/html
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
SSLEngine on
SSLCertificateFile /etc/ssl/certs/ssl-cert-snakeoil.pem
SSLCertificateKeyFile /etc/ssl/private/ssl-cert-snakeoil.key
<FilesMatch "\.(cgi|shtml|phtml|php)$">
SSLOptions +StdEnvVars
</FilesMatch>
<Directory /usr/lib/cgi-bin>
SSLOptions +StdEnvVars
</Directory>
</VirtualHost>
</IfModule>
Edit: Output of curl -i
:
user@myhostname:~# curl -i http://myhostname/contents/example/example.xml
curl: (1) Received HTTP/0.9 when not allowed
Output of wget -O - -o /dev/null --save-headers
09:41:43 GMT
Accept-Ranges: bytes
Content-Length: 10971
Cache-Control: max-age=0
Expires: Mon, 09 Dec 2024 08:22:33 GMT
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
Keep-Alive: timeout=5, max=100
Connection: Keep-Alive
Content-Type: application/xml; charset=utf-8
(...)
ataTable:11:inputText",onco:function(xhr,status,args,data){preserveMetadata(); updateTitleMetadata();;}});" /><s
(...)
stands for the XML file content. I also see there is content at the end that shouldn’t belong there either. I recognize that content as part of the web application included through the JkMount. And the content in the last line also is different with each request.
Output of tcpdump -vv -i any -s 0 'tcp port http'
(I hope I got the right lines, because there are people working on the server meanwhile):
09:15:05.940645 IP (tos 0x0, ttl 64, id 26047, offset 0, flags [DF], proto TCP (6), length 60)
localhost.60850 > myhostname.http: Flags [S], cksum 0xff30 (incorrect -> 0x75f8), seq 4023966924, win 65495, options [mss 65495,sackOK,TS val 2467400527 ecr 0,nop,wscale 7], length 0
09:15:05.940660 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
myhostname.http > localhost.60850: Flags [S.], cksum 0xff30 (incorrect -> 0xbe63), seq 1059986559, ack 4023966925, win 65483, options [mss 65495,sackOK,TS val 1084365632 ecr 2467400527,nop,wscale 7], length 0
09:15:05.940673 IP (tos 0x0, ttl 64, id 26048, offset 0, flags [DF], proto TCP (6), length 52)
localhost.60850 > myhostname.http: Flags [.], cksum 0xff28 (incorrect -> 0xe51f), seq 1, ack 1, win 512, options [nop,nop,TS val 2467400527 ecr 1084365632], length 0
09:15:05.940703 IP (tos 0x0, ttl 64, id 26049, offset 0, flags [DF], proto TCP (6), length 241)
localhost.60850 > myhostname.http: Flags [P.], cksum 0xffe5 (incorrect -> 0x771f), seq 1:190, ack 1, win 512, options [nop,nop,TS val 2467400527 ecr 1084365632], length 189: HTTP, length: 189
GET /contents/example/example.xml HTTP/1.1
User-Agent: Wget/1.20.3 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: myhostname
Connection: Keep-Alive
09:15:05.940721 IP (tos 0x0, ttl 64, id 2993, offset 0, flags [DF], proto TCP (6), length 52)
myhostname.http > localhost.60850: Flags [.], cksum 0xff28 (incorrect -> 0xe463), seq 1, ack 190, win 511, options [nop,nop,TS val 1084365632 ecr 2467400527], length 0
09:15:05.946161 IP (tos 0x0, ttl 64, id 2994, offset 0, flags [DF], proto TCP (6), length 11426)
myhostname.http > localhost.60850: Flags [P.], cksum 0x2b97 (incorrect -> 0xe10f), seq 1:11375, ack 190, win 512, options [nop,nop,TS val 1084365637 ecr 2467400527], length 11374: HTTP
09:15:05.946180 IP (tos 0x0, ttl 64, id 26050, offset 0, flags [DF], proto TCP (6), length 52)
localhost.60850 > myhostname.http: Flags [.], cksum 0xff28 (incorrect -> 0xb81b), seq 190, ack 11375, win 463, options [nop,nop,TS val 2467400532 ecr 1084365637], length 0
09:15:10.951973 IP (tos 0x0, ttl 64, id 2995, offset 0, flags [DF], proto TCP (6), length 52)
myhostname.http > localhost.60850: Flags [F.], cksum 0xff28 (incorrect -> 0xa45b), seq 11375, ack 190, win 512, options [nop,nop,TS val 1084370643 ecr 2467400532], length 0
09:15:10.952765 IP (tos 0x0, ttl 64, id 26051, offset 0, flags [DF], proto TCP (6), length 52)
localhost.60850 > myhostname.http: Flags [F.], cksum 0xff28 (incorrect -> 0x90cb), seq 190, ack 11376, win 512, options [nop,nop,TS val 2467405539 ecr 1084370643], length 0
09:15:10.952792 IP (tos 0x0, ttl 64, id 2996, offset 0, flags [DF], proto TCP (6), length 52)
myhostname.http > localhost.60850: Flags [.], cksum 0xff28 (incorrect -> 0x90ca), seq 11376, ack 191, win 512, options [nop,nop,TS val 1084370644 ecr 2467405539], length 0
You can see that the last three entries come exactly 5 secs later, which is the unrelated stuff at the bottom.
Important additional findings:
Web folder completely emptied, no .htaccess files can be playing in.
The behavior does not occur if the XML file is retrieved via a compressed query (Accept-Encoding gzip)
If I remove 'security.conf' from 'conf-enabled', I get a slightly different (but still wrong) first line of output:
st-Modified: Thu, 28 Nov 2024 09:41:43 GMT
[sic!]The behavior only occurs if the XML file is downloaded via a symbolic link in the web folder that points to a CIFS mount point, not if it is located right in the folder
We couldn’t find the true reason and fix it. We updated good old Ubuntu 20 to Ubuntu 22 now, and now the issue is gone. Sad to say that I cannot give a better answer.