I have a server that I use to send emails. The emails have an image, that is used to track their being opened, that is in the form of
<img src="http://tracker.site.com/[email protected]&c=somemd5ishcode">
The script that serves up the image is as follows:
<?php
// turn off errors
ini_set('display_errors', 0);
ini_set('log_errors', 0);
// database settings
define('DB_HOST','localhost');
define('DB_USER','user');
define('DB_PASSWORD','pass');
define('DB_NAME','db');
// connecting to the db
$dbh = mysql_connect(DB_HOST,DB_USER,DB_PASSWORD,true);
mysql_select_db(DB_NAME, $dbh);
// clean the vars
$code = mysql_real_escape_string( $_GET['c'] );
$email = mysql_real_escape_string( $_GET['e'] );
// insert a record if need be
if( $code <> '' && $email <> '' )
{
// quick debug
$dump = '';
foreach( $_SERVER as $k=>$v )
{
$dump.= '"'. $k .'" => "'. $v .'",\n';
}
mysql_query( "INSERT INTO `tracker_hits`
(`CODE_TEXT`,`HIT_EMAIL`,`HIT_IP`,`HIT_TIMESTAMP`,`SERVER_DUMP`)
VALUES
('$code','$email','{$_SERVER['REMOTE_ADDR']}','" .time(). "','$dump')", $dbh );
}
// disables caching of the image
header("Expires: Sat, 26 Jul 1997 05:00:00 GMT");
header('Cache-Control: no-cache');
header('Pragma: no-cache');
// outputs a 1x1 transparent PNG
header('Content-type: image/png');
echo gzinflate(base64_decode('6wzwc+flkuJiYGDg9fRwCQLSjCDMwQQkJ5QH3wNSbCVBfsEMYJC3jH0ikOLxdHEMqZiTnJCQAOSxMDB+E7cIBcl7uvq5rHNKaAIA'));
die();
When I send the email out though, I'm getting hundreds of hits within the exact same second, from the exact same IP address, which varies, but whois says it belongs to Mediacom / Verizon / Frontier / etc. I looked around, and it seems like my issue may be with an ISP cache server. I used this website: http://redbot.org/ to check the 'cacheability' of my file, and this was what they had to say:
HTTP/1.1 200 OK
Date: Thu, 17 Mar 2011 17:05:20 GMT
Server: Apache/2.2.10 (Win32) mod_ssl/2.2.10 OpenSSL/0.9.8i PHP/5.2.6
X-Powered-By: PHP/5.2.6
Expires: Sat, 26 Jul 1997 05:00:00 GMT
Cache-Control: no-cache
Pragma: no-cache
Content-Length: 87
Keep-Alive: timeout=5, max=500
Connection: Keep-Alive
Content-Type: image/png
With these notes:
General
- The Content-Length header is correct.
Caching
- Pragma: no-cache is a request directive, not a response directive.
- This response allows all caches to store it.
- This response cannot be served from cache without validation.
So, I looked around, and found that I could supposedly use Apache's expires_module to get the job done, but when setting it up with the following settings, I'm still getting the same results from REDbot:
LoadModule expires_module modules/mod_expires.so
<IfModule expires_module>
ExpiresActive On
ExpiresByType text/php A0
ExpiresByType image/png A0
ExpiresByType image/jpg A0
ExpiresByType image/gif A0
</IfModule>
What am I missing? I looked around, and to my understanding the PHP header()
s and the <meta>
headers are used only for the users browser, not the ISP cache. One thought that I had was since the group IP hits are all within a few seconds of the email going out, in my log script I could check to see if at least 10 seconds had passed before allowing a hit to be valid. The issue I figured I may encounter with that though is that later on when a person actually physically reads the email, won't the ISP catch the request and short circuit the one that would have logged a hit to my server, and just serve them the cache version?
Depending on the caching an ISP employs you may have 0 chance of forcing a proper request rather than cached. Even if you specify things like mod_expires and nocache etc, ISPs can still cache, especially if they run any sort of proxy.
So:
1. even if you wait 10 seconds after mail you can't guarantee the proper client view will not be cached (ie so you'll see the original blocked ISP view, but then nothing after)
2. you should look at a better method ie email receipts.
3. if you want to update what you want to achieve, and if there are limitations, we can probably help your larger problem.