My friend runs a popular Youtube-to-GIF conversion site. Right now, he has converted 250,000 Youtube videos to GIFs (each video gets 6 thumbnails for 1.5m total GIF files) and serves about 80TB of bandwidth per month.
His server is IO blocking -- I'm not a guru admin, but it seems to be the harddrive seek time for non-sequential GIFs that's clogging everything up. He has a server with 100tb.com for $300/mo, and it comes with 100TB free bandwidth. At first, I advised him to get a CDN to solve his problems, because then the GIFs get served without consuming his server resources, and his main box could just handle the encoding -- We found one CDN for $600/mo that was too slow/unreliable, and the rest wanted at least $2000/mo for 80TB of bandwidth. We're trying to keep the whole project under $900/mo, right now.
So the cheapest bandwidth we can find is with 100TB, but we're outgrowing one server. We could add another server, but I don't really know how to partition the GIF storage so that the load is distributed evenly between two boxes. Our host recommended using software like Aflexi.net, but I'm sure there must be a cheaper solution.
Can anyone help? I'm a programmer by trade, not a sysadmin, but trying to learn the ropes. Thanks!
S3 is no alternative, the bill for 80 TByte will be over 8k$ alone per month.
It looks like you serve the GIFs right out of the filesystem. Why don't you put all the GIFs on 2 machines, use a hash-algorithm mapping the name to one of the 2 machines and deliver them this way? This would easily scale to more machines as long as your loadbalancer holds up…
Dump the files to S3 and serve them from there. The poor man's CDN :)
If you need more processing power, you can do the conversions out of EC2 instances and dump directly to your "CDN" as well.
I can't comment on the other comments, but they sound good. I would look to lift some of the load from the file servers by keeping your most commonly accessed (i.e. most popular) files in a memory cache, i.e. have a http handler that does something like this:
If you can get a machine with a crap-load of RAM, you're laughing as it's quite likely you'll be able to fit a large percentage of your popular files in memory.
And when you saturate that, add another image-handler server and round-robin them. Keep doing this until something breaks, i.e. throughput, scalability, economy.
I've done something like this before to good effect.
If it's just 2 machines, you can consider using DRBD to sync between both machines. Then just use PHP to decide randomly or algorithmically which server to pull from during a request. Simple but workable solution.