This looks related to this one, but it's somewhat different.
There is this WAN link between two company sites, and we need to transfer a single very large file (Oracle dump, ~160 GB).
We've got full 100 Mbps bandwidth (tested), but looks like a single TCP connection just can't max it out due to how TCP works (ACKs, etc.). We tested the link with iperf, and results change dramatically when increasing the TCP Window Size: with base settings we get ~5 Mbps throughput, with a bigger WS we can get up to ~45 Mbps, but not any more than that. The network latency is around 10 ms.
Out of curiosity, we ran iperf using more than a single connections, and we found that, when running four of them, they would indeed achieve a speed of ~25 Mbps each, filling up all the available bandwidth; so the key looks to be in running multiple simultaneous transfers.
With FTP, things get worse: even with optimized TCP settings (high Window Size, max MTU, etc.) we can't get more than 20 Mbps on a single transfer. We tried FTPing some big files at the same time, and indeed things got a lot better than when transferring a single one; but then the culprit became disk I/O, because reading and writing four big files from the same disk bottlenecks very soon; also, we don't seem to be able to split that single large file into smaller ones and then merge it back, at least not in acceptable times (obviously we can't spend splicing/merging back the file a time comparable to that of transferring it).
The ideal solution here would be a multithreaded tool that could transfer various chunks of the file at the same time; sort of like peer-to-peer programs like eMule or BitTorrent already do, but from a single source to a single destination. Ideally, the tool would allow us to choose how many parallel connections to use, and of course optimize disk I/O to not jump (too) madly between various sections of the file.
Does anyone know of such a tool?
Or, can anyone suggest a better solution and/or something we already didn't try?
P.S. We already thought of backing that up to tape/disk and physically sending it to destination; that would be our extreme measure if WAN just doesn't cut it, but, as A.S. Tanenbaum said, "Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway."
Searching for "high latency file transfer" brings up a lot of interesting hits. Clearly, this is a problem that both the CompSci community and the commercial community has put thougth into.
A few commercial offerings that appear to fit the bill:
FileCatalyst has products that can stream data over high-latency networks either using UDP or multiple TCP streams. They've got a lot of other features, too (on-the-fly compression, delta transfers, etc).
The fasp file transfer "technology" from Aspera appears to fit the bill for what you're looking for, as well.
In the open-source world, the uftp project looks promising. You don't particularly need its multicast capabilities, but the basic idea of blasting out a file to receivers, receiving NAKs for missed blocks at the end of the transfer, and then blasting out the NAK'd blocks (lather, rinse, repeat) sounds like it would do what you need, since there's no ACK'ing (or NAK'ing) from the receiver until after the file transfer has completed once. Assuming the network is just latent, and not lossy, this might do what you need, too.
Really odd suggestion this one.. Set up a simple web server to host the file on your network (I suggest nginx, incidentally), then set up a pc with firefox on the other end, and install the DownThemAll extension.
It's a download accelerator that supports chunking and re-assembly.
You can break each download into 10 chunks for re-assembly, and it does actually make things quicker!
(caveat: I've never tried it on anything as big as 160GB, but it does work well with 20GB iso files)
The UDT transport is probably the most popular transport for high latency communications. This leads onto their other software called Sector/Sphere a "High Performance Distributed File System and Parallel Data Processing Engine" which might be worthwhile to have a look at.
My answer is a bit late, but I just found this question, while looking for fasp. During that search I also found this : http://tsunami-udp.sourceforge.net/ , the "Tsunami UDP Protocol".
From their website :
As far as speed goes, the page mentions this result (using a link between Helsinki, Finland to Bonn, Germany over a 1GBit link:
If you want to use a download accelerator, have a look at lftp , this is the only download accelerator that can do a recursive mirror, as far as I know.
The bbcp utility from the very relevant page 'How to transfer large amounts of data via network' seems to be the simplest solution.