I have a network of 20 machines
, running Ubuntu 10.04.
Each machine has about 200[GB] of data that I'd like to share with all other 19 machines
for READ ONLY PURPOSES. The reading should be done at the fastest possible way.
A friend told me to look into setting up HTTP / FTP. Is it indeed the optimal way to share data between the machines (better than NFS)? if so, how do I go about it?
Is there a python
module that would help in accessing/reading the data?
UPDATE: Just to clarify, all I want is to be able (from within machine X
) to access one of machine Y
s files and LOAD IT INTO MEMORY. all of the files are of uniform size (500 [KB]). Which method is fastest (SAMBA / NFS / HTTP / FTP)?
With
python
you can start up a webserver via a simple one-liner in the directory where the data is stored.Edit:
It creates a simple web server on port 8000, performance wise I cannot tell you much, and for that type of questions it would be better to ask in SuperUser and not SO.
It doesn't autostart, but it wouldn't be hard to make it.
There are hundreds of ways to solve this problem. You can mount a FTP, or HTTP file system over fuse, or even use NFS (why not?). Search for httpfs2 or curlftpfs (or even sshfs, which should not be used if you are looking for performance)
But the problem I see is, you have a single point of failure of the one and only master machine. Why not distribute the storage?
I usually using glusterfs [1], which is fast and can be used in different modes.
[1] http://en.wikipedia.org/wiki/Distributed_file_system