We are currently evaluating hardware and topology solutions for a new environment using GFS+iSCSI and would like some suggestions/tips. We have deployed a similar solution in the past where all hosts accessing the GFS nodes were the GFS nodes themselves. The new topology would be to separate the GFS nodes from the clients accessing them.
A basic diagram would look like:
GFS_client <-> gigE <-> GFS nodes <-> gigE <-> iSCSI SAN device
- Is this the most optimal way to setup GFS+iSCSI?
- Do you have suggestions on hardware for the GFS nodes themselves(ie - CPU or memory heavy)?
- Do you have suggestions on tweaks/config settings to increase performance of the GFS nodes?
- Currently we are using 3-4 gigE connections per host for performance and redundancy. At this point does 10gigE or fiber become more attractive for cost/scaling?
The only part of this question I can sugget an answer to is #4.
We evaluated and considered 10GbE for our SAN, and decided it was cheaper, more effective, and safer, to stick with teamed/load balanced 1Gb apaptors. To achieve the same level of redundancy with 10GigE was astronomical, and provided nominal performance increase for clients (you're not going to put a 10GbE card in each client, after all).
I don't think there's an "optimal" setup. Just make dead sure you start your iSCSI initiator before GFS. You've already specified bonding as a redundancy/performance measure. You should probably also think of setting up a multi-path connection to your target, if you have 4 NICs, maybe create 2 paths over 2 bonded interfaces for better redundancy. You should also consider using Jumbo frames if you have a dedicated iSCSI switch which supports that feature.
GFS as a subsystem isn't very heavy on the system. There are locks held in kernel, some membership information/heartbeat running around between nodes, that's pretty much it. On the other hand, since you plan to make them both GFS nodes and a server being accessed by clients, you should probably invest in your nics/switches and RAM for the servers.
Jumbo frames. 803.2ad link aggregation if possible, on both sides (iscsi and clients). tcp stack optimizations (/proc/sys/net/ipv4/tcp_rmem|wmem)
I'll skip this one, I've no idea of the costs of 10ge.
Have you thought about network redundancy? GFS clusters are very vulnerable to missed heartbeats. We use interface bonding for all our cluster and iSCSI links, connected to separate switches.
Just to add to #3&4
Jumbo's can make a huge beneficial difference in performance especially for "storage" networks where 99.99% of packets will be large. Just make sure to do an audit first to ensure all hosts on the net support them.
Second, it's worth verifying that all those extra GigE interfaces are giving you more speed, most switches (by default) actually use MAC or IP based hashes so you may not actually see more then 1Gb between a single host pair.
By the time you're putting in 10Gbe you should just bite the bullet and use FC which is much faster for the same link rate, or, wait until early next year where the converged ethernet stuff should finally be shipping at below "early adopter" pricing.
We are evaluating solution for our new SAN, and Equalogic product looks really great for iscsi. Each bundle is 15 disks and 2 controllers (A/P 4GB each). As you add 2 controllers per 15 disks, you have a linear increase in performance while adding storage capacity.
They don't go 10Ge for now, but each controller have 4 Links. They provides real thin provisioning
Link to the official page
I can't comment (yet) on LapTop006's post, but he's absolutely spot on!
The pinch is that all your network equipment in the IP-SAN must support the same amount of MTU (Maximum Transmission Unit). If I remember correctly the maxmimum MTU for Jumbo Frames by spec is 9000 bytes, but I have seen people using 9100 and above..