Background/Environment Architecture:
My current environment for $corp_overlords$
is set up in a hub-and-spoke model with a technologically well-endowed home office hub (SAN, bladecenter/bladesystem ESXi cluster, fiber internet connection, etc.) connected to a number of remote site spokes that are not so well off, and typically contain single ESXi host server and connect to the home office hub via a T1. All traffic originating at any remote site routes back to the home office over a "MPLS network" (which is really just a T1 connecting the remote site to the home office).
At the home office, on the SAN, we have a number of VM templates that I have created to deploy VMs from. They are stored in an NFS volume, that is a vSphere datastore, attached to the home office datacenter object within vSphere.
Each remote site has a corresponding vSphere datacenter object, containing a datastore object that's connected to the locally-attached storage on the ESXi host server physically located at the remote site.
As these VM templates exist on the NFS volume, they occupy ~ 40 GiB (thin-provisioning). As files on NTFS (or a Linux FS), they occupy ~100 GiB.
Question:
How should I copy this 40 GiB of thin-provisioned data (that occupies 100 GiB of filesystem space) between my sites?
I am under the constraints that I have approximately 5 days to do so, and cannot interfere (noticeably) with "normal network traffic."
How about using ovftool to copy the templates directly between hosts?
I have used this for VMs before, and it works pretty well. Not sure if that also works for templates, but if not then you can just covert the templates temporarily to VMs for copying them.
Instructions, with an example are here.
You could also use ovftool to convert your templates to
.ovf
packages, which should be very compact, and then transfer the packages between datacenters with BITS or FTP or SCP or whatever protocol you want.Options:
The way I see it, I have three possible approaches, though I dearly hope I'm missing a better one that someone here can point me at. (Ideally one that has me only moving the 40 GiB of actual data, and in a resumable, "background" or speed-throttled method.)
ProbationBonus: PowerShell Remoting makes it possible to do this in one single command.Here's a somewhat interesting idea for you. It won't help with your initial seeding, but I wonder if using something like Crashplan's free product would help you with your templates.
https://www.code42.com/store/
It does dedupe and block level differentials, so you could install it on one local server there at HQ as the "seeder", and on each spoke server (in a VM I guess) as a "receiver". Setup the backups to only include the folder where the templates will be stored on the HQ Server. It can also backup to multiple destinations (such as each "spoke") https://support.code42.com/CrashPlan/Latest/Getting_Started/Choosing_Destinations
The steps (after setting up the Crashplan app on each side) would work something like:
Just an idea...might be an interesting road to venture down and see if it works as a poor man's dedupe/block level replication for just these files.
I've done this type of move a number of ways, but given what you've described...
FedEx or UPS, with a twist...
I know that the servers in use are HP ProLiant and Dell PowerEdge servers. VMware does not have good support for removable devices (e.g. USB) as datastore targets. However, using a single drive RAID 0 logical drive (in HP-speak) at the main site can work. You can add and remove locally-attached disks on HP and Dell systems and use that as a means to transport datastores.
Being templates, you can move/copy them to your local disk via vCenter. Ship the disks. Insert into the receiving standalone server. The array and datastore will be recognized via a storage system rescan. Copy data. Profit.
I've also used this as a means to seed copies for vSphere replication, as 24 hours of deltas is a lot easier to manage than multiple full syncs.
This is a method I use fairly often for this kind of scenario. It seems counter-intuitive because you are uploading files from inside a VM stored on the datastore, to the datastore itself. However, this gives you a lot more control over how the transfer is accomplished.
Pros:
By breaking the template into smaller pieces you reduce the risk of data corruption during transfer. (If a file gets corrupted, you only need to re-upload that piece of the RAR, rather than the entire 40GB file.)
You only transfer 40GB (probably less as RAR'ing will compress further).
You get your pick of transfer utilities as you're doing the transfer inside the OS of your choice.
Cons:
You have to create a staging VM. I make this easier by having a pre-created template that is <1GB that has just a bare OS install + SFTP server.
Compressing/decompressing a 40GB template will take ~4-6 hours depending on your CPU resources.
I've dealt with this same issue quite a few times and about half the time I find that I'm far better off to just to build new machines in the remote location. This is especially true for what I call "template" machines. My version of that is a pretty basic machine. Your version may be something a little different.