I have a KVM host machine with several VMs on it. Each VM uses a Logical Volume on the host. I need to copy the LVs to another host machine.
Normally, I would use something like:
dd if=/the/logical-volume of=/some/path/machine.dd
To turn the LV into an image file and use SCP to move it. Then use DD to copy the file back to a new LV on the new host.
The problem with this method is you need twice as much disk space as the VM takes on both machines. ie. a 5GB LV uses 5GB of space for the LV and the dd copy also uses an additional 5GB of space for the image. This is fine for small LVs, but what if (as is my case) you have a 500GB LV for a big VM? The new host machine has a 1TB hard drive, so it can't hold a 500GB dd image file and have a 500GB logical volume to copy to and have room for the host OS and room for other smaller guests.
What I would like to do is something like:
dd if=/dev/mygroup-mylv of=192.168.1.103/dev/newvgroup-newlv
In other words, copy the data directly from one logical volume to the other over the network and skip the intermediate image file.
Is this possible?
Sure, of course it's possible.
Boom.
Do yourself a favor, though, and use something larger than the default blocksize. Maybe add bs=4M (read/write in chunks of 4 MB). You can see there's some nitpicking about blocksizes in the comments; if this is something you find yourself doing fairly often, take a little time to try it a few different times with different blocksizes and see for yourself what gets you the best transfer rates.
Answering one of the questions from the comments:
You can pipe the transfer through pv to get statistics about the transfer. It's a lot nicer than the output you get from sending signals to
dd
.I will also say that while of course using netcat -- or anything else that does not impose the overhead of encryption -- is going to be more efficient, I usually find that the additional speed comes at some loss of convenience. Unless I'm moving around really large datasets, I usually stick with ssh despite the overhead because in most cases everything is already set up to Just Work.
Here's an optimized version, which shows the progress using
pv
and uses BS for bigger chunks and also usesgzip
to reduce the network traffic.That's perfect when moving the data between slow connections like internet servers. I recommend to run the command inside a screen or tmux session. That way the ssh connection to the host from where you execute the command can be disconnected without trouble.
How about using an old freind to do this. NetCat.
On the system that is losing the logical volume type
$ dd if=/dev/[directory]/[volume-name] | nc -l [any high number port]
Then on the receiving system. type
$ nc -w 10 [ip or name] [port] | dd of=/dev/[directory/[volume name]
Translating, orgin box dd this file and pipe it to nc (netcat) that will listen of this port. On the receiving system, netcat will wait 10 seconds if it it gets no data before closing to [ip or name] on [port] then pipe that data to dd to write it out.
First I would take a snapshot of the lv:
After that you have to create a new lv on the new host (e.g. using lvcreate) with the same size. Then you can directly copy the data to the new host. Here is my example of the copy command:
I used the procedure to copy a proxmox pve maintained VM to another host. The logical volume contained several additional LVs that were maintained by the VM itself.
First make sure that the logical volume is not mounted. If it is and you want to make a "hot copy", create a snapshot first and use this instead:
lvcreate --snapshot --name transfer_snap --size 1G
I have to transfer a lot of data (7TB) between two 1Gbit connected Servers, so i needed the fastes possible way to do so.
Should you use SSH?
Using ssh is out of question, not because of its encryption (if you have a CPU with AES-NI support, it does not hurt so much) but because of its network buffers. Those are not scaling well. There is a patched Ssh version that addresses this problem, but as there are no precompiled packages, its not very convenient.
Using Compression
When transferring raw disk images, it is always advisable to use compression. But you do not want the compression to become a bottleneck. Most unix compression tools like gzip are single-threaded, so if the compression saturates one CPU, it will be a bottleneck. For that reason, i always use pigz, an gzip variant that uses all CPU cores for compression. And this is necessary of you want to go up to and above GBit speeds.
Using Encryption
As said before, ssh is slow. If you have an AES-NI CPU, this should not be a bottleneck. So instead of using ssh, we can use openssl directly.
Speeds
To give you an Idea of the speed impact of the components, here are my results. Those are transfer speeds between two production systems reading and writing to memory. You actual results depend on network speed, HDD speed and source CPU speed! Im doing this to show that there at least is no huge performance drop.
Simple nc dd: 5033164800 bytes (5.0 GB, 4.7 GiB) copied, 47.3576 s, 106 MB/s +pigz compression level 1 (speed gain depends on actual data): network traffic: 2.52GiB 5033164800 bytes (5.0 GB, 4.7 GiB) copied, 38.8045 s, 130 MB/s +pigz compression level 5: network traffic: 2.43GiB 5033164800 bytes (5.0 GB, 4.7 GiB) copied, 44.4623 s, 113 MB/s +compression level 1 + openssl encryption: network traffic: 2.52GiB 5033164800 bytes (5.0 GB, 4.7 GiB) copied, 43.1163 s, 117 MB/s
Conclusion: using compressing gives a noticeable speedup, as it reduces the data size a lot. This is even more important if you have slower network speeds. When using compression, watch your cpu usage. if the usage gets maxed out, you can try without it. Using compression as only a small impact on AES-NI systems, imho only because it steals some 30-40% cpu from the compression.Using Screen
If you are transferring a lot of data like me, you do not want to have it interrupted by an network disconnect of your ssh client, so you better start it with screen on both sides. This is just a note, i will not write a screen tutorial here.
Lets Copy
Install some dependencies (on source and destination):
apt install pigz pv netcat-openbsd
then create a volume on the destination with the same size as the source. If unsure, use lvdisplay on the source to get the size and create the target i.e.:
lvcreate -n lvname vgname -L 50G
next, prepare the destination for receiving the data:
nc -l -p 444 | openssl aes-256-cbc -d -salt -pass pass:asdkjn2hb | pigz -d | dd bs=16M of=/dev/vgname/lvname
and when ready, start the transfer on the Source:
pv -r -t -b -p -e /dev/vgname/lvname | pigz -1 | openssl aes-256-cbc -salt -pass pass:asdkjn2hb | nc <destip/host> 444 -q 1
Note: If you are transferring the data locally or do not care about encryption, just remove the Openssl part from both sides. If you care, asdkjn2hb is the Encryption key, you should change it.
The rest of the answers does not work well and don't fulfill the question requirements, because it does not create the logical volume in the target server, but instead creates a file under /dev/mygroup/myvol in the root disk, which also causes the copied volume not appearing on the LV tools like
lvdisplay
.I created a bash script that automates the whole process: https://github.com/daniol/lvm-ssh-transfer