I understand what LVM is and what it accomplishes, but I feel like I am missing some things.
Lets say we have two physical drives, sda and sdb. Both are 100 Megs. I put them into VolumeGroup1 and create one 200 meg LogicalVolume1.
What would happen if I create a 150 meg file? Would 100 megs physically be on sda and 50 on sdb? If so, what tells the OS that a piece of the file is on one drive, and another piece is on the other?
What about drive failure? Assuming no RAID, if sdb fails, will all the data on sda be lost? Is there anyway to control what files are on what physical drives?
How do you generally manage LVM? Do you create one or two large Volume Groups then make partitions as it makes sense? Any other tips?
Correct (assuming the filesystem was empty before the file was created).
LVM tells the operating system that there is one single 200MB disk. The LVM part of the kernel (it comes in two parts, userspace management tools and kernel drivers) will then map what the operating system sees to physical locations/blocks on the disks.
Yes, consider the data lost.
If you create smaller Logical Volumes then you can use the
pvmove
command to move them from disk to disk.I tend to create large Volume Groups and then create Logical Volumes as needed. There is no need to fully allocate all the space in a Volume Group; allocate it when it is needed. It's easy to increase the size of a Logical Volume, and pretty much all modern filesystems can be easily grown, too.
The underlying thing that lets LVM and Software Raid in Linux work is the device mapper portion of the kernel. This is what abstracts the block addresses of the physical devices to the virtual block devices that you're using.
When using LVM as with anything when it comes to data you do need to be aware of the data availability repercussions. That's not to say that LVM is dangerous in fact when the proper practices are used it's impact on availability is minimal.
In the scenario you suggest in your question the availability of your data would be the same as a RAID0 where if any drive fails it would result in data loss.
In practice I would not use LVM without running it on some sort of RAID. I have used LVM on a 30TB file server that had about 20 Hardware RAID5 volumes in one VG. But if you have enough free Extents you can use pvmove to migrate the data off one or more PV's should it start to give you problems.
But always have a backup strategy in place that is tested from time to time.
How do you generally manage LVM? Do you create one or two large Volume Groups then make partitions as it makes sense?
My general strategy is to put into separate volume group the physical volumes that might possibly be migrated (as a whole set) to another system.
If you have external storage, it is good idea to put it in a separate volume group. It is physically easy to disconnect it from this computer and connect to another, so it should be similarly logically easy to export/import it in LVM, keeping the data intact.
If you already have a vg00 on internal disk(s), and then you buy another internal disk for your machine, ask yourself a question: will the data on the new disk be bound to vg00, and there would be no sense ever in moving the data to another system? In this case, it should be part of vg00. Otherwise, I would create vg01, as it can be easily exported/imported on its own.
If you have two drives as physical volumes in a group like that, then what you have is a JBOD (Just a Bunch Of Disks) array. If one of the drive fails you will be no better protected than if the drives were arranged in a RAID0 array.
You can not directly control what goes where on the two drives if you have one logical volume in the volume group (as this will be controlled by the filesystem in the volume, not LVM) though if you split the volume group into multiple logical volumes you can manually order their creation such that a given logical volume is on a given drive.
I believe that each PV in an VG has a copy of the LV layout and the data isn't stripped like with RAID0, so you do have more chance of recovering something if one of your drives fails but if data loss is any concern at all I would not consider using two drives this way at all (via LVM or RAID0).
The LVM (Logical Volume Manager) collects physical volumes into volume groups. Every physical volume (the drive itself) has small pieces called physical extents. These extents has a uniq identifier in the disk. Actually they are sequentally numbered. When you creates a logical volume, it has been built from logical extents which are paired with the physical extents. Logical extents has uniq ID in the logical volume. In HP-UX you can check which logical extent paired with which physical extent. In SLES11 I cannot figured out how to check it.
lvdisplay --maps
should be good but not perferct (for me).