It appears that I am able to successfully do a pvcreate on top of a raw block device, without ever taking the step of creating a partition table. I am then able to create a volume group, logical volume, and finally a filesystem, mount it, and test via dd.
It appears to work, but I need a sanity check. Is this a bad idea?
How do I create a GPT or MBR partition table on top of a raw block device?
How do I use parted to show what sort of partition table is in use? I have tried doing:
parted, select /dev/sdb, print and I get:
Error: /dev/sdb: unrecognised disk label
Yet the drive is currently in use and I can read and write to it. Is that the expected output when doing LVM on top of a raw block device without a partition table? Any thoughts?
Thanks!
Even if LVM itself doesn't care about having a real partition, one reason to create it anyway is to inform partitioning programs that there's "something there." A nightmare scenario is a new sysadmin diagnosing a boot problem on a server, firing up a partitioning program, seeing unpartitioned disks, and concluding that the drive is corrupt.
I see no downside to creating an LVM partition. Do you?
While you can just create a pv out of raw block device I normally try to avoid it as it can cause confusion as to what the block device is being used for. It may also break some of the auto discover routines that LVM can use if it's missing it's configuration files.
Here's an example of using parted to create a GPT with 1 partition that is the whole drive and set the partition flag to be lvm. The mkpart requires that you specify a file system but it doesn't create the file system. Seems to be a long standing bug in parted. Also the start offset of 1M is to ensure that you get proper alignment.
Even if in the past I was using MS-DOS disklabel or GPT disklabel for PV, I prefer now to use directly LVM on the main block device. There is no reason to use 2 disklabels, unless you have a very specific use case (like disk with boot sector and boot partition).
The advantage of having LVM directly are:
If you create a PV directly on a virtual storage device inside a KVM guest, then you will notice that the logical volumes from the guest are visible on the hypervisor. This can make things quite confusing if you use the same logical volume and volume group names across multiple guests. You may also get some warnings on the hypervisor saying that it can't find a device.
For example, I have recreated this problem on my test hypervisor:
Here you can see 2 volume groups with the same name, both from guests which shouldn't really appear on the hypervisor.
For this reason, I would advise that you use parted or fdisk to create a KVM partition on there first (as shown in the previous answer by 3dinfluence), before creating a PV and adding it to a volume group. That way, the guest logical volumes remain hidden from the hypervisor.
According to LVM guide from RedHat, section 4.2.1 https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/physvol_admin
They said there is no need to have the partition table, they even suggest us to destroy it if we use the whole disk for VG (Volume Group) unless we only intend to include only parts of it (partition).
One downside is that it is not be possible to hot-add space to a PV inside a partition table. This is not an issue if you use the entire block device for the PV.
Does LVM need a partition table ?
Is this a bad idea ?
Explanations (to complete the good answers above that correspond to different use cases):
Nowadays, Linux and LVM are used so widely that many different use cases share these same technologies, but they have very different operational constraints and requirements.
The most common example is: is your LVM disk a storage system A) used by hypervisors to store many virtual machines image files?
Or is it B) for a single, probably virtual, machine ?
In the first case A), you would need to split your LVM disk into smaller portions to reduce the granularity of the volumes space management, and also for your failure recovery risk management. You may also want to use more features of LVM, including RAID redundancy, thin pools, etc... This is best done by splitting your huge disk into partitions of smaller sizes, each of them becoming an LVM PV in your volume group. This way, in case of incident, you may reduce the probability of the incident (with redundancy). And its impact and/or resolution time if you need to resort to physical disk recovery of smaller partitions. How many exactly will depend on your risk tolerance, your service commitments, your availability expectations and design, etc ... A rule of thumb is that reducing the impact and recovery time by a factor of 10 is always welcome, and 10 is still a manageable number for partitions.
Another reason is LVM Metadata stored on each PV. LVM does store its metadata in several locations on each PV, according to Red Hat LVM docs. If you use a raw device instead of a partition, the creation of a partition table at some time, like during a data recovery attempt, would overlap with and damage your LVM metadata and render you data unusable, or much more difficult to recover. This data loss would not happen on a disk with a pre-existing partition table.
In the case B), a single system might benefit from the simplicity of a PV without partition table, as the recovery process might be ensured by the hypervisor layer backup system, if it is a virtual server. But you need to make sure of it before to decide to go for this simplicity.