I've configured a new MySQL server on Amazon EC2 and decided to store my data on a EBS RAID0 Array. So far so good, and I've tested taking snapshots of those devices with ec2-consistent-snapshot, great.
Now, how do you rebuild the array on a new instance, from these snapshots, quickly?
When you use ec2-consistent-snapshot to create a snapshot of multiple volumes, you have no way to tell which volume was used for each device in the RAID. I maybe completely wrong, but since you're striping data across the volumes, it would stand to reason that you have to put each NEW volume in the same location on the RAID as the volume from which the snapshot was created.
An example:
- 3x200gb volumes in a RAID0 configuration.
- vol-1 is /dev/sdh device 0 in the RAID
- vol-2 is /dev/sdh1 device 1 in the RAID
- vol-3 is /dev/sdh2 device 2 in the RAID
you create an ec2 snapshot with: ec2-consistent-snapshot <options> vol-1 vol-2 vol-3
.
You now have 3 snapshots, and the only way to trace back which device they are is to look at the source volume id, then look at which device the source volume id is mounted as on the instance, and then check the details of the RAID configuration on the source volume's instance.
This is obviously incredibly manual...and not fast (which obviously makes it hard to bring up a new mysql instance quickly if the other one fails. not to mention, you'd have to record the device positions on the RAID at the time of snapshot, because if the source volume instance crashes, you have no way to get to the RAID configuration).
So, in conclusion:
- Am I missing something with how ec2-consistent-snapshot and a software RAID0 array work?
- If not, are there any known solutions / best practices around the problem of not knowing to which device/position in the RAID array a snapshot belongs?
I hope this was clear, and thanks for your help!
I tested your premise, and logical as it may seem, the observation is otherwise.
Let me detail this:
I have the exact same requirement as you do. However, the RAID0 that I am using has only 2 volumes.
I'm using Ubuntu 10 and have 2 EBS devices forming a RAID0 device formatted with XFS.
The raid0 device was creating using the following command:
sudo mdadm --create /dev/md0 --level 0 --metadata=1.1 --raid-devices 2 /dev/sdg /dev/sdh
I've installed MYSQL and a bunch of other software that are configured to use /dev/md0 to store their data files.
Using the same volumes: Once done, I umount everything, stop the Raid and reassemble it like so:
sudo mdadm --assemble /dev/md0 /dev/sdh /dev/sdg
The thing is that irrespective of the order of/dev/sdg /dev/sgh
, the RAID reconstitutes itself correctly.Using snapshots: Post this, I use
ec2-consistent-snapshot
to create snapshots of the 2 EBS disks together. I then create volumes from this disk, attach it to a new instance (that has been configured for the software already), reassemble the RAID (I've tried interchanging the order of the EBS volumes too), mount it and I'm ready to go.Sounds strange, but it works.
I run a similar configuration (RAID0 over 4 EBS volumes), and consequently had the same concerns to reconstitute the RAID array from snapshots created with ec2-consistent-snapshot.
Fortunately, each device in a raid array contains metadata (in a superblock) that records its position in the array, the UUID of the array and the level of array (e.g. RAID0). To query this superblock on any device run the following command (the line matching '^this' describes the queried device):
If you do the same query on a device which is not part of an array, you obtain:
Which proves that this command really relies on information stored on the device itself and not some configuration file.
One can also examine the devices of a RAID array starting from the RAID device, retrieving similar information:
I use the later along with ec2-describe-volumes to build the list of volumes for ec2-consistent-snaptshot (-n and --debug allow to test this command without creating snapshots). The following command assumes that the directory /mysql is the mount point for the volume and that the AWS region is us-west-1:
I know this doesn't answer your question, but I'm doing something similar but with Amazon's base ec2-create-snapshot tool and a cron script. It's not as fast as ec2-consistent-snapshot, but I get the extra control I need: fsync, lock writes, and most importantly, name the snapshots appropriately so they can be reconstituted in the correct order.