Ping a Specific Port

Question

brendan

Asked: 2016-09-08 00:04:04 +0800 CST2016-09-08 00:04:04 +0800 CST 2016-09-08 00:04:04 +0800 CST

How long will it take to create a new EBS volume from a 1TB snapshot?

772

I am taking periodic snapshots of a 1TB EBS (Amazon Web Services Elastic Block Store) volume as backup. In the case of the whole AZ (Availability Zone) becoming unavailable, my Disaster Recovery plan is to create a new EBS volume from the latest snapshot in another AZ in the same region.

How can I figure out how long it will take to create the new EBS volume? I have an RTO (Recovery Time Objective) of 6 hours. Can I meet it with this approach?

It probably shouldn't/doesn't make any difference, but I am in the ap-southeast-2 region (i.e. Sydney).

3 Answers

Voted

Michael - sqlbot · Answer 1 · 2016-09-08T15:11:46+08:00

How can I figure out how long it will take to create the new EBS volume?

Create one.

And then, try using it. Continue using it over a period of hours and days, and note what you observe.

The first answer to your question is that it actually only takes a few seconds.

The problem with that answer is that it doesn't tell the whole story:

New volumes created from existing EBS snapshots load lazily in the background. This means that after a volume is created from a snapshot, there is no need to wait for all of the data to transfer from Amazon S3 to your EBS volume before your attached instance can start accessing the volume and all its data. If your instance accesses data that hasn't yet been loaded, the volume immediately downloads the requested data from Amazon S3, and continues loading the rest of the data in the background.

However, you have to understand what the term "immediately" means, here. Immediately does not mean the volume is as fast, initially, as it will eventually be. Remember: the difference between microseconds and milliseconds seems intuitively small but it is still a factor of 1,000.

[...] storage blocks on volumes that were restored from snapshots must be initialized (pulled down from Amazon S3 and written to the volume) before you can access the block.

This preliminary action takes time and can cause a significant increase in the latency of an I/O operation the first time each block is accessed.

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-initialize.html

This is my point, above -- creating the volume only requires a matter of seconds, at which point it is usable, but slow.

EBS volumes are logical entities. When a volume is restored from a snapshot, every block on the volume is logically present and logically available as soon as the new volume becomes available, but not necessarily physically present on the volume the first time you try to read it.

The lag in loading the blocks is, overall, a small price to pay for the immediate availability of any specific block anywhere on the volume, but the impact can be significant, with the significance depending in part on how the volume is used.

The link, above, goes on to explain how you can speed up the warm-up process with dd or fio. What the documentation omits is the fact that you can use either of these in a read-only mode with the volume mounted, and get the benefit of immediate availability while prepping the volume for action. This will have a further negative impact on initial random accesses, but the pain will end sooner than if you do nothing at all, and for that reason it is probably going to be your best choice... but you must put your DR scenario through its paces, observe its operation, and adjust your strategy, accordingly.

Tim · Answer 2 · 2016-09-08T16:54:47+08:00

Michael has a great answer to your question, as always. You can also prewarm your volume, which takes a bit of time but brings all the blocks in quicker, so you take the performance hit up front. Spinning up an instance in another AZ could probably be scripted with some combination of events, lambda, and CloudFormation or Opsworks, though it would take some experimentation. It's not the way things are usually done in AWS though.

Another potentially better option depending on your use case and budget is to use an elastic load balancer with auto scaling and multiple smaller instances, spreading your traffic across two or more AZs. This means if you have an AZ fail your other instance will keep serving traffic and your ELB/AS will create more instances in working AZs automatically. Once the first AZ comes back up it will eventually balance the load across all AZs again.

If your application works just as well on two smaller instances than one large instance this will cost you a little more for the ELB, with an RTO of zero. If price is more important than availability then you probably want to follow your original plan with the original RTO.

Note that snapshots live in a region, across AZs. If a whole region goes out you can't access them from another region.

serverstackqns · Answer 3 · 2016-09-08T00:14:48+08:00

serverstackqns

2016-09-08T00:14:48+08:002016-09-08T00:14:48+08:00

Creating EBS snapshot can mainly depend on the volume, as well as read/writes happening on the disk, network latency (minor impact). And personally I wont suggest taking EBS root volumes snapshot as backups because of OS issues. If the volume is a data disk, yes, you can go with snapshots as a backup.

I believe your RTO should be good enough to recover the volume.

-2

How long will it take to create a new EBS volume from a 1TB snapshot?

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?