Tuesday, July 14, 2009

Boot EC2 Instances from EBS

EBS offers the ability to have a "virtual disk drive" whose lifetime is independent of any EC2 instance. You can attach EBS drives to any running instance, and any changes made to the drive will persist even after the instance is terminated. You can also set up an EBS drive to be the root filesystem for an instance - giving you the benefits of an always-on instance but without paying for it when it's not in use - and this article shows you how to do that. I also explore how to estimate the cost savings you can achieve using this solution.

I'm not the first person to think of this idea: credit goes to AWS forum users rickdane for starting the discussion and N. Martin and Troy Volin for posts in this AWS forum thread that describe most of the heavy-lifting - this solution is largely based on their work.

Note: this article assumes you are comfortable in the linux shell and familiar with the EC2 AMI tools and EC2 API tools.

Why Boot EC2 Instances from EBS?

My company runs applications in EC2, and we test new application features in EC2 (after testing them locally) before we deploy them to our production environment. At first, each time we had a new feature to test out, we would construct a testing environment as follows:
  1. Launch a new instance of our base AMI (based on an Alestic Ubuntu Hardy Server AMI, which I highly recommend, and customized with our own basic application server stack).
  2. Make the necessary environmental changes (e.g. adding monitoring services) to the test instance.
  3. Deploy the the application (the one with new features we need to test) to the test instance.
  4. Deploy the test database from S3 to the test instance.
  5. Update the test database with any schema or data changes necessary.
  6. Test and debug the application.
  7. Save the updated test database to S3 for later use.
  8. Terminate the test instance.
This process worked great at first (even if it was a manual process). Our application is a WAR and a bunch of jar files, so they were easily uploaded to the test instance. We stored our test database in S3 as a gzipped MySQL mysqldump file, and it was easy to import into / export from MySQL. That is, until we got to the point where our test database was big enough for it to take a long time - over an hour - to reconstitute. At that point it became very annoying to bring up the test environment.

The last thing you want, as a developer, is a test environment that is difficult or annoying to set up. You want testing to be easy, quick, and inexpensive (otherwise you will start looking for shortcuts to avoid testing, which is not good for quality). Once our test environment began to require almost an hour just to set up, we realized it was no longer serving our needs. I began researching a setup that:
  • Is ready-to-go within a short time (less than five minutes)
  • Contains the most-recent environment (tools, application, and database) already
  • Only costs money when it is being used
Basically, we wanted the benefits of a server that is always-on but without paying for it when we weren't using it.

That's the "why". Read on for the "what" and the "how".

Ingredients and Tools

The solution consists of two pieces:
  • An AMI that can boot from an EBS drive (the "boot AMI")
  • An EBS drive that contains the bootable linux stack and, later, anything else you add (the "bootable EBS volume")
The main tool used to put the pieces together is the linux utility pivot_root. This program can only be run by the /sbin/init process (the startup process itself, pid 1), and it "swaps" the root filesystem out from under the OS and replaces it with a directory you specify. This will be the root directory of the EBS drive.

EBS volumes are attached to specific devices (/dev/sdb through /dev/sdp), and you'll need to choose what attach device you want the boot AMI to use. We could, theoretically, build the boot AMI to read the attach device name from the user-data provided at launch time. But this would require bringing up networking in order to download the user-data to the instance, which is complicated. Instead of doing this we will hardcode the attach device into the boot AMI. You will need to remember what device you chose in order to know where to attach the EBS drive when you launch an instance of the boot AMI. A good way to remember the chosen attach device is to name the boot AMI accordingly, for example boot-from-EBS-to-dev-sdp-32bit-20090624.

As part of choosing the attach device you'll need to know the mknod minor number associated with the device. mknod minor numbers begin at 0 (for /dev/sda) and progress in increments of 16 (so /dev/sdb is number 16, /dev/sdc is number 32, etc.) until /dev/sdp (which is number 240). mknod minor numbers are detailed in the Documentation/devices.txt file in the linux kernel source code. In the procedure below I use /dev/sdp ("p" for "pivot_root"), with mknod minor number 240.

The EC2 AMI tools are useful in preparing the bootable EBS drive: they can create an image file from an existing instance. These tools will be used to create a temporary image containing the bootable linux stack copied from the running instance, and this temporary image will then be copied to the EBS drive to make it bootable.

How to Set it Up

Here is an outline of the setup process:
  1. Set up a bootable EBS volume.
  2. Set up the boot AMI.
The detailed setup instructions follow.

Setting up a bootable EBS volume:
  1. Launch an instance of an AMI you like. I use the Alestic Ubuntu Hardy Server AMI.
  2. Create an EBS volume in the same availability zone as the instance you launched. This volume should be large enough to fit everything you plan to put on the root partition (later, not now). It can even exceed 10GB: unlike the "real" root filesystem on EC2 instances which is limited to 10GB, the bootable EBS volume is not limited to this size.
  3. Once the EBS volume is created, attach it to the instance on /dev/sdp.
  4. SSH into the instance as root and format the EBS volume:
    mkfs.ext3 /dev/sdp
  5. Mount the EBS volume to the instance's filesystem:
    mkdir /ebs && mount /dev/sdp /ebs
  6. Make an image bundle containing the root filesystem:
    mkdir /mnt/prImage
    ec2-bundle-vol -c cert -k key -u user -e /ebs \
    -r i386 -d /mnt/prImage
    These arguments are the same as you would use to bundle an AMI - please see the Developer Guide: Bundling a Unix or Linux AMI for details. The certificate and private key credentials should be copied to the instance in the /mnt partition so they don't get bundled into the image and copied to the bootable EBS volume. If you're creating a 64-bit image you'll need to substitute -r x86_64 instead.
  7. Copy the contents of the image to the EBS drive:
    mount -o loop /mnt/prImage/image /mnt/img-mnt
    rsync -a /mnt/img-mnt/ /ebs/
    umount /mnt/img-mnt
  8. Prepare the EBS volume for being the target of pivot_root. First, edit /ebs/etc/fstab, changing the entry for / (the root filesystem) from /dev/sda1 to /dev/sdp. Next,
    mkdir /ebs/old-root
    rm /ebs/root/.ssh/authorized_keys
    umount /ebs
    rmdir /ebs
  9. Detach the EBS volume from your instance.
Your bootable EBS volume is now ready. You might want to take a snapshot of it. Don't terminate the EC2 instance yet, because it will be used to create the boot AMI.

Setting up the boot AMI:
  1. Create the mount point where the bootable EBS will be mounted:
    mkdir /new-root
  2. Replace the original /sbin/init with a new one. First, rename the original:
    mv /sbin/init /sbin/init.old
    Then, copy this file into /sbin/init, as follows:
    curl -o /sbin/init -L https://sites.google.com/\
    As mentioned above, if you choose an attach device different than /dev/sdp you should edit the new /sbin/init file to assign DEVNO the corresponding mknod minor number.
    Finally, make it executable:
    chmod 755 /sbin/init
  3. Clean up the current SSH public key that was used to login to the instance:
    rm /root/.ssh/authorized_keys
    Not so fast! Once you perform this step you will no longer be able to SSH into the instance. Instead, you can move the SSH authorized_keys file out of the way, as follows:
    mv /root/.ssh/authorized_keys /mnt
  4. Bundle, upload, and register the AMI. Give the bundle a name that indicates the processor architecture and the device on which it expects the bootable EBS volume to be attached. For example, boot-from-EBS-to-dev-sdp-32bit-20090624.
  5. If you opted to move the SSH authorized_keys out of the way in Step 3, restore SSH access to the instance:
    mv /mnt/authorized_keys /root/.ssh/authorized_keys
The boot AMI is now ready to be launched. If you are feeling brave you can terminate the instance. Otherwise, you can leave it running and use it to troubleshoot problems with the launch process - but hopefully this won't be necessary.

How to Launch an Instance

Once the bootable EBS volume and boot AMI are set up, launch instances that boot from the EBS volume as follows:
  1. Launch an instance of the boot AMI.
  2. Attach the bootable EBS volume to the chosen device (/dev/sdp in the above instructions).
The boot AMI will wait until a volume is attached to the chosen device, pivot_root to the EBS volume, and then continue its boot sequence from the EBS volume.

Some troubleshooting tips:
  • To troubleshoot problems launching the instance you should look at the console output from the instance. The console output might not get updated consistently. If the console output is still empty a few minutes after launching the instance and attaching the EBS volume, try rebooting the instance, which will force the console to refresh.
  • If you need to attach the bootable EBS volume to another instance (not the boot AMI), attach it to /dev/sdp and mount it as follows:
    mkdir /ebs && mount /dev/sdp /ebs
  • The boot AMI includes a hook that allows you to add a program to be executed before it performs a pivot_root to the EBS drive. This avoids the need to re-bundle the boot AMI if you need to change the startup process. The hook looks for a file called /pre-pivot.sh on the EBS drive and executes the file if it can.
  • Booting from an EBS drive seems to take slightly longer than booting a regular EC2 instance. Don't be surprised if it takes up to five minutes until you can get an SSH connection to the instance.
  • You don't need to re-attach the EBS volume when you reboot the instance - it stays attached.
Usage Tips
  • Create a file in the root of each bootable EBS volume labeling the purpose of the volume, such as /bootable-32bit-appserver. This helps when you mount the EBS drive to an instance as a "regular" volume: the label indicates what the purpose of the volume is, and helps you distinguish it from other attached EBS drives. This is good practice even for non-bootable EBS volumes. See below for a tip on how to track the purpose of EBS volumes without looking at their contents.
  • Once you get the boot AMI running properly with the bootable EBS volume, shut it down and take a snapshot of the volume before you make any other changes to it. This snapshot is your "master bootable volume" snapshot, containing only the minimum setup necessary to boot.
  • After creating a "master bootable volume" snapshot you can (and should!) launch the boot AMI again (remembering to attach the bootable EBS volume) and customize the instance any way you want: add your application stack, your database, your tools, anything else. These changes will persist on the bootable EBS volume even after the instance has terminated. This is the main motivation for the bootable EBS volume solution!
  • You can create multiple bootable EBS volumes from the "master bootable volume" snapshot and customize them each. See below for a tip on how to keep track of the purpose of each EBS volume.
  • The setup instructions above can be used for creating either a 32-bit or a 64-bit boot AMI and matching bootable EBS volume. Because you can't run an AMI for one architecture on an EC2 instance of the other architecture, you'll need to create two separate boot AMIs and two separate bootable EBS volumes if you plan to run on both 32-bit and 64-bit EC2 instances. And, because the bootable EBS volumes created by this procedure will contain a 32-bit or 64-bit specific linux stack, be sure to attach the corresponding bootable EBS volume for the boot AMI you launch.
  • If you work with multiple EBS volumes you will want to identify the purpose of each volume without attaching it to a running instance and looking at the label file you created. Unfortunately the EC2 API does not currently offer a way to tag EBS volumes. But the ElasticFox Firefox extension does - and I highly recommend it for this purpose. Note that the volume tags will only be visible in the browser that creates them, not on other machines. [See my article on Copying ElasticFox tags between browsers for a workaround.]
Cost Implications of Booting Instances from EBS

EBS costs money beyond what you pay for the disk drives that come as part of EC2 instances. There are four components of the EBS cost to consider:
  1. Allocated storage size ($0.10 per GB per month)
  2. I/O requests ($0.10 per million I/O requests)
  3. Snapshot storage (same as S3 storage costs; depends on the region)
  4. Snapshot transfer (same as S3 transfer costs; depends on the region)
The AWS Simple Monthly Calculator can help you estimate these costs. Here are some guidelines to help you figure out what numbers to put into these fields:
  • Component #1 is easy: just plug in the size of your bootable EBS volume.
  • Component #2 should be estimated based on the I/O usage of your existing application instance. You can use iostat to estimate this number as follows:
    iostat | awk -F" " 'BEGIN {x="0.0"} \
    /^sd/ {x=x+$3} END \
    {print x}'

    The result is the number of I/O transactions per second. Multiply this figure by 2592000 (60 * 60 * 24 * 30, the number of seconds in a month) to get the number of I/O transactions per month, then divide by 1 million. Or, better yet, use this instead:
    iostat | awk -F" " 'BEGIN {x="0.0"} \
    /^sd/ {x=x+$3} END \
    {printf "%12.2f\n", x*2.592}'

    For my test environment, this figure comes to 29 (million I/O requests per month).
  • Component #3 can be estimated with the help of the guidelines at the bottom of the EBS Product Info page:
    Snapshot storage is based on the amount of space your data consumes in Amazon S3. Because data is compressed before being saved to Amazon S3, and Amazon EBS does not save empty blocks, it is likely that the size of a snapshot will be considerably less than the size of your volume. For the first snapshot of a volume, Amazon EBS will save a full copy of your data to Amazon S3. However for each incremental snapshot, only the part of your Amazon EBS volume that has been changed will be saved to Amazon S3.
    As a conservative estimate of snapshot storage size, I use the same size as the actual data on the EBS volume. I figure this covers the initial compressed snapshot plus a month's worth of delta snapshots.
  • Component #4 can also be estimated from the guidelines further down the same page:
    Volume data is broken up into chunks before being transferred to Amazon S3. While the size of the chunks could change through future optimizations, the number of PUTs required to save a particular snapshot to Amazon S3 can be estimated by dividing the size of the data that has changed since the last snapshot by 4MB. Conversely, when loading a snapshot from Amazon S3 into and Amazon EBS volume, the number of GET requests needed to fully load the volume can be estimated by dividing the full size of the snapshot by 4MB. You will also be charged for GETs and PUTs at normal Amazon S3 rates.
    My own monthly estimate includes taking one full EBS volume snapshot and creating one EBS volume from a snapshot. I keep around 20GB of data on the EBS volume. So I divide 20480 (20 GB expressed in MB) by 4 to get 5120. This is the estimated number of PUTs per month. It is also the estimated number of GETs per month.
For my usage in our test environment, the cost of running instances that boot from a 50GB EBS volume with 20GB of data on it in the US region comes to approximately $10.96 per month.

Is it financially worthwhile?

As long as your EBS cost estimate is less than the cost of running the instance for the hours it would sit unused, this solution saves you money. My test instance is an m1.large in the US region, which costs $0.40 per hour. If my instance would sit idle for at least (10.96/0.4=) 28 hours a month, I save money by using a bootable EBS volume and terminating the instance when it is unused. As it happens, my test instance is unused more than 360 hours a month, so for me it is a no-brainer: I use bootable EBS volumes.

EC2 instances that boot from an EBS volume offer the benefits of an always-on machine without the costs of paying for it when it is not in use. The implementation presented here was developed based on posts by the aforementioned forum users, and I thank them for their help.


  1. are you kidding? you are a rock star for this post! Thank you!!!

  2. After your tip-off in your excellent http://clouddevelopertips.blogspot.com/2009/08/mount-ebs-volume-created-from-snapshot.html, I was curious to see about putting the root file system onto EBS for development environments. When everything is on EBS, installing additional packages is less painful than it is if you only have your development application in persistent storage.

    I've been using the Debian AMIs for no better reason than brand loyalty, so my starting point was creating the EBS bootable volume from ami-ff46a796 alestic/debian-5.0-lenny-base-20090804.manifest.xml following your instructions here.

    However, when I booted from the pivot AMI, I saw this error in my console output:

    Making pivot_root device node /new-root-dev
    Attempting to mount pivot_root volume on /new-root-dev
    1 scsi_mod: no version for "struct_module" found: kernel tainted.
    SCSI subsystem initialized
    register_blkdev: cannot get major 8 for sd
    mount: /new-root-dev is not a valid block device
    2 mount: /new-root-dev is not a valid block device
    3 mount: /new-root-dev is not a valid block device
    4 mount: /new-root-dev is not a valid block device
    5 mount: /new-root-dev is not a valid block device
    6 mount: /new-root-dev is not a valid block device
    7 mount: /new-root-dev is not a valid block device
    sdp: unknown partition table

    I tried going through your procedure a couple of times and repeated the same error.

    Google took me to a similar error reported in a thread, which also had a public pivot AMI, which works nicely, if I mount my volume on /dev/sdj:


    So the public pivot AMI ami-2feb0f46 nimlabs/pivot-sdj-20080824.manifest.xml works nicely for me with the Debian Lenny.

    The only substantial differences I can see between your init scripts - his is at http://developer.amazonwebservices.com/connect/servlet/JiveServlet/download/30-24091-99041-1961/init.sh - are:

    1. N. Martin uses /dev/sdj rather than /dev/sdp (yes... I did attach accordingly :-)
    2. He doesn't invoke the pre-pivot script

    I suspect that following your instructions for setting up a boot AMI from Ubuntu would work fine with my Debian bootable EBS, but it makes sense to me to use a public AMI for the boot.

    This got me thinking - lazy person that I am - that there is a case now for public snapshots for use with public pivot AMIs. Get yourself started on your development project on a bootable EBS volume with a pre-configured set up for all of the usual suspects you see listed as public AMIs. I guess these would go into http://aws.amazon.com/publicdatasets/. You'd get yourself started with a public snapshot and then take private snapshots as your project evolves - i.e. carry the AMI paradigm through to EBS.

  3. @rstaveley,

    The problem reported by scsi_mod may be an issue with the kernel modules included in the ramdisk. What AKI and ARI did you use? I'll try to reproduce with the default AKI and ARI and see what happens....

    The ability to create public, shared EBS volumes has been requested by many. It would allow some really cool sharing scenarios, such as the one you describe. I have submitted a question to the AWS folks who manage the public data set submissions:

    I'd like to share an EBS volume with the public in the same way I share an AMI. Public EBS snapshots would allow many really cool use cases, including bypassing the 10GB root-partition limit of an AMI (in combination with the pivot_root technique). Can I submit an EBS snapshot containing the linux boot code and startup sequence? Would you approve it? It's more than "data", it would also be the kind of stuff you'd burn into an AMI - apps, configs, etc....

    I'll report back what they say.

  4. I've been playing with this some time now this morning and I have noticed that the public pivot AMI also throws up the same scsi_mod error. I apologise that was misleading. The Debian boot was similar to your Ubuntu right the way up to:

    Checking root file system...fsck 1.41.3 (12-Oct-2008)
    fsck.ext3: Device or resource busy while trying to open /lib/init/rw/rootdev
    Filesystem mounted or opened exclusively by another program?
    fsck died with exit status 8

    In 20 or so boots I found that the public pivot AMI also failed once, so the problem may be a subtle one.

    However... I made a happy discovery while I was playing with this, which probably seems obvious to you, but which may interest your readers.

    I (shortsightedly?) created a 1G EBS volume for my DEV environment, which was OK for experimenting but very soon filled up. I wished I made a 2G volume.

    I thought of going back to the drawing board, but then it occurred to me that I could do this:

    1. Create a snapshot of the 1G bootable EBS volume with my DEV system on it.
    2. Create a 2G volume from that snapshot.
    3. Attach the new volume as (say) /dev/sdk, but don't mount it.
    4. Resize the filesystem to 2G
    e2fsck -f /dev/sdk
    resize2fs /dev/sdk
    6. Snapshot the volume with the resized filesystem
    7. Use my new snapshot henceforth as the bootable EBS volume.

    This quickly gives me a way to increase the size of the bootable volume. An elastic smile appeared on my face :-)

  5. @rstaveley,

    Yes, it turns out that the actual problems begin later in the boot sequence. The entire console output (which can take a while to show up, and sometimes only flushes completely at instance termination) shows more interesting stuff.

    Boot-time problems are notoriously hard to debug in EC2 where you can't get console access.

    Using the same AMI/AKI/ARI as you, I get this:
    [skipping uninteresting stuff]
    EXT3 FS on sda1, internal journal
    Making pivot_root device node /new-root-dev
    Attempting to mount pivot_root volume on /new-root-dev
    1 scsi_mod: no version for "struct_module" found: kernel tainted.
    SCSI subsystem initialized
    register_blkdev: cannot get major 8 for sd
    mount: /new-root-dev is not a valid block device
    2 mount: /new-root-dev is not a valid block device
    3 mount: /new-root-dev is not a valid block device
    4 mount: /new-root-dev is not a valid block device
    5 mount: /new-root-dev is not a valid block device
    6 mount: /new-root-dev is not a valid block device
    7 mount: /new-root-dev is not a valid block device
    8 mount: /new-root-dev is not a valid block device
    9 mount: /new-root-dev is not a valid block device
    10 mount: /new-root-dev is not a valid block device
    11 mount: /new-root-dev is not a valid block device
    12 mount: /new-root-dev is not a valid block device
    13 mount: /new-root-dev is not a valid block device
    14 mount: /new-root-dev is not a valid block device
    15 mount: /new-root-dev is not a valid block device
    16 mount: /new-root-dev is not a valid block device
    17 mount: /new-root-dev is not a valid block device
    18 mount: /new-root-dev is not a valid block device
    sdp: unknown partition table
    19 kjournald starting. Commit interval 5 seconds
    EXT3-fs: mounted filesystem with ordered data mode.
    Mounted pivot_root volume.
    Mount failed for selinuxfs on /selinux: No such file or directory
    INIT: version 2.86 booting
    /proc is empty (not mounted ?)
    hostname: the specified hostname is invalid
    Starting the hotplug events dispatcher: udevd.
    Synthesizing the initial hotplug events...done.
    Waiting for /dev to be fully populated...input: PC Speaker as /class/input/input1
    Activating swap...Adding 917496k swap on /dev/sda3. Priority:-1 extents:1 across:917496k
    Checking root file system...fsck 1.41.3 (12-Oct-2008)
    fsck.ext3: Device or resource busy while trying to open /dev/sdp
    Filesystem mounted or opened exclusively by another program?
    fsck died with exit status 8
    [31mfailed (code 8).[39;49m
    An automatic file system check (fsck) of the root filesystem failed. A manual fsck must be performed, then the system restarted. The fsck should be performed in maintenance mode with the root filesystem mounted in read-only mode. [31mfailed![39;49m
    The root filesystem is currently mounted in read-only mode. A maintenance shell will now be started. After performing system maintenance, press CONTROL-D to terminate the maintenance shell and restart the system. [33m(warning).[39;49m
    Press enter for maintenance
    (or type Control-D to continue):

    The 19 retries happened while waiting for the volume to be attached to /dev/sdp. Then the boot continues OK but dies with:
    Checking root file system...fsck 1.41.3 (12-Oct-2008)
    fsck.ext3: Device or resource busy while trying to open /dev/sdp
    Filesystem mounted or opened exclusively by another program?
    fsck died with exit status 8

    Ugh. I'll look into this some more.

  6. I'd only look into it if it satisfies an intellectual itch, Shlomo. A virtue of this pivot_root AMI approach is that the AMI doesn't need to be the same distro as the bootable EBS volume. As long as the AMI can [reasonably] reliably get any bootable EBS distro you want up and running, you might as well use the same AMI... Unless I am mistaken.

    I hate to think that my brand loyalty has got you delving into the differences between Debian and Ubuntu or Fedora, when any distro will suffice for the pivot AMI.

  7. @rstaveley,

    It's true - any distro can be used to build the boot AMI, and so the public AMI you mentioned (ami-2feb0f46 nimlabs/pivot-sdj-20080824.manifest.xml) can be used to boot from any EBS attached to /dev/sdj . Anyone who needs a boot AMI using a different attach point can follow the procedure in this article to build one from an Ubuntu AMI.

    I did identify that the difference between Ubuntu and Debian lies in the RAID support. As you say, there appears to be little practical value to discovering the underlying cause of failure.

    Thanks for your feedback!

  8. Looks like you got through to them, Shlomo! Shared snapshots have been announced now. They are implemented in the AWS console http://aws.amazon.com/console/ and in the API and collaboration with shared DEV environments using your technique should now be able to cross Amazon account boundaries. :-)

  9. @Robert Staveley (Tom),

    Yes, there are many cool use cases enabled by shared snapshots. I even wrote an article about some: http://clouddevelopertips.blogspot.com/2009/09/cool-things-you-can-do-with-shared-ebs.html

    I wish I could claim credit for "getting through to them" but I shouldn't. It takes more than one person to clamor, and AWS seem to be strongly influenced by customer clamor.

  10. Hi Shlomo,

    Nice post regarding boot from EBS.

    On Error: fsck died with exit status 8

    I think one possible cause of this is when fsck found inconsistent block on EBS volume used.

    A possible solution could be ignoring filesystem check by editing the fstab setting pass to 0. As posted at http://developer.amazonwebservices.com/connect/thread.jspa?threadID=24091&start=30&tstart=0

    Or either attached that EBS volume to another instance and do fsck to that device then reattached it again to the pivot AMI instance.

    Just a thought.