MapR on Ubuntu 12.0.4 - Disk setup on EC2

MapR on Ubuntu 12.0.4 - Disk setup on EC2 - amazon-ec2

I see these steps in seting up the disk for MapR installation at link
To determine if a disk or partition is ready for use by MapR:
Run the command sudo lsof to determine whether any processes are
already using the disk or partition.
There should be no output when running sudo fuser , indicating there is no > process accessing the specific disk or partition.
The disk or partition should not be mounted, as checked via the output of the mount ?command.
The disk or partition should not have an entry in the /etc/fstab file.
The disk or partition should be accessible to standard Linux tools such as
mkfs. You should be able to successfully format the partition using a
command like sudo mkfs.ext3 as this is similar to the
operations MapR performs during installation. If mkfs fails to access
and format the partition, then it is highly likely MapR will encounter
the same problem.
I have issues in acheiving this on amazon EC2 instance.
Steps that i have tried
I have created a large EC2 instance.
Created the snapsot of that volume associated with that instance
Created a new volume with 500 GB from the snapshot created above
I am not sure, how to unmount this new volume and make it available for MapR. I also see an entry in /etc/fstab for this new volume.
Can some one give a step-by-step approach to create a disk or partition which satisfies the above mentioned criteria for MapR?

MapR runs on raw disks, eg, directly on /dev/sdb. Use the disksetup command to add disks to MapR. See http://mapr.com/doc/display/MapR/disksetup for information on how to use.

Related

Running out of disk space EC2

I ran into some issues with my EC2 micro instance and had to terminate it and create a new one in its place. But it seems even though the old instance is no longer visible in the list, it is still using up some space on my disk. My df -h is listed below:
Filesystem Size Used Avail Use%
/dev/xvda1 7.8G 7.0G 719M 91% /
When I go to the EC22 console I see there are 3 volumes each 8gb in the list. One of them is attached (/dev/xvda) and this one is showing as "in-use". The other 2 are simply showing as "Available"
Is the terminated instance really using up my disk space? If yes, how to free it up?

I have just solved my problem by running this command:
sudo apt autoremove
and a lot of old packages are going to be removed, for instance many files like this linux-aws-headers-4.4.0-1028

Amazon Elastic Block Storage (EBS) is a service that provides virtual disks for use with Amazon EC2. It is network-attached storage that persists even when an EC2 instance is stopped or terminated.
When launching an Amazon EC2 instance, a boot volume is automatically attached to the instance. The contents of the boot volume is copied from an Amazon Machine Image (AMI), which can be chosen from a pre-populated list (including the ability to create your own AMI).
When an Amazon EC2 instance is Stopped, all EBS volumes remain attached to the instance. This allows the instance to be Started with the same configuration as when it was stopped.
When an Amazon EC2 instance is Terminated, EBS volumes might or might not be deleted, based upon the Delete on Termination setting of each volume:
By default, boot volumes are deleted when an instance is terminated. This is because the volume was originally just a copy of an AMI, so there is unlikely to be any important data on the volume. (Hint: Don't store data on a boot volume.)
Additional volumes default to "do not delete on termination", on the assumption that they contain data that should be retained. When the instance is terminated, these volumes will remain in an Available state, ready to be attached to another instance.
So, if you do not require any content on your remaining EBS volumes, simply delete them. In future, when launching instances, keep an eye on the Delete on Termination setting to make the clean-up process simpler.
Please note that the df -h command is only showing currently-attached volumes. It is not showing the volumes in Available state, since they are not visible to that instance. The concept of "Disk Space" typical refers to the space within an EBS volume, while "EBS Storage" refers to the volumes themselves. So, the 7GB of the volume that is used is related to that specific (boot) volume.
If you are running out of space on an EBS volume, see: Expanding the Storage Space of an EBS Volume on Linux. Expanding the volume involves:
Creating a snapshot
Creating a new (bigger) volume from the snapshot
Swapping the disks (requiring a Stop/Start if you are swapping a boot volume)

These 2 steps add an extra hard drive to your EC2 and format it for use:
Attach an extra hard drive (EBS: Elastic Block Storage) to an EC2
Format an EBS drive attached to an EC2
Here's pricing info. Free Tier includes 30GB. Afterward it's $1.25/month for 10GB on a General Purpose SSD (gp2).
To see how much space you are using/need:
Check your current disk use/available in Linux with df -h.
Check the size of a directory in Linux with du -sh [path].

EC2 Create Image EBS volume seems to remain the same

Production went down today with no disk space remaining error. After deleting files and restarting the machine, it still came up with this error, even if I just try to touch a new empty file.
It is probably caused by running out of inodes, but I went ahead and created an "Image" which seems to create an AMI, but after launching an instance of the AMI the same problem persisted... probably because it is using the same EBS volume.
Question is: how do I snapshot the EBS volume and then connect a new volume to the AMI as the root fs?

You care correct that the "Create Image" command creates an Amazon Machine Image (AMI). If you start a new EC2 instance with this AMI, it will contain the same data as the machine that was imaged. That's why you are copying your exiting problem to the new instance.
Check your disk space with df -h to confirm that you have space available.
If you require more disk space, you can copy your disk to a larger volume as follows:
Option 1: If you already have an AMI of the volume:
Launch a new instance using the AMI, but expand the size of the volume in the Add Storage options
Option 2: If you want to retain the same instance:
Stop your instance
Create Snapshot of the EBS Volume
Create Volume from the Snapshot, specifying a larger storage size
Detach the original root volume
Attach the new volume in its place (keep the same Device identifier)
In both cases, after startup confirm that the partition has automatically expanded. If not, use the resize2fs command to extend the partition.

When you create an image of an ec2 instance, it takes snapshots of the volumes also. You can see this in "Images > AMIs" and snapshots information is visible in "Block Devices" column (By default, this column is not visible) of the table.
Now, if you are getting the "no disk space error", you need to increase the size of root volume. You can do that by following the link below:
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html

CoreOS & HDFS - Running a distributed file system in Linux Containers/Docker

I need some sort of distributed file system running on a CoreOS cluster.
As such I'd like to run HDFS on CoreOS nodes. Is this possible?
I can see 2 options;
Expand CoreOS - Install HDFS directly onto CoreOS - not ideal as it breaks the whole concept of CoreOS's containerisation and would mean installing a lot of additional components
Somehow run HDFS in a Docker container on CoreOS and set affinities
Option 2 seems like the best approach, however, there are some potential blockers;
How do I reliably expose the physical disks to the Docker container running HDFS?
How do you scale container affinities?
How does this work the the Name nodes etc?
Cheers.

I'll try to provide two possibilities. I haven't tried either of these, so they are mostly suggestions. But could get you down the right path.
The first, if you want to do HDFS and it requires device access on the host, would be to run the HDFS daemons in a privileged container that had access to the required host devices (the disks directly). See https://docs.docker.com/reference/run/#runtime-privilege-linux-capabilities-and-lxc-configuration for information on the --privileged and --device flags.
In theory, you could pass the devices to the container that is handling the access to disks. Then you could use something like --link to talk to each other. The NameNode would store the metadata on the host using a volume (passed with -v). Though, given the little reading I have done about NameNode, it seems like there won't be a good solution yet for high availability anyways and it is a single point of failure.
The second option to explore, if you are looking for a clustered file system and not HDFS in particular, would be to check out the recent Ceph FS support added to the kernel in CoreOS 471.1.0: https://coreos.com/releases/#471.1.0. You might then be able to use the same approach of privileged container to access host disks to build a Ceph FS cluster. Then you might have a 'data only' container that had Ceph tools installed to mount a directory on the Ceph FS cluster, and expose this as a volume for other containers to use.
Though both of these are only ideas and I haven't used HDFS or Ceph personally (though I am keeping an eye on Ceph and would like to try something like this soon as a proof of concept).

How do I increase the EBS volume size of a running instance? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
I have a server running the recent Ubuntu AMIs from Canonical. The size of the EBS boot volume is 8GB. I know that I can resize EBS volumes by taking a snapshot, creating a new volume and expanding the partition on it. How can I increase the size of the volume while the machine is running? If this is not possible, what is the preferred method for increasing the boot volume size with minimal downtime?

We can increase the volume size with the new EBS Feature Elastic volumes, post that we need to follow the following steps to use the increase the size.
Assume your volume was 16G and you increased it to 32GB.
$lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 32G 0 disk
└─xvda1 202:1 0 16G 0 part /
To extend xvda1 from 16GB to 32GB, we need growpart. growpart is available as part of cloudutils.
sudo apt install cloud-utils
Post installation of cloud-utils, execute the growpart command:
sudo growpart /dev/xvda 1
Now lsblk, will show:
$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
xvda 202:0 0 32G 0 disk
└─xvda1 202:1 0 32G 0 part /
but df -h will show only 16GB.
Final command for extending xvda1 to 32GB is:
In the case of an ext4 file system:
sudo resize2fs /dev/xvda1
In the case of an XFS file system:
apt-get install xfsprogs
sudo xfs_growfs /dev/xvda1

Unfortunately it is not possible to increase the size of an Amazon EBS root device storage volume while the Amazon EC2 instance is running - Eric Hammond has written a detailed (I'm inclined to say the 'canonical' ;) article about Resizing the Root Disk on a Running EBS Boot EC2 Instance:
As long as you are ok with a little down time on the EC2 instance (few
minutes), it is possible to change out the root EBS volume with a
larger copy, without needing to start a new instance.
If you properly prepare the steps he describes (I highly recommend to test them with a throw away EC2 instance first to get acquainted with the procedure), you should be able to finish the process with a few minutes downtime only indeed.
Good luck!

A late answer to this 5-year-old question
AWS has just announced a new EBS feature called Elastic Volumes, which allows you to increase volume size, adjust performance, or change the volume type while the volume is in use.
You can read more about it on AWS Blog here.

You just need to create its snapshot first and from that snapshot need to create another volume and once new volume ready, detach the old volume from the instance and attach the new volume. Make sure to stop the instance before start this process and restart the instance once its done.
Refer http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-expand-volume.html

This will work for xfs file system just run this command xfs_growfs /

I found that when trying to increase the root partition /dev/sda1 which was being reported as /dev/xvda1 on centos6 i couldn't unmount the volume in order to expand the partition.
I got around this by mounting my original volume as /dev/sda1 and my snapshot as /dev/sdb. I then restarted the image and resized the /dev/sdb1 partition using parted.
Once the partition /dev/sdb1 was resized i detached both volumes and reattached the new volume to /dev/sda1 and ran resize2fs /dev/xvda1.

You cannot do this. But if your more focused on downtime then cost, you maybe able to clone your main instance, mount a larger EBS storage device to your system, copy the data over and then redirect traffic to your new instance.
If you want, a method I use lately use S3 has a medium of backups and deployment to other systems. So for example, you have your existing system running..set a script to upload your data to s3 every N minutes/hours/days..then write a script to use when launching new instances to download that data. If your data isn't something like constantly updated then this should work fine(for me, I use this to distribute updated version of my codebase while the data itself is managed on a ec2 database server).
Hope that helps.

Swap disappears when stopping an ebs backed instance.

My instance swap file is disappearing when I start my instance.
I have an Ubuntu ec2 instance, and I follow the "Four-step Process to Add Swap File" instructions at https://help.ubuntu.com/community/SwapFaq:
sudo dd if=/dev/zero of=/mnt/512MiB.swap bs=1024 count=524288
sudo chmod 600 /mnt/512MiB.swap
sudo mkswap /mnt/512MiB.swap
sudo swapon /mnt/512MiB.swap
I then changed my /etc/fstab to include:
/mnt/512MiB.swap none swap sw 0 0
Since I am using a much bigger swap, this process takes some time, and I don't want to do it every time I start. I would rather pay for the storage. However, when I start my instance, the swap has disappeared. If I type 'top', the instance does not have a swap file in use.
What should I do?

While the Amazon EC2 instance you are using has EBS backed Root Device Storage, all EC2 instance types still have the EC2 instance storage (also known as an ephemeral store) available for use as well, and the smaller instance types (e.g. m1.small and c1.medium) have it attached and mounted at /mnt by default even (the larger ones not!).
The most important characteristic of this storage type to be aware of is, that the data on the instance store volumes persists only during the life of the associated Amazon EC2 instance.
This statement is nowadays a tiny bit misleading, insofar it applies to stopping an EBS backed instance as well (not rebooting though), i.e. the moment you stop that instance, the ephemeral volume mounted at /mnt is detached and deleted and all data stored there is lost, including your swap file of course; once you start the instance again, a new ephemeral volume will be attached and mounted at /mnt.
Solution
You can still use the EC2 instance storage (which is plentiful and free of charge) if you exactly know what you are doing (see section Background below), e.g. it is a perfect option for strictly temporary data or anything that can be recreated easily on demand, like a cache for example.
A swap file is matching this requirements as well of course, so you simply need to create a script with the commands outlined in your question and execute it on instance start to recreate the swap file. You should put a guard in place though, because the instance storage survives reboots, i.e. you neither need nor should recreate the swap file on reboots, just with real stop/start cycles.
Background
The instance storage used to be the only storage option when Amazon EC2 was first introduced, but the resulting severe limitations for everyday usage have fortunately been remedied with the Amazon Elastic Block Store (EBS) you are using as well accordingly. Eric Hammond has recently provided a great summary why You Should Use EBS Boot Instances on Amazon EC2, addressing this very topic:
If you are just getting started with Amazon EC2, then use EBS boot
instances and stop reading this article. Forget that you ever heard
about instance-store and accept my apology that I just mentioned it.
Once you are completely comfortable with using EBS boot instances on
EC2, you may (or may not) want to come back here and read why you made
a good decision.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio