Installing GlusterFS on HP Cloud
Gluster is a distributed filesystem that works well in the cloud. This post explains how to configure GlusterFS on an Ubuntu 12.04 image running in HP's cloud.
Using this setup, I gain all the benefits of a distributed and replicated (redundant) filesystem for my in-cloud services, and can back these servers to persistent block storage if I want. It's a great way to gain stable networked filesystems in the cloud, and reduce or eliminate single points of failure. <!--break-->
In this article, I'm using HP Cloud. I assume that one has an acocunt, and has basic knowledge of security groups, servers, and so on.
I am using the
hpcloud CLI, which is Open Source, and available from HP Cloud. I assume that
hpcloud has already been configured.
Create A Security Group
First, create a new security group for your Gluster servers. I do this with the
hpcloud commandline client:
hpcloud securitygroups:add GlusterServer "Gluster server cluster"
For now, I am adding the following rules:
hpcloud securitygroups:rules:add GlusterServer tcp -p 22..22 -c '00.0.0.0/0'
hpcloud securitygroups:rules:add GlusterServer tcp -p 24007..24007 -g GlusterServer
hpcloud securitygroups:rules:add GlusterServer tcp -p 24009..24099 -g GlusterServer
This opens port 22 for SSH, port 24007 for Gluster admin, and ports 24009-24099 for Gluster's filesystem protocol. Note that the last two are restricted to only other compute instances in the
GlusterServer group. I am not making any warrants about the security of this setup. Since Gluster does not currently do encryption over the wire, use at your own risk.
Create an HP Cloud instance
Next, create two HP Cloud Compute instances with the following:
- Size: Small
- Image: Ubuntu 12.04
- Security Group: GlusterServer
hpcloud servers:add gluster-server-1 small -i 75845 -k MyKey -s GlusterServer
hpcloud servers:add gluster-server-2 small -i 75845 -k MyKey -s GlusterServer
These two will be members of our Gluster cluster.
I'll refer to the private IP of each instance as IP_ONE and IP_TWO. These are 10.x.x.x IP addresses.
Installing a GlusterFS Server
Repeat this process on each of the two servers you just created.
Log into our new instance, and install the GlusterFS server package, and also the XFS tools:
sudo apt-get install glusterfs-server xfsprogs
Following the GlusterFS installation manual, we now create and mount volumes:
sudo umount /mnt
sudo mkfs.xfs -f -i size=512 /dev/vdb
sudo mkdir -p /export/brick1
/etc/fstab/ to mount
/export/brick1 instead of
Once this is done, you can do something like this:
sudo mount /dev/vdb
This should mount your new XFS virtual disk on
Clustering your two servers
From the first Gluster server, run this command:
sudo gluster peer probe $IP_TWO
The probe should return successful. If it hangs forever, you probably did not add port 24007 to the security group.
Continuing on that same compute instance, we can create a new configured to replicate data across both of these servers:
sudo gluster volume create gv0 replica 2 $IP_ONE:/export/brick1 $IP_TWO:/export/brick1
Where IPONE is the 10.x IP address of the present compute instance, and IPTWO is the 10.x IP address of the second instance.
From here, we just need to start up that volume:
sudo gluster volume start gv0
sudo gluster volume info
The last command above should print out the information about the volume named
Volume Name: gv0
Number of Bricks: 2
Now we have a Gluster server cluster running. The last step is to configure another VM to use the Gluster file system.
Connecting a Client
To mount a Gluster volume from another HP Cloud instance, that instance will need:
- To be a member of the
- To have the
Let's create an extra-small VM just for testing:
hpcloud servers:add TEST-GLUSTERFS xsmall -i 75845 -k hpcs-apaas -s GlusterServer
Once this server is up and running, SSH into it and install just the FUSE client for GlusterFS:
sudo apt-get install glusterfs-client
From there, you can mount a gluster volume like this:
mount -t glusterfs IP_ONE:/gv0 /mnt
You should be able to go to
/mnt and work directly on the GlusterFS volume. The files you create there will be replicated across your new Gluster replicated cluster.
In the example above, I created a two-node cluster using only ephemeral storage. If both of these instances died at the same time, your data would be lost. So for production workloads, you probably want to also create a new block storage volume and use that on at least one of your instances.
Also, keep in mind that GlusterFS does not encrypt traffic over the network. (That's slated to arrive in GlusterFS 2.4 or so.) So there's a level of risk involved in using GlusterFS.
As usual, I speak only for myself and not for my employer.