Installing GlusterFS on HP Cloud

Feb 27 2013

Gluster is a distributed filesystem that works well in the cloud. This post explains how to configure GlusterFS on an Ubuntu 12.04 image running in HP's cloud.

Using this setup, I gain all the benefits of a distributed and replicated (redundant) filesystem for my in-cloud services, and can back these servers to persistent block storage if I want. It's a great way to gain stable networked filesystems in the cloud, and reduce or eliminate single points of failure. <!--break-->

Prerequisites

In this article, I'm using HP Cloud. I assume that one has an acocunt, and has basic knowledge of security groups, servers, and so on.

I am using the hpcloud CLI, which is Open Source, and available from HP Cloud. I assume that hpcloud has already been configured.

Create A Security Group

First, create a new security group for your Gluster servers. I do this with the hpcloud commandline client:

hpcloud securitygroups:add GlusterServer "Gluster server cluster"

For now, I am adding the following rules:

hpcloud securitygroups:rules:add GlusterServer tcp -p 22..22 -c '00.0.0.0/0'
hpcloud securitygroups:rules:add GlusterServer tcp -p 24007..24007 -g GlusterServer
hpcloud securitygroups:rules:add GlusterServer tcp -p 24009..24099 -g GlusterServer

This opens port 22 for SSH, port 24007 for Gluster admin, and ports 24009-24099 for Gluster's filesystem protocol. Note that the last two are restricted to only other compute instances in the GlusterServer group. I am not making any warrants about the security of this setup. Since Gluster does not currently do encryption over the wire, use at your own risk.

Create an HP Cloud instance

Next, create two HP Cloud Compute instances with the following:

  1. Size: Small
  2. Image: Ubuntu 12.04
  3. Security Group: GlusterServer
hpcloud servers:add gluster-server-1 small -i 75845 -k MyKey -s GlusterServer
hpcloud servers:add gluster-server-2 small -i 75845 -k MyKey -s GlusterServer

These two will be members of our Gluster cluster.

I'll refer to the private IP of each instance as IP_ONE and IP_TWO. These are 10.x.x.x IP addresses.

Installing a GlusterFS Server

Repeat this process on each of the two servers you just created.

Log into our new instance, and install the GlusterFS server package, and also the XFS tools:

sudo apt-get install glusterfs-server xfsprogs

Following the GlusterFS installation manual, we now create and mount volumes:

sudo umount /mnt
sudo mkfs.xfs -f -i size=512 /dev/vdb
sudo mkdir -p /export/brick1

Now edit /etc/fstab/ to mount /dev/vdb to /export/brick1 instead of /mnt. Once this is done, you can do something like this:

sudo mount /dev/vdb

This should mount your new XFS virtual disk on /export/brick1.

Clustering your two servers

From the first Gluster server, run this command:

sudo gluster peer probe $IP_TWO

The probe should return successful. If it hangs forever, you probably did not add port 24007 to the security group.

Continuing on that same compute instance, we can create a new configured to replicate data across both of these servers:

sudo gluster volume create gv0 replica 2 $IP_ONE:/export/brick1 $IP_TWO:/export/brick1

Where IPONE is the 10.x IP address of the present compute instance, and IPTWO is the 10.x IP address of the second instance.

From here, we just need to start up that volume:

sudo gluster volume start gv0
sudo gluster volume info

The last command above should print out the information about the volume named gv0:

Volume Name: gv0
Type: Replicate
Status: Started
Number of Bricks: 2
Transport-type: tcp
Bricks:
Brick1: $IP_ONE:/export/brick1
Brick2: $IP_TWO:/export/brick1

Now we have a Gluster server cluster running. The last step is to configure another VM to use the Gluster file system.

Connecting a Client

To mount a Gluster volume from another HP Cloud instance, that instance will need:

  • To be a member of the GlusterServer security group
  • To have the gusterfs-client package installed

Let's create an extra-small VM just for testing:

hpcloud servers:add TEST-GLUSTERFS xsmall -i 75845 -k hpcs-apaas -s GlusterServer

Once this server is up and running, SSH into it and install just the FUSE client for GlusterFS:

sudo apt-get install glusterfs-client

From there, you can mount a gluster volume like this:

mount -t glusterfs IP_ONE:/gv0 /mnt

You should be able to go to /mnt and work directly on the GlusterFS volume. The files you create there will be replicated across your new Gluster replicated cluster.

Next Steps

In the example above, I created a two-node cluster using only ephemeral storage. If both of these instances died at the same time, your data would be lost. So for production workloads, you probably want to also create a new block storage volume and use that on at least one of your instances.

Also, keep in mind that GlusterFS does not encrypt traffic over the network. (That's slated to arrive in GlusterFS 2.4 or so.) So there's a level of risk involved in using GlusterFS.


As usual, I speak only for myself and not for my employer.