Ceph storage system

Tweet about this on Twitter0Share on Facebook0Share on Google+0Share on LinkedIn0

Currently maybe the most popular open distributed storage solution. I’m using it with OpenStack for block and object storage, for NFS and like S3 compatible storage. I can just tell that I’m very impressed about capabilities it have and I can highly recommend it for large scales deployment. Here I will start with installation instructions on lab machine with 2 disks and continue with deployment of all services in future posts.

About Ceph

Ceph storage is a unified, distributed storage system designed for excellent performance, reliability and scalability. It provides:

  • One Object Storage: The Ceph Object Store, called RADOS, is the object storage component for CephFS filesystems, Ceph RADOS Gateways, and Ceph Block Devices.
  • Many Storage Interfaces: You can use CephFS, Ceph RADOS Gateway, or Ceph Block Devices in your deployment. You may also use all three interfaces with the same Ceph Object Store cluster! There’s no reason to build three different storage clusters for three different types of storage interface!
  • Use Commodity Hardware: You can deploy Ceph with commodity hardware. You don’t need to purchase proprietary storage or networking hardware commonly used in SAN systems. (http://ceph.com/docs/master/)

Installation

Ceph installation is really simple like all other services on Ubuntu. Just add repository and run installation through apt-get. Here I will install latest stable release v0.56.X Bobtail:

Configuration

Before I start with configuration you need to know what are three main components of Ceph:

  • OSD: Object Storage Daemons (OSDs) store data, handle data replication, recovery, backfilling, rebalancing, and provide some monitoring information to Ceph monitors by checking other OSDs for a heartbeat. A cluster requires at least two OSDs to achieve an active + clean state.
  • MON: Ceph monitors maintain maps of the cluster state, including the monitor map, the OSD map, the Placement Group (PG) map, and the CRUSH map. Ceph maintains a history (called an “epoch”) of each state change in the monitors, OSDs, and PGs.
  • MDS: Metadata Servers (MDSs) store metadata on behalf of the CephFS filesystem (i.e., Ceph block devices and Ceph gateways do not use MDS). Ceph MDS servers make it feasible for POSIX file system users to execute basic commands like ls, find, etc. without placing an enormous burden on the object store.

For detailed information I highly recommend reading official Ceph documentation which is really good written. Here is example of ceph.conf main configuration file. If you will use multiple OSDs on multiple nodes you can simply copy same file on all of them. Also in this example I’m using XFS file system which is recommended for production (/etc/ceph/ceph.conf):

As you can see here I’m using osd jurnal on same disk as data. For better performance it is recommended that jurnal is on seperate disk or even SSD. Also use one osd daemon per disk. For authentification ceph can use cephx or none. If you are using cephx you will need key before you can access any ceph resource. For hostname use just hostname, not FQDN.
To continue with above example create directories for osd0 and mon0 daemons:

Now you can deploy Ceph configuration. With “mkfs” parameter this command will also format disk /dev/sdb to XFS:

NOTE: As of Ceph v0.60, mkcephfs is deprecated in favor of ceph-deploy command. (http://ceph.com/docs/master/rados/deployment/)

When all is prepared you can start Ceph:

Administration

Ceph is up and running, so you can check cluster health:

This command with above configuration will show WARN message, but that is because I’m using just one OSD and replication level is set to two by default. Each pool has it’s own replication level.

You can easily check replication level for all pools:

To list all pools:

I you want to create new pool:

Increase replication level to 3 for newly created pool test:

Check cluster status:

Also you can live monitor your Ceph cluster. With this command you will also see read/write speeds:

Now when you have your Ceph cluster up & running explore it’s functionalists by going through official documentation. I next post I will show how to use Ceph storage as NFS, so stay tuned.

Tweet about this on Twitter0Share on Facebook0Share on Google+0Share on LinkedIn0
Posted in Storage and tagged , .

Alen Komljen

I'm a DevOps/Cloud engineer with experience that spans a broad portfolio of skills, including cloud computing, software deployment, process automation, shell scripting and configuration management, as well as Agile development and Scrum. This allowed me to excel in solving challenges in cloud computing, and the entire IT infrastructure along with my deep interest in OpenStack, Ceph, Docker and the open-source community.

  • Joel Berman

    Do many tools exist for monitoring, fixing any errors, performance tuning, fast backup/recovery? How long for a good sysadmin to pick up the skills?

    • Alen Komljen

      Ceph RBD and RADOS are production ready, but I can’t say that for CephFS. When you have distributed storage that means you will have backup/recovery already, of course if you planed your storage correctly, picked up right replication level, etc. I’m more DevOps guy so maybe I’m not right person to say how many time sysadmin will need to pick up the skills. If he worked already with some distributed storage solution probably much less. Just remember that Ceph is complex system and it is more than file storage.

  • Have you tried native CephFS for instance storage? I have done some testing and have found it to be fairly unstable during large copies, and any OSD failures. I agree that RADOS/Block Devices have worked great for glance and cinder

    • Alen Komljen

      Currently I’m using it for Glance, Cinder, object storage through S3 API or swift. When I say Cinder that also means booting VMs from volumes stored on Ceph block storage. I didn’t try CephFS on instances.

  • Pingback: CephFS filesystem | TechBar()

  • Pingback: CephFS filesystem | TechBar()

  • RK

    What is your opinion on using ceph for personal backups? The solution for mitigating dead drives seems interesting.