| By Greg Schulz | Article Rating: |
|
| November 27, 2012 11:30 AM EST | Reads: |
3,550 |
Recently while I was in Europe presenting some sessions at conferences and doing some seminars, I was invited by Ed Saipetch (@edsai) of Inktank.com to attend the first Ceph Day in Amsterdam.
As luck or fate would turn out, I was in Nijkerk which is about an hour train ride from Amsterdam central station plus a free day in my schedule. After a morning train ride and nice walk from Amsterdam Central I arrived at the Tobacco Theatre (a former tobacco trading venue) where Ceph Day was underway, and in time for lunch of Krokettens sandwich.
Let's take a quick step back and address for those not familiar what is Ceph (Cephalanthera) and why it was worth spending a day to attend this event. Ceph is an open source distributed object scale out (e.g. cluster or grid) software platform running on industry standard hardware.
Ceph is used for deploying object storage, cloud storage and managed services, general purpose storage for research, commercial, scientific, high performance computing (HPC) or high productivity computing (commercial) along with backup or data protection and archiving destinations. Other software similar in functionality or capabilities to Ceph include OpenStack Swift, Basho Riak CS, Cleversafe, Scality and Caringo among others. There are also the tin wrapped software (e.g. appliances or pre-packaged) solutions such as Dell DX (Caringo), DataDirect Networks (DDN) WOS, EMC ATMOS and Centera, Amplidata and HDS HCP among others. From a service standpoint, these solutions can be used to build services similar Amazon S3 and Glacier, Rackspace Cloud files and Cloud Block, DreamHost DreamObject and HP Cloud storage among others.
At the heart of Ceph is RADOS a distributed object store that consists of peer nodes functioning as object storage devices (OSD). Data can be accessed via REST (Amazon S3 like) APIs, Libraries, CEPHFS and gateway with information being spread across nodes and OSDs using a CRUSH based algorithm (note Sage Weil is one of the authors of CRUSH: Controlled, Scalable, Decentralized Placement of Replicated Data). Ceph is scalable in terms of performance, availability and capacity by adding extra nodes with hard disk drives (HDD) or solid state devices (SSDs). One of the presentations pertained to DreamHost that was an early adopter of Ceph to make their DreamObjects (cloud storage) offering.
In addition to storage nodes, there are also an odd number of monitor nodes to coordinate and manage the Ceph cluster along with optional gateways for file access. In the above figure (via DreamHost), load balancers sit in front of gateways that interact with the storage nodes. The storage node in this example is a physical server with 12 x 3TB HDDs each configured as a OSD.
In the DreamHost example above, there are 90 storage nodes plus 3 management nodes, the total raw storage capacity (no RAID) is about 3PB (12 x 3TB = 36TB x 90 = 3.24PB). Instead of using RAID or mirroring, each objects data is replicated or copied to three (e.g. N=3) different OSDs (on separate nodes), where N is adjustable for a given level of data protection, for a usable storage capacity of about 1PB.
Note that for more usable capacity and lower availability, N could be set lower, or a larger value of N would give more durability or data protection at higher storage capacity overhead cost. In addition to using JBOD configurations with replication, Ceph can also be configured with a combination of RAID and replication providing more flexibility for larger environments to balance performance, availability, capacity and economics.
One of the benefits of Ceph is the flexibility to configure it how you want or need for different applications. This can be in a cost-effective hardware light configuration using JBOD or internal HDDs in small form factor generally available servers, or high density servers and storage enclosures with optional RAID adapters along with SSD. This flexibility is different from some cloud and object storage systems or software tools which take a stance of not using or avoiding RAID vs. providing options and flexibility to configure and use the technology how you see fit.
Here are some links to presentations from Ceph Day:
Introduction and Welcome by Wido den Hollander
Ceph: A Unified Distributed Storage System by Sage Weil
Ceph in the Cloud by Wido den Hollander
DreamObjects: Cloud Object Storage with Ceph by Ross Turk
Cluster Design and Deployment by Greg Farnum
Notes on Librados by Sage Weil
While at Ceph day, I was able to spend a few minutes with Sage Weil Ceph creator and founder of inktank.com to record a pod cast (listen here) about what Ceph is, where and when to use it, along with other related topics. Also while at the event I had a chance to sit down with Curtis (aka Mr. Backup) Preston where we did a simulcast video and pod cast. The simulcast involved Curtis recording this video with me as a guest discussing Ceph, cloud and object storage, backup, data protection and related themes while I recorded this pod cast.
One of the interesting things I heard, or actually did not hear while at the Ceph Day event that I tend to hear at related conferences such as SNW is a focus on where and how to use, configure and deploy Ceph along with various configuration options, replication or copy modes as opposed to going off on erasure codes or other tangents. In other words, instead of focusing on the data protection protocol and algorithms, or what is wrong with the competition or other architectures, the Ceph Day focused was removing cloud and object storage objections and enablement.
Where do you get Ceph? You can get it here, as well as via 42on.com and inktank.com.
Thanks again to Sage Weil for taking time out of his busy schedule to record a pod cast talking about Ceph, as well 42on.com and inktank for hosting, and the invitation to attend the first Ceph Day in Amsterdam.

Returning to Amsterdam central station after Ceph Day
Ok, nuff said.
Cheers Gs
Greg Schulz - Author Cloud and Virtual Data Storage Networking (CRC Press, 2011), The Green and Virtual Data Center (CRC Press, 2009), and Resilient Storage Networks (Elsevier, 2004)
twitter @storageio
All Comments, (C) and (TM) belong to their owners/posters, Other content (C) Copyright 2006-2012 StorageIO All Rights Reserved
Read the original blog entry...
Published November 27, 2012 Reads 3,550
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Greg Schulz
Greg Schulz is founder of the Server and StorageIO (StorageIO) Group, an IT industry analyst and consultancy firm. Greg has worked with various server operating systems along with storage and networking software tools, hardware and services. Greg has worked as a programmer, systems administrator, disaster recovery consultant, and storage and capacity planner for various IT organizations. He has worked for various vendors before joining an industry analyst firm and later forming StorageIO.
In addition to his analyst and consulting research duties, Schulz has published over a thousand articles, tips, reports and white papers and is a sought after popular speaker at events around the world. Greg is also author of the books Resilient Storage Network (Elsevier) and The Green and Virtual Data Center (CRC). His blog is at www.storageioblog.com and he can also be found on twitter @storageio.
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Best CIO Practices Shared from SHI’s Customers
- Big Data Isn’t About the Database, It’s About the Application
- Cloud Expo New York: Rethink IT and Reinvent Business with IBM SmartCloud
- Cloud Expo New York: API Security, Does My Business Need an OAuth Server?
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- Cloud Expo New York: Developing the World’s First IaaS Marketplace
- Cloud Expo NY: Best Practices for Delivering Oracle Database as a Service
- BEA Updates WebLogic SOA Portal for Web 2.0 Era
- UNIT4 Business Software: Three Retail Accounting Tips to Help Retailers Leverage the Cloud and Back Office Systems
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York: Cloud Is Changing the Economics of Business
- Best CIO Practices Shared from SHI’s Customers
- Cloud Expo New York: Deploying Hybrid Cloud for Performance and Uptime
- Big Data Isn’t About the Database, It’s About the Application
- Cloud Expo New York: Delivering Digital Marketing on the Cloud
- Cloud Expo New York: Rethink IT and Reinvent Business with IBM SmartCloud
- Cloud Expo New York: API Security, Does My Business Need an OAuth Server?
- Cloudant to Exhibit at Cloud Expo & Big Data Expo New York
- Cloud Expo New York: Basics of SSD Technology and Its Use in Cloud
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- The i-Technology Right Stuff
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Get the Message
- i-Technology Viewpoint: Is Web 2.0 the Global SOA?
- ESB Myth Busters: 10 Enterprise Service Bus Myths Debunked
- i-Technology Viewpoint: Thinking Outside the VC Box
- i-Technology Viewpoint: When to Leave Your First IT Job
- SOA Web Services Edge Conference Coverage on SYS-CON.TV
- SYS-CON.TV's "SOA Web Services" and "Enterprise Open Source" Programs To Air in December
- Five Reasons Why Web 2.0 Matters
































