Gartner Critical Capabilities for Scale-Out File System Storage – 27 January 2015

ID:G00269990

Analyst(s): Arun Chandrasekaran, Santhosh Rao

VIEW SUMMARY

Analytics, collaboration and cost-effective data retention are key imperatives that are driving interest in scale-out file system storage for I&O leaders. This research compares nine scale-out file system storage products on their capability to support key uses cases via seven critical capabilities.

VIEW SUMMARY

Overview

Key Findings

  • The products in this research are sufficiently differentiated from each other by use case or specific capabilities unique to each product to make them appropriate for purchase.
  • Scale-out file system storage products face competition from object storage products, due to object storage’s better scalability, easier management and robust multitenancy, as well as from traditional, scale-up network-attached storage products, due to NAS’s growing capacity and better interoperability.
  • Although nascent, cloud-based deployments of scale-out file system products are expected to challenge the growth of on-premises deployments due to the promise of low entry costs from public cloud infrastructure as a service, rapid scalability and a growing ecosystem of independent software vendors.
  • Big data analytics and cloud storage are emerging use cases for scale-out file system storage.

Recommendations

  • Focus on the workload characteristics by understanding storage needs across the critical capabilities, so that the appropriate product can be implemented to meet workload requirements.
  • Validate performance claims with proofs of concept, given that performance varies greatly by protocol type and file sizes.
  • Evaluate scale-out file system storage products for their interoperability with the ISV solutions that are dominant in your environment and for their support of public cloud IaaS.
  • Include an adequate training budget during procurement, because managing scale-out file system storage differs from storage area network management; as a result, storage administrators may need more training.

What You Need to Know

The growing demand for storage products that can scale linearly in capacity and performance to manage unstructured data is propelling scale-out file system products to the forefront for use cases such as high-performance computing (HPC), file sharing, backup and archiving.

In this research, we rate nine scale-out file system storage products on their ability to support four use cases by means of those products’ capabilities, which are critical to those cases. As revealed by the analysis in this research, the evaluated products, for the most part, vary greatly in their architecture, capabilities and alignment with the aforementioned use cases. Although many vendors continue to fine-tune their products to focus on specific use cases, the leading vendors in this research cater to a wide variety of use cases in enterprise environments.

I&O leaders must carefully select a scale-out file system storage product through a rigorous planning process that involves thoroughly evaluating the products’ critical capabilities. In addition, because awareness of scale-out file system storage and global namespaces is uncommon in enterprise IT organizations, I&O leaders should allocate a portion of the scale-out file system storage budget to training on the technology.

Analysis

Critical Capabilities Use-Case Graphics

Figure 1. Vendors’ Product Scores for the Overall Use Case

Source: Gartner (January 2015)

Figure 2. Vendors’ Product Scores for the Commercial HPC Use Case

Source: Gartner (January 2015)

Figure 3. Vendors’ Product Scores for the Large Home Directories Use Case

Source: Gartner (January 2015)

Figure 4. Vendors’ Product Scores for the Backup Use Case

Source: Gartner (January 2015)

Figure 5. Vendors’ Product Scores for the Archiving Use Case

Source: Gartner (January 2015)

Vendors

Dell Fluid File System

Dell Fluid File System (FluidFS) is based on the Exanet assets, which Dell acquired at the end of 2009. FluidFS supports several Dell storage arrays in the back end, including EqualLogic and Compellent. In the Version 3 release of FluidFS, which came out in June 2013, Dell added several new features, such as data reduction (e.g., deduplication and compression), NFSv4, 10GbE support and a unified management interface with Compellent Enterprise Manager. Many of the capabilities evaluated in this research, such as capacity, performance and resiliency, vary depending on FluidFS’s back-end storage arrays.

For example, the EqualLogic solution is designed to provide an easy-to-use solution to small and midsize businesses, while the Compellent solution targets performance-oriented deployments. FluidFS has a scale-out architecture based on high-availability, active-active pairs and stripes metadata and data across nodes in the cluster for performance and data protection. Although the Version 3 release was a significant improvement over the prior version, FluidFS still does not support multitenancy or WORM (see Note 1). It also lacks native tiering and only supports Server Message Block (SMB) v.2.0, although Dell has indicated availability of SMB 3.0 support in 1Q15.

EMC Isilon

Among the distributed file systems for scalable capacity and performance on the market, Isilon stands out, with its easy-to-deploy clustered storage appliance approach and well-rounded feature sets. The product includes a tightly integrated file system, volume manager and data protection in one software layer; a clusterwide snapshot capability at a granular level; asynchronous replication; high availability with multiple failover nodes; fast disk rebuild time; and a policy-based migration tool. In 2014, EMC added features such as postprocess deduplication and SMB 3.0 multichannel support.

From a performance standpoint, Isilon backup processes can be accelerated by adding the A100 performance accelerator node. From a security standpoint, Isilon has native encryption for data at rest, and SmartLock provides WORM capabilities that meet compliance requirements, such as SEC 17a-4 and HIPAA. Isilon also supports Hadoop deployments by uniquely supporting HDFS as a protocol. Isilon does not support compression, and geographically distributed deployments of Isilon can be complex and expensive to manage, due to the replication overhead and the lack of dispersed erasure coding.

Hitachi NAS Platform

The Hitachi NAS (HNAS) series is based on the SiliconFS object-based file system. In 2014, Hitachi introduced support for object-based replication, hardware-accelerated deduplication and automated tiering to the cloud via AWS S3 API. The HNAS deduplication engine is unique in that it executes at the hardware level, using field-programmable gate array (FPGA), thus relieving the system CPU and memory of this process. The HNAS deduplication engine automatically throttles based on workload levels. HNAS also introduced data migration tools that decrease data migration time by automatically assessing file data on third-party NFS servers and setting up associations. HNAS’s file-tiering capabilities include a built-in policy manager and a mechanism to automatically place the metadata table in the fastest tier to increase directory search speeds.

HNAS integrates with a wide variety of backup and archiving independent software vendors (ISVs). Although HNAS 4000 series is rated high on performance, it can only scale up to eight nodes and lacks compression support. HNAS lacks support for SMB 3.0 protocol, which can affect its availability in Windows environments, because it won’t be able to handle transparent failovers.

HP StoreAll Storage

HP StoreAll Storage is based on the Ibrix parallel file system and has unique features such as HP Labs’ StoreAll Express Query, which can perform extremely fast metadata searches of massive content repositories. HP StoreAll supports up to 16PB within a single namespace. Automated, policy-based data tiering is a standard feature in the product, and integration with tools such as HP Systems Insight Manager (SIM) and HP Storage Essentials simplifies manageability. The StoreAll series has native retention, WORM and auditing features, OpenStack Swift API support and broad support for archiving ISVs, making it an attractive product for petabyte (PB)-scale archiving. HP also packages all hardware and software components in a single, unified pricing scheme.

HP StoreAll has only modest efficiency features and relies on back-end storage arrays for thin provisioning. In addition, the product lacks deduplication and compression and HP has few public performance benchmarks for the product.

Huawei OceanStor 9000

Huawei offers two clustered network-attached storage (NAS) products: OceanStor N8000 series and the OceanStor 9000 series. Gartner evaluated the latter in this research. OceanStor 9000 is based on Huawei’s proprietary Wushan file system and can scale out to 288 nodes, which is one of the highest among scale-out file system storage products. Huawei OceanStor 9000 also supports NFS, SMB and InfiniBand protocols; Amazon Web Services S3 API on the front-end; and an HDFS plug-in. To further stimulate the demand for its product, Huawei has been aggressive in submitting it to publicly available performance benchmarks, such as the Standard Performance Evaluation Corp. (SPEC).

OceanStor 9000 has a resilient architecture and supports erasure coding and internode balancing; however, it lacks efficiency features, such as deduplication and compression. Huawei’s service, support and reseller network continues to be weak for this product line outside China.

IBM Elastic Storage

IBM’s Elastic Storage is a software-only platform based on the mature and scalable General Parallel File System (GPFS). Elastic Storage supports object access, file sharing, virtualization and analytics on a single converged platform. It is closely integrated with IBM’s FileNet for content management. The product scored highly on the scalability and performance capabilities. In 2014, IBM made a number of enhancements to Elastic Storage, adding an interface for OpenStack SWIFT (for object storage), as well as adding file encryption and NFSv4 support. Elastic Storage also includes a rich replication feature that supports two-way, three-way and metadata-level replication of individual files or the entire file system.

Elastic Storage is widely deployed for HPC and archive use cases, with actual deployments exceeding 10PB production capacities in some cases. However, Elastic Storage lacks features such as built-in deduplication, compression and thin provisioning. Although IBM has made improvements by modeling the graphical user interface (GUI) after the popular XIV interface, overall manageability continues to be complex.

NetApp Clustered Data Ontap 8.x

Clustered Data Ontap is a unified storage OS from NetApp, and this research evaluates Clustered Data Ontap v.8.x, which adds a global namespace, load balancing capabilities and federated management to the feature set that has made its nonclustered file systems popular. Clustered Data Ontap can support as many as 12 failover node pairs, which can scale to more than 100PB. In addition, the product enables user-transparent migration among different node pairs to perform load balancing, easing management complexities with high availability in a large environment. NetApp has been in a market-leading position in consolidating Windows and Unix/Linux file servers for home directories, and Clustered Data Ontap brings its NFS and SMB (v3.0) support into a more scalable environment.

The latest release of Clustered Data Ontap (v.8.3), which launched in October 2014, nearly brought the product to feature parity with the traditional seven-mode, including the addition of MetroCluster. The v.8.3 release does not include support for seven-mode, clearly signaling NetApp’s intention to focus its innovation on the Clustered Data Ontap architecture moving forward. With regard to the critical capabilities covered in this research, Clustered Data Ontap is highly rated for its storage efficiency, as well as interoperability, due to robust thin provisioning, data reduction and caching capabilities, as well as its tight integration with leading ISV products. Data Ontap 8.3 does not support SnapLock for WORM capabilities, lacks a parallel file system and involves a complex migration process for most seven-mode customers.

Quantum StorNext

Quantum is an established producer of data protection and data management products and is especially known for its disk backup appliances and tape libraries. Quantum’s StorNext scale-out file system offering is purpose-built to address the high-performance streaming of rich media, cross-OS file sharing and long-term archiving in industries such as life sciences, energy, media and entertainment, and government. In the past year, Quantum has enhanced StorNext, giving it the ability to handle bigger datasets and more IP-network-centric workloads, and to embed more-flexible, automated storage tiering.

In August 2014, Quantum introduced StorNext Connect, enabling easier deployment and operational management of multiple StorNext systems. StorNext is available as a software-only solution and as an appliance with dedicated hardware for metadata controllers, NAS gateways and archival storage. The product has tight integration with tape and Quantum’s object storage, and takes advantage of policy-based tiering to lower the total cost of ownership (TCO). Although StorNext is rated well for its performance, it lacks thin provisioning and snapshots. The product line remains niche, lacking broad appeal across vertical industries and use cases.

Red Hat Storage Server

The acquisition of Gluster by Red Hat in 4Q11 was beneficial for Gluster, since it brought backing from a pioneer in open-source software. Since then, Red Hat has relaunched Gluster’s open-source storage product, GlusterFS, with more stability, better features and additional prepackaged software. The product is a scale-out, multiprotocol (NFS, RESTful APIs, Server Message Block [SMB]), open-source storage software solution with PB-scale capacity and improved snapshot and replication capabilities. Red Hat Storage Server is a preintegrated software product consisting of Red Hat Enterprise Linux (RHEL), GlusterFS and the extensible file system (XFS) and is installed on bare-metal hardware or can be installed in a kernel-based virtual machine (KVM) or VMware hypervisors to pool storage resources.

The product benefits from Red Hat’s complementary open-source community projects and technical support capabilities, which include community, standard and premium support options. However, the product lacks some capabilities that enterprise IT buyers aspire to in a file system product, such as tiering, and native data reduction features, such as compression and deduplication. The IHV ecosystem supporting the product is small, but growing.

Context

Traditionally, the major market for scale-out file system storage has been academic and commercial HPC environments for workloads such as genomic sequencing, financial modeling, 3D animation, weather forecasting and seismic analysis. As such, scale-out file system storage solutions have focused on scalable capacity, raw computing power and aggregated bandwidth, with data protection, security and efficiency only as secondary considerations.

However, ever-increasing data growth — chiefly, unstructured data growth — in the enterprise has led many I&O leaders in these organizations to deploy the technology to support large home directories, backup and archiving. For these use cases, better security and multitenancy, easier manageability, robust data protection and ISV interoperability are growing in importance.

In addition to simply supporting these four use cases (academic HPC not included), I&O leaders are embracing scale-out file system storage for its added benefits. First and foremost, the technology includes embedded functionality for storage management, resiliency and security at the software level, easing the tasks related to those functions in the I&O organization. The technology also offers nearly linear horizontal scaling and delivers highly aggregated performance through parallelism. This means that scale-out file system storage enables pay-as-you-grow storage capacity and performance, making it a cost-effective alternative to scale-up storage, in particular, where I&O leaders are forced to purchase more storage than needed to ensure storage growth does not outpace capacity. Lastly, most scale-out file system storage vendors use standard x86 hardware, thus reducing the hardware acquisition costs.

Big data analytics, a scenario in which these file systems could run map/reduce processing jobs, and cloud storage for file sync and share and other SaaS workloads are emerging use cases for scale-out file system storage products.

Product/Service Class Definition

Scale-out file system storage is a category of storage product that use a global namespace to aggregate a loose file cluster that resides across distributed storage modules or nodes. In a scale-out file system storage environment, capacity, performance, throughput and connectivity scale with the number of nodes in the system. That being said, scalability is often limited by storage hardware and networking architectural constraints.

Critical Capabilities Definition

Capacity

The ability of the product to support growth in storage capacity in a nearly linear manner with capacity requirements often extending from hundreds of TBs to the PB scale.

Scoring for this capability takes into consideration the scalability limitations of a product’s file system capacity, in theory and in real-world practice. Scalability limitations include maximum storage capacity, the number of files/directories/user connections supported, and the number of nodes and disk drives supported by a file system, volume or namespace.

Storage Efficiency

The ability of the product to support storage efficiency technologies, such as compression, deduplication, thin provisioning and automated tiering to reduce TCO.

Scoring for this capability takes into consideration data reduction ratios, performance impact of data reduction and granularity and application transparency of tiering algorithms.

Interoperability

The ability of the product to support third-party ISV applications, public cloud APIs and multivendor hypervisors.

Scoring for this capability takes into consideration the breadth and depth of ISV/independent hardware vendor (IHV) support, integration with common hypervisor and cloud APIs, flexible deployment models and support for various access protocols.

Manageability

The ability of the product to support automation, management and monitoring and provide reporting tools.

Reporting tools and programs can include single-pane management consoles, monitoring and reporting tools designed to help support storage team members to seamlessly manage systems, monitor system usage and efficiencies, and anticipate and correct system alarms and fault conditions before or soon after they occur.

Performance

The aggregated IOPS, bandwidth and low latency that can be delivered by the cluster functioning at maximum specifications and observed in real-world configurations.

Scoring for this capability takes into consideration real-world implementations, as well as publicly available performance benchmarks, such as SPEC.

Resiliency

The ability of the product to provision a high level of system availability and data protection.

Resiliency features contributing to this capability include high tolerance for simultaneous disk and/or node failures, fault isolation techniques, built-in protection against data corruption and other techniques (such as snapshots and replication) to meet customers’ recovery point objectives (RPOs) and recovery time objectives (RTOs).

Security and Multitenancy

The depth and breadth of a product’s native security and multitenancy features, including granular access control, user-driven encryption, malware protection and data immutability.

Scoring was based on granularity of multitenancy settings, depth of data-at-rest encryption capabilities, integration with Lightweight Directory Access Protocol (LDAP)/Active Directory systems with user mapping, role-based access control and WORM capabilities for governance and compliance.

Use Cases

Overall

This is the general rating for the scale-out file system storage.

Archiving

In this use case, an enterprise uses scale-out file system storage to meet the requirements of long-term data retention.

Scale-out file system products have been used as an archiving target for regulatory and cost optimization reasons. Security features that can guarantee data immutability (such as WORM), capacity scalability and resiliency are highly weighted for this use case.

Backup

Enterprises use scale-out file system storage to meet the requirements of large-scale, disk-based backup for low RTOs and RPOs.

I&O leaders have used scale-out file system storage as a backup target for years. This is because scale-out file system storage provides added scalability for large backup datasets to meet increasing demands for disk-based backup. Resiliency, storage efficiency and interoperability with a variety of backup ISVs are important selection considerations, and are heavily weighted.

Commercial HPC

In this use case, an enterprise uses scale-out file system storage to provide high throughput and parallel read-and-write access to large volumes of data.

Commercial HPC is the most prominent use case for scale-out file system storage and most scale-out file system storage products are built to address it. Because they are the most important factors in choosing a product for commercial HPC, performance, capacity and resiliency are weighted heavily on this use case.

Large Home Directories

Enterprises use scale-out file system storage to support large home directories, as they would with scale-up NAS, only on a larger scale.

In environments characterized by file server sprawl, scale-out file system storage simplifies storage management by eliminating physical, client-to-server mappings through global namespaces, making it an ideal platform to perform tasks such as automated storage tiering and user-transparent data migration. Scale-out file system storage’s ability to provide operational simplicity and enable linear scalability also makes it particularly useful for consolidating file server or NAS filer sprawl. Resiliency, storage efficiency, and performance are weighted heavily in this use case.

Vendors Added and Dropped

Added

Huawei: In the 2013 release of this Critical Capabilities, Huawei did not yet meet our inclusion criteria, because the company did not have at least 10 customers with 300TBs or more in production and/or did not have a fully owned product. However, the company now meets these criteria, along with the other requirements for inclusion.

Dropped

Nexenta has been excluded from this study due to its lack of support for namespace cluster plug-in since the 4.0 release of NexentaStor.

Inclusion Criteria

The products covered in this research include scalable file system storage offerings with a sizable footprint in the market. In this research, we define scalable file system storage as storage that allows for (at a minimum):

  • 100TB per file system
  • 1PB per namespace, which can span two nodes or more

To be included in this research, scale-out file system storage products need:

  • At least 10 production customers — all with at least 300TB residing on the product
  • Support for horizontal scaling of drive capacity and throughput in a cluster mode or in independent node additions with a global namespace
  • The ability to support all four uses cases in this research
  • Three or more vendor-provided customer references for the product
  • A deployment in at least two major global geographies (e.g., North America, EMEA, Latin America or the Asia/Pacific region)

Vendors, such as Intel, Panasas and DataDirect Networks, that focus on the technical computing (academic HPC) market and/or don’t cater to all the use cases outlined in this document are excluded from this study.

Table 1. Weighting for Critical Capabilities in Use Cases
Critical Capabilities Overall Archiving Backup Commercial HPC Large Home Directories
Capacity 15% 20% 12% 18% 10%
Storage Efficiency 13% 8% 20% 3% 20%
Interoperability 9% 10% 15% 7% 6%
Manageability 11% 12% 8% 12% 12%
Performance 20% 10% 15% 40% 15%
Resiliency 21% 18% 25% 15% 25%
Security and Multitenancy 11% 22% 5% 5% 12%
Total 100% 100% 100% 100% 100%
As of January 2015

Source: Gartner (January 2015)

Critical Capabilities Rating

Table 2. Product/Service Rating on Critical Capabilities
Product or Service Ratings Dell Fluid File System EMC Isilon Hitachi NAS Platform HP StoreAll Storage Huawei OceanStor 9000 IBM Elastic Storage NetApp Clustered Data Ontap 8.x Quantum StorNext Red Hat Storage Server
Capacity 3.5 4.3 3.8 3.9 4.2 4.8 4.2 3.7 3.3
Storage Efficiency 3.1 3.5 3.5 2.7 2.9 2.8 4.3 3.1 2.1
Interoperability 3.4 4.2 3.7 3.8 3.2 3.8 4.7 2.6 3.5
Manageability 3.0 4.3 3.6 3.7 3.0 3.7 4.1 3.4 2.7
Performance 3.6 4.1 3.9 2.6 4.1 4.3 3.9 4.1 2.8
Resiliency 3.5 4.3 3.8 3.9 3.4 4.1 4.2 3.2 3.7
Security and Multitenancy 2.9 4.5 3.9 4.2 3.2 3.7 3.8 3.0 3.3
As of January 2015

Source: Gartner (January 2015)

Table 3 shows the product/service scores for each use case. The scores, which are generated by multiplying the use case weightings by the product/service ratings, summarize how well the critical capabilities are met for each use case.

Table 3. Product Score in Use Cases
Use Cases Dell Fluid File System EMC Isilon Hitachi NAS Platform HP StoreAll Storage Huawei OceanStor 9000 IBM Elastic Storage NetApp Clustered Data Ontap 8.x Quantum StorNext Red Hat Storage Server
Overall 3.34 4.17 3.76 3.49 3.51 3.96 4.14 3.39 3.08
Archiving 3.28 4.25 3.77 3.71 3.48 3.99 4.13 3.30 3.17
Backup 3.35 4.11 3.73 3.45 3.43 3.86 4.22 3.29 3.07
Commercial HPC 3.43 4.20 3.81 3.33 3.74 4.18 4.09 3.62 3.07
Large Home Directories 3.30 4.13 3.74 3.47 3.40 3.83 4.15 3.33 3.03
As of January 2015

Source: Gartner (January 2015)

To determine an overall score for each product/service in the use cases, multiply the ratings in Table 2 by the weightings shown in Table 1.

http://www.gartner.com/technology/reprints.do?id=1-28KO5XO&ct=150128&st=sb

Advertisements