Cuboid - Technology, Industry and People

Tuesday, March 19, 2013

Commoditization of Enterprise Storage

Let's start with a question.
When was the last time that Enterprise Storage [NAS, DAS and SAN] Industry saw a new entrant challenging the incumbents and become successful?
Answer: 2001-2004 when NetApp successfully challenged dominance of EMC and picked up significant market shares in NAS space [NetApp today commands around 24% of NAS market while EMC has around 41%].
What happened at that time? Well, NetApp brought a new NAS appliance, in the industry, that was cheaper and lot more easier to manage for the IT folks. There is another side of the story. EMC, as we know today was not exactly the same in 2003. Between 2003 and 2005, EMC went through a transformation under the leadership of Mr. Tucci from being an engineering-driven organization to marketing-driven organization. They bought few companies which later on changed the way people look at EMC. Some of the dazzlers that they bought at that time are Documentam and VMware, much smaller but very interesting entities at that time. But these are all past data. Why are we bringing them now? The next question will lead to the answer.
Question 2: Networking hardware segment [Routers, Switches] has already seen a mass-scale commoditization. HDD segment is also heavily commoditized. Surprisingly Enterprise Storage segment has resisted it till now. The question therefore is how are they doing it?
Well, the easy and quick answer is that the storage technology space is not as transparent as IETF-driven networking technologies and more importantly Storage gears deal with storing and managing the organization data which is critical to organization's survival unlike Networking gears which only see the data in transit. High sensitivity to data has made the organizations extremely risk-averse and a prisoner of sort to a particular set of vendors [who they bought the equipment from earlier].
Will my new system smoothly talk to old systems? How do I trust that my data will not mutate (get corrupted or changed irretrievably ) or storage infrastructure will not become unavailable and shaky when I bring a relatively unknown vendor in the same setup when the vendors themselves did not certify the configuration? I want my vendor to guarantee that the setup will be stable and I can demand that if the setup is relatively homogeneous.
Life of Data is another important aspect of the puzzle. Since regulatory compliance require firms to keep their internal data for a long period [some critical data must be stored for 25 years or more], IT managers tend to buy equipment from those vendors who are likely to survive for that long, and that leaves the buying decision heavily skewed towards the established players, making it a little difficult for new-entrants to find foot-hold.
All the leading Storage System vendors capitalize on this fear in customers' mind which kind of explains why they have been successful in resisting commoditization of storage systems and how they have been successful in pushing their servers and disk arrays to customers year after year. If you have invested in NetApp gears, you will continue to buy NetApp boxes and software and likewise for HP, EMC unless the cost dynamics pushes you to change.
A curious thing to note that all the leading storage OEMs today use processors, HDDs and Flash drives from the same set of vendors and have their manufacturing outsourced to same part of the world but somehow they have maintained the product differentiation in customer's mind. Multiple Design patents and proprietary software are the secret sauce behind their dominance, they would argue. It is also curious to observe that relative positions in terms of market share between the top 5 did not see significant changes over the past 5-6 years with EMC leading the pack and IBM, NetApp, HP following EMC from a distance. Quite miraculously the incumbents have not really seen a challenger like NetApp in last half of a decade.
But is it really a miracle?
It is only when we look closely, the veil of miracle drops and we realize it is rather a vigilant and clever market maneuvering, by the incumbents, that did not allow the challenger to find success and sustain the way NetApp did. Each time, we saw a new entrant creating a niche for itself, we also knew that it is going to be bought over in a matter of months. When clustered Storage became hot domain for startup, we saw spinnaker got acquired by NetApp, and Isilon got bought over by EMC. If EMC bought XtremeIO, HP bought 3PAR. When Deduplication brought the spotlight on efficient storage, IBM bought Storwize. Datadomain was bought over by EMC just when people thought they would be the change-agent in disk-based back-up. As many observed, the incumbents somehow always managed to gobble up new entrants before they became challenger.
How long can they sustain? Well, they will sustain as long as the game is played in the same way by the new entrant and buyers continue to look at storage in the same way they have been looking.
But what happens if the game itself changes? Is it realistic to expect that the game will change when incumbents have low incentives to change the game? What incentives customers have to take up the risk of changing their way of looking at their IT puzzle?
We will explore that in next post.

Thursday, February 21, 2013

Future Energy Efficient Enterprise Storage Servers

When I was finishing my last post, another thought came to my mind. If 30% of data centre energy bill goes to Storage and Servers and half of that are consumed only by storage, we still have a large expense head to target. Present Intel Xeon or x-86 based processors that most enterprise servers use today are quite power hungry. Given that processors are the biggest power consumers in server hardware, should we not be looking at making them more energy efficient?
ARM enjoys today as the de-facto processor design for all energy sensitive segment, particularly miniature and consumer devices such as smartphone. Intel brought up ATOM processor based on ARM design but somehow the server industry stayed away from ATOM. Traditionally Server work profile used to be considered too heavy for ARM-based processor. But the trend is changing slowly. Since ARM is inherently lot more power efficient, there is an interest to use cluster of ARM core processors to replace more powerful Intel variants. Baidu, the Chinese Google [search] equivalent announced that it has deployed ARM based processor from Marvell for its storage servers. The particular version that Baidu is using is known to be Marvell's 1.6GHz quad-core Armada processor, that Marvell launched in 2010. However, AMD and some semiconductor startups like Calxeda also are trying to bring a 64-bit version for the storage server market. In last 3-4 years most of the storage vendors have moved to (mostly Intel-based) 64-bit processors for their enterprise servers. So, it is quite obvious that for them to seriously consider a ARM-based processor, they would need at least a 64-bit version. Taking cognizance of this need, ARM has already announced to bring out two new core designs. Last October, ARM unveiled its 64 bit Cortex A-50 server processor. ZDNET reports that this design is already licensed by AMD, Broadcom, Calxeda, HiSilicon, Samsung and STMicroelectronics. AMD announced that their first ARM based server CPU is targeted for production in 2014.

AMD Bridges the X86 and ARM Ecosystems for the Data Center from AMD

At this point, it is not clear if Intel's response would be another version of ATOM or Xeon. Storage vendors who adopted Xeon in their storage controllers definitely would like if Intel makes Xeon more energy efficient. But we sure can expect the data centres to be lot more energy efficient compared to their present versions.

Thursday, February 7, 2013

Innovations towards more energy- efficient storage

source: ComEd

Electrical Power Consumption is one large cost component in data centre expenses. As per some of the available reports, 30% of data centre energy bill is attributed to IT Equipment. Last year in September, New york Times published an article that highlighted how much energy bill these modern data centres run. As per that article [read here], Google’s data centers consume nearly 300 million watts and Facebook’s about 60 million watts. "Worldwide, the digital warehouses use about 30 billion watts of electricity, roughly equivalent to the output of 30 nuclear power plants, according to estimates industry experts compiled for The Times.", NYT reports.
Since power consumption has become such a big issue with data centres, there are many companies like ComEd whose business entirely is focused to solutions for reducing the energy use of data centers.
But given that 30% of that consumption is driven by servers and storage devices, the question arises as to why the storage vendors are not bringing more energy efficient storage. Fact that 'energy efficient storage' has not been highlighted to be one of the major trends in 2013, tells us that addressing the problem is not very simple. Let us see why.

Electrical Power efficiency in disk based system

Storage systems at present are primarily disk-based. Even our backup system is predominantly disk-based except possibly the main-frame systems. With disks-based systems, historically the storage software are designed assuming that all attached disks are online i.e. disks have to be connected to the data bus of the storage controller and are powered on and are always available for the higher level . These systems traditionally cannot differentiate between a failed disk and a powered off disk. Fact is disk vendors, initially did not provide any different device state in the disk controller [electrical circuitry attached to the disk that controls read/write to the disk] that would identify that disks are powered down. So the storage vendors too designed their File Systems /RAID Arrays with the assumption that disks are always powered on and in active state. Probably nobody imagined that hundreds of disks will be attached to a single storage controller one day. With the rising clamour for better power management, by 2008-2010, the disk vendors introduced power management scheme in SATA controllers. Since SATA is mostly used in near-line, backup and archive systems, and these systems have large number of disks which are not used all the time, one can power down 'idle' disks, if possible and bring considerable power saving. SATA provides two link power management states, in addition to the “Active” state. These states are “Partial” and “Slumber,” that, by specification, differ only by the command sent on the bus to enter the low power state, and the return latency. Partial has a maximum return latency of 10 microseconds, while Slumber has a maximum return latency of 10 milliseconds [further read]. Storage systems would need to tweak their File System and/or RAID controller to take advantage of SATA power management. Handling microseconds of latency is easier but handling miliseconds of latency requires major design change in the software.
EMC first brought this power management feature in their Clarion platform. The solution was to create a new RAID group and assign the powered down disks to that group after 30 min. idle state. The controller could recognize these disk states and can wait for maximum 10 seconds for the disks to come back to active state. EMC claims that this power down feature would save around 54% in average [ further read]. To my knowledge, other storage vendors are in the process of adopting this power saving feature in their controllers. If they haven't done already, it is probably because their disk based system would require pretty large design changes to accommodate these new states of disks. I personally was involved in analysis for one prominent storage vendors, and was made adequately aware of how deep the changes would go. However my take is that in next 3-4 years, most disk-based storage vendors will adopt the SATA power management.
That obviously leaves out a large number of systems that use FC/SAS disks. Fortunately SAS 2.1 brought in a new set of power management features which disk vendors are expected to adopt in next few years and SAS is expected to replace FC disks going forward, so we have a workable solution in the future.

Tape-based system as an alternative

Tape controllers on the other hand do not suffer such issues. Tapes, in fact are designed with specific attention to offline storage. One can backup the data to the tapes, take the cartridge out of the online system, store them in separate locker and insert them to the system when needed. Inside the vault, the tape cartridge do not consume any electrical power. They do however needs periodical data auditing since tape-read fails more frequently that disks.
But with the new long-life and a high-capacity LTO-5 and LTO-6 tapes, those problems are much reduced. Many are in fact bringing back tape storage in their system. EMC also is promoting tapes for backup data. Although it sounds like a regressive step, one must accept that tape does provide a considerable power saving option especially when it comes to storage for data backup and archival.

Little Longer-term Future

To envisage the future of power efficient storage, we need to look at the problem holistically. One can power down idle disks. However more power is consumed by active disks. Data centres also spend considerable money in cooling the data centres. The pie chart at the top shows that almost 33% of total energy is spent in Cooling system and that cost is going to rise with rising global temperature.
A better solution would therefore be to design storage media that consumes almost zero power when kept idle but also consumes much less power even in active state compared to existing hard disks. Much much better if these media can operate at room temperature which would translate to lower energy bill for cooling. Towards this, flash provides an excellent option. Flash storage [see previous post] consumes magnitude less power for regular read/write operation and consumes almost zero power when left idle. It also provides much higher random read/write throughput making it ideal for high-performance storage. At present its relative higher cost and limited write-span are the hindrance for it to replace disks in mainstream storage. With time there is little doubt that further innovations will bring down the cost/GB drastically. Storage capacity also will be comparable to SAS. The biggest dampener for SSD/flash so far has been its number of writes limitation. A very recent article in IEEE Spectrum indicates that we already have a breakthrough. Macronix, a Taiwanese company has reported the invention of a self-healing NAND flash memory that survives more than 100 million cycles.  Fact is they are yet to find the limit where it breaks. They strongly believe that it will survive a billion writes. Their method is simply a local heat treatment on the chip-set to lengthen the life of the media. If that invention works, we have an alternative storage solution that meets all our stated needs, namely, 1.consume low power, 2. can operate at room temperature, 3. provide both high capacity [~ around 2 TB] and high throughput and 5. consume a fraction of space compared to HDD [the IEEE Spectrum article can be accessed here].
In a couple of years the technology is very likely to matures with full-fledged induction of flash-only storage in mainstream storage systems. EMC's xtremeIO, whipTail, violin memory and other all-flash-storage systems are likely to define tomorrow's mainstream storage system.

Tuesday, February 5, 2013

Building a SAAS infrastructure using opensource components

Let's assume that you are in the low-cost web server business and you want to build your entire setup using only opensource components. This means that you probably are using a open-source LAMP stack for your web-servers. This also means that you definitely are using mySQL for your backend database. As far as programming tools are considered, you have plenty of choices. For our present discussion, we would assume that you are using PHP and Javascript since these are the tools majority use today. However, changing the tool should not be a big issue and we should be able to add new tools as needed. So in summary, we need an application server setup comprising a LAMP stack, a PHP server, a mySQL server. In case you need a Windows configuration, we would simply replace LAMP with WAMP stack.
All right, now let's say you need many of these servers and you need to be able to provision the application fast, ideally in just couple of hours, application in this case would mean a single server application and not the Hadoop type. This means you would prefer a virtual server instead of a single dedicated m/c. Since you do not know how many applications you would run and how many users would subscribe to a single application, you want to design a setup that can scale out instead of scale up. What we mean is this, let's say your application handles 4000 connections today. Now you can either design a server that can scale up to load for 100,000 connections [which is decidedly more complex] or you can decide to design to share the load with multiple servers, which is relatively easier.
Advantage of the latter approach is that you can scale down very easily by decommissioning few servers when your load goes down and re-provision the servers for different application(s).
In short we need a cluster of VMs with each VM running either a web server or mySQL.
Let's look at your hardware investment. To start with, you want a mid-size server m/c. A good configuration would be six core x86-based processor [Intel or AMD] with 8 GB RAM at the least, somewhat similar to Dell Poweredge. The hardware must support Intel VT spec for hardware assistance for virtualization. To cover for hardware failures, you may want to have two of the servers connected in hot-standby mode. For storage, you may want to have another set of two servers with large cache of SAS disks running GlusterFS. No need to mention, all these m/c would be connected in LAN and would be interfacing to the external world through a Firewall router.
Now let's bring virtualization. We will install Linux KVM hypervisor on the PowerEdge equivalent servers. Remember Linux KVM that we are talking here is free and does not come with many management tools that come with Both RedHat Enterprise Virtualization [RHEV] and SUSE enterprise virtualization version are used in large enterprise setup and one can choose either of them. Both of these versions come with a License Fee. If you like to check which guest OS is supported on which KVM version, check this page.
Once KVM is installed we can create around a hundred VMs on the server. Each VM can be pre-configured with a guest OS [we used Ubuntu version] and a web-server template which would provide default configuration for LAMP stack, PHP servers, IP address, domain name in order to preload our VM installation work. I know that RHEV provides tool to create template. For free KVM, one may have to do it manually or write a tool. Having a template makes the provisioning jobe easier. All that one needs is to modify the configuration before commissioning the web-server.
For GlusterFS, many prefer the Debian linux as the host. Since they are not part of web-server, we actually can have a native debian server and have the GlusterFS installed on it. This page can be helpful in getting this setup ready. Now we need to install the mySQL cluster [it provides replication support]. this How-to document could be useful.
Now the setup is ready, we have almost the first version of open-source based cloud ready. You can commission the web-server as needed. There is one glitch though. You have no interface to monitor server load and health remotely. We picked Nagios tool for that job as it was relatively light-weight and easier to install. It is also used by some of the leading large cloud service providers. You still may have to develop few tools to make the provisioning the application and monitoring entirely autonomous as per specific need for the setup.
There is a nice article on building Ubuntu linux based cloud and you may want to check that too.

Friday, February 1, 2013

Building your own cloud

Let me start with a disclaimer that I do not intend to write a 'How-to' document in this post. My primary motivation behind writing this article is that I see too many overloaded and clouded use of the word 'cloud', which in my opinion is becoming serious demotivator for honest and hype-shy people to really feel interested to cloud. From that point of view my humble attempt here would be to deglamourize 'cloud' and lay it as bare as possible. Hopefully along the way we will be able to understand what it takes to build one cloud.
Conceptually cloud refers to the arrangement where application server configuration and storage space can be dynamically changed preferably with a software interface. Essentially this means 1. application server is decoupled from physical server configuration and 2. there is software interface which allows the administrator to monitor loads and performances and add/modify server configuration most desirably remotely. Advantages are two-fold, the application server becomes self-contained entity which can migrate from one hardware to another as needed and secondly it gives us the freedom to think in terms of our application's need alone. This simplifies a lot in application provisioning, especially if the user does not have the expertise in data-centre technologies or it does not want to incur large CapEx in data centre.
Technologically, there is one crucial element that has accelerated the shaping up of cloud and that is virtualization. Not that the concept is anything new, but the way it has been made ubiquitous has brought new value. With increasing number of faster processors and larger RAM becoming the norm for modern physical servers, it became evident some time back that running a single OS with single server application would mean significant under-utilization of the server m/c. For example, a Dell PowerEdge 310 is a relatively low-end server m/c but it provides one Quad-core Intel® Xeon® 3400 series processor and 8GB RAM (expandable to 32GB) in one configuration. Running a single application is serious wastage of all its processing power, unless the application is heavily used all the time. In a typical case, an application server's load is input driven and only takes up compute and networking bandwidth for a fraction of the time the resources are up.
Instead one can install VMware vSphere or Microsoft Hyper-V and have tens of VMs each with its own application server running on a single server m/c. Great thing about this VM (Virtual Machines) is that all the needed interfaces (serial, network) come bundled with it. One just have to install (provisioning is almost effortless with all commercial VMs) them and they are ready for application provisioning. Best part of it, with one time configuration of all the VMs, getting a new application up requires very small time. One can even have VM templates for different type of servers [e.g. Oracle server VM or exchange server VM or an Ubuntu based linux web server VM] and install the template once a VM is allocated.
Now a server needs storage for its data which also keeps growing. Adding a physical storage volume or a physical LUN [in SAN concept] to each VM is bound to bring under-utilization of storage bandwidth. Instead, storage vendors provide virtual volumes / LUNs which can be provisioned over a physical Volume or LUNs[which is just a bundle of disks sharing same RAID structure].
VM and Vstorage (i.e. Virtual storage volume or virtual LUN) thus can be thought of as the unit of provisioning in an IT setup. All that one needs is a horizontal software layer that monitors and configures the VMs [with all the interfaces] and VStorages and one has basic working Cloud ready. A user of this cloud can be allocated VMs with pre-installed applications and storage and a software interface using which the user can manage and monitor his resources. When he needs more compute or network bandwidth he places the request to the Cloud administrator and from the Cloud's readily available VM pool, the administrator assigns adequate resources to the user. This generic model is used for what is known as IAAS or Infrastructure As A Service. If the cloud needs to provide support for higher abstraction of service, it needs further sophistication at the horizontal software layer. For example, let's assume the user needs to run an application that sifts through huge data which are distributed across many compute and storage nodes. The application needs the support of a more sophisticated interface which can scale up or scale down resources with the volume of data while providing a consistent data manipulation engine across many nodes. Yes, we are talking about Hadoop like software layer. We cannot cover Hadoop here but the point should be clear, complexity of cloud is driven by the sophistication of the application that the cloud is going to host, but at the essential level, Cloud remains a set of virtualized compute and storage resources which are governed by a single software layer.
As one can imagine, a basic Cloud setup can be built entirely using Opensource elements too. In the next post we will talk about a basic cloud setup that we built with linux KVM and Python.

Monday, January 28, 2013

Storage Trends in 2013

As the year begins, it is kind of a popular game to try to predict what is coming in the year. However, as far as storage industry goes whichever way one looks, it does not appear that the game can be much interesting. One good measure of interesting-ness is (1.)how predictable the market is and the other one could be (2.)how likely a new technology is going to be disruptive. As far as the market is concerned, results from last years tell us the market has been quite stable with EMC leading the pack almost in all fronts of storage systems with more than 1/3rd share of market, while IBM, NetApp, HDS and HP closely competing with each other to pick up second position in the rank. Gartner's 3rd Quarter, 2012 chart [source], below shows relative positions of the storage system vendors.

Growth projectiles of individual players also do not throw up any possibility of surprises with EMC continuing to lead with large margin in the foreseeable future. IDC for example in its press release last November, 2012 forecast 2013 to be a slow growth year. "While both EMC and NetApp continue to gain market share, which should enable both vendors to outpace the overall market growth rate, we are modestly concerned with the current estimates for EMC (+9.5% in 2013) and NTAP (+7.6%) vs. the (forecast) industry growth of 4%.", IDC report says. Gartner also sounded similarly in their 2012-end forecast about the market. To sum up, we do not expect much of reordering in ranks this year too.
As far as technology trend is considered, we have seen published views of 3PAR (HP) CEO David Scott and NetApp CTO, Jay Kidd.
While both focus on their respective solution portfolio and positioning, Mr. Kidd paints the picture with broader brush. He sees, dominant market play of Virtualization, Clustered Storage, Flash Storage and Cloud access in 2013. EMC in addition talks very strongly about Tapes. Tapes?? Some would sneer that it looks like a regressive step. But frankly if enterprise is buying tape storage, there must be strong reasons for that. With larger archive, the need, to drive down power, rack density for archive storage, has become stronger and EMC is giving a solution where Tape adequately addresses that need, especially where EMC gears constitute most of the data centre equipment. But we are digressing.
Coming back to where we started, based on what we can distill from all the chatters, there are three distinct technological patterns:
1. more penetration for solid state storage as differentiator in tiered storage system and
2. stronger play of virtualization in storage deployment to enable more data mobility and
3. growth of object storage.

Let's take them individually.

Flash-based Storage /Solid-state Storage

Samsung SSD vs SATA HDD: source CNET

Solid-state storage [based on NOR or NAND flash] devices gained popularity in last decade, especially in consumer devices. With passing years, SSDs vendors have brought more reliability, memory density and longer device life time so much so that enterprises see SSD as strong alternative to high-speed HDD. Compared to disk-based HDDs, SSDs offer almost 3 times lower power consumption and magnitude faster memory access [no seek time for SSDs] making it better fit for high-transaction-throughput server. SAP's HANA for example runs entirely on memory to provide faster throughput. SSDs become cheaper alternative in this scenario. However big storage players so far showed lukewarm response due to high cost of SSD compared HDD-based system. Most of the large storage players brought in flash as fast cache or accelerator in otherwise disk-based storage controllers for read/write throughput [some are using it to store metadata for active storage volumes] but so far complete SSD-array has not come to mainstream. Startups like Whiptail and Violin Memory are betting on full flash-based storage array and not surprisingly they are making quite a few positive news splashes too. Many believe that 2013 will herald the era of SSD-based storage arrays in mainstream enterprise storage. Here is a recent story where Indian Railway System [IRCTC] is looking at SSD to boost performance for online real-time ticketing system. In a tiered storage structure, it looks like flash-based storage or SSDs will see a dominant role in Tier-1 or performance tier in this year. [For more on tiered storage concept see my previous post]

Virtualization

Virtualization is not a new story. VMware continues to shape the storage deployment topography where mobility of not only virtual machines but mobility of entire solution bundled with storage, server and networking infrastructure is making headway. Here we are talking about mobility of entire application ensemble. While EMC gets the biggest benefit of VMware's dominance, by working with all leading storage players like NetApp and HP and networking giant Cisco, VMware has almost created a de-facto virtualization solution for enterprises. There are are few IT managers, though, who are brave enough to try linux based virtualization solution. IBM for a change is pushing KVM [linux virtualization solution] and trying to position it as alternative to VMware solutions. Read more at IBM blog. There is however hardly any different opinion that virtualization will drive most of the storage deployment this year [I am not counting tape storage here]. IDC also forecasted that in 2013, 69 percent of workloads will be virtualized.
Software Defined Data Centre [SDDC] is a term that has got quite popular in electronic chatter these days. Although VMware coined the term some time back but the way people are using it today is very different from the way VMware outlined. SDDC is used to describe scenario where entire data centre is defined in software and provided as a service. It takes lot more than just virtualization of server and storage but primary constituent of the solution definitely is virtualization. From that perspective, we would put SDDC under Virtualization in the present context.

Object Storage

Object Storage largely comprises all cloud-based storage access. Typically a cloud is accessed over HTTP-based interfaces where all storage entities are referred as objects, i.e. a URL is an object, a File is an object, a database entry too is an object. In other words whenever one accesses storage using any of cloud APIs, one is accessing object storage. In a sense this is an abstraction but in many other senses, it is new way of dealing with storage. It is kind of getting closer to application semantics. As enterprises are moving to Cloud [public or private], storage accesses is getting objectified.
In 2013, Adoption of special cloud-based software will expand as the applications will become more and more cloud-aware. Mark Goros, CEO of Caringo, the leading provider of object storage software, tells us that, "The shift to object storage is being driven by IT trends, including the adoption of cloud services, emerging analytics applications, BYOD and mobility, as well as the market momentum of research, healthcare, government and life sciences." [source]. While there are public cloud gateways like Nasuni File server or NetApp® StorageGRID® gateway that connect enterprise data centres to public cloud like Amazon or Rackspace, the challenge of object storage is less about handling throughput, it is more about how one can organize, move and manage huge number objects of varied sizes in an unconstrained namespace. As is evident, enterprise object storage will closely follow the evolution of large public Cloud infrastructure like Amazon EC or Microsoft Azure.

Tuesday, January 15, 2013

A storage system potpourrie for beginners

Storage is easily one of the most talked about, most invested by people's attention and most confusing technologies around. Anyone who can read this sentence, is aware of digital storage as a concept. Any data that is generated by computing m/c is digital and requires digital storage. However when it comes to technology aspect, storage is easily most clouded concept that is infested with unending series of acronyms: DAS, NAS, SAN, SCSI, SATA, SAS, NFS, CIFS, RAID.. and multiple technology families, like tape storage, disk storage, solid-state storage and then there is all-encompassing Cloud. If you hoped that with cloud you have finally one thing that you can take refuge in, hold that hope for you must first ascertain what constitutes cloud to be sure that you can rest with Cloud.
One way to make sense out of these apparent forest of acronyms and concepts is try to appreciate what we need storage for. Essentially entire purpose of all storage technologies is to help us to store our ever-expanding digital data in such a way that is

safe and persistent, that is data does not get destroyed, lost, mutated or corrupted once stored
secure against unauthorized access
accessible when one needs and
affordable.

There is one more complexity that we must be mindful, which is, complexity of size. As the size of the data grows, the means to deliver on all those four parameters, must evolve, often drastically so that overall solution remain attractive to user. For example, if you have only 100GB data, a single external hard disk is often good enough for your need, however if that data becomes 1 exabyte [1 exabyte is 1000 petabytes and 1 petabyte is 1000,000 GB], you need whole range of technologies to manage that data. Difference between personal storage and enterprise storage to a large extent is an illustration of how Quantity transforms into a qualitative attribute at larger magnitude.

Personal Storage

For non-professional personal need, typically a 300GB hard disk that comes by default with a laptop is more than sufficient. A 250GB hard disk for example can hold around 50,000 normal size photos or mp3 music. If you are avid user of video, you probably will buy few 1 TB external hard disk in addition and that would be DAS or Directly Attached Storage system for you. If you are a cloud aficionado, you probably would rely on Google Drive or Microsoft SkyDrive for your additional needs. In which case you have both DAS and public Cloud in your system.

Enterprise Storage

When it comes to enterprise, many aspects like, data growth, data retention, preparedness towards recovery of data against site disaster and access frequency of data comes into consideration, making the storage planning a costly and complex business. Additionally with increasing sensitivity towards unstructured data, enterprise is experiencing faster expansion of storage demands. According to IDC's Worldwide Quarterly Disk Storage Systems Tracker, 3Q12 marked the first time that external disk storage systems makers shipped over seven exabytes, or 7,104 petabytes, of capacity in a single quarter for a year-over-year growth rate of 24.4 percent.[source: Infostor]. This means in next 5-6 years there will be many organizations that would hit exabyte of enterprise data.

Storage Tiers

To get around this challenge of data explosion, enterprise try to bring storage tiers where the data is organized into different classes based how actively they are used. For example, very active (data modification rate is high and data access rate is very high) data requires that they are kept online in fast and most reliable storage tier [let's say tier 1] and the least active [no data modification and only accessed in special scenario like past data audit or recovery] data could be archived in off-line storage. This way, the enterprise provides most resources to most active data and efficiently reduces cost of storage for lesser active data.

Fig 1. Storage Tiers based on Data usage

Fig. 2 tapes and disks

Typically most of the online storage in an enterprise is maintained in disk-based storage. Traditionally digital tapes were used for all offline storage for advantages that tapes can be preserved with very low electrical power consumption and can be moved to different location physically with very little cost. But tapes are serial and therefore require different hardware setup. They also are more prone to read-failures compared to disk. Last ten years of innovations increased storage density of disks manifold and brought down the cost/GB of storage for disk lower than to that of tape and eventually established disks very strongly for archival storage so much so that most enterprises of late are opting for disk-based backup over tape. It started with VTL [Vitual Tape library] appliances replacing physical Tape backup appliances and of late VTLs got merged with standard disk-based backup appliances. Almost all backup appliances use Deduplication in a major way to reduce storage footprint. An added advantage that this transition has brought is archived data can be made online within a very small time-window. Datadomain appliances are very good example of how disk-based backup appliances shaped up. Additionally the backup appliances provide some desirable features such as compliance support where the system can be configured to ensure immutability of data once written into it for a duration defined by the administrator, or automatic data shredding where the data gets destroyed when someone tries to access the data from disk without going through proper authentication procedure.
Compared to archival data, Tier-1 storage employs high-end faster disks [15K RPM] quite often along with SSDs [Solid State Disks]. SSDs are new favourite in this segment with vendors, like Samsung, Sandisk competing with each other to bring out new products that are cheaper, denser and last longer. SSDs are a lot faster and support true random read/write compared to disks.With fast falling price, higher capacity and increased life-time, solid-state drives are finding their places in a large way in tier-1 storage gears. Other advantages of SSDs are that they occupy less physical space, less electrical power and can transition from offline to online a lot quicker compared to disks. It however will take some time, before we see SSDs completely replacing disks in this Tier.

Fig 3: simple stack comparison - SAN, NAS and DAS

Fig. 4 Tiered NAS storage organization in Data Centre

Sometimes called primary, mission-critical storage appliances, Tier-1 storage gear provides fast, reliable storage for mission critical data. They often provide multiple levels of redundancy in order to reduce data down-time. Since these gears are the most expensive of the lot, many storage vendors provide mechanism to transparently move less active data to less expensive disk storage. This Low-Cost Storage Tier or sometimes referred as Near-line storage often is made up of large set of high-capacity but slower SATA disks [5400/7200 RPM]. NAS (Network attached Storage) designs are inherently suited for this type of tiered use, which kind of explains why NAS sells more compared to SANs. Also SAN uses fibre-channel or SAS disks making it more expensive compared to NAS when the data is not mission critical. [see slides for an illustrative comparison between NAS and SAN]. In either SAN or NAS a single disk-array must have all its disks of similar type and speed. For example either they all will be FC high speed disks or they will be SAS. Either way, higher level data access syntax are built into the NAS/SAN software. NAS mimics File access syntax as provided by a File System and SAN provides block access that File systems can use. So NFS (Network File System) and CIFS are the two primary interfaces that a NAS server supports whereas iSCSI and FC are the two interfaces that SAN provides primary support for the host server file systems.
Fig 4 provides an illustration of a typical enterprise with two data centres, both simultaneously serving its users as well as providing storage replication service to the other site, a popular configuration to support Site Disaster Recovery, while internally each data centre organizes data into 3 tiers. Tier 1 storage almost always come in a primary-standby configuration in order to support high availability.

Cloud Storage

courtesy: HDS: Thin Provisioning with Virtual Volume

Cloud as a concept became popular only after Virtualization became successful in large-scale. With virtualization, one could have hundreds of virtual servers running on a single physical server. With that, came software that could make provisioning hundreds of applications a matter of running few software commands which could be invoked remotely over HTTP. Ability to dynamically configure servers using software brought up a new paradigm where an application can be commissioned to run across multiples of virtual servers (that are communicating with each other using a common communication structure), serving a large user base entirely using software commands that administrator could execute remotely from his desktop. This type of server provisioning demanded new way of storage provisioning. Concept of virtual volume or logical storage container became popular. Now one can define multiple containers residing in the same physical storage volume and provision them to the server manager remotely. The concept of Thin provisioning became predominant in storage provisioning where the idea is that a server is provided a virtual volume that uses little physical storage to start with but as it grows the physical storage allocation also grows underneath based on demand. Advantage with this is that one does not need to plan for all the storage in advance, as the data grows, one can keep adding more storage to the virtual volume, making the virtual volume grow. That decoupled physical storage planning from server's storage provisioning. Storage provisioning became dynamic like virtualized server provisioning. As long as the software can provision, monitor and manage the servers and virtual volumes allotted to the server over a software defined interface, without errors and within acceptable performance degradation, the model can scale to any size. As it is apparent, there is no real category called 'Cloud storage', what we have rather is 'Cloud service'. Data centres are designed and maintained in the same way the data centres are designed and built all along using combinaton of NAS, SAN and DAS.
Cloud provides a software framework to manage the resources in the data centres by bringing them in a common sharable pool. Cloud in that sense is more about integrating and managing the resources and is less about what storage technologies or systems per se. are used in the data centre(s). Given that Cloud software is the essential element of Cloud service, as long as the software is designed carefully, one can have any type of devices /systems below it, ranging from inexpensive storage arrays of JBOD (Just a Bunch Of Disks) to highly sophisticated HDS, HP, EMC disk arrays or NAS servers. The figure below from EMC's online literature illustrates this nicely.
It is apparent that as the cloud size grows larger and larger, the complexity and sophistication of the software increase by magnitude and so does the cost advantage of data storage. One can look at the cost of provisioning (server and storage) in public clouds like that of Google, Rackspace and Amazon and imagine the complexity and sophistication of their Cloud management software. Fortunately many have published a version of their software in open source for others to learn/try.

source: http://managedview.emc.com/2012/08/the-software-defined-data-center/
courtesy: EMC

Further Reading:
Brocade Document on Data centre infrastructure

My slides on slideshare