Thursday, September 15, 2011

Big Data Fuelling Storage growth?

Recent IDC report tells us that enterprises are spending on storage again and it appears that preparing for 'big data' is a major growth driver this time. The boost in storage has come along with investments in cloud computing and data-centre virtualisation, IDC analyst Liz Conner said. Companies are updating their storage systems for the era of "big data," to deal with huge and growing volumes of information, she said.
While money spent on external storage increased by 12.2% Y-over-Y for the second quarter of this year, the total capacity grew by more than 47%.

Sales increased across all major product categories, including NAS (network-attached storage) and all types of SANs (storage-area networks). The total market for non-mainframe networked storage systems, including NAS and iSCSI (Internet SCSI) SANs, grew 15.0% from a year earlier to $4.8 billion (£2.96 billion) in revenue, IDC reported. EMC led that market with 31.9% of total revenue, followed by NetApp with a 15.0% share. NAS revenue alone increased 16.9% from a year earlier, and EMC dominated this market with 47.2% of revenue. NetApp came in second at 30.7%.
EMC led non-mainframe SAN market too with  a hold of  25.7% of that market, followed by IBM with 16.7% and HP with 13.4%, according to IDC.
[IDC is a division of International Data Group, the parent company of IDG News Service.]

Unfortunately the report does not elaborate how big data influences the storage growth.
Is it that the enterprises are anticipating that their internal data will grow faster and therefore investing in expansion fo storage? Or is the growth happening primarily because enterprises are building new storage infrastructure dedicated for 'big data'?
The first scenario is not much different from the decade-old enterprise storage expansion pattern. In the second scenario, enterprises need to think differently. They would be essentially building their own cloud infrastructure. So they would need to decide on distribution of objects/storage elements, which distributed file system they should use, how applications will access these data etc and those will drive the decision of the storage system they will buy. But given that both NetApp and EMC are leading the growth and are selling their already established products in SAN and NAS space, actual scenario most likely to remain closer to the first case. In that case it is the expansion of existing NAS and SAN infrastructure that is propelling the storage growth. Should we then talk about 'Big NAS' and 'Big SAN' instead of Big Data?

Friday, September 9, 2011

Some interesting posts on Big Data and noSQL

BusinessWeek reports that Hadoop is becoming the dominant choice for organizations dabbling with Big Data. They have cited Walmart, Nokia, GE or BoA, all moving their big data on Hadoop. Here is the article: http://www.businessweek.com/technology/getting-a-handle-on-big-data-with-hadoop-09072011.html
Couple of posts from Nati Shalom is also interesting. First one is on big data platform for real-time analytics and he uses facebook model and tries to refine it. The post: http://natishalom.typepad.com/nati_shaloms_blog/2011/07/real-time-analytics-for-big-data-an-alternative-approach.html
And the latest, little lengthy one http://natishalom.typepad.com/nati_shaloms_blog/2011/09/big-data-application-platform.html
And Alex Popescu's blog should not be missed if you are NoSQL enthusiastic. Here is one dig at Digg's Cassandra implementation: http://nosql.mypopescu.com/post/334198583/presentation-cassandra-in-production-digg-arin

And I found couple of interesting infogpraphics:
First one created by Mozy that shows interesting comparison among largest data centres and second one is a graphical illustration on growth of big data.