Tuesday, February 5, 2013

Building a SAAS infrastructure using opensource components

Let's assume that you are in the low-cost web server business and you want to build your entire setup using only opensource components. This means that you probably are using a open-source LAMP stack for your web-servers. This also means that you definitely are using mySQL for your backend database. As far as programming tools are considered, you have plenty of choices. For our present discussion, we would assume that you are using PHP and Javascript since these are the tools majority use today. However, changing the tool should not be a big issue and we should be able to add new tools as needed. So in summary, we need an application server setup comprising a LAMP stack, a PHP server, a mySQL server. In case you need a Windows configuration, we would simply replace LAMP with WAMP stack.
All right, now let's say you need many of these servers and you need to be able to provision the application fast, ideally in just couple of hours, application in this case would mean a single server application and not the Hadoop type. This means you would prefer a virtual server instead of a single dedicated m/c. Since you do not know how many applications you would run and how many users would subscribe to a single application, you want to design a setup that can scale out instead of scale up. What we mean is this, let's say your application handles 4000 connections today. Now you can either design a server that can scale up to load for 100,000 connections [which is decidedly more complex] or you can decide to design to share the load with multiple servers, which is relatively easier.
Advantage of the latter approach is that you can scale down very easily by decommissioning few servers when your load goes down and re-provision the servers for different application(s).
  In short we need a cluster of VMs with each VM running either a web server or mySQL.
Let's look at your hardware investment. To start with, you want a mid-size server m/c. A good configuration would be six core x86-based processor [Intel or AMD] with 8 GB RAM at the least, somewhat similar to Dell Poweredge. The hardware must support Intel VT spec for hardware assistance for virtualization. To cover for hardware failures, you may want to have two of the servers connected in hot-standby mode. For storage, you may want to have another set of two servers with large cache of SAS disks running GlusterFS. No need to mention, all these m/c would be connected in LAN and would be interfacing to the external world through a Firewall router.
Now let's bring virtualization. We will install  Linux KVM hypervisor on the PowerEdge equivalent servers. Remember Linux KVM that we are talking here is free and does not come with many management tools that come with Both RedHat Enterprise Virtualization [RHEV] and SUSE enterprise virtualization version are used in large enterprise setup and one can choose either of them. Both of these versions come with a License Fee. If you like to check which guest OS is supported on which KVM version, check this page.
Once KVM is installed we can create around a hundred VMs on the server.  Each VM can be pre-configured with a guest OS [we used Ubuntu version] and a web-server template which would provide default configuration for LAMP stack, PHP servers, IP address, domain name in order to preload our VM installation work. I know that RHEV provides tool to create template. For free KVM, one may have to do it manually or write a tool. Having a template makes the provisioning jobe easier. All that one needs is to modify the configuration before commissioning the web-server.
For GlusterFS, many prefer the Debian linux as the host. Since they are not part of web-server, we actually can have a native debian server and have the GlusterFS installed on it. This page can be helpful in getting this setup ready. Now we need to install the mySQL cluster [it provides replication support]. this How-to document could be useful.
Now the setup is ready, we have almost the first version of open-source based cloud ready. You can commission the web-server as needed. There is one glitch though. You have no interface to monitor server load and health remotely. We picked Nagios tool for that job as it was relatively light-weight and easier to install. It is also used by some of the leading large cloud service providers. You still may have to develop few tools to make the provisioning the application and monitoring entirely autonomous as per specific need for the setup.
There is a nice article on building Ubuntu linux based cloud and you may want to check that too.


No comments:

Post a Comment