There are many issues to consider when you're building a clustered environment. Proper planning of your Web site architecture is important as well. Many factors are involved and laying out a plan before purchasing and building your clustered environment can save you many headaches later. Questions you may want to ask include:
How many servers do we need? The number of servers will depend on how much traffic you expect and how Web site functionality is distributed in your server farm.
What types of servers and operating systems do we want to deploy? Choosing servers and operating systems depends on many factors, including your team's skills sets and experience in these areas.
How do we balance traffic between the servers? The methods that you select for load-balancing may affect your load-balancer choice. You may want users to stay on one machine for the length of their session. Failover and server monitoring are other considerations when balancing traffic in a cluster.
How will we keep our Web site content in sync between all of the servers and how will we deploy our Web site? This is potentially one of the most troublesome areas in Web site maintenance. Not only do you need to keep Web site content in sync, each server requires periodic configuration changes, patches, and hot fixes to be deployed as well.
I'll try to answer some of these questions by breaking the Web site infrastructure into major elements and then discussing their implementation. These major elements include tiered application architecture, server and hardware components, and cluster-load balancing. What do you have when you have a Web site? You have a server or servers with operating systems, files, directories, configurations, hardware and software. Your environment may be tiered, consisting of the web server, application server, and a separate database server. Let's discuss tiered application architecture first.
Tiered Application Architecture
Before you begin scaling, you should limit the activities on your Web server to include only those related to the operation of the Web server software and ColdFusion MX application server. Other servers in your Web server farm will provide the remaining functionality for your Web site. This approach is called tiered architecture, and it can help provide more stability and scalability as well as improve your Web site performance. Figure 3.1 shows a three-tiered Web site architecture where ColdFusion MX is installed in the application server tier. This configuration can be accomplished by installing ColdFusion MX on a supported J2EE application server platform. For more about deploying ColdFusion MX on J2EE see Chapter 4, "Scaling with J2EE."
ColdFusion MX can also be deployed in distributed mode. Installing ColdFusion
in distributed mode is now quite different than in prior versions of ColdFusion.
ColdFusion MX in distributed mode can be clustered, but still is not the
recommended solution for deploying ColdFusion. To set up ColdFusion MX in
distributed mode, a connector needs to be installed on the Web server, allowing
it to interact with the ColdFusion MX application server. The embedded version
of JRun supplies a Java connector for this purpose. There is a TechNote article
on Macromedia's Web site explaining this
Front-End Servers Versus Back-End Servers
If you are running your database server on the machine that is also running the Web server software and ColdFusion MX application server, it is time to move the database to another computer. Be sure to move all other services off of the Web server to other machines as well. Such services include the FTP server, mail server, network file server, backup server, and others.
Figure 3.1 Three-tiered server farm with ColdFusion MX installed on J2EE.
In a two-tiered architecture, the Web server, all its content, and Web pages are separate from the database server for a single Web site.
A tiered Web server network works best if it's divided into separate front- and back-end segments (see Figure 3.2).
The front end is the network segment between the public Internet and your Web cluster. The front end should be optimized for speed. Place a switched segment with lots of bandwidth in front of your Web servers. Your two primary goals on the front end are to avoid collisions and to minimize the number of hops (intervening network devices) between your Web servers and the public Internet.
If you are using a hardware-based load-balancing solution, you could have a hardware load balancer in front of your front-end network.
The back end is the network segment between your Web cluster and your supporting servers. Because your support servers need to talk only to your Web servers and your LAN, you don't need to make this segment directly accessible to the public Internet. In fact, you might do better to deliberately prevent any access to these machines from the public Internet by using private IP addresses or a firewall. Doing so can enable you to take advantage of useful network protocols that would be a security risk if they were made available to the public Internet. Be sure to spend some time trying to minimize collisions on your back-end network as well.
Figure 3.2 A sample two-tiered configuration for a Web cluster.
To protect the back-end servers from unwanted traffic you can implement dual-homed servers. This strategy employs two network interface cards (NICs) in a Web server: one that speaks to the front end and one that speak to the back end. This approach improves your Web server's network performance by preventing collisions between front-end and back-end packets.
If you choose to dual-home your Windows 2000 servers, you must contend with a particularly nasty problem known as dead gateway detection. Your server needs to detect whether a client across the Net has ended communications even though the request has not been fulfilled. This problem commonly occurs when a user clicks the Stop button on a Web browser in the middle of a download and goes somewhere else. If errors occur, Windows 2000 will eventually stop responding. The solution to this problem in Windows is an advanced networking topic and beyond the scope of this book. You can find information on this subject at the Microsoft Web site at www.microsoft.com/. If you want to find information about the concept in general, it is covered in RFC-816 (RFCs, or Requests for Comments, are specific standards for Internet communications). The full text of this RFC is available on many public sites throughout the Internet.
In a dual-homed configuration, depending on which type of load balancing you are using, you can use private, non-routable IP addresses to address machines on the back-end server farm (see Figure 3.3). Using private non-routables introduces another layer of complexity to your setup but can be a significant security advantage.
Server and Hardware Components
Several considerations regarding server and hardware configurations crop up when you attempt to scale your site. These issues include the number of CPUs per box, the amount of RAM, and the hard drive speed and server configuration in general.
Figure 3.3 Using private nonroutable IP addresses to access back-end servers.
If your server is implemented with one CPU, turning this system into a two-CPU system does not double your performance, even if the two processors are identical. Adding a third CPU increases the performance even less, and the fourth CPU gives an even smaller boost. This is true because each additional CPU consumes operating system resources simply to keep each processor in sync with the others. Generally, if a two-processor machine is running out of processor resources, you're better off adding a second two-processor machine than adding two processors to your existing machine. To illustrate, see Figure 3.4, which shows performance gains when adding up to 4 CPU on one server. Notice that the performance gains are not linear. Each additional CPU adds less performance than the previous CPU.
Figure 3.4 Performance gains by adding CPUs to a server are not linear.
You might ask why you would want a two-processor machine at all. Why not use four one-processor machines instead? In an abstract measure of processor utilization, you might be right. But you also must deal with problems of user experience. Even though you're not using 100 percent of the second processor on the server, you are getting a strong performance boost. This performance boost might make a page that takes two seconds to process on a one-processor box take just over one second to process on a two-processor box. This amount can be the difference between a site that feels slow and a site with happy users. Another point in favor of two-processor machines: Many server-class machines, with configurations that support other advanced hardware features necessary for a robust server, support dual processors as part of their feature sets. If you're investing in server-class machines, adding a second processor before adding a second server can be cost effective.
Macromedia has worked with Intel and Microsoft to greatly improve multiple-server performance in Windows 2000. If you are using Windows 2000 Server, Advanced Server, or DataCenter Server, you will see a far better performance improvement with additional processors than you would see if you were using NT 4.0. If you are developing a new site and you haven't yet chosen a Windows-based operating system, look into Windows 2000 for better performance.
Unix environments, on the other hand, are designed to take advantage of multiple processors and use them efficiently; ColdFusion takes advantage of the extra processing power Unix environments provide. To determine which way to scale a Unix environment (meaning whether to add processing power or another server), you should use your performance-test data and make your best judgment. However, while adding a few more processors will definitely increase your Unix site's performance, if you have only one Web server and that server goes down, no amount of processors will beat having an additional machine for redundancy. RAM is another hardware issue to consider. The bottom line is that RAM is cheap, so put as much RAM in each machine as you can afford. I recommend at least 512 MB. Additional RAM allows for more cached database queries, templates, and memory-resident data. The more RAM you have, the more information you will be able to cache in memory rather than on disk, and the faster your site will run.
Hard-disk drive speed is an often-overlooked aspect of server performance. Be sure to use fast SCSI drives for all your Web servers. Think about using a redundant array of independent disks, or RAID, on a dedicated drive controller for fastest access. Most production-level RAID controllers enable you to add RAM to the controller itself. This memory, called the first in first out (FIFO) cache, allows recently accessed data to be stored and processed directly from the RAM on the controller. You get a pronounced speed increase from this type of system because data never has to be sought out and read from the drive.
If you use a RAID controller with a lot of RAM on board, you also should invest in redundant power supplies and a good uninterruptible power system (UPS). The RAM on the RAID controller is written back to the hard disk only if the system is shut down in an orderly fashion. If your system loses power, all the data in RAM on the controller is lost. If you don't understand why this is bad, imagine that the record of your last 50 orders for your product were in the RAM cache, instead of written to the disk, when the power failed. The more RAM you have on the controller, the greater the magnitude of your problem in the event of a power outage.
The type of load-balancing technology you use has a big impact on the way you build your boxes. If you are using load-balancing technology that distributes traffic equally to all boxes, you want each of your servers to be configured identically. Most dedicated load-balancing hardware can detect a failed server and stop sending traffic to it; if your system works this way, and you have some extra capacity in your cluster, each box can be somewhat less reliable because if it goes down, the others can pick up the slack. But if you're using a simple load-balancing technology such as round robin DNS (RRDNS), which can't detect a down server, you need each box to be as reliable as possible because a single failure means some of your users cannot use your site.
Because you want your users to have the same experience on your site, regardless of which server responds to their requests, you need to keep your system configurations as close to identical as possible. Unfortunately, because of the advanced complexity of today's operating systems and applications, doing so is a lot harder than it sounds. Identical configurations also help to alleviate quality assurance issues for your Web site. If your servers are not identical, your Web site may not function the same way on these different servers. This condition makes managing your Web site unnecessarily complex. If you must have different servers in your configuration, plan to spend extra time performing quality assurance on your Web applications to ensure that they will run as expected on all servers in the cluster.
Considerations for Choosing a Load-Balancing Option
Before deploying your clustered server farm, you should consider how you want your servers to handle and distribute load. There are two methods for handling load: user-request distribution algorithms or a round robin configuration. User-request distribution algorithms can distribute user requests to a pre-specified server, to a server with the least load, or through other methods. A round robin configuration passes each user request to the next available server. This is sometimes performed regardless of the selected server's current load. Round robin configurations may involve DNS changes. Consult with your network administrator when discussing this option.
Round Robin DNS
The round robin DNS (RRDNS) method of load balancing takes advantage of some capabilities of the way the Internet's domain name system handles multiple IP addresses with the same domain name. To configure round robin DNS, you need to be comfortable with making changes to your DNS server.
Be careful when making DNS changes. Making an incorrect DNS change is roughly equivalent to sending out incorrect change of address and change of phone number forms to every one of your customers and vendors and having no way to tell the people at the incorrect postal destination or the incorrect phone number to forward the errant mail and calls back to you. If you broadcast incorrect DNS information, you could cut off all traffic to your site for days or weeks.
Simply put, RRDNS centers around the concept of giving your public domain name (www.mycompany.com) more than one IP address. You should give each machine in your cluster two domain names: one for the public domain and one that lets you address each machine uniquely. See Table 3.1 for some examples.
Table 3.1 Examples of IP Addresses
When a remote domain name server queries your domain name server for information about www.mycompany.com (because a user has requested a Web page and needs to know the address of your server), your DNS returns one of the multiple IP addresses you've listed for www.mycompany.com. The remote DNS then uses that IP address until its DNS cache expires, upon which it queries your DNS again, possibly getting a different IP address. Each sequential request from a remote DNS server receives a different IP address as a response.
Round robin DNS is a crude way to balance load. When a remote DNS gets one of your IP addresses in its cache, it uses that same IP address until the cache expires, no matter how many requests originate from the remote domain and regardless of whether the target IP address is responding. This type of load balancing is extremely vulnerable to what is known as the mega-proxy problem. Internet Service Providers (ISPs) manage user connections by caching Web site content and rotating their IP addresses between users using proxy servers. This allows the ISP to manage more user connections than they have available IP addresses. A user on your e-commerce site may be in the middle of checking out and the ISP could change their IP addresses. Their connections would be broken to your Web site and their carts will be empty. Similarly, an ISP's cached content may point to only one of your Web servers. If that server crashes, any user who tries to access your site from the ISP is still directed to that down IP address. The user's experience will be that your site is down, even though you might have two or three other Web servers ready to respond to the request.
Because DNS caches generally take one to seven days to expire, any DNS change you make to a RRDNS cluster will take a long time to propagate. This means that in the case of a server crash, removing the down server's IP address from your DNS server doesn't solve the mega-proxy problem because the IP address of the down server is still in ISP's DNS cache. You can partially address this problem by setting your DNS record's time to live (TTL) to a very low value, so that remote DNSs are instructed to expire their records of your domain's IP address after a brief period of time. This solution can cause undue load on your DNS, however. Even with low TTL, an IP address you remove from the RRDNS cluster still might be in the cache of some remote DNS for a week or more.
User-Request Distribution Algorithms
Many load-balancing hardware and software devices offer customizable user-request distribution algorithms. Users will be directed to an available server based upon a particular algorithm. These methods offer more alternatives and are preferable to using RRDNS configurations.
User-request distribution algorithms can include the following:
Users are directed to the server with the least amount of load or CPU utilization.
Clustered servers are set up with a priority hierarchy. The available server with the highest priority handles the next user request.
Web site objects can be clustered and managed when deployed with J2EE. Objects include Enterprise Java Beans (EJBs) and servlets.
Web server response used to determine which server handles the user's request. For example, the fastest server in the cluster handles the next request.
The distribution algorithms listed above are not meant to be a complete list, but they do illustrate that many methods are available to choose from. They offer very granular and intelligent control over request distribution in a cluster. Choosing your load-balancing device may depend on deciding among these methods for your preferred cluster configuration.
Session State Management
Another load-balancing consideration is session-aware or "sticky" load balancing. Session-aware load balancing keeps each user on the same server as long as their session is active. This is an effective approach for applications requiring that a session's state be maintained while processing the user's requests. It fails, however, if the server fails. The user's session is effectively lost and even if it fails over to an alternative server in the cluster, the user will restart the session and all information accumulated by the original session will no longer exist. Centrally storing session information between all clustered servers helps alleviate this issue. See Chapter 5, "Managing Session State in Clusters" for more information on implementing session state management.
Consider how your Web site responds to server or application failover when you're designing your cluster server farm. An effective strategy will allow seamless failover to an alternative server without the user knowing that a problem occurred. Utilizing a load-balancing option with centralized session state management can help maintain state for the user while the user's session is transferred to a healthy machine.
Failover considerations also come into play with Web site deployment. You can shut down a server that is ready for deployment without having to shut down your entire Web site, enabling you to deploy to each server in your cluster, in turn, while maintaining an active functioning Web site. As each server is brought back into the cluster, another is shut down for deployment.
Mixed Web Application Environments
If your Web site consists of mixed applications and application servers, choosing your load-balancing solution becomes even more difficult. Let's take an example where your current Web site is being rewritten and transformed from an active server page (ASP) Web site to a ColdFusion (CFML) Web site. Your current Web site is in the middle of this transformation where ASP pages co-exist with CFML pages. Not all load-balancing solutions will be able to effectively handle server load at the application level. Some will be able to handle load at the Web-server level only. In addition, session state management may not work as planned. Because ASP session and ColdFusion sessions are not necessarily known between the two systems, you may want to implement session-aware load balancing in this "mixed" environment. This type of session-aware load balancing could consist of cookies or other variables that both applications can read.