|  | Home
 | Calendar
 | Mail Lists
 | List Archives
 | Desktop SIG
 | Hardware Hacking SIG Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU | 
A plan to improve web server availability: Assume one has a backup webserver with similar capacity to the production server. Redundancy can be achieved by adding a second A record to the www record pointing to the backup webserver. Then the DNS server will return both records for each query, in random order. If both webservers are up, obviously no problem. If one is down, we anticipate that the browsers will go on to the 2nd address if the first server address does not respond. The question is, how universal is this ability? I have been doing some experiments and I believe it is near universal among recent browsers. Using the latest versions of six browsers on various machines about the office MSIE 8.0.6001.18702 Opera 10.10 Safari 4.0.4 531.21.10 Firefox 3.5.30729 Chrome 4.0.239.30 Konqueror 4.3.3 the worst result when one server was down was a delay of about 30 seconds before the page was loaded. I conclude that some browsers have a 30 second timeout before trying the next IP address. FF 2.0 and Lynx never switched. I was able to test Safari 4.0, FF 2.0 and Chrome 3.0 on Adobe Browserlab, and those all failed to switch within the Browserlab timeout. On my PC, Safari 4.0.4 did manage to switch after about 30 seconds, so perhaps Browserlab is to quick to give up. During periods when one server was down, users of non-switching browsers would have a 50% chance of getting the bad server in an individual browser session, but the chance of one of two servers being down is about double the chance of one server being down. This is close to a wash then, for the older browsers and a pretty big win for the newer ones. It is true that a user with an older browser could close his browser and wait 5 minutes (our DNS TTL) for another chance, but probably most users wouldn't do that. This isn't perfect, but I think achieves some valuable redundancy at low cost, and does not introduce any single point of failure that didn't exist before. It does not require any special topology, hardware or skills either. On our web site all internal links are relative. This means that once a browser session finds a working server, it will stay with that server - so there is only one delay per visitor. If the links were absolute, then there would be a delay on 50% of page views, not very attractive. See also: http://www.tenereillo.com/GSLBPageOfShame.htm but he doesn't name any compatible browsers, and back in 2004, I don't think there were many. I did a similar experiment some years ago with mail clients. The results were much worse, but maybe I should try again: http://www.nber.org/redundant.html Comments or suggestions? Daniel Feenberg feenberg-fCu/yNAGv6M at public.gmane.org