Website Performance and Optimization

A couple of months ago, I noticed that I was getting pretty close to using up all of my monthly bandwidth allocation for my server and that was a surprise. I run several blogs that get quite a few hits but I didn't think I was anywhere near going over my 250 GB allotment. So I decided to spend a little time to optimize my server and figure out the best way to utilize what I had and optimize it to get the most performance out of my little box. Jeff Atwood's wonderful blog entry about Reducing Your Website's Bandwidth Usage inspired me to write about my experience and what I ended up doing to squeeze the most out of my server.

I had done some of the obvious things that people typically do to minimize traffic to their site. First and foremost was outsourcing of my RSS feeds to FeedBurner. I've been using FeedBurner for several years now after I learned the hard way how badly programmed a lot of the RSS readers were out there. I had to ban several IP addresses as they were getting my full feed every 2 seconds – Hoping that was some bad configuration on their side but who knows. Maybe it was a RSS DOS attack :). After taking a little time to see what was taking up a lot of the bandwidth, I discovered several things that needed immediate attention. First and foremost was the missing HTTP compression. Looks like an Apache or PHP upgrade I did in the past few months had ended up disabling the Apache module for GZIP compression and so all the traffic was going out in text. HTTP Compression delivers amazing speed enhancements via file size reduction and most if not all browsers support compression and so I enabled compression for all content of type text/html and all CSS and JS files.

Some older browser don't handle JS and CSS compressed files but anything of IE6 seemed to handle JS/CSS compression just fine and my usage tracking (pictured above) indicated that most of my IE users were using IE 6 and above.

Enabling HTTP Compression compressed my blog index page by 78% resulting in a statistical performance improvement of almost 4.4x. While your mileage may vary, the resulting performance improvement got me on the Top20 column at GrabPERF almost every single day.

Another issue I had was the number of images being loaded from my web server. As most of you already know, browsers will typically limit themselves to 2 connections per server and so if a webpage being loaded has 4 CSS files, 2 JS files and 10 images, you are loading a lot of content over those 2 connections. And so I used a simple CNAME trick to create an image.j2eegeek.com to complement http://www.j2eegeek.com and started serving images from image.j2eegeek.com. That did help and I considered doing something similar for CSS and JS files but decided instead to outsource image handling to Amazon's S3.

Amazon's S3 or Simple Storage Service is a highly scalable, reliable, fast, inexpensive data storage infrastructure that is fast and relatively inexpensive. S3 allows you to create a 'bucket', which is essentially a folder that must have a globally unique name and cannot have any sub-buckets or directories and so it's basically emulates a flat directory structure. Everything you put in your bucket and make publically available is accessible via http using the URL http://s3.amazonaws.com/bucketname/itemname.png. Amazon's S3 Web Service also allows you to call it using the HTTP Host header and so the URL above would become http://bucketname.s3.amazonaws.com/itemname.png. You can take this further if you have access to your DNS server. In my case, I created a bucket in S3 called s3.j2eegeek.com. I then created a CNAME in my DNS for s3.j2eegeek.com and pointed it to s3.amazonaws.com. And presto – s3.j2eegeek.com resolves to essentially http://s3.amazonaws.com/s3.j2eegeek.com/. I then used John Spurlock's NS3 Manager to get my content onto S3. NS3 Manager is a simple tool (windows only) to transfer files to/from an Amazon S3 storage account, as well as manage existing data. It is an attempt to provide a useful interface for some of the most basic S3 operations: uploading/downloading, managing ACLs, system metadata (e.g. content-type) and user metadata (custom name-value pairs). In my opinion, NS3 Manager is the best tool out there for getting data in and out of S3 and I have used close to 20 web based, browser plug-in and desktop applications.

In addition, I also decided to try out a couple of PHP Accelerators out there to see if I could squeeze a little more performance out of my web server. Compile caches are a no-brainer and I saw decent performance improvement in my PHP applications. I blogged about this topic in a little more detail and you can read that if you care about PHP performance.

The last thing I did probably had the biggest impact after enabling HTTP compression and that was moving my Tomcat application server off my current Linux box and moving it to Amazon's EC2. Amazon's EC2 or Elastic Compute Cloud is a virtualized cloud of computing available to you for $0.10 per hour of CPU utilization. I've been playing around with EC2 for a while now and just started using it for something real. I have tons of notes that I taken during my experimentation with EC2 where I took the stock Fedora Core 4 images from Amazon and made that server into my Java application server running Tomcat and Glassfish. I also created my own Fedora Core 6, CentOS 4.4 image and deployed them as my server. My current AMI running my Java applications is a Fedora Core 6 image and I am hoping to get RHEL 5.0 deployed in the next few weeks but all of that will be a topic for another blog.

In conclusion, the HTTP Compression offered me the biggest reduction in bandwidth utilization. And it is so easy to setup on Apache, IIS or virtually any Java application server that is it almost criminal not to do so. 🙂 Maybe that's overstating it a bit – but there are some really simple ways to optimize your website and you too can make your site hum and perform like you’ve got a cluster of servers behind your site.

Daily del.icio.us for Mar 10, 2007 through Mar 15, 2007

  • video.onflex.org – video.onflex.org is maintained by Mike Chambers and Ted Patrick of Adobe. It is focused on providing videos about developing with Adobe Flex, ActionScript and Apollo.
  • How to Use Java at a Startup – Cardsharp on Software – The embarrassment of riches in the Java Open Source movement makes it a slam dunk for startups. The fact that you can find an Open Source framework for every conceivable use means that you can focus on your core business instead of on plumbing
  • InfoQ: JP Rangaswami on open source in the enterprise & the future of information – CIO JP Rangaswami explains how open source became a corporate IT strategy at investment bank Dresdner Kleinwort Wasserstein and why CIOs of major enterprises should open source for software development initiatives. JP also explains his vision of four pill
  • Ajaxian » Compressed versions of Prototype – John-David Dalton has spent some time compressing Prototype in a couple of ways to keep your download time to a minimum.
  • jsjuicer – jsjuicer is a free tool for safely reducing the size of your JavaScript files. Reducing the size and number of the JavaScript files included in a web page will enable it to load faster

New Theme for this blog: NigaRila

A lot of you read this blog using an RSS reader and so you probably don’t see the theme that adorns this blog but I just switched the theme that powers this blog to the NigaRila theme by Sadish Bala. I have been looking for a great 3-column theme and Sadish has created one of the best looking and usable theme out there.

NigaRila is an awesome theme for WordPress 2.0 that has 3 columns on the Front Page with a fixed width of 900 pixel and 2 columns on all other pages. This theme has two sidebars on the right side. If you have the sidebar widgets plugin installed, then you can use it for both of them. NigaRila is an awesome theme that produces valid XHTML and offers a great deal of functionality. I’ve made a couple of modifications to add support for a few other plugins but most of the functionality you see on my blog is out of the box including the archive and contact page. Sadish wants $15.00 for this theme and I think its well worth the cost.

In addition to NigaRila, Sadish just recently created a new WordPress theme called Intense after learning about my wife’s first cousins son Gavin Winslow. Sadish was moved by Gavin’s story and decided to help by adding a link from his theme to Gavin’s site at www.savebabygavin.com. This has resulted in Gavin’s site getting thousands of visits from people that normally wouldn’t know about Gavin. Thank you Sadish for helping raise awareness about Gavin’s story and bringing additional visibility to his site and creating a great WordPress theme in the process.

The impact of Scoble

It’s great to see all the coverage of Scoble leaving Microsoft – For the uninitiated, Robert Scoble is a very popular blogger that works for Microsoft and Robert achieved what millions and millions of dollars could not do. Through his blog, Scoble humanized Microsoft and offered some much needed transparency that led a lot of people to rethink their assessment of Microsoft as an evil company (Disclaimer: my brother works for Microsoft). By opening up Microsoft via channel 9 and getting other people (3000 by latest count) within Microsoft to blog, Scoble enabled people access directly into Microsoft and peeled away all of the facade to show Microsoft as a company of people where product decisions get made by developers and managers coming to some consensus and now via some master evil plan. For the record,Scoble is leaving Microsoft to join a startup in San Francisco named PodTech.net where he will serve as vice president of content media and help PodTech.net and get them a ton of exposure. Congratulations to the PodTech.net team as they are getting a great person on their team and the added bonus is all this publicity is a huge plus – You can’t buy publicity like this.


scoblesleaving.jpg

I knew this story would be great fodder for the blogosphere but it’s great to see ‘real’ news organizations like AP, Reuters and BBC News covering his departure. Who would have guess just a year ago that a bloggers departure from a company would generate this much attention from the media? I guess this just reaffirms the power of the blog and how important blog will continue to be as companies move forward to get their message out and market their brand. I can see a future where bloggers will be like free-agents in sports, blogging for the highest bidder. 🙂 For the record, I am willing to leave my current employment for a seven figure salary and I’ll bring my 500 blog readers with me. 🙂

I wonder what Microsoft will do to replace Scoble. I do hope they replace him with another blogger as companies need a public face and I think it’s crucial to have that one blog that’s the face of the company.For Microsoft, Scoble has been that just like Jonathan Schwartz is for Sun and countless other examples. I guess the one positive for Microsoft is that people will now finally believe Scoble that Vista does indeed rock. 🙂

Blog Post from Microsoft Word 2007

Just downloaded the latest beta of Microsoft Office 2007 and am testing out the blog posting feature. Microsoft Word has added functionality that allows you to create blog post and post them directly to your blog. They support most of the blog platform out of the box including MSN Spaces, Blogger, SharePoint, Community Server and Other, which includes any blog platform that supports the MetaWebLog and the Atom API.

Word 2007 blog setup screen

The HTML created by Word 2007 is also pretty clean and I think I can really get used to this as my primary blogging interface. I still need to play around with some of the settings to customize the layout. I am hoping I can import or point to a CSS document and have it allow me to format the contents of the blog post. Haven’t found the option yet but I am hoping it’s here somewhere. If not, be a great feature for the final GA release.

Blog setup in Word 2007

Some of the bugs I’ve discovered so far are:

  • Posting doesn’t work – Pretty critical bug I would think. 🙂 I need to reverify my setting but I am able to post a blog entry as a draft but not publish it directly
  • The post date is set to December 31, 1969
  • Images included from Flickr (or anything off the Internet addressable by a url) show in the Word interface but the blog entry doesn’t include the img tag reference.

I’m sure that’s not a complete list of bugs but Word does show promise as a decent WYSIWYG blog editor.

Microsoft, Microsoft+office, office2007, word2007, blog, blog+editor, WYSIWYG, atom, metaweblog, api

WordPress Upgrade Notes

I finally upgraded my blog to WordPress 2.0 a few weekend ago and am now finally getting around to blog about it. I had blogged previously about issues I had upgrading my blog software but those issues were related to some MySQL upgrade and version compatibilities.  To get around the database issues, I used MySQL Administrator to backup the entire database from the old server and restored it on the new server.  Not sure why that worked, but it did and didn’t create any issues.  In the past, I had backed up the database using MySQL Administrator and then used the MySQL Query Browser to create the new database and insert the data.  I’ve spent a lot of time figuring out the differences between the version of MySQL and the issues I had and will post a lengthy (and boring) blog entry about that in the near future.

After upgrading the database, I upgraded my blog software to WordPress 2.0 and everything worked with the exception of a few minor issues.  One of the biggest issues I ran into was an internal rewrite issue where my blog is deployed under /blog and I had a page whose slug was blogs-i-read.  WordPress was generating a 404 for that page and I fixed this issue by simply renaming the slug to remove the use of the word blog. This issue is fixed in the latest maintenance release of WordPress which currently happens to be 2.0.1.  In addition, I had a problem with the awesome WordPress FeedBurner Plugin that was also related to rewrite rules but Steve Smith had already fixed that problem in the last revision of the plugin.

Since WordPress 2.0 had been working pretty smoothly and 2.0.1 was working in my development area, I applied that today to this site and it worked like a charm.  The biggest improvement (besides the bug fixes) appears to be performance.  It’s too early to tell if the numbers will hold up but here is a performance chart that shows dramatic improvement in performance since the upgrade.

WordPress 2.0.1 Performance
Chart courtesy of GrabPERF

I hope this performance improvement last as WordPress 2.0 was a lot slower than WordPress 1.5 with the WP-Cache plugin. Incidentally, the WP-Cache plugin has not been upgraded to work with WordPress 2.0 but if this performance holds, who needs WP-Cache 🙂

I am such a loser :)

Discovered EgoSurf via. Steve Ruble’s blog and so I gave it a shot and scored 9130 points. I need a life 🙂 The idea is simple – You enter your name and your blogs web address and EgoSurf searchs google (MSN, Yahoo, Technorai, Del.icio.us also) and finds links to your blog which is used to calculate your ego ranking. Pretty interesting and a great way to kill a few hours at work 😉 EgoSurf is the creation of Jason Hyland.

egosurfall