I love Wordle

Wordle is a toy for generating “word clouds” from text that you provide and I love playing with it. Every couple of weeks, I’ll stumble onto Wordle again and fall in love all over again :). Here is a tag cloud of all of the tags from my del.icio.us bookmarks.

I love Wordle

I love Wordle

Go create your tag/word clouds at http://www.wordle.net/.

iTunes & Ehcache – You figure it out

Thanks to Greg Luck, I discovered something new in iTunes called My iTunes that lets you export your purchases out as RSS or as a widget to display on your website. Check out a sample of my purchases below – With DRM free music from Amazon, I’m not buying anything from iTunes that’s available on Amazon. By the way, Greg Luck is one of the lead developers of Ehcache, which IMHO is the best and most widely used Java distributed caching framework.

http://phobos.apple.com/WebObjects/MZStoreServices.woa/wa/widget?type=1&sf=143441

 

Books I am currently reading

As I’ve said before, I am a voracious book collector and (usually) reader as well. I love books and could spend hours reading. With a demanding job, a wife and a young daughter, I’ve built up quite a backlog and hope to get to most of these books in the next few weeks. Here are the books on my current ‘reading’ list:

Technical Books

Scripting in Java: Languages, Frameworks, and Patterns
By Dejan Bosanac
Addison-Wesley Professional
ISBN: 0321321936
Publication Date: August 2007
Price: $49.99   $25.97
Rating: (Total Reviews: 2)
Sales Rank: 355093


Spring in Action
By Craig Walls, Ryan Breidenbach
Manning Publications
ISBN: 1933988134
Publication Date: August 2007
Price: $49.99   $28.43
Rating: (Total Reviews: 43)
Sales Rank: 6432


iText in Action: Creating and Manipulating PDF
By Bruno Lowagie
Manning Publications
ISBN: 1932394796
Publication Date: December 2006
Price: $49.99   $31.06
Rating: (Total Reviews: 4)
Sales Rank: 30402


Next Generation Java Testing: TestNG and Advanced Concepts
By Cédric Beust, Hani Suleiman
Addison-Wesley Professional
ISBN: 0321503104
Publication Date: October 2007
Price: $49.99   $29.94
Rating: (Total Reviews: 2)
Sales Rank: 140707


Prototype and script.aculo.us: You Never Knew JavaScript Could Do This!
By Christophe Porteneuve
Pragmatic Bookshelf
ISBN: 1934356018
Publication Date: January 2008
Price: $34.95   $23.07
Rating: (Total Reviews: 0)
Sales Rank: 49628


Non-Technical Books

The Big Switch: Rewiring the World, from Edison to Google
By Nicholas Carr
W. W. Norton
ISBN: 0393062287
Publication Date: January 2008
Price: $25.95   $17.13
Rating: (Total Reviews: 1)
Sales Rank: 1672


The Nine: Inside the Secret World of the Supreme Court
By Jeffrey Toobin
Doubleday
ISBN: 0385516401
Publication Date: September 2007
Price: $27.95   $16.22
Rating: (Total Reviews: 67)
Sales Rank: 9


Daily del.icio.us for Apr 03, 2007

Website Performance and Optimization

A couple of months ago, I noticed that I was getting pretty close to using up all of my monthly bandwidth allocation for my server and that was a surprise. I run several blogs that get quite a few hits but I didn't think I was anywhere near going over my 250 GB allotment. So I decided to spend a little time to optimize my server and figure out the best way to utilize what I had and optimize it to get the most performance out of my little box. Jeff Atwood's wonderful blog entry about Reducing Your Website's Bandwidth Usage inspired me to write about my experience and what I ended up doing to squeeze the most out of my server.

I had done some of the obvious things that people typically do to minimize traffic to their site. First and foremost was outsourcing of my RSS feeds to FeedBurner. I've been using FeedBurner for several years now after I learned the hard way how badly programmed a lot of the RSS readers were out there. I had to ban several IP addresses as they were getting my full feed every 2 seconds – Hoping that was some bad configuration on their side but who knows. Maybe it was a RSS DOS attack :). After taking a little time to see what was taking up a lot of the bandwidth, I discovered several things that needed immediate attention. First and foremost was the missing HTTP compression. Looks like an Apache or PHP upgrade I did in the past few months had ended up disabling the Apache module for GZIP compression and so all the traffic was going out in text. HTTP Compression delivers amazing speed enhancements via file size reduction and most if not all browsers support compression and so I enabled compression for all content of type text/html and all CSS and JS files.

Some older browser don't handle JS and CSS compressed files but anything of IE6 seemed to handle JS/CSS compression just fine and my usage tracking (pictured above) indicated that most of my IE users were using IE 6 and above.

Enabling HTTP Compression compressed my blog index page by 78% resulting in a statistical performance improvement of almost 4.4x. While your mileage may vary, the resulting performance improvement got me on the Top20 column at GrabPERF almost every single day.

Another issue I had was the number of images being loaded from my web server. As most of you already know, browsers will typically limit themselves to 2 connections per server and so if a webpage being loaded has 4 CSS files, 2 JS files and 10 images, you are loading a lot of content over those 2 connections. And so I used a simple CNAME trick to create an image.j2eegeek.com to complement http://www.j2eegeek.com and started serving images from image.j2eegeek.com. That did help and I considered doing something similar for CSS and JS files but decided instead to outsource image handling to Amazon's S3.

Amazon's S3 or Simple Storage Service is a highly scalable, reliable, fast, inexpensive data storage infrastructure that is fast and relatively inexpensive. S3 allows you to create a 'bucket', which is essentially a folder that must have a globally unique name and cannot have any sub-buckets or directories and so it's basically emulates a flat directory structure. Everything you put in your bucket and make publically available is accessible via http using the URL http://s3.amazonaws.com/bucketname/itemname.png. Amazon's S3 Web Service also allows you to call it using the HTTP Host header and so the URL above would become http://bucketname.s3.amazonaws.com/itemname.png. You can take this further if you have access to your DNS server. In my case, I created a bucket in S3 called s3.j2eegeek.com. I then created a CNAME in my DNS for s3.j2eegeek.com and pointed it to s3.amazonaws.com. And presto – s3.j2eegeek.com resolves to essentially http://s3.amazonaws.com/s3.j2eegeek.com/. I then used John Spurlock's NS3 Manager to get my content onto S3. NS3 Manager is a simple tool (windows only) to transfer files to/from an Amazon S3 storage account, as well as manage existing data. It is an attempt to provide a useful interface for some of the most basic S3 operations: uploading/downloading, managing ACLs, system metadata (e.g. content-type) and user metadata (custom name-value pairs). In my opinion, NS3 Manager is the best tool out there for getting data in and out of S3 and I have used close to 20 web based, browser plug-in and desktop applications.

In addition, I also decided to try out a couple of PHP Accelerators out there to see if I could squeeze a little more performance out of my web server. Compile caches are a no-brainer and I saw decent performance improvement in my PHP applications. I blogged about this topic in a little more detail and you can read that if you care about PHP performance.

The last thing I did probably had the biggest impact after enabling HTTP compression and that was moving my Tomcat application server off my current Linux box and moving it to Amazon's EC2. Amazon's EC2 or Elastic Compute Cloud is a virtualized cloud of computing available to you for $0.10 per hour of CPU utilization. I've been playing around with EC2 for a while now and just started using it for something real. I have tons of notes that I taken during my experimentation with EC2 where I took the stock Fedora Core 4 images from Amazon and made that server into my Java application server running Tomcat and Glassfish. I also created my own Fedora Core 6, CentOS 4.4 image and deployed them as my server. My current AMI running my Java applications is a Fedora Core 6 image and I am hoping to get RHEL 5.0 deployed in the next few weeks but all of that will be a topic for another blog.

In conclusion, the HTTP Compression offered me the biggest reduction in bandwidth utilization. And it is so easy to setup on Apache, IIS or virtually any Java application server that is it almost criminal not to do so. 🙂 Maybe that's overstating it a bit – but there are some really simple ways to optimize your website and you too can make your site hum and perform like you’ve got a cluster of servers behind your site.

Will (Or Should) Adobe open-source Flex?

I have been building AJAX applications for a while now and absolutely love AJAX and the improvements it can offer in user-interface design, making applications easy and fun to use. But AJAX does have limitations and I, like many others have come to the realization that while AJAX is great for most things, it is not the silver bullet. For data-intensive application, specifically that involve dynamic charting with vector graphics and mining, AJAX falls short.

There are a couple of alternatives out there that fill that niche that AJAX still hasn’t successfully filled and Adobe’s Flex 2 framework is definitely one of the them. Adobe Flex 2 software is a rich Internet application framework based on Adobe Flash that will enable you to create applications that are cross-platform and browser independent as they run inside the Flash VM. Flash has fulfilled the promise that Java applets never delivered for a variety of reasons. The Flex programming model is fairly simple where developers write MXML and ActionScript source code and the source code is then compiled into bytecode by the Flex compiler, resulting in a binary file with the *.swf extension. Developers use MXML to declaratively define the application user interface elements and use ActionScript for client logic and procedural control. MXML provides declarative abstractions for client-tier logic and bindings between the user interface and application data. ActionScript 3.0 is an implementation of ECMAScript, and it provides support for strong typing, interfaces, delegation, namespaces, error handling, and ECMAScript for XML (E4X).

Adobe gives away the Flex 2 SDK for free and so anyone can create Flex 2 application and compile them into SWF bytecode files. Adobe sells Flex Builder, which is the Eclipse based IDE for Flex development and Flex Data Services, which is a J2EE component deployed inside a container. It provides adapters to connect to EJB’s, JMS queues, backend data stores, etc.

One of the barriers to wider Flex adoption is the proprietary nature of the technology. Flex is closed technology and Adobe controls every aspect of it. There’s nothing wrong with that but I and I am guessing a lot of people prefer open architecture, open systems and open platforms for application development to prevent vendor lock-in. Adobe has taken some positive steps by releasing the Flex-Ajax Bridge (FABridge) library, which automatically exposes the public data and methods within a Flex application to the JavaScript engine and vice versa. This enables developers to easily integrate Flex applications with existing sites as well as to deliver new applications that combine Ajax with applications created in Flex. A great example of the Flex-AJAX interaction is the charting application on Google Finance. It was interesting to see that Yahoo also decided to use Flash for charting when they deployed the new version of the Yahoo Finance portal.

Open sourcing Flex would certainly lead to wider adoption of Flex as an application development framework. So why doesn’t Adobe do it? It seems to fit the Adobe business model – If you take a look at Acrobat or Flash or really any of the other Adobe products. They give away the client for free and monetize the creation part of process. Take a look at PDF and Acrobat – Adobe gives away the reader for free but makes money by selling Adobe Distiller. Why couldn’t that model work for Flex? Open-source Flex and continue making money on Flex Builder, Flex Data Services, training, consulting, support and custom components. I’m sure there is already a fairly robust marketplace for Flex components but Adobe can take that to the next level. I know Adobe has spent significant amount of time, money in terms of engineering effort to create Flex but the proprietary nature of it will always be a limiting factor and never let Flex be the premier platform for RIA’s. If Adobe waits too long, the browsers will get better and fully support SVG, CSS3, JavaScript JIT compilers and the advantage Flex offers will narrow. The next generation of AJAX frameworks are also just around the corner and they will compete with Flex. OpenLaszlo is another dark-horse in this race that may eat Flex’s lunch. OpenLaszlo is everything I want Flex to be – OpenLaszlo programs are written in XML and JavaScript and transparently compiled to Flash. The OpenLaszlo APIs provide animation, layout, data binding, server communication, and declarative UI. And what sets it apart from Flex is that OpenLaszlo is an open source platform. Adobe – Are you listening?

PHP Acceleration – Pick Your Poison

As I deployed more applications and web sites on my server, I started running into resource issues. Since most of the applications I write are in Java, I run Tomcat on my Linux server. But I also run Apache as a front-end host for Tomcat as well as several PHP applications like WordPress, Vanilla and a few other PHP applications that I’ve written. I am not an expert PHP developer by any stretch of the imagination but I tinker with enough PHP that I decided to take a look at PHP Acceleration software.

For the uninitiated, PHP is a scripting language that is interpreted and compiled on the server side. PHP Accelerators offer caching of the PHP scripts in their compiled state along with optimization. There are several PHP optimization products out there and I decided to give eAccelerator, XCache and APC a try on my Linux machine. For the record, the box is running CentOS 4.4 which is essentially a distribution that is repackaged Red Hat Enterprise Linux 4.x.

  • eAccelerator – eAccelerator is a free open-source PHP accelerator, optimizer, and dynamic content cache. It increases the performance of PHP scripts by caching them in their compiled state, so that the overhead of compiling is almost completely eliminated. It also optimizes scripts to speed up their execution. eAccelerator typically reduces server load and increases the speed of your PHP code by 1-10 times.
  • XCache – XCache is a fast, stable PHP opcode cacher that has been tested and is now running on production servers under high load.
  • APC – The Alternative PHP Cache (APC) is a free and open opcode cache for PHP. It was conceived of to provide a free, open, and robust framework for caching and optimizing PHP intermediate code.

I compiled and installed these PHP accelerators and found APC worked the best for me. XCache seemed to work well and actually provided a nice admin application that lets you peek inside the cache to see what’s cached, the hit/miss ratio, etc. eAccelerator also seemed to work well and offered a great performance boost but caused segmentation fault and made the Apache web server unusable. It could have been bad PHP code that was causing the segmentation faults but I didn’t really spend any times getting to the root cause. APC just worked, pretty much like XCache but seemed to offer a little better performance. Now I didn’t really perform any empirical testing here – I simply relied on my website monitor GrabPERF as I ran each PHP extension for a few days. Your mileage may vary based on your server architecture, application, lunar phase, etc but PHP APC seemed to work the best for me.

Red Hat to Oracle (and market) – Oh No You Didn’t!

Red Hat shares jumped about 25% after they reported a quarterly profit and outlook that topped Wall Street forecasts. After the close, the stock is still going up in after-hours trading. Kudos to Red Hat whose stock had been hammered after Oracle announced that they were going to redistribute Red Hat’s Linux under the name Oracle Unbreakable Linux and include complete support for cheaper than Red Hat at Oracle World. Here is Red Hat’s stock chart over the last 3 months.

Hopefully Red Hat will send Larry Ellison one of those cool Unfakeable Linux T-shirts 🙂

Hosting Woes Over

The day started off fairly normally – Check GMail for anything that needs immediate attention, then move to blog stats and then hit GrabPERF to see how my sites are behaving. And much to my very pleasant surprise, my blog made it on the front page of GrabPERF and it was on the good (Top 20 performance) side, not the bad side. 🙂 Check out the screenshot below and squint really hard to see Vinny’s blog in there at #17 with an average page load time of 0.4739 seconds.

GrabPERF performance chart vinny carpenter blog

If you haven’t heard about GrabPERF, it is an awesome free (community supported) service created by Stephen Pierzchala that provides distributed measurement services and monitoring for tracking key performance benchmark of many sites including my blog. The GrabPERF agents gather detailed component, page size, and response code data for the sites they monitor on a regularly scheduled interval and ship it to the central database in real-time where it is available for presentation in the GrabPERF interface. I’ve been a big fan of GrabPERF for a while now and use it as THE key measure for my sites performance.

It was really exciting to see the latest performance results as I’ve had a rough couple of months in terms of hosting. For a while, I had this blog hosted at Kattare to see how their Java/JSP (Tomcat) hosting services stacked up. While my blog was hosted there, I ran into a bug in the awesome Ultimate Tag Warrior 3 WordPress plugin where the plugin filled up my wp_postmeta database table with empty value of meta_value for every post that was viewed and didn’t have one of more tag defined. I had just started using the tag plugin and so not all my posts had tags defined and with the traffic I get, I was causing major issues for Kattare’s shared MySQL database server and so they disabled my site. Since I already had an account with TextDrive and a reliable daily MySQL backup, I moved my site to TextDrive. With the Ultimate Tag Warrior plugin fix, my blog worked for a while but it started causing problems for the folks at TextDrive as I was using the shared hosting feature and the traffic I was getting was adversely affecting other people on my server. So I decided to move to A Small Orange to check out one of their virtual (VPS) offerings to see how they compare to traditional dedicated server. I picked the professional plan which got me 512 MB of RAM, 20 GB of disk space, and 250 GB of bandwidth running CentOS (RedHat Enterprise Linux 4) on a quad 2GHz CPU box for $90 a month. I have been incredibly happy with the performance of the server, the support team and the overall performance of their network and the results from GrabPERF show it.

I am still continuing to ‘play’ with Amazon’s EC2 (Elastic Compute Cloud) offering to see if it could really become the killer solution that will change the hosting landscape. A dedicated (albeit virtual) machine for $70.00 a month is a really compelling story and if Amazon can back that up with additional offerings where you can geographically distribute your applications in multiple datacenters and still scale up/down with computing capacity as needed, why would you host anywhere else? I know Amazon’s EC2 offer is ‘bare-bones’ on purpose where you have to build your server from scratch; you don’t get a web interface like cpanel or plesk to manage your server instance or help with your server configuration once you get up and running. This has to open up opportunities for VAR’s to offer value on top of the EC2 platform by creating ‘hosting-in-a-box’ service where they will build custom Linux deployments, manage them and offer simple management tools. S3, the Amazon storage service has created a huge marketplace for storage, backup tools, online backup vendors and other niche products. I think ECS is going to do the same for the hosting market.

Almost forget: I am running WordPress 2.0.5 with a ton of plugins that are listed on my colophon page and the real difference maker is WP-Cache with this fix to wp-cache-phase2.php that makes it compatible with WordPress 2.0.x.