Caching proxy replacement

I recently replaced a caching proxy infrastructure that was using commercial software on commodity hardware. The software was, in my opinion, an awful pile of buggy code with a management interface that never saw a usability test in its life. One of the most frustrating things about it was that I absolutely had no control over the operating system or software, all I had was the web-based management interface. In its defense, it did chug away for about a year without any issues. However, once I introduced URL filtering software is where the real problems started. The URL filtering component was added, configured and worked as expected. But when a URL that was considered inappropriate by the URL filter list was in actuality not, it needed to be added to the allowed site list configuration section. After entering the URL in the allowed site list section required applying this change to the system. Clicking the Apply button either worked or brought the system down, typically the latter. The real fun was that when the system came down and would reboot it would either boot back up, which would take a whole hour, or it would need to be rebuilt from scratch, yes, rebuilt from scratch! I later found out that there was an interoperability issue between this caching proxy software and the hardware I was using – HP Proliant DL380′s – that caused this. At this point I found out that the software was being EOL’ed in the near future which was definitely a good thing. These two things, interoperability between software and my hardware and the pending EOL, gave me the catalyst I needed to begin investigating other solutions.

I had three routes to go. First, I could purchase similar hardware to the DL 380′s and continue the use of the software, bypassing the interoperability issue, but the software would still need to be replaced in the near future. That wouldn’t be a smart decision. Second, I could purchase a bundled caching proxy software and hardware solution. I took a long look at commercial appliance offerings from reputable vendors such as Blue Coat and NetCache but ultimately they were just too expensive and had high recurring costs. Lastly, I could leverage the existing hardware and use an enterprise-grade open source caching proxy software such as Squid. I decided to go with this option.

One of the requirements was that URL filtering needed to be in place per company policy. I was using N2H2 which was bought by Secure Computing and after talking with them they pushed me towards using Smartfilter DA (Delegated Admin). I set up a test RedHat machine running Squid and Smartfilter DA and had a small test group using this solution for a few weeks. Everything seemed satisfactory so I moved forward with installing RedHat, Squid and Smartfilter DA over one of the production cache servers. This went smoothly as well but over the course of a few days performance degraded dramatically and I tracked it down to Smartfilter DA. Smartfilter DA uses a redirector, which is defined within the squid.conf file, to check URLs against its list. This *does not* work on a caching proxy that has any sort of userbase. Secure Computing suggested multiple things that didn’t pan out and then they suggested using Smartfilter instead. Yes, not Smartfilter DA, Smartfilter. Smartfilter is actually a patch against the Squid source. I didn’t necessarily like applying their proprietary code against the Squid source but it did the trick. Performance was as expected. Interestingly, I recently received notification from Secure Computing about a new Smartfilter DA release and the release notes stated that the performance issue with the redirector was fixed. However, I can’t and wouldn’t recommend using DA.

A little information on how I configured the hardware. There are two caching proxies which are HP Proliant DL380′s, 2.8 GHz Xeon CPU, 2.5 GB PC2100 DDR RAM, and 6 36.4GB 15K Ultra320 SCSI drives. The 6 drives were configured in the following manner under the RAID controller: 2 configured as RAID 1 which contain the OS, 4 configured as RAID 0 where each drive is mounted as a cache disk.

Once the machine was built with the OS, the 4 cache disks needed to be added. When dealing with disks under an array, the first drive (d0) on the first controller (c0) on the controller cciss is /dev/cciss/c0d0. The following commands perform the function of partitioning and creating a filesystem on the cache disks.

Create new partition spanning the whole drive:

fdisk /dev/cciss/c0d1

Use the mke2fs command, using the -j option to make an ext3 file system:

mke2fs -j /dev/cciss/c0d1

Create the directory where the disk will be mounted:

mkdir /mnt/cachedisk1

Set ownership on these mountpoints:

chmod squid.root /mnt/cachedisk*

The disk can be mounted now:

mount –t ext3 /dev/cciss/c0d1 /mnt/cachedisk1

Add the following line to the /etc/fstab file in order for the disk to be mounted at boot time:

/dev/cciss/c0d1 /mnt/cachedisk1 ext3 default 0 0

The previous commands need to be repeated for the other three disks that will be mounted as cache disks – /dev/cciss/c0d2, /dev/cciss/c0d3, /dev/cciss/cod4.

I am fairly happy with the current setup. It has been in production for months now and is performing well. I recently set up Calamaris to do monthly statistics on cache performance. Most interesting to me and for the PHBs is that in a typical month around 600G total bandwidth is moving through the caches with 125G cached locally for a bandwidth savings of ~20%.

Two small notes. One, there was a parsing problem when using Calamaris as Smartfilter modifies the Squid access.log file to append the category the URL falls under. This problem can be alleviated by either running awk over the logs: awk ‘{print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10}’ access.log before piping to Calamaris or adding “use bytes;” to the Calamaris code. Two, I needed to run the cache stats on the last day of the month from cron and this does the trick:

00 20 * * * [`date -d tomorrow +\%d` -eq '01'] && /usr/local/sbin/cache-stats.sh

Comments are closed.