Monday , 29 April 2024
Home 4 Hack Day 4 Google Speed Up Google Analytics with urchin.js

Google Speed Up Google Analytics with urchin.js

Ever notice that sometimes your sites take a while to load all the way because google’s urchin.js file is taking forever?
You may recognize this problem when you see something similar to this in your browsers status bar “Transferring data from google-analytics.com…”

Time To Setup?
4 minutes I got tired of seeing that all the time and so I set up an automated cronjob that runs every 12 hours and downloads the latest version from google, and saves it to my website directory, then I reference /urchin.js instead of http://www.google-analytics.com/urchin.js.. and my site loads a lot faster!

Take a look at the source for this page if you want to see what is going on (look at the bottom)
There are 2 pretty major things that you accomplish by hosting urchin.js locally

  1. You Enable persistant connections
  2. You ensure that the correct 304 Not Modified header is sent back to your site visitors instead of reserving the entire file.

The urch.sh shell script
Create a shell script called urch.sh.. this will be executed every 1/day or 1/wk, whatever you want. The following script removes the local urchin.js, then it wgets the latests version into the local directory.

#!/bin/sh rm /home/user/websites/askapache.com/z/j/urchin.js cd /home/user/websites/askapache.com/z/j/ wget http://www.google-analytics.com/urchin.js chmod 644 /home/user/websites/askapache.com/z/j/urchin.js cd ${OLDPWD} exit 0;

Improved shell script
I realized right away that a more modular shell script would be needed.. I admin like 50+ web-sites and it would be stupid to have to type the same block of code 50 times, wget the same file 50 times, etc.. So this version downloads the urchin.js file into a temporary directory, then it copies it OVER the old file for each directory. So 1 GET request for unlimited sites.. The below is just for 2 sites.

#!/bin/sh   # SETTINGS export TMP=${HOME}/sites/tmp/ export SH=${HOME}/sites/ export SJ=/htdocs/z/j/ export UA="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3"   # SITES export S1=htaccesselite.com export S2=askapache.com   # RESOURCE URLS L1=http://www.google-analytics.com/urchin.js   # SETS CORRECT PERMS AND COPIES TO EACH SITES DOC_ROOT setit(){ chmod 644 *.js cp *.js $SH$1$SJ }   cd $TMP   curl --header "Pragma:" -A "${UA}" --retry 200 --retry-delay 15 -O ${L1}   setit $S1 setit $S2   cd ${OLDPWD}   exit 0;

The crontab
Add this to your crontab by typing crontab -e

11 12 * * * /home/user/websites/urch.sh >/dev/null 2>&1

Or to just check once a week do

0 2 * * 6 /home/user/websites/urch.sh >/dev/null 2>&1


Finished! Read on for more in-depth overkill.

In the past year urchin.js has only been updated once, yet the Last-Modified header reflects an updated date every request.. not even in a linear fashion I might add!

The problem happens when requests for the urchin.js file on google-analytics.com spike, even with load-balancing technologies which are obviously in place. When this happens your browser makes the request for the urchin.js file, but instead of an immediate connection and transfer of the file you get a lagging transfer.

One reason is because the server that the urchin.js file is served from does not allow persistant connections.

This object will be fresh for 1 week. It has a validator present, but when a conditional request was made with it, the same object was sent anyway. It doesn’t have a Content-Length header present, so it can’t be used in a HTTP/1.0 persistent connection.

Another big big reason is that even though Cache-Control headers are correctly set by google-analytics when serving urchin.js, Instead of responding to an If-Modified-Since header correctly with a 304 Not Modified header, indicating the file has not been modified, google-analytics instead returns the entire urchin.js file again, thus rendering the cache-control void.

You can see this problem with a wireshark capture from an exchange.

GET /urchin.js HTTP/1.1 Accept: */* Referer: http://www.askapache.com Accept-Language: en-us UA-CPU: x86 Accept-Encoding: gzip, deflate If-Modified-Since: Tue, 20 Mar 2007 22:49:11 GMT User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; SU 2.011; .NET CLR 1.1.4322; .NET CLR 2.0.50727; Alexa Toolbar; .NET CLR 3.0.04506.30) Host: www.google-analytics.com Connection: Keep-Alive
HTTP/1.1 200 OK Cache-Control: max-age=604800, public Content-Type: text/javascript Last-Modified: Tue, 20 Mar 2007 22:54:02 GMT Content-Encoding: gzip Server: ucfe Content-Length: 5675 Date: Sat, 24 Mar 2007 18:23:12 GMT

Of course you should implement your own caching scheme for best results.
So there are 2 pretty major things that you can eliminate by using a locally hosted version of the urchin.js file.

  1. Enable persistant connections
  2. Send correct 304 Not Modified headers


Still, this issue only becomes an issue if you notice lags from the google-analytics site, which happen from time to time.

Check Also

The Essential Laws of Explained

Title: Navigating the Quantum Realm: Choosing RF Isolators for Quantum Computing In the ever-evolving landscape of quantum computing, where the laws of classical physics bend and blur, the need for robust and reliable RF isolators becomes paramount. These small yet mighty devices play a crucial role in shielding sensitive quantum systems from unwanted electromagnetic interference …

The Beginners Guide To (Finding The Starting Point)

Tips for Choosing a Professional Photographer Are you looking for a reliable and professional photographer? …

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.