The evolution of the Google Analytics snippet

How the ubiquitous Google Analytics tracking code has changed over the years, and what we can learn from it.


google-analytics ga4 javascript

Recently, while upgrading this site’s projects pages to the latest version of Google Analytics (GA4), I got to see how the GA snippet has changed since around 2009, when this website first went up. Throughout that time, I had created many quick projects, and would copy whatever the latest GA tracking code would be at the time of creation.

The Google Analytics snippet is almost certainly the most widely copy-pasted piece of Javascript from the last decade, with the widest variety of both implementers (ranging from super-technical programmers, to novices setting up a cooking blog) as well as end-users (the script should load on any browser on any device).

The following is by no means a comprehensive history of the Google Analytics snippet—I’m sure I missed many iterations. Looking back at these snapshots that I had copy-pasted led to some interesting observations as browsers improved, patterns changed, and business needs evolved.

Early on

At some point in the distant past, the GA snippet looked like this:

<script type="text/javascript">
  var gaJsHost =
    'https:' == document.location.protocol ? 'https://ssl.' : 'http://www.';
  document.write(
    unescape(
      "%3Cscript src='" +
        gaJsHost +
        "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"
    )
  );
</script>
<script type="text/javascript">
  try {
    var pageTracker = _gat._getTracker('UA-12345678-1');
    pageTracker._trackPageview();
  } catch (err) {}
</script>

You can tell it’s old, document.write is definitely no longer a browser API that’s commonly used. The unescape hackery is probably to work around parsing issues with some older browsers that might see a <script> inside of a string and try to evaluate it directly.

It’s also well before the mass adoption of HTTPS, because it still supported the insecure HTTP protocol. The support for both protocols was likely to make it easier for website authors to be able to paste a single snippet that would work anywhere: the main script wasn’t loaded directly, but rather depending on whether we were in an HTTP or HTTPS page.

It was also not that performant — the code to figure out whether to load securely had to be parsed and executed before the main script even started loading. Then the ga.js script had to load before tracking would begin, since the second <script> block called methods from the script directly. This would block rendering of any further page components until the script loaded and was parsed and executed, a practice that today is frowned upon.

Getting async

Later on, the script look something like this:

<script type="text/javascript">
  var _gaq = _gaq || [];
  _gaq.push(['_setAccount', 'UA-12345678-1']);
  _gaq.push(['_trackPageview']);

  (function () {
    var ga = document.createElement('script');
    ga.type = 'text/javascript';
    ga.async = true;
    ga.src =
      ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') +
      '.google-analytics.com/ga.js';
    var s = document.getElementsByTagName('script')[0];
    s.parentNode.insertBefore(ga, s);
  })();
</script>

At this point browsers had better support for the async attribute for loading scripts, which is used here. The page could go on rendering and become interactive while the main GA script loaded. Events could event be logged prior to the loading completed by pushing events into a global array (_gaq) of objects.

The code of the script got wrapped in an IIFE (Immediatelely Invoked Function Expression), presumably to isolate the logic from any global variables that might interfere.

The snippet also ended up being self-contained within one <script> tag. I wonder if had to do with preventing people from accidentally re-ordering scripts within their pages without realizing that there were dependencies between them and causing GA not to load. This is also the era of WordPress dominating the blogging ecosystem, and I’m guessing that some script managers might have been validating that a snippet is a single tag.

Fancy IIFE

At one point, this got a little cuter, with the function receiving a few parameters passed from the global scope rather than accessing them directly.

<script>
  (function (i, s, o, g, r, a, m) {
    i['GoogleAnalyticsObject'] = r;
    (i[r] =
      i[r] ||
      function () {
        (i[r].q = i[r].q || []).push(arguments);
      }),
      (i[r].l = 1 * new Date());
    (a = s.createElement(o)), (m = s.getElementsByTagName(o)[0]);
    a.async = 1;
    a.src = g;
    m.parentNode.insertBefore(a, m);
  })(
    window,
    document,
    'script',
    '//www.google-analytics.com/analytics.js',
    'ga'
  );

  ga('create', 'UA-12345678-1', 'filosophy.org');
  ga('send', 'pageview');
</script>

At this point, the script is looking pretty hard-to-read, if not obfuscated. That’s not necessarily a problem, since people are expected to copy-paste it without modification, but there’s something slightly off-putting about inserting aesthetically messy third-party code into your own codebase.

I suspect the main reason for passing globals into the initialization might have been testing—in this approach, it’s possible to initialize the snippet with another source URL or to do unit tests in a non-DOM environment such as node, with mocked window and document objects.

The parameter signature (function (i, s, o, g, r, a, m)) is neat, since it spells a word. My hunch is that Google engineers were trying to be cute and eyeing use “Google” or “Analytics”, neither of which would work because they repeated letters. So they needed to use an isogram (a word without repeating letters), and it probably got caught in committee and ended up just being the inert “isogram” rather than something funny. a,b,c,d,e,f,g would have worked just as well.

An unlikely motivation for passing in global parameters with a single-character alias might have been to to reduce the number of characters in the snippet, since words like window and document appear multiple times within the body. If we “expand” the variables out, the snippet looks like the following, which ends up being 543 characters, which is 4 less than the original 547. I doubt this is the case, becaues there is so little reduction in length, and also the repetitive window and document strings would compress nicely with gzip anyway.

<script>
  (function () {
    window['GoogleAnalyticsObject'] = 'ga';
    (window['ga'] =
      window['ga'] ||
      function () {
        (window['ga'].q = window['ga'].q || []).push(arguments);
      }),
      (window['ga'].l = 1 * new Date());
    (a = document.createElement('script')),
      (m = document.getElementsByTagName(script)[0]);
    a.async = 1;
    a.src = '//www.google-analytics.com/analytics.js';
    m.parentNode.insertBefore(a, m);
  })();

  ga('create', 'UA-12345678-1', 'filosophy.org');
  ga('send', 'pageview');
</script>

The initial event is also now capturing the time (new Date()) at which the script was executed. This is probably useful to track how long it took to load the main script, and time elapsed between events prior to the main script initializing.

There’s no more explicit handling of pages loaded via HTTP either—by this point browsers had allowed for loading protocol-agnostic URLS (// rather than http://) so that logic was dropped entirely.

The Google Analytics <> Google Tag Manager merge

Later on, in what must have been an enormously bureaucratic corporate reorganization over at Google, the Google Analytics and Google Tag Manager (GTM) offerings started to merge. It makes sense — Google was offering two products geared towards website authors that had snippets:

Consolidating it into a single resource would make it’s easier to cross-sell Google services without requiring developers to make changes to code.

The GA script started looking like the following:

<script
  async
  src="https://www.googletagmanager.com/gtag/js?id=UA-12345678-1"
></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag() {
    dataLayer.push(arguments);
  }
  gtag('js', new Date());
  gtag('config', 'UA-12345678-1');
</script>

It’s much, much cleaner. And the same analytics ID from before continued to work.

One improvement here is that the Javascript is back to being simple again. The script does less, and the browser does more — using a sibling <script> to do the loading rather than using Javascript to load the script.

There’s no more HTTP support either, the script is loaded only via HTTPS. Perhaps there was just too much risk associated with logging events and user data on pages that had insecure connections — and by this point most browsers had been marking pages served over HTTP as insecure to end users.

There is, in my opinion, a regression here — the id now appears in two places: the script’s URL, as well as in the config event. This could lead to some unexpected breakages when code using this is maintained over time — for example, if one needed to change the tag ID, would one need to change it both places? Would a developer even notice this difference.

Presumably the ID-aware script loading is to allow loading different javascript code for different customers, which is important for Google Tag Manager, since it might be loading many different pieces of code or just one depending on what tags a user has configured.

I wonder whether the second gtag() call could be dropped altogether — a dynamically-generated script loading with a URL search parameter could presumably also template the ID into its own body.

The breaking change: GA4

Up until this point, the key thing is that the identifier (the 'UA-...' part) stayed the same — regardless of whether you added a GA tag in 2009 or 2022, it would continue to work and data would flow into your Google Analytics project.

Google Analytics 4 is the first non-backward compatible change. Nothing actually changed about the structure of the snippet itself, but IDs went from looking like UA-... to G-...:

<script
  async
  src="https://www.googletagmanager.com/gtag/js?id=G-AB1CDE2F3G"
></script>
<script>
  window.dataLayer = window.dataLayer || [];
  function gtag() {
    dataLayer.push(arguments);
  }
  gtag('js', new Date());
  gtag('config', 'G-AB1CDE2F3G');
</script>

Why? I don’t know. I presume there could have been a way to preserve the old-style IDs or migrate them to a new system, especially since this is exactly the same script being loaded as before.

Knowing how annoying it was to update my mere two dozen project pages with the new snippet, and how rampant copy-paste is on the web, I am sure that a large percentage of GA projects will just stop receiving events altogether. Rest easy, old chaps.

I suspect dropping legacy GA tags might be a cost-saving move on Google’s part—decades of random pages continuing to send events into GA can’t be cheap, and a “bulk reset” where only actively-maintained pages (whose authors did the work to change the analytics ID) continue to have their events processed could really save on bandwidth and data storage costs.

Snippet design best practices

What are some lessons learned from looking back at this?

1: People will almost never update copy-pasted code.

I have never in my life gone back to update copy-pasted code, unless it’s actively broken. If you have a code snippet that is being used on more than a handful of pages that you control, it’s no longer in your own hands.

2: The less code in the snippet the better.

It’s helpful to have a little bit of logic in your snippet, such as GA uses to capture events that are logged before the script initializes. But, per #1, you have almost no opportunity to update bugs in the pasted content—better to just update the script that loads and keep the snippet to the bare minimum.

3: Loading async is fast.

Modern webpages have lots of things loading at once. If you can load asynchronously and do your work whenever that’s ready, it’s a much better user experience.

4: Lean on the browser.

Browsers these days do a whole lot more than just rendering HTML, and it’s unnecessary to reproduce that work in Javascript. When possible and widely supported, use built-in functionality of the browser to reduce the size of the snippet.

5: Backward-compatibility rocks.

In over a decade of changes, only the latest upgrade is backward-incompatible. My 2009 projects were happily recording pageviews until their GA ID got deprecated. If you can upgrade the script that is loaded to be compatible with all previous versions of a snippet, rather than requiring snippet upgrades to support your new script, your users will thank you for having lower costs of maintenance and headache.


A decade ago, Google Analytics seemed to be the only player in town when it came to webpage analytics. These days, there’s a whole lot more competition, and the GA admin tools continues to get more complicated to use as more and more functionality gets jammed into it. It’s hard to beat a totally free analytics tool, but I’m very curious to see what the webpage analytics landscape looks like ten years hence.