analytics-datalayer
v0.17.0
Published
[![npm version](https://badge.fury.io/js/analytics-datalayer.svg)](https://badge.fury.io/js/analytics-datalayer) [![CircleCI](https://circleci.com/gh/operation-orange/analytics-datalayer.svg?style=shield)](https://circleci.com/gh/operation-orange/analytic
Downloads
6
Readme
This is still not considered production-ready but is currently being tested for production. If you come across this module by chance or through a search engine, use at your own risk
CODE SNIPPETS ARE OUTDATED. API HAS CHANGED AND I HAVE MOVED TO USING THE DJV SCHEMA VALIDATION LIB INSTEAD OF JOI. THE CONCEPTS REMAIN THE SAME THOUGH
Analytics Data Layer
The Problem
There are instances where an application may need to capture application analytics data and send it to multiple 3rd party analytics services, such as Google Analytics, Google Tag Manager, Clicky, Mint etc. Each service comes with its own API, its own way of collecting analytical data in its own format.
TBD: Provide examples
The need to use multiple analytics services generally arises either through those services fulfilling different business needs or because the decision has been made to move from one to another and there is a period of overlap as the transition is made between the two and need to run in parallel for a time.
Even if you only currently intend use a single service, you have the issue of vendor lock-in. You are writing code specific to that service's API and any future decision to move away to something else will incur the cost of a rewrite of some or all of that analytics code.
This Javascript library is an abstraction above analytics services that provides a single API for Developers to capture event based analytics data within their applications in a way that is simple, consistent and in a format that is agnostic to 3rd party vendors. Then through the power of service adapters, that data can be sent to any number of external analytics services via the API they expect and in the format they understand.
Features/Benefits
- Can be used in any Javascript Node or browser application.
- Use multiple analytics services within your application, but interact with them using only one, simple, robust API.
- Capture data in a simple, consistent and vendor agnostic format without the need for Developers to learn potentially multiple, vendor specific API's and formats.
- Changing services later down-the-line becomes a simple case of swapping out adapters, rather than rewriting code to fit the new service.
- Even if you do only ever use a single service, this library provides extra features usually not provided, such as a way to define the structure of expected analytics data and then validate against it, ensuring analytics data isn't missed or malformed.
- It may also increase the testability of the data you are capturing as it has been written with automated testing in mind.
Great for Single Page Applications
This library is ideal for stateful, Single Page Applications (or SPA) such as ones made with React or AngularJS. Similarly to a mobile or desktop app, analytical data needs to be sent off the back of events triggered explicitly by the application (e.g. user input, route changes, AJAX data requests etc) rather than relying largely on DOM events to be detected automatically by the Javascript code snippets that these services ask you to add to your site.
In a more traditional website, analytics data is generally captured off the back of DOM events that are automatically detected by the vendor's code snippet, such as the 'DOM Ready' event for a 'Page View'. In most cases the SPA won't have even finished initialising and be in a position to send analytics data by the time the 'DOM Ready' event fires.
Then there is the issue of the application's statefulness. Because the page is only technically ever loaded from the server once and then subsequent page changes in the application's session are a case of Javascript updating parts of that very same page, events such as 'DOM Ready' are only ever fired once, regardless of the number of page changes that happen after the first.
Some services may also expect analytics data to be generated server-side and to be made available for when the 'DOM Ready' event fires, as with Google Tag Manager. This is not an option for a pure front-end Javascript SPA. Even with an isomorphic/universal application, this is problematic and cumbersome because of the same statefulness issue above. The page is only generated server-side once and then all other subsequent page changes (or 'route changes') happen in pure browser Javascript-land.
Therefore an SPA cannot really rely on the automatic detection of standard DOM events. The application instead generally needs to take complete control of what data gets tracked and when through its own events e.g. triggering 'Page View' tracking off the back of a 'routeChange' event.
For example, in Google Tag Manager, instead of tracking a 'Page View' using its built-in 'Page View' trigger option, you would use the 'Custom Event' option which is set to listen for the 'routeChange' event.
TBD: explain 'Why Not Just Use a Tag Manager like GTM?'
Installation
If you use yarn:
yarn add analytics-datalayer
If you use NPM:
npm install analytics-datalayer --save
How It Works
The idea is to push analytics data and the application event that triggered it into a 'Data Layer'. The Data Layer is just a plain Javascript array that has been extended with extra methods to facilitate the validation and sending of that analytics data to the 3rd party services you have set up.
The Data Layer exists for the lifetime of the current application session and acts as a log of the analytics data captured, until the application is restarted.
Because the Data Layer is just a simple array, it makes viewing and testing the analytics have been tracked very simple.
The Data Layer array has been extended in the following way:
Data Schemas
The dataLayer.define()
method
This provides a means to define any number of 'data schemas'. A schema represents the structure and expected values of the list of analytics data to send the external analytics sources after a given event has occurred e.g. a 'Main Menu' schema may have values specific only to menus, so we define a menuItemId
and menuItemName
which we send on the event of a menu item click. We may also have a 'Product' schema which has product specific data such as productId
and productCategory
which we send on the click of a product item.
Schema Validation
In a data schema, we define the names of the values of what we need to send, such as a productCategory
. We can also specify the expected format of that value, such as text
, number
or perhaps one out of a fixed list of values such as book
, mug
or canvas
. We can also specify whether the value is required or optional or if it has a default value in the absence of one provided. If validation fails, an error is thrown. This will help ensure that the analytics data we wish to capture and send is always there, in the format that we expect.
Schema Nesting
The ability to 'nest' (or 'compose') schemas.
Using the 'Main Menu' and 'Product' schema examples above, both will always include page level data such as pageHost
and pageTitle
. Rather than repeating those same values across both schemas, we can create a parent schema called Page Schema
that specifies those values and then add the Main Menu Schema
and Product Schema
as children of Page Schema
. The child schemas then inherit those value definitions when used so we don't have to repeat and maintain the same list across multiple schemas (see the example below).
Schema Example
import dataLayer from 'analytics-datalayer';
import joi from 'joi'; // a great lib for schema validation. See: https://www.npmjs.com/package/joi
dataLayer.define({
name: 'Page Schema',
schema: joi.object({ // joi based schemas are required
pageHost: joi.string().hostname().required(), // required, must be text, in a correct hostname format
pageTitle: joi.string().required(),
appVersion: joi.string().regex(/^[0-9]+\.[0-9]$/).default('1.0').required() // ensures the format 'N.N' and this value is hard coded here using default(x) by a developer
}),
children: [{
name: 'Main Menu Schema',
schema: joi.object({
menuItemId: joi.integer().required(),
menuItemName: joi.string().required()
})
}, {
name: 'Product Schema',
schema: joi.object({
productId: joi.integer().required(),
productCategory: joi.string().allow('book', 'mug', 'canvas').required()
})
}]
});
An (overridden) dataLayer.push()
method
With our schemas defined, we can then start pushing analytics data to the Data Layer array. We have overridden the default behaviour of the plain Javascript Array.push()
to require an explicit event name and the schema name(s) to validate the provided analytics data against: dataLayer.push(eventName, [schemaNames], data)
dataLayer.push()
Example
import dataLayer from 'analytics-datalayer';
dataLayer.push('ProductClick', 'Product Schema', {
pageHost: '/range/books',
pageTitle: 'Our Store - Books',
productId: 123,
productCategory: 'book'
});
The data above is validated against the Product Schema
we have defined and it if fails, an error is thrown for us to handle.
The dataLayer
is still just an array so if you were to console.log(dataLayer[0])
you would see:
{
_event: 'ProductClick',
pageHost: '/range/books',
pageTitle: 'Our Store - Books',
productId: 123,
productCategory: 'book'
}
You can also validate analytics data against multiple data schemas:
dataLayer.push()
Multiple Schemas Example
import dataLayer from 'analytics-datalayer';
dataLayer.push('ProductClick', ['Product Schema', 'Main Menu Schema'], {
pageHost: '/range/books',
pageTitle: 'Our Store - Books',
productId: 123,
productCategory: 'book'
}); // An error is thrown
The above will throw an error because the data required by the 'Main Menu Schema' is missing.
Adapters for External Services
Adapters are used to push analytics data to any number of external analytics services. The Data Layer provides a plugin adapter interface which accepts adapters specific to a service e.g. a Google Tag Manager adapter or an Adobe Analytics adapter. The job of the adapter is to take the data we push to our own Data Layer and subsequently push it to the 3rd party service, taking the data we have passed, translating it into a format that the external service understands and then sending it to where it needs to go.
NOTE: You still may need to add the respective vendor's code snippet to your application for the adapter to interface with. Adapters will never generate this for you as analytics code snippets generally need to be added very early in the application's lifecycle (i.e. in the <head>
tag) before this library is even initialised. If your interest is reducing the number of code snippets within your application, consider using a Tag Manager such as Google Tag Manager along with this library
Google Tag Manager Adapter Example
import dataLayer from 'analytics-datalayer';
const gtmAdapter = (window) => (dataLayer) => ({
// this adapter's push is triggered when we do dataLayer.push(eventName, schemaName, data)
// but only if the data has passed the schema validation
push: (eventName, data) => {
// this is GTM's own data layer
window.dataLayer.push({
event: eventName,
...data
});
}
});
// this adapter needs to be added BEFORE we start pushing data
dataLayer.adapter(gtmAdapter(window));
Debugging & Testing
Because the Data Layer is based on a plain Javascript Array
(albeit, extended), we can easily test what is being pushed into it by our app either through the standard tools Developers use such as Chrome Dev Tools or a more in-depth and user friendly debugging tool could be built around it to be used by non-Developers for QA (much like GTM's own preview mode).
The Data Layer array exposes a dataLayer.attachTo()
method, so you can freely attach the Data Layer array to any object you desire, such as the gloabl window
object, to expose it to an E2E testing framework for automated testing for example:
import dataLayer from 'analytics-datalayer';
dataLayer.attachTo(window, '_dataLayer') // the dataLayer is then visible on window._dataLayer