sourcebuster
v1.1.0
Published
Get sources of your site's visitors (utm / organic / referral / typein).
Downloads
20,347
Maintainers
Readme
Sourcebuster JS
1.0.5 is merged to master
Read upgrade guide for details.
About
- Sourcebuster tracks the sources of your site’s visitors and stores the data in the cookies for further analysis.
- It handles sources-overriding just like Google Analytics does.
- It’s written in pure JavaScript, without any dependency from third-party libraries.
- You can use received data for:
- phones change
- site content change
- export data within forms to your CRM or analysis system.
Links
Download · Upgrade guide · Changelog · Test page
Install
# from `npm`
npm install --save sourcebuster
# or `bower`
bower install --save sourcebuster
Or you can download this repo and use sourcebuster.min.js
from /dist
folder.
Setup
I'm gonna walk you through the inline-HTML setup pattern, but keep in mind, that Sourcebuster is available as CommonJS
/ AMD
module, so you can require
it as dependency as well.
Script is written in pure JavaScript, doesn’t have any dependency on third-party libraries and doesn’t interacts with DOM’s objects, so it can be called as soon as you want it. Higher you’ll place it in the <head>
tag, sooner you’ll get the cookies, whose data can be used for DOM’s objects manipulation (phones change etc.).
Place in the <head>
tag:
<script src="/path/to/sourcebuster.min.js"></script>
Or require it:
var sbjs = require('sourcebuster');
It will expose Sourcebuster object — sbjs
. This object has just 2 methods: init
and get
. Basically first one is setter
, and the second one is getter
.
So, lets initialize it:
<script src="/path/to/sourcebuster.min.js"></script>
<script>
sbjs.init();
</script>
This will:
- set all the cookies
- give you the data (available through the second method
sbjs.get
)
Configure
sbjs.init
accepts one argument: object with the settings. It's optional and usually it will be something like this:
sbjs.init({
domain: 'alexfedoseev.com',
lifetime: 3,
callback: doSmth
});
There are 11 types of user settings:
- lifetime
- session_length
- domain
- referrals
- organics
- typein_attributes
- timezone_offset
- campaign_param
- user_ip
- promocode
- callback
Now I'll show you the whole list of configurable options. The short description is provided in comments. The rest is explained below.
sbjs.init({
// Set custom expiration period for cookies in months
// 6 months is default
lifetime: 6,
// Set custom session length in minutes
// 30 minutes is default
session_length: 30,
// Set domain name in cookies
domain: {
host: 'alexfedoseev.com',
isolate: false
},
// Set custom referral sources
referrals: [
{
host: 't.co', // This is host from Twitter's `http referer`
medium: 'social', // This is custom `utm_medium`, you can drop it and it'll be `referral`
display: 'twitter.com' // And this is how you'll see it in the result data
},
{
host: 'plus.url.google.com',
display: 'plus.google.com'
}
],
// Set custom organic sources
organics: [
{
host: 'bing.com',
param: 'q',
display: 'bing'
}
],
// Set `utm_source` & `utm_medium` values for `typein` traffic
// Defaults are `(direct)` & `(none)`
typein_attributes: {
source: '(direct)',
medium: '(none)'
},
// Set time zone
timezone_offset: 3,
// Set custom `utm_campaign` param
campaign_param: 'my_adwords_campaign',
// Set user ip
user_ip: '192.168.1.1',
// Set promocode
// Default range is [100000..999999]
promocode: {
min: 100000,
max: 999999
},
// Set callback-function,
// It will be executed right after `sbjs` cookies will be set
callback: doSmth
});
lifetime
lifetime: 6
Setting up custom expiration period for sbjs
cookies (in months). Default is 6.
session_length
session_length: 30
Setting up user's session duration (in minutes). Default is 30.
This parameter affects only referral sources overriding. When the visitor comes on your site for the first time, we receive and store the data about his source. After some time this visitor can return to your website, but from another source, and we need to have some rules to decide — should we overwrite previous source or we should not.
The rules are the same with Google Analytics:
- UTM and Organic source always overrides any previous source.
- Typein never overrides a previous source.
- Referral source overrides previous source only if there is no user session at the moment. If it’s inside the same session — a referral source will never override previous source.
Explanation to referral
logic: sometimes visitor within the current visit (session) comes to the website from the “source” which is not actually a “source”. For example, it can be visit from the email service, where he had a registration activation link.
domain
domain: {
host: 'alexfedoseev.com',
isolate: false
}
Setting up cookies domain.
First of all lets talk about how script handles the absence of this custom option: if no value is set, cookies will be placed for current domain and all its subdomains.
Why it is so. Lets say the site doesn't have subdomains to share traffic with, and now it doesn't matter: do we share the cookies with subdomains or we do not. But what's gonna happen if someday subdomains will occur? If cookies wasn't shared — subdomains won't get them, it means they won't share the traffic, and visitors, which came from main domain to subdomains (and vise versa), will be considered as referral
traffic.
So by default the cookies are shared. However if you don't want to share them — use isolate: true
, to isolate domain.
Let’s take a look at further examples.
No. 1
The first scenario. You have a site: wow.com. Also you have a blog on this site: blog.wow.com. And you don’t want to separate traffic between them. It means that when the visitor come from blog.wow.com to wow.com, it won’t be a new visit for wow.com (with blog.wow.com as Referral source), it will be a common inner click without source change (to sourcebuster.js this scenario will be equal to inner page change: from wow.com/about to wow.com/contacts for example). To achive this you have to add this line on both sites: wow.com & blog.wow.com.
domain: 'wow.com'
No. 2
That was simple. Now lets take a look at opposite scenario — you want to separate traffic between your subdomains and consider it as Referral. There is the main site — wow.com, and there is a blog on this site — blog.wow.com, which also has users subdomains — alex.blog.wow.com is one of them. You don’t want to separate traffic between blog.wow.com and alex.blog.wow.com, but you do want to separate it between all the blogs and the main site. Here’s how we can sort it out:
// put this on the main domain wow.com
domain: {
host: 'wow.com',
isolate: true
}
// and this on the blogs subdomains (blog.wow.com & alex.blog.wow.com)
domain: 'blog.wow.com'
Pay attention to the isolate: true
param in setting for the main domain. Use it only when all incoming traffic from all subdomains should be referral in relation to provided host.
In our example if the visitor came for the first time on the main site wow.com by clicking on the link in user’s blog alex.blog.wow.com, his source (for the wow.com) will be alex.blog.wow.com (traffic type: referral).
NB! Do not change isolate
value after you pushed the script in production! If you'll do — your visitors will get doubled cookies and bad things may happen.
The option, that will give the ability to force-update cookies domain (if you need to change subdomains handling rules), is in my TODO-list, but not available yet.
How to check that isolate
is set right:
- host of the page, which code have setting
domain
with paramisolate: true
must be equal to the host provided indomain
line:
// CORRECT: on the pages of wow.com
domain: {
host: 'wow.com',
isolate: true
}
// DOESN’T MAKE SENSE: on the pages blog.wow.com
domain: {
host: 'wow.com',
isolate: true
}
- all incoming traffic from all subdomains will be referral in relation to provided host:
domain: {
host: 'wow.com',
isolate: true
}
// visit from *.wow.com → wow.com will be referral
referrals
referrals: [
{
host: 't.co', // This is host from Twitter's `http referer`
medium: 'social', // This is custom `utm_medium`, you can drop it and it'll be `referral`
display: 'twitter.com' // And this is how you'll see it in the result data
},
{
host: 'plus.url.google.com',
display: 'plus.google.com'
}
]
Adds custom referral
sources.
In general if you’re ok with the fact that medium (utm_medium
) of traffic from facebook.com
is referral
, you don’t need this setting. But if you want to make this kind of traffic social
(utm_medium=social
), you can set it up using referrals
. First param is host
of the source from http referer
, second — medium
— preferred value of utm_medium
.
Moreover some of the traffic sources have different referer host in relation to their main domain (for example, traffic from Twitter has referer with the host — t.co
). In these cases you can assign alias to the source using optional display
param. Also with this param you can group the traffic from the set of the sites into one virtual source.
Twitter (host: 't.co', display: 'twitter.com'
) and Google+ (host: 'plus.url.google.com', display: 'plus.google.com'
) added to default referral
sources. You still can override it by your custom setting (to mark it as social
for example).
organics
There are already a number of predefined organic
sources in the core:
Source -> Alias
-------------------------
google.all -> google
yandex.all -> yandex
bing.com -> bing
yahoo.com -> yahoo
about.com -> about
aol.com -> aol
ask.com -> ask
globososo.com -> globo
go.mail.ru -> go.mail.ru
rambler.ru -> rambler
tut.by -> tut.by
But you can use this setting, if you want to add more organic sources or override aliases of predefined ones.
organics: [
{
host: 'bing.com',
param: 'q',
display: 'bing_in_da_house'
}
]
For example you want the traffic from the SERP of bing.com to be organic and the alias for this source to be bing_in_da_house
. So you need to provide host: 'bing.com'
, and the query param
of keyword — 'q'
. Both are required. Also, to set custom alias for the source, provide optional third param display: 'bing_in_da_house'
.
To get the keyword param go to bing.com and search for smth, “apple” for example. After you’ll get to the SERP, explore its URL: http://www.bing.com/search?q=apple&go=&qs=n&form=QBLH&pq=apple&sc=8-5&sp=-1&sk=&cvid=718ad07527244c319ecebf44aa261f64
Keyword param — 'q'
— is a symbol/word between “?”
(or “&”
if the param is not the first after question sign) and “=apple”
in URL of SERP.
typein_attributes
typein_attributes: {
source: '(direct)',
medium: '(none)'
}
Sets custom utm_source
and utm_medium
for typein
traffic.
By default the values of source
and medium
for typein
traffic are (direct)
& (none)
.
You can override this via typein_attributes
.
timezone_offset
timezone_offset: 3
Setting up time zone.
By default datetime is taken from the visitor's system. But you can normalize it to predefined time zone via timezone_offset
.
Example. Your visitor is in London (UTC +00:00
). His local time is 03:00 AM
. If no timezone_offset
was set, the time in cookie will be 03:00 AM
. Another visitor at the same moment is from Berlin (UTC +01:00
) and his local time is 04:00 AM
. The time in cookie will be 04:00 AM
.
If you want to normalize time of all visitors (let it be UTC +03:00
for example), you should set it via timezone_offset: 3
. So the time in cookies of both visitors will be 06:00 AM
.
campaign_param (Google AdWords gclid
param handler)
campaign_param: 'my_adwords_campaign'
Sets custom GET-param, whose value (if present) will be set as utm_campaign
in cookies (if there is no original utm_campaign
in request). This feature was added mainly because of Google AdWords gclid
param.
Here is the use-case. If you have traffic from Google AdWords and you use gclid
param, you can shorten your urls by removing utm
out of it. Sourcebuster will match this traffic as utm
from Google.
If there is only gclid
param in url:
http://statica.alexfedoseev.com/sourcebuster-js/?gclid=sMtH
This will give you the following results:
- Traffic type: utm
- utm_source: google
- utm_medium: cpc
- utm_campaign: google_cpc
- utm_content: (none)
- utm_term: (none)
You can provide a custom utm_campaign
name via campaign_param
and value of this GET-param:
http://statica.alexfedoseev.com/sourcebuster-js/?gclid=sMtH&my_adwords_campaign=test_custom
You'll get the following:
- Traffic type: utm
- utm_source: google
- utm_medium: cpc
- utm_campaign: test_custom
- utm_content: (none)
- utm_term: (none)
WARNING
- If there is original utm-param in request (
utm_source
,utm_medium
,utm_campaign
), it will overridegclid
param andcampaign_param
param value. - If there is only custom campaign param (
campaign_param
) in request, Sourcebuster will consider it asutm
traffic.
user_ip
user_ip: '192.168.1.1'
Sets user’s ip address.
By default sourcebuster can’t get ip address of the visitor. But if you need it, you can get it on your back-end and push it using user_ip
setting.
promocode (beta)
promocode: true
// or
promocode: {
min: 100000,
max: 999999
}
Sets random promocode for visitor.
If you don't want to do promocode-stuff on your back, Sourcebuster can generate them or you.
There is no check for uniqueness, of course. But you can rely on probability and set range of promocode values min
and max
params.
They are optional, by the way. If they are not set, range will be between 100 000 and 999 999.
This option is beta — mainly because I haven't check the percent of duplicated promos on big projects.
callback
callback: doSmth
Callback. Just pass the name of the function to the option, and it will be executed right after the cookies will be set. Callback will get the object with sbjs
data as argument.
sbjs.init({ callback: go });
function go(sb) {
console.log('Cookies are set! Your current source is: ' + sb.current.src);
}
Usage
Getting the data
Grab the data through sbjs.get
method (it's object
actually).
Here is the list of what you can get.
sbjs.get.current
Params of the latest visitor’s source. If the visitor had more than one source, this will be the latest values.
sbjs.get.current.typ
Traffic type. Possible values:utm
,organic
,referral
,typein
.sbjs.get.current.src
Source. utm_source, actually.sbjs.get.current.mdm
Medium, utm_medium. Values can be customized using utm-params andreferrals
.sbjs.get.current.cmp
Campaign. Value of utm_campaign.sbjs.get.current.cnt
Content. Value of utm_content.sbjs.get.current.trm
Keyword. Value of utm_term.
sbjs.get.current_add
Additional info about the visit, when the current
source was written.
sbjs.get.current_add.fd
Date and time of the visit. Format:yyyy-mm-dd hh:mm:ss
. Time zone can be customized viatimezone_offset
.sbjs.get.current_add.ep
Entrance point.sbjs.get.current_add.rf
Referer URL.
sbjs.get.first
& sbjs.get.first_add
Just like sbjs.get.current
& sbjs.get.current_add
, but holds params of the very first visit. Stored once, never overwritten.
sbjs.get.session
Current opened session data.
sbjs.get.session.pgs
How many pages user have seen during the current session.sbjs.get.session.cpg
Current page URL.
sbjs.get.udata
Additional user data: visits, ip & user-agent.
sbjs.get.udata.vst
How many times user visited site.sbjs.get.udata.uip
Current ip-address.sbjs.get.udata.uag
Current user-agent (browser).
sbjs.get.promo
Visitor's promocode. Cookie set only if promocode
option is present.
sbjs.get.promo.code
Promocode.
Cookies
All data are stored in cookies. When the script is pushed to production, visitors will get the following yummies:
- sbjs_current
- sbjs_current_add
- sbjs_first
- sbjs_first_add
- sbjs_session
- sbjs_udata
- sbjs_promo
sbjs_current & sbjs_first
Formattyp=organic|||src=google|||mdm=organic|||cmp=(none)|||cnt=(none)|||trm=(none)
Examples
# source: adv campaign with utms
typ=utm|||src=yandex|||mdm=cpc|||cmp=my_adv_campaign|||cnt=banner_1|||trm=buy_my_stuff
# source: google's SERP
typ=organic|||src=google|||mdm=organic|||cmp=(none)|||cnt=(none)|||trm=(none)
# source: referral from site.com/referer-path
typ=referral|||src=site.com|||mdm=referral|||cmp=(none)|||cnt=/referer-path|||trm=(none)
# source: facebook with custom `referrals` setting
typ=referral|||src=facebook.com|||mdm=social|||cmp=(none)|||cnt=(none)|||trm=(none)
# source: direct visit
typ=typein|||src=(direct)|||mdm=(none)|||cmp=(none)|||cnt=(none)|||trm=(none)
sbjs_current_add & sbjs_first_add
Formatfd=2014-06-11 17:28:26|||ep=http://statica.alexfedoseev.com/sourcebuster-js/|||rf=https://www.google.com
sbjs_session
Format
pgs=3|||cpg=http://statica.alexfedoseev.com/sourcebuster-js/
sbjs_udata
Formatvst=2|||uip=192.168.1.1|uag=Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36
sbjs_promo
Format
code=002354
Limitations
Visits from https to http
When the visitor come from https
web-site to http
, the request don't have a referer. So the script will consider it as typein
(direct visit).