superdesk-analytics
v2.1.2
Published
_Jan 2017_
Downloads
13
Readme
Superdesk Analytics
Jan 2017
Overview
This is a plugin for superdesk.
It adds an analytics application page with its API endpoints.
License Notice
You must have a valid license for Highcharts JS v6.x to use this plugin.
The Highcharts JS library is used to render the charts. It is available under different licenses depending on whether it is intended for commercial/government use, or for a personal or non-profit project. see more
Table of contents
- Installation
- Highcharts License
- Config Options
- Highcharts Export Server
- Scheduled Reports
- Archive Statistics
- Reports
1: These reports require archive statistics stats to be enabled
Installation
In order to see the analytics feature in the application, you need to enable it.
Client: Enable the superdesk-analytics module
In client/superdesk.config.js
from superdesk
, add this line
features : {analytics: true},
This will import the superdesk-analytics
node module and load the superdesk.analytics
angular module in the main angular application.
Add the following line in the package.json file, dependencies:
"superdesk-analytics": "superdesk/superdesk-analytics#1.5"
Run the command:
npm install
This will install the analytics module.
Server: Load the superdesk-analytics module
--- a/server/settings.py
+++ b/server/settings.py
@@ -132,7 +131,8 @@ INSTALLED_APPS.extend([
+ 'analytics',
])
Add the following line in the requirements.txt file:
git+git://github.com/superdesk/[email protected]#egg=superdesk-analytics
Run the command:
pip install -r requirements.txt
This will install the analytics package.
Development setup
Download and install
git clone [email protected]:superdesk/superdesk-analytics.git
make install
Run the tests
source server/env/bin/activate
make test
Connect the repository to superdesk-client-core
npm link
cd /path/to/superdesk-client-core
npm link superdesk-analytics
Connect the repository to superdesk-core
cd /path/to/superdesk-core
source env/bin/activate
pip install -e /path/to/superdesk-analytics
Highcharts License
You must have a valid license for Highcharts JS v6.x to use this plugin.
The Highcharts JS library is used to render the charts. It is available under different licenses depending on whether it is intended for commercial/government use, or for a personal or non-profit project.
To have the license details available to the end user in the Analytics page of Superdesk, provide the following config options in your settings.py:
- HIGHCHARTS_LICENSE_TYPE (High-Five, Develop or OEM)
- HIGHCHARTS_LICENSEE (Name of the entity that owns the license)
- HIGHCHARTS_LICENSEE_CONTACT (A contact email address for the licensee)
- HIGHCHARTS_LICENSE_ID (The license ID provided by Highsoft)
- HIGHCHARTS_LICENSE_CUSTOMER_ID (A custom license field to use an internal customer number, if required)
- HIGHCHARTS_LICENSE_EXPIRY (the expiry of the license)
Config Options
- HIGHCHARTS_SERVER_HOST (defaults to 'localhost')
- HIGHCHARTS_SERVER_PORT (defaults to '6060')
- HIGHCHARTS_SERVER_WORKERS (defaults to 4) - Number of workers to spawn
- HIGHCHARTS_SERVER_WORK_LIMIT (defaults to 60) - The pieces of work that can be performed before restarting a phantom process
- HIGHCHARTS_SERVER_LOG_LEVEL (defaults to 3) - Set the log level. Available options are:
- HIGHCHARTS_SERVER_QUEUE_SIZE (defaults to 10) - how many request can be stored in overflow count when there are not enough
- HIGHCHARTS_SERVER_RATE_LIMIT (defaults to False) - The max requests allowed in one minute
- ANALYTICS_ENABLE_SCHEDULED_REPORTS (defaults to False) - Enable the emailing of scheduled reports
- ANALYTICS_ENABLE_ARCHIVE_STATS (defaults to False)
- STATISTICS_MONGO_DBNAME (defaults to 'statistics')
- STATISTICS_MONGO_URI (defaults to 'mongodb://localhost/statistics')
- STATISTICS_ELASTIC_URL (defaults to ELASTICSEARCH_URL config)
Highcharts Export Server
To be able to generate charts on the server, we need to install/run the Highcharts Export Server.
Installing the service
npm install -g highcharts-export-server
There is also a script that can be used to automate the install:
cd server/scripts && ./install-highcharts-export-server.sh
This will automatically accept the end user licence, and use a specific version of highcharts
- ACCEPT_HIGHCHARTS_LICENSE=1
- HIGHCHARTS_VERSION=6.2.0
- HIGHCHARTS_USE_STYLED=1
- HIGHCHARTS_MOMENT=1
Running the service
There is a python module to allow running the highcharts export server.
python3 -u -m analytics.reports.highcharts_server
This will start the service using the host/port configured in settings.py (relative to your current working directory)
- HIGHCHARTS_SERVER_HOST (defaults to 'localhost')
- HIGHCHARTS_SERVER_PORT (defaults to '6060')
Adding the service to a Honcho Procfile
You can add the service to your Honcho Procfile with the following line:
highcharts: python3 -u -m analytics.reports.highcharts_server
Configuring Process Workers
The export server uses a pool of PhantomJs worker threads. You can configure this pool in settings.py
- HIGHCHARTS_SERVER_WORKERS (defaults to 4) - Number of workers to spawn
- HIGHCHARTS_SERVER_WORK_LIMIT (defaults to 60) - The pieces of work that can be performed before restarting a phantom process
- HIGHCHARTS_SERVER_LOG_LEVEL (defaults to 3) - Set the log level. Available options are:
- 0 - off
- 1 - errors
- 2 - warn
- 3 - notice
- 4 - verbose
- HIGHCHARTS_SERVER_QUEUE_SIZE (defaults to 10) - how many request can be stored in overflow count when there are not enough
Rate limiting Highcharts requests
The highcharts export server has the ability to rate limit the requests that it receives. By default rate limiting is turned off. You can turn it on in your settings.py
- HIGHCHARTS_SERVER_RATE_LIMIT (defaults to False) - The max requests allowed in one minute
Scheduled Reports
To enable reports to be periodically scheduled (emailed), you must enable the config in settings.py. If this is enabled, then the celery queue entry will be created.
- ANALYTICS_ENABLE_SCHEDULED_REPORTS (defaults to False) - Enable the emailing of scheduled reports
Archive Statistics
Archive statistics are generated from the archive_history
collection and stored in an archive_statistics
collection. This allows in depth reports for content, desk and user activities.
Generating these stats are disabled by default (due to resource requirements), and can be enabled in settings.py.:
- ANALYTICS_ENABLE_ARCHIVE_STATS=True
The data for archive statistics is stored in a separate mongodb database and elastic index. These can be configured with the following:
- STATISTICS_MONGO_DBNAME (defaults to 'statistics')
- STATISTICS_MONGO_URI (defaults to 'mongodb://localhost/statistics')
- STATISTICS_ELASTIC_URL (defaults to ELASTICSEARCH_URL config)
Without enabling the archive stats, the following reports will be disabled:
Statistics are generated using the following celery beat schedule:
- CELERY_BEAT_SCHEDULE['analytics:gen_archive_stats']
If the above is not defined, then it will default to run at 3am every day
Archive Reports
Content Publishing
As an administrator I would like to be able to generate metadata reports around content publishing.
I would like to be able to group, and optionally subgroup, the report by the following metadata:
- Source
- Category
- News Value
- Genre
The report should have the ability to filter data based on the following:
- Date (yesterday, last week, last month, range)
- Desk
- User
- Published State (published, killed, corrected, recalled, rewrite/update)
I should be able to choose one of the following chart types for the report:
- Bar
- Column
- Table
Publishing Performance
As an administrator I would like to generate a report detailing Publishing Performance.
I would like to be able to group the report by one the following:
- Desk
- User
- Category
- Source
- News Value
- Genre
Each group should be broken down to the following publish states:
- Kills
- Corrections
- Rewrites
- Updates
- Originals
Advanced Archive Reports
Desk Activity
As an administrator I would like to be able to analyse peak times on a desk.
This report will show the total incoming and outgoing actions on the desk, grouped by hour or by day (based on a drop down field on the form). These actions should include:
Incoming:
- Manually created by a user
- Sent to the desk (by the send to action)
- Fetched from ingested content
- Unspiked
- Duplicated
- System routed
Outgoing:
- Published
- Sent from the desk (by the send to action)
- Spiked
Below the chart should be a count of all the actions that made up this report, displayed in a table.
Featuremedia Updates
As an administrator I would like to be able to analyse updates to story attachments.
This report will display a table of each update to stories that have asset's attached to them.
The updates will include:
- Pictures added
- Pictured removed
- Pictured altered
Links should be provided to the images and stories so the user can view the image in detail.
Production Time
As an administrator I would like to be able to analyse time spent on stories..
This report will provide statistics on the minimum, average and maximum amount of time spent on producing stories for desks.
These figures will be calculated based on when a story enters a desk, and when that same story exits the desk.
Incoming:
- Manually created by a user
- Sent to the desk (by the send to action)
- Fetched from ingested content
- Unspiked
- Duplicated
- System routed
Outgoing:
- Published
- Sent from the desk (by the send to action)
- Spiked
Update Time
As an administrator I would like to be able to analyse updates to 3 par stories.
I expect this report to show how long it took to create these updates.
User Activity
This report will show a breakdown of a users workload during a single day, broken down by the individual stories that they worked on.
I expect to be able to click on any single story in the chart, that will then display a second chart displaying the life-cycle of that story, from initial creation to the last action taken on the story
Planning Module Reports
Planning Usage
As an administrator I would like to be able to analyse who is using Planning.
I expect this report to include the list of users who are using Planning, and a list of users who are not using Planning.
The report will include counts per user for the following actions:
- Planning items created
- Coverages created
- Coverages that are assigned to workflow? - (do we need this)
- Events created? - (do we need this)
I also expect that users who do not have permission to use Planning not to be included in this report.