@cazoo/telemetry
v0.16.8
Published
Codified standards for open telemetry
Downloads
142
Readme
Cazoo Telemetry
A wrapper around open telemetry for getting traces and telemetry into your life
Basic concepts
https://opentelemetry.io/ is a standard for observability.
Instead of logger.info('request sent to aws')
, you'll have something more like const trace = parent.startChild('awsRequest')
followed at some point by a trace.end()
Any standard logger will contain information about when an event happened but a trace will contain information about when it happened, how long it happened for and what was the hierarchy of operations within the trace.
Our telemetry data is sent through to https://www.honeycomb.io/ where it can be viewed and analysed.
Basic usage
NB: All examples can be found in the examples
directory of this repository. Follow directions in the README over there.
yarn add @cazoo/telemetry
npm install --save @cazoo/telemetry
the entrypoint is Telemetry
. Telemetry.start(name)
and this will return a Trace
object. You have to end the trace in order for telemetry to be logged
// yarn example:basic
import { Telemetry } from '@cazoo/telemetry'
const trace = Telemetry.start('basic')
trace.end()
/*
{
"traceId":"38d55155fb57b62757f509288b14ea4f",
"name":"basic",
"id":"421f5158d03eac4a",
"kind":0,
"timestamp":1658228429531994,
"duration":0,
"attributes":{},
"status": {
"code":0
},
"events":[]
}
*/
Including AWS Context
the syntax for this is Telemetry.startWithContext(name, event, context, options)
The startWithContext
method is able to pull relevant information out from your AWS event and context
// yarn example:awsContext
// event and context from the unit test data used in @cazoo/telemetry
import { event, context } from '../tests/data/awsgateway'
import { APIGatewayProxyEvent, Context } from 'aws-lambda'
import { Telemetry } from '@cazoo/telemetry'
function handle(event: APIGatewayProxyEvent, context: Context): void {
const trace = Telemetry.startWithContext('handler', event, context)
trace.end()
}
handle(event, context)
/*
{
"traceId": "4d110d20fdcc8516d23df6c594833f76",
"name": "handler",
"id": "e6ba7f033be6f7af",
"kind": 0,
"timestamp": 1658228683032974,
"duration": 1,
"attributes": {
"request_id": "request-id",
"account_id": "12345678912",
"function.name": "my-function",
"function.version": "v1.0.1",
"function.service": "log-stream",
"http.path": "/hello/world",
"http.method": "POST",
"http.stage": "testStage",
"http.query": "{\"name\":[\"me\"],\"multivalueName\":[\"you\",\"me\"]}"
},
"status": {
"code": 0
},
"events": []
}
*/
Child Traces
One of the offerings of open telemetry is the hierarchy of execution.
These are achieved by taking your root trace and creating children.
// yarn example:children
import { Telemetry } from '@cazoo/telemetry'
const queryDynamo = (): void => {
// dummy function
}
const trace = Telemetry.start('root')
const child = trace.startChild('queryingDynamo')
queryDynamo()
child.end()
trace.end()
/* This generated two traces. This one is a root trace and has no parentId
{
"traceId": "5d38dc29c40d55c6bb781c44112728dc",
"name": "root",
"id": "338340157a29db83",
"kind": 0,
"timestamp": 1658228710320277,
"duration": 2,
"attributes": {},
"status": {
"code": 0
},
"events": []
}
*/
/* This trace is a child of the root and is linked by the parent id
{
"traceId": "5d38dc29c40d55c6bb781c44112728dc",
"parentId": "338340157a29db83",
"name": "queryingDynamo",
"id": "63b7ef930d8e14fe",
"kind": 0,
"timestamp": 1658228710320763,
"duration": 0,
"attributes": {},
"status": {
"code": 0
},
"events": []
}
*/
Adding supplementary context
If you need to include additional information in a trace, you can do it using appendContext
// yarn example:appendContext
import { Telemetry } from '@cazoo/telemetry'
const queryDynamo = (): string => {
// dummy function
return 'some result'
}
const trace = Telemetry.start('root')
const result = queryDynamo()
trace.appendContext({ result })
trace.end()
/*
{
"traceId": "8f59474987c2840e8c7b97091099e036",
"name": "root",
"id": "2431dbc40ffda91d",
"kind": 0,
"timestamp": 1614180894315418,
"duration": 1,
"attributes": {
"result": "some result"
},
"status": {
"code": 0
},
"events": []
}
*/
Propagating supplementary context
If you create a child after appending the context, the appended information will be propagated to them.
// yarn example:propagate
import { Telemetry } from '@cazoo/telemetry'
const queryDynamo = (): string => {
// dummy function
return 'some result'
}
const trace = Telemetry.start('root')
const child = trace.startChild('queryingDynamo')
const withoutContext = child.startChild('subChildWithoutContext')
withoutContext.end()
const result = queryDynamo()
child.appendContext({ result })
const withContext = child.startChild('subChildWithContext')
withContext.end()
child.end()
trace.end()
/* This time, we're producing 4 traces. Any context appended in *not* propagated to parents
{
"traceId": "9e964e47b1d0ecd166056c9242391d2b",
"name": "root",
"id": "fafddc9e47a060a7",
"kind": 0,
"timestamp": 1614180944224978,
"duration": 3,
"attributes": {},
"status": {
"code": 0
},
"events": []
}
*/
/* The context appended is included within the `attributes` property
{
"traceId": "65f96121416465d10344824fb24379e0",
"parentId": "3936b85acc29c85d",
"name": "queryingDynamo",
"id": "6e05a7bebfa166c7",
"kind": 0,
"timestamp": 1658228790300710,
"duration": 1,
"attributes": {
"result": "some result"
},
"status": {
"code": 0
},
"events": []
}
*/
/* context is not propagated to a child that has already been created
{
"traceId": "9e964e47b1d0ecd166056c9242391d2b",
"parentId": "594302faad4cbb6c",
"name": "subChildWithoutContext",
"id": "f673b204193f3884",
"kind": 0,
"timestamp": 1614180944225634,
"duration": 0,
"attributes": {},
"status": {
"code": 0
},
"events": []
}
*/
/* context is propagated to any children created afterwards
{
"traceId": "65f96121416465d10344824fb24379e0",
"parentId": "6e05a7bebfa166c7",
"name": "subChildWithContext",
"id": "29f947c2334e8a59",
"kind": 0,
"timestamp": 1658228790302075,
"duration": 0,
"attributes": {
"result": "some result"
},
"status": {
"code": 0
},
"events": []
}
*/
Matching the schema in honeycomb
Honeycomb is looking for a set of specific fields. I've added a utility method to help you place them.
// yarn example:schema
import { Telemetry } from '@cazoo/telemetry'
const trace = Telemetry.start('root')
trace.schema({
error: 'error',
httpStatusCode: 404,
route: '/search',
})
trace.end()
/* We've added the error, the httpStatuscode and the route to the attributes. These will then be accessible to honeycomb.
{
"traceId": "689d4d4a725ba260b961b347af9298c9",
"name": "root",
"id": "e71955b42a8c6031",
"kind": 0,
"timestamp": 1614181146057262,
"duration": 1,
"attributes": {
"error": "error",
"isError": true,
"httpStatusCode": 404,
"route": "/search"
},
"status": {
"code": 0
},
"events": []
}
*/
Handling errors
Errors can be added through the schema({})
method but it will probably be more convenient to use endWithError(error)
. This does the same except it will also end the trace.
// yarn example:error
import { Telemetry } from '@cazoo/telemetry'
const trace = Telemetry.start('root')
try {
throw new Error('oops!')
trace.end()
} catch (error) {
trace.endWithError(error)
}
/*
{
"traceId": "3d59dc56bfa4deb5b1f2da66dcb889a5",
"name": "root",
"id": "ce3749a2d8510d8f",
"kind": 0,
"timestamp": 1614181187125848,
"duration": 47,
"attributes": {
"error": "oops!",
"errorStackTrace": "Error: oops!\n at Object.<anonymous> (/Users/jason.luong/Documents/projects/telemetry/examples/error.ts:5:9)\n at Module._compile (internal/modules/cjs/loader.js:1063:30)\n at Module.m._compile (/Users/jason.luong/Documents/projects/telemetry/examples/node_modules/ts-node/src/index.ts:1043:23)\n at Module._extensions..js (internal/modules/cjs/loader.js:1092:10)\n at Object.require.extensions.<computed> [as .ts] (/Users/jason.luong/Documents/projects/telemetry/examples/node_modules/ts-node/src/index.ts:1046:12)\n at Module.load (internal/modules/cjs/loader.js:928:32)\n at Function.Module._load (internal/modules/cjs/loader.js:769:14)\n at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:72:12)\n at main (/Users/jason.luong/Documents/projects/telemetry/examples/node_modules/ts-node/src/bin.ts:225:14)\n at Object.<anonymous> (/Users/jason.luong/Documents/projects/telemetry/examples/node_modules/ts-node/src/bin.ts:512:3)",
"isError": true
},
"status": {
"code": 0
},
"events": []
}
*/
Masking sensitive information
Currently this library provides two exporters designed to remove sensitive details from the attributes.
The masked exporter allows the user to specify a set of attributes to allow into the telemetry backend.
The santised exporter instead allows the user to specify a set of regular expressions and redact every string or subtring that matches any of them.
Exporters combination: Please note that in accordance with the decorator pattern, exporters can be combined as constructor parameters.
The masked exporter
If you need to mask sensitive information in a trace, you can do it using the MaskedExporterDecorator
exporter. N.B. This will mask all attributes by default.
If you need to unmask information you can supply an array of allowedFieldPaths to the masker.
// yarn example:masker
import {
Telemetry,
StdOutExporter,
MaskedExporterDecorator,
} from '@cazoo/telemetry'
const stdOutExporter = new StdOutExporter()
const allowedFieldPaths = ['data.id']
const maskedExporterDecorator = new MaskedExporterDecorator(
stdOutExporter,
allowedFieldPaths
)
const trace = Telemetry.start('root', { exporter: maskedExporterDecorator })
trace.appendContext({ data: { email: '[email protected]', id: '1234-5678-9101' } })
trace.end()
/*
{
"traceId": "0f32f5dd3465771a31cf5c155ae20cbe",
"name": "root",
"id": "57ff6ebdb662267d",
"kind": 0,
"timestamp": 1617699714136696,
"duration": 1,
"attributes": {
"data.email": "[REDACTED]",
"data.id": "1234-5678-9101"
},
"status": {
"code": 0
},
"events": []
}
*/
The sanitised exporter
If you desire to redact sensitive values from the trace attributes regardless of their location, such as email addresses or phone numbers, you can use the sanitised exporter.
Specify the patterns to mask as a list of RegExp
objects and the exporter will replace any occurrance expressions with [REDACTED]
or any other custom placeholder.
The module CommonSensitiveInfoPatterns
, located at src/utils
, provides a set of regular expressions that are usually considered sensitive information.
// yarn example:sanitised
import {
Telemetry,
StdOutExporter,
SanitisedExporterDecorator,
CommonSensitiveInfoPatterns,
} from '@cazoo/telemetry'
const stdOutExporter = new StdOutExporter()
const sanitisedExporerDecorator = new SanitisedExporterDecorator(
stdOutExporter,
[CommonSensitiveInfoPatterns.EMAIL, CommonSensitiveInfoPatterns.PHONE_NUMBER, /bar/],
'*****'
)
const trace = Telemetry.start('root', { exporter: sanitisedExporerDecorator })
trace.appendContext({
contacts: {
email: '[email protected]',
mobile: '+44 8087339090',
random_list: ['foobar', 'barbaz']
},
id: '1234-5678-9101'
})
trace.end()
/*
{
"traceId": "d4ace7437c073568f07628b1742b45f0",
"name": "root",
"id": "69e33bc21cf9f4b4",
"kind": 0,
"timestamp": 1634727242512792,
"duration": 1,
"attributes": {
"contacts.email": "*****",
"contacts.mobile": "*****",
"contacts.random_list": {
"0": "foo*****",
"1": "*****baz"
},
"id": "1234-5678-9101"
},
"status": {
"code": 0
},
"events": []
}
*/
Timeout logging
The telemetry will close all its traces just before a lambda timeout, as otherwise you will lose all open traces, including the root trace.
Because of the way the lambda works, this has to be logged before the actual timeout happens. The time between the trace close and the timeout we call it buffer. The default buffer is 10ms. This default can be overriden using the environment variable CAZOO_LOGGER_TIMEOUT_BUFFER_MS
.
We close the traces adding an error attribute indicating that the timeout happens, with type lambda.timeout
. This will also count as an error in honeycomb.
Telemetry Debug mode
It's possible to enable debug logging of the Telemetry library. Please set the environment variable TELEMETRY_DEBUG=1
. Any truthy value will work. This'll provide debug logging of the creation and destruction of spans and other Telemetry behaviours
Cross services tracing
Let's say you have a front end server side lambda that is using the library to trace the time it takes to talk to an API endpoint.
That API endpoint also uses this package and start its trace with startWithContext
, passing in the AWS Proxy event.
You can link those two traces and get a single trace for the whole request across front end and backend.
To do this you first need to change your request to your API to pass headers generated from the frontend trace.
const serviceARootTrace = Telemetry.startWithContext('serviceA', someEvent);
// ...
// We create a child trace to track the call to our API
const apiCallTrace = serviceARootTrace.startChild("serviceA_queryingServiceB");
try {
const response = await fetch(
`/serviceB/api/endpoint`,
{
headers: {
...apiCallTrace.asHttpHeaders()
}
}
);
} finally {
apiCallTrace.end();
}
You then need to update your API to use continueFromContext
to signal you wish to try and continue the incoming trace:
const serviceBTrace = Telemetry.continueFromContext('serviceB', apiGatewayEvent);
Once this is set up, the API endpoint trace will automatically be a child of the front end one.
This is what the output of the trace will look like:
[
{
"traceId": "c210bdfbdc8b731f93d6111ab162bdfc",
"name": "serviceA",
"id": "ae805c11916bae76",
"kind": 0,
"timestamp": 1658229123052375,
"duration": 2,
"attributes": {},
"status": {
"code": 0
},
"events": []
},
{
"traceId": "5cf3221f8e1904b21aad2190644acfe1",
"parentId": "d8ea76a78b9ab874",
"name": "serviceA_queryingServiceB",
"id": "066b40f1e319a702",
"kind": 0,
"timestamp": 1618231896652018,
"duration": 2,
"attributes": {},
"status": {
"code": 0
},
"events": []
},
{
"traceId": "5cf3221f8e1904b21aad2190644acfe1",
"parentId": "066b40f1e319a702",
"name": "serviceB",
"id": "043ff5c29516f2e4",
"kind": 0,
"timestamp": 1618231896652238,
"duration": 0,
"attributes": {
"account_id": "missing"
},
"status": {
"code": 0
},
"events": []
}
]
This is what it looks like in Honeycomb:
myAccount.fetchOrders
was created by the account-app-main-account
My Account SSR Lambda, and getCustomerOrders
was created by the order-service-getCustomerOrders
API.
Contributing
CI package version check
The GitHub Actions workflows used for continous integration and deployment are configured to automatically test and release new versions of the Cazoo-uk/telemetry
package on the NPM registry at https://www.npmjs.com/package/@cazoo/telemetry.
The process includes a verification of the package version set in package.json
.
If the version is not updated, the CI "test" workflow is configured to fail, preventing a pull request to be merged.
To skip this check on a workflow run, insert skip-release
in the commit message. Doing this on the merge commit will skip the release of a new package version.