aws-log
v0.11.2
Published
A logging framework for AWS Lambda
Downloads
85
Maintainers
Readme
AWS Log
The aws-log module is intended for AWS serverless functions to use as their exclusive way to write to STDOUT and STDERR. By using using this library you will:
- be assured that all data sent to your logging solution will be as structured JSON string
- contextual information will be sent along with the details of that log message
- automatic creation of a correlation-id for cross-function tracing
- your "shipper" function will be able to filter log messages based on configured "severity"
Installing
In your project root add the Log Shipper module:
# npm
npm install -s aws-log
# yarn
yarn add aws-log
Logging
Now that the dependendency is installed you can import it in your code like so:
import { logger } from "aws-log";
const log = logger();
log.info("this is a log message", { foo: 1, bar: 2 });
In this simple example this will produce the following output to STDOUT:
{
"@x-correlation-id": "1234-xyzd-abcd",
"@severity": 1,
"message": "this is a log message",
"foo": 1,
"bar": 2
}
Things to note about usage:
You must call the
logger()
function to get the primary logging functions which are:info
,debug
,warn
anderror
we ALWAYS get a JSON object as a return (good for logging frameworks)
The first calling parameter is mapped to the
message
parameter in the outputThe second calling parameter is optional but allows you to add other structured
attributes which help to define the log message
Every message will have a
@severity
attached to it. This is one-to-one mapped to which log function you choose:{ DEBUG: 0, INFO: 1, WARN: 2, ERROR: 3 };
Every message will have a
@x-correlation-id
attached to it ... more on that laterNote: while there is no "timestamp" attribute appended we leave that off because AWS includes the timestamp by default on each log entry. Please do ensure your shipper function picks up the timestamp and adds it into the JSON payload.
Context
While each log message has unique data which you will want to log, there is also "context" that when placed next to the specific message can make the data much more searchable and thereby more useful. This idea of context will be broken into two parts:
- Global Context: context that is relevant for the full execution of the function and and withe paramters you would never expect to be overwritten by the local logging.
- Local Context: context which may only be relavent for a shorter period or might
be
overwritten by the local logging event.
Global Context
There are a two primary ways to set global context but here's the most basic:
const log = logger().context({ foo: "bar" });
log.info("this is a log message", { foo: 1, bar: 2 });
In this example the output would be:
{
"@x-correlation-id": "1234-xyzd-abcd",
"@severity": 1,
"@timestamp": 2234234,
"message": "this is a log message",
"foo": 1,
"bar": 2,
"context": {
"foo": "bar"
}
}
Every call to debug
, info
, warn
and error
will now always include the properties
you have passed in as context.
Note: If your specific log content includes a property
context
then the logger will rename it to_context_
. It is important for function-to-function consistency that the meaning of "context" remain consistent.
Global Context in AWS Lambda
The signature of a Lambda function looks like this:
export function handler(event, context) { ... }
In order to provide consistent "context" in Lambda functions as described above we suggest you initialize your logging functions like so:
const { log, debug, info, warn, error } = logger().lambda(event, context);
This allows for "smart" extraction of context. By smart we mean that typically there are two distinct types of Lambda execution:
- Functions called from API Gateway (aka, an external API endpoint)
- Functions called from other functions
The main difference in these two situations is in the data passed in as the event. In the case of an API-Gateway call, the event has lots of meta-data travelling with it. For a complete list refer to Lambda Proxy Request. The quick summary is that it passes the client's "query parameters", "path parameters" and "body" of the message. This makes up the distinct "request" that will be considered in your functions but it also passes a bunch of variant data about the client such as "what browser?", "which geography?", etc. For a normal lambda-to-lambda function call the "event" is exactly what the calling function passed in.
The context
object is largely the same between the two types of Lambda's mentioned above
but in both cases provides some useful meta-data for logging. For those interested the
full typeing is here:
IAWSLambaContext.
All this information, regardless of which type of function it is, becomes "background
knowledge" as aws-log
will take care of all the contextual information for you if you
use .lambda(event, context)
, providing you with the following attributes on your context
property:
/** the REST command (e.g., GET, PUT, POST, DELETE) */
httpMethod: string;
/** the path to the endpoint called */
path: string;
/** query parameters; aka, the name-value pairs after the "&" character */
queryStringParameters: string;
/** parameters passed in via the path itself */
pathParameters: string;
/** the callers user agent string */
userAgent: string;
/** the country which the request hit Cloundfront */
country: string;
/** the function handler which led to this log message */
functionName: string;
functionVersion: string;
/** the cloudwatch log group where the log was sent */
logStreamName: string;
/** the AWS requestId which is unique for this function call */
requestId: string;
/** the version from package.json file (for serverless function, not other libs) */
packageVersion: string;
Lambda Context outside of Handler Function
We've already discussed the utility of passing the event
and context
attributes to the
logger and in the handler function we have a simple way of achieving this as these two
objects are immediately available:
export function handler(event, context) {
const log = logger().lambda(event, context);
// ...
}
But unless we keep passing around the event
and context
how would we maintain context
in logging that's in a utility function, etc.? The answer is after the context has been
set with logger().lambda(event, context)
you can simply write:
const log = logger().reloadContext();
function doSomething() {
log.info("something has happened");
}
More Context
The original, and generic, logger.context(obj)
method allowed us to add whatever
name/value pairs we pleased but with logger.lambda(event, context)
we rely on
aws-log to choose context for us. This is probably good enough for most situations but
wherever you want to add more you can do so easily enough:
// in the handler function
const log = logger().lambda(event, context, moreContext);
// somewhere else
const log = logger().reloadContext(moreContext);
NOTE: that while both signatures are valid, the first one is STRONGLY recommended because "context" is meant to be information which is valid for the full execution of the function. Typically we'd expect this to be established as the first line of the handler function not later in the execution.
Correlation ID
The correlation ID -- which shows up as @x-correlation-id
in the log entry -- is an ID
who's scope is meant to stay consistent for a whole graph or fan out of function
executions. This scoping is SUPER useful as within AWS most logging is isolated to a
single function execution but in a micro-services architecture this often represents too
narrow a view.
The way the correlation ID is set is when "context" is provided -- typically via the
lambda(event, context)
parameters -- it looks for a property x-correlation-id
in the
"headers" property of the event. This means that if you are originating a request via
API-Gateway, you can pass in this value as part of the request. In fact, it is often
the case that graph of function executions does originate from API-Gateway but even in
this situation we typically suggest the client does not send in a correlation ID unless
there is a chain of logging that preceeded this call on the client side. In most cases,
the absence of a correlation ID results in one being created automatically. Once it is
created though it must be forwarded to all other executions downstream. This is achieved
via a helper method provided by this library called invoke
.
ENV filtering
While in development you almost always want ALL log entries to make it to your logging solution, this is not always the case when your in production (or other environments). For this reason aws-log
provides a means to configure what logs should be sent. Options include:
all
- all logs of a given log level should be sent to stdoutnone
- no logs of a given log level should be sentsample-by-session
- when initializing a function, a sampling rate is sampled and if within the bounds then all messages of a given log level will be sampled for that session (aka, as long as this logger stays in memory)sample-by-event
- when an log event is encountered, the sampling rate is sampled and if within the sampling window it is logged.
Filtering based on the environment is set by default based on the value of the AWS_STAGE
environment variable. The defaults turn on logging for all events in DEV/TEST but block debug logs and only sample logs at the info level in STAGE/PROD. The default sampling rate is 10%.
If you'd like to set your own than you can by passing in the configuration as part of call to logger( config )
. You can explicitly state the configuration you want or you can just state a "delta" off of the default configuration:
Delta Config: changes sampling rate to 25% when in STAGE/PROD
const config: IAwsLogConfig = { sampleRate: .25 }
const log = logger(config);
Full Config: explicitly state config; it is assumed that caller has considered the ENV
const config: IAwsLogConfig = { debug: "none", info: "none", warn: "all", error: "all" }
const log = logger(config);
Keeping Secrets
Logging is great until you start logging secrets. This is a common problem so aws-log
has features to help you avoid this:
addToMaskedValues(...values: string)
- adds one or more values to the "masked" categorysetMaskedValues(...values: string)
- same asaddToMaskedValues
but instead of adding it sets the masked values (removing any previously set values)
Once you've stated the values which should be masked you should consider the masking "strategy"; the strategies available are:
astericksWidthFixed
astericksWidthDynamic
revealEnd4
revealStart4
All masked values will default to a single strategy which is by default astericksWidthDynamic
. You can also override the default strategy by passing in tuple of the form of [value, strategy]
when setting the value with addToMaskedValues
/setMaskedValues
; for example:
const log = logger().mask('abcd', 'efghi', [ 'foobar', "revealEnd4" ])
In this example the first two values will pickup and use the default strategy but the value "foobar" will be masked usin the "revealEnd4" strategy.
The invoke
Function
The standard way of calling a Lambda functon from within a Lambda is through the invoke method of the AWS Lambda interface:
import { Lambda } from "aws-sdk";
const lambda = new Lambda({ region: "us-east-1" });
lambda.invoke(params, fn(err, data) { ... });
As a convenience, this library provides invoke
which provides the same functionality but
with a simplified calling structure:
import { invoke } from "aws-log";
try {
await invoke(fnArn, request, options);
} catch (e) {
// your error handler here or if you like just ignore the try/catch
// and let the error cascade down to a more general error handler
}
Note: the AWS API exposes both an
invoke
andinvokeAsync
which is somewhat confusing becauseinvoke
can also be asynchronous! At this point no one should use theinvokeAsync
call as it is deprecated and therefore we ignore exposing this in our API.
The invoke
API
The invoke API only requires two parameters:
- the ARN representing the function you are calling
- the parameters you want to send in (as an object)
See below for an example:
await invoke("arn:aws:lambda:us-east-1:837955399040:function:myapp-prod-myfunction", {
foo: 1,
bar: 2
});
To make it more compact, you can set the following environment variables:
- First, you don't actually need the
arn:aws:lambda
at all, it will be assumed if you don't start with "arn". - Second, if you set the
AWS_REGION
environment variable for your function then you can leave off that component. - Third, if you provide
AWS_ACCOUNT
as a variable then you no longer need to state that in the string. - Finally, if you provide
AWS_STAGE
then you can leave off the prod | dev | etc. portion.
That means if you do all of the above you only need the following:
await invoke("myfunction", { foo: 1, bar: 2 });
This also has the added benefit of dynamically adjusting to the stage you are in (which you'll almost always want).
The last parameter in the signature is options
(which is typed for those of you with
intellisense) but basically this gives you an option to:
- turn on the "dryrun" feature AWS exposes
- specify a specific version of the function (rather than the default)
Now if you weren't already sold on why you should be invoking using this more compact API than AWS's provide API, here's the clincher ... using invoke
ensures that x-correlation-id
and other contextual parameters are passed along to the next function so that logging in the next function will have the same correlation id (which is actually the intent of a correlation id).
To see what parameters are being pass forward to the next function look at the
IAwsInvocationContext
interface defined in types.ts
The stepFunction
API
AWS Step Functions are quite useful tool in the "serverless toolkit" but they also provide a manner in which many Lambda functions will be executed together in concert. For this reason we must ensure that the Correlation ID is maintained throughout this fan out. Fortunately this easily achieved with the stepFunction API surface:
// To start a step function
import { stepFunction } from "aws-log";
function anyFunction() {
stepFunction.start(arn, props);
}
Then in any functions which are involved in the steps of a step function be sure to complete your handler with:
import { logger, stepFunction } = "aws=log";
function handler(event, context, callback) {
const log = logger().lambda(event, context);
/** ... */
callback(null, stepFunction.response( response ));
}
Shipping
In Lambda you can specify a particular Lambda function to be executed with your serverless
functions to accept your STDOUT and STDERR streams. If you have an external logging
solution then you should attach a "shipping" function to ship these entries to that
external solution. Here at Inocan Group we use Logzio and if you do
as well you should feel free to use ours:
logzio-shipper
.
License
Copyright (c) 2019 Inocan Group
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.