Streaming AWS Lambda logs to AWS Elasticsearch

AWS Lambda Function is a great service for developing and deploying serverless applications. A lambda function stores its log messages in CloudWatch Logs and one would invariably end up with a large and ever increasing number of log streams like the screenshot below. Trying to do log analysis and debug operation issues here is possible but definitely not fun and effective.

cwatch

 

This post will provide a step by step guide on how to stream the logs from a AWS Lambda function to Elasticsearch Service so that you can use Kibana to search and analysis the log messages.

1. Create Elasticsearch Endpoint

First you will have to create a AWS Elasticsearch domain. Follow the instructions on AWS here. Once the domain is created, click on the link to it under the Elasticsearch Dashboard and note the DNS for Kibana under the Overview tab.

2. Format Log Messages in Lambda Function

The log messages from the lambda function need to be in a format that can be parsed using CloudWatch filters. Typically it means you should use a logging library instead of print statements to write your log message. For example, the standard logging module in Python:

import logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

def handle(event, context):
    logger.info('lambda function rocks!')
    ...

will generate the following in the CloudWatch log

[INFO]	2019-08-04T15:00:01.283Z	d7d9d6cc-7b2d-447d-a07e-ffc96c940610	lambda function rocks!

One suggestion here is to format your log message proper using your logging library as this will make the configuration later much easier and less error prone.

3. Enable Streaming from CloudWatch Log to Elasticsearch Service

Next go to the CloudWatch dashboard in AWS Console and do the following:

  1. Click on Logs to view all your log groups.
  2. Find and select the log group corresponding to the lambda function you want to stream the logs to Elasticsearch Servoce, e.g. /aws/lambda/my-lambda-function.
  3. Click on the Actions button at the top and select “Stream to Amazon Elasticsearch Service” from the arrow drop-down
  4. Select the Elasticsearch domain you created in step 1 for the Amazon ES cluster drop down. Click Next
  5. Use this page to select the log format and test it against your log files to make sure the messages are indexed properly in Elasticsearch. For example, with the log message in Step 2, select Log Format Common Log Format with the following Subscription Filter Pattern [level, timestamp=*Z, request_id=”*-*”, message]
  6. Click Next and Start Streaming 

4 Configure Kibana

Now we are ready to set up Kibana to view and search the log messages:

  1. Go to Kibana on your browser using the DNS noted in Step 1.
  2. Click on Management tab on the left
  3. Click Index Patterns->Create index pattern and create an index with pattern cwl-*. By default, the lambda function generated in previous step (more on this later) creates indices with pattern cwl-<yyyy.mm.dd>
  4. Click on Discover tab and the index pattern you just created should appear in the dropdown under the Add a filter+ link. Select it to view the log messages streaming from your Lambda function.

Now that you have the log messages indexed and stored in Elasticsearch, you can use Kibana to, for example:

  • Search log messages using indexes, e.g. by log level, keywords in message, etc.
  • Create dashboard to visualize how the lambda function is performing
  • Create alerts on certain log events

kibana

What’s Going On Under the Hood

The diagram below shows the solution created by the procedure above (Step 3 in particular). A lambda function is created by AWS to listen to the log event of the source lambda functions (via their associated CloudWatch log groups). It then processes the event payload before sending it to the target Elasticsearch Service.

blog-lamba2Es

Streaming multiple log groups

The generated lambda function can be used to stream logs from multiple log groups (and hence multiple source lambda functions) as shown in the above diagram. However you need to change the codes in the generated lambda function to work around an issue with the current version of Elasticsearch by including the log group name in the Elasticsearch index name:

        // index name format: cwl-YYYY.MM.DD
    var indexName = [
        'cwl-' + payload.logGroup.toLowerCase().split('/').join('-') + '-' + timestamp.getUTCFullYear(),              // log group + year
        ('0' + (timestamp.getUTCMonth() + 1)).slice(-2),  // month
        ('0' + timestamp.getUTCDate()).slice(-2)          // day
    ].join('.');