Ping a Specific Port

Question

Michał Paluchowski

Asked: 2020-05-21 10:12:41 +0800 CST2020-05-21 10:12:41 +0800 CST 2020-05-21 10:12:41 +0800 CST

Outputting JSON logs on Elastic Beanstalk with Amazon Linux 2

772

We're trying to migrate our Java applications from from the current Elastic Beanstalk JDK 8 platforms to the new ones running Corretto 11 on Amazon Linux 2. The app works well, but the way logs are handled changed. The output from the web process is now stored in /var/log/web.stdout.log and every line is prefixed with the timestamp and process name, ie:

May 20 17:00:00 ip-10-48-41-129 web: {"timestamp":"2020-05-20T17:00:00.035Z","message":"...","logger":"...","thread":"MessageBroker-2","level":"INFO"}

How can we get rid of the prefix? These logs are streamed into CloudWatch and we're outputting them to stdout in JSON so that we can later query them with Logs Insights. But with the prefix, Insights doesn't "see" the JSON and just treats the whole line as a text blob.

I cannot find any documentation for that on AWS. Almost all of Elastic Beanstalk's documentation refers to the first version of Amazon Linux.

4 Answers

Voted

Michał Paluchowski · Answer 1 · 2020-05-21T23:17:13+08:00

I found a solution that works well enough, so I'll post it here for posterity. If someone can suggest a better one, please do.

Elastic Beanstalk on Amazon Linux 2 relies on rsyslog for log processing and output. There's a file at /opt/elasticbeanstalk/config/private/rsyslog.conf that during deployment gets copied over to /etc/rsyslog.d/web.conf and it's the one that's directing all output from the web application to /var/log/web.stdout.log.

The file doesn't include any custom template. It relies on rsyslog's default template, which prefixes any %msg% with a timestamp and $programname (which is web in this case).

I tried replacing this file via an .ebextensions config, but that didn't work, because Elastic Beanstalk seems to be overwriting this file after .ebextensions run. So I added an additional platform hook that deletes the file, keeping the custom one I added.

Here's the .ebextensions/logs.config file:

files:
  "/etc/rsyslog.d/web-files.conf":
    mode: "000644"
    owner: root
    group: root
    content: |
      template(name="WebTemplate" type="string" string="%msg%\n")

      if $programname == 'web' then {
        *.=warning;*.=err;*.=crit;*.=alert;*.=emerg; /var/log/web.stderr.log;WebTemplate
        *.=info;*.=notice /var/log/web.stdout.log;WebTemplate
      }

commands:
  remove-.bak-rsyslog:
    command: rm -f *.bak
    cwd: /etc/rsyslog.d

and .platform/hooks/predeploy/remove-default-rsyslog-conf.sh (make sure you chmod +x this one):

#!/bin/sh
rm /etc/rsyslog.d/web.conf
systemctl restart rsyslog.service

Mike · Answer 2 · 2021-08-07T14:15:25+08:00

I used Platform Hooks to accomplish this. The only catch is /etc/rsyslog.d/web.conf is replaced on both application and configuration deployments, so you need a hook for both.

This approach avoids messing with Elastic Beanstalk's internal files in /opt/elasticbeanstalk/config/private (which have changed since the previous answers - rsyslog.conf no longer exists). Also, platform hooks are now preferred over ebextensions.

If you use CodeBuild, don't forget to include the platformFiles directory (or wherever you put your files) in your output artifact.

NOTE: This code assumes the name of the process is web. If you defined a different process name in your Procfile, use that instead. However, I think the rsyslog config should always be at /etc/rsyslog.d/web.conf despite the process name.

Make sure all of your .sh files are executable using chmod +x.

.platform/hooks/predeploy/10_logs.sh

#!/bin/sh
sudo platformFiles/setupLogs.sh

.platform/confighooks/predeploy/10_logs.sh

#!/bin/sh
sudo platformFiles/setupLogs.sh

platformFiles/setupLogs.sh

#!/bin/sh
# By default logs output to /var/log/web.stdout.log are prefixed. We want just the raw logs from the app.
# This updates the rsyslog config. Also grants read permissions to the log files.

set -eu

mv platformFiles/rsyslogWebConf.conf /etc/rsyslog.d/web.conf

touch /var/log/web.stdout.log
touch /var/log/web.stderr.log
chmod +r /var/log/web.stdout.log
chmod +r /var/log/web.stderr.log

systemctl restart rsyslog.service

platformFiles/rsyslogWebConf.conf

# This file is created from Elastic Beanstalk platform hooks.

template(name="WebTemplate" type="string" string="%msg%\n")

if $programname == 'web' then {
  *.=warning;*.=err;*.=crit;*.=alert;*.=emerg; /var/log/web.stderr.log;WebTemplate
  *.=info;*.=notice /var/log/web.stdout.log;WebTemplate
}

Conjecture

It looks like /opt/elasticbeanstalk/config/private/rsyslog.conf was replaced with /opt/elasticbeanstalk/config/private/rsyslog.conf.template:

# This rsyslog file redirects Elastic Beanstalk platform logs.
# Logs are initially sent to syslog, but we also want to divide
# stdout and stderr into separate log files.

{{range .ProcessNames}}if $programname  == '{{.}}' then {
  *.=warning;*.=err;*.=crit;*.=alert;*.=emerg /var/log/{{.}}.stderr.log
  *.=info;*.=notice /var/log/{{.}}.stdout.log
}
{{end}}

Based on this, I speculate Elastic Beanstalk uses this template to generate a single /etc/rsyslog.d/web.conf file which contains a block for every defined process name. Because both an application and a configuration deployment can change the defined processes, it makes sense this file is recreated after both.

puttputt · Answer 3 · 2020-06-25T06:36:54+08:00

I'm going to put my solution down here, although its a bit of a different problem than op - its around the same idea and hopefully answers some of the other commenters questions about nodejs.

This is for Amazon Linux 2 running Node.js 12.x

Problem: nodejs stdout logs are mixed in with nginx logs under "web" and are badly formatted.

Previously, in Amazon Linux 1 running Nodejs, these logs are separated into /var/log/nodejs/nodejs.log and /var/log/nginx/access.log. Having them combined and prefixed with the ip address just makes them an an absolute mess.

I followed the solutions put forwarded and modified them a bit.

.ebextension config that modifies rsyslog.conf to split logs into two different files. I'm filtering on a pattern I saw in my log files, but you could use any regex that is compatible with RainerScript.

I'm not sure if theey want you to be editing this file, as another commenter has pointed out, sincei is private. If you're not comfortable with this, I would recommend writing to your own log files rather than stdout. You have more control that way.

files:
  "/opt/elasticbeanstalk/config/private/rsyslog.conf" :
    mode: "000755"
    owner: root
    group: root
    content: |
        template(name="WebTemplate" type="string" string="%msg%\n")

        if $programname  == 'web' and $msg startswith '#033[' then {
            *.=warning;*.=err;*.=crit;*.=alert;*.=emerg; /var/log/web.stderr.log
            *.=info;*.=notice /var/log/web.stderr.log;
        } else if( $programname == 'web') then /var/log/node/nodejs.log;WebTemplate

Now that I have a new log file that is more or less, just node stdout, that needs to be streamed to cloudwatch as well as included in the log bundle. These configs are documented (although not updated for Amazon Linux 2). No configs are overwritten here, they simple add new configurations .

files:
  "/opt/elasticbeanstalk/config/private/logtasks/bundle/node.conf" :
    mode: "000755"
    owner: root
    group: root
    content: |
      /var/log/node/nodejs.log


files:
  "/opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.d/node.json" :
    mode: "000755"
    owner: root
    group: root
    content: |
        {
            "logs": {
                "logs_collected": {
                    "files": {
                        "collect_list": [
                           
                            {
                                "file_path": "/var/log/node/nodejs.log",
                                "log_group_name": "/aws/elasticbeanstalk/[environment_name]/var/log/node/nodejs.log",
                                "log_stream_name": "{instance_id}"
                            }
                        ]
                    }
                }
            }
        }

commands:
    append_and_restart:
        command: /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a append-config -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/etc/node.json -s

Anthony Webb · Answer 4 · 2021-12-04T11:52:07+08:00

If you are using a lambda to move stuff into loggly (as per their documentation here https://documentation.solarwinds.com/en/success_center/loggly/content/admin/cloudwatch-logs.htm) you can simply modify the lambda to extract the JSON and drop off the leading string that is causing grief. Here is the lambda I am using, and it works great to clean stuff up and post my JSON logs.

Sidenote, I use NodeJS - Fastify which includes Pino under the hood for nice JSON logs. Some great insight on setting up the logs can be found here https://jaywolfe.dev/blog/setup-your-fastify-server-with-logging-the-right-way-no-more-express/

'use strict';

/*
 *  To encrypt your secrets use the following steps:
 *
 *  1. Create or use an existing KMS Key - http://docs.aws.amazon.com/kms/latest/developerguide/create-keys.html
 *
 *  2. Expand "Encryption configuration" and click the "Enable helpers for encryption in transit" checkbox
 *
 *  3. Paste <your loggly customer token> into the kmsEncryptedCustomerToken environment variable and click "Encrypt"
 *
 *  4. Give your function's role permission for the `kms:Decrypt` action using the provided policy template
*/

const AWS = require('aws-sdk');
const http = require('http');
const zlib = require('zlib');


// loggly url, token and tag configuration
// user needs to edit environment variables when creating function via blueprint
// logglyHostName, e.g. logs-01.loggly.com
// logglyTags, e.g. CloudWatch2Loggly
const logglyConfiguration = {
    hostName: process.env.logglyHostName,
    tags: process.env.logglyTags,
};

// use KMS to decrypt customer token in kmsEncryptedCustomerToken environment variable
const decryptParams = {
    CiphertextBlob: Buffer.from(process.env.kmsEncryptedCustomerToken, 'base64'),
    EncryptionContext: { LambdaFunctionName: process.env.AWS_LAMBDA_FUNCTION_NAME },
};

const kms = new AWS.KMS({ apiVersion: '2014-11-01' });

kms.decrypt(decryptParams, (error, data) => {
    if (error) {
        logglyConfiguration.tokenInitError = error;
        console.log(error);
    } else {
        logglyConfiguration.customerToken = data.Plaintext.toString('ascii');
    }
});

// entry point
exports.handler = (event, context, callback) => {
    const payload = Buffer.from(event.awslogs.data, 'base64');

    // converts the event to a valid JSON object with the sufficient infomation required
    function parseEvent(logEvent, logGroupName, logStreamName) {
        return {
            // remove '\n' character at the end of the event
            message: extractJSON(logEvent.message.trim())[0],
            logGroupName,
            logStreamName,
            timestamp: new Date(logEvent.timestamp).toISOString(),
        };
    }

    // joins all the events to a single event
    // and sends to Loggly using bulk endpoint
    function postEventsToLoggly(parsedEvents) {
        if (!logglyConfiguration.customerToken) {
            if (logglyConfiguration.tokenInitError) {
                console.log('error in decrypt the token. Not retrying.');
                return callback(logglyConfiguration.tokenInitError);
            }
            console.log('Cannot flush logs since authentication token has not been initialized yet. Trying again in 100 ms.');
            setTimeout(() => postEventsToLoggly(parsedEvents), 100);
            return;
        }

        // get all the events, stringify them and join them
        // with the new line character which can be sent to Loggly
        // via bulk endpoint
        const finalEvent = parsedEvents.map(JSON.stringify).join('\n');

        // creating logglyURL at runtime, so that user can change the tag or customer token in the go
        // by modifying the current script
        // create request options to send logs
        try {
            const options = {
                hostname: logglyConfiguration.hostName,
                path: `/bulk/${logglyConfiguration.customerToken}/tag/${encodeURIComponent(logglyConfiguration.tags)}`,
                method: 'POST',
                headers: {
                    'Content-Type': 'application/json',
                    'Content-Length': finalEvent.length,
                },
            };

            const req = http.request(options, (res) => {
                res.on('data', (data) => {
                    const result = JSON.parse(data.toString());
                    if (result.response === 'ok') {
                        callback(null, 'all events are sent to Loggly');
                    } else {
                        console.log(result.response);
                    }
                });
                res.on('end', () => {
                    console.log('No more data in response.');
                    callback();
                });
            });

            req.on('error', (err) => {
                console.log('problem with request:', err.toString());
                callback(err);
            });

            // write data to request body
            req.write(finalEvent);
            req.end();
        } catch (ex) {
            console.log(ex.message);
            callback(ex.message);
        }
    }
    
    function extractJSON(str) {
        var firstOpen, firstClose, candidate;
        firstOpen = str.indexOf('{', firstOpen + 1);
        do {
            firstClose = str.lastIndexOf('}');
            console.log('firstOpen: ' + firstOpen, 'firstClose: ' + firstClose);
            if(firstClose <= firstOpen) {
                return null;
            }
            do {
                candidate = str.substring(firstOpen, firstClose + 1);
                console.log('candidate: ' + candidate);
                try {
                    var res = JSON.parse(candidate);
                    console.log('...found');
                    return [res, firstOpen, firstClose + 1];
                }
                catch(e) {
                    console.log('...failed');
                }
                firstClose = str.substr(0, firstClose).lastIndexOf('}');
            } while(firstClose > firstOpen);
            firstOpen = str.indexOf('{', firstOpen + 1);
        } while(firstOpen != -1);
    }

    zlib.gunzip(payload, (error, result) => {
        if (error) {
            callback(error);
        } else {
            const resultParsed = JSON.parse(result.toString('ascii'));
            const parsedEvents = resultParsed.logEvents.map((logEvent) =>
                    parseEvent(logEvent, resultParsed.logGroup, resultParsed.logStream));

            postEventsToLoggly(parsedEvents);
        }
    });
};

Outputting JSON logs on Elastic Beanstalk with Amazon Linux 2

Conjecture

Can you pass user/pass for HTTP Basic Authentication in URL parameters?

Ping a Specific Port

Check if port is open or closed on a Linux server?

How to automate SSH login with password?

How do I tell Git for Windows where to find my private RSA key?

What's the default superuser username/password for postgres after a new install?

What port does SFTP use?

Command line to list users in a Windows Active Directory group?

What is a Pem file and how does it differ from other OpenSSL Generated Key File Formats?

How to determine if a bash variable is empty?