Logging for CoreOS and Kubernetes: How Containerization Saved the Day!
It seemed a simple request: store the system logs from our EC2 instances living in AWS. Seems pretty fundamental. After all, Amazon already provides an agent for shipping logs to their API. This should be a no-brainer. A slam-dunk!
Not so fast! Our instances run CoreOS, part of our Kubernetes infrastructure, and nothing was showing up. Need to get creative…
Things get interesting
On a more traditional Linux based system it would be a simple matter of running the CloudWatch agent which would tail
one or more log files sending that information to AWS via the CloudWatch APIs.
CoreOS is a minimal OS designed for running containers and it comes with Docker pre-installed as a systemd service. This, along with its focus on security and reliability, is a large part on why we choose CoreOS for running our Kubernetes Infrastructure - more on this in a future blog post. As part of this minimal design, CoreOS does not provide a package management system, like yum
or apt
, and does not come with a traditional syslog daemon.
Accessing the logs
If you’re familiar with CoreOS you will know that CoreOS uses systemd
's logging system journald
for managing logs. The journald
daemon stores the log information in a binary format. A quick look at the docs for journald and journald.conf shows some configuration options - none of which gave us access to the logs in a format required. CoreOS does provide the journalctl
utility for accessing the machine’s journal/logs.
With this information the obvious solution is to use journalctl
. The journalctl
utility has the ability to tail the systemd journal with the -f flag: journalctl -f
. Using this approach we can pipe the logs somewhere. But where? One option would be to pipe them to a file and then have the CloudWatch agent read the file that way. However without a package-management solution installing the CloudWatch Agent is not a simple solution. Plus we want something that is easy to update, and would work with our existing container infrastructure. We need something else…
Sending the logs
We are big fans of Docker at InVision and it’s only natural to look for a solution running in a container. A while back, Amazon announced its Container Service, called ECS. They posted a blog article about sending container logs to CloudWatch “[Send ECS Container Logs to CloudWatch Logs for Centralized Monitoring](http://blogs.aws.amazon.com/application-management/post/TxFRDMTMILAA8X/Send-ECS-C ontainer-Logs-to-CloudWatch-Logs-for-Centralized-Monitoring)". They were also kind enough to provide a working example that someone had already ported to CoreOS: cloudwatchlogs.
This implementation involves running rsyslog and the CloudWatch agent in a container. rsyslog
listens on a port (in this case 514
) writing the sys logs it receives to the container’s filesystem. The CloudWatch agent reads these files sending the logs to the CloudWatch API. These are all managed by a supervisord
process manager. We modified this implementation a bit internally - mostly to allow setting the Log Group name via an environment variable and changing the log file location.
Now we have somewhere to send the output of journalctl -f
, but how to get them to the container? There just so happens to be a great utility already installed on CoreOS that works great for this situation, ncat. Ncat is a small utility for reading and writing data across a network. Using ncat
allows us to do the following: /usr/bin/journalctl -o short -f | /usr/bin/ncat 127.0.0.1 514
.
Putting it all together
The CloudWatch Agent requires permission to write to the CloudWatch Service. You will need to grant the AWS User Account the following permissions:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"logs:Create*",
"logs:PutLogEvents"
],
"Effect": "Allow",
"Resource": "arn:aws:logs:*:*:*"
}
]
}
CoreOS uses cloud-config
for its instance setup. We make extensive use of cloud-config
for the initial configuration of all our EC2 instances. It’s a simple matter of adding these as systemd services to the cloud-config script.
First we need to configure the CloudWatch container to listen for logs. Since this is a Docker container we want it to run after the Docker service has started.
- name: cloudwatchlogs.service
command: start
content: |
[Unit]
Description=Cloudwatch Logs Service
After=docker.service
Requires=docker.service
[Service]
User=core
Restart=always
TimeoutStartSec=30
RestartSec=10
ExecStartPre=-/usr/bin/docker kill cloudwatchlogs
ExecStartPre=-/usr/bin/docker rm cloudwatchlogs
ExecStartPre=/usr/bin/docker pull roundsphere/cloudwatchlogs:latest
ExecStart=/usr/bin/docker run --name cloudwatchlogs -p 514:514 -e "AWS_ACCESS_KEY_ID=[YOUR-AWS-ACCESS-KEY]" -e "AWS_SECRET_ACCESS_KEY=[YOUR-AWS-SECRET-KEY]" roundsphere/cloudwatchlogs:latest
ExecStop=/usr/bin/docker stop cloudwatchlogs
[Install]
WantedBy=multi-user.target
For our worker nodes that are running as part of our Kubernetes infrastructure, the cloudwatchlogs.service
can be deployed as a DaemonSet. This allows kubernetes to manage the lifecycle of the Cloudwatch log service versus systemd
and fits in well with the rest of our application infrastructure.
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: cloudwatch-agent
labels:
app: system
tier: backend
release: stable
spec:
template:
metadata:
labels:
app: system
name: cloudwatch-pod
version: v1
spec:
containers:
- name: cloudwatch-con
imagePullPolicy: IfNotPresent
securityContext:
privileged: true
ports:
- hostPort: 514
containerPort: 514
name: rsyslogport
image: 'roundsphere/cloudwatchlogs:latest'
env:
- name: AWS_ACCESS_KEY_ID
value: [YOUR-AWS-ACCESS-KEY]
- name: AWS_SECRET_ACCESS_KEY
value: [YOUR-AWS-SECRET-KEY]
Second we need to configure a service to send the logs to the container. It is just a matter of implementing our journalctl
command as a systemd service unit. We want this to start after the cloudwatchlogs.service
has started.
- name: journalctl-output.service
command: start
content: |
[Unit]
Description=Sends log output to container
After=cloudwatchlogs.service
[Service]
Type=simple
Restart=always
TimeoutStartSec=60
RestartSec=60
ExecStart=/usr/bin/bash -c '/usr/bin/journalctl -o short -f | /usr/bin/ncat 127.0.0.1 514'
ExecStop=
[Install]
WantedBy=multi-user.target
If you prefer to manually install these to test it out, you can save the content of each of these services to individual files (in this example journalctl-output.service
and cloudwatchlogs.service
) and place them in the /etc/systemd/system/
directory. Make sure permissions are set correctly and then enable them like so:
sudo systemctl daemon-reload
sudo systemctl enable cloudwatchlogs
sudo systemctl start cloudwatchlogs
sudo systemctl enable journalctl-output
sudo systemctl start journalctl-output
And That’s It!
You should start to see your logs showing up in the AWS CloudWatch Logs console. Note: logs will be grouped by the defined awslogs
and then logged under each instance name. You can modify this by editing awslogs.conf
file.
It’s important to note that this implementation does not guarantee that 100% of the journald
logs will be sent to CloudWatch. While the container will be restarted if it crashes (supervisord
will manage individual processes in the container), due to timing issues some logs may get lost. We have been running this in Production for a few months now and have not had any issues. The overall system impact is pretty minimal and the combined services seem stable.