kubernetes

The Big Easy: Visualizing Logging Data by Integrating Fluentd and vRealize Log Insight with VMware PKS

Editor’s note: On February 26th, 2019, VMware renamed VMware PKS to VMware Enterprise PKS. To learn more about the change, read here.

Centralized logging is an essential part of any enterprise Kubernetes deployment. Configuring and maintaining a real-time high-performance central repository for log collection can ease the day-to-day operations of tracking what went wrong and its impact. Effective central logging also helps development teams quickly observe application logs to characterize application performance. Security compliance and auditing often require a company to maintain digital trails of who did what and when. In most cases, a robust logging solution is the most efficient way to satisfy these requirements.

Out of the box, VMware PKS creates a powerful logging layer on top of Kubernetes by using a combination of Fluentd and VMware vRealize Log Insight. This post describes how this integration works and how you can leverage it to quickly capture aggregated container logs from your Kubernetes pods and view them in the vRealize Log Insight dashboard. The following diagram gives you a high-level view of this integration: 

Container Log Format and Log File

By default, container engines such as Docker capture the standard output or error and leverage the JSON-file driver on each host to write messages to files. Docker maintains a separate log file for each container and stores it in the /var/log/containers directory of the Docker host. Annotation for each log entry consists of the following:

  • Log message
  • Message origin – stdout or stderr
  • Timestamp

  

FluentD

Fluentd is an open source log processor and forwarder that allows you to collect logs from different sources, unify them, and send them to monitoring destinations. With Fluentd, an operator starts by defining directories where log files are stored, applying transform or filter rules based on the type of the message, and deciding how to route the transformed message to a set of destinations by using output rules.  

Source Configuration

Source configuration tells Fluentd where to look for logs. VMware PKS sources include BOSH, VMware NSX, etcd, Kubernetes worker and master nodes, and container log directories. FluentD is configured to tail all log sources. As Fluentd reads from the end of each log file, it standardizes the time format, appends tags to uniquely identify the logging source, and finally updates the position file to bookmark its place within each log.  Read from the beginning is set for newly discovered files. Here is an example of a VMware PKS container source Fluentd config:  

<source>

@type tail  #tails each log

format json

time_key time

path /var/log/containers/*.log     # defining log source directory

pos_file /var/vcap/sys/log/fluentd/containers.log.pos  # Scan offset

time_format %Y-%m-%dT%H:%M:%S.%NZ

time_key time

read_from_head true

tag kubernetes.*  # Tag Kubernetes to incoming payload

</source>

Filter Configuration

Filtering is about transforming the data stream, appending additional information for simplified queries, or extracting information to provide global context. The record_transformer and kubernetes_metadata are two FluentD filter directives used extensively in VMware PKS. The filter_record_transformer is part of the Fluentd core often used with the <record> directive to insert new key-value pairs into log messages. Here is an example of a FluentD config adding deployment information to log messages:  

<filter **>

@type record_transformer

enable_ruby true

<record>

message ${record["message"].nil? ? "" : record["message"]}

bosh_deployment service-instance_290fe873-a2b0-43fd-82eb-50f9a116c7eb

instance_type worker

bosh_index 0

bosh_id afc9c51d-256a-47ee-9694-5234fa5b77ef

</record>

</filter>

For container logs, the kubernetes_metadata filter is an opensource plugin that uses basic container information, such as the container name, to query the Kubernetes API server to obtain and append Kubernetes metadata to raw container logs:

  • Pod ID
  • Labels
  • Annotations
  • Cluster Events

The following diagram gives you a sense of how this happens: 

The code snippet below shows the filter for Kubernetes:

<filter kubernetes.**>

  @type kubernetes_metadata

</filter>

Fluentd further use the <record> directive to insert additional fields such as the hostname and docker_id.  

<filter kubernetes.**>

  @type record_transformer

  enable_ruby true

  tag ${record["docker"]["container_id"][0..12]}

  <record>

    hostname ${record["kubernetes"]["pod_name"] + "." + record["kubernetes"]["container_name"]}

    message ${record["log"]}

    docker_id ${record["docker"]["container_id"][0..12]}

  </record>

</filter>

Output Configuration

The output plugin determines the routing treatment of formatted log outputs. Fluentd offers three types of output plugins: non-buffered, buffered, and time sliced. The VMware PKS implementation is based on a customized buffered approach with full integration with vRealize Log Insight. Here are some of the default parameters:  

  • Port: 9000
  • Rate_limit_msec: 0 # 0 is for no limit
  • Flush_interval (seconds): 20
  • ssl_verify : true  

Every 20 seconds, FluentD will check the incoming message against the configured rate limit. If the number of logs exceeds the rate limiter, FluentD will drop the excess log and log a FluentD informational message. By default, no rate limit is set;  FluentD will upload all messages using the vRealize Log Insight rest API.  

Log Aggregation and vRealize Log Insight

Log aggregation requirements are much more than message rendering. An effective log aggregator must support the processing of events from thousands of endpoints, the ability to accommodate real-time queries, and a superior analytics engine to provide intelligent metrics to solve complex technical and business problems. You have the option to implement log aggregation using vRealize Log Insight or a number of popular open source or commercial logging analytics solutions, such as Elasticsearch, Fluentd, Kibana, or Splunk. Each solution has a set of strengths and weaknesses. VMware PKS offers the flexibility to let you choose a solution that most aligns with your processes and tooling.

The Big Easy button that VMware offers is vRealize Log Insight. vRealize Log Insight works with VMware PKS straight out the box without any additional customization or integration. For those not familiar with vRealize Log Insight, it delivers heterogeneous and highly scalable log management with intuitive, actionable dashboards, sophisticated analytics, and broad third-party extensibility. It provides deep operational visibility and faster troubleshooting across physical, virtual, and cloud environments. Setting up vRealize log Insight integration is super simple. My colleague William Lam wrote an excellent step-by-step blog on how to install and integrate vRealize Log Insight with VMware PKS.  A video recording of this integration is also available.

A few simple clicks are all it takes to enable FluentD and Log Insight integration with VMware PKS. VMware PKS will configure the entire infrastructure to send log messages to vRealize Log Insight anytime a new cluster is provisioned. Worker and master nodes will tag and synchronize all the Kubernetes cluster components, the system components on the nodes, and the standard out and error of all your applications. Fluentd routes all this information to vRealize Log Insight. So if you're the operator, or you're the DevOps SRE who wants to get consolidated logging, you get not only the infrastructure and cluster environment of the Kubernetes components, but also the state of your application and its data. The following diagram illustrates this process:  

  

Viewing Logs with Log Insight

Log messages forwarded by Fluentd can be visualized and analyzed with Log Insight interactive analytics with custom filters for real-time analysis. Also, historical data can be searched from the same interface.

If you are interested in all log events for a Kubernetes cluster, you can set your filter to match based on bosh_deployment. Event logs for etcd, worker and master nodes, NSX-T, all application containers, etc., will be displayed. Here’s an example:

On the other hand, if your goal is to view logs from a set of container instances, you can define a match-all filter based on cluster deployment ID, Kubernetes namespace, Kubernetes pod name, and container names. Here’s another example:

You can build your Interactive queries as complex as you want, including wildcards, to match any or all conditions.  

Converting Filters into Dashboards and Alerts

Kubernetes operators can also convert interactive analysis filters into custom dashboards. Custom dashboards allow you to monitor specific Kubernetes pod, namespaces, or types of events based on the events that come in. When you create custom dashboards of useful metrics that you want to closely monitor, the dashboards can also be shared. Sharing of dashboards ensures everyone responsible for the infrastructure or application is looking at the same set of consistent metrics. RBAC is fully supported and recommended with vRealize Log Insight; please refer to the vRealize Log Insight administration guide for details. Here’s what a dashboard looks like:

Kubernetes operators can also convert interactive analysis filters into alerts. When you expect something to happen and you want to be notified, you can turn your interactive query into alerts by clicking the red alarm icon and selecting the create Alert from the Query option. Alerts can be configured to forward to your email as well as VMware vRealize Operations.

You can track how often the event occurred in a certain time interval by using a variety of options or a customized time range. Here’s an example of creating a new alert: 

Conclusion

Centralized logging is a mandatory requirement of any enterprise Kubernetes deployment. The ability to view and filter logs in real time across thousands of endpoints is vital to be able to triage and resolve infrastructure and application issues quickly. VMware PKS provides a big easy button with out-of-the-box integration with FluentD and vRealize Log Insight. With this integration, providing centralized access of all logs to both operators and developers is quick and straightforward.