Collect Metrics with Metricbeat

Learn how you could use Metricbeat to monitor your servers by collecting metrics from the system and services running on the server.

Buy Me a Coffee

Introduction

Until now, we have covered several components of the Elastic Stack. For instance, we learn about Elasticsearch for storing the data that we collect and how to deploy it, Kibana as a Web UI for visualizing the collected data, Filebeat for collecting data from our cluster, and last we saw what Logstash can do. Now, it is time to learn about Metricbeat and how it helps you monitor your servers by collecting metrics from the system and services running on the server.

Overview

Metricbeat is an Elastic Beat, and as such, it’s based on the libbeat framework. Periodically collect metrics from your server’s operating system and from services running on the server with the help of this lightweight shipper. Metricbeat takes the metrics and statistics that it collects and ships them to the output that you specify, such as Elasticsearch or Logstash.

Metricbeat consists of modules and metricsets. A module in Metricbeat, defines the basic logic for collecting data from a specific service, such as Kafka, Apache, MySQL, and so on. It is here, where the details about the service are specified. These include how to connect, how often to collect metrics, and which metrics to collect.

Each module has one or more metricsets. A metricset is the part of the module that fetches and structures the data. Rather than collecting each metric as a separate event, metricsets retrieve a list of multiple related metrics in a single request to the remote system.

Deploying Metricbeat in Docker

Let’s start by adding a folder which will have Metricbeat’s files. The changes in the project should be highlighted.

elastic-stack-demo
  +- elasticsearch-single-node-cluster
       +- elasticsearch
       |    +- Dockerfile-elasticsearch-single-node
       |    +- elasticsearch-single-node.yml
       +-filebeat
       |    +- Dockerfile
       |    +- filebeat-to-elasticsearch.yml
       |    +- filebeat-to-logstash.yml
       +-kibana
       |    +- Dockerfile-kibana-single-node
       |    +- kibana-single-node.yml
       +-logstash
       |    +- config
       |    |    +- logstash.yml
       |    |    +- pipelines.yml
       |    +- pipeline
       |    |    +- beats-example.conf
       |    |    +- data-stream-example.conf
       |    |    +- output.conf
       |    +- Dockerfile
       +-metricbeat
       |    +- Dockerfile
       |    +- metricbeat.yml
       +- .env
       +- docker-compose-es-single-node.yml
       +- docker-compose-filebeat-to-elasticseach.yml
       +- docker-compose-filebeat-to-logstash.yml
       +- docker-compose-logstash.yml
       +- docker-compose-metricbeat.yml

The first file we will be creating is the Dockerfile. Create it under elastic-stack-single-node-cluster/metricbeat/, and paste the following code:

ARG ELK_VERSION
FROM docker.elastic.co/beats/metricbeat:${ELK_VERSION}

# add custom configuration
COPY --chown=root:metricbeat metricbeat.yml /usr/share/metricbeat/metricbeat.yml

The file has nothing extraordinary. it is just specifying the base image and copying the configuration YAML file for Metricbeat. This configuration file looks like this:

########################## Metricbeat Configuration ###########################
# You can find the full configuration reference here:
# https://www.elastic.co/guide/en/beats/metricbeat/index.html

#============================  Config Reloading ===============================
# Config reloading allows to dynamically load modules. Each file which is
# monitored must contain one or multiple modules as a list.
metricbeat.config.modules:
  # Glob pattern for configuration reloading
  path: ${path.config}/modules.d/*.yml
  # Period on which files under path should be checked for changes
  reload.period: 10s
  # Set to true to enable config reloading
  reload.enabled: false

# Maximum amount of time to randomly delay the start of a metricset. Use 0 to
# disable startup delay.
metricbeat.max_start_delay: 10s

#============================== Autodiscover ===================================
# Autodiscover allows you to detect changes in the system and spawn new modules
# as they happen.
metricbeat.autodiscover:
  providers:
    - type: docker
      hints.enabled: true

#==========================  Modules configuration =============================
metricbeat.modules:
  #-------------------------------- Docker Module --------------------------------
  - module: docker
    metricsets:
      - "container"
      - "cpu"
      - "diskio"
      - "healthcheck"
      - "info"
      - "memory"
      - "network"
    hosts: ["unix:///var/run/docker.sock"]
    period: 10s
    enabled: true
    # If set to true, collects metrics per core.
    cpu.cores: true

# ================================= Processors =================================
# Processors are used to reduce the number of fields in the exported event or to
# enhance the event with external metadata. This section defines a list of
# processors that are applied one by one and the first one receives the initial
# event:
#
#   event -> filter1 -> event1 -> filter2 ->event2 ...
#
# The supported processors are drop_fields, drop_event, include_fields,
# decode_json_fields, and add_cloud_metadata.
processors:
  # The following example enriches each event with docker metadata, it matches
  # container id from log path available in `source` field (by default it expects
  # it to be /var/lib/docker/containers/*/*.log).
  - add_docker_metadata: ~
  # The following example enriches each event with host metadata.
  - add_host_metadata: ~
  # The following example enriches each event with process metadata using
  # process IDs included in the event.
  - add_process_metadata: ~

# ================================== Outputs ===================================
# Configure what output to use when sending the data collected by the beat.
# ---------------------------- Elasticsearch Output ----------------------------
output.elasticsearch:
  # Boolean flag to enable or disable the output module.
  enabled: true
  # Array of hosts to connect to.
  # Scheme and port can be left out and will be set to the default (http and 9200)
  # In case you specify and additional path, the scheme is required: http://localhost:9200/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
  hosts: ['elasticsearch-demo:9200']

# ================================= Dashboards =================================
# These settings control loading the sample dashboards to the Kibana index. Loading
# the dashboards are disabled by default and can be enabled either by setting the
# options here, or by using the `-setup` CLI flag or the `setup` command.
setup.dashboards.enabled: true

# =================================== Kibana ===================================
# Starting with Beats version 6.0.0, the dashboards are loaded via the Kibana API.
# This requires a Kibana endpoint configuration.
setup.kibana:
  # Kibana Host
  # Scheme and port can be left out and will be set to the default (http and 5601)
  # In case you specify and additional path, the scheme is required: http://localhost:5601/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:5601
  host: "kibana-demo:5601"

# ================================== Logging ===================================
# There are four options for the log output: file, stderr, syslog, eventlog
# The file output is the default.
# Sets log level. The default log level is info.
# Available log levels are: error, warning, info, debug
logging.level: info

# ============================= X-Pack Monitoring ==============================
# Metricbeat can export internal metrics to a central Elasticsearch monitoring
# cluster.  This requires xpack monitoring to be enabled in Elasticsearch.  The
# reporting is disabled by default.

# Set to true to enable the monitoring reporter.
monitoring.enabled: true

# Uncomment to send the metrics to Elasticsearch. Most settings from the
# Elasticsearch output are accepted here as well.
# Note that the settings should point to your Elasticsearch *monitoring* cluster.
# Any setting that is not set is automatically inherited from the Elasticsearch
# output configuration, so if you have the Elasticsearch output configured such
# that it is pointing to your Elasticsearch monitoring cluster, you can simply
# uncomment the following line.
monitoring.elasticsearch:
  # Array of hosts to connect to.
  # Scheme and port can be left out and will be set to the default (http and 9200)
  # In case you specify and additional path, the scheme is required: http://localhost:9200/path
  # IPv6 addresses should always be defined as: https://[2001:db8::1]:9200
  #hosts: ["elasticsearch-demo:9200"]

# =============================== HTTP Endpoint ================================
# Each beat can expose internal metrics through a HTTP endpoint. For security
# reasons the endpoint is disabled by default. This feature is currently experimental.
# Stats can be access through http://localhost:5066/stats . For pretty JSON output
# append ?pretty to the URL.
# Defines if the HTTP endpoint is enabled.
http.enabled: true

# The HTTP endpoint will bind to this hostname, IP address, unix socket or named pipe.
# When using IP addresses, it is recommended to only use localhost.
http.host: metricbeat-demo

# Port on which the HTTP endpoint will bind. Default is 5066.
http.port: 5066

As you can see, we have included the description of each configuration option. Hopefully, it will be easier to understand it. However, the main idea behind it, is:

Enable the autodiscover feature, based on hints. Autodiscover allows you to track applications and monitor services as they start running. The hints system looks for hints in Kubernetes Pod annotations or Docker labels that have the prefix co.elastic.metrics. As soon as the container starts, Metricbeat will check if it contains any hints and launch the proper config for it.
Enable docker module, so that we can monitor other containers in the same network.
Enable providers, which work by watching for events on the system and translating those events into internal autodiscover events with a common format.
Send the collected data to Elasticsearch for indexing.
Automatically create predefined dashboards and load them into Kibana.
Export internal metrics to a central Elasticsearch monitoring cluster, by enabling x-pack monitoring. In our case, we will be using the same cluster.
Enable experimental HTTP endpoint, which exposes internal metrics.

Now, we create a separate docker-compose file under elastic-stack-single-node-cluster/ and name it docker-compose-metricbeat.yml.

version: '3.9'
services:
  metricbeat-demo:
    hostname: metricbeat-demo
    container_name: metricbeat-demo
    build:
      context: ./metricbeat
      dockerfile: Dockerfile
      args:
        - ELK_VERSION=${ELK_VERSION}
    ports:
      - 5366:5066
    # Need to override user so we can access the log files, and docker.sock
    user: root
    cap_add:
      - SYS_PTRACE
      - DAC_READ_SEARCH
    volumes:
      - /proc:/hostfs/proc:ro
      - /sys/fs/cgroup:/hostfs/sys/fs/cgroup:ro
      - /:/hostfs:ro
      - /var/run/docker.sock:/var/run/docker.sock
      - /var/run/dbus/system_bus_socket:/var/run/dbus/system_bus_socket:ro
      - data_metricbeat_demo:/usr/share/metricbeat/data
    # disable strict permission checks
    command: [ '-e', '-v', '--strict.perms=false', '-system.hostfs=/hostfs' ]
    #network_mode: host # Mandatory to monitor HOST filesystem, memory, processes,...
    networks:
      - elastic-stack-service-network

# Networks to be created to facilitate communication between containers
networks:
  elastic-stack-service-network:
    name: elastic-stack-service-network

# Volumes
volumes:
  data_metricbeat_demo:
    driver: local

When executing Metricbeat in a container, there are some important things to be aware of if you want to monitor the host machine or other containers. For instance, Metricbeat’s system module collects much of its data through the Linux proc filesystem, which is normally located at /proc. Due to containers are isolated as much as possible from the host, the data from the host’s /proc is different than the inside of the container’s /proc. To solve this, you can mount the host’s /proc filesystem inside of the container and by using the -system.hostfs=/hostfs CLI flag, specify Metricbeat to look inside the /hostfs directory when looking for /proc .

Out of the box, cgroup reporting is enabled for the system process metricset. That is why, you need to mount the host’s cgroup mountpoints within the container and inside the directory specified by the -system.hostfs CLI flag. In this case, it is the /hostfs/sys/fs/cgroup directory.

If you want to be able to monitor filesystems from the host by using the system filesystem metricset, then those filesystems need to be mounted inside of the container. They can be mounted at any location.

When using -system.hostfs=/hostfs, the system network metricset uses data from /hostfs/proc/net/dev, otherwise it is from /proc/net/dev. Due to Linux namespacing; it is not enough to bind mounting the host’s /proc to /hostfs/proc. The only way to make this file contain the host’s network devices is to use the --net=host flag. In docker-compose, this flag is represented by network_mode: host. Nevertheless, you can only set either the network mode or connect Metricbeat to a docker’s isolated virtual network. For this time, we will leave it in a virtual network so that it can access the other containers.

Since we are using a Linux based operating system, more privileges need to be granted to Metricbeat if we ought to use system socket metricset. By default, docker disables the capabilities to read files from /proc that are an interface to internal objects owned by other users. The capabilities needed to read all these files are sys_ptrace and dac_read_search.

Great. We are ready to start Metricbeat, by executing the following command:

$ docker-compose -f docker-compose-metricbeat.yml up -d --build

If you go to Analytics > Dashboards and look for a dashboard called [Metricbeat System] Host overview ECS. Click it and you will see an overview of the host’s metrics:

And finally, since we also enabled the docker module, we can also see a dashboard ([Metricbeat Docker] Overview ECS) showing us an overview of all our docker containers’ metrics:

Clean Up

To do a complete clean up, execute this command:

$ docker-compose -f docker-compose-es-single-node.yml -f docker-compose-filebeat-to-elasticseach.yml -f docker-compose-filebeat-to-logstash.yml -f docker-compose-logstash.yml-f docker-compose-metricbeat.yml down -v

Summary

In this post, we learn about Metricbeat and how to configure and deploy it in a dockerized environment.

Please feel free to contact us. We will gladly response to any doubt or question you might have. In the mean time, you can download the source code from our official GitHub repository.

Collect Metrics with Metricbet