Collectd is Unix daemon used for periodically collecting system usage statistics, which can include identifying CPU or memory bottleneck issues. The collected data can then be transformed to graphs using RRDtool or a Grafana dashboard (Grafana provides real time graphs and complex search queries).
The daemon itself is modular and functions through external plugins with each plugin performing a distinct function. This post will explore a plugin which collects weather information of a given city. The first section will explain how the plugin configuration works and how to plot a graph of the output data using RRDtool. Finally, we will delve into the plugin internals and see how it is written.
Note: For an intro on how to setup Collectd, refer to this wiki.
Plugin configuration
All enabled Collectd plugins must be listed in the collectd.conf
file. The default file paths used by Collectd can be viewed with:
$ collectd -h
Builtin defaults:
Config file /etc/collectd.conf
PID file /var/run/collectd.pid
Plugin directory /usr/lib/collectd
Data directory /var/lib/collectd
The plugins lie under the /usr/lib/collectd
directory, by default.
Note: Python plugins might not work when placed under the default plugin directory, as this causes anomalies when importing external modules.
Invoking the plugin
Let’s say we have a script named log-weather.py
which we want to periodically call using Collectd. We first need to change its path to our custom plugins directory:
$ cp /path/to/log-weather.py /usr/lib/collectd/python/
Next, we will add this plugin to our /etc/collectd.conf
config file. We also need to load the Python plugin (this exposes the collectd
Python module which we then import in our script):
$ cat /etc/collectd.conf
LoadPlugin python
<Plugin python>
ModulePath "/usr/lib/collectd/python/"
Import "log-weather"
<Module weather_logger>
CITY "<city>"
API_KEY "<api_key>"
</Module>
</Plugin>
The import structure remains similar for all Python plugins, except for the module path, the module name, and its configuration parameters.
Import
directive: This takes the script name as its value. Notice how this differs from the module name in the following line. We’ll see this in detail in a later section.Module
tag: This module requires two configuration parameters,CITY
andAPI_KEY
. It calls an external API with the given key to collect the weather information.
Enabling RRDtool
As we intend to plot the collected data at the end, we will enable the RRDtool plugin. This plugin enables writing the data dispatched by our script to an RRD (Round Robin Database) file.
LoadPlugin rrdtool
Setting the call interval
In our /etc/collectd.conf
file we can tweak the frequency by which our plugin is called with the Interval
directive. This expects a value in seconds:
Interval 1800
Our script will now collect the weather information every 1800 seconds, or every half an hour.
Note: It is also possible to set the interval for each plugin separately. This requires setting the interval in the
register_read
function, which we will see later.
Starting the Collectd service
Once the configuration is done, we must (re)start the Collectd service using systemd
.
Note: this step might be different based on your Linux distribution. Consult the Collectd Wiki for more details.
Enabling the Collectd systemd
service will start Collectd when the user logs in on their computer.
$ systemctl start collectd
$ systemctl enable collectd
Plotting the data
As soon as the Collectd service is running, data will be written to the RRD file using the RRDtool plugin which we enabled earlier. This data is stored inside the data directory, which in our case is /var/lib/collectd/home/weather_alert/temperature.rrd
.
To plot this data we will use the rrdtool
Linux utility. (This tool is different from the one used by Collectd, which uses the librrd
API for writing the data.) We will first view the information written to the temperature.rrd
file with:
$ rrdtool info temperature.rrd
rrd_version = "0003"
step = 2
last_update = 1589090986
header_size = 3496
ds[value].index = 0
ds[value].type = "GAUGE"
...
And then proceed with creating the graph:
$ rrdtool graph weather.png \ # (1)
--title "Weather (Athens)" \ # (2)
--vertical-label "Temperature (°C)" \ # (3)
--start now-8000s --end now \ # (4)
DEF:a=temperature.rrd:value:AVERAGE # (5)
It produces this result:
Let’s go over the above command in detail:
- We specify the name for the output graph.
- This title appears at the top of the graph.
- The vertical label appears to the left of our x-axis.
- Each graph has a start and end time which can be provided either as a relative or absolute timestamp value.
now-7200s
means that we will view weather statistics from the past 2 hours up till now. - The DEF directive defines the graph variable
a
, which is followed by this pattern<source file>:<key>:<metric>
.<source file>
contains the path to the RRD file.<key>
is what we saw in therrdtool info
command asds[value]
. In this case we will plot thevalue
key.<metric>
is usually eitherMIN
,MAX
, orAVERAGE
. It is also visible in theinfo
output.
Note: this tool can produce some very complex graphs, but for our use case we will just stick with a simple line graph.
Having configured the script we will now go over the plugin structure and see how the weather collection script is written.
Weather Logger plugin
The Python Collectd plugin begins by importing the collectd
module. It then defines a series of callback functions. For our use case we will only define a config
and a read
callback function.
import collectd
import requests
PLUGIN = 'weather_alert'
CITY, API_KEY = '', '' # placeholders
def config_callback(config):
global CITY, API_KEY
for node in config.children:
key = node.key.lower()
val = node.values[0]
if key == 'city':
CITY = val
if key == 'api_key':
API_KEY = val
if not CITY:
collectd.error('{} plugin: [Error] The city name must be provided'.format(PLUGIN))
if not API_KEY:
collectd.error('{} plugin: [Error] The API key must be provided'.format(PLUGIN))
The configuration function above sets the CITY
and API_KEY
variables that we passed over from the config file. Later on we will see how to view Collectd logs.
We now define the read_callback
function:
def read_callback():
weather = requests.get(f'http://api.openweathermap.org/data/2.5/forecast?q={CITY}&units=metric&appid={API_KEY}').json()
weather_metric = collectd.Values(plugin=PLUGIN, type='temperature')
try:
weather_metric.dispatch(values=[weather['list'][0]['main']['temp']])
except Exception as e:
weather_metric.dispatch(values=[-1])
collectd.error('{} plugin: [Error] Unable to run {}'.format(e))
We call the OpenWeatherMap API for collecting the data. The collectd.Values()
function registers the plugin name which we later reference in the config file. The weather_metric.dispatch()
function is what writes the data to the RRD file.
The final step is to register these two functions with the collectd
module:
collectd.register_config(config_callback)
collectd.register_read(read_callback)
Diagnosing plugin issues
Let’s try removing the CITY
from our /etc/collectd.conf
configuration file and see if it raises any errors. We restart the Collectd service and check its status:
$ systemctl restart collectd.service
$ systemctl status collectd.service
● collectd.service - System statistics collection daemon
Loaded: loaded (/usr/lib/systemd/system/collectd.service; enabled; vendor preset: disabled)
Active: active (running) ...
...
May 10 13:42:07 home collectd[29390]: weather_alert plugin: [Error] The city name must be provided
It can also be helpful to view the full logs for more details. By default, these are present in the /var/log/collectd.log
file.