Data visualization with Telegraf, InfluxDB and Grafana on Synology

How to configure Telegraf, InfluxDB and Grafana on a Synology NAS using Docker to visualize home automation statistics.

Data visualization with Telegraf, InfluxDB and Grafana on Synology

In this article, we will expand on an earlier TIG stack setup done for Home Assistant and integrate other data sources to create amazing dashboards.

( Photo by Sergey Pesterev )

At the beginning of the year, I spent some time setting up InfluxDB and Grafana for my Home Assistant installation. Now several months have passed and I think that it is a good time to experiment a bit further with the collection of data that I have been slowly gathering.

I've been using Home Assistant since last fall and I am very impressed with it! The ease of setting up complex automation rules and seamlessly integrating many devices has been the key reason for me to use and enjoy it. On the data visualization side of things, Lovelace does its job but, of course, cannot match the power of a dedicated data visualization software such as Grafana.

Today I am going to share some of the updates I have made over my initial setup. If you need to get going from scratch, be sure to read the first chapter.

Here's what we are going to do:

  • Upgrade InfluxDB collector to accept data over TLS;
  • Spin up Telegraf on Docker;
  • Collect Synology and Docker metrics using Telegraf and into InfluxDB;
  • Optional - Collect pfSense metrics using Telegraf and into InfluxDB;
  • Connect Grafana with the new data sources;
  • Create some dashboards;

Ready? Let's get started!

Configure InfluxDB over TLS

In my first set up, I didn't bother configuring InfluxDB to accept incoming API calls over HTTP. The reason for this was based mainly on the fact that all connections happen inside my home network. However, it is always possible that a device within my LAN gets compromised and starts snooping the data. Regardless of the probability of this occurring for real, it never hurts to adopt a safer configuration by default.

I already have a wildcard Let's Encrypt certificate for the required hostname I want to use, so it is a simple matter of reconfiguring the InfluxDB Docker image to use it, then update Home Assistant to point to the new TLS endpoint. If you do not know how to obtain and renew the certificates, you can look at one of my earlier posts.

The docker setup is pretty simple - I am using InfluxDB 1.8.1. Firstly, I will output the existing /etc/influxdb/influxdb.conf file in the container, as I will want to update it with my modified version later. On a Synology SSH session, run (changing influxdb with the name of your container)

sudo docker exec -it influxdb cat /etc/influxdb/influxdb.conf

In my case, this is just the default as I didn't modify anything:

[meta]
  dir = "/var/lib/influxdb/meta"

[data]
  dir = "/var/lib/influxdb/data"
  engine = "tsm1"
  wal-dir = "/var/lib/influxdb/wal"
/etc/influxdb/influxdb.conf

I will now add the TLS configuration as described on the official documentation page :

[meta]
  dir = "/var/lib/influxdb/meta"

[data]
  dir = "/var/lib/influxdb/data"
  engine = "tsm1"
  wal-dir = "/var/lib/influxdb/wal"

[http]
  https-enabled = true
  https-certificate = "/etc/ssl/cert.pem"
  https-private-key = "/etc/ssl/fullchain.pem"
/etc/influxdb/influxdb.conf

Where fullchain.pem and privkey.pem are the names of my certificate and private key file.

We can now stop the InfluxDB container, add the required volume mappings to expose both our configuration file and the certificates. Below are the mappings I used on my system.

Docker volume mappings
Docker volume mappings

Restart the container, then check that the TLS configuration is working correctly. You can use the influx CLI:

influx -ssl -host <your InfluxDB host>

Or you can also use openssl : openssl s_client -connect <your InfluxDB host>:8086

If all works fine, remember to update your existing configurations to use the new endpoint:

  • Home Assistant: update configuration.yaml to point to the TLS endpoint, then restart. Below, I have defined the actual hostname in my secrets.yaml file
influxdb:
  host: !secret influxdb_host
  ssl: true
  verify_ssl: true
configuration.yaml
  • Grafana: update your existing Home Assistant data source to read from the TLS endpoint in the configuration dashboard.

We have now improved our original configuration so that it communicates over TLS. Next step, we will add Synology metrics into our collection of measurements.

Capture Synology system data with Telegraf

Telegraf is a lightweight agent which can capture metrics and events for a lot of systems and has a very small footprint. There are a large number of plugins available. In this tutorial, we will be running an instance on our Synology station to push system metrics into InfluxDB and expose them to Grafana.

The first thing to do is to pull the telegraf image from DockerHub. On Synology, we can use the Docker frontend to search in the registry and download it. I have used telegraf:alpine

However, the Synology frontend is limited and does not allow me to set up the volume mounts that are needed for the agent to gather the required metrics. To do this, I can start the container directly from a Synology ssh session:

 sudo docker run -d \
--name=telegraf \
--net=host \
--pid=host \
--restart always \
--volume /volume1/docker/telegraf/config/telegraf.conf:/etc/telegraf/telegraf.conf:ro \
--volume /var/run/docker.sock:/var/run/docker.sock:ro \
--volume /sys:/host/sys:ro \
--volume /proc:/host/proc:ro \
--volume /etc:/host/etc:ro \
-e "HOST_PROC=/host/proc" \
-e "HOST_SYS=/host/sys" \
-e "HOST_ETC=/host/etc" \
telegraf:alpine

Specifically, here are the interesting parts:

  • volume /volume1/docker/telegraf/config/telegraf.conf:/etc/telegraf/telegraf.conf:ro This is where I have placed my telegraf.conf file - more on that shortly.
  • We then have some volume mounts that are that are telling the Telegraf agent what to monitor. In our case, we want to monitor our host (the Synology) and also other Docker containers. This is why we are mapping var/run/docker.sock:/var/run/docker.sock:ro - as seen on the examples in the DockerHub page.

The configuration file is where all the magic happens. Here is mine - please note that some values are placeholders for obvious reasons:

[global_tags]

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  precision = ""
  hostname = ""
  omit_hostname = false
  
[[outputs.influxdb]]
  urls = ["https://***:8086"]
  database = "yourdb"
  skip_database_creation = true
  retention_policy = ""
  timeout = "5s"
  username = "user-write"
  password = "password2"
  
[[inputs.cpu]]
  percpu = true
  totalcpu = true
  collect_cpu_time = false
  report_active = false

[[inputs.disk]]
  ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"]

[[inputs.diskio]]

[[inputs.kernel]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.swap]]

[[inputs.system]]

[[inputs.docker]]
  endpoint = "unix:///var/run/docker.sock"
  container_names = [ "container-1", "container-2"]
  
[[inputs.net_response]]
   protocol = "tcp"
   address = "https://<some_url_to_monitor>"

This configuration collects several systems stats, metrics about specific Docker containers and response time statistics from an HTTPS endpoint. It will then push all the measurements into our InfluxDB instance.

Before we can start Telegraf, we need to configure the InfluxDB database and users. You can do it by connecting to your InfluxDB Docker container sudo docker exec -it influxdb influx -ssl -host <your host> (or install the influx CLI locally):

CREATE USER "user-read" WITH PASSWORD "password1"
CREATE USER "user-write" WITH PASSWORD "password2"
CREATE DATABASE "yourdb"
GRANT READ ON "yourdb" TO "user-read"
GRANT WRITE ON "yourdb" TO "user-write"

-- or create one user and GRANT ALL to them

I recommend to set up the read user in Grafana, and the write user in Telegraf. Or deploy a read/write user in both if you don't care too much.

Once you have completed your configuration, you are ready to issue the Docker command as seen earlier and start your instance.

Optional - Install Telegraf in pfSense

If you are using pfSense, I recommend installing the Telegraf agent there as well. This will allows monitoring the status of the Firewall in Grafana, as well as gathering other stats and metrics on the firewall activity.

In the pfSense package manager, you should find Telegraf as one of the available packages. Here is my entry at the time of posting this blog

Telegraf package in pfSense
Telegraf package in pfSense

Install it, then head to Services > Telegraf. The configuration is self-explanatory. You will need to set up the usual InfluxDB output. I recommend creating a new database and user pair so that you will be able to run your queries on separate databases. This is the same approach we adopted earlier.

Finally, in the Additional configuration for Telegraf area, include the parameters below (Credits to Tokugero and this Reddit user for the pfBlockerNG one). If you do not use pfBlockerNG, well, you should! Jokes aside, you can remove the 2nd [[inputs.logparser]] entry from the below configuration.

[[inputs.logparser]]
	files = ["/var/log/pfblockerng/dnsbl.log"]
	from_beginning=true
	[inputs.logparser.grok]
		measurement = "dnsbl_log"
		patterns = ["^%{WORD:BlockType}-%{WORD:BlockSubType},%{SYSLOGTIMESTAMP:timestamp:ts-syslog},%{IPORHOST:destination:tag},%{IPORHOST:source:tag},%{GREEDYDATA:call},%{WORD:BlockMethod},%{WORD:BlockList},%{IPORHOST:tld:tag},%{WORD:DefinedList:tag},%{GREEDYDATA:hitormiss}"]
		timezone = "Local"

[[inputs.logparser]]
	files = ["/var/log/pfblockerng/ip_block.log"]
	from_beginning=true
	[inputs.logparser.grok]
		measurement = "ip_block_log"
		patterns = ["^%{SYSLOGTIMESTAMP:timestamp:ts-syslog},%{NUMBER:TrackerID},%{GREEDYDATA:Interface},%{WORD:InterfaceName},%{WORD:action},%{NUMBER:IPVersion},%{NUMBER:ProtocolID},%{GREEDYDATA:Protocol},%{IPORHOST:SrcIP:tag},%{IPORHOST:DstIP:tag},%{NUMBER:SrcPort},%{NUMBER:DstPort},%{WORD:Dir},%{WORD:GeoIP:tag},%{GREEDYDATA:AliasName},%{GREEDYDATA:IPEvaluated},%{GREEDYDATA:FeedName:tag},%{HOSTNAME:ResolvedHostname},%{HOSTNAME:ClientHostname},%{GREEDYDATA:ASN},%{GREEDYDATA:DuplicateEventStatus}"]
		timezone = "Local"
Additional configuration for Telegraf (pfSense)

Save and restart the Telegraf service on pfSense.

Connect Grafana with the new metrics

We now have three InfluxDB databases available:

  • Home Assistant, which we created in the first part of the tutorial, already connected to Grafan
  • Synology
  • PfSense

In Grafana, we just need to create new data sources. The configuration is the same as before, using our HTTPS endpoint and the correct InfluxDB users that we have set up. If you do not remember how to do it, have a look at the "Set up Grafana" section of my earlier post.

Grafana Data Sources configuration dashboard
Grafana Data Sources configuration dashboard

Create some dashboards

Well. We now have a mass of metrics flowing and aggregating into our InfluxDB data storage. I will need to start watching the Hard Disk usage on my Synology or at least tweak the retention policies, at least I can do that on a Grafana panel now! For now, let's make use of all the information we are collecting to set up some nice dashboards.

Ready-made dashboard

On the Grafana website, there are a lot of many ready-made Dashboards. Some of these are excellent to get started, and also to see how dashboards and queries are structured. You can import them by noting the dashboard ID  I have used the following ones:

  • pfSense System Dashboard (ID 12023): a nice overview of the pfSense System status, including classic load stats and a map visualization showing which countries are responsible for triggering the most block rules on your firewall. If you use DNSBL, it will also show the worst offending domains. An updated version is also available on Github.
pfBlocker - IP Block Geomap
pfBlocker - IP Block Geomap
  • Synology Dashboard (ID 9961): This is a dashboard to monitor your Synology NAS. It is based on SNMP so it will require additional setup and configuration for Telegraf to gather and populate that data. Setting up SNMP is beyond the scope of this tutorial, but you are welcome to experiment further with it

Example Panels

Below you can see some examples of panels I created using the data collected by Telegraf (and Home Assistant).

System Load

Let's create a new dashboard. Using the data collected by the Telegraf agents running on the NAS and on the pfSense gateway, I created a graph to report the load metrics for each box:

Load metrics for the NAS and the Firewall
Load metrics for the NAS and the Firewall

In the top one, I have included some threshold areas to ensure that I consistently stay within the expected parameters - except of course that huge spike at midday when I rebooted the NAS 😜

Weather sensors and graphs

UV Index panel
UV Index panel

Using the Met Office integration with Home Assistant, I can pull weather data about my location. In this case, I'm plotting the UV index values over time and including a couple of areas showing me when it is unsafe to be in the Sun for too long without adequate protection.

Or, you may want to feel like a ship captain and see what is the wind speed in your neighbourhood:

Set sail master! 🏴‍☠️ Wind speed panel
Set sail master! 🏴‍☠️ Wind speed panel

If you have some sensors at home (check out this earlier article I wrote to figure out how to set that up) you might want instead to see at a glance what are the temperatures around your house. For example, it looks like it was quite toasty under the roof yesterday!

Temperatures across the house

Finally, you may be interested in monitoring the temperature and humidity of your bedroom, making sure they stay within the optimal values:

Let's hope it gets less balmy later today!
Let's hope it gets less balmy later today!

pfSense IP Block Feed

pfSense IP Block feed
pfSense IP Block feed

Using the data coming from the pfSense Telegraf agent, I can check which feed has been triggering most actions over time. For example, to identify and review the performance of each security feed.

Conclusion

In this tutorial, we have reviewed how to set up Telegraf and pull different data sources into InfluxDB. We can then pull this into Grafana and set up  dashboards and panels for our monitoring.

I am still working on my dashboard, and I plan to share it once I have refined it a little. Let me know in the comments below what you have built!

If you liked this article, follow me on Twitter for more updates!