In this article, we will expand on an earlier TIG stack setup done for Home Assistant and integrate other data sources to create amazing dashboards.
( Photo by Sergey Pesterev )
At the beginning of the year, I spent some time setting up InfluxDB and Grafana for my Home Assistant installation. Now several months have passed and I think that it is a good time to experiment a bit further with the collection of data that I have been slowly gathering.
I've been using Home Assistant since last fall and I am very impressed with it! The ease of setting up complex automation rules and seamlessly integrating many devices has been the key reason for me to use and enjoy it. On the data visualization side of things, Lovelace does its job but, of course, cannot match the power of a dedicated data visualization software such as Grafana.
Today I am going to share some of the updates I have made over my initial setup. If you need to get going from scratch, be sure to read the first chapter.
Here's what we are going to do:
- Upgrade InfluxDB collector to accept data over TLS;
- Spin up Telegraf on Docker;
- Collect Synology and Docker metrics using Telegraf and into InfluxDB;
- Optional - Collect pfSense metrics using Telegraf and into InfluxDB;
- Connect Grafana with the new data sources;
- Create some dashboards;
Ready? Let's get started!
Configure InfluxDB over TLS
In my first set up, I didn't bother configuring InfluxDB to accept incoming API calls over HTTP. The reason for this was based mainly on the fact that all connections happen inside my home network. However, it is always possible that a device within my LAN gets compromised and starts snooping the data. Regardless of the probability of this occurring for real, it never hurts to adopt a safer configuration by default.
I already have a wildcard Let's Encrypt certificate for the required hostname I want to use, so it is a simple matter of reconfiguring the InfluxDB Docker image to use it, then update Home Assistant to point to the new TLS endpoint. If you do not know how to obtain and renew the certificates, you can look at one of my earlier posts.
The docker setup is pretty simple - I am using
InfluxDB 1.8.1. Firstly, I will output the existing
/etc/influxdb/influxdb.conf file in the container, as I will want to update it with my modified version later. On a Synology SSH session, run (changing
influxdb with the name of your container)
sudo docker exec -it influxdb cat /etc/influxdb/influxdb.conf
In my case, this is just the default as I didn't modify anything:
I will now add the TLS configuration as described on the official documentation page :
privkey.pem are the names of my certificate and private key file.
We can now stop the InfluxDB container, add the required volume mappings to expose both our configuration file and the certificates. Below are the mappings I used on my system.
Restart the container, then check that the TLS configuration is working correctly. You can use the
influx -ssl -host <your InfluxDB host>
Or you can also use
openssl s_client -connect <your InfluxDB host>:8086
If all works fine, remember to update your existing configurations to use the new endpoint:
- Home Assistant: update
configuration.yamlto point to the TLS endpoint, then restart. Below, I have defined the actual hostname in my
- Grafana: update your existing Home Assistant data source to read from the TLS endpoint in the configuration dashboard.
We have now improved our original configuration so that it communicates over TLS. Next step, we will add Synology metrics into our collection of measurements.
Capture Synology system data with Telegraf
Telegraf is a lightweight agent which can capture metrics and events for a lot of systems and has a very small footprint. There are a large number of plugins available. In this tutorial, we will be running an instance on our Synology station to push system metrics into InfluxDB and expose them to Grafana.
The first thing to do is to pull the
telegraf image from DockerHub. On Synology, we can use the Docker frontend to search in the registry and download it. I have used
However, the Synology frontend is limited and does not allow me to set up the volume mounts that are needed for the agent to gather the required metrics. To do this, I can start the container directly from a Synology ssh session:
sudo docker run -d \ --name=telegraf \ --net=host \ --pid=host \ --restart always \ --volume /volume1/docker/telegraf/config/telegraf.conf:/etc/telegraf/telegraf.conf:ro \ --volume /var/run/docker.sock:/var/run/docker.sock:ro \ --volume /sys:/host/sys:ro \ --volume /proc:/host/proc:ro \ --volume /etc:/host/etc:ro \ -e "HOST_PROC=/host/proc" \ -e "HOST_SYS=/host/sys" \ -e "HOST_ETC=/host/etc" \ telegraf:alpine
Specifically, here are the interesting parts:
volume /volume1/docker/telegraf/config/telegraf.conf:/etc/telegraf/telegraf.conf:roThis is where I have placed my
telegraf.conffile - more on that shortly.
- We then have some volume mounts that are that are telling the Telegraf agent what to monitor. In our case, we want to monitor our host (the Synology) and also other Docker containers. This is why we are mapping
var/run/docker.sock:/var/run/docker.sock:ro- as seen on the examples in the DockerHub page.
The configuration file is where all the magic happens. Here is mine - please note that some values are placeholders for obvious reasons:
[global_tags] [agent] interval = "10s" round_interval = true metric_batch_size = 1000 metric_buffer_limit = 10000 collection_jitter = "0s" flush_interval = "10s" flush_jitter = "0s" precision = "" hostname = "" omit_hostname = false [[outputs.influxdb]] urls = ["https://***:8086"] database = "yourdb" skip_database_creation = true retention_policy = "" timeout = "5s" username = "user-write" password = "password2" [[inputs.cpu]] percpu = true totalcpu = true collect_cpu_time = false report_active = false [[inputs.disk]] ignore_fs = ["tmpfs", "devtmpfs", "devfs", "iso9660", "overlay", "aufs", "squashfs"] [[inputs.diskio]] [[inputs.kernel]] [[inputs.mem]] [[inputs.processes]] [[inputs.swap]] [[inputs.system]] [[inputs.docker]] endpoint = "unix:///var/run/docker.sock" container_names = [ "container-1", "container-2"] [[inputs.net_response]] protocol = "tcp" address = "https://<some_url_to_monitor>"
This configuration collects several systems stats, metrics about specific Docker containers and response time statistics from an HTTPS endpoint. It will then push all the measurements into our InfluxDB instance.
Before we can start Telegraf, we need to configure the InfluxDB database and users. You can do it by connecting to your InfluxDB Docker container
sudo docker exec -it influxdb influx -ssl -host <your host> (or install the influx CLI locally):
CREATE USER "user-read" WITH PASSWORD "password1" CREATE USER "user-write" WITH PASSWORD "password2" CREATE DATABASE "yourdb" GRANT READ ON "yourdb" TO "user-read" GRANT WRITE ON "yourdb" TO "user-write" -- or create one user and GRANT ALL to them
I recommend to set up the read user in Grafana, and the write user in Telegraf. Or deploy a read/write user in both if you don't care too much.
Once you have completed your configuration, you are ready to issue the Docker command as seen earlier and start your instance.
Optional - Install Telegraf in pfSense
If you are using pfSense, I recommend installing the Telegraf agent there as well. This will allows monitoring the status of the Firewall in Grafana, as well as gathering other stats and metrics on the firewall activity.
In the pfSense package manager, you should find
Telegraf as one of the available packages. Here is my entry at the time of posting this blog
Install it, then head to
Services > Telegraf. The configuration is self-explanatory. You will need to set up the usual InfluxDB output. I recommend creating a new database and user pair so that you will be able to run your queries on separate databases. This is the same approach we adopted earlier.
Finally, in the
Additional configuration for Telegraf area, include the parameters below (Credits to Tokugero and this Reddit user for the pfBlockerNG one). If you do not use pfBlockerNG, well, you should! Jokes aside, you can remove the 2nd
[[inputs.logparser]] entry from the below configuration.
Save and restart the Telegraf service on pfSense.
Connect Grafana with the new metrics
We now have three InfluxDB databases available:
- Home Assistant, which we created in the first part of the tutorial, already connected to Grafan
In Grafana, we just need to create new data sources. The configuration is the same as before, using our HTTPS endpoint and the correct InfluxDB users that we have set up. If you do not remember how to do it, have a look at the "Set up Grafana" section of my earlier post.
Create some dashboards
Well. We now have a mass of metrics flowing and aggregating into our InfluxDB data storage. I will need to start watching the Hard Disk usage on my Synology or at least tweak the retention policies, at least I can do that on a Grafana panel now! For now, let's make use of all the information we are collecting to set up some nice dashboards.
On the Grafana website, there are a lot of many ready-made Dashboards. Some of these are excellent to get started, and also to see how dashboards and queries are structured. You can import them by noting the dashboard ID I have used the following ones:
- pfSense System Dashboard (ID 12023): a nice overview of the pfSense System status, including classic load stats and a map visualization showing which countries are responsible for triggering the most block rules on your firewall. If you use DNSBL, it will also show the worst offending domains. An updated version is also available on Github.
- Synology Dashboard (ID 9961): This is a dashboard to monitor your Synology NAS. It is based on SNMP so it will require additional setup and configuration for Telegraf to gather and populate that data. Setting up SNMP is beyond the scope of this tutorial, but you are welcome to experiment further with it
Below you can see some examples of panels I created using the data collected by Telegraf (and Home Assistant).
Let's create a new dashboard. Using the data collected by the Telegraf agents running on the NAS and on the pfSense gateway, I created a graph to report the
load metrics for each box:
In the top one, I have included some threshold areas to ensure that I consistently stay within the expected parameters - except of course that huge spike at midday when I rebooted the NAS 😜
Weather sensors and graphs
Using the Met Office integration with Home Assistant, I can pull weather data about my location. In this case, I'm plotting the UV index values over time and including a couple of areas showing me when it is unsafe to be in the Sun for too long without adequate protection.
Or, you may want to feel like a ship captain and see what is the wind speed in your neighbourhood:
If you have some sensors at home (check out this earlier article I wrote to figure out how to set that up) you might want instead to see at a glance what are the temperatures around your house. For example, it looks like it was quite toasty under the roof yesterday!
Finally, you may be interested in monitoring the temperature and humidity of your bedroom, making sure they stay within the optimal values:
pfSense IP Block Feed
Using the data coming from the pfSense Telegraf agent, I can check which feed has been triggering most actions over time. For example, to identify and review the performance of each security feed.
In this tutorial, we have reviewed how to set up Telegraf and pull different data sources into InfluxDB. We can then pull this into Grafana and set up dashboards and panels for our monitoring.
I am still working on my dashboard, and I plan to share it once I have refined it a little. Let me know in the comments below what you have built!