How to collect OPNSense metrics - Christos Georgiadis

Why monitor OPNSense

For those not familiar with OPNSense, is an opensource firewall and routing platform. I use it mainly for DHCP, DNS server, port forwarding, dynamic DNS and VPN server (viva wireguard!). Since many things depend on it, I really wanted to have a way to monitor it. Now you will tell me, what’s the point of monitoring since you will not be even notified if OPNSense goes down. I will not have internet so any notification outside my LAN will be impossible. it could be though that through monitoring, I can at least see into historical data and find out what might have gone wrong.

Monitoring Solution

I have already setup with different docker containers, influxdb and grafana to monitor my openHAB server so everything is almost setup. Influxdb is a time series database that stores data from vaious sources and grafana pulls those data and create very nice-looking graphs. At this point, I am still working in adding several data sources and metric to monitor so I am not too focused on making everything look nice. That will come later.

How to push data to InfluxDB

Normally applications are supposed to push data to influxdb, my openHAB instance does, but there is also telegraf which is influxdb’s data collector. It’s an agent installed on the target machine. It collects data that sends over to influxdb. Luckily, OPNSense has telegram available in the plugins page. You just to to System –> Firmware –> Plugins and install telegraf.

Once this is done, telegraf will be available via the menu Services –> Telegraf

Configuration

influxdb

In influxdb v2.x we should already have configured an organization. We also need to configure a bucket where our OPNSense metrics will be stored. Head over to Load Data –> Buckets –> Create Bucket

You give a name and also define the retention policy i.e. how long are you going to keep the data.

You also need to create an API Token in the Load Data section. A Custom token and Write Access is all you need to ensure telegraf from OPNSense can write data to the bucket.

Telegraf

First, we go to Services –> Telegraf –> Input to configure the metrics we want to collect such as CPU, Memory, Disk, Disk I/O, Internet Speed, network, ping etc.

Then, we go to Services –> Telegraf –> Output to configure where to send all these data. If you have influx DB 1.x (legacy) you want to enable Enable Influx Output. If you, like me, have a recent version of the db 2.x (based on the proprietary query language flux) you will need to check the Enable Influx v2 Output. On top of that, you will need to provide the URL of influxdb, username/password and database for legacy, token, organization and bucket for 2.x. There are plenty of other options there so make sure you have a good look.

Finally, you go to Services –> Telegraf –> General and check Enable Telegram Client. If all goes well, you should be starting to see data coming into the bucket. Monitor the log file in Services –> Telegraf –> Log File as it will give you pointers in case something is wrong. A typical error is:

E! [outputs.influxdb_v2] When writing to [http://[ip_address_of_influxdb:8086]: Post "http://ip_address_of_influxdb:8086/api/v2/write?bucket=opnsense&org=home": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
E! [agent] Error writing to outputs.influxdb_v2: failed to send metrics to any configured server(s)

The above almost surely means that telegram cannot connect to influxdb. Check the ip address and port details and make sure the API token gives write access to the correct bucket.

If you head over to Data Explorer in influxdb you should be able to see data in your bucket. From there you can create queries, inspect the incoming data and visualize it.

Although influxdb has its own visualization engine called Kapacitor, I have never tried it as I already have Grafana installed, and I think it is much more popular (although this does not necessarily mean that it is better for my needs)