Mqttrunner Introduction

An Eccentric Anomaly: Ed Davies's Blog

Four years ago, in 2016 November, I slung together a few hundred lines of Python to do some logging of data from around the house. It was intended as a plan-to-throw-one-away effort for exploration before writing something a bit more general. As with all things temporary, it's continued in operation since and has grown a bit to about 10'000 lines of Python and 5000 lines of HTML, CSS, JavaScript, JSON, Rust, C/C++ and other eldritch horrors.

There are a few aspects of it which might be worth discussing so a brief overview to give some context seems a good way to start.

The general “architecture” is to fling all of the available data at an MQTT broker (I use mosquitto) then have a process pick it all up from there, add a timestamp, and write it to a database for future analysis.

The original program was so trivial that I didn't bother to give it a proper name. The actual logging program was just called mqttlog.py living in a directory called mqtt_utils. Now it's made up of quite a few more components (though a few lines of the original mqttlog.py still exist in modules mqttreader.py and main_loop.py). The whole lot is coordinated at the top level by mqttrunner.py so I'll retrospectively use that name for the whole system.

For the first month or so I ran the system on my netbook (Acer Aspire One) but towards the end of 2017 January I got a Raspberry Pi and have been running it on that ever since.

The original system logged data to flat text files as lines of JSON data.

Each time a log file grew larger than 8 MB it was closed and a new one opened. An early addition was a set of index files to record the timespan of the data in each file. Because I wanted to allow for the option of having out-of-order data it was, in theory, possible for the timespans of log files to overlap and for reading the entries in order to require merging of the contents of two (or more) log files, not just sorting the (already mostly sorted) contents of a single log file. This all got a bit convoluted.

Accordingly, towards the end of 2017 I moved to using a “proper” database. IIRC, I first had a play with MonogoDB but decided against it for reasons I don't remember. In the end I transferred all the existing data into a sqlite3 database where I've put all the new data since. It's now about 4 GB in size with just over 48 million individual data points.

The main table in the database just contains data points in the form of timestamp, topic, value triples where the topic is the MQTT label applied to the data. Here's some data extracted in JSON format just now:

{"timestamp": "2020-11-21T15:44:40.75Z", "topic": "study/temperature", "value": 20.79}
{"timestamp": "2020-11-21T15:44:40.87Z", "topic": "study/humidity", "value": 61.71}
{"timestamp": "2020-11-21T15:44:41.65Z", "topic": "ow/28e3f2c7010000/temperature", "value": 31.0625}
{"timestamp": "2020-11-21T15:44:41.71Z", "topic": "study/radiator/temperature", "value": 31.0625}
{"timestamp": "2020-11-21T15:44:44.52Z", "topic": "esp/cc50e322cb67/commanded_state", "value": false}
{"timestamp": "2020-11-21T15:44:44.54Z", "topic": "esp/cc50e322cb67/ip", "value": "192.168.77.10"}
{"timestamp": "2020-11-21T15:44:44.58Z", "topic": "esp/cc50e322cb67/state", "value": false}
{"timestamp": "2020-11-21T15:44:44.59Z", "topic": "ch/commanded_state", "value": false}
{"timestamp": "2020-11-21T15:44:44.60Z", "topic": "esp/cc50e322cb67/uptime_millis", "value": 68165450}
{"timestamp": "2020-11-21T15:44:44.64Z", "topic": "ch/state", "value": false}
{"timestamp": "2020-11-21T15:44:46.24Z", "topic": "esp/cc50e322cb54/fridge_state", "value": 0}
{"timestamp": "2020-11-21T15:44:46.27Z", "topic": "esp/cc50e322cb54/uptime_millis", "value": 2617730480}
{"timestamp": "2020-11-21T15:44:46.29Z", "topic": "fridge/state", "value": 0}
{"timestamp": "2020-11-21T15:44:46.29Z", "topic": "ow/28aa07f94e1401/temperature", "value": 1.25}
{"timestamp": "2020-11-21T15:44:46.33Z", "topic": "fridge/temperature", "value": 1.25}
{"timestamp": "2020-11-21T15:44:48.69Z", "topic": "cc/power", "value": 88}
{"timestamp": "2020-11-21T15:44:54.58Z", "topic": "cc/power", "value": 86}
{"timestamp": "2020-11-21T15:44:57.81Z", "topic": "esp/30aea4739ca4/uptime_millis", "value": 2617590249}
{"timestamp": "2020-11-21T15:44:57.85Z", "topic": "ow/28b21ac8010000/temperature", "value": 14.38}
{"timestamp": "2020-11-21T15:44:57.90Z", "topic": "living-room/temperature", "value": 14.38}
{"timestamp": "2020-11-21T15:45:01.20Z", "topic": "esp/30aea48bd5cc/uptime_millis", "value": 2617650376}
{"timestamp": "2020-11-21T15:45:01.23Z", "topic": "ow/28d517c8010000/temperature", "value": 15.06}
{"timestamp": "2020-11-21T15:45:01.57Z", "topic": "cc/power", "value": 86}

“ow” prefixes data identified by 1-wire address and “esp” prefixes data from ESP8266 or ESP32 devices identified by their Wi-Fi MAC address. Generally speaking, data items often appear twice such as the study radiator flow pipe's temperature which appears first on line 3 of the above data by its 1-wire address then on line 4 by its “logical” meaning. Updating this mapping allows devices to be moved around and reconfigured without finishing up with glitched data in the database.

As this mapping works by re-posting the raw data back to MQTT with the logical topic there's a bit of a flaw in that the timestamp isn't preserved. In this example the delay is very small (from 41.65 seconds into the minute until 41.71 seconds) but it can be longer if I'm fiddling with the configuration.

“ch” is a prefix for topics relating to the central heating control and “cc” to the CurrentCost meter which records (via a clip-on current transformer) the mains power entering the house.

This data looks very verbose (because it is). However, the whole 4 years of data in JSON format is about the same size as the sqlite3 database (4 GB) and it compresses well; gzip gets 10:1 compression on it where as it only compresses the sqlite3 by about 4:1.

My backup strategy for the database is to, every few days, turn off writes temporarily then rsync it to my main laptop for inclusion with other backups. Data is not lost while doing this, it gets queued up and written as soon as writes are turned back on (so long as there isn't a power cut or other catastrophe in the mean time).

Apart from the change of database the main extensions to the system over the years have been:

but more on those later…