tzeejay icon

About

Archive

Github

Setting Up Automated E-Mail GoAccess Reports via systemd

Despite the, to me, obvious drawbacks of client side website analytics libraries like Google Analytics etc. people insist on including it on their website since they “need the data” or whatever and think that there is no other way to achieve that. I disagree.
GoAccess is a great command line tool to generate web server reports directly in the terminal which lets you quickly interact with the data to answer questions you may have, but it also happens to output a, all included in a single static file, HTML report. The out of the box styling for the static HTML file is great and enables you to quickly glance at a report, or even send it to colleagues to help support their efforts. I really recommend giving it a try and I strongly recommend tearing privacy invading client side libraries out of everything you have control over. While the early 2000s of the internet were a really cool place to be, Google Analytics is one of those relics that should have died a decade ago.
Most web server logs hold tons of interesting information sent by the visitors web browser, which can then be analysed asynchronously by a tool like GoAccess in a privacy respecting way. You will miss out of a few things but nobody is sitting in front of the Google Analytics dashboard frantically optimising their website for every possible viewport height & width pixel by pixel.

I have been using this setup for over a year now but recently ran into some trouble with my cron automation, which this Raspberry Pi executed based on a file in /etc/cron.daily/. If you don’t know, placing files into those directories is a little tricky because various requirements need to be met and sometimes things just stop for inexplicable reasons, oh and there are no logs about anything.
I was finally over it and simply ignored the reports for a while, thinking that it’ll either fix itself again or that I would fix it myself at some point. Recently though I ran into a great blog post about setting up systemd timers to run automations. I have since converted various things in my life into these systemd timers and have not looked back. Some of y’all don’t quite understand the powers of systemd or don’t like it fundamentally, but it is quite compatible with my brain and getting worry free logs which are instantly available, query-able and sort-able via journalctl makes it all even better in my eyes.

The last remaining puzzle piece in this entire chain of events is the transfer of the GoAccess report into my E-Mail inbox. I looked at a few of the common recommendations on the internet but ended up using a great OSS project fittingly called eMail. It does a few fancy things, but it tries to get out of your way and simply be a good *NIX tool. There isn’t too much to say about it and that is a good thing. I can wholeheartedly recommend using it!

Post installation of GoAccess one would usually get started by running something like these commands to generate static HTML reports (drop the -o static-report.html to checkout the analysis in the terminal first)

zcat -f /var/log/nginx/access.* -f /var/log/nginx/error.* | goaccess -o static-report.html

You could even start excluding some paths that you do not care about in the GoAccess report by running it through grep for example

zcat -f /var/log/nginx/access.* -f /var/log/nginx/error.* | grep -Ev '/admin|/wp/login/' | goaccess -o static-report.html

This however will not work with systemd, nor will it work if you throw it all into a shell script which systemd is then executing for some reason. GoAccess will give you scary sounding nonsensical errors and gzip will complain about broken pipes.

Aug 05 09:15:56 raspberrypi systemd[1]: Started GoAccess Report generation script.
Aug 05 09:15:56 raspberrypi bash[22837]: GoAccess - version 1.5.1 - Jul 19 2021 17:11:41
Aug 05 09:15:56 raspberrypi bash[22837]: Config file: /usr/local/etc/goaccess/goaccess.conf
Aug 05 09:15:56 raspberrypi bash[22837]: Fatal error has occurred
Aug 05 09:15:56 raspberrypi bash[22837]: Error occurred at: src/goaccess.c - initializer - 1459
Aug 05 09:15:56 raspberrypi bash[22837]: No input data was provided nor there's data to restore.
Aug 05 09:15:56 raspberrypi bash[22837]: grep: write error: Broken pipe
Aug 05 09:15:56 raspberrypi bash[22837]: gzip: stdout: Broken pipe

I ended up fixing this by creating a working directory for my GoAccess executing script and defining a bunch of files which I create, use while logs are being analysed and finally deleted once the report was sent out.

# mkdir -p /usr/local/goaccess-reports
# touch /usr/local/goaccess-reports/excluded-paths.txt
# touch /usr/local/goaccess-reports/goaccess-report.sh


Once that is set you can populate the files accordingly and start receiving daily reports about the traffic on your web server, straight into your inbox.

/usr/local/goaccess-reports/goaccess-report.sh

#! /bin/bash

OUTPUTFILELOCATION="/usr/local/goaccess-reports/"
OUTPUTFILE="todays-report.html"
TMPLOGFILEFULL="/usr/local/goaccess-reports/tmp-full.txt"
TMPLOGFILECLEANED="/usr/local/goaccess-reports/tmp-cleaned.txt"

echo "Assembling report from Nginx logs"
zcat -f /var/log/nginx/access.* -f /var/log/nginx/error.* > "$TMPLOGFILEFULL" 

echo "Cleaning the log"
grep -v -f /usr/local/goaccess-reports/excluded-paths.txt "$TMPLOGFILEFULL" > "$TMPLOGFILECLEANED"

echo "Analyzing the log"
goaccess -f "$TMPLOGFILECLEANED" --log-format=COMBINED --http-protocol=no --http-method=no --ignore-crawlers -a -o "$OUTPUTFILELOCATION$OUTPUTFILE"

echo "Sending report with eMail"
email -b -s "Website Report For $(date +"%Y-%m-%d")" -attach "$OUTPUTFILELOCATION$OUTPUTFILE" johndoe@example.com

echo "Deleting today-report.html file"
rm "$OUTPUTFILELOCATION/$OUTPUTFILE"

echo "Deleting tmp-full.txt"
rm "$TMPLOGFILEFULL"

echo "Deleting tmp-cleaned.txt"
rm "$TMPLOGFILECLEANED"

echo "Done for today"


/etc/systemd/system/goaccess-reports.service

[Unit]
Description=GoAccess Report generation script

[Service]
ExecStart=/bin/bash /usr/local/goaccess-reports/goaccess-report.sh


/etc/systemd/system/goaccess-reports.timer

[Unit]
Description=Timer for Daily GoAccess report

[Timer]
OnBootSec=300
OnUnitActiveSec=1d

[Install]
WantedBy=multi-user.target

Finally remember to start and enable the systemd timer

# systemctl start goaccess-report.timer 
# systemctl enable goaccess-report.timer 

You can check on the status of your timer and when it’ll be executed again via

# systemctl status goaccess-report.service

07.08.2021