Interesting, I've switched from Google Analytics to GoAccess ~5 years ago. I let both analytics run for a month and compared the results. The relative numbers were very similar (so I get the same information about what blog posts are most popular), but the absolute numbers were in fact lower for GoAccess. It might be because tech blog visitors are using AdBlockers more often (and hence block GA).
> I'd love if I could change the time interval of the shown html stats.
GoAccess displays the data that you pass it. So while it doesn't have any date filter option (at least the last time I've checked), you can just filter your logs beforehand. There's even a more simple solution that I'm using: Set the logrotate to a specific time frame (e.g. weekly), so you can pass "access.log" to GoAccess to only get the latest stats. You can still pass "access.log*" to get ALL stats at the same time.
About the numbers: thanks for your input. I guess, you get more accurate results the more visitors you have. On my small sized blog I doubt the numbers and think, that GoAccess does not filter out some bots. You can try to determine them and filter them out, but well, that takes some time.
However, even if the numbers may not be accurate, you still see overall trends, which is valuable.
And thanks for the log-rotation trick. I will definitely make use of it.
Most bot filtering from analytics tools (GA or Adobe or so) are quite efficient. So you would expect lower traffic in these tools as from a tool using your server's log files.
On the other hand a lot of browser plugins or privacy/incognito mode kill analytics and do not have any effect on your log files. This would lead to higher numbers in your log files as well.
So I would expect somewhere between 10% - 25% increased numbers from your log files depending on the audience you serve and the overall traffic volume your site has.
At least this were the numbers some years back, when we did some additional backend tracking for some clients, were we linked the front end tracking tool ID (from the cookie) to the tracking hit being sent from the backend with additional information. Back in 2015 it was between 7% and 19%, in 2018 (before GDPR kicked in) it was 'tween 15% and 27% of backend tracking hits that did not have a frontend ID associated with them. So we knew the amount of tracking calls that had FE tracking blocked.
Very interesting. I just checked my logfiles and as expected most traffic seems to originate from search crawlers, feeds and bots running through all kinds of exploit urls. Hard to tell, but a wild guess is that more than 75% of hits and visitors are bot-related.
I learned that it is possible to exclude bots through browsers.list [1] and in goaccess.conf you can exlude ip ranges. Unfortunately updating those entries is very time consuming and probably not worth it.
> I'd love if I could change the time interval of the shown html stats.
GoAccess displays the data that you pass it. So while it doesn't have any date filter option (at least the last time I've checked), you can just filter your logs beforehand. There's even a more simple solution that I'm using: Set the logrotate to a specific time frame (e.g. weekly), so you can pass "access.log" to GoAccess to only get the latest stats. You can still pass "access.log*" to get ALL stats at the same time.