I think a big part of modern analytics is a Javascript component that runs in the browser. Javascript can help you tell crawlers from user agents and repeat clients from large school networks (which all have the same source IP).
This raw TinyAnalytics log is just temporary, and is digested every day into a summary and then removed. No need to go for a more complex DB than that I think.
--
About the JavaScript part: yes it would be totally possible to add a JS tracker to TinyAnalytics (I'm currently thinking about adding one at the moment). You're right it would be interesting to add this.
The main reason for which I did TinyAnalytics is to have a simple super quick dashboard, in which you see all your websites in a blink, without any click.
More self-hosted analytics options are nice to see!
Why did you choose to write your own log-file instead of parsing server logs (which would work without my site being PHP, which for me would be a large benefit)?
Own-log because it just worked easily, and doesn't require a configuration to match the server's log policies (logs layout can vary from one config to another).
Note: The logs are digested each day, so they will never grow to a super big file.
Most other analytics solutions (Piwik, OWA) also do their own logging.
I think it's a nice idea and probably fits the use case of some small blogs, but for anything that's more serious you'd want much more than just unique visitors and referrers.
If you have a website that is making you money (saas service, blog with ads, etc), these two basic metrics mean very little. Segmentation, conversion funnels, device/os info, etc are necessities.
Basic web server log analysis stopped being mainstream in the early 2000s, and I see no reason to bring it back.
After years, I've noticed that I prefer to have few (important) informations about my websites, that I can consult each day in 30 seconds, rather than lots of informations for which I would need 15 or 30 minutes per day for an in-depth analysis.
(When it took me 15 minutes per day, I finally watched the analytics only once every 2 weeks or so).
Well, fair enough, but what is stopping you from just not doing the indepth analysis when using something like google analytics? I mean the simple stuff is right there, easily and quickly accesible on the dashboard, with no need to follow up on any of it, right?
Good idea indeed! Haven't thought about it. Anyway learning how GA API works would have probably taken me more time than I actually used to write TinyAnalytics ;)
If you have time to share on Github a ready-to-use GA API -> HTML renderer (just the numbers would be enough, I'll be able to do the chart rendering), it would be awesome, I'll probably use it too!
I started in Python for the "worker process" because I'm much faster at writing Python code, for data parsing, digesting, etc.
It's planned to rewrite this file into PHP to have a 100% PHP solution: https://github.com/josephernest/TinyAnalytics/issues/5
If someone feels like doing it, I'll apreciate :)
If you are not using Javascript, then just parsing Nginx/Caddy/Apache log files will give you the same information: https://github.com/josephernest/TinyAnalytics/blob/master/tr...
Speaking of which, what compact datastore is recommended for this? A RRD?