Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: TinyAnalytics – A lightweight web analytics tool (github.com/josephernest)
47 points by josephernest on Feb 18, 2017 | hide | past | favorite | 17 comments



I think a big part of modern analytics is a Javascript component that runs in the browser. Javascript can help you tell crawlers from user agents and repeat clients from large school networks (which all have the same source IP).

If you are not using Javascript, then just parsing Nginx/Caddy/Apache log files will give you the same information: https://github.com/josephernest/TinyAnalytics/blob/master/tr...

Speaking of which, what compact datastore is recommended for this? A RRD?


This raw TinyAnalytics log is just temporary, and is digested every day into a summary and then removed. No need to go for a more complex DB than that I think.

--

About the JavaScript part: yes it would be totally possible to add a JS tracker to TinyAnalytics (I'm currently thinking about adding one at the moment). You're right it would be interesting to add this.

The main reason for which I did TinyAnalytics is to have a simple super quick dashboard, in which you see all your websites in a blink, without any click.


More self-hosted analytics options are nice to see!

Why did you choose to write your own log-file instead of parsing server logs (which would work without my site being PHP, which for me would be a large benefit)?


I started this project because I tried both Piwik and OpenWebAnalytics, and I was a bit unhappy with them (see http://josephbasquin.fr/aboutanalytics).

Own-log because it just worked easily, and doesn't require a configuration to match the server's log policies (logs layout can vary from one config to another). Note: The logs are digested each day, so they will never grow to a super big file.

Most other analytics solutions (Piwik, OWA) also do their own logging.


Curious to know if you have tried out Countly (http://github.com/countly/countly-server) ? If so, what do you think?


I think it's a nice idea and probably fits the use case of some small blogs, but for anything that's more serious you'd want much more than just unique visitors and referrers.

If you have a website that is making you money (saas service, blog with ads, etc), these two basic metrics mean very little. Segmentation, conversion funnels, device/os info, etc are necessities.

Basic web server log analysis stopped being mainstream in the early 2000s, and I see no reason to bring it back.


> Basic web server log analysis stopped being mainstream in the early 2000s, and I see no reason to bring it back.

I can think of some reasons from the top of my head:

* It works even if the user is using a blocker;

* You don't have to add a script that's constantly making AJAX calls, so you site should load and run faster;


This is the reason, at least for me: https://news.ycombinator.com/item?id=13674984

15 or 30 seconds per day, instead of several minutes + twenty clicks in a row on UI elements (too annoying for me).

--

Yes, as I mentioned on README.md, it's cool for small or middle-sized websites, but not for more advanced use cases.

But then, if you use OWA or Piwik, be prepared with bugs, slow UI with the default settings (I tried several times, with different servers), etc.


> Basic web server log analysis stopped being mainstream in the early 2000s, and I see no reason to bring it back.

I still use it, and I find many discrepancies between the server logs and what appears in Google Analytics.

And having the raw NCSA logs allows me to run specialized queries on the data.


A bit of explanation about why this project:

After years, I've noticed that I prefer to have few (important) informations about my websites, that I can consult each day in 30 seconds, rather than lots of informations for which I would need 15 or 30 minutes per day for an in-depth analysis.

(When it took me 15 minutes per day, I finally watched the analytics only once every 2 weeks or so).


Well, fair enough, but what is stopping you from just not doing the indepth analysis when using something like google analytics? I mean the simple stuff is right there, easily and quickly accesible on the dashboard, with no need to follow up on any of it, right?


In Google Analytics, I would have to:

* click on "Websites" (top left of http://gget.it/5e8tyn6g/2.jpg)

* choose the right website in the list (http://gget.it/8jptjvso/3.jpg)

* wait that the page reloads

This, ten times in a row, which is annoying (once per website I manage).

Conclusion: more time spent on clicking twenty times on UI elements than actually looking at the charts.

I like to go super fast, now I open http://mywebsite.com/tinyanalytics/, and I have an immediate overview of all the websites in 10 seconds.


Or you could just fetch the most imporant data from the GA API and show it in the same kind of view you built for TinyAnalytics.


Good idea indeed! Haven't thought about it. Anyway learning how GA API works would have probably taken me more time than I actually used to write TinyAnalytics ;)

If you have time to share on Github a ready-to-use GA API -> HTML renderer (just the numbers would be enough, I'll be able to do the chart rendering), it would be awesome, I'll probably use it too!


Why are you invoking python through a shell call inside of a PHP script, instead of just doing all of that with PHP directly?


I started in Python for the "worker process" because I'm much faster at writing Python code, for data parsing, digesting, etc. It's planned to rewrite this file into PHP to have a 100% PHP solution: https://github.com/josephernest/TinyAnalytics/issues/5 If someone feels like doing it, I'll apreciate :)


Does this use javascript tracking as well?




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: