Matomo: Open-source analytics platform

l1am0 · on Jan 6, 2021

I use Matomo for years now and it works quite reliably. (A few updates failed the automatic update, but nothing serious)

Only thing that bothered me is that most Ad Blockers are blocking Matomo as well. I did build a little Script to circumvent that, you might find it handy as well: https://gumroad.com/l/matomo_circumvent_adblock

I use it on my website. Check if your ad blocker is capable of blocking it: https://simon-frey.com

lightswitch05 · on Jan 6, 2021

Its pretty disgusting to track people against their consent, even more so to circumvent their protections against tracking. I added your domain to my blocklist: https://github.com/lightswitch05/hosts/commit/bb2cd77c9ec028...

aorth · on Jan 6, 2021

It's your call really, but a website owner tracking you with their own software on their own Matomo instance is not the problem. This is essentially the same as monitoring website logs... that's not disgusting at all.

lightswitch05 · on Jan 6, 2021

I think grouping server-side tracking with JavaScript based tracking is an oversimplification. JavaScript tracking is much more invasive and can access significantly more data. From something as straightforward as fingerprinting to potentially even more invasive data such as geo-location, battery status, webcam, microphone - you name it. Server access logs aren't going to track my eyes.

I think we can all agree there are different levels of acceptable tracking and use of that data- but the degrees of acceptance are going to be different depending on the user and service. I don't consider bypassing my restrictions to run unauthorized code to be an acceptable tracking method and raises serious concerns about how the data will then be used.

arp242 · on Jan 6, 2021

Anyone can do all sorts of things. I can punch anyone I see on the street in the face. Doesn't mean they're actually doing it.

Now, I have a vested interest in this as I work on one of those tracking tools, but it actually collects less data than those Apache access_logs that people have been keeping for 25 years. Plus, the JS is unminified and easily examinable if you want (as is the HTTP request), so you also have more insight in what is being collected exactly.

"It's using JavaScript" and "it can do [..]" are massive red herrings; browsers are actually fairly sandboxed and there are millions upon millions of lines of code on your computer that can do much more than JavaScript inside a webpage.

lightswitch05 · on Jan 6, 2021

> I can punch anyone I see on the street in the face.

Yes, and then you would be charged with assault. It is great that you work on a tool that respects peoples privacy. I suppose I failed to put an emphasis on trust. With server side logs, less trust is required because there is less that can be done. Paired with VPN, I can have reasonable belief that server side logging is not logging anything unreasonable and it does not require trust that they are not fingerprinting me. As you say, just because someone can do something doesn't mean they will - but trust is required, especially if there are no repercussions if that trust is violated.

throwaway894345 · on Jan 6, 2021

OTOH JavaScript tracking is an easy way to filter out a lot of the bots. I use a little bit of JS-based tracking for exactly this reason, but I'm not extracting anything that wouldn't show up in server logs (eventually I also want to get some "time spent on page" metric so I have some idea how useful my blog posts are (are people clicking and leaving right away or are they sticking around to read). You pretty need JS for this. In whatever case, web analytics like these aren't "tracking"; you're looking at user behavior on your own site; not trying to follow them around the Internet or otherwise identify them.

dalu · on Jan 7, 2021

Matomo doesn't track all that much. Screen size, which is wrong for Firefox vs Chrome. Visit time and unique user and visit ID. Also some ecommerce parameters if you set them. As well as if your browser supports whatever tech like flash, silverlight etc. It's a slightly better server log.

Mediterraneo10 · on Jan 6, 2021

When the GDPR was entering into force, I remember some speculating that monitoring Apache logs could violate it, since the user has not consented to having their personal details (i.e. the IP address) processed. What was the final consensus reached on this?

samjmck · on Jan 6, 2021

Ad blocking != block tracking. If you don’t want to get tracked, turn on Do Not Track in your browser. Matomo and most other privacy focused analytics scripts respect that setting.

lightswitch05 · on Jan 6, 2021

While I agree that is the proper solution, most analytics do not respect the Do Not Track header. Beyond it being mostly ignored, Safari (which currently has 20% global browser share) removed support for Do Not Track in 12.1. So even though Matomo might respect the header request, there is no way for me to send that header on many of my devices. Blocking is the only solution left to me to 'opt out' of tracking regardless of the good intentions of Matomo.

l1am0 · on Jan 6, 2021

+1 I set Matomo to respect Do Not Track and you can opt out of the Tracking in my Privacy Settings

dastx · on Jan 6, 2021

Sure, so I have tracking protection too both through uBlock Origin, and Firefox' tracking protection feature. Yet, here you are, bypassing my tracking protections.

samjmck · on Jan 6, 2021

If you're using Firefox tracking protection (which I'm guessing using DNT as well), then Matomo by default does not track you though. So no, your tracking protections aren't being bypassed.

arp242 · on Jan 6, 2021

Do-Not-Track is pointless and dead. Pretty much none of the trackers that actually matter pay one iota of attention to it.

vntok · on Jan 6, 2021

It's pretty disgusting to access creators' content for free while blocking their attempts to monetize it.

necovek · on Jan 6, 2021

Why not block access to the content then? You can't watch Netflix streams without paying for them, that's trivial to implement.

Ah, right, creators want their content to show up for my search keywords, Google won't let them have pages only visible to Google bots (though even that is changing with the rise of paywalled sites), and they want the money from that same Google showing ads from their ad network.

Google initially promised to deliver a search for the open web unencumbered. It has become a sort of paywall itself (accept our ads or our search results will be useless pointing you to pages that only work if you have ads enabled).

Sure, it would be fair if they haven't pushed out the competition acting entirely differently ("we have no ads", "our ads are clearly marked" to current "see if you can tell a difference between an ad and your search results").

vntok · on Jan 6, 2021

Unfortunately your script still calls a third party domain, which is trivial to block using a generic AdBlock/uBlock rule. Instead, you should host the matomo script (under a different filename of course) on your own domain. That way it won't be as easily blocked.

I go as far as to send all the tracking parameters through a custom server script before they are proxied to GA and Matomo. That way, I can change the script and parameter names at will, making them much more difficult to block. For example, Matomo-related blocking rules are as follows:

/matomo-tracking.

/matomo.js$domain=~github.com

/matomo.php

/matomo/$domain=~github.com|~matomo.org|~wordpress.org

/piwik-$domain=~github.com|~matomo.org|~piwik.org|~piwik.pro|piwikpro.de

/piwik.$image,script,domain=~matomo.org|~piwik.org|~piwik.pro|piwikpro.de

/piwik./ping?

/piwik.js

/piwik.php

/piwik/*$domain=~github.com|~matomo.org|~piwik.org|~piwik.pro

/piwik1.

/piwik2.js

/piwik_

/piwikapi.js

/piwikC_

/piwikTracker.

l1am0 · on Jan 6, 2021

If someone decides to explicit block my Matomo tracking server I am fine with that.

I experimented with tracking on the same site and the overhead is not worth it for me. Central solution for all my projects works quite reliable

vntok · on Jan 6, 2021

Sure if someone explicitely blocks you and you alone, that's fine. The problem is you getting blocked generically, because you're using the same scripts or patterns as everyone else, such that there exists a very wide and generic block rule in uBlock Origin or some other filter that happens to apply to your own domain. That's unacceptable and worth fighting against.

l1am0 · on Jan 6, 2021

But that is exactly what the script is doing. It changes the structure to prevent generally being blocked

vntok · on Jan 6, 2021

Using a third-party domain means you're already blocked in most cases involving adblockers and non-standard CDNs.

l1am0 · on Jan 6, 2021

Oh thanks. That is an insight I did not have so far!

huhtenberg · on Jan 6, 2021

Unless I am missing something, that's trivial to block.

Any tracker can be made to work around ad blockers by making callbacks to the site itself and having a small shim there that forwards these pingbacks to the actual tracking service. But even then they still can be blocked based on the request contents.

PS. Here's how your website looks in Firefox - https://i.imgur.com/uFKEB4X.jpg. That's with uBlock off. No console errors.

l1am0 · on Jan 6, 2021

Weird that you don't see any text. In FF on Linux it works as expected. What system are you on?

huhtenberg · on Jan 6, 2021

Windows

l1am0 · on Jan 6, 2021

FYI: The gumroad link is to a support license.

You purchase a support license to help me to continue working on MCAB. MCAB itself is Open Source and can be found on Github: https://github.com/simonfrey/matomo_circumvent_adblock

viraptor · on Jan 6, 2021

I've tried to use it and... they could work on improving the installation guide. I gave up after an hour or so of failing to provision the database and create configuration in a way that gets me to the initial setup screen. I was using their docker setup for this.

It's kind of my job to take random apps, deploy them and manage properly, so I'm not a clueless user here. I could press on and figure it out with more time, and I understand they'd be happy with people using the cloud offering / paid support instead. But I also feel like a working docker-compose (or comparable) setup is table stakes these days for an open-source service.

See Loki+Grafana for a good example: https://grafana.com/docs/loki/latest/installation/docker/#in... - it's not a production setup, but it's a valid "play around with it in 2min" setup.

cxcorp · on Jan 6, 2021

It was pretty easy to get up and running with Docker Compose:

  version: '3'

  services:
      mysql:
          image: mysql:8
          environment:
              MYSQL_ROOT_PASSWORD: root
              MYSQL_DATABASE: matomo
      matomo:
          image: matomo:4
          ports:
              - 4000:80

This lets me in at localhost:4000 and I just enter "mysql" as the DB host, "root" as the username & password and "matomo" as the database name, and it's basically done.

Of course, I probably have to point it out or someone else will, that it's a bad idea to be using the MySQL root user, instead of creating a user with the rights that Matomo needs: https://matomo.org/faq/how-to-install/faq_23484/

typhonius · on Jan 6, 2021

I’ve found that using geerlingguy’s Ansible roles for MySQL, Nginx, and PHP, most random PHP applications can be deployed with default configuration. I’ve had them in production with Matomo for the past year or so and had no problems so far.

A lot of the challenges faced with a ‘from scratch’ install will revolve around which PHP version and extensions to install and how to get Nginx to talk to FPM. Neither of which are trivial for someone wanting to test/evaluate without much prior knowledge.

crstin · on Jan 6, 2021

I've did some research a while ago and found that https://github.com/crazy-max/docker-matomo dockerizes it the best.

kerng · on Jan 6, 2021

I switched to it about a year ago to get rid of Google Analytics. Quite happy with the decision.

FriedrichN · on Jan 6, 2021

My anecdotal experience is that people liked it better than GA. I also really like how easily you can extend it. In one case (a webshop) we added the currently logged in user so we can track what they look for specifically so we can improve the search and categories.

XCSme · on Jan 6, 2021

Where are you hosting it and how much do you pay for it? Do you have a lot of visitors?

input_sh · on Jan 6, 2021

Any cheap Hetzner dedicated (~30€/month) can handle tracking thousands of daily visits without breaking a sweat.

If we're talking about 10s of thousands, you're gonna need to invest in some SSD (€50-70/month probably).

If we're talking about dozens of sites, some of which have millions of yearly visitors (and a bunch of plugins and reports that need to be generated), then you're gonna encounter some issues and have to spend a considerable amount of time optimizing every part of it, and hardware cost will rise to a hundred or two per month.

y42 · on Jan 6, 2021

To add another level: To track millions of requests per day for a couple of sites, it will probably cost around 1k - 5k per month, hosting on AWS. It's advised to use not the cheapeast hoster here, because any little outage directly affects every site that implements your tracking technology.

zerkten · on Jan 6, 2021

> It's advised to use not the cheapest hoster here, because any little outage directly affects every site that implements your tracking technology.

I guess it depends on how essential your tracking is, and how you've implemented it. It shouldn't be added in a way that can take out your site unless there is some business critical reason to track.

Then I'd ask, just how critical is the tracking? If losing a few hours of data is going to throw off your product development, do you have enough data to be making decisions? My experience is that bugs and misconfiguration of experiments is common in most orgs, so even if the system is up to capture all data, product managers check an experiment a week later to find they have only 50% of the data.

celsoazevedo · on Jan 6, 2021

While it makes sense for some projects to use AWS, Azure, Google Cloud, etc, you could track the same number of requests on Digital Ocean, Vultr or Linode reliably and for less money.

mox1 · on Jan 6, 2021

Formerly known as Piwik, been around for a long time.

AbdHicham · on Jan 6, 2021

Thank you :) I didn't know that, found it here as well https://piwik.com/

kooparse · on Jan 6, 2021

It's pretty easy to make your own analytics. It's not famous, but I open-sourced mine; it's called Bast, written mostly in Rust, and it's easy to deploy it. https://github.com/kooparse/bast

marvinblum · on Jan 6, 2021

I did the same for my Go library [0], but I don't think it's "pretty easy" if you want to do it right. Especially filtering out bots is a constant hassle, it needs to be tested, maintained, just like any other software. So, paying something like $4 is worth it, if you don't want to think about it as much.

You have a very good looking UI there. I really love the simplicity.

[0] https://github.com/pirsch-analytics/pirsch

solarkraft · on Jan 6, 2021

I used it a bunch back when it was Piwik (why did they change the name, it was great!) and have been quite satisfied.

Yet, though at least it isn't cloud based, it's still quite scary what kinds of things it will tell you about your visitors.

brnt · on Jan 6, 2021

I'll bite: what kinds of things did you learn about your visitors?

cmg · on Jan 6, 2021

Worth noting that they also have a WordPress plugin that handles all of the installation and related setup: https://wordpress.org/plugins/matomo/

I use it on a personal project site and it works very well.

samuell · on Jan 6, 2021

Installed it on some websites recently. Quite pleased with the UI and the functionality. Pretty spot on in terms of the right level of information about users - providing useful info without stalking people inappropriately. The new version (4.0) seems to improve on some earlier stability issues.

XCSme · on Jan 6, 2021

Is the performance better than Google Analytics?

samuell · on Jan 6, 2021

It is definitely clearly snappier, yes. There is some sub-second loading times, which should not be surprising, but not the sometimes multi-second lags I have seen in Google Analytics.

nagbava · on Jan 6, 2021

About data protection and GDPR, a good thing with Matomo is that, if configured properly, it can be used without requiring to collect the user's consent (since Matomo doesn't use the data for its own purpose). Of course there are less information collected but at least you don't have to display a form as soon as a user enters your website.

The French data protection authority issued a piece of code (JS) which must be used to avoid collecting the user's consent. I don't know about other data protection authorities in the EU but it shouldn't be much different.

iamacyborg · on Jan 6, 2021

> it can be used without requiring to collect the user's consent (since Matomo doesn't use the data for its own purpose)

This is not how the GDPR works. If you are collecting personal data, or if you are dropping analytics cookies on someone's device, you need consent. No ifs or buts.

nagbava · on Jan 6, 2021

You should apply to the CNIL since you seem to know GDPR better than they do. (https://www.cnil.fr/fr/cookies-solutions-pour-les-outils-de-...)

I never said no personal data were collected but, if configure properly, the processing of data falls within the legitimate interest basis.

iamacyborg · on Jan 6, 2021

I believe that page may be out of date, or they've updated their github repo prematurely.

https://github.com/LINCnil/Guide-RGPD-du-developpeur/commit/...

/edit Ignore me. I appear to have misunderstood the changes when I last read this. Sorry

nagbava · on Jan 6, 2021

It is true that an opt-out system must be installed on the website (Matomo gives that piece of code) but - as noted on the github link you posted - that very is different from an opt-in system (which is the standard GDPR requirement).

lmkg · on Jan 6, 2021

GDPR does not require consent. If you use consent, then it must be freely-given, but can often use a different legal basis when processing personal data.

The ePrivacy Directive requires consent for reading or writing from a terminal device. This includes anything with cookies, even if they're not personal data. While the ePD refers to GDPR for its definition of consent, it is a separate piece of legislation and many things that are true about GDPR are not true about ePD (such as being able to invoke Legitimate Interest instead of consent).

nagbava · on Jan 6, 2021

Sure, but when it comes to cookies, consent is almost always required on the GDPR basis (other legal basis are rarely working).

You're right to point to e-privacy, to which consent is central. But the latest draft of its new version states that (art.8): 1.The use of processing and storage capabilities of terminal equipment and the collection of information from end-users’ terminal equipment, including about its software and hardware, other than by the end-user concerned shall be prohibited, except on the following grounds: [...] (d)it is necessary for audience measuring, provided that such measurement is carried out by the provider of the information society service requested by the end-user or by a third party, or by third parties jointly,on behalf of theone or more providersof the information society service provided that conditions laid down in Article 28, or where applicable Article 26,of Regulation (EU) 2016/679 are met

So Matomo can still do without the user consent (from what I understand, the relation between GDPR and e-privacy is no easy business).

lmkg · on Jan 6, 2021

> when it comes to cookies, consent is almost always required

We are in agreement. It seems I wasn't clear enough in my original post, but this is my overall point. GDPR doesn't require consent, but consent is required because of ePD.

> latest draft of its new version

> So Matomo can still do without the user consent

The new draft is not law yet. It's been 6 months away from passing for several years now. In the meantime, fines are still being issued under the existing law. Google got fined a hundred million euro last month in France, and that fine was very specifically ePD and not GDPR for a variety fo reasons.

iamacyborg · on Jan 6, 2021

> So Matomo can still do without the user consent (from what I understand, the relation between GDPR and e-privacy is no easy business).

It also depends on the jurisdiction. For example the ICO has been clear that using a cookie based analytics tool requires a GDPR level of consent, without exceptions.

arp242 · on Jan 6, 2021

This is not how the GDPR works. It lays out several legal basis for the collection of personal information, of which consent is one. There are others as well.

I'd have to re-read it to be sure about analytics cookies, but I don't think it says a whole lot about that off-hand. This the the ePrivacy directive.

matomo-report · on Jan 6, 2021

I have setup a load balancer with 3 instances of matomo connected to one mysql database to handle tracking on a website with around 7000 visits a day. It could all probably be handled by just one instance but that is sort of the standard setup we have for things.

matomo is very comparable to google analytics in terms of reports. matomo has some things that seem a little easier to get to; like visitor flows.

However, matomo seems to just give up on big data, complex reports. Similar reports in google analytics take a long time to complete, 10 to 40 seconds, but they at least complete; eventually.

XCSme · on Jan 7, 2021

Isn't the performance limited my the MySQL database and not by the webserver? Wouldn't make more sense to have a cluster of three MySQL database instances and one matomo instance instead?

hertzrat · on Jan 6, 2021

I used matomo once for a basic Wordpress blog (with cloudron) but for some reason it led to my site being flagged as distributing malware and I vanished from search engines. Apparently there is a Microsoft form you need to fill out to get unflagged where you explain what data your site is collecting but I just took the site offline because I was too busy to dig into it. Extremely annoying since the entire idea is to collect as little personal information as you can. It wasn’t matomo fault, its somebody’s dysfunctional web crawler bot auto generating reports

XCSme · on Jan 7, 2021

Offtopic: What is your experience with cloudron? Does it make it a lot easier to self-host apps, or the complexity/limitations it adds on top are not enough to justify using it?

Galanwe · on Jan 6, 2021

As a non-web developper, I always wondered:

Are these alternatives fully able to replace Google analytics?

I sort of thought Google analytics would tell you more about your visitors since with Google cookies, they could map them to other visited websites, centers of interest, age group, etc.

Are you loosing all that when switching to a less intrusive analytics platform such as this, or is Google analytics not leveraging their ability to disclose more about the visitors?

jeroenhd · on Jan 6, 2021

Google Analytics tells you more about your audience because it stalks people across the web. Matomo can never provide that without having a broad range of websites from which you collect data and writing custom code to annotate visitors with your own interest tags.

Matomo purely tracks analytics: who visited what page, for how long, from what device, from what location, from what inbound website, and what outbound links did they click. It also provides a log of pages requested per session so you can analyze people's flows through your website.

It's certainly not a replacement for Google Analytics if you use it to collect background information on your visitors. Even though Google's information is very broad (you mostly get ranges and the interests aren't that reliable), some marketeers use it to make decisions about their marketing strategies. Matomo won't help you there, your alternative would probably be Facebook or another big tech tracking solution.

It does provide a replacement for the type of tracking that I personally find acceptable, assuming the IP addresses are anonymized sufficiently. Matomo recommends shortening IP addresses to /16 after analysis, which I consider good enough, but that's a setting administrators can change.

iamacyborg · on Jan 6, 2021

> Google Analytics tells you more about your audience because it stalks people across the web.

That data is mostly garbage and only getting worse.

XCSme · on Jan 6, 2021

What information exactly does GA tell you from stalking people across the web? I don't think Google sharing with you accurate information about the people visiting other websites would be completely legal (GDPR). Where exactly do you find this information in the dashboard?

iamacyborg · on Jan 6, 2021

It's under Audience -> Demographics.

It's off by default. Turning it on gives you basic demographic data but also means you consent to sharing your GA data with Google to use for advertising purposes.

kbelder · on Jan 6, 2021

Importantly, it's always bucketed. You can get a breakdown for a group of visitors, but not the demographics for an individual.

marvinblum · on Jan 6, 2021

Even if they don't share it with you, they do it for themselves.

XCSme · on Jan 6, 2021

Yes, that was my point, probably they get this data for themselves and improving their own services but as a webmaster you don't get all this data yourself.

JW_00000 · on Jan 6, 2021

Demographic data like: age (in buckets of 10 years), gender, household income for some countries, whether you're a parent [1] + Interests/"Affinities" [2]. I think these are derived from your Google search history and the sites you visit.

You do need to explicitly enable this in the GA dashboard, and ask users' consent under the GDPR.

[1] https://support.google.com/google-ads/answer/2580383?hl=en [2] https://support.google.com/google-ads/answer/2497941?hl=en

marvinblum · on Jan 6, 2021

Hey there, we are working on Pirsch [0] (another GA alternative).

If you can replace GA depends on your needs. GA collects more personal data, you get better insight of your audience. This is important if you do online marketing and like to see how well your campaigns perform. GA does track visitors across days and you can therefore see if someone came back after a week and made a purchase.

In case you don't do that or are simply not interested in specifics, all the alternatives are good enough right now, I think. You can still tell how visitors navigate your page, what content they visit most and all that stuff. We are currently thinking about what we can add to gain more insight for businesses, without invading privacy as Google does.

[0] https://pirsch.io/

dustinmoris · on Jan 6, 2021

Due to the amount of people blocking Google Analytics with browser extensions, Pi-Holes and other tools I find GA increasingly lacking good analytics.

wartijn_ · on Jan 6, 2021

I assume most tools will block Matomo as well. I know uBlock origin with the default blocklist does.

mattmcknight · on Jan 6, 2021

This is why you self host it, to avoid a third party cookie.

y42 · on Jan 6, 2021

Still not help if you consider GDPR et al. rules, at least in the EU.

XCSme · on Jan 6, 2021

I think that using a self-hosted platform where you don't store any PII or cookies allows you to store visitor statistics without explicit consent.

y42 · on Jan 11, 2021

Only half the truth and a common misunderstanding. Define "visitor statistics". GDPR is about personal information, yes, but the cookie directive is also about tracking features in general. See the current so called "cookie verdict". To break it down: If you just count page impressions, you should be fine. Everything beyond is complicated. (Besides that, it does not matter where you host the data, on-premise or on foreign soil or if you use cookies or any other storing technology.)

vntok · on Jan 6, 2021

In some countries (eg France), there are exemptions for tracking purposes if the tracking is done only to the benefit of the site's editor.

mattmcknight · on Jan 6, 2021

I meant it helps with uBlock type rules against domains.

johnchristopher · on Jan 6, 2021

The real kick is when you link both google ads and google analytics https://blog.littledata.io/2019/02/25/why-link-google-ads-ad...

Something you can't do when leaving google land.

bravura · on Jan 6, 2021

29 euros a month for the managed, on-cloud version. Does anyone know inexpensive Google Analytics alternatives for small sites, that are hosted for you?

marvinblum · on Jan 6, 2021

https://pirsch.io/

$4/month if you pay annually or $6 to pay monthly, but free during beta.

We are actively working on it right now, but the core is working well and is open-source: https://github.com/pirsch-analytics/pirsch

deepstack · on Jan 6, 2021

Any one know a node alternative to GA? Matomo is great, but for something like GA, node would be best option for handling large amount of http requests.

codyogden · on Jan 6, 2021

I started using Umami (Node/Next) to replace GA on my major site (70k/30days). It provides the data I care about seeing, and nothing more. Public preview: https://analytics.kbg.rip

Project: https://umami.is

fbnlsr · on Jan 6, 2021

This looks really good. Kudos!

AbdHicham · on Jan 6, 2021

That's also a really cool alternative.

pabe · on Jan 6, 2021

I'd also recommend this one. Relatively "basic" data but it's GDPR compliant and easy to install and update. Big thanks to the author!

dalu · on Jan 7, 2021

Node? For handling large amounts of requests? What did I miss? lol sorry but I had to laugh.

I couldn't find any benchmarks. How many concurrent requests can it handle without errors, simple hello world.

Never mind I found a post. https://blog.rh-flow.de/2016/01/10/benchmark-helloworld-in-p...

As expected node is not a thing to handle large amounts of requests

XCSme · on Jan 6, 2021

I use a PHP analytics platform on a shared VPS hosting and it can track 1M+ monthly visits without any issues.

Why would node be able to handle so much more HTTP requests than Apache or Nginx? I think the throughput is mostly dictated by implementation.

looperhacks · on Jan 6, 2021

Why would you need node to handle "large amounts" of http requests?

capableweb · on Jan 6, 2021

Of course you don't have to have nodejs to handle large amounts of http requests, if you spend enough time you can get any language/framework to handle the amount you need :)

But, seems that at least in the TechEmpower framework benchmarks, es4x (JS) ends up on position 9 while the closest PHP framework ends up at 13. Now it's just a small benchmark with specific tests, but I do think it's easier to make NodeJS handle large amount of requests than PHP. Although again, you can definitely do large amounts of requests with PHP too. I've spent about 5 years on each, found that getting good performance out of V8 is easier than out of PHP.

Volrath89 · on Jan 6, 2021

I've been using plausible, and don't miss a thing about GA so far

XCSme · on Jan 6, 2021

Is 1.7k open issues something to worry about or is it normal for a project of this size?

PS: I have also been building something similar, but not completely open-source: https://www.usertrack.net

Findus23 · on Jan 6, 2021

While I agree that 1.7k issues are a lot, also keep in mind that there are 9.6k closed issues and they are bugs and feature requests over 11 years that never get closed as there might be someone else coming across some feature request one day who wants to implement it.

monkin · on Jan 6, 2021

Fun fact that should be point out: Most of those issues and pr’s comes from core team members. There is almost no community around it, but looking at these numbers you get the impression that it is otherwise.

AbdHicham · on Jan 6, 2021

It's normal for a project of that size, never had any major issues with it, it's in active development so it's good to see issues are being reported

yread · on Jan 6, 2021

I think it's more a consequence of not tidying up. There are irrelevant issues open from 2008. On the other hand I prefer 1.7k issues to project where they aggressively auto-close tickets

mgkimsal · on Jan 6, 2021

if projects are given the option to 'auto-close' tickets (which honestly, I don't mind - having loads of open stuff can hamper finding more recent/useful info), wouldn't it be helpful to have filter to view 'auto-closed' tickets vs 'closed' tickets?

zufallsheld · on Jan 6, 2021

I also compare open issues to closed issues. If the numbers are roughly the same (or the open issues bigger than closed) I'd say that's a problem.

AbdHicham · on Jan 6, 2021

Which parts are open source in usertrack ? you say not completely open-source

XCSme · on Jan 6, 2021

I say it's not completely open-source mainly because it's not free and people usually expect open-source to equal free.

Once you purchase it you get full access to the original server-side code (PHP, MySQL).

For the client-side part you only get the bundled JS/HTML/CSS (the original client-side source code is TypeScript, React), mostly because otherwise I would have to provide all the build tools and document better the code, tooling, building, releasing, etc.

viraptor · on Jan 6, 2021

> people usually expect open-source to equal free.

Open source has a specific meaning - is the Software released on an open source license. (https://opensource.org/licenses) For example if you pay enough, you get ms windows source as well - that doesn't make it "not completely open source". Your project doesn't seem to be open source at all.

XCSme · on Jan 6, 2021

Sorry for the misunderstanding then, I might be using the wrong terminology.

I have seen many other products that are marketed as "open source" because you get the source code after you purchase it, so it is literally "open source", but not "open-source" as in released under an open-source license.

I am personally not marketing userTrak as open-source and I will stop using similar terms if other people do have a strong opinion about what "open-source" actually means.

mgkimsal · on Jan 6, 2021

I think the term 'shared source' was coined to describe that particular business model (under certain conditions, the code is shared, and perhaps modifications may be allowed in some scenarios, but no redistribution).

Macha · on Jan 6, 2021

From your license agreements (this language appears in all 3):

You are NOT allowed to:

Redistribute in any way any of the userTrack files or any parts of the userTrack's source code (with the exception of the public tracker JavaScript files that have to be included on your site).

Install userTrack on someone else's server.

Continue using userTrack or offering userTrack access to others after this license agreement has been voided (either via a refund, license period expiration or legal action).

This is not open source (or even "fair code" as redis etc advocate for). Providing the source but under a license like this is usually referred to as visible source or shared source

XCSme · on Jan 6, 2021

You are correct, I did confuse the terms "visible source" with "open source".

The way userTrack is currently distributed is as any other digital product (you pay for it and you are not allowed to sell or redistribute copies of it) with the mention that the server-side code is un-compiled and un-obfuscated so you can transparently see what it does, how it does it and change it if you want.

I am not sure that fully open-sourcing it is the way to go as I've seen so many projects die or disappear because the maintainers didn't have a lot of incentives to keep improving it or simply no longer had time to work on it. I also think that it's fair to pay for something that brings value to you also knowing that by paying for it you support its further development.

ffpip · on Jan 6, 2021

Did you change the license? I think I stumbled upon your website on HN comment a few months ago. Didn't it be free?

Great product and an excellent demo!

XCSme · on Jan 6, 2021

Thank you! userTrack was never free, but I did change the pricing model from lifetime, to yearly to now being one-time payment + yearly payments for updates.

I would love to make userTrack free if I can find a sustainable way to work on it. Most other similar open-source software offers a "hosted" version to get revenue, but my goal is to promote decentralization and self-hosting in general, so me focusing on the hosted version would go against my goal and beliefs. I really want to see a feature where any non-technical person can choose a few products and have them running on their own VPS/server in a few clicks. This would have many advantages for the clients AND for the developers:

* Clients pay a lot less for a products

* Developers must focus more on product and performance, leading to higher quality products.

* Hugely increased privacy for the average internet user and for the own data of the client using the product

* Better performance (each client has their own server so it is more likely to have more resources)

* Better latencies (each client can choose to use/host their product on a local datacenter)

* Better data transparency, easier migrations and fewer vendor lock-ins (if you own the server and the data on it you can most likely always export it in some form)

I think there are many other advantages for both companies and clients. The current SaaS environment makes it really easy for companies to ask huge amounts of money for services just because they want to, as the client has no real alternative unless he is really technical and can spend days installing and maintaining a self-hosted software that rarely gets updated.

ffpip · on Jan 6, 2021

> userTrack was never free...

Sorry if it seemed like I was complaining about the pricing change. I was just wondering whether I remembered it correctly from here (https://news.ycombinator.com/item?id=24207129)

It's a great product. People will pay for it.

XCSme · on Jan 6, 2021

No worries, I was just making the history of the pricing structure clear.

Thank you for the kind words, I do love working on this project and I hope to be able to continue working on it. Existing customers absolutely love it and keep recommending but I am still struggling with finding a pricing structure that makes sense for everyone.

I do hope that one day I will find a way to make userTrack free for everyone, but looking at Matomo, making it open-source seems to drastically slow the development of a project as there are so many people involved and so much more decisions to be taken. Apart from that I would still have to earn a living somehow, but if I get a job and keep userTrack open-source I won't be able to spend too much energy on maintaining it and I hate not being able to make a product as good as it can be.

AbdHicham · on Jan 6, 2021

I understand now, thank you for clarifying :)

villgax · on Jan 6, 2021

A startup I worked with used Piwik(now Matomo) & defrauded investors with fake visits by tampering with the DB. Any VC should actively be involved & knowledgeable of analytics that is presented to you in order to avoid being ripped off.

viraptor · on Jan 6, 2021

I'm not sure if this is specific to Matomo. You could use headless agents over web proxies around the world to inflate Google Analytics as well. It just costs less to do it in your own DB.

banana_maker · on Jan 6, 2021

I used to use matomo but I found goatcounter to be simpler to set up and maintain. I also like the UX more [0].

[0]: https://stats.arp242.net/

rilut · on Jan 6, 2021

Have anyone evaluated Matomo vs Countly vs PostHog?

sneak · on Jan 6, 2021

Does anyone know how to get it to show the full referer URL? I use a self hosted Matomo and it only shows the referer domain.

XCSme · on Jan 6, 2021

I don't think that's always possible as it's a browser security limitation. The referring domain can decide to pass on only the domain, not the full URL.

https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Re...

robertlagrant · on Jan 6, 2021

Would recommend Countly. Not affiliated.

blitz · on Jan 7, 2021

Anyone here have experience comparing this against Open Web Analytics (OWA)?

pachico · on Jan 6, 2021

I just wished it used Clickhouse as persistence layer.

RocketSyntax · on Jan 6, 2021

maybe website analytics is more appropriate? the word analytics is taking on a new meaning these days

francoisp · on Jan 6, 2021

a slightly different angle, also opensource: mautic

dalu · on Jan 7, 2021

I started writing a backend in Go. So far it parses the request, turns it into a struct and saves it to mongodb. However 2 requests per page are recorded, no idea why that happens. One has 24 or 25 fields filled, the next 15-19 fields.

Anyhow that's as far as I got. The next and largest step would be analyzing those records aka creating all those beautiful graphs. Also creating a management interface for sites and generation of tracking code. While it's open source (search for gopiwik) it still uses the mgo.v2 driver and I just yesterday added go module support and did some minor changes.

I thought, why reinvent the wheel, let's just grab the piwik.js and build a receiver. Turns out it's a bit more complicated than that.

junon · on Jan 6, 2021

No screenshots, no examples in the readme, builds failing.

Nah.

tadzik_ · on Jan 6, 2021

There's some of that on https://matomo.org/, not sure why OP linked to github instead.

ddevault · on Jan 6, 2021

Open source does not make it okay. Do not spy on people. It's just that simple.

hertzrat · on Jan 6, 2021

What do you use to track whether your site is growing or shrinking in popularity? Server logs? If so, I’m curious whether not being able to filter out bot visits is a problem

XCSme · on Jan 6, 2021

So, you shouldn't be allowed to know the conversion rate on your page?

ddevault · on Jan 6, 2021

It's easy enough to run $nconversions / $nrequests without spying on anyone.

XCSme · on Jan 6, 2021

But $nrequests is not accurate as it can be 10x or 100x more than the number of visitors, so your conversion rate will be 10x or 100x off.