Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
We ditched Google Analytics (missiveapp.com)
212 points by AdriaanvRossum on June 1, 2020 | hide | past | favorite | 74 comments


On my personal blog, I just decided to go without any analytics. I feel like I'm writing for myself and to engage a close group of friends. Maybe make some new friends along the way.

I don't want to blog for page views or retweets, it just feels like it creates a bad set of incentives. Ultimately the feeder bar of vanity metrics feels so draining. The important post that 5 people deeply appreciate is just as valuable as a shallow clickbait listicle post that 1000s of people scan through.


This! Exactly this! The moment you start worrying about which posts/pages people looked at is the very moment you start getting sucked into the vortex that's called "pandering to the crowd". The extreme end of that is "clickbait content".

Down that road I wish not to go!


Yes, there are much better ways of getting user feedback, for example a comments section.


Or simply an email or a contact form.

The reality is that one usually does not have large discussion, only one to one exchanges (if any). Email is fine for that, you can always update the post if there aee interesting outcomes.


Disqus? Just joking. Any good alternatives?


I've seen a bunch of people just link to HN and reddit discussions of their posts. On my blog I am using utteranc.es, which let's people make comments using github issues. It does require making a github account though. Here's an example of some comments on a post: https://blog.kdheepak.com/writing-papers-with-markdown.html


/r/exlurker? (Only half joking!)


I haven't looked at personal website analytics for years. I left them out of my current blog redesign.

Like you, I don't maintain this blog for the fame and glory.

However, I depend on analytics for my monetised website. I depend on that website to make rent. I'd still like to find a less invasive solution.


Nice, idea for someone (I wish I had time)...

"Good Housekeeping Seal" for Analytics

- Self host analytics

- Don't use google analytics

- Don't use tracking pixels (FB, Reddit, etc)

- etc

"Good Housekeeping Seal" for Billing Practices (in same vein, a little off topic though)

- Can cancel your subscription online

- Sends reminders prior to charging card (dark pattern on some sites)

- Open receipt API to scrape out historical spend data - started some thoughts here- https://github.com/HipSpec/invoice-data-api-specification

- No sneaky trial => subscription without warning

- etc

====

I think most of us here acknowledge that privacy or "how the sausage is made" generally doesn't sell to businesses & consumers. But I think there's some fertile ground between a self attestation (PCI style) and a $40-50k SOC Audit.

There are a ton of incumbents that at this point could not strip out spyware from their application/data stacks even with 100 human years of engineering effort... could be a competitive advantage for smaller/newer organizations to leverage.

Good Housekeeping reference: https://www.goodhousekeeping.com/institute/about-the-institu...

Thanks for sharing the article, SimpleAnalytics, Posthog & Fathom are steps in the right direction. Also shout out to Matomo/Piwik, one of the OG's in self hosted/ Google analytics alternatives.


Together with jivings I'm working on a ethical code of conduct [1]. Please add more if you want. Happy to make this a standard for websites. Maybe add a seal as well.

[1] https://hackmd.io/EmPIHGhTRh6pDSJ1VEgkkA


Looks good, but needs additional specificity / testability.

> No hidden costs

What is a "hidden cost"? Sometimes pricing is complicated. For example, shipping and sales taxes vary based on the customer's location.

> No making it difficult to cancel/unsubscribe from a plan

Maybe: "An authenticated user must be able to review their past and upcoming charges within 2 clicks from the default view. This page must provide immediate options for cancelling/unsubscribing (2 additional clicks to allow for confirmation)."

Or weaker: "Users must be able to cancel/unsubscribe by any mechanism that they can use to sign up / subscribe. For example, if users can purchase a subscription on the website, a they cannot be required to make a phone call to cancel that subscription"

> Automated emails to not self generated mailing lists/social platforms

I don't understand what this means. Are you trying to prevent the companies from using third-party advertising targeting? That seems like an unreasonable ask. It would prevent using Google/Facebook/Twitter for basic marketing tasks.

> No spammy follow up emails

This is not testable. It would be more valuable to identify quantitative best practices and publish those. e.g. "When a user cancels a subscription, do not sent them more than 1 marketing email per month asking them to re-subscribe"

> Allow recipients to easily unsubscribe from mailing-list emails

This should be covered by the "no making it difficult to cancel" clause.


Incidentally, California state law requires transparency around cancellation of subscriptions. Including online cancellation if the sub was started online.

https://blogs.findlaw.com/common_law/2018/07/what-to-know-ab...


Despite that you would be surprised how many companies still make it difficult and employ dark patterns here. Apparently the WSJ only allows that flow for California residents?

https://m.signalvnoise.com/subscription-hostages/


> Can cancel your subscription online

I think this is legally required in EU at least? I mean, if you started the subscription online. Same with the trial I believe.


Despite that, many companies employ dark patterns where you can take that action if you are in the EU, but they add friction in other markets.


Call me old-fashioned, but if developers really want to minimize their impact on visitors, I wonder why they shifted away from using plain old log files. In the screenshot of the dashboard, "screen size" is the only attribute I see which can't be derived from a web server request log entry.

I was using Analog to analyze Apache logs back in the 1990's -- it's older than JS.


I think it's part of a larger trend of outsourcing logic/services to outside SaaSs. Reminds me of that image posted here on HN of a spider-web of all of the SaaS products that just one company used centered around Salesforce.

For almost every SaaS one can always say "yeah but I can do it myself by doing A, B, and C" but the fact is that time is precious, and if you outsource parts of your website/product, it means for $19 a month you're getting a product that is worked on full-time by 4 engineers.


Analytics from server logging breaks in the cases where either:

- Page navigation often happens client-side (so most dynamic SPAs)

- Content is often cached and/or served from a CDN (so most non-trivial static sites)

In reality, nowadays most page loads don't ever touch the origin server, for a wide range of good reasons. Including analytics in the client-side page sidesteps that whole problem.


CDN could give you your logs though. I think cloudflare gives you analytics for example (not very detailed in the free version, but you can get more by paying).


AFAIK Cloudfront also let’s one store access logs.


> Page navigation often happens client-side

But most will still load the content from the backend, so you can count those requests.


Because modern web apps/sites don't make tons of requests to look at in logs - they load up the SPA bundle and all the magic of navigation/etc. happens in the browser, with round trips back only to fetch occasional new resources.


Because a lot of times things are hosted on services where you don't have access to raw log files (Netlify for example).


And in the Netlify example, they'll sell you analytics based on the logs (adblocker proof) for $9/month. For one case on a site I operate, the numbers are ~2x higher than GA.


This was always the way back before GA really gained traction - a lot of the old school log monitoring analytics basically equated requests as hits (with only some slight filtering) and so the numbers were always inflated (I never normally got to 2x, normally to about 1.5x - but guess it depends on the implementation's filtering of bots, spam IPs etc... Edit: remember we were using AWStats).

Was great being able to spin the traffic figures on our game review site for PRs who didn't understand web analytics at that time.


2x is usually just unique visitors, because most people block ads and trackers (at least those who visit tech blogs).

Counting all requests will be much more than 2x the GA visitors.


maybe they should not sellout their users like this and host their own files


Logs are missing some information I occasionally need to make decisions:

- Browser and screen resolution

- Time on page

- Tracking of on-page events


I also made the switch two weeks ago and I don't miss anything. I even removed google-fonts from my site and host the .woff files myself, so I don't force my users to send requests to Google. I replaced GA with my own self-hosted analytics platform[0] and I plan to add a lot more privacy features to it.

[0]: https://www.userTrack.net


I put this line:

This site collects no analytics and calls no third-party scripts

On the footer of https://remotivo.com and I've already had a couple of people comment that they thought it was a nice touch. I just built this site/bot for a fun side-project, I don't care how many views it gets.


I have been in the process of moving all of my stuff off of Google services (while I have an excessive amount of free time during Shelter In Place. So I recently moved from Google Analytics to https://usefathom.com . I quite like it. I would like to see how many people read my stuff, but I value my privacy and would like to do the same for others. It's pretty good, and


Fathom has been investing a lot of $ in legal / privacy consultants recently. Excited for the next steps :)


My problem with usefathom is that their analytics hasn't been prooved in court or by a European GDPR data watchdog to be in the clear of not storing personal information. If you only want to ditch GA because you don't trust Google, then I'd also use usefathom.

I think what they do is very clever, but we settled with SimpleAnalytics instead (not as sophisticated and less analytics, but determine unique visitors by referral is cleaner and enough for us).


I'd still prefer Fathom because the data isn't fed into Google's fine-tuned tracking machine. Even if it's not a perfect solution, it's likely better than Analytics in that regard


You can talk to the Google Analytics API directly and control exactly what is sent to Google's servers. This also allows to create custom identifiers for users or sessions, or to track no personally identifiable information at all. A good starting point is Minimal Analytics which also removes lots of unnecessary bloat:

https://minimalanalytics.com/


That's cool! Isn't the client-side request to `www.google-analytics.com` enough for Google to track users between sessions and domains?


Yes definitely. They collect the IP address and a browser fingerprint at the very least.

I'm not sure if this is doable by proxying through the server, so only custom events (with an unrelated IP - the server's one) are sent.


Google Analytics never uses browser fingerprints. The default tracking scripts only use first-party cookies. If you call the GA API directly, you don't even need cookies or local storage. This will make some of the reports meaningless, of course. The user's IP address is obviously sent, but it can be truncated by setting an option.


> The user's IP address is obviously sent, but it can be truncated by setting an option.

The IP address is inherently sent due to how the internet operates (unless you proxy GA calls through your server, which I'm not sure is even possible).

The option to "truncate" the IP address just tells Google you don't want to store it for your analytics. It has no effect on whether Google still keeps it on their side for their own benefit.


Is anyone using an open source statistics system that's as simple as simple analytics? Maybe even something that can be setup to run stats for multiple clients from our own VPS?


A relatively simple log analyzer, the kind we used to use in before que Analytics craze is GoAccess: https://goaccess.io/

If you want something more complex, Analytics style, you could try Matomo (Used to be called Piwik): https://matomo.org/


Shameless plug, but PostHog is MIT licensed, open source product analytics [0]. You can use it for super simple pageview analytics and even give all of your clients their own team on one instance.

If you want something more advanced, we're comparable in features to Amplitude or Mixpanel too, but more engineering focussed.

[0] https://github.com/posthog/posthog


Freshlytics - https://github.com/sheshbabu/freshlytics

* Cookies are not used

* Personally identifiable information (PII) is not collected

* See the pageview in different dimensions like page urls, referrers, browsers etc

* Supports multiple projects

* Supports RBAC

Screenshots - https://github.com/sheshbabu/freshlytics/blob/master/docs/sc...


Looks useful, but it depends on PipelineDB, a PostgreSQL extension for streaming data. Unfortunately PipelineDB hasn't been updated since May 2019 [0] when they were acquired by Confluent [1]. The former PipelineDB team appears to be focused on Confluent's KSQL product [2]. There's an open source "ksqlDB" but it appears to depend on Kafka, so it's not a 1:1 replacement for PipelineDB[3].

[0] https://github.com/pipelinedb/pipelinedb

[1] https://www.confluent.io/blog/pipelinedb-team-joins-confluen...

[2] https://www.confluent.io/blog/confluent-cloud-ksql-as-a-serv...

[3] https://ksqldb.io/quickstart.html


Yes, I'm relying on the "continuous views" feature of PipelineDB which is like autorefreshing materialized views. I'm planning to swap PipelineDB with TimescaleDB in near future

Most of the heavy lifting is done by Postgres/PipelineDB with Node.js as a simple wrapper so it's both performant and consumes less resources.

http://docs.pipelinedb.com/continuous-views.html

https://docs.timescale.com/latest/using-timescaledb/continuo...


I really like the look of this. And I really like the response here regarding "blockers" [0].

Is there any documented way to let users opt-out sans ublock:o (or other)? Or configure it to respect "Do Not Track"?

[0] https://github.com/sheshbabu/freshlytics/issues/12


Thanks!

> Is there any documented way to let users opt-out sans ublock:o (or other)? Or configure it to respect "Do Not Track"?

I initially wanted to implement DNT but there's a lot of confusion about whether it's useful or not. I'm not sure if there's a standardized way for users to opt-out.


How does it track individual users without cookies?


Yes, it doesn't actually track individual visitors as we can't use cookies or any fingerprinting methods.


There is also https://plausible.io/

I personally currently don't use any analytics, but they claim to be simple and lightweight so I keep an eye on them in case of future use.


Shameless plug, I wanted to develop a simple tiny stats dashboard using firebase as a backend: https://github.com/Karalix/feu-analytics

However, while it does what it is intended to do, it needs further dev that I am too lazy to do.


I would say go with Countly [1]. It is both for mobile, web and desktop, unlike several products mentioned here.

[1] https://github.com/countly/countly-server


GoatCounter[1] was featured on HN at some point. Open source, written in Go. It has a "subsite" feature which I haven't played with.

[1] https://www.goatcounter.com/


Question: If you don't use google analytics are you penalized in google SEO?


No. Google employees have denied that on multiple occasions[1]

[1]: https://plausible.io/blog/google-analytics-seo


The answer to "will using other analytics penalize you?" is "probably". Now let's rephrase the original question, "Will using other analytics penalize you more than using ones from Google?".


Absolutely not. Deep dive here: https://usefathom.com/blog/google-analytics-seo


To remove google analytics, I will be implementing snow-plow. First PoC was quite good.

Dashboards via BigQuery + Google Datastudio + Custom stuff.


I ran for quite a while without any analytics after removing GA. I've now settled on Fathom for Kubestack.

It was important to me to be mindful about the privacy of my visitors but at the same time I need some data to see if I'm on the right track. Fathom seemed like a good compromise.


How private is using localStorage for ID persistence instead of cookies for Google Analytics - and in general? It's what I'm using, but obviously it takes an expert to figure out just how much of an improvement it is.

Apparently Service Workers can also be used.

https://developers.google.com/analytics/devguides/collection...


localStorage is no different than cookies from the perspective of privacy laws. https://softwareengineering.stackexchange.com/questions/2905...


The biggest advantage of localStorage is that it is, as its name implies, "local" to the site you are visiting. Cookies can be shared accross domains and sites you visit, but localStorage can not. If one on domain you store `localStorage.ID = 5`, you can not read that from another domain, as `localStorage.ID` will be undefined in a different domain context.


Seems to me like LocalStorage in this case is just a way to get around people deleting/blocking cookies. What would be the legitimate use?


> What would be the legitimate use?

Caching/storing GUI state and user configuration.

You (the server) doesn't need to know about LocalStorage contents. You can read and write from it via JS without ever sending that data somewhere. It actually improves privacy if done this way, because then the user owns the data and you just act upon it.

Unfortunately Apple thinks otherwise, they clear LocalStorage in Safari after a while by default.


Yeah we use this on our Chrome Extension Next Up to avoid sending data to our server. Keeps everything cached locally and lot less data in transit


See my comment above, localStorage can not be shared between domains as cookies can. So it's a lot better for privacy. With localStorage you can only track multiple visits of a user on the same site, not across sites.


Both google analytics and google tag manager are blocked in NoScript for me. If your site is using them they're useless for me and my actions on your site. Also every single one of my clients are blocking them too. Part of talk with my clients is to also make them install NoScript + uBlock Origins.


I was just in this place two days ago, wanting some basic analytics from nginx, but not wanting to add GA tracking to my site.

I found: https://goaccess.io/

Had not heard about Simple Analytics before but I will check it out!


Let me first of all congratulate you. And secondly throw my hat in the ring. We offer a commercial alternative to Google Analytics and have seen massive interest in this space.

Please let us know if you are looking for alternatives.


My personal blog also ditched GA and uses an alternative tool without cookies. Admittedly, it's harder for businesses whose management is steadily being replaced by analytics.


Sounds great! Though I think GA has a "more-private" mode (though it still requires cookies).

Their product looks interesting, something like a collective inbox + Google Wave builtin

> How to run analytics without a consent banner? It is simple; don't use cookies nor collect personal information.

Again, please. Louder for the people in the back. Thanks


if all you need is counting a number of users, then you can use cloudflare analytics


Does Cloudflare have decent user permissions yet?

Last I checked, it wasn't possible to grant permissions to (for example) a marketing employee who you don't also want permission to edit DNS records.

Edit: I see "Role based account access" is a feature of their "Enterprise" / "request pricing" tier...


Why don't you use mixpanel or something. You'll write the code so you won't have to use JavaScript snippets that do shady things with cookies etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: