Had nothing but problems with their product over the years. In 2012, they had a bug in their analytics.js which resulted in me losing about 10k new users. Last year, I was experiencing multi-day delays between an event and it appearing in Mixpanel, Intercom, etc. Prior this year, I was using Branch and then realized how many sacrifices you make when you use Segment – they will never be up to date with all the most recent SDKs/APIs, so you will always be pulling your hair out. We ended up using Segment + 4 other services, it would have been better to not use Segment at all.
I hope they sort these problems out, and build a better developer product. Right now, they're focused on the enterprise, and I'm amazed their product is sufficient.
I'm writing this because their team is amazing and kind, but I hope they improve their product.
Shoot. I'm sorry this happened. I'd love to chat more about your experience with Branch SDK/APIs recently. I found your email in another HN comment and will reach out shortly.
On the other two issues...
Analytics.js launched on HN in December 2012. It had some bugs in its early form, but it has certainly become a lot more stable in the five years since then. Today it runs without issue on 500m+ browsers and hundreds of thousands of websites each day.
For event delivery, you can see real-time event deliverability and latency stats on our public status page:
https://status.segment.com
For most services we deliver data in a few hundred milliseconds, but some services consistently drop data. We retry that data over the next 12 hours, and this actually increases the deliverability to our partner endpoints. (For example, we're improving deliverability to Iron.io from 95.4% to 98.4% right now.)
Latency can be more or less questionable depending on when the stopwatch button is pressed. For example, in VR, the gold standard is "motion to photon" -- anything else is just not representative of what actual latency means for actual users (i.e. getting motion sick). There are many definitions of latency, so it's important to choose the right one (the one that matters for your users).
For us, latency means the time between when the _event_ actually occurs in real life (user clicks on a button) and when it showed up in all of our analytics integrations (Mixpanel, Intercom, etc.). As developers, this is the definition of "latency" that really matters for us because it represents the minimum time before we can take action on our users' behaviors.
The reason I'm very skeptical is because what we were actually seeing was orders-of-magnitude differences from the 44 ms "latency" claimed on your status graph and what we were measuring ourselves (~1-10 hours for us) -- not just 10% differences which would have been more than acceptable for our use case.
For example, Intercom support were able to identify a four hour delay from when the /identify actually took place and when Segment sent the event to Intercom. Segment Support then mentioned a race condition with their Intercom integration that they had yet to solve (we ran into this ourselves -- we had to guarantee we called /identify BEFORE calling any /track calls).
With Mixpanel, not only were we experiencing multi-hour delays, we also experienced many nonsensically mis-ordered events when using Segment (even with timestamping -- it turned out that Mixpanel still relied on events being ordered properly within a 2-minute window -- something we were only able to guarantee by directly sending events to Mixpanel ourselves).
Segment's response: "Our infrastructure is built on top of a queue system that accommodates high scalability. The drawback to this is that sometimes events are received by our API in a mixed up order because we have multiple queues feeding events to multiple workers."
These problems were all resolved immediately when we started issuing direct calls to our various analytics providers.
Hey, I’m one of the Segment founders. I helped instrument a lot of these metrics and built the status page Peter linked to above. I’m hoping I can help shed some light on some of those numbers.
The 44ms you’re referring to is the time it takes our API to respond to an incoming call and ingest an event. It’s certainly not the wall clock time, but it is a good measure of the overall health of the system. Your feedback about its prominence is definitely good–it’s our goal to be transparent, not misleading. We’ll change the area where it’s displayed shortly.
If you’re looking for the ‘end-to-end’ wall clock time, you can find that a little further down the page. For every event coming through our pipeline, we average the time from ingestion to successful delivery and display those metrics on a per-destination basis.
You can see right now that the Google Analytics end-to-end delivery latency is ~400ms within the past hour, and is pretty consistently near that number. We’ve also developed internal systems which break down exactly how many events from a given source were delivered to a given destination, and what the latency distribution for those events was.
The ordering problem you mention is indeed tricky. Like TCP, if we wanted to keep a loose ordering, we’d have to keep a window and ensure the partner API would then re-order messages appropriately. If you want a total ordering, this window of delivery _has_ to be 1 for a given user. It does have some pretty serious implications on the throughput of the system, so we’ve been working with both Mixpanel and Intercom (we’re users ourselves) to try and solve the issue just with timestamps. Ideally, partners would be able to re-order events received based upon time, which is what we do inside of customers’ data warehouses.
As far as the deliverability issues you mentioned, I’m terribly sorry to hear that we failed you here. We’ve hit some scaling bottlenecks that we’ve been working hard to fix–and we do our best to keep the status page updated whenever we have production incidents.
All that said, reliability is our top focus as a company. Teams present their SLA metrics on a weekly basis at all hands, and it’s a key part of our monthly board reporting. We’ll be surfacing these metrics and event traces inside the webapp so as a customer can see exactly where your data is, and what has been delivered. And we’re in the process of building an entirely revamped pipeline that will provide better deliverability guarantees. We plan on sharing the architecture on the blog once it’s all rolled out. Giving you transparency into where your data is stored and how it is processed is exactly what we want to achieve as a company.
If there’s anything I missed–please reach out! I’m calvin at segment.
>For us, latency means the time between when the _event_ actually occurs in real life (user clicks on a button) and when it showed up in all of our analytics integrations (Mixpanel, Intercom, etc.). As developers, this is the definition of "latency" that really matters for us because it represents the minimum time before we can take action on our users' behaviors.
This. I call it "end-to-end latency" and it doesn't get nearly as much attention as it should. This is, by the way, why it's a bad idea to use most OLAPs (like Redshift) as a backend for Segment if you want "real-time" analytics. Column-oriented OLAPs are not designed for real-time ingest because either they don't support streaming inserts (Redshift) or it's not particularly reliable (BigQuery).
>Can you please elaborate on why you think BigQuery's streaming inserts are not particularly reliable?
I've heard this from many users through my work on Fluentd (in fact, many end up using fluent-plugin-bigquery because they don't want to deal with the minutiae of insertId and whatnot on their own).
>Separately, you can always use BigQuery's federated query capability of Bigtable.
Yes, and very few people understand what that means (I had to look up, and I am reasonably educated on GCP). This is not necessarily product shortcomings but packaging deficiencies that can be addressed with better product marketing.
Just to offer perspective from the other side, we've been using their product since 2015, we send millions of events through Segment every day without a problem.
There was one short outage I can remember in 2015, the only other issues we ever had, we traced back to misconfiguration on our side.
I think Segment's core code is _really solid_, it shows in how fast a relatively small company is able to iterate, update and release new product.
I think their biggest challenge is educating potential users on what they do (no exact competitor) and educating new users on how to set it all up properly (gotta read the documentation.)
I agree here--we've never had any availability problems like this, and using Segment has been delightful.
I think judging a 6-year old startup based on what its product was one year into the company is rather unreasonable. To put this into perspective, 80% of the company's lifespan has occurred after the grandparent's negative experience. Things change dramatically in that period of time.
We used Segment for application analytics and exception reporting in an enterprise setting. Their .NET libraries were abysmal and had some show-stopper bugs that we wrestled with, particularly broken dependencies. I eventually had to fork the Analytics.NET repo and fix them myself. It was months before I heard any response to the issues or PRs, although they did eventually get merged in. It was just disappointing overall. I don't know why they'd release .NET tooling but have it be broken out-of-the-box, or maintain an open source project but not respond to any contributions.
Hey, this is Ilya - i'm one of the co-founders and original author of the .NET library - I'm really sorry that happened. In the early days, I was the author of 5 of our open source libraries and also a student of the .NET platform.
Today we have a larger team working on our libraries - .NET in particular has 200+ customers (mostly on our business tier) and we're processing 21B+ API calls/mo originating from .NET. If you could send me an email at ilya@segment.com, I'd love to chat live and collect any other feedback you have from the experience. Thanks again for leaving the note.
Usually these kind of startups that produce native libraries for use in other software produce poor .NET libraries. (see sendgrid) I wouldn't place the blame totally on them - you often have 1 person supporting multiple libraries in multiple languages, and .NET is not often used at these sorts of startups. On the other hand, a few thousand dollars for a consultant to get their situation in order would not be a bad investment to get them out of the situation, because a bad experience like this festers in the community for years.
Thank you! Late comment but I'm starting a new business and was missing the old cheaper/startup plans. I would be fine with badging the site Mixpanel-style. Congrats on the success, your tools have saved me many many man-hours in past businesses.
This was my take when we investigated integrating Segment into our SaaS app too. We are purely boot strapped, and at the level where our monthly site visitor are building up at a great clip, and we would be occasionally pushing the free limits of Segment into real dollar that would eat up a chunk of our slowly growing revenue.
I'm a massive fan of Segment - we use it everywhere at my company (Clearbit) and my only regret was not integrating it sooner. For example, some things we do:
* Sync Salesforce to Redshift - then perform queries across it
* Sync all incoming lead to Salesforce
* Enrich incoming leads with person/company data
* Fire webhooks to lambda processing incoming signup events, qualifying them, and then triggering subsequent events
Segment is one of my favorite products, and I consider it a core part of our stack. It's enabled us to be leaps and bounds ahead of other companies our size as far as analytics tracking, distribution, analysis in our warehouse, and more. We use pretty much every piece of it--even the esoteric options like their embedded pixel API. It's such a core part of my day that I sometimes actually spend time daydreaming about new features I'd add to the product rather than actually doing my job :)
All I need now is for them to add the ability to push my events in real time to my Kafka cluster. (Please add that!)
I'm not a Segment user, but my understanding is that it allows you to track events via a single API, and then push those events to lots of other services. So instead of needing to have js script tags/tracking codes/event calls for Google Analytics, Mixpanel, Optimizely, etc. etc., you just send your events to Segment, and Segment sends them on to wherever you want.
I'm a massive fan (and user!) of Segment. I think it could be revolutionary. I do worry a lot about how they tell the world about what they _are_ though, it's an inherently complex and technical product - and often the buyers of this software may not be technical enough to appreciate the sophistication of this product. In my experience Segment is confusing to explain until you actually use it, then you love it.
I keep going back and forth between understanding the value of this product (dashboard for marketing to add integrations w/o dev time; better integration of analytics), to wondering why technical people pay for this. It doesn't seem cheap, and you're throwing a lot of control over latency-sensitive data to a black box.
The founder straight-up admits in this article that the problems they're solving are largely simple. To my mind, there's no need for a hosted service. Why hasn't Segment been replaced by an open-source project?
I'm not trolling here...I really don't understand the full value.
Any individual connection is simple. Website to Mixpanel? Easy. Mobile app to Facebook App Events? Also simple. Both website and mobile app to warehouse? Bigger investment but still doable. Data out of Stripe and Zendesk into the warehouse? Also reasonable straight forward to build, more annoying to maintain, very annoying to scale.
The issue is that companies need all of this together. And they need it now. Many enterprises have 100+ business units each using 50+ tools and all of that needs to be connected together.
So the issue isn't that building a single connection is hard. It's that managing N interconnected things is N^2 hard, and N^2 gets big very very fast. Wrangling that mess is just way too complicated for an open source project... even though analytics.js is open source, the vast, vast majority of a.js users only use it through Segment.
I've personally built all of the things you're talking about, on real products, with lots of traffic, and it just isn't that hard. Annoying, yes, but arguably a core competency of any data team, and something you want to think very carefully about outsourcing. Also, most teams won't have to "scale" many of the things you're talking about...that's a problem that you only have to solve when you're the central point of failure for lots of companies.
As for the n^2 problem, yes, I can see that argument, but in my experience, n is always a small number (i.e. 2-10), and they rarely need to cross-communicate in a fully connected graph. In short, piping everything through an expensive, centralized point-of-failure run by strangers seems like such a risky proposition that I can't imagine many companies would have the integration complexity to justify the risks.
But again, people seem to be finding value in it, so maybe I'm missing something. I was honestly expecting you to tell me something about advanced data warehousing or integrated analytics tools or something like that. If you tell me that integrating with segment automatically starts correlating session data across providers, well...that starts to be compelling.
Go tell a marketer that if they use Segment they can onboard new marketing tools and it'll 'just work' without any engineering time, and they'll shit a brick. (the data warehouse part is imho folly.)
Yes, I see this value. But again, that's something that could rather easily exist as an open-source, self-hosted tool, and I kind of wonder why the middleman hasn't been optimized away.
We're in a world where people would rather self-host github to save a few bucks...this is an easier problem.
open source is basically dead courtesy of SaaS. people would much rather pay a few bucks than self-host. hell, there are entire successful companies that 'just' host open source software and wrap it up in a SaaS experience.
You're just too technical and think everyone is on your level. I work for a mobile analytics provider and we once sent a json log lines file format to a customer, who replied that they didn't know how to parse it and asked us for CSV instead. So, if that's the caliber of developer you have, ask again how easy everything you just talked about is.
I use Segment, I love Segment, and I love the fact that the problem they're solving is forehead-slappingly simple: data fanout and delivery is a pain in the ass, fix it. It's one of those "but I could have done that!" businesses that just comes down to foresight, acumen, and most importantly, execution.
Analytics.js could be replaced by a self-hosted/bundled solution in theory, but Segment's server-side integrations are an extremely valuable piece of network plumbing. It's awfully nice that I was able to set up a few endpoints on our API (quick) and went through the approval process (sorry, not quick) and now my customers have a super-easy option if they want to point their data hose at us.
My brain often cracks on startups like these and particularly listed users/client companies. And it kind of worries me that those companies can't do their own analytics and outsource it to segment or alike.
All they need to do is install some dashboard, data processing tools and think of what data they want to correlate with other data and if that correlation has any interesting meaning.
But of course having your company listed at a hot startup site is simply exposure or marketing too. Maybe those companies don't even use it, they just want their logo to be shown every where on the 'hot' parts of the web.
And having listed on your site that 15.000+ companies trust you? Why do you still need to raise money then? Start charging people if your product has value.
Also as a private citizen can i request at Segment if you are storing, processing or handling my private data (provided by parties i did business with) in anyway?
I was implementing Segment this week, and I was excited to see that they do far more than customer analytics now. With a single installation of analytics.js, you can turn on tools like Sentry.io error tracking, HelloBar announcements, or secure user accounts with Castle.io. With a single integration, Segment is becoming an app store for websites.
Just bundle it in with the rest of JS payload with something like webpack. Individual calls to certain domains (GA, Heap, etc) might still get blocked, but the payload itself will be fine.
To me this is the primary challenge of outsourcing your client side tracking -- you risk a wholesale block of your analytics if the Segment file gets blacklisted.
Odd to see hn cheering on segment, but hate on analytics solutions in almost every other thread. Is it okay because anything yc does is good, or is enabling analytics different than using analytics somehow?
It's not that surprising. Hn isn't a single being.. it's a bunch of individuals with differing opinions. Some are web devs who like usage analytics, some email from the command line because they're afraid of web browsers.
Segment is perfect example of a small solution that proved to be extremely useful for everyone. I imagine Segment can grow to an extent that it'll start affecting other products' customer base—or maybe it already has. Earlier if a company was using Mixpanel, moving away would have meant wasting hundreds of man-hours to switch to a newer solution, which could have deterred companies from having that discussion. With Segment, it'll only take a few clicks.
We need more solutions similar to Segment which make we-are-too-much-invested-in-that reason less and less common.
From the discussion here, it appears that a lot of 'picks and shovels' app tool providers seem to expect that their customer base will be well venture funded companies.
Their pricing plans seems to reflect this by have a 'free' tier that supports nothing more than a quick test MVP, then a paid plan in a VERY high price range that would be considered negligible for someone who has just secured $100M in venture cap.
Feel for we founders who want to stay bootstrapped, and who will be paying for tools out of nothing but slowly trickling in revenue from paying customers. For us, $250+ per month is a real cost, and not a flippant consideration in the early stages of our growth.
I'd much prefer to see vendors have more gradually tiered pricing plans, or else peg the cost to a meaningful metric, such as 'per paying customer' rather than 'per anonymous site visitor' etc.
I work at Astronomer, another company in the clickstream and data integration industry, so I am very familiar the problem they are trying to solve. From an engineering perspective I can attest that they are tackling some huge and very complex problems, especially with the massive volume of events they process.
I have been very impressed with the content of their engineering blog lately, and am not surprised to hear they are doing well. They have an awesome team, and a lot of developers really respect them.
Some other comments in here mentioned having a bad experience with their product, but I think that was back in 2012 when they were still a small start up. In my experience, their product is solid.
This comment might be in jest, but we've actually been working on a few different ways to help optimize [1] and manage our AWS spend [2].
That said, there are quite a few places left for us to optimize. We're hoping to share more of the techniques and architecture we've used to get better performance in upcoming blog posts.
yes, sorry for the snark. It was more about, how does one go about spending $94MM in this day and age? And even after reading your well written post on AWS cost control, I still can't figure out how to spend that much money on a software biz other than given the lion's share to AWS. e.g. even if your engineers and salespeople cost (all in) $200K a year. $94MM = 470 person years of expensive staff.
completely theoretically with nice round numbers, let's say a SaaS company has $50m in annual revenue, and they want to grow to $100m in annual revenue next year. lastly, let's assume time required to pay back the cost of customer acquisition (marketing, sales, setup support) is 24 months, and customers always pay annually up front. (24 months is kind of a long payback time, so this is a bit extreme.)
doing the math... you have $50m in revenue so you can buy another $25m in recurring revenue with that. and then you need another $50m in capital from somewhere to add the other $25m in recurring revenue to get to $100m ARR total.
it also follows from the math above that the faster you grow revenue, the more capital you need. :) saas math is a bit funky.
fair enough, but sometimes no matter how much you spend you won't be able to double revenues from $50MM to $100MM. Will the market allow you to grow that fast? So yes, the math is funky.
As a counter example, many companies are using excess cash to buy back stock and not grow the biz.
As another example, what did Github did with their $100MM? I'd argue that the Github product and client base looks substantially similar to what it did before a16z gave them that chunk of change.
I hope they sort these problems out, and build a better developer product. Right now, they're focused on the enterprise, and I'm amazed their product is sufficient.
I'm writing this because their team is amazing and kind, but I hope they improve their product.