I worked at a self-described "data-driven" company, and the analogy senior leadership liked to make was that the company was like a machine learning algorithm, using data (particularly A/B tests) to do "gradient descent" of the product into its optimal form.
My first take-away was that using data to make decisions is tremendously, tremendously powerful. A/B tests, in particular, can help determine causality and drive any key metric you want in the direction you want to. Short-term, it seems to work great.
Long-term, it fails. Being purely data-driven without good intuition and long-term bets (that can't be "proven" with data), and the product loses its soul. You can
(and should) invest in metrics that are more indicative of the long-term. And you should use data to help guide and improve your intuition.
But data is not a substitute for good judgment, or for a deep understanding of your users and their problems, or of "where the puck is going". It's just a tool. It's a very powerful tool, but if it's your main or only tool, you will lose.
I don't have a way to prove this, but I've long suspected that A/B testing might be one of the major culprits to the modern problem of software churn. ie, constantly-changing GUIs, features, and products. Some percentage of users will always have problems with an application or interface. Constantly chasing every user hurdle might sound like a good idea, but I do wonder if it's similar to listening to too much fan feedback in the case of video games, movies, etc. Implementing every fan suggestion will usually make a terrible game. Instead, the ideal is to figure out which fan suggestions to ignore, and which to listen to. Only some of them will work for a given application.
I wonder if A/B testing is a bit like that. You're suddenly listening to every single issue encountered by users, but lack the wisdom to understand which issues to ignore. As a result, software changes constantly, users cannot really learn a GUI because it changes so frequently, and software teams feel like they're constantly improving things, however the software itself is not actually getting better when a the whole user's experience is taken into consideration.
The churn is the result of the fact that we don't understand what is optimal.
We know how to optimize for things like the shortest distance between two points but we don't understand how to optimize Software Design or even GUI.
This is literally what "Design" means. If something needs to be designed it means there's no theory or notion behind it on what it means to be "optimal." So we "design" a "better" solution but we don't actually know if it's better. So we design another solution and the cycle continues.
A/B testing when done on a population that gives consistent answers should converge on a consistent solution. If the population gives different answers at different times then of course there will be churn.
I would say the methodology of A/B testing is indeed like machine learning, it's quite good and accurate. It's the data source that's the problem. If you have users that don't know what they want or behave differently and inconsistently then your conclusions reflect the data. Perhaps the data is accurate and there is no consistent conclusion, OR the data is inaccurate and you need to get it from a better source other then users telling you what they think is better.
> Apply these assumptions to finding an answer to the
> question of what you should make for dinner, and you'll
> quickly see there are problems.
And, to amplify your point a bit, there are problems with A/B testing for which of my children I should love the most, or what I find beautiful or interesting; or should I A/B test out whether enslaving others works out well for me?
I chose extreme examples, but the point is that there are many, many things in human experience that don't lend themselves to easy, simplistic rules. It's genuinly hard to work through a lot of a lot of the issues that people face in real life.
With that said, it's it's also important to work through and to understand objective data as best as possible. For clearly defined and nicely-behaving problems objective data is certainly the way to go. The problem is that a lot of the problems people actually face aren't so easy to understand, and aren't so well-behaved.
I did a lot of reading on this particular subject. An opinion which I think is the most correct is that the war was driven mostly by knee jerk reactions and whatever faulty data there was, was never considered.
LBJ didn't want to "lose" Vietnam. LBJ remembered Joseph McCarthy who accused the Truman administration of losing China to communism.
JFK and Ngo Dinh Diem were murdered in 63, Gulf of Tonkin incident in 64, and full US armed intervention by 65. This was a rollercoaster ride that the US military didn't plan for and didn't want.
McNamara picked Westmoreland to lead the military effort. It was a disaster. Their lack of a coherent strategy was doomed from the start. The body counts and #bullets per enemy killed was akin to rearranging the deck chairs on a sinking Titanic. On top of their minds was the Korean War, when the US military operated at China's border and China intervened militarily. Hence, US military land operations was limited to South Vietnam. There were no plans to invade North Vietnam.
Nixon and Kissinger decided the end game was to breakup the USSR-China alliance. With China on the US' side, the USA no longer had a national interest in what happens to Vietnam. The US left Vietnam in 73 and the country was overrun two years later.
Another fundamental problem with this type of approach is that science is hard.
If you don't know how to properly construct experiments and interpret data it's honestly not much better than hiring an ornithomancer to spot omens in the skies. To be clear, this is something that even scientists with decades of both education and experience sometimes get wrong. If you're coming at it with an MBA I've got some concerns.
"Intuition" and "judgement" isn't really the best way to think about this.
The entire problem here is that when people say "data-driven", what they actually mean is "model blind". If you make decisions based on a model, nobody will say it's data-driven, it doesn't matter how empirical the model is. Yet, if you rephrase the question as "Do model blind companies win?" the answer is obvious.
Running incoherent experiments testing for each little event you may think of is certainly better than walking at random. But as everybody knows for centuries, you use data to improve your model, and use your model to improve your craft. Jumping over the model part is an incredibly lousy use of data.
Intuition and judgement come from changing the context of the analysis--the time range/history, the details included. Geopolitical and economic forecasting appear to fit this pattern. The question posed about whether data-driven companies win seems simplistic and formulaic to me personally. Often outcomes can be better fortunes and circumstantial than from over analyzing and using the right approach. Of course both together tend to be where we'd see success looking backward.
Haha. Only thing that gradient hill climbing algos are called “local” as they stuck in local maximums. Probably that’s what happened “on the long term”.
100% agree and I think the same issues apply. For our search algorithms we learned how to handle local minima etc.. I just wonder if those solutions apply to data driven companies since the iteration speed and failure cost is so much higher. I'm afraid good intuition will outperform this on the long run.
Data is useful for proving my decisions once I have made them, so what's most useful is having short feedback loops so can I catch bad decisions quickly, or reinforce good decisions.
1. To try or to ascertain by an experiment, or by a test or standard; to test; as, to prove the strength of gunpowder or of ordnance; to prove the contents of a vessel by a standard measure.
Thou hast proved mine heart.
Ps. xvii. 3.
(Webster's 1913)
Which makes sense to me. Try something you think might work, then test it—is just plain old experimental science.
Something that also tends to be under-appreciated is the constant bias towards successful A/B testing. Successful A/B tests lead to promotions and other positive employee outcomes - negative A/B tests can lead to dismissal in some places. As such people will naturally try to beef up the A/B testing result, or ship a "mixed" result and claim success.
>Being purely data-driven without good intuition and long-term bets (that can't be "proven" with data), and the product loses its soul.
This sounds like post-hoc, anti-intellectual rationalization.
How do you place long-term bets without a model to measure expectations versus outcomes? How do you know what good intuition is? There is a ton of research on "gut" calls that demonstrates it's random.
> This sounds like post-hoc, anti-intellectual rationalization.
How is this anti-intellectual?
If you want, I can formalize as a game the problem of choosing business/product strategy in a competitive market with a continuous flow of imperfect information. I can then use ideas from controls to establish some upper bounds on what can be inferred from a continuous flow of information. I can then use that result to prove an impossibility result about the game. I can even tweak assumptions to get bounds on probability distributions which infer we'd be better off flipping a coin or whatever.
I'm not going to do the work, because intuition is almost always enough to identify these situations, but it's absolutely clear to me that results like this obviously exist and correspond to many real-world situations.
> Good judgement requires data.
It used to be that insisting on data-driven decision making was a hard pull. Now it's the opposite. Insisting on data where data cannot possibly provide enough signal to make a decision is the new form of anti-intellectualism. IMO.
I would say good judgement requires experience. Data may or may not be available and applicable, but its absence doesn't mean one can't exhibit good judgement.
Waiting for data to somehow materialize to support a new action, without actually trying anything new seems like a recipe for just spinning your wheels doing the same things over and over and getting nowhere in a hurry.
Surely the data comes after the action, not prior to it?
I think "dashboard" is perhaps a better analogy than some even realize. When I hear someone describe their startup as purely data-driven, I picture someone driving a car while focusing exclusively on the dashboard.
Data is valuable. Some KPI's can show important, even vital information -- often not directly but through the first or second derivative. If something changes rapidly, you probably want to take notice.
Overall though, I think the metrics that we have available are about as helpful to a company as the metrics we can track about our bodies are to our health. Sure, monitor your weight, HRV or even your blood sugar levels and get some insight. Let it help inform your decisions. But only that.
I see too many "data driven companies" that are not data driven, but selectivly use data to do what the executive wants to be done.
On top of that I see companies with employees who didn't understand college mathematics but now want to be data driven - E.g. basing major decisions on 10 customer feedback data points. Or who don't know the difference between median and average.
There's also something to be said about having the wisdom to understand which data you can do something about and which you can't.
E.g. if the metrics of customer acquisition vs customer LTV don't make sense on a fundamental level, it doesn't matter what your other metrics are showing.
> but selectivly use data to do what the executive wants to be done.
the thirst of some people when I give them something that sounds like what they want. When they re-parrot last years findings back to me incorrectly because they only listened to the parts that backed up their instinctual beliefs. Makes me angry I waste my time on the data when all they wanted was an excuse. I coulda given them an excuse without having to do all that data digging...
Sounds like you need to embrace the "Zen of Wally" my friend. Just pull up Dilbert.com search for Wally and then read every strip associated with him. At the end you will understand. You will also receive extra points if you do this while you are supposed to be working as that is practicing not just living the Zen of Wally.
>I see too many "data driven companies" that are not data driven, but selectivly use data to do what the executive wants to be done.
Bingo. And judging by many of the comments here, it's spoiled people's opinions on what it really means to be "data driven". I mean, imagine advocating for "good judgement" when so much empirical research says it's not repeatable?
Do you have empirical research / data supporting data driven companies being more successful? Genuine question - I'd be interested to see it but I notice a lack of data supporting the argument that being data driven is associated with success in this thread.
Yup just ran into this at my company. All of the higher ups claim they want the org to be data driven, I then presented them with exhaustive research on a potential tool to use, but they pushed it aside because it didn’t confirm their biases.
Bad execs will find a way to manipulate any good philosophy toward their control
I find his rankings of which companies to invest in hilarious. It's like, take the attributes that you think will make a good solid company - experience in the industry, strong analytical skills, good operational skills. You know, the things we know for a fact are going to make a good solid successful company. Ok, take them, and throw them away. We're not interested in them. We're interested in variance.
I think fundamentally this suffers from the problem lots of VCs have right now. Cheap money meant that the correct investment strategy was to just pile money into high risk bets. I think this calculation looks very different without free money.
I think this is akin to focusing too hard on things that are appear easy to measure over things that are (perhaps incorrectly) perceived as hard to measure.
The parable of the drunk found looking for his keys under the light instead of inside the dark bar where he lost them comes to mind. What's needed is a flashlight, however imperfect.
Nobody has a big enough test set to A/B test every combination of elements for a web page, a product feature set, or even a conference presentation. The combinatorics are just unwieldy.
Someone has to have multiple good ideas and the ability to carry them out properly before an A/B test is even valuable. Otherwise you're measuring one uninformed random change against another. A/B confirms whether you've succeeded in improving something. It can't really suggest what to try next.
Like any good idea that sees wide adoption, being "data-driven" has jumped the shark to cargo-cult methodology in most places.
The truth is that data is only powerful if applied judiciously with solid knowledge of statistical fundamentals and careful thinking about causality (which will may not be practically falsifiable!). "Data" can also be misread and applied in big powerpoints with a reckless disregard for reality.
Enlightenment is understanding enough about data to use it correctly, but also acknowledging its limitations and that to create a successful business you also need vision and insight about potentials and trends for which accurate data does not and can not exist (except as a post-hoc trailing indicator).
"Data-driven" is a fancier term for "empirical", a method for building knowledge based on observation. The other method is "epistemology", a method for building knowledge by reasoning about things. A successful organisation should not snob the one over the other.
Why not just use the layman terms. Science and logic.
The terms derived from philosophy are the most snobbish in my opinion. Additionally Logic and science formalize these concepts, Epistemology on the other hand has a lot of qualitative mumbo jumbo, with concepts like "belief."
Data-driven progress is largely for optimizing choices, not discovering them. Being data-driven can be just as dangerous intentionally ignoring data. Companies “win” by focusing on what matters, with the right people, at the right time.
I believe everything else being equal data-driven is better.
Of course if data-driven means short-term with bad KPIs this version of data-driven will fail. I think a lot of people picture this A/B short-term testing when they think of it.
A data-driven approach is still far from objective and relies on choosing good metrics to optimize for.
> I believe everything else being equal data-driven is better.
While I understand your point, it also depends on what you mean by "better". It's certainly safer, less risk, more predictable and makes planning easier. It also limits creativity and high pay offs.
Some one else commented that it was like "driving by looking only at the dashboard". You can do that very safely in planes, but you're limited in where you can go. I feel it's the same in business.
There is the option of you having an absolutely massive dataset available. It just seems a bit far fetch to assume that any company would have data that could allow them to move from developing a SaaS product to running a chain of burger joints, because the data indicates that would be a good move.
I think this fully depends on the metric that you choose. The metrics can be very creative.
> could allow them to move from developing a SaaS product to running a chain of burger joints, because the data indicates that would be a good move.
In this scenario the counterfactual is a company without data is taking the decision based on a hunch.
I think we are saying similar things. For me any decision that is rational is based on data (better or worse data, but _something_ is quantified). What and how to quantify from the real world to incentivise the right behaviour is the creative part, from my point of view.
Because being 100% data driven means that you're removing ideas and opportunities that exist outside your datasets.
It's not realistic to have a dataset large enough to cover all situations, especially not those areas and opportunities that you don't know exist. That's also way Googles 20% projects was such a good idea. It allowed to business to grow and develop in directions they couldn't possible predict.
I liked this article and agree with the point - when it comes to innovation, new products, and new markets, instinct and experience could probably help more than data.
Using data for decision making is more relevant for more mature processes where the data exists to learn from. Being data driven is especially important for data intensive sectors where the complexity of issues cannot be fathomed just by being brilliant, it's here you need to be able to analyse the intricacies using the data to find possible answers.
We use the term data-driven to describe certain parts of our product wherein we use SQL queries against faithful models of the domain to make deterministic, traceable choices.
Certainly works for us, but I don’t think this is how most people understand “data-driven” anymore.
I think it's fairly safe to say that most true data-driven companies sustain their innovation and business over the years. They have little incentive to take many risks and will hold their own throughout time.
That's to say that real innovation is being more data-inspired. Taking risks with what is known based on your intuition/opinion.
A timeless read on this topic is Clayton Christensen's "Innovators Dilemma".
I think the exact same phenomena applies to investing too. You can sustain your growth by making moves purely off data, but your big wins may come from being "data-inspired"
The irony in this all is that nobody can tell the future and therefore what we know in the current moment is the best thing we've got. Even those with more experience, resources, and knowledge can be disrupted. That's the beauty of the world.
Is it about making money? Or is it about about using your money to gain greater control? Or some combination of both?
This article assumes its all about the money. But if - as I believe - we are under some hybrid corporate-governance system, where these 2 domains are working together whilst appearing to be separate, then the measure is not just money.
Look at the 13Fs for last quarter, find the funds that are still up, ask them if they made trades based on technicals and data or purely on gut feeling that it’s time to buy lean hog futures again because they’ve been through this before. Careful with the crypto bros they are looking at the sidewalk on 5th ave from their penthouses.
Data driven companies tend to use the data to quickly fill in dark spots, collect that income and tap the source dry, and then bureaucracy and nepotism set in and the script is flipped. Data is used to explain away why we lost our shirt and how it’s not our fault.
The VC is looking for the moon and trying to argue that more data would somehow make the VC feel less like it’s Las Vegas. Sorry it’s all Vegas unless you want to spread and settle for making few percent by investing in all 5 companies and not trying to guess the pack leader.
Sometimes, but the company will lack soul, purpose and a meaning that it stands for. Sometimes it's better to stand by what you want your company's values to be, even if that means losing some potential customers. As a result, you'll have more passionate and engaged customers who are more likely to stick with you long term.
I think the main issue with being a data-driven company is that it's a known secret. Sure, you can snowball a small edge into a winning position over a few years, but isn't it more likely that your competitors become more data-driven, mitigating your one advantage and enhancing their advantages?
I don't think any company is more data driven than any other. After they're successful, companies all need some minimum of measurement. Some say they're data driven but internally data is usually a mess. The right question is, are there any companies where data is not a mess? I'm guessing not.
>The right question is, are there any companies where data is not a mess? I'm guessing not.
I've been in a company that claimed to be "data driven". There was plenty of data that wasn't a mess - and we used it to make a lot of decisions. But, of course, we discovered new uses for our data - and that had to be cleaned.
The distinction is that all data starts messy, but you clean some, then discover new dirt. So you clean that, then discover some more. Some companies have more clean data than others.
My first take-away was that using data to make decisions is tremendously, tremendously powerful. A/B tests, in particular, can help determine causality and drive any key metric you want in the direction you want to. Short-term, it seems to work great.
Long-term, it fails. Being purely data-driven without good intuition and long-term bets (that can't be "proven" with data), and the product loses its soul. You can (and should) invest in metrics that are more indicative of the long-term. And you should use data to help guide and improve your intuition.
But data is not a substitute for good judgment, or for a deep understanding of your users and their problems, or of "where the puck is going". It's just a tool. It's a very powerful tool, but if it's your main or only tool, you will lose.