Hacker News new | past | comments | ask | show | jobs | submit login

The goal of science has always been to discover underlying principles and not merely to predict the outcome of experiments. I don't see any way to classify an opaque ML model as a scientific artifact since by definition it can't reveal the underlying principles. Maybe one could claim the ML model itself is the scientist and everyone else is just feeding it data. I doubt human scientists would be comfortable with that, but if they aren't trying to explain anything, what are they even doing?



That's the aspirational goal. And I would say that it's a bit of an inflexible one- for example, if we had an ML that could generate molecules that cure diseases that would pass FDA approval, I wouldn't really care if scientists couldn't explain the underlying principles. But I'm an ex-scientist who is now an engineer, because I care more about tools that produce useful predictions than understanding underlying principles. I used to think that in principle we could identify all the laws of the universe, and in theory, simulate that would enough accuracy, and inspect the results, and gain enlightenment, but over time, I've concluded that's a really bad way to waste lots of time, money, and resources.


It's not either-or, it's yes-and. We don't have to abandon one for the other.

AlphaFold 3 can rapidly reduce a vast search space in a way physically-based methods alone cannot. This narrowly focused search space allows scientists to apply their rigorous, explainable, physical methods, which are slow and expensive, to a small set of promising alternatives. This accelerates drug discovery and uncovers insights that would otherwise be too costly or time-consuming.

The future of science isn't about AI versus traditional methods, but about their intelligent integration.


Or you can treat AlphaFold as a black box / oracle and work at systems biology level, i.e. at pathway and cellular level. Protein structures and interactions are always going to be hard to predict with interpretable models, which I also prefer.

My only worry is that AlphaFold and others, e.g. ESM, seem to be bit fragile for out-of-distribution sequences. They are not doing a great job with unusual sequences, at least in my experience. But hopefully they will improve and provide better uncertainty measures.


> if we had an ML that could generate molecules that cure diseases that would pass FDA approval, I wouldn't really care if scientists couldn't explain the underlying principles

It’s actually required as part of the submission for FDA approval that you posit a specific Mechanism of Action for why your drug works the way it does. You can’t get approval without it


A substantial proportion of FDA-approved drugs have an unknown mechanism of action - we can handwave about protein interactions, but we have no useful insight into how they actually work. Drug discovery is bureaucratically rigid, but scientifically haphazard.


How much do you believe that the MoA actually matches what is happening in the underlying reality of a disease and its treatment?

Vioxx is a nice example of a molecule that got all the way to large-scale deployment before being taken off the market for side effects that were known. Only a decade before that, I saw a very proud pharma scientist explaining their "mechanism of action" for vioxx, which was completely wrong.


Underlying principles are nice for science, whatever works is nice for engineering. There is plenty of historical precedent where we build stuff that works without knowing exactly why it works.


Me like thee career path. Interesting.


Discovering underlying principles and predicting outcomes is two sides of the same coin in that there is no way to confirm you have discovered underlying principles unless they have some predictive power.

Some had tried to come up with other criteria to confirm you have discovered an underlying principle without predictive power, such as on aesthetics - but this is seen by the majority of scientists as basically a cop out. See debate around string theory.

Note that this comment is summarizing a massive debate in the philosophy of science.


If all you can do is predict an outcome without being able to explain how then what have you really discovered? Asking someone to just believe you can predict outcomes without any reasoning as to how, even if you're always right, sounds like the concept of faith in religion.


The how is actually just further hypotheses. It's turtles all the way down:

There is a car. We think it drives by burning petrol somehow.

How do we test this? We take petrol away and it stops driving.

Ok, so we know it has something to do with petrol. How does it burning the petrol make it drive?

We think it is caused by the burned petrol pushing the cylinders, which are attached to the wheels through some gearing. How do we test it? Take away the gearing and see if it drives.

Anyway, this never ends. You can keep asking questions, and as long as the hypothesis is something you can test, you are doing science.


>There is a car. We think it drives by burning petrol somehow. How do we test this? We take petrol away and it stops driving.

You discovered a principle.

Better example:

There is a car. We don’t know how it drives. We turn the blinkers on and off. It still drives. Driving is useful. I drive it to the store


In the vein of "can a biologist fix a radio" and "can a neuroscientist understand a microprocessor", see https://review.ucsc.edu/spring04/bio-debate.html which is an absolutely wonderful explanation of how geneticists and biochemists would go about reverse-engineering cars.

The best part is where the geneticist ties the arms of all the suit-wearing employees and it has no functional effect on the car.


> what have you really discovered?

You’ve discovered magic.

When you read about a wizard using magic to lay waste to invading armies, how much value would you guess the armies place in whether or not the wizard truly understands the magic being used against them?

Probably none. Because the fact that the wizard doesn’t fully understand why magic works does not prevent the wizard from using it to hand invaders their asses. Science is very much the same - our own wizards used medicine that they did not understand to destroy invading hordes of bacteria.


Exactly! The magic to lay waste to invading armies is packaged into a large flask and magical metal birds are flown to above the army. There the flask is released from the birds bellies and gently glides down. When the flask is at optimum height it releases the power of the sun and all that are beneath it get vaporized. A newer version of this magic is attached to a gigantic fireworks rocket that can fly over whole mountain ranges and seas.


Do you know what the stories say happens to wizards who don't understand magic?

https://youtu.be/B4M-54cEduo?si=RoRZIyWRULUnNKLM


it's still an extremely valuable tool. just as we see in mathematics, closed forms (and short and elegant proofs) are much coveted luxury items.

for many basic/fundamental mathematical objects we don't (yet) have simple mechanistic ways to compute them.

so if a probabilistic model spits out something very useful, we can slap a nice label on it and call it a day. that's how engineering works anyway. and then hopefully someday someone will be able to derive that result from "first principles" .. maybe it'll be even more funky/crazy/interesting ... just like mathematics arguably became more exciting by the fact that someone noticed that many things are not provable/constructable without an explicit Axiom of Choice.

https://en.wikipedia.org/wiki/Nonelementary_integral#Example...


>closed forms (and short and elegant proofs) are much coveted luxury items.

Yes, but we're taking about roughly the opposite of a proof


but in usual natural sciences we don't have proofs, only data and models, and then we do model selection (and through careful experiments we end up with confidence intervals)

and it seems with these molecular biology problems we constantly have the problem of specificity (model prediction quality) vs sensitivity (model applicability), right? but due to information theory constraints there's also a dimension along model size/complexity.

so if a ML model can push the ROC curve toward the magic left-up corner then likely it's getting more and more complex.

and at one point we simply are left with models that are completely parametrized by data and there's virtually zero (direct) influence of the first principles. (I mean that at one point as we get more data even to do model selection we can't use "first principles" because what we know through that is already incorporated into previous versions of the models. Ie. the information we gained from those principles we already used to make decisions in earlier iterations.)

Of course then in theory we can do model distillation, and if there's some hidden small/elegant theory we can probably find it. (Which would be like a proof through contradiction, because it would mean that we found model with the same predictive power but with smaller complexity than expected.)

// NB: it's 01:30 here, but independent of ignorance-o-clock ... it's quite possible I'm totally wrong about this, happy to read any criticism/replies


Isn’t that basically true of most of the fundamental laws of physics? There’s a lot we don’t understand about gravity, space, time, energy, etc., and yet we compose our observations of how they behave into very useful tools.


>there is no way to confirm you have discovered underlying principles unless they have some predictive power.

Yes, but a perfect oracle has no explanatory power, only predictive.


increasing the volume of predictions produces patterns that often lead to underlying principles.


And much of the 20th century was characterized by a very similar progression - we had no clue what the actual mechanism of action was for hundreds of life saving drugs until relatively recently, and we still only have best guesses for many.

That doesn’t diminish the value that patients received in any way even though it would be more satisfying to make predictions and design something to interact in a way that exactly matches your theory.


We were using the compass for navigation for thousands of years, without any clue about what it was doing or why. Ofcourse lot of people got lost cause compasses are not perfect. And the same will happen here. Theory of Bounded Rationality applies.


That ship sailed with Quantum physics. Nearly perfect at prediction, very poor at giving us a concrete understanding of what it all means.

This has happened before. Newtonian mechanics was incomprehensible spooky action at a distance, but Einstein clarified gravity as the bending of spacetime.


I think this relies on either the word “concrete” or a particular choice of sense for “concrete understanding”.

Like, quantum mechanics doesn’t seem, to me, to just be a way of describing how to predict things. I view it as saying substantial things about how things are.

Sure, there are different interpretations of it, which make the same predictions, but, these different interpretations have a lot in common in terms of what they say about “how the world really is” - specifically, they have in common the parts that are just part of quantum mechanics.

The qau that can be spoken in plain language without getting into the mathematics, is not the eternal qau, or whatever.


The goal of science has always been to predict the outcome of experiments, because that's what distinguishes science from philosophy or alchemy or faith. Anyone who believes that they've discovered an underlying principle is almost certainly mistaken; with time, "underlying principles" usually become discredited theories or, sometimes, useful but crude approximations that we teach to high schoolers and undergrads.

Prediction is understanding. What we call "understanding" is a cognitive illusion, generated by plausible but brittle abstractions. A statistically robust prediction is an explanation in itself; an explanation without predictive power explains nothing at all. Feeling like something makes sense is immeasurably inferior to being able to make accurate predictions.

Scientists are at the dawn of what chess players experienced in the 90s. Humans are just too stupid to say anything meaningful about chess. All of the grand theories we developed over centuries are just dumb heuristics that are grossly outmatched by an old smartphone running Stockfish. Maybe the computer understands chess, maybe it doesn't, but we humans certainly don't and we've made our peace with the fact that we never will. Moore's law does not apply to thinking meat.


Kepler famously compiled troves of data on the night sky, and just fitted some functions to them. He could not explain why but he could say what. Was he not a scientist?


He did attempt to explain why. Wikipedia: "On 4 February 1600, Kepler met Tycho Brahe....Tycho guarded his data closely, but was impressed by Kepler's theoretical ideas and soon allowed him more access. Kepler planned to test his theory from Mysterium Cosmographicum based on the Mars data, but he estimated that the work would take up to two years (since he was not allowed to simply copy the data for his own use)."


Mixed it up! I meant Tycho Brahe actually.


Sure he was. And then Newton came along and said it's all because of gravity and Kepler's laws were nothing but his laws of motion applied to planets.

Newton was a bit of a brat but everybody accepted his explanation. Then the problem turned to trying to explain gravity.

Thus science advances, one explanation at a time.


He might not have been able to explain why _but_ I'd bet anything he would have wanted to if he could.


Can underlying principles be discovered using the framework of scientific method? The primary goal of models and theories it develops is to support more experiments and eventually be disproven. If no model can be correct, complete and provable in finite time, then a theory about underlying principles that claims completeness would have to be unfalsifiable. This is reasonable in context of philosophy, but not in natural sciences.

Scientific method can help us rule out what underlying principles are definitely not. Any such principles are not actually up to be “discovered”.

If probabilistic ML comes along and does a decent job at predicting things, we should keep in mind that those predictions are made not in context of absolute truth, but in context of theories and models we have previously developed. I.e., it’s not just that it can predict how molecules interact, but that the entire concept of molecules is an artifact of just some model we (humans) came up with previously—a model which, per above, is probably incomplete/incorrect. (We could or should use this prediction to improve our model or come up with a better one, though.)

Even if a future ML product could be creative enough to actually come up with and iterate on models all on its own from first principles, it would not be able to give us the answer to the question of underlying principles for the above-mentioned reasons. It could merely suggest us another incomplete/incorrect model; to believe otherwise would be to ascribe it qualities more fit for religion than science.


I don't find that argument convincing.

People clearly have been able to discover many underlying principles using the scientific method. Then they have been able to explain and predict many complex phenomena using the discovered principles, and create even more complex phenomena based on that. Complex phenomena such as the technology we are using for this discussion.

Words dont have any inherent meaning, just the meaning they gain from usage. The entire concept of truth is an artifact of just some model (language) we came up with previously—a model which, per above, is probably incomplete/incorrect. The kind of absolute truth you are talking about may make sense when discussing philosophy or religion. Then there is another idea of truth more appropriate for talking about the empirical world. Less absolute, less immutable, less certain, but more practical.


> The kind of absolute truth you are talking about may make sense when discussing philosophy or religion.

Exactly—except you are talking about it, too. When you say “discovering underlying principles”, you are implying the idea of absolute truth where there is none—the principles are not discovered, they are modeled, and that model is our fallible human construct. It’s a similar mistake as where you wrote “explain”: every model (there should always be more than one) provides a metaphor that 1) first and foremost, jives with our preexisting understanding of the world, and 2) offers a lossy map of some part of [directly inaccessible] reality from a particular angle—but not any sort of explanation with absolute truth in mind. Unless you treat scientific method as something akin to religion, which is a common fallacy and philosophical laziness, it does not possess any explanatory powers—and that is very much by design.


Now we come back to words gaining their meaning from usage.

You are assigning meanings to words like "discovering", "principles", and "explain" that other people don't share. Particularly people doing science. Because these absolute philosophical meanings are impossible in the real world, they are also useless when discussing the reality. Reserving common words for impossible concepts would not make sense. It would only hinder communication.


I can see what you mean. Then perhaps you could give a non-circular definition of what you mean by “underlying principles” and how that is different from any other prediction or model to deserve this distinct and quite strong-sounding term? or what you mean by “explain” that is different from “predict” or “model” to warrant such a distinctive term, and where exactly such explanatory activity fits within scientific method?


Communication is inherently circular, and words don't have definitions. But people are often capable of discovering what a particular word means in a particular context. And other people can sometimes help that by giving useful explanations.

Science is pretty much the same. We can often discover how the reality works, and use the discoveries to explain many things. Somehow that keeps happening all the time, even if we can never be fully sure about anything.


Any word can be given a definition, that’s how we communicate. A non-circular definition is a way to define what you mean by a term to another person.

Again: scientific method does not explain. Religion or philosophy are about explaining. Scientific method is about experimentation and making testable predictions. What experiments we perform is determined by how we understand the world, and if there is any subsequent explanation about “how things really are” (a.k.a. “the underlying principles”) then it has nothing to do with scientific method which does not make such claims by design; that is untestable/unfalsifiable beliefs and a product of either philosophical or religious thinking.

Since you insist on using specific words “explain” and “discover”, rather than more conventionally used in science “predict” or “model”, it implies they mean something different to you. I have provided the meanings of “explain” and “discover” I am familiar with, as it applies to the discussion at hand (which is about the philosophy of scientific process, underlying principles and truths about objectively existing reality). If you refuse to identify the meanings you are using those words in, I take it that you concede whatever point you had.


I've never met anyone capable of communicating with well-defined terms. Or giving definitions that actually match the real-word usage of the term. And all definitions are ultimately circular, because the number of words is finite. In any chain of definitions, you will eventually have to use a term you were trying to define.

What you call the scientific method is a philosophical construct that has little to do with actual science. And philosophers disagree on whether it's a good ideal for science. Given that it's neither a good description of science nor a universal ideal for science, I wouldn't focus too much on it when discussing science.


> And all definitions are ultimately circular, because the number of words is finite.

I can’t help thinking I’m talking to an LLM or a troll.

If you use a complex term that needs definition in a casual discussion, it’s most likely none of the words you use in the definition would themselves require definitions—and if this was to happen repeatedly, the conversation would halt long before we would be running out of words. It’s enough to avoid circularity within a couple of levels in good faith.

Anyway, I’m not sure whether we disagree or not or what exactly we are arguing about. My point is “ML making predictions is not a threat to us getting at underlying principles, because natural science (scientific method, predicting things) in general does lead us to any provable facts about those principles, and because ML would make predictions within an incorrect/incomplete model that we gave it.” In that, by “underlying principles” I mean some statements about objective reality. If we are on the same page here, we can continue discussion, otherwise let’s not.


It's an analogy. All communication is ultimately circular, and we can never be sure that we understand the terms the same way as the other party. Still, people often seem to be able to communicate.

Similarly, scientific method cannot discover the underlying principles or explain the nature. It can only rule out principles and explanations. Regardless, science seems to come up with principles and explanations all the time.

And that's because scientific method is not science. It's a theoretical model for a subset of (often ritualized) activities within science. Actual science is more than that. It can do things scientific method cannot, because it's less focused on philosophical ideas such as absolute truth or provable facts.

In my experience, scientific method is like a picture of an elephant drawn from a written description. Drawn by someone who has never seen the animal or a picture of it, and who has no idea what kind of an animal it is. There are some recognizable elements, but it definitely does not look like the real thing.


Sorry, what’s less focused, scientific method or “actual science”?


What if the underlying principles of the universe are too complex for human understanding but we can train a model that very closely follows them?


Then we should dedicate large fractions of human engineering towards finding ethical ways to improve human intelligence so that we can appreciate the underlying principles better.


I spend about 30 minutes reading this thread and links from it: I don't really follow your line of argument. I find it fascinating and well-communicated, the lack of understanding is on me: my attention flits around like a butterfly, in a way that makes it hard for me to follow people writing original content.

High level, I see a distinction between theory and practice, between an oracle predicting without explanation, and a well-thought out theory built on a partnership between theory and experiment over centuries, ex. gravity.

I have this feeling I can't shake that the knife you're using is too sharp, both in the specific example we're discussing, and in general.

In the specific example, folding, my understanding is we know how proteins fold & the mechanisms at work. It just takes an ungodly amount of time to compute and you'd still confirm with reality anyway. I might be completely wrong on that.

Given that, the proposal to "dedicate...engineer[s] towards finding ethical ways to improve...intelligence so that we can appreciate the underlying principles better" begs the question of if we're not appreciating the underlying principles.

It feels like a close cousin of physics theory/experimentalist debate pre-LHC, circa 2006: the experimentalists wanted more focus on building colliders or new experimental methods, and at the extremes, thought string theory was a complete was of time.

Which was working towards appreciating the underlying principles?

I don't really know. I'm not sure there's a strong divide between the work of recording reality and explaining it. I'll peer into a microscope in the afternoon, and take a shower in the evening, and all of a sudden, free associating gives me a more high-minded explanation for what I saw.

I'm not sure a distinction exists for protein folding, yes, I'm virtually certain this distinction does not exist in reality, only in extremely stilted examples (i.e. a very successful oracle at Delphi)


There's a much easier route: consciousness is not included in the discussion...what a coincidence.


That sounds like useful engineering, but not useful science.


I think that a lot of scientific discoveries originate from initial observations made during engineering work or just out of curiosity without rigour.

Not saying ML methods haven't shown important reproducibility challenges, but to just shut them down due to not being "useful science" is inflexible.


What if it turns out that nature simply doesn't have nice, neat models that humans can comprehend for many observable phenomena?


I read an article about the "unreasonable effectiveness of mathematics" that it was basically the result of a drunk looking for his keys under a lamp post because that's where the light is. We know how to use math to model parts of the world, and every where we look, there's _something_ we can model with math, but that doesn't mean that there's all there is to the universe. We could be understanding .0000001% of what's out there to understand, and it's the stuff that's amenable to mathematical analysis.


The ML model can also be an emulator of parts of the system that you don't want to personally understand, to help you get on with focusing on what you do want to figure out. Alternatively, the ML model can pretend to be the real world while you do experiments with it to figure out aspects of nature in minutes rather than hours-days of biological turnaround.


The machine understands, we do not, and so it is not science?

Can we differentiate?


Maybe the science of the past was studying things of lesser complexity than the things we are studying now.


If have an oracle that can predict the outcome of experiments does it _matter_ if you understand why?


AFAIK in wet science you need (or needed) to do tons of experimentations with liquids with specific molar compositions and temperatures splurging in and out of test tubes - basically just physically navigating a search space. I would view an AI model with super powerful guestimation capability as a much faster way of A) cutting through search space B) providing accidental discoveries while at it

Now, if we look at history of science and technology, there is a shit ton of practical stuff that was found only by pure accident - discoveries of which could not be predicted from any previous theory.

I would view both A) and B) as net positives. But our teaching of the next generation of scientists needs to adapt.

The worst case scenario is of course that the middle management driven enshittification of science will proceed to a point where there are only few people who actually are scientists and not glorified accountants. But I’m optimistic this will actually super charge science.

With good luck we will get rid of the both of the biggest pathologies in modern science - 1. number of papers published and referred as a KPI 2. Hype driven super politicized funding where you can focus only one topic “because that’s what’s hot” (i.e. string theory).

The best possible outcome is we get excitement and creativity back into science. Plus level up our tech level in this century to something totally unforeseen (singularity? That’s just a word for “we don’t know what’s gonna happen” - not a specific concrete forecasted scenario).


> singularity? That’s just a word for “we don’t know what’s gonna happen” - not a specific concrete forecasted scenario

It's more specific than you make it out. The singularity idea is that smart AIs working on improving AI will produce smarter AIs, leading to an ever increasing curve that at some point hits a mathematical singularity.


No it's not specific at all in predicting technological progress, which was the point of my comment.

Nobody knows what singularity would actually mean from the point of view of specific technological development.


they offered a good tool for science... so this is a part of science.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: