God I hope this becomes a trend in other fields as well.
Spending time in universities has made me very cynical of the research that comes out of them. There is just too much incentives for profs to ignore biases in their research. I've seen it happen many times. For them, it is the difference between becoming a prof or a lab tech and it is literally worth millions.
You can not trust any research done where the career of the researcher depends on them finding results.
Universities, by hiring based on research credentials that currently translate roughly in the amount of positive results a person has generated, completely render worthless the research going on in their departments.
Aggravating the situation is the fact that the peer review system is too incestuous to be relied on especially when the peers are probably also 'bias ignorers' with incentives to keep the flaws in the system.
I strongly believe that for hiring purposes, the research skills of profs should be evaluated on criterion that are incidental to research. Mostly math, probability, statistics (yes even for psychology) then methodology skills and also maybe leadership, communication skills and dedication to science (last only because it is difficult to measure).
"I take the hard view that science involves the creation of testable hypotheses" - Michael Crichton
I'm not trying to be a troll. Really, I promise. But I really hope that the next field they do this in, is in Climate Science, specifically articles on climate change. It seems like bulk of their work is based on computer models and I just don't know how you're supposed to replicate any of that. How do you isolate the millions of variables to show cause and effect with a closed source computer model. That is my biggest problem with climate science.
> "It seems like bulk of their work is based on computer models"
Then it seems that you are confused. Climate science is about gathering data, putting forth a testable hypothesis and seeing how future data fits the prediction. Yes, they use computer models. In the same way that you or I might gather data from a wind tunnel test and make a computer model of wind flow from that data to extrapolate a theory and create new testable hypotheses of a more-aerodynamic shape.
But so long as new data is used to test those hypotheses, the use of a computer model is irrelevant. Science is being done. Testable hypotheses are being created and new data is being compared to those predictions. As long as we aren't running around half-cocked, believing the predictions of hypotheses that have not yet been tested, there's no concern.
And the core hypotheses under the umbrella of Climate science have been tested. Some disproved by data, leading to changes in thinking, further research and new testable theories. Some supported by data. None of the 'alternate' hypotheses can say the same. (No, it's not the solar cycle. No, it's not distorted temperature readings from data stations in urban areas.)
The very reason that there is a general consensus around the so called 'tent-pole' theories in Climate Science is precisely because they have been tested and have been shown to make accurate predictions and continue to do so. Some of them, for decades now.
Somehow I find that thought uncomfortable that the complexity of factors impacting something as complex as climate is getting reduced to a wind tunnel test.
One is a highly controlled scenario where all(most all) the externalities can be controlled. The other some extremely complex modeling, which will, at least in my view, always depend on some consensus among the participants which factors are more important than others.
I am far from a climate change skeptic, but the best example for the fundamental failure of grasping complex scenario with some factor-based prediction systems has been demonstrated with the complete failure of economics in predicting something as "simple" as the financial crisis. But then again, I am coming from a social science perspective assuming that as soon as humans are involved data and data analysis will always be subjective.
Predicting the financial crisis is probably orders of magnitude more complex that climate change. Climate science is based on well known (although complex) systems of equations and laws that have been studied independently from each other for hundreds of year (heat diffusion, fluid flow, etc...). The tricky part is combining all the pieces is a system the size of the planet where a lot of quantities are still relatively unknown.
Compare that with the financial crisis where there are no hard laws and the behavior of the system depends directly and inextrinsibly from how a large group of people scattered around the world will reach to certain hypothetical scenario in an unknown point in history. Will investor A be rational? Panic? How about B? What happens if B sees A panicking? Or vice versa?
I'm not an economist, by can give you an example on epidemiology I worked on recently. Starting with one of the simplest epidemic models (SIR), as soon as you add a simple behavioral assumption (people can become afraid of the disease and behave more carefully) the behavior of the systems changes completely.
Predicting the financial crisis is probably orders of magnitude more complex that climate change.
Orders of magnitude simpler, I'd think. "If real wages and household incomes keep dropping while household indebtedness keeps rising, you will eventually have a financial crisis."
Marx never needed any math to work this out, just the good sense to realize that money never appears out of thin air.
There's as much content to that prediction as there is to the prediction "if you release carbon dioxide into the environment at an accelerated rate, the average temperature of the Earth will increase".
What people are meaning by "prediction" here is something much more specific than what you mean and you know it.
You are not "far from a climate change skeptic". You are a climate change skeptic. At best, your post is the vague claim that complex systems cannot be modeled or tested empirically, and is empirically wrong in its claim that no-one predicted the financial crisis.
Chaotic systems have a horizon of predictability (http://www.britannica.com/EBchecked/media/3227/Sensitivity-o...), so climate models are still useful to an extent. They will tell us, with some degree of accuracy, roughly what will happen in the near future. The long term remains a mystery and there's little that can be done about that.
It goes much too far to say that the climate is chaotic and therefore totally unpredictable in the long run. After all, if you buy that whole hog, you're also led to say that if you moved the Earth to Mercury's orbit it would be impossible to predict that the Earth would get hotter, in the long term. This results from the flaw of confusing weather (which is indeed chaotic) with statistics about weather (i.e. climate).
A butterfly flapping its wings can cause a hurricane. But it can also prevent a hurricane. All those perturbations don't change the fact that, within an order of magnitude, we can predict how many hurricanes will occur in a given year. Alternatively, think about the classical three body problem: even though it's the preeminent example of a chaotic system, you can still make broad predictions about it. Center of mass, average speed of the bodies, average collision time. This isn't some obscure controversial field, either: in physics, these emergent statistical properties form the basis of statistical mechanics and therefore thermodynamics as a whole.
Chaos doesn't mean you have to throw your hands up and give up on making any long term predictions at all about a system.
>Chaos doesn't mean you have to throw your hands up and give up on making any long term predictions at all about a system.
But we can't tell what the weather will be in 45 days in New York.
One of the attributes of chaotic systems is their self-similarity. The behavior on one level is similar to the behavior on another - in this case weather is similar to climate. We can make predictions about the weather in the next few days, just as we can with climate, but in the next few months or years (or centuries or millenia for climate) is another story. The tiny pertrubations in the system which we do not model (the butterflies) end up throwing our predictions off in the long run.
Yes, and that's why nobody is claiming to be able to predict the weather in the long run. Nevertheless, some things still can be predicted, even in chaotic systems. What is your point?
My point is that pollution should be reduced, climate models should still be developed and that at the same time we shouldn't take their predictions as dogma and continue to question them.
I agree with all of what you say above and am skeptical of the models as they exist now, as another comment of mine elsewhere suggests.
I'm more interested in the theoretical question here, though. I'm unconvinced that self-similarity in chaotic systems means that since weather is a chaotic system, climate is itself a chaotic system that's unpredictable in the long term. One way of thinking about chaos is that at some point in the future, a system's position in phase space will become wholly uncorrelated with the system's initial position.
So for weather, tomorrow's weather is correlated with today's weather, but weather two weeks from now (or whenever) is wholly uncorrelated. But with climate you do end up with persistent correlations, year on year, decade on decade, millenia on millenia. Temperate regions show (anti-)correlations of deviation from annual average temperature between 180 day periods, and daily solar energy impingement is correlated with positive deviation from global mean temperature.
I've always made a mental analogy between that and a gas (the n-body problem): even though the positions of gas molecules rapidly become uncorrelated from their starting position, certain statistical properties--e.g., the pressure the gas exerts on the walls of its container--are highly correlated from one moment to another. Not just initially but into the indefinite future.
I'm open to arguments that that analogy is flawed and could see some assumptions that I'm not 100% confident in, but I don't see any obvious flaws. What's going wrong here, in your view?
The point is, while the models are inaccurate, the data (and there is a lot of data) indicate that warming is happening. This is shown not merely in temperature measurements; but in ice coverage changes, both in the Arctic Ocean and at mountaintops; and in biological change: increased growing seasons, earlier spring flowering, earlier bird migrations (and in some cases increased overwintering bird populations), increased ranges of species, including pests; not to mention permafrost melting and increased droughts and other severe weather anomalies in many areas.
That argumentation is a bit silly. Chaotic systems are not necessarily unpredictable. For example take the pendulum that swings among the magnets (one of those classic examples). So the exact path of the pendulum might be unpredictable, but it is predictable that it will reach equilibrium eventually.
What I mean is: some aspects can be predictable while others are not predictable. Therefore your argument is wrong.
Based on the second claim of stfu’s post, “as soon as humans are involved data and data analysis will always be subjective”, I think he is really trying to highlight the unpredictability of a model that must generalize human choices using statistics. Since even conservative scientists will agree that climate change has some anthropogenic dimension, we should understand the limitations of modeling it.
Being skeptical of the model, however, does not prohibit you from accepting what the model is suggesting. Historical data and scientific opinion can be enough to convince us that climate change is occurring. But no model can predict how our reaction to that knowledge will affect climate change in the future.
As long as they are producing testable hypotheses and gathering new data to (in)validate their predictions, they are doing science. And if their theories make accurate predictions, it literally does not matter if those theories came from computer modelling, mathematical proofs, inspiration or divination.
If we could make accurate predictions about the function of the universe by casting chicken bones, that would be science. Even though we wouldn't have the first clue what mechanisms might make it so.
There's no shortage of such situations in science, where theories produce useful results despite no understanding of the underlying, clearly complex systems that drive them. e.g. The Placebo Effect, Gravity, etc.
You seem to be arguing that every time science approaches a complex situation, which it doesn't understand and can't explain from first principles, that we cannot apply the Scientific Method. And that is patently false. That is exactly how we have come as far as we have. By creating falsifiable hypotheses, testing their predictions and refining those hypotheses.
Your hand-waving doubts and fears about complexity and methods is fundamentally anti-science. You are not only a 'climate skeptic', you are apparently skeptical of man's ability to use science to investigate and understand the universe at all.
> One is a highly controlled scenario where all(most all) the externalities can be controlled. The other some extremely complex modeling, which will, at least in my view, always depend on some consensus among the participants which factors are more important than others.
You do know that precisely modeling just a few seconds of turbulent flow would take more computer time than all the climate simulations in history, right? They're both approximations of intractably complex situations.
The mathematics of economies falls in the field of evolutionary dynamics. This is a relatively young field, having its origins in the 1960-1970 time frame, and much of the fundamental principles are still being worked out. For one thing, evolutionary dynamics necessarily involves quite a bit of graph theory, much of which is difficult or impossible to describe with continuous functions.
The mathematics of climate, on the other hand, falls in the field of statistical mechanics. While still a rather complicated field, statistical dynamics is much older, starting with Boltzmann in the late 19th century, and considerably more of the principles have been worked out. Statistical dynamics also deals more with continuous functions which, while not exactly easy to work with, can still be manipulated more easily than graphs.
It will never be possible to credibly predict financial crises: if you learn that in a month your stocks are going to drop 50% you will try to sell them now. But, since other people have heard the same prediction, they will do the same, and the drop would happen today, rather than in a month. Such a prediction can only be possible if it's made by someone people do not believe. Which is, accidentally, completely in agreement with very classic economic theory.
That being said, you are right in that climate models are woefully inadequate.
Complete failure of neoclassical economics, perhaps. There were plenty of people who predicted the GFC with varying degrees of accuracy; Steve Keen being but one.
The problem is that these people (and others) don't know enough to realize their economic predictions are worthless. There is truth to the old saw, "Economists are good at predicting recessions. They've predicted eight of the last three. . . "
In many cases the prognosticators are fully aware that their predictions are worthless but persist in the facade to build up a reputation. It takes only a few lucky coincidences to get set up for life. People love to hear bold predictions.
There's an interesting issue in climate science, and I say this as someone who thinks that dealing with climate change is probably the most important task of our generation. The (small subset of) professional statisticians I've spoken to are both similarly suspicious of climate models (not climate change itself), though for more technical reasons than you mention.
At the same time, the physics dictate that the general sign and order of magnitude of CO2 forcing in climate models is correct, as surely as doubling solar output would increase mean surface temperatures. The physics also don't preclude the effects being even more severe than the models predict. But there's considerable uncertainty involved in both directions, and when people hear "well, things might end up better than scientists think is the average case, and at no cost to us!" they jump for that option and ignore the average prediction, let alone the worst case scenario that's within plausibility.
Coupled with an extremely well-funded group of fossil fuel industrialists putting hundreds of millions of dollars into the outright shutting down of climate research, good scientists and especially not-at-all-scientific activists end up on the defensive, overemphasizing the finality of the models and using them as solid predictions instead of tools to vindicate the general thrust of the physics.
As far as your particular point, think of the the models in climate journals as tools to understand the issue instead of the final word. It is somewhere between very difficult and impossible to come up with a climate model where CO2 forcing doesn't cause significant warming, but individual parts of those models need to be and often are tested. Indeed, those are the main points of dispute in the legitimate research and end up being thoroughly vetted.
There are a billion dollars worth of research money for group-think infested alarmist journals for every million dollars available from the fossil fuels industry. Your life as a climate scientist is MUCH easier if you shut up and find a hockey stick than if you take Exxon funding.
I'll try to interpret this as charitably as possible =) I think there are two separate issues you're bringing up here.
The first is the implication that climatologists are all involved in a conspiracy with power hungry politicians to institute a kind of global eco-Stalinism, in exchange for research grants. It's the only imaginable scheme where you can treat all government-funded ecology, meteorology, clean-energy tech, and climatology research as part of a coherent but corrupt bargain (As you must to get your billions of dollars figure. I would also add that your mere seven figure fossil fuel budget is grossly underestimated.). Frankly, I don't think you actually believe it, as it's implausible rhetoric that could come straight from the fevered fantasies of Fox News and Rush Limbaugh.
The other is that there's group think among scientists. The idea seems to be that academics and scientists are all too often willing to get caught up in petty vendettas, battles for turf and recognition, and back scratching, instead of focusing on the angelically pure pursuit of knowledge. The issue with that is... well, there isn't an issue. It's totally true, as anyone who's spent much time in research knows. Hence the CRU emails.
It's a fair criticism. But science has soldiered on despite it through the centuries, and scientific institutions, even being plagued with those flaws, have consistently produced better explanations of the world than hacks-for-hire employed by Big Tobacco, Lysenkoist Communists, or the fossil fuel industry.
There's only one small problem with your reasoning: the Lysenkoists called their work "science" as well. As do of course the other two.
Unless you adopt a tautological definition, in which "science" does not include pseudoscience, "science" is whatever the people in your society who practice and organize it choose to call "science."
More specifically, since basically all "science" is government-funded, you'll find that your actual working definition of "science" is "whatever my government funds and calls science."
So your statement boils down to: climatology can't be pseudoscience, because it's funded by the US Government. And Washington (unlike Moscow) would never fund pseudoscience, and call it "science."
This is a pretty interesting epistemology to say the least. Do I have it right? If not, where's the error? If so, what information do you, as (no doubt) a rationalist, have about the US Government that justifies this extension of trust?
And if USG is not the institution you're trusting, what is? What set of human beings are you investing your trust in? If the field of climate science as presently practiced was not in fact scientific, but rather pseudoscientific, who would you expect to have stepped in and shut it down?
[Edit: see also the links to the actual funding levels a couple of posts down. If you're interested in reconsidering your position on this issue, the blog to read is Steve McIntyre's.]
You're welcome. Actually, the funding for the actual climate skeptics who are actually fighting this goliath is so close to zero as to be indistinguishable. The significant players all basically amateur bloggers. I think Anthony Watts got $40K from Heartland, but AFAIK it was for a contract not officially related to his blog at all.
The good news is - there's really nothing wrong with being wrong by five orders of magnitude. So long as you keep an open mind and are willing to learn from it.
I think computer models just mean you do replication the other way around. Of course anyone who runs the same program gets the same results, so ordinary replication is pointless--instead you answer the same question with a totally different program and see if your answer is close.
And then, of course, you wait and see if the predictions of the models come true. You can't reset the world and re-test it, but you can re-run the models and ask for more prediction in the future and wait some more.
Climate scientists do both of those things all the time because they're in one of the most heavily scrutinized fields.
Isolating variables just means you compare two setups with everything the same except one. That's actually one of the things the models are for. You can't re-run the world without humans, but you can re-run the model with humans turned off.
Then someone else can do the same with their totally different model. And if both of your answers match reality with humans on and each other with humans off, well maybe the difference between humans on and humans off is the impact of humans. Or maybe not. But adding more different models helps.
In short: climate science generates testable hypotheses, does replication, and isolates variables. It's possible they're wrong (and publishing source code is a good idea) but they don't have a methodology problem. And they're probably right.
Also, Michael Crichton basically wrote Hollywood scripts in novel form. He's not a good source on anything.
> Of course anyone who runs the same program gets the same results, so ordinary replication is pointless
That would be true, if anyone actually distributed their actual code. Pick a journal article at random in any field that describes results from a computational model, and 99% of the paper will describe the results and not the model. The paper will never contain the complete code (which is fair enough, since it would be too long); 1 paper out of 100 will have excerpts of the code, and another 10/100 will have a URL that claims to have the code. If you actually follow that link, you'll find that 2-3 times out of 10 the code won't actually compile or run, and 9 times out of 10, the figures in the paper were generated by tweeking some parameters not defined in the paper whose particular values the author never recorded, and so even the author couldn't reproduce what he actually published, even if he wanted to.
Perhaps you mean this in the sense that we should trust in science and not in authority? Because as hack writers go, his academic qualifications are better than most. He's an intelligent guy, and it seems fair to give at least some weight to his opinion. Equally, it seems unfair to presume a priori that he's not a good source of information. At the least, you'd should argue "He's looney about X, which we all agree is false, therefore we should not trust him on Y". Simply saying "He's looney about X" doesn't add much information and merely pits your authority against his.
I agree with your summary of computer models, and your basic judgement of climate scientists, but fear that many of the influential climate science papers don't adhere to this standard. Frequently the technique is to tweak the parameters of a number of models until each creates "realistic" results, generate a small number (1-3) of simulations with each model, and then create an unweighted average of this ensemble so that the high and low estimates cancel. The meaning of this is much harder to interpret than the case where all the models predict a similar outcome with identical inputs.
CRICHTON, (John) Michael. American. Born in Chicago, Illinois, October 23, 1942. Died in Los Angeles, November 4, 2008. Educated at Harvard University, Cambridge, Massachusetts, A.B. (summa cum laude) 1964 (Phi Beta Kappa). Henry Russell Shaw Travelling Fellow, 1964-65. Visiting Lecturer in Anthropology at Cambridge University, England, 1965. Graduated Harvard Medical School, M.D. 1969; post-doctoral fellow at the Salk Institute for Biological Sciences, La Jolla, California 1969-1970. Visiting Writer, Massachusetts Institute of Technology, 1988.
Crichton's scientific standards can't have been THAT high, considering how he got hoodwincked into believing Jack Houck's whole spoon bending spiel. I've attended one of Houck's parties and the whole thing was quite remarkable, not for the spoon bending but for the gullability of 90% of the people there.
That certainly could be a legitimate case of "He's looney about X therefore we shouldn't trust him about Y", but I'm not familiar with either Houck or Crichton's beliefs. The quote I can find from him on it seems within reason: 'I think that spoon bending is not "psychic" or bugga-bugga. It's something pretty normal, but we don't understand it. So we deny its existence.'
On the other hand, he also says "More than seeing adults bend spoons (they might be using brute force to do it, although if you believe that I suggest you try, with your bare hands, to bend a decent-weight spoon from the tip of the bowl back to the handle. I think you'd need a vise.)"
I just grabbed a spoon and tried it. As expected, contra Crichton, I had no trouble bending the handle to touch the bowl -- no vice required. And no trouble twisting it 360 after bending. But it would be a little surprising that one could exert that much force without noticing. And there was some interesting annealing and tempering going on: it was much harder to untwist than to twist, easier to unbend than to bend, and subsequent bends preferred new locations to repeat bending. So the scale tips a little toward looney, but I'd have to read more before discounting him. And I'm willing to believe there might be some metallurgical property worth exploring here, although I'm pretty sure it has nothing to do with telekinetics or psychics.
But you've actually been to a spoon bending party, and I haven't. Do you have a loonier link?
Funny anecdote - I honestly tried to use his method of 'feeling the metal get soft and then quickly use this moment to bend the spoon'. I didn't feel anything, so after seeing everyone around me get into some kind of ecstasy, I decided to actually bend the spoon to get an idea of how hard it was. It wasn't hard at all! (Just get some low quality, cheap spoons and forks, they're very easy to bend.) Now, when Jack Houck came around, I showed him my spoon with a sad face and told him it hadn't worked for me, and I had just 'used my muscles' to bend it. He took a few moments to examine it, then proclaimed that he could see in the metal that it had actually melted, that there were features inconsistent with 'cold bending' and that I had very great mindpower but just didn't realize it.
The crowd at this party was very much into new age stuff, crystal healing and all that. In fact, Jack Houck was doing a seminar the next day to teach people healing powers using the same 'energy' that was used to bend spoons, which he had come to consider as a party trick of little interest compared to the healing powers.
Go read James Hansen's 1981 predictions, and its match with current reality, and weep. Climate change is real, and is probably going to be disastrously significant in the lives of many now alive.
I think we see this graph very differently. I see a linear rise in measured temperature, and a prediction that shows an inflection point at the year 2000. If I were trying to match the actual to the prediction, without knowing the labels, I'd probably match it to 2c. This turns out to be the hypothetical coal phaseout in 2000.
After looking for a bit, my questioning side kicks in:
How did they choose the zero point for the overlay? Since it was published 1981, shouldn't the measured match up with the prediction until then? If not, why not?
For that matter, what exactly is the overlay? Is the pink line a time smoothed average of the red or a pencil? Would it create a different impression if extended to 2012?
In what way is a prediction that is 30% off a good prediction? Has it been long enough that we should have seen the change in slope? If anything it seems like it levels off.
I don't think that I ask these questions because I'm a "climate skeptic", but because I'm "generally skeptical". If you tell me a graph tells me something, my first instinct is to doubt you, and then see if the evidence supports your position.
This one feels more fuzzy to me than terrifying, and there are lots of things about the direction of the world that terrify me. What about it makes you weep more than the rest of the news?
I agree with you about the rest of the news. This (and the whole climate denial[not skepticism]) is just one more piece of ongoing tragedy, which is going to get quite a bit worse before it gets better (if it does).
As someone who accepts global warming (and is uncertain how much alarmism is warranted), I completely agree with you. The best way to debunk (or validate) the climate skeptics is to attack the theory as vigorously as possible, and to see whether is holds up or not.
Of course, I still think in the meantime we should be building nuclear plants, investing in solar tech, etc, but there are many reasons to wean ourselves off of oil and coal rather than just climate change.
You are very right. Things like rising cancer rates, alarming increases in extinction, massive destruction of natural habitats, destabilization of entire ecosystems, etc. Some may attack anthropogenic climate change despite strong evidence, rightly or ignorantly, but it's an absurd (and tragically common) the degree to which actual environmental damage that is happening right now is ignored. Some of it is likely due to global warming, but there are a lot of other externalities that are really dangerous for our planet and even our species.
You are soooo right. One of the worst fields, in my opinion, is nutritional science. Gary Taubes has written two books, "Good Calories, Bad Calories" and "Why We Get Fat" which eviscerate the core studies that are foundational to modern nutritional science.
Not all fields rely on grants -- but in all fields the careers depend on results. Which is not a bad thing in itself; but it would have been nice if "results" were not quite as narrowly defined.
I strongly believe that for hiring purposes, the research
skills of profs should be evaluated on criterion that are
incidental to research. Mostly math, probability,
statistics (yes even for psychology) then methodology
skills and also maybe leadership, communication skills
and dedication to science (last only because it is
difficult to measure).
I like the sentiment, but the problem is that to do good research requires real initiative and creativity. Testing for the skills you listed would filter for people who are skilled, but it doesn't distinguish at all between those who can do new research versus those who are merely good at following instructions from other people.
What's your evidence for that? Don't you think it's possible that initiative and creativity are important for getting your research noticed by jey on Hacker News, but not for actually doing the research?
"When I was at Cornell, I often talked to the people in the psychology department. One of the students told me she wanted to do an experiment that went something like this--it had been found by others that under certain circumstances, X, rats did something, A. She was curious as to whether, if she changed the circumstances to Y, they would still do A. So her proposal was to do the experiment under circumstances Y and see if they still did A.
I explained to her that it was necessary first to repeat in her laboratory the experiment of the other person--to do it under condition X to see if she could also get result A, and then change to Y and see if A changed. Then she would know that the real difference was the thing she thought she
had under control.
She was very delighted with this new idea, and went to her professor. And his reply was, no, you cannot do that, because the experiment has already been done and you would be wasting time. This was in about 1947 or so, and it seems to have been the general policy then to not try to repeat psychological experiments, but only to change the conditions and see what happens."
What is interesting is that they are engaging in an activity that wishes to avoid reproducibility. They allow "facts" and "ideas" to swap.
Articles like http://ir.canterbury.ac.nz/handle/10092/5828 exist. There's the Philosophical Foundations of Neuroscience. All that theorizing needs to record facts. To do that, we need to be able to follow what the hell is going on without blatant wankery like with Anthony Crick or early John Searle's "foot in the mind" rubbish or Dennett's everything's-a-scientist metaphysics. Philosophers got the descriptivist bug with Experimental Philosophy. Linguistics has always been largely descriptivist. Now psychology is to do the same. Who cares that they may have false theories, let's see if they've described anything.
As Fraiser obviously demonstrates, psychologists have had their Golden Age. Psychology needs management.
Small sample sizes and publication bias are a lethal combination in any field.
Suppose scientists wanted to test the hypothesis that a fair coin always comes up heads. Due to budgetary issues, scientists are only able to toss the coin 5 times. 40 groups of researchers conduct the experiment in universities around the world. One of them is quite likely to get 5 heads. Guess which result is likely to be published (or which group is going to even attempt to publish its result). Moreover, this result is statistically significant according to the well-accepted peer-reviewed journal standard of p < 0.05.
There are of course quite a few relevant books as well, but that's probably enough to keep most people busy for a few days. Also, most of these articles are about drug research, medicine, or psychology, but it applies to basically every field.
I suggest you stay away from "The Truth Wears Off" by Jonah Lehrer. It's filled with errors and exaggerations, and it seems like Lehrer's concern was demonstrating the existence of a Gladwell-like bogus "phenomenon" which he calls the "decline effect".
It is a well written popular article about his own experiences. A candid honesty about his own failings. The decline effect is humour (and a catchy tagline) rather than attempting to be a testable hypothesis.
The main point: Psychology is not "science" in the same way F=ma is science. It's a model.
The secondary point:
When the article then goes on to describe the breakdown of this
sweeping generalization in studies after 1994 (on other species),
it attributes that to the Decline Effect. It's not. When you
look at the studies together, what you should have inferred is
"symmetry is an associated factor in mate selection by females
in only some species and not others and more research is need to
explain why." Instead, the article attributes its inability to
summarize the variety and complexity of nature in a 140 character
Twitter message to an underlying failure in the 500-year-old
guiding principle of science."
Thanks! (And also thanks to pron, boredguy8, and duwease.)
I have an enormous mindmap/document of all the most important facts and reading for understanding various major areas of society, so I am going to version all of this stuff in once I get through it. Eventually this entire document will be published, but this will take a while.
I suggest the book "Wrong: Why experts* keep failing us--and how to know when not to trust them" by David H. Freedman, the author of that Atlantic piece. It's a book length treatment, and very damning in many ways.
If psychology is "under attack", or for that matter any scientific field is under attack, the solution is to do the hard work to prove that your work is meaningful and accurate, and if you discover it isn't, to fix those problems.
"People" are gullible and easy to fool in the short term, but I think it is commonly underestimated how smart "people" can be in the long term. Yes, you could throw up a smokescreen and dodge out of the spotlight cast on your bad science today, but that will only be a momentary reprieve of the pain. In the long term you'll still be under attack, and given that you will have been witnessed using smokescreens and obfuscations, you'll probably be on the losing side of that attack. In the short term, the pain of revealing just how much flimflam there is might hurt, but the result will be a discipline that in 5 or 10 years is no longer under serious attack, because "people" will notice that honesty and react to it.
If science is under attack, it is only because "people" are noticing that a lot of it is bunk... and the problem is people are right. We've seen that in a number of studies in a number of fields lately. The only answer that's going to truly restore confidence and respect is to eliminate the bunk. Politics won't work. Pay the piper now, or pay the piper more later. How often those are the choices....
and of all the people who should understand the psychological appeal of paying the piper later and taking the easy road today, you'd think it would be the psychologists...
That is a reaction to the statement from the original article:
"Nosek told Science that a senior colleague warned him not to take this on “because psychology is under threat and this could make us look bad.” In a Google discussion group, one of the researchers involved in the project wrote that it was important to stay “on message” and portray the effort to the news media as “protecting our science, not tearing it down.”"
I see they say threat in that text, not attack, which appears elsewhere. Might I point out that if you carefully read my post it is vehemently agreeing that replication is necessary, and from there it should seem logical that I would not consider something necessary an "attack".
"Reproducing results is science. Perhaps the most important kind."
Right, which is the problem because it's supposedly not being done and/or there are incentives against it for three journals of a particular field. This could be a larger problem in other fields as well. Saying what you said is right, but if the system is setup to be against it, then there is a problem.
In 2003 I did the work that lead to this paper (http://bit.ly/I1qvuI). Could it be replicated? Probably not. Let me outline a few challenges that don't have to do with a result being true or not.
1. The passage of time
Unlike physics and biology, culture has a huge impact on the field of psychology, and cultures change. Additionally, my work was about addiction to online games. I have no idea if the same type of people exist today that did then. I do know that the game I was studying (Asheron's Call) does not exist in anywhere near the same form as it did then.
2. Copyright
While I was lucky enough to find psychological measures (aka tests) that were in the public domain, most are not. To replicate a result you may have to pay to access the tests and then most likely cannot re-publish them. At the very least this makes replicating work inconvenient and expensive.
3. Data
Because psychology is largely driven by statistics, to replicate a result you should really start by re-analyzing the data. The reality? This data simply isn't available. It is not published alongside the results, and I don't have it anymore. I might be able to find the cleaned data, but that's not important. Error and bias can just as easily be introduced during data cleaning. For example I removed many data points that appeared as outliers, but perhaps these should have been included. No one will ever know.
I hope barriers such as these are addressed in this investigation. It would be disappointing to impune the work of scientists when it is the process that could really use reform.
As a layman, your three points terrify me about the state of science. Well, numbers two and three do. There are still people addicted to video games. WoW and Minecraft probably moreso than Asheron's Call back in 2003.
As to point number two: Are these questionnaires or something that are used to get psychological data? Why would those not be accessible--don't you need to know the questions asked to determine if the results are valid?
As to point three: Data cleaning?! I would understand throwing out invalid results (i.e.-you find out somebody's lying, errors with data collection, etc), but to throw out results because they look like they don't belong feels disingenuous. I've done a lot of work with public company financials and operational systems. I cannot, on my worst day, imagine telling the SEC, "Yeah, well that return was really abnormal and was an outlier, so we decided not to record it." Why are these data points thrown out?
To me, again, a layman, it appears that if this is standard practice, these experiments are starting with a conclusion and then just going through the motions to get that conclusion published. Why are data cleansing and secrecy normal practices? What about the scientific method don't I understand?
You can clean the data according to certain metrics: e.g., "We removed all results that were greater than 6 standard deviations from the result".
There are other metrics of dropping data that is "out there".
As someone finishing my Master's, I will be releasing all my work, including test inputs and reproducibility results, as a Mercurial repository. Some of it depends on an exterior compiler, but the interesting ideas don't. I feel it is absolutely critical and honest to release source code and all data.
The line between an invalid result and one that looks like it doesn't belong is fuzzy. I believe I was removing the former, but if an independent replicator doesn't have access to the raw data they would never discover that I had been fooling myself.
As to secrecy, there is nothing being hidden, instead it's that nothing in the system requires you to publish all your data, and so because you're pressed for time you don't. Never ascribe to malice what laziness can explain :)
I'm not ascribing this to malice. It strikes me as a paragon of negligence if that's the way that the science of psychology is being done. In fact, it strikes me as an affront to science.
Take this in contrast to the neutrinos that were going faster than light at CERN. Instead of going, "Okay, well, those can't happen, so those are outliers," they reported their results, published their methods and data, and said to the world, "Help us validate this."
It seems to me like the modern psychology scientific method is the antithesis of this--"These are the results of my massaged data, and no you can't see how I got them, just trust me."
I'm not attributing this to malice, I'm attributing this to negligence and laziness. Putting your work in front of people for critique is hard. Really hard. But, it's part of science. If psychology as a field has convinced themselves that they're above that reproach, then I think that's a huge condemnation of the field.
When I first read the article, I thought it was a bit sensationalist to call psychology "under attack," but now I'm not so sure. I've assumed that reproducability is the standard litmus test of all science, but I guess not, and that leaves me with a bitter taste in my mouth toward the field of psychology.
Outlier detection is an important part of basic statistics, and has been for a long time. It isn't about just deleting a few data points that "look" strange. Some statistical tests are robust to outliers and some are not... it's always important to use the appropriate tests in the appropriate way and be open about them.
"It would be disappointing to impune the work of scientists"
I hate to point this out, but I've never even heard of any attempt to "impune the work" of any scientist that is anywhere near as damning as what you wrote above. You've described your work as temporary, local, and proprietary, and you've discarded the actual data. Being unable to replicate it is the least of the issues.
Note: I am not a psychologist. What do you mean by "some tests are copyrighted"? Are you talking about a statistical method being copyrighted, an implementation thereof, or something else entirely?
Probably something like the Rorschach tests being copyrighted until recently. The actual test questions and the scoring rubric that, applied to a subject, yield a number or set of numbers.
I have found Hacker New to be full of some of the smartest people I've encountered on the internet. However, every time a discussion pops up where a social or soft science is involved, the discussion becomes mired by arrogance, bias, and general small mindedness about what constitutes "real science." I see comment after comment of where the author implies or outright says that psychology is pseudoscience or close to it. The challenge in designing, implement, and supporting a study in psychology or similar science is staggering.
That's no excuse for questionable statistics interpretation or outright manipulations, but to write off an entire field of study because it doesn't have the convenient quantitative measuring capabilities that other sciences have is ridiculous.
I have noticed a pattern as of late, or maybe its just the articles I have been reading on HN, but there seems to be a lot more people making dubious claims, commenting on things they no nothing about, and just genuinely turning this place into a cesspool like the rest of the internet.
What is so hard about not posting an opinion on subjects that you don't have knowledge of or any proof to back up your claims? If you haven't studied psychology, have even a vague idea of what it's about, or what its past or present state is then why muddle up the conversation with you BS conjecture?
I'm sure a legitimate licensed clinician would have a field day with some of the posters on this site.
While I agree with much of the substance of your post, the fact remains that psychology is a mess. A friend of mine once told me that doing a doctorate in a subject only really gave you the ability to see how your field is wrong and all the problems with it, and I would agree.
The problems with psychology, from someone who's been at it for a while:
1) lack of focus on replication
2) in survey studies, failure to correlate psychometric measures with behaviour or other kinds of measures (surveys versus reaction time measures versus physiological measures).
3) An unfortunate lack of understanding of the assumptions behind the statistical procedures used routinely within the field.
4) Misplaced emphasis on theory at the expense of prediction. A relatively well known psychologist, quite statistical aware, posted on stats.stackexchange.com that the goal of psychology was theoretical understanding, not prediction. For the life of me, I can't see how one can develop good theories without prediction, but it appears to be a dirty word within much of psychology.
That being said, people are hard. They change their behaviour based on what they think you are trying to do, they tell you what you want to hear, and even when they tell the truth as they see it, they may well be mistaken.
So while its not right to write off an entire field for some errors (in fact, my results are all perfect) its also useless to deny the problem and pull a Freud by saying that everyone who disagrees with your methods has some kind of psychological disorder.
> What is so hard about not posting an opinion on subjects that you don't have knowledge of or any proof to back up your claims? If you haven't studied psychology, have even a vague idea of what it's about, or what its past or present state is then why muddle up the conversation with you BS conjecture?
This isn't a psych journal -- this is a discussion site -- and what's so hard about not reading "muddled up" discussions? If your concern is that laypeople are getting the wrong idea about Important Things, then you're squandering your opportunity to nudge them in the right direction.
It seems that psychology has always been about the atypical rather than the typical. In most other sciences, the goal is to hopefully build a useful model of the typical, and observations outside that model result in alterations to the model.
In psychology the idea seems to be to produce the inverse of that, yet there doesn't appear to be any particular end goal of producing the typical model, only in defining the atypical cases to the nth-degree. The result? We have no better idea what the typical mind is like than we did a hundred years ago, but we have exhaustive lists of subtly different atypical models so encompassing that almost anybody could be recognized as having a psychological problem of some sort!
Describing a new disease (atypical model) is one of the only ways to get recognized in the field. But all the easy cases have been taken, so bizarre models seem to make the publishing rounds much more readily than new subtle delineations on previously recognized diseases.
In treatment this turns into quack and fad medicine like "I prescribe shared strip club night for marriage problems, with the idea that it forces couples to talk to each other about their sexual problems blah blah blah" or "Anger therapy" or "Primal Therapy" or other such nonsense.
I'd say that with the tools we have available today, that psychology is due for an Einstein level revolution, but I'm not sure that the field, internally, is ready for this.
It happened already 50 years ago, it just doesn't get any press. Pretty much everything I've been reading in the past few years (Kahneman, Baumeister, Baron, Stanovich, Wilson for example) will pass this with flying colors.
It is true however that psychology requires a bit more intellectual honesty then other fields. Mistakes are more subtle and easier to cover up, samples are smaller and there's always the excuse of different cultures. I'm really curious what this project will uncover.
Kahneman's getting a little more press lately, with a bestseller on the shelves.
His story is interesting - he was given a lot of responsibility at a young age, because he was doing officer evaluation for the newly-formed Israeli army. He has to produce real, repeatable results, and has a wealth of data to test his earlier conclusions, and perhaps the sheer newness of the entire enterprise makes it safer for him to be honest. In a few years he realizes the vacuity of his own psychological results, and by implication, many of the methods he's been taught. That's what kicks off his whole program of investigating sources of bias and error.
Would this have ever happened in a standard academic setting?
Quite possibly. Roy Baumeister has possibly the best thinking process I've ever met. He had the gall to approach a subject like this, and do it with flawless rigor: http://www.amazon.com/Meanings-Life-Roy-F-Baumeister/dp/0898... Too bad his latest book (Willpower) is coauthored with a professional writer... I miss his style.
You're referring to clinical psychology. However, most psychological research (particularly that published in the journals chosen for review) does try to build a model of the typical mind.
We have no better idea what the typical mind is like than we did a hundred years ago
I can't take this statement seriously. Have you ever even taken an intro cognition or perception class? Almost everything is about how minds typically work, and all of the findings are less than one hundred years old.
Many years ago. Perhaps the field has changed or I had a lousy course. But most of it seemed to be about describing the edge cases where our brains produce false models or find wrong patterns in the world.
Of course brains produce false models and find wrong patterns. Modeling and pattern-matching is what brains do. So they're going to get it wrong a lot.
The interesting thing isn't how funny it is that the mind screws up sometimes, it's knowing the actual mechanisms of perception, cognition, memory, behavior, etc. Edge cases and failures are only the beginning of understanding, because they hint at how things work.
For example, visual perception is heavily based on detecting edges. So there are a set of optical illusions where you fail to accurately perceive the colors or shades of different areas (like the chessboard illusion), because the relative shading of adjacent areas is more important for producing edges and shapes in your mind.
> that psychology is due for an Einstein level revolution
The Behaviour Therapy model was that revolution. The old model had many sessions trying to uncover past trauma before allowing the patient to move on. People with a phobia of something spent many weeks talking about their childhood and their adult life.
Now a severe phobia can be cured for most people in about an hour. And it stays gone.
I'm glad that psychology studies are being given more rigorous scrutiny. There's been too much sloppy science in the field for much too long.
I am talking about "Systemic Desensitization" as developed by Wolpe. That helped form CBT.
The evidence base for SD and for CBT is pretty good. (It's a good treatment for depression, if applied by a skilled practitioner.)
Here's one link to a BBC Radio 4 programme about treatment of phobia using SD. (Might not be available outside UK, but there's probably some way of getting it.)
It appears that he is referring to cognitive-behavioural therapy, or CBT. The Wikipedia page is pretty good.
That being said, it works very well for phobias, but in other areas its about as good as all the other methods of therapy, which implies that its much more down to the meanings that a patient takes from the experience rather than the treatment itself.
I've always had problems with fields like psychology. Part of the field revolves around hard science - neuroscience researchers (as one example) use tools like fMRIs, etc to capture observable phenomenon. On the other hand, there is the art of psychology - social/developmental psychologists use dubious tools like surveys and interviews to try and prove a thesis.
I think that the average Hacker News reader could write a survey that would 'prove' that gravity doesn't exist. What kind of experimental integrity do surveys have?
And then there are participant pools. These pools are primarily composed of undergraduates (who often earn bonus marks/money for participating).
Add in a strong publish or perish mentality and you can see some serious problems. A whole lot of researchers are using dubious methods on a set of participants that do not adequately reflect humanity as a whole.
There are some issues I (as a psychologist) would have with your assumptions.
Firstly, fMRI and neuroscience research is probably the single biggest source of errors and shoddy research in psychology. Its a combination of really small sample sizes, poor statistical tests (brain regions are independent of one another, really?) and huge amounts of data dredging to find significance.
See for a roundup: http://escholarship.org/uc/item/51d4r5tn
Interestingly enough, most of the observable pheonomena you note in neuroscience are linked to traits that people think about (optimism, personality etc) through surveys, so even if their methods were perfect, the results still wouldn't be.
On your survey point, I would agree that the participant pools are quite limited and generalizability is quite low, but your example is ludicrous. No one does surveys of gravity in psychology, surveys are carried out to investigate the manner in which people conceptualise their experience.
For a really great roundup of the problems with typical social science participant pools, see: Heinrich et al: humancond.org/_media/papers/weirdest_people.pdf
Indeed, the entire issue of BBS that that article appears in is well worth a look for a deeper understanding of these problems.
These surveys should then be calibrated against behavioural outcomes, but this does not happen often enough, which is a major issue in my view.
Thats the major problem with surveys, well that and some poor methods accepted too uncritically within the field (factor analysis).
To summate, psychology has many, many problems and I fully support this reproducibility effort (and it will expose a lot of findings as non-replicable). But don't single out surveys for derision, neuroscience deserves as much if not more of your scorn.
As a recent Psychology PhD, I agree completely, and I'll add this: people outside the field commonly believe that neuroimaging tools like fMRI are better and "harder" science than more traditional experiments that measure human behavior, but they have it totally wrong.
Here's an analogy for computer nerds: imagine if you had an "fMRI" of your computer's operation. You'd see that different tasks result in different parts of the computer "lighting up". Tasks such as graphics, disk I/O, numerical processing, etc., would lead to different patterns of activation. And if your scanner had could resolve, say, 0.1mm voxels in your CPU, you might even learn that certain parts of the CPU are related to certain tasks. But what all this tells you is something about the gross physical structure of the computer; it doesn't tell you much about the abstract, logical structure of the computer. To draw an analogy, you learn something about the brain, but not very much about the mind.
What's of interest to most people (except hardware engineers) is what the computer does, not what parts activate.
With a "computer fMRI", you'd learn little about how, say, a filesystem works, how a programming language works, or much of anything that's of interest at the functional level of the computer. The same is true of fMRI and humans: it doesn't tell us much about the mind works. Instead, it tells us that there's a brain region associated with some task. For example, one of the major recent-ish findings with fMRI is that there's a small brain region that's associated with face recognition.
I'm not saying that traditional experimental psychology is going to answer everything. The mind is very complicated and I'm pessimistic that we'll ever have a good reductive model, the way that we do in many other sciences. I'm also not saying that neuroimaging is totally useless -- but it's certainly not as enlightening as many people imagine.
Here's a summary of a paper that found that lay people thought that psychological explanations were more convincing with neuroscience talk (even though the "neuroscience" was totally irrelevant), and that those with experience in neuroscience/psychology thought the opposite:
http://scienceblogs.com/cognitivedaily/2008/03/when_we_see_a...
You do not use surveys to learn about gravity. That would be stupid. You use surveys whenever there is no better way to get the data you want. We have better ways to learn about gravity than surveys, so we use those. This is true for all scientific methods we have: All are not perfect (some are better than others, though) and all are not appropriate to answer every question.
(I know very little about psychology so I’m going to put on my social scientist hat.) The Large Hadron Collider can tell us nothing about which political issue people think is the most important one (one of the variables needed for Agenda Setting research, for example), so even though it’s a great tool to learn about gravity, it’s not a great tool to learn about Agenda Setting.
fMRIs are currently also about as useful as the LHC when it comes to answering questions like that. fMRIs are not very precise, we don’t understand the underlying system (brain!) thoroughly enough, they are expensive so we can’t look at very many people and have a representative sample.
They are just not good tools when it comes to answering certain questions and surveys become then our only realistic option, warts and all. (Some warts every good social scientist knows about: social desirability bias, reactivity, ceiling and floor effects, …)
So, Agenda Setting. The hypothesis is that issues about which the media reports a lot are also seen as important by the recipients. If you want to answer that question you have to do a quantitative content analysis and a survey. There is no other way to get to that data.
Surveys are not dubious tools. In the context of this research question, for example, an fMRI would be a much more dubious research tool.
It's funny because, coming from an "interdisciplinary" research area, many of us cower in fear of psychology journals, completely daunted by the rigorous level of (perceived?) statistical knowledge required to be accepted. Psychology is viewed by many as one of the most hard-core sciences, just because it has taken such a hard-line attitude toward statistical work, out of shear necessity, due to past problems arising from a history of more philosophical approaches to the subject. Other sciences (e.g. HCI) are often somehow "softer" just because they take pains to avoid the kind of criticism that can come from the kind of complex statistical interpretations we see in pure psychology journals.
Although, I suppose I could be mixing up psychology with what we call "psychometrics", which is the kind of psychology that I'm more familiar with due to my research area, which involves perception of virtual reality.
By the way I agree with others that this is not an "attack" on psychology, it is simply science. Verification of results can invalidate claims, but it can also easily provide further evidence for claims. Nothing bad can come of this initiative, if it's carried out properly.
You are definitely mixing up psychology with psychometrics. Psychometrics is awesomely thorough and aware of statistical models and the limits thereof. Psychology, on the other hand relies on SPSS and what other papers have done to determine their methods.
"If you’re a psychologist, the news has to make you a little nervous—particularly if you’re a psychologist who published an article in 2008 in any of these three journals...<snip>...Because, if you did, someone is going to check your work.'
Can somebody in the field comment on this statement? Is it really so out of the ordinary to attempt to reproduce work in psychology that this kind of statement is warranted?
I only have undergrad research experience. In general, I haven't really seen focus on replicating previous results - but it depends on the work and the researcher.
Most of what I've seen is a researchers who extend their own work. in doing so, they replicate their previous research and reveal other information. In this case, replication is definitely a goal, but it's focused on a researcher's own work and a byproduct of exploring a particular research problem.
I've witnessed people discussing replications and discussing that they were unable to replicate X's results. Here, there were clear are efforts made to replicate results, but I've only witnessed this in an informal setting, so the results were never published. I've never seen a big push to publish results that discredited a result due to an inability to replicate results
I suspect that there is little effort made after the fact unless the theory was particularly 'significant,' or if a researcher can offer a competing account. The one time I learned about replication was when two competing researchers were offering competing accounts and discrediting anothers' results. However, in these cases, there were flaws in each others' research
Obviously, in a classroom setting, we have seen many cases were replication has occurred, which makes a particular theory very strong. It is rare that the published results are straight-replications, though. They are often extensions.
Its not out of the ordinary for people to replicate, but its mostly undergraduate and masters students that do it, as its difficult to get pure replications published.
This is a terrible state of affairs, but I think that it may be an artifact of the pressures of publication right now, coupled with psychology's lack of research back in times where replication was viewed as more well respected.
Personally, if someone was going to replicate my results for me without me having to do anything, I'd be delighted. People coming up for tenure and desperately needing findings might be less pleased, however.
The article shows how many of these issues have long been on the radar screen of psychologists at the minority of universities (for example, the University of Minnesota) where the psychology departments train graduate students in the general scientific method. There are some very amusing, thought-provoking, and shocking examples in the article.
The challenge of publishing "novel" work exists in many fields, not just psychology, and the debate about potential false positives is ongoing. It's great to see this reproducibility work being done. I'd like to see it in my field.
I personally hope the software world picks this up, so we might get some actual data on things like TDD, agile, waterfall, instead of peoples' anectdotes and experiences on past projects and whatever Martin Fowler thinks up (no offense, I love Fowler, but he would benefit from proving some of his assumptions)
I agree. It seems we have a problem where things that are most likely just opinion or preference are treated as fact and to worsen the situation there is a disregard for evidence or the need of it.
For example, take the argument why indexes should start with zero. [0] Some people will claim that it's easier to learn without providing evidence or that those who use such a system produce less bugs, again without evidence.
A cynic would say that psychology can't come undone since it's never really been "done" in the first place. We know so little about the fundamental nature of human thought that the whole field is basically a joke. Their journal articles are built on a foundation of sand.
I expect that 200 years in the future historians will look back at 2012 psychology as a pseudoscience, little better than phrenology.
I am glad someone is doing this. I recently tried to implement an algorithm described in a paper in a well known chemistry journal and discovered that the paper's results were incorrect. It turns out that the author's program (which was not made available with the paper) didn't correspond with what was described in the paper. That is to say, when I used the described algorithm, I got very different results than those in the paper.
Now I'm not sure what to do. I want to correct the paper, but I don't want to openly criticize the work of others.
If you are prepared to back your claim up in a way that can be independently verified, why don't you want to criticize the original authors work? Surely correct science is more important than sparing someone's blushes?
How about reaching out to the original author and outlining the problems that you've been encountering? This way, you can at least give them a chance to respond privately or correct potential misunderstandings with the paper (perhaps publicly).
Similar analogy would be approaching a vendor regarding a security/vulnerability disclosure.
I already have. They recently gave me the source code and I pointed out the problems. They agree that there are bugs in their code. I just haven't approached them about publishing a comment on their paper because I feel bad about it.
While this is part of the 'problem' that is being solved by this project, I think as a whole, this issue is separate from the intent of that project.
Making experiments transparent (ie. including data sets, including the algorithm) is an extremely important issue and given the increased reliance on software in these experiments, it is important to make them available. I'd wager that in this project, many of the reproductions will be contingent on using the exact same apparatus. Unfortunately not enough information is usually made available to reproduce.
"Yet when Stuart Ritchie, a doctoral student in psychology at the University of Edinburgh, and two colleagues failed to replicate his findings, they had a heck of a time getting the results into print"
Maybe because they were a bunch of hacks. They didn't actually replicate Bem's methodology, so whether or not they got the same results is irrelevant. The journals were right for rejecting their work.
Got a cite to back that up? Because Ritchie claims that no serious methodological issues were raised during the review process, other than the ridiculous "You can't get positive results from an ESP test unless you believe in it."
I don't remember the exact article that discussed it, but I'm sure you could find it by searching HN. Other than the issue you mentioned, the other issue was that the original study was done in person using Ivy league students, whereas the replication was done over the Internet using a non-comparable demographic.
Regardless of whether or not one thinks it's ridiculous that these issues could have any effect, it's intellectually dishonest for them to say that they've replicated Bem's methodology and failed to reproduce the results when in fact they haven't actually done so.
(The researcher-belief issue matters since we already know that the how well a drug work depends on how much the person administering the drug believes it will work, which is why for well-designed drug trials the investigators aren't the ones who administer the drug. So it's not like the replication was being dinged for not subscribing to some exotic new methodology, rather they were being shot down for not following existing best practices. That said, I have no idea whether there were videos how much the researcher's beliefs would have actually been apparent, so I don't really know whether or not it was reasonable for the journal to reject it based on this one point.)
> we already know that the how well a drug work depends on how much the person administering the drug believes it will work
hahahahahahaahahahahahahahahah what? No, no we do not.
If you have a citation that suggests otherwise, I would be thrilled to read it.
The reason that we blind is because there ARE effects based on perception if the person administering the drug knows its experimental. There have been studies indicating that the administrator would say things like "this really should work", for example. That invokes the placebo effect response in the patients (which is also well documented,) but the concept that "we know how well a drug works depends on how much the person administering it believes in it" is completely incorrect.
Everything you said in your comment supports what I said. If there is any non-zero effect attributable to the beliefs of the person administering the drug, then it is 100% correct to say that how well a drug works depends on how much the person administering it believes it will work, regardless of whether the difference in outcomes is enormous or tiny.
No, there is a difference between belief and actions. The action of saying "this one should really work" of course has an effect.
There is absolutely no scientific evidence anywhere to support that the "belief" on the part of an administrator in a drug trial has anything to do with any outcomes.
I think that should be sufficient to prove that beliefs can be transmitted non-verbally. You are trying to argue there is some distinction between belief and action, but the only reason any 'action' has an effect is because it transmits a belief. What you are saying is that the non-verbal transmission of a belief could not have an effect on outcomes, whereas the verbal transmission of a belief can. This makes no sense.
Spending time in universities has made me very cynical of the research that comes out of them. There is just too much incentives for profs to ignore biases in their research. I've seen it happen many times. For them, it is the difference between becoming a prof or a lab tech and it is literally worth millions.
You can not trust any research done where the career of the researcher depends on them finding results.
Universities, by hiring based on research credentials that currently translate roughly in the amount of positive results a person has generated, completely render worthless the research going on in their departments.
Aggravating the situation is the fact that the peer review system is too incestuous to be relied on especially when the peers are probably also 'bias ignorers' with incentives to keep the flaws in the system.
I strongly believe that for hiring purposes, the research skills of profs should be evaluated on criterion that are incidental to research. Mostly math, probability, statistics (yes even for psychology) then methodology skills and also maybe leadership, communication skills and dedication to science (last only because it is difficult to measure).
update:typos