I think this is a great idea theoretically, but in reality for most papers I don't want to see the data/underlying code. While it would be great to publish data/code with the paper (in the field I've worked on the most, astronomy, most data is already published with the paper anyways), I don't want/need to look through a notebook with the underlying code of the paper in order to just read the intro/conclusions (and maybe one key methods section). Interactive figures are a great idea, but again, oftentimes I don't really care to interact with the figure, or fiddle stuff around, I just want to know why the paper is important and how I should use its conclusions. The two-column format of most papers is very useful for skimming. So instead I would argue notebooks shouldn't replace papers, but supplement them (as they sometimes do already, in fact, but perhaps journals could make it an actual requirement to create a supplementary notebook).
As the article mentions, scientific fields are gigantic nowadays, and skimming papers is critical when you're citing 100+ references in your paper.
EMBL-EBI and others had some RDF-related effort to provide machine readable abstracts, which I thought was a really cool idea.
IMHO, the biggest problem with papers is politics and reviews. In many top journals like Nature there's no double-blind review (actually in Nature it's now optional but big groups never use it). And even if there was double-blind review, referees have no skin in the game. So the usual outcome is to get reviewed by a big name in your field, who is actually interested in controlling research trends and killing "competitors".
This is hindering progress and hurting new ideas. For example, proponents of Alzheimer's disease being caused by an infection or dysbiosis have had a hard time to do research, get grants and publish articles during the last 2 decades. Despite their theory is able to explain the etiology quite well, unlike competing alternatives.
Another problem is that to publish in good journals you need cool results. Cool results are rare, but Nature, Science, Cell et al. are full of articles every month. So, most groups are overselling and misreporting things. Research fraud, p-value hacking and data manipulation are really common.
That's a big problem. I just got a paper rejected. Reviewer 1 was just focalized on a single detail I mentioned somewhere, not central at all in the paper yet is basing most on this criticism around that. Reviewer 2 has difficulty understanding a table containing 2 columns and 3 rows, and what means N, V, ADJ and ADV in a paper about dictionary (not to mention the same abbreviation was used just before, and used a plain words numerous times in multiple paragraphs). Reviewer 3 is the only one saying a remotely nice thing and who seems to have grasped what the paper is about. There is of course some valid criticism raised in the reviews, but half of it is bullshit that would be dispelled in a more interactive process or/and if reviewers had incentives to actually put a minimal effort to understand the paper.
It’s not really possible to conduct double-blind reviews in most cases: authors or at least the group can often be easily guessed from the list of references, “in our previous work…”, and research domain and approach in general.
Sensible anonymisation policies prevent people from referring to "our previous work" in submissions - e.g., the policy for CHI [1] states:
> We do expect that authors leave citations to their previous work unanonymized so that reviewers can ensure that all previous research has been taken into account by the authors. However, authors are required to cite their own work in the third person, e.g., avoid “As described in our previous work [10], … ” and use instead “As described by [10], …”
However, it is true that things like choice of research questions, approach, and equipment used can be quite suggestive of the authors' identity.
You should always want to have the underlying code available. Without the exact procedures they used to process their data, the only kind of "using their conclusions" you can do is the superficial "take it at face value" kind. So many important details get hand-waved away in papers that say things like "we used the well known blahblahblah method to analyze the data."
If you do it right, the code should in no way interfere with your ability to read abstracts.
If I publish a paper saying I have an algorithm which can factor large composites, and in the paper publish the factors to all of the RSA numbers listed at https://en.wikipedia.org/wiki/RSA_Factoring_Challenge , then I think people will take it seriously, and not consider it at the superficial level.
Even if I don't publish the algorithm. ("Because of the security implications of this work, I have decided to withhold publication for a year.")
Furthermore, some things are worth publishing even if the methods was "it came to me in a dream" à la Kekulé's snake. If you can demonstrate a sorting network of size 47 for n=14 input (which is the known lowest bound) then you can publish that exemplar, even without publishing the method used to generate it.
(If you used computer assistance then that method would likely also be publishable, but that's a different point. Newton famously used the calculus to solve problems, but published their proofs using more traditional approaches.)
If you can come up with a protein model that is a significantly better fit to the X-ray diffraction data, then that's publishable too, no matter how you came up with that model.
In all of these cases, there are ways to verify the validity of the results without reproducing the methods used to come up with the result.
This won’t work for empirical research. I vividly recall weeks spent trying to reproduce a paper on information retrieval (a deep learning model). What saved me is skimming through the author’s codebase and chancing upon an undocumented sampling step. They were only using the first and last passage in a document as training data and uniformly sampling from 10% of the remaining passages, and the paper didn't mention this. I adopted their sampling strategy, and i was able to obtain their results.
My argument is that there are nuances and subtleties that are often omitted in a paper (accidentally or otherwise), but are nevertheless required to reproduce the research.
My example of a protein model is an example of empirical research, yes?
My understanding is the X-ray gives you a diffraction pattern which is hard to invert to a structure, while if you have the structure the diffraction pattern is easy to compute. The diffraction pattern therefore gives you a way to verify that one model is a better fit than another model.
It may not be perfect, certainly not. It might not even be correct once more data arrives. But if you predict a novel fold, and that fold matches the diffraction pattern significantly better than the current model, then it doesn't matter how you came up with the new fold, does it?
It could have been a dream. It could have been search software. The result is still publishable.
All of what you have said is true, but my point is for some research being able to verify the correctness of the result is all that matters, not being able to reproduce the research.
What do you see as the fundamental point of scientific communication? In your counterpoints you narrow in on papers being a means of communicating concepts or proof of work. In this view, showing the process itself is pointless or at least irrelevant to the main axiom.
However, others (myself included) see the the communication of methods as a primary function of the literature, because this is what enables others to understand, critique, and build upon the idea.
If you want to be that broad about it, science journals publish a lot more than just method development, including obituaries and opinion pieces on where funding should be directed.
> A direct search on the CDC 6600 yielded 27⁵ + 84⁵ + 110⁵ + 133⁵ = 144⁵ as the smallest instance in which four fifth powers sum to a fifth power. This is a counterexample to a conjecture by Euler [l] that at least n nth powers are required to sum to an nth power, n>2.
Do I need to know how the direct search was carried out to confirm Euler's conjecture was false?
And now that you know it isn't true, you might adjust which project areas to spend your time on. Which is part of what we get from scientific publications.
Just because you prefer one sort of scientific research doesn't mean other forms aren't science.
Again, is Kekulé's model of the benzene ring less scientific because it came to him in a daydream?
We accept Newton's publications where he secretly used the calculus, even though he didn't publish the calculus, because they could be proved through other more laborious means.
Why is it not scientific to write publications which use secret software, so long as we can verify the results?
For the article on Euler's conjecture, I am aware of this paper and that it serves as a sufficient proof of work for publication. It includes context for its purpose by citation, and the methods for verification were well-established. There is a class of literature where this type of structure works.
For the Kekule paper [1] there is a significant amount of information about the context and reasoning for the claim. This is not an isolated concept and he wrote at length as to why the idea might be plausible given the current evidence. He also could have written solely about the dream without context, but that lacks a grounding in the reality he was attempting to describe.
If it is possible to write a paper where the result is possible to verify using already-known methods, then by all means write in that style. But this is a subset of the useful papers to be written, and in my experience a small one.
> But this is a subset of the useful papers to be written, and in my experience a small one.
Certainly. I never claimed otherwise.
But bloaf's and lonesword seem to think such papers are of only superficial merit at best, and that detailed steps to reproduce the research are essential.
Yes, there are occasional exceptions where you don't have to repeat or replicate the experiments reported in a paper to verify them. But that is very much the exception.
Generally you are expected to explain what you did in enough detail that the reader can replicate your experiment. If you're fitting a protein model to X-ray diffraction data, you aren't expected to include all the other protein models you considered that didn't fit, or explain to the reader your procedure for generating protein models, but you are expected to explain how you measured the fit to the X-ray diffraction data (with what algorithms or software, etc.) so that the reader can in theory do the same thing themself.
Sure, but "I found the structure after 5 months playing around with it in Foldit" isn't that reproducible or informative either.
The result is still the same - a novel fold which is a significantly better fit than existing modules, based on measured vs. predicted x-ray diffraction patterns and whatever other data you might have.
Which is publishable, yes?
When the Wikipedia entry at https://en.wikipedia.org/wiki/Foldit says "Foldit players reengineered the enzyme by adding 13 amino acids, increasing its activity by more than 18 times", how is that much different than "A magical wizard added 13 amino acids, increasing its activity by more than 18 times"?
Or "secret software".
What's publishable is that the result is novel (and hopefully interesting), and can be verified. The publication does not require that all step can be repeated.
Unfortunately we have a long way to go to make it easy to repeat the calculation that a novel structure is "a significantly better fit than existing modules, based on measured vs. predicted x-ray diffraction patterns". (If I run STEREOPOLE and it says the diffraction pattern from your new structure is a worse fit, is that because I'm running a different version of IDL? Maybe there's a bug in my FPU? Or the version of BLAS my copy of IDL is linked with? Or you're using a copy of STEREOPOLE that a previous grad student fixed a bug in, while my copy still has the bug? And stochastic software like GAtor is potentially even worse.)
This is something we could and should completely automate. There's been work on this by people like Konrad Hinsen, Yihui Xie, Jeremiah Orians, Eelco Dolstra, Ludovic Courtès, Shriram Krishnamurthi, Ricardo Wurmus, and Sam Tobin-Hochstadt, but there's a long way to go.
>Yes, there are occasional exceptions where you don't have to repeat or replicate the experiments reported in a paper to verify them. But that is very much the exception.
And even in this exceptional case, the algorithm itself is interesting above and beyond the fact of its existence.
It is, but if the algorithm produces a result such as a protein structure or a sorting network that is itself novel and verifiable, you can very reasonably publish that result separately. As long as it doesn't require knowing the search algorithm to replicate your result that the sorting network sorts correctly, which it wouldn't.
> If I publish a paper saying I have an algorithm which can factor large composites, and in the paper publish the factors to all of the RSA numbers
If some factors of those numbers are also large composites, without access to a good algorithm, nobody can truly verify your claims.
If not and you include all of those factors in an easily digestible way for computers to process (let's call that "code"), it will be easy for anyone to reproduce your results (run that code which multiplies all the factors and gets the resulting RSA numbers).
With code, they could easily check that there's not an error in your verification method too (eg. large number multiplication broken).
This would achieve both goals: you'd withhold your algorithm for security reasons, and your results would be easier to verify.
Edit: but to be honest, I think withholding the research is a bit of a special case. You are doing it on purpose, and you can easily offer a service to prove your algorithm works (eg. imagine a "factoring" web service that instantly gives you a hash of the resulting sequence of factors, and then only mails you the actual sequence in two days).
The point of interactive notebooks is not seeing and having access to all the data - it's seeing the abstractions at work, having a direct grasp of how they act on particular examples as an aid to understand their formal definition.
Nothing prevents you from having two-column notebooks, if you find that advantageous, as well as abstract and conclusions sections. The part that you don't get with static paper is that of navigating the abstraction ladder[1] up and down with direct manipulation aids, instead of having to work it all in your head or by following dense detailed paragraphs.
I have made an experiment with my last paper: Write everything from scatch in Jupyter Notebook, including data preprocessing and generation of all figures (etc.) (10 Notebooks in total).
Start of the conceptualization was in 2017, we just submitted it 2 weeks ago (it got desk rejected for not fitting the journals topic).
I learned a lot and it was definitly worth it. The next paper will be easier with this knowledge. Nonetheless, there is an overhead and I feel that this overhead is not valued with the current makeup of journals, where you really need to dig deep to find any supplementary materials.
I did [something similar] too when I started my PhD ... I had one Makefile managed project that ran everything with dependencies. From raw data, to figures and even embedding the numbers into the final, Latex-based PDF.
My supervisor manually copied all of the text from my PDF into a word document on his first revision ...
I think having the ability to focus on the things you care about the paper mostly is what would be more beneficial for all readers. You care more about an overview? You can easily find it (perhaps with graphics and walkthroughs), you care more about proofs? Then you can get them, what about code and experiments? And so on and so forth.
Readability and scalability is about making all this data available in the publication record, but easy to navigate for whoever is looking for whatever.
Moving the burden of assessing everything in a paper completely to the reader is an interesting idea but seems somewhat a step back when at the same time good and curated data gets ever more expensive. So the market for validated results is already not bad where those results "matter".
And not every paper has a lot of code or data associated with it. If you do experiments on organisms etc. then there is so much happening in the actual lab work - where would that go? Endless hours of video documentation?
In a ipython notebook you can fold away "blocks" of code, that means you can have everything there that produces the graphs and still be able to look under the hood if you like to.
Isn't that the practical part about digital technology? That you are not limited to one view?
Papers today are longer than ever and full of jargon and symbols. They depend on chains of computer programs that generate data, and clean up data, and plot data, and run statistical models on data. These programs tend to be both so sloppily written and so central to the results that it’s contributed to a replication crisis, or put another way, a failure of the paper to perform its most basic task: to report what you’ve actually discovered, clearly enough that someone else can discover it for themselves.
I think this almost every time I read the paper. It’s like Linus’ “show me the code.” I just want papers now to “show me the data and the code.” And include a discussion about why these results are important. I think it’s a great time for the scientific community to improve transparency on these fronts.
Sincerely, someone who reads a lot of research but contributes none because I’m an amateur.
> I just want papers now to “show me the data and the code.”
But the code is secondary to the idea. The idea and the discussion around how it was arrived at and what it means is the key thing. The code is just there to implement it. You could code the same idea ten different ways.
I translated what you said in the context of mathematics:
“But the proof is secondary to theorem. The theorem and the discussion around how it was arrived at and what it means is the key thing. The proof is just there to show it’s true. You could write a proof for the same theorem ten different ways.”
Which is all true. But man it wastes so much time having to re-prove everything. Also some lemmas/theorems are so hard to prove. It’s much easier when you see some incredible statement and can’t believe it’s true to look at the proof and see where the mistake / contentious part is.
Yes! There is "reproducibility" in the sense that you can run the code and get the same answer, and this can be very useful. But I don't see that as the core domain of the research paper, which is to describe a new discovery. The papers job should be to explain the discovery and give appropriate supporting evidence. This leads to a stronger form or reproducibility, like you say, which is "I understand the discovery and I can do it too". That's not the same as generating the exact result the authors' show. And a paper that is reproducible in the first sense but not in the second is of limited scientific value.
If anything, I'd like to see more focus on giving evidence of generality of a result, vs just sharing everything needed to get back the same specific result
Yes but chances are it only appears to work because the analysis code has bugs. So first i want to check the code and that it works before i put effort into understanding the idea.
Those aren’t CS papers, social science and biology research have constraints that CS does not. I haven’t seen any evidence that there is anywhere close to that level of issue here. A couple of conferences adopted artifact review where an independent reviewer attempts to reproduce the experiments listed in the paper. Nearly all papers that participate do end up passing
One case where this happens a lot is papers that pick bad comparisons as state of the art. If you have the code, you can run it vs better configurations of existing tools to see if the promises still hold up.
Yes, that's what I tried to say. I can easily try a piece of code with my own data to verify that its results are plausible. I can't do that for a paper. So if I don't have the code, I might waste a significant amount of time trying to reproduce a fake paper.
Some papers are about an idea, but others (perhaps most others) are about results. And the results are very much dependent on the data and how you analyzed it.
If I've learned anything in my career it is that no, ideas are not valuable. There are vastly more bad ideas than good ideas. What makes an idea valuable is validation. Papers aren't to present ideas: papers are to present ideas that have been validated. We proposed an idea, we went and ran some experiments or gathered data some other way, and we concluded the idea was valid (or not valid). The point of this discussion here is that ideas that require huge amounts of computer effort to validate are prone to bugs. The conclusions cannot be relied upon to be validated without having the software available so that it, too, can be validated.
Really? I've read a lot of SIGGraph papers, and sure, they didn't provide all the code. But you know that the code exists. And certainly for those there's a lot of trust. I think we're talking here about something different. Not "hey, you can use quaternions for animation" but "If you factor in hippopotamus usage, young adults experience 22% more flange per square doughnut, and we got all this data, and we ran it through 25 separate computer programs written in lolcode, and look, proof!"
> But there are so many valuable papers in CS that just presented an idea. If you ignored them you’d be ignorant of how to do 90% of modern engineering.
You are right. But that is why we are having this discussions, so we can improve situation.
Having even bad code (and corresponding data) available is always better than not. You can always just ignore it, and read the papers like today.
Honestly I am ok with just zip file of project directory that you have anyway, with hopefully list of versions of os, libs and programs used.
We could do a lot better than just a zip file, but that would be a nice start.
For some papers. Others make a claim about some statistically significant look at data that might not have any basis in reality because the code is wrong. A famous example being the R&R paper in economics where a second look at the showed massive mistakes in the excel document they were using, invalidating the central thesis. Unfortunately not before being used by the world bank for years as a metric for forcing austerity on countries.
Not if the code is wrong, and therefore the conclusion may be wrong. I'm no scientist, but I don't think the point of scientific papers is to get unfounded ideas out into the world.
I can list many major influential papers in computer science that described an idea and didn't really give any concrete code, where we're still using the idea today.
For example the paper on polymorphic inline caching, which is the key idea for the performance of many programming languages today, just described the idea, and didn't present any code. How was it evaluated? People sat and thought about it. Holds up today.
You can reason about an idea through other things than concrete code. Code is transient and incidental. Ideas persist.
I think you're talking past each other. Both are true under different circumstances. In some cases an abstract idea is the important takeaway. In other cases the central point of a paper is to present conclusions that were arrived at based on analysis of some dataset. If the code used to generate or analyze the dataset is wrong then conclusions based on it likely worthless.
A lot of times the idea is wrong (and thus not valuable), and that can’t be proven either way without the code and data. So an idea that depends on code without the code is less valuable.
We can only possibly gain from publishing the code, and lose by not publishing.
It's not like it takes a whole lot of time to just dump your code in a github repo once you're done and link it somewhere on the paper (if you wrote code at all while working on the paper).
Sometimes I did just want to run my own experiments with different datasets, and those algorithms aren't always trivial to implement :|
Yep but we should still show we did actually simulate our idea, and the methodology that gave rise to the simulation. Not because of the code but to test at all a simulation we describe actually outputs what we propose
Not everyone is a programmer but they could find one to confirm, or better yet, invalidate code my team relied on
I agree that they should ideally come with raw data along with all code that was used to process it to produce the results as presented.
> but contributes none because I’m an amateur
I don't mean to be rude but it seems relevant to point out. Papers aren't written for the benefit of amateurs. They're written for experts who actively work in that specific field. I don't think there's anything wrong with that.
Yeah I agree, but I read papers mainly in domains I do have university level degrees in. So while I’m not as expert as a lifetime professor, I do know the fields relatively well.
And I don’t think it’s rude, that’s why I included that statement!
There are also legal and privacy concerns. I've worked on a few research papers where exactly one researcher had access to the data under a very strict NDA. And even they did not get full access to the raw data, only the ability to run vetted code against it and some subsets for development.
This is because the datasets were subscriber logs from mobile operators. They are both highly privacy sensitive and contain sensitive business knowledge. There is no way they will ever get published, even in some anonymized form.
Ultimately it always comes down to trust. You need to convince your peer reviewers to trust you that you have correctly done what you have claimed to have done. Of course, even when you publish datasets, you need to convince the peer reviewers to trust you that you didn't fake the data.
It doesn’t really work like that. For instance, imagine you have a simulation with billions of particles in it. To construct reduced data you may need to use many fields (position, temperature, composition) of all particles over many outputs (usually at different times).
In that case you shouldn’t need to ship the data at all. Just include the code for the simulation and let the rescuers run it to generate the data themselves.
Sorry I'm a bit late to this, but those simulations take 10s - 100s of millions of Cpu hours (i.e. costs of millions - 10s of millions of dollars), so that's not practical.
I think in astronomy they generate tens of terabytes per night and an experiment may involve automatically searching through the data for instances of something rare, like one star almost exactly behind another star, or an imminent supernova, or whatever. To test the program that does the searching you need the raw data, which until recently, at least, was stored on magnetic tape because they don't need random access to it: they read through all the archived data once per month (say) and apply all current experiments to it, so whenever you submit a new experiment you get the results back one month later.
I like the idea of publishing the data with the paper but it's not feasible in every case.
The GP is making a completely legitimate point here that broad sharing of large raw datasets is pretty hard, but I don't think anyone is arguing we should give up. Here's a few thoughts, though they're more directed at the general thread than the parent.
In my case I'm currently finishing up a paper where the raw data it's derived from comes to 1.5 PB. It is not impossible to share that, but it costs time and money (which academia is rarely flush with), and even if it was easy at our end, very few groups that could reproduce it have the spare capacity to ingest that. We do plan to publicly release it, but those plans have a lot of questions.
Alternatively we could try to share summary statistics (as suggested by a post above), but then we need to figure out at what level is appropriate. In our case we have a relevant summary statistic of our data that comes to about 1 TB that is now far easier to share (1 TB really isn't a problem these days, though you're not embedding it in a notebook). But a large amount of data processing was applied to produce that, and if I give you that summary I'm implicitly telling you to trust me that what we did at that stage was exactly what we said we'd done and was done correctly. Is that reproducibility?
You could also argue this the other way. What we've called "raw data" is just the first thing we're able to archive, but our acquisition system that generates it is a large pile of FPGAs and GPUs running 50k lines of custom C++. Without the input voltage streams you could never reproduce exactly what it did, so do you trust that? Then you're into the realm of is our test suite correct, and does it have good enough coverage?
I think we have a pretty good handle on one aspect of this, is our analysis internally reproducible? i.e. with access to the raw data can I reproduce everything you see in the paper? That's a mixture of systems (e.g. configs and git repo hashes being automatically embedded into output files), and culture (e.g. making sure no one things it's a good idea to insert some derived data into our analysis pipeline that doesn't have that description embedded; data naming and versioning).
But the external reproducibility question is still challenging, and I think it's better to think about it as being more of a spectrum with some optimal point balancing practicality and how much an external person could reasonably reproduce. Probably with some weighting for how likely is it that someone will actually want to attempt a reproduction from that level. This seems like the question that could do with useful debate in the field.
I remember in grade school reports seemed to logical. Propose something, create a hypothesis what you think you will see, record your data and what you observe during the experiment, summarize the results as to what actually happened vs what you initially expected. Most papers now seem like a foreign language and I can only glimpse at what is happening relying on some math genius to reply the significance.
One of the first suggestions I have is to use source control and store the Git hash of code used to generate data. A few times I've heard back "we don't have time for that" - pretty easy to see how the replication crisis flows from processes like that.
Certain scientific software packages (e.g., Tensorflow, pymc3, etc) do have frameworks that you follow to return pipeline and result objects that follow some data model that others can learn quickly (e.g., an arviz::InferenceData result object). I wish there was a more extensive framework where this is applied end-to-end from data input, to library components in a pipeline processes, to the result, and then to plot.
Papers aren’t pop-sci articles, they do not target an audience that does not knows anything about the field yet. They are from experts for experts. If someone wants to familiarize themselves with the language, symbols and methods of a field, a textbook is a better thing to start with. Over time they will also learn the shared knowledge of the field that isn’t even mentioned in these articles.
Worse in that a lot of researchers actually have only the slightest grasp on statistics. To the point that I would assume that a lot (1 in 20? :-)) papers will contain an error in their statistical analysis of their results.
I feel like the website paperswithcode.com addresses this very well, especially with their feature "quick start in Colab". For example, here's the top paper on the website as of now: https://paperswithcode.com/paper/towards-real-world-blind-fa.... Instead of going through the process of cloning a repo, initializing a fresh Anaconda environment from scratch, reading through nebulous, haphazard documentation about how to download the necessary training data, and then converting that training data into a format that's compatible with the code, I just click a link and run a couple of lines. Bam. I have an intuition about the code that's 100x better than reading the paper alone. Even though Colab isn't applicable to all fields and is largely used by the Data Science and Computer Science community, it is a promising step at modernizing science, especially the replicability of discoveries.
It's really disappointing that technical societies like the ACM and IEEE haven't done this already.
For many journals and conferences there isn't even a way to submit the code or other digital artifacts with the PDF. A few have badging for whether digital artifacts are provided and whether the results have been reproduced or repeated by others - steps in the right direction at least.
As much as I intensely dislike their practices of overcharging for journals and milking digital library subscriptions to fund administrative overhead, the technical societies are technically non-profits and exist to serve their members and the research and professional community. This is really something they should be doing.
> A few have badging for whether digital artifacts are provided and whether the results have been reproduced or repeated by others - steps in the right direction at least.
This part really downplays the considerable resources (ie time contributed by unpaid volunteers) required to do artifact evaluation.
It seems like you should be able to include more or less arbitrary binary artifacts as a form of supplemental information provided the peer reviewers skim and sign off on it and it fits within some size limit.
> ...For many journals and conferences there isn't even a way to submit the code or other digital artifacts with the PDF.
Why not to stuff it into a repo and reference it in the text?
Of course, a central location within a stable institution helps with continuity of availability of such repo. But at least this gives you some control over it.
I don't know if they necessarily should. It's easy enough to post your own supporting materials online. But it doesn't fit the publisher's mission of keeping a timeless paper trail of ideas.
In my field, at least, I think the problem is less about the medium, and more about the incentives. Researchers are incentivized to write papers that seem impressive (and intimidating) rather than clear and intuitive.
To make matters worse, this is an evolved trait: researchers whose papers are intimidating are more likely to succeed, which means they're more likely to have future PhD students, which means that the style of writing is more likely to get passed on.
I think the main way to address this is to change the incentives. In particular, by creating publication venues that value simplicity and clarity (one such conference is SOSA, which has had a lot of impact on theoretical computer science in the last few years).
> Researchers are incentivized to write papers that seem impressive (and intimidating) rather than clear and intuitive.
Ah, a fellow economist lol. Lack of clarity is a strategic advantage because (1) (as you said) it looks impressive and (2) it's hard to validate that it's correct.
So many papers contain such elementary statistics mistakes such as survivorship bias, e.g. 'returns to education' is almost exclusively measured by asking individuals who graduated (on average 50% of enrolled students don't) and respond back to surveys (good chance of bias).
Pubs are how you get jobs. It's not about science anymore, it's about navigating bureaucracy for an elite job.
> Ah, a fellow economist lol. Lack of clarity is a strategic advantage because (1) (as you said) it looks impressive and (2) it's hard to validate that it's correct.
Reminds me of the old "there are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies, and the other way is to make it so complicated that there are no obvious deficiencies."
There is the whole WEIRD participants thing ... of which I am guilty too, only because that's the crowd you can easily recruit for experiments on campus.
I hope the "trend" of researchers writing for the public, via books & blog posts, can contribute some incentive towards being clear and intuitive.
I'm thinking about the success of Freakonomics, Thinking Fast and Slow, The Elegant Universe, etc. These are all academics, who've "translated" their research for the masses. That translation ended up being much more impactful - and prestigious - than an intimidating paper.
I hope this becomes more of a trend, and an incentive structure, in the future.
As a practicing scientist, I firmly believe the world would be much better off if we simply published version-controlled Jupyter notebooks on a free site, such as GitHub or ArXiv.
That's awfully field specific. It probably wouldn't work for most of STEM. Even for ML I shudder to imagine trying to make sense of the inevitable monstrosities. Writing a paper is part of the thinking process. It forces the author to sit down and work through things in an orderly manner and they're still often difficult to read.
I'm definitely in favor of all papers being accompanied by working source code when relevant though.
> It forces the author to sit down and work through things in an orderly manner and they're still often difficult to read.
As a former academian: Papers are difficult to read primarily because the academic community does not value making them easier to read - no other reason. You may hear things like "papers should be written for other experts", but even that doesn't hold up to scrutiny.
They typically spend 99% of their research time on the actual research, and less than 1% on writing the paper. They definitely can afford the time and energy to make the papers easier to read, but game theory holds sway: Why should a particular researcher use his/her time to do it, when his/her peers will not appreciate it? It's purely an internal, cultural problem. There are no external constraints leading to this.
I've seen referees send papers back saying they contained too much explanation, and suggest leaving out most of the details - just include the big picture methodology and show the results. I can guarantee most who will read the paper would not be able to reproduce those details, if they ever want to. Likewise, I've found papers where I couldn't reproduce the results, because the results were wrong - but since including a derivation of your final expressions is discouraged, no referee caught the errors.
> They typically spend 99% of their research time on the actual research, and less than 1%
This isn’t anywhere close to my experience. 2-3 days of writing per year seems like a wild underestimate for any academic I know. I’d maybe believe only 20% of time spent writing, but for some folks even that’s probably way too low
As a former academian, I mostly disagree. In my experience, papers are difficult to read because they are as concise as possible, striving to refer to previous work for anything that's not original, and only elaborating on original things. This is done to make them as quick as possible to read by experts, which is pretty important given the immense volume of papers that appear in many fields.
I do think papers nowadays need to include a link to a zip file (or whatever other format - but it should be a boring old format unlikely to change or be abandoned, and not proprietary either) including all data, code, and so on. This data is necessary to verify the paper's results, but it is not the results themselves.
> I mostly disagree. In my experience, papers are difficult to read because they are as concise as possible, striving to refer to previous work for anything that's not original, and only elaborating on original things. This is done to make them as quick as possible to read by experts, which is pretty important given the immense volume of papers that appear in many fields.
This is consistent with what I said: Papers are difficult to read by choice. Yes, they strive to make it as concise as possible, which translates to making them harder to read.
Where I would disagree is the claim that it is done to make it as quick as possible to read by experts. In my field, the experts would skim papers quickly to get an idea, but if they then honed in on a paper to actually extract the meat of it so they can use it in their own work, it would take a lot of time, and was a pain.
I've heard from math professors that it takes about a day to read and digest one page of a journal article in their field.
Also disagree on it merely being a matter of consulting references. My work was theoretical/computational. It was common to see the final equations without any derivation, and many experts would not be able to reproduce it. There are lots of tricks that go into the derivation, but they are not provided under the pretext that any expert should be able to solve the equations and derive them.
And in the day of digital media, it's quite trivial to write a paper the way you suggest, and then put all the extra details in appendices. I guarantee that they will be read by most people who want to read the paper in detail, as opposed to merely skimming it.
Explorable explanations[1] is what you want, not Jupyter notebooks.
Explorables have all the same requirements to carefully think them through and prepare them for reading and clarity of exposition, but they also have the interactions that ease the introduction of concepts to their readers through a hands-on approach (rather than forcing them to read the mind of the writer by reverse-engineering their thought process, by running in your short-term memory the examples given in a non-interactive paper).
I'm pretty sure all of us have the experience of cramming most of the paper into the two weeks before the deadline. Expecting someone to firm up their presentation without the risk of finalizing something half-assed is ... too much to ask for mere mortals.
So they would suffer the same decay all other web resources do? Broken links, no longer maintained tools, their opinionated choice (or lack of flexibility - which text + mathematical notation do have - if a set of stacks and data formats gets standardized).
Nobody is preventing scientists from publishing code and data in addition to & before the paper, which imho itself should be as conservative in format as possible to provide the most universal baseline for understanding, reproducibility, and reliability.
> So they would suffer the same decay all other web resources do? Broken links, no longer maintained tools, their opinionated choice (or lack of flexibility - which text + mathematical notation do have - if a set of stacks and data formats gets standardized).
Tools like Zenodo [1] are meant to solve this exact problem, and ensure these kinds of data don't suffer web decay.
I think what you really want is a FOSS Mathematica.
It's sort of code, but more convenient for just getting work done. You can pass around the whole thing (data + code). No need to learn software development skills or set up a development environment to replicate results; it runs everywhere. Plenty of power to do advanced things. Already a standard in research.
I'm not an academic, but a physicist working in R&D at a company. So my "papers" are only for internal consumption, and not earth shattering anyway. My colleagues and I are using Jupyter extensively.
My observation, from seeing papers that have been written in Jupyter, and observing how people work, is that Jupyter will first gain traction in disciplines that are already computation-heavy, and where open software is closer to the front end of the data pipeline.
For instance in my case, I develop measurement instruments, so everything I make is computerized, by me or my colleagues. While "raw data" may be in the form of things like voltages, they are almost immediately turned into a Python friendly data format by code that I wrote myself. So I'm up to my armpits in data and code just to get my experiments even barely working in the first place. I have a computer with coding tools literally at every bench in the lab. Jupyter is my lab notebook, and often my "report" is just the same notebook, dressed up with some readable commentary.
Now, contrast that with somebody like a synthetic bench chemist. The data that they get may be in computer readable form, but they rarely do any coding during the course of a project. For analysis, they're satisfied with the computations rolled into their instrument software, or Excel. And a fair amount of their analysis is in the form of explaining their way through an argument that connects data from disparate measurement techniques, using pictures and graphs. They don't program. The ones who can program have gone into software development. The ones who are using Jupyter are motivated to use it, as an end unto itself. Bringing that stuff together in Jupyter wouldn't help much. Many of their journals do require submission of raw data.
This is similar to questions about why so many people use Excel. I think you have to actually immerse yourself in the specific work environment an observe or even experience what people are experiencing, what they're actually studying, how they think, and so forth. There's a certain Chesterton's Fence aspect to discussions that start with the premise that some widespread activity is hopelessly broken beyond repair and must be immediately abolished.
I do share code that way, but the traditional ivory tower standards by which I am judged require "refereed journal publications" in high impact factor traditional journals. I'm trying to fight back against that, largely unsuccessfully.
What would help me is to have the old geezers consider GitHub issues, PRs, and commits as a type of citation and to have a better way of tracking when my code gets used by others that is more detailed than forks.
I also think citations of your work that find errors or correct things should count as a negative citation. Because otherwise you are incentivized to publish something early and wrong. Thus the references at the end of the paper should be split into two sections: stuff that was right and stuff that was wrong.
> I also think citations of your work that find errors or correct things should count as a negative citation.
Strong disagree. Given how much influence colleagues can have over one another's career prospects, how petty academic disagreements can get, admin focus on metrics like citation count, and how it's easier to prove someone else wrong than to do your own original work (both have value, one is just easier), it would end up with people ONLY publishing 'negative citations' (or at least the proportion would skyrocket). I think that would be bad for science and also REALLY bad for the public's ability to value and understand science.
> Thus the references at the end of the paper should be split into two sections: stuff that was right and stuff that was wrong.
This, on the other hand, is brilliant and I love it and want to reform all the citation styles to accommodate it.
Superficially yes, but in actuality it would be very different due to the context surrounding academic papers vs. Reddit.
Organizationally speaking, Reddit is a dumpster fire; check out the 'search' function (I'm just speaking on a taxonomical/categorization perspective, I can't speak to their dev practices).
Academic papers aren't. (They're a dumpster fire in their own ways: The replication crisis and the lack of publishing negative results comes to mind, but damn if they aren't all organized!)
There's two key differences:
1.) Academic papers have other supporting metadata that could combine with the more in-depth citation information to offer clear improvements to the discovery process. Imagine being able to click on a MeSH term and then see, in order, what every paper published on that topic in the past year recommends you read. I also think improving citation information would do a lot to make research more accessible for students.
2.) Reddit's system lets anybody with an account upvote or downvote. Given you don't even need an email address to make a Reddit account, there's functionally zero quality control for expressing an opinion. For academic publications, there is a quality control process (albeit an imperfect one). If only 5 people in the world understand a given topic, it's really helpful to be able to see THEIR votes: If they all 'downvote' a paper that would suggest it's wrong.
> the references at the end of the paper should be split into two sections: stuff that was right and stuff that was wrong
I've seen stuff like this said before but I don't think it would work. Most citations are mixed in my experience. A few objections, a bunch of stuff you aren't commenting on, and some things you're building on. Or you agree with the raw data but completely disagree with the interpretation. Others are topical - see <work> for more information about <background>. Probably more patterns I'm not thinking of.
Yes. We should record all of this, and turn them into easily browsable graphs/hypertext to easily assemble sets of papers to read/look into. At the very least things like 'background reading', 'further reading', 'supporting evidence' and 'addressed arguments' would be useful.
'We' meaning the librarians and archivists. You guys actually researching have more than enough to do.
Actually I think that's an intriguing idea for how to improve citations. Instead of a single <work> citation, have multiple <work, subset> citations that include a region of text as well as basic categorization of how the citation is being used in that instance.
I'm not sure if it would prove feasible in practice. It seems like it would aid the writing process in some cases by helping the author keep track of details. But in other cases maintaining all that metadata would become too much of a burden while writing, so it would get put off, and then it would all fall apart.
I was imagining a post-writing process akin to assigning a paper its DOI[0] or a book its cataloguing info. Citations as they are can be done by researchers without impacting the process because they're binary: Something either is cited or it isn't. It either contributed to the creation of the research or it didn't. This probably couldn't be done by the researchers, but you identified why: The citations are data but this would be metadata.
Definitely don't want to encourage papers to take even LONGER.
> What would help me is to have the old geezers consider GitHub issues, PRs, and commits as a type of citation
As a geezer myself I am imagining what this type of request would look like in less than 5 years from now.
"What would help me is to have the old geezers consider Tiktok videos and replies as a type of citation"
i.e. if you open this particular can of worms for a very restricted subset of users (not only programmers but specifically programmers who also happen to use github), you have to open it to everyone else. I am sure plenty of Youtube research qualifies as "citation" if you start counting Github commits.
Papers qua papers aren't the goal. The idea is to advance our collective understanding of a field. Papers are certainly a means to that end, but other things can too, like code, videos, and blogposts, even if they don't fit into the "6000 words and 6 figures" box.
I get that citations and citation metrics feel objective, but they emphatically aren't great measures of research/researcher "quality".
Tiktok videos are primarily used for entertainment, rather unlike Jupyter notebooks and source code repositories. Surely you have a more serious objection.
For entertainment, I tend to read about things outside my field — things like In the Pipeline where I learn about FOOF and chlorine triflouride and freaky molecules like jawsamycin (insert shark theme here). I also watch Chemical Safety and Hazard Investigation Board videos on YouTube, like that time a refrigerator accident at a poultry plant caused hydraulic shock and released a massive cloud of ammonia, and ~150 contractors hanging out across the river working on Deepwater Horizon cleanup measures got sent to the hospital.
This is a little exaggerated. Most papers have to be somewhat readable to be accepted into journals and notable conferences. The fact that the layman cannot understand an advanced biology paper is nothing new. I'd wager the scientific paper "golden age" the author cites as having such readable papers were not very readable for the general public of the time. It's just we are taught those things in elementary school and so can see the concepts in those older papers much better than the people of the time can.
Some medical journals have done something very similar to this for quite some time. They also define any acronyms used and have a brief summary of the key results.
I definitely agree with the author that a major gap that has continually failed to be bridged between the scientific community and the more general bulk of people is true understanding of the field of science as a whole. Even considering the way that most people casually throw around the term "research" is elucidating of this problem: a genuine lack of understanding of not just complex scientific and mathematical models but science as an industry and a tool for understanding life on this planet.
None the less I wonder if the goal should be to make it so any person could understand a complex paper? Should all people strain to understand every study? There are experts in certain fields for that very reason. It is not always possible to accurately explain higher level concepts to people who lack foundational knowledge that can take years to accrue. I am not certain if changing papers to be more interactive is going to bridge the gap as this author hopes, or if it is even the goal that should be pursued.
James Somers is one of my favorite science/tech writers. Thank you, James for enriching my mind over the years. I have a bunch of your New Yorker articles to read but throughly enjoyed your 'The Coming Software Apocalypse' story. Guys check it out here https://www.theatlantic.com/technology/archive/2017/09/savin...
It's easy to say things are obsolete when you are your own publisher (Victor, Wolfram, Perez) or when can suggest your favorite (even if it's very cool) approach (Jupyter Notebooks) as a potential key solution. If you can't be your own publisher, it's a much more difficult proposition.
We're trying to figure out how to facilitate taxonomists publishing their own taxon pages, i.e. species descriptions, from a science 250+ years old. Our MVP use case is ~20k pages, one per species, for one project. There will be many of these projects, though maybe not many with 20k pages, and some with much greater than 20k pages. Updates are needed with as little latency as possible with data from from multiple sources. There has to be basically zero cost to serve these (I know, nothing is free). Sites must be trivially configurable (e.g. clone a GH template repo and edit a YAML file and some markdown). Even if we can get this in infrastructure in place we then have to figure out how to get the social structure in place to have this type of product recognized as equivalent to traditional on paper publishing, i.e. advance people's careers because they "published".
In my field, until we give people the power to publish on their own, I don't see traditional publishing go away. Many in the past have indeed published (traditionally) their own species descriptions on their own dime, meeting the rules within the various international codes of nomenclature. I also don't have a problem with dead wood- if we go digital too fast we will loose so much for any number of reasons associated with the ephemeral nature of electron-based infrastructures.
Great read. Taking from my field (CS), I think a lot of papers suffer from the idea that you only are supposed to show "the interface", like, what the result is, what you achieved. The "how" or the "why" are sometimes neglected, regarded merely as a "technicality" to account for the rigorous mathematical framework that "must be there".
There is little effort in making your results understandable and easy to replicate. Academia values paper production, which requires convincing peer reviewers that your results are not trivial and are worth publishing. Contrary to what the essay states, I don't think many scientists today think their research is "incremental". In fact, this word is used in many places as a derogatory term to indicate certain result doesn't contain enough novelty to deserve publication. Researchers are more incentivized to make their constructions and results as complicated and less accessible as possible.
This is not just a theory, this is something I've seen over and over throughout the years.
As an academically employed scientist of 20 years, the notion that scientific communication suddenly needs better standards puzzles me. The core research curriculum of nearly every scientific field I’ve seen, STEM or otherwise, is that the data needed for replication are non-negotiable. A paper that doesn’t include it would be table rejected by any editor. Or one would hope. This is taught at the UNDERgraduate level, for heaven’s sake.
The thought clusters emerging from the recent “replication crisis” are a fascinating rabbit hole to crawl into. If you stay near the surface, you will find mostly young scholars cheerleading open science as the obvious solution to replication difficulties. The concepts of pre-registering your study, committing to sharing data, and publishing online are all various components of this idea, varying in their necessity by the author’s devotion to their cause.
But there are several downsides to such a system that aren’t immediately obvious. For example, does the skill set of the successful scientist broaden to include how skilled they are at poaching ideas from public data that wasn’t immediately seen by their authors?
Some of the more recent criticisms invoked the spectre of “platform capitalism”, and suggested the Facebook and Linkedin-ification of science by dumping all its data on a centralized platform would likely have a net negative effect.
This article was written in 2018, and most of the discussions I’ve read since then have suggested that the open science initiative has failed despite the rapid penetration of Jupyter and visualization tools in the scientific process. Perhaps, like most things, the unseen market will pick and choose the good out of the dubious.
> The core research curriculum of nearly every scientific field I’ve seen, STEM or otherwise, is that the data needed for replication are non-negotiable. A paper that doesn’t include it would be table rejected by any editor. Or one would hope. This is taught at the UNDERgraduate level, for heaven’s sake.
This may vary based on discipline, but in both the subdisciplines of experimental and theoretical physics I was involved in: No - very few will provide the data/derivation. My professors were very open about this: They don't want to lose their competitive edge. Almost no experimentalist I knew could take papers from his/her field and reproduce the results, because the papers lacked enough detail to do so. They would mention a technique, but there are lots and lots of nuances involved when building equipment to carry out the technique[1], and these are intentionally excluded. It's unlikely you'll be able to build the equipment the same way the original authors would.
[1] Most experimental physics involves building your own equipment, or at the least modifying existing equipment.
> if the experiment can't be reproduced to be verified, how does the paper provide more proof than a blanket 'trust me'?
It doesn't, and it is a big "trust me". People review papers based on the merits of the idea and methodology, and then tend to trust the results. Of course, if the results are very "significant" (e.g. cold fusion), then it will be scrutinized more, people will fail to reproduce, and they will harass the author. 99% of papers don't fall in this category, though.
> isn't this letting politics come before science?
Yep. The games at play are often: "How do I write my paper in the most convincing way?" and "As a referee, this paper is hurting the research work I am currently doing. What is the best way to reject this paper?"
The extremely annoying part was I felt I was back to taking literature courses, where I'm graded on very subjective metrics. It was horrible, especially when all my work was extremely objective. However, the publishing system is not incentivized to be that objective.
Simple example: A colleague's paper was rejected because he explained a phenomenon using method A, and the referee complained there was no mention of method B. Method B was the hot topic of the day. Neither method A nor method B had good empirical data to support it - it was almost purely theoretical at that point. But that community was gravitating towards method B, and really did not want to see alternative explanations.
If you are working in the same niche, you will likely know the tricks of the trade. You usually have an idea of what your peers are working on and it's often a race to see who gets a paper out first. This is what conferences are for as well, to try and figure out what your peers are up to.
Sounds like there's a need for some sort of time-based escrow on "the full paper"... Of course, game-theory strikes again because nobody will want to pay for it.
> At no point do I want pretty visualisations made by wannbe PhD candidates
See Figure (5). Your argument doesn't really counter any part of the scientific notebook. A notebook will still have the abstract and conclusion (result & discussion). The tools mentioned in the article describes how to restructure the methods, data, and figures. You're note going to look at these anyways until the abstract and conclusion intrigues you.
>The earliest papers were in some ways more readable than papers are today. They were less specialized, more direct, shorter, and far less formal. Calculus had only just been invented. Entire data sets could fit in a table on a single page. What little “computation” contributed to the results was done by hand and could be verified in the same way.
This is not even close to true. Look up Tycho Brahe's observations, and he's only the earliest I can think of.
in another way, if you read the early article in biology, they are very easy to read and funny! e.g., Wallace's article looks like travel note. but use them in research just a nightmare...
Computational notebooks are great, but only seeing what the author (or coder) wanted you to see is not enough to evaluate their work. In addition to data and code, we need to see the path they took, the exploration and experimentation that led to the final presentation of ideas in the notebook.
For this we can use cloud based environments controlled by funding agencies/universities that ensure every interaction with data is recorded from the very beginning.
Something like this would at least reduce the risk of p-hacking practices that would otherwise be there even if everyone used notebooks instead of papers.
As a publishing scientist myself I would have hoped to read more about how I can actually publish Jupyter Notebooks in a way that is recognised academically. That's at least what the title implied for me.
But the article is actually about Mathematica vs. Jupyter notebooks. Still, it's well researched and very interesting.
Nevertheless, the question how to publish better remains open. I for one think that some progress could already be made if ArXiv published html articles by default, rather than those unwieldy PDFs that really only work best when printed on paper.
I don't disagree with the main point of the article, but I think it underplays the extent to which most publications being information-poor is the fault of the medium rather than the low standard of writing that we've become accustomed to and complacent with over the years.
This is not too unlike maintainable code. The code platform itself matters to some extent, but far less than the extent to which the author wrote with maintainability in mind.
Years ago I did a lot of original research in a field that wasn't very well developed at the time. However, that research took place in the context of a startup, not academia. I was rewarded for producing solutions that worked - not for publishing papers.
I was approached by a few academics about publishing what I'd worked on, but I never did. I never did because I did consume large stacks of papers every month, and I absolutely hated the pompous, obfuscated portioning out of ideas fragment by fragment. It was an unnecessarily time consuming, and often quite useless way of sharing information. Especially since source code often wasn't part of what was published so a lot of important information got lost (which I guess was the entire point of not publishing code).
I particularly remember a 4-5 page paper that was so poorly written it took me a couple of readings (weeks apart) to realize that it described something I too had worked on. How bad is a paper when it is so obfuscated that it takes effort to recognize something you have worked on too?
I wasn't interested in wasting time dressing up my notes in drag. And if my notes as they were were not good enough, well, then someone else would surely do the same work independently and publish something at some point. Lots of the things I worked on inevitably were described by other people.
I have a love-hate relationship to scientific papers for the simple reason that they sometimes aren't really about science, but about scoring points in academia and certain types of research organizations. Yes, a lot of interesting goodies are published, but my god there is a lot of garbage that gets published. Not least because people in academia are incentivized to get as many papers as possible out of what ought to be a single publishable unit.
If we incentivize authors to spam us, they will spam us.
If it's any consolation, even if the source code was provided, it is highly unlikely that it would have been of much better organisation or quality than the paper itself.
I'm afraid you are right, but it should be a priority that people with PhDs know how to write decent code. In several workplaces I've worked before the term "PhD-code" was an euphemism for "disorganized, buggy mess".
what came first? notebooks in mathematica or knuth's ideas on literate programming[1] ?
regarding notebooks themselves, i feel like they're a high concept idea but i've yet to see them really click for me in practice. i find the small cells for code to be extremely unergonomic and that the interspersal of code and plots to be distracting from both the code and the plots (although pretty fantastic for demonstrating high level features of a library, programming language or environment).
on a more fundamental level, i completely agree that mathematical notation is lossy, and that it takes a lot of skill to go from some arcane notation to an actual sense of what the relationships are- but, it requires no specific functioning technology to do so. i can review a paper from 100 years ago and understand it, where running a computer program from 20 years ago can be a challenge at best.
i think that additional high touch experiences for data exploration and teaching are fantastic ideas, but i also think that maybe the base level of communication should be kept simple; both for the purposes of maintaining accessibility and history. where the linux kernel developers insist on 78 column listservs, maybe scientists should insist on camera ready documents when it comes time to share.
i think that everyone agrees that better science would come from full data and code being supplied with publications, but interop is quite difficult as-is keeping code alive. i suppose the big question is: does it make sense to move science towards how software is done, where every bit of code is actively maintained over the years to avoid code rot, or does it make sense to come up with a scheme of freezing and archiving computing environments used in science so those in the future may be able to reproduce results or errors as they see fit. (something like, every paper must ship with a vm image for a widely available architecture that includes no proprietary code and all data used for results)
interesting questions. how to fundamentally change scientific communication such that it is enriched with data and code properly is a harder/organizational problem that i think many have tried to solve (not to mention how this ties into another problem in science- idea validation/replication and knowledge rot). building software systems for exploration and data analysis (ie; computer as partner in exploration) sounds much more fun and likely to produce useful results!
> Similar things can be said about the textbook and the lecture
If you haven't yet, maybe look at Andy Matuschak's "Why books donʼt work" [1] and " How can we develop transformative tools for thought?" [2] which connect these same ideas to education in general.
They do. First, just like with everything else there are brilliant, good, mediocre, and outright poorly written books. Among those some may work for you, others miss the mark completely depending on your prior experience and background (as the author rightly notices, books are just a medium). Second, did the author expect to become a domain expert after finishing a single book? Clearly, his expectations are unrealistic then. You start somewhere, then use references to deepen your knowledge. That's a task requiring interest and dedication, but no "several lifetimes of research" (of mnemonics and learning methods) as he puts it, will replace that.
I'm uncertain if you read the entire article, but the point is that even the greatest non-fiction books aren't optimal when the objective is learning.
His expectations aren't unrealistic, the methods he's suggesting have decades of research suggesting to learning that's much more efficient than traditional reading.
> even the greatest non-fiction books aren't optimal when the objective is learning
Says who? I say he's doing it wrong.
> the methods he's suggesting have decades of research suggesting to learning that's much more efficient than traditional reading
Where are the results of that research then, the culmination in the form of medium superior to books? His mneumonic quantum book doesn't look like one. People need understanding, not memorization.
The paper as way to publish information is rooted in a world that no longer exists. That world consisted of scientists accessing information in printed form through libraries that subscribed to relevant magazines, journals, and what not from publishers. That's still a thing as far as publishers are concerned but I rarely set foot in a library after Google became a thing last century.
The last thirty years have changed that game to basically no longer involve printing (other than for a very few select publications) and basically switching to a digital only publishing form.
I completed my phd, read thousands of papers (well skimmed mostly, it's a fine art to zoom in on the relevant stuff), all without visiting the library more than once or twice. I only printed the ones actually worth reading in detail. And these days I consume vast amounts of information on screen without ever using a printer. My printer is fifteen years old and I just installed the second toner cartridge I ever bought for it.
Yet we still pretend to have "journals" like we're in the 19th century. It's the equivalent to writing your friend a letter to inform them that your train is delayed by 5 minutes. Most sane people use some kind of instant messaging tool for that. Writing letters of course used to be a primary way to communicate for scientists. That too has stopped being a thing. People use email now.
The whole point of publishing is to convey information in a form that's convenient to the reader and to solicit endorsement from your peers (via peer review). Peer review used to be implied by virtue of an editor choosing to select a certain paper for publishing. That in turn implies they would have consulted a number of peers about the suitability of that paper. It's sort of the super tedious equivalent of a soliciting a thumbs up button in a social network.
If you publish on linkedin because you are some kind of wannabe influencer you basically need to get people to 1) read your stuff and 2) click the like or share button. Scientific publishing basically is not that different. You have wannabe scientist that want to get the attention of the influencers (reputable peers) so people will be convinced they know their shit. This ultimately translates into degrees, research funding, and tenure track positions. The whole process is kind of biased towards metrics because that's how universities choose to allocate their money.
An ambitious scientist behaves basically in a similar way as a linkedin influencer and will try to game the system by flooding the system with a lot of content and getting their buddies to sign off on it. There are a lot of mediocre articles that get published in obscure places with cliques of scientists basically doing each other favors by referencing each other's work; or worse self referencing. In linkedin terms, this would be the absolute drivel that nobody likes that gets re-shared by a few people that also don't manage to produce much content of interest.
So, here's a thought, maybe get this a bit more out in the open and give scientists some modern tools to endorse each other's work. The best endorsement is a reference. A link basically. Tracking links between bits of paper is super tedious. These papers need permanent URLs. And they need to be digitally signed by their authors so we can have some authenticity and prevent cheating. And scientists need a place where scientists can debate and exchange thoughts about these papers. That used to be a big tradition between scientist back when they still wrote letters to each other or used journals to criticize each other's work.
Curating and aggregating work by means of linking to it is a job that should not be reserved for fussy editors of non paper based journals that absolutely nobody ever reads cover to cover. HN for science; why not? Why not have a multitude of websites referring, editorializing and commenting on published work? How is that not a thing?
While I agree with the premise of this article - that the scientific paper is obsolete - the proposed alternative does not address the primary inefficiencies with our current system. Dynamic 'papers' with illustrative examples are surely an improvement over current manuscripts. However, a flashy paper should be well down the list of goals for a research program. At the top of the list should be to make an important discovery or breakthrough.
Currently there is way too much trivial shit being published in the ever-expanding number of scientific journals.
Our current model for career advancement in academia is partially to blame for this. Landing a tenure track job requires having a prolific publication record. And once you've landed that coveted ladder-rank position, the pressure to publish only heats up, with tenure on the line. And then even after you've secured tenure, career advancement still depends heavily on the ol' publication record. Pre-tenure the pressure to publish in high-impact journals is immense; it's the kind of pressure that drives otherwise honest people to consider partaking in fraud (and unfortunately those who stick to their morals often lose out to those who fabricate results to some degree). Post-tenure the pressure changes from securing high-impact papers to just getting on whatever papers you can. Review boards for associate professors would more readily give a promotion to someone with 20 meaningless papers over the last three years than someone with 2 papers in CNS journals over the same timeframe, even though a single paper in Cell/Nature/Science is typically more impactful than 100 papers in Frontiers or other similarly dogshit journals. So post-tenure pay raises are based on getting as many words in print as possible, using the least amount of effort to do so.
I almost forgot where I was going with this... right, so, in my opinion we need to switch from a publication-based mindset to a discovery-based mindset. We (the public) provide the NIH with $30 billion dollars per year, with the idea that such an investment will lead to medical breakthroughs, discoveries, and other innovations that can concretely improve our health and wellbeing. However, so much of that is wasted on the idiosyncrasies of career advancement inside the ivory tower.
If I were the director of the NIH my sole purpose would be to end this nonsense. And my first order of action would be no longer accepting grant applications from individual PIs. I would only entertain grant applications from a small force of scientists (4-8 lab equivalents) with thorough and cogent plans for making breakthroughs on cancer or heart disease etc. I would change the minimum R01 funding amount from <$500k to >$10 million dollars. I would change the grant renewal timelines from every year to every 5 years but require yearly progress updates to ensure the proposed experiment were soundly conducted. I would encourage the equal reporting of both positive and null findings. Performance would not in any way be based on positive findings, only that the experiments were carefully run. I would require that all raw data be deposited into publicly accessible repos (not at the end of the study, not yearly, but whenever data is generated it should be made accessible asap). I would encourage that research groups bother with drafting/submitting interim manuscripts (interim meaning prior to completion of the full 5-10 year study) only if they find something important, otherwise just provide a comprehensive writeup at the end of the study. This final writeup would not be submitted to a 3rd party journal. It would be posted directly on the NIH website I'd have created for such results reporting. Naturally this would also be publicly accessible. That would be a start...
No that's how it currently works. What I propose aims to fix that. You must have missed this entire section...
"I would encourage the equal reporting of both positive and null findings. Performance would not in any way be based on positive findings, only that the experiments were carefully run."
Would you have considered the LHC experiments "failures" had we not found evidence of the Higgs Boson? I certainly wouldn't have. What matters most is that important questions backed by sound theoretical reasoning are addressed carefully, collaboratively, openly, and with plenty of long-term support. If we do that, we get genuine answers - such answers represent important information whether or not they support our original hypotheses.
As the article mentions, scientific fields are gigantic nowadays, and skimming papers is critical when you're citing 100+ references in your paper.