Even in science there is not a 'requirement' that you have a controlled experiment in order to have evidence that a claim is true. Following your argument you can't substantiate that humans are the result of evolution because we can't take two groups of early primates, subject one to evolutionary forces and the other not and see what happens. Instead we can observe a chain of correlations with plausible mechanisms that indicate causation and say it's evidentiary. For example, data that indicates unvaccinated people died at a higher rate and data that indicates people who chose not to vaccinate self-report that the reason they made that choice was based on particular information that they believed. That would be evidence that helps substantiate the theory the information led to deaths. It's not 'proof'. We can't 'prove' that exposure to the information actually led to the decision (because people sometimes misattribute their own decisions) and it would be impractical to imagine we can collect vaccine-decision rationales from a large number of folks pre-death (though someone might have) and you can't attribute a particular death to a particular decision (because vaccines aren't perfectly protective) so you have to do statistics over a large sample. But the causal chain is entirely plausible based on everything I know and there's no reason to believe data around those correlations can't exist. And science isn't about 'proof'. Science is about theories that best explain a set of observations and in particular have predictive power. You almost never run experiments (in the 8th grade science fair sense) in fields like astronomy or geology, but we have strong 'substantiated' theories in those fields nonetheless.
A causal chain being plausible does not justify or substantiate a claim of causation.
I absolutely would say that we can't prove humans are the result of evolution. The theory seems very likely and explains what we have observed, but that's why its a theory and not a fact - its the last hypothesis standing and generally accepted but not proven.
My argument here isn't with whether the causation seemed likely, though we can have that debate if you prefer and we'd have to go deep down the accuracy and reliability of data reporting during the pandemic.
My argument is that we can't make blanket statements that misinformation killed people. Not only is that not a proven (or provable) fact, it skips past what we define as misinformation and ignores what was known at the time in favor of what we know today. Even if the data you to point to shows correlation and possible causation today, we didn't have that information during the pandemic st the time that YouTube was pulling down content for questioning efficacy or safety.
The journal appears to be published by an office with 7 FTE's which presumably is funded by the money raised by presence of the paywall and sales of their journals and books. Fully-loaded costs for 7 folks is on the order of $750k/year.
https://www.kentstateuniversitypress.com/
Someone has to foot that bill. Open-access publishing implies the authors are paying the cost of publication and its popularity in STEM reflects an availability of money (especially grant funds) to cover those author page charges that is not mirrored in the social sciences and humanities.
Unrelatedly given recent changes in federal funding Johns Hopkins is probably feeling like it could use a little extra cash (losing $800 million in USAID funding, overhead rates potential dropping to existential crisis levels, etc...)
> Open-access publishing implies the authors are paying the cost of publication and its popularity in STEM reflects an availability of money
No it implied the journal not double-dipping by extorting both the author and the reader, while not actually performing any valuable task whatsoever for that money.
> while not actually performing any valuable task whatsoever for that money.
Like with complaints about landlords not producing any value, I think this is an overstatement? Rather, in both cases, the income they bring in is typically substantially larger than what they contribute, due to economic rent, but they do both typically produce some non-zero value.
Johns Hopkins University has an endowment of $13B, but as I already noted above, this journal has no direct affiliation with Johns Hopkins whatsoever so the size of Johns Hopkins' endowment is completely irrelevant here. They just host a website which allows online reading of academic journals.
This particular journal is published by Kent State University, which has an endowment of less than $200 million.
Except if any of your pages are cached between eyeball and your server and so your server logs don't capture everything that is going on. You can get fancy with web server logs, but depending on what you're trying to understand it may not be the data you need.
<source: did fancy things with logs over the last 25 years, including running multiple tools on the same site in parallel to do comparisons (Analog, AWStats Urchin, GA, Omniture, homegrown, etc...)>
If you control the cache layer, log it there. If you don't control the cache layer, does a read from the end user cache really count as a separate visit anyway?
There are plenty of situations where someone visiting a page once and someone repeatedly looking at that page over a period of days (even if it is pulled from their browser cache) is an important difference. Obviously it depends on what you're using the data to try to understand.
A simple reason would be if you're just using it as a proxy signal for bad bots and you want to reduce the load on your real servers and let them get rejected at the CDN level. Obvious SQL injection attempt = must be malicious bot = I don't want my servers wasting their time
Sounds about right. I remember hearing about it first in a talk being given by Doug Crockford at my university around that time. It blew my mind. I thought it was like gcc for the Internet. It's kind of wild that in the interim we have experienced the complete rise and fall of mongodb, node.js, even today the react paradigm are all expressions of this tiny little functional scripting language..
Do they really use 90% less water cleaning the floors of women's restrooms than men's currently? Cause that's one implication of the 'new design will save cleaning' claim. Places with public restrooms that I'm familiar with seem to get their floors mopped with identical frequency (and similar apparent rigor) regardless of whether they have urinals or not.
Not if they hop to a different IP address every few requests. And they generally aren't bothered slow responses. It's not like they have to wait for one request to finish before they make another one (especially if they are making requests from thousands of machines).
How do you think they do crawling if not like that? They'd be IP banned instantly if they used any kind of predictable IP regime for more than a few minutes.
I don't know what is actually happening, that's why I'm asking.
Also you're implying that the only way to crawl is to essentially DDOS a website by blasting them from thousands of IP addresses. There is no reason crawlers can't do more sites in parallel and avoid hitting individual sites so hard. There are plenty of crawlers for the last few decades that don't cause problems, these are just stories about the ones that do.