Hacker News new | past | comments | ask | show | jobs | submit login

I would guess that everyday they're comparing the current weather against the forecast from 15 days ago. Not a lot of data points to be sure, but perhaps enough to have confidence of very high accuracy.



I don't think that's necessary.

You can do a backtest for any point in the past, as long as you only use the data that was available until 15 days before the day being predicted.


So do they say. I am reminded of Google Flu Trends [0]. They likely also did similar "verification" and it didn't work.

> The initial Google paper stated that the Google Flu Trends predictions were 97% accurate comparing with CDC data.[4] However subsequent reports asserted that Google Flu Trends' predictions have been very inaccurate, especially in two high-profile cases. Google Flu Trends failed to predict the 2009 spring pandemic[12] and over the interval 2011–2013 it consistently overestimated relative flu incidence,

[0] https://en.wikipedia.org/wiki/Google_Flu_Trends


Disclaimer I work at Google.

One of the difficulties with using user data to understand society is that the company isn't a static entity. Engineers are always changing their algorithms for purposes that have nothing to do with the things you're trying to observe. For Google Flu Trends specifically here's a great paper

https://gking.harvard.edu/files/gking/files/0314policyforumf...


2013 was a bad year for h1n1 or whatever it was. It killed my sister while she was vacationing back east. I think it was aggressive but it was quite cold that winter and that may have completely screwed with their data/interpretation at that time. For instance, it snowed 5 times and stuck in Central Louisiana that winter (continuing into frebruary/march, whateer.. It's been colder since, but that year was a real outlier (in my experience on earth, this is my supposition, i have been trying to figure this out for 11 years)


This gets tricky. Once you look into your past, you're presumably looking at data that was used to generate your training corpus. So you would expect better accuracy on that than you would find on present/future predictions.


No need to guess. It's in the paper:

https://www.nature.com/articles/s41586-024-08252-9

See section "Baselines".


Alternatively the can do back testing - using historical data they feed a subset into the predictor, then compare it's predictions to actual history




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: