If ARC-AGI were a good benchmark for "AGI", then MindsAI should effectively be b...

drdeca · 2024-09-14T02:21:35 1726280495

Perhaps a benchmark could be a good approximate upper bound for something without being a good approximate lower bound for that thing?

alphabetting · 2024-09-14T17:29:54 1726334994

I clarified in a another post I mean for benchmarking standalone models, not ones fine-tuned for solving ARC

nightski · 2024-09-14T02:14:50 1726280090

I mean, there are a lot of tasks that frontier models excel at which many humans wouldn't be able to complete.