> That's fine if you want to measure sample efficiency, but ARC-AGI is supposed to measure progress towards AGI.
On the Measure of Intelligence defines intelligence as skill-acquisition efficiency, I believe, where efficiency is with respect to whatever is the limiting factor. For each ARC task, the primary limiting factor is the number of samples in it. And the skill here is your ability to convert inputs into the correct outputs. In other words, in this context, intelligence is sample-efficiency, as I see it.
What do you think about limiting the submission size? Kaggle does this sometimes.
With a limit like 0.1-1MB (compressed), you are basically saying: "Give me sample-efficient learning algorithms, not pretrained models."