Interactions are usually smaller than main effects, the idea that they are half the size is just a convenient size to get a point across. If you allowed the interaction size to vary between 1% to 99% of the main effect then you'd get such a huge range of resulting required sample sizes that it wouldn't be useful to illustrate his point- that interactions require a larger sample size to estimate than just tacking on another main effect. If you wanted to detect a main effect that is half the size of the others you only need (roughly) 4x the sample size, not 16x. Part of this is due to the way Gelman defines "interactions half the size of the main effects" but the idea that interactions are harder to estimate than main effects is true.
The assumption that interactions are smaller than main effects is generally true in real life. There's an idea in statistical modeling called the hierarchy of effects. Main effects are the largest while second-level effects are smaller and don't appear without the presence of the corresponding main effects. It's possible to construct datasets in which the hierarchy of effects isn't true (synergy only, without any benefit from a single variable), but it's uncommon in real life that you could have an interaction without both corresponding main effects being present.
The assumption that interactions are smaller than main effects is generally true in real life. There's an idea in statistical modeling called the hierarchy of effects. Main effects are the largest while second-level effects are smaller and don't appear without the presence of the corresponding main effects. It's possible to construct datasets in which the hierarchy of effects isn't true (synergy only, without any benefit from a single variable), but it's uncommon in real life that you could have an interaction without both corresponding main effects being present.