Most fictional long-form video (whether live-action movies or cartoons, etc) is composed of many shots, most of them much shorter than 7 seconds, let alone 60.
I think the main factor that will be key to generate a whole movie is being able to pass some reference images of the characters/places/objects so they remain congruent between two generations.
You could already write a whole book in GPT-3 from running a series of one-short-chapter-at-a-time generations and passing the summary/outline of what's happened so far. (I know I did, in a time that feels like ages ago but was just early last year)
> I think the main factor that will be key to generate a whole movie is being able to pass some reference images of the characters/places/objects so they remain congruent between two generations.
I partly agree with this. The congruency however needs to extend to more than 2 generations. If a single scene is composed of multiple shots, then those multiple shots need to be part of the same world the scene is being shot in.
If you check the video with the title `A beautiful homemade video showing the people of Lagos, Nigeria in the year 2056. Shot with a mobile phone camera.` the surroundings do not seem to make sense as the view starts with a market, spirals around a point and then ends with a bridge which does not fit into the market.
If the the different shots generated the model did fit together seamlessly, trying to make the fit together is where the difficulty comes in. However I do not have any experience in video editing, so it's just speculation.
Most fictional long-form video (whether live-action movies or cartoons, etc) is composed of many shots, most of them much shorter than 7 seconds, let alone 60.
I think the main factor that will be key to generate a whole movie is being able to pass some reference images of the characters/places/objects so they remain congruent between two generations.
You could already write a whole book in GPT-3 from running a series of one-short-chapter-at-a-time generations and passing the summary/outline of what's happened so far. (I know I did, in a time that feels like ages ago but was just early last year)
Why would this be different?