AI Dungeon (and its successors like novelai [1] and holoAI [2] following its infamous censorship purges and mutinies) is basically designed entirely around that problem and has a lot of clever solutions that let writers keep relevant context in memory in a story that is longer than the model's maximum input length.
* "Memory", which is a customizable block of text that is repeated at the top of each API input. If your story revolves around your character being a vampire who melts in the sunlight, you can include something like "you are a vampire who will die if you go out in the sunlight" in the memory, and even 1000 characters back a paragraph of context can prime the AI accordingly.
* "Author's Note", which is a customizable short block of text that is inserted invisibly a few lines before your current place in the story when it's sent to the API. A note such as "[A/N: The following section includes depictions of graphic violence]" or "The following section is written in Shakespearean old English", as obvious and blunt as it might seem, actually works surprisingly well for nudging the AI towards a certain style or content.
* "World Info", which is a customizable dictionary of short text blocks that are conditionally added to the top of the API input like memory when a certain key appears in the current context. Imagine you have a story with 10 important characters who cycle in and out of the story. If you create an entry in the world info about Bob, then when you write that "Bob appears from behind the shrub", the blurb about Bob is automatically tacked on to the context so long as Bob is mentioned by name in the last few dozen inputs.
In general, both GPT-3 and the open source alternatives by EleutherAI such as GPT-J-6B are able to use a context primer from 1000 tokens prior to affect the current tail of a story. It's actually kind of uncanny how good they are at it -- you can have a story that in the memory at the top says "goblins always have purple skin" and notice that the AI will mention it as an offhand detail much farther down in the context.
* "Memory", which is a customizable block of text that is repeated at the top of each API input. If your story revolves around your character being a vampire who melts in the sunlight, you can include something like "you are a vampire who will die if you go out in the sunlight" in the memory, and even 1000 characters back a paragraph of context can prime the AI accordingly.
* "Author's Note", which is a customizable short block of text that is inserted invisibly a few lines before your current place in the story when it's sent to the API. A note such as "[A/N: The following section includes depictions of graphic violence]" or "The following section is written in Shakespearean old English", as obvious and blunt as it might seem, actually works surprisingly well for nudging the AI towards a certain style or content.
* "World Info", which is a customizable dictionary of short text blocks that are conditionally added to the top of the API input like memory when a certain key appears in the current context. Imagine you have a story with 10 important characters who cycle in and out of the story. If you create an entry in the world info about Bob, then when you write that "Bob appears from behind the shrub", the blurb about Bob is automatically tacked on to the context so long as Bob is mentioned by name in the last few dozen inputs.
In general, both GPT-3 and the open source alternatives by EleutherAI such as GPT-J-6B are able to use a context primer from 1000 tokens prior to affect the current tail of a story. It's actually kind of uncanny how good they are at it -- you can have a story that in the memory at the top says "goblins always have purple skin" and notice that the AI will mention it as an offhand detail much farther down in the context.
[1] https://novelai.net/ [2] https://writeholo.com/