Though just from prompt basics, though there was a story on NPR recently about Michelle Huang. She provided ChatGPT-3 on her diaries and then had a conversation with that tuned model.
> Michelle Huang: Younger Michelle is trained on these diary entries. So I put in diary entries from the age of 7 to 18. I kept diaries for a really long time, and then ended up creating chat prompts where I had lines from a present Michelle. So I was able to ask my younger Michelle questions, and then the AI essentially just populated the younger Michelle text for what she would have theoretically answered based off of the diary entries that I was able to give her.
I suspect that an even more rigorous approach could be done by baking it into the model directly through the fine tuning methods. That would be a way to get around the 4k token limit and having ChatGPT pretend that something is the case.
The fine tuning of the model would be something to experiment with for a public figure where there exists a sizable transcribed corpus of interviews that are easy to convert into "prompt" and "response".
{"prompt": "What do you think of him, Socrates? Has he not a beautiful face?", "completion": "Most beautiful"},
{"prompt": "But you would think nothing of his face if you could see his naked form: he is absolutely perfect.", "completion": "By Heracles there never was such a paragon, if he has only one other slight addition."}
....
You raise a good point about using fine tuning this way, honestly from the documentation i struggled to see how i could apply it to some situations but i think i just need to dig deeper. My use cases are essentially getting data about properties and trying to answer unanticipated questions.
Thanks for the link on how this was done, I’ll be trying to learn from that.