Maybe a dumb question - how does the model which is trained to predict the next ...

nestorD · on May 5, 2019

If I remember correctly, they say that since the training set contains extracts of question-answer sessions, it will detect the pattern and follow it when you give an appropriate prompt. So yes, you just feed the question and, detecting that it is a question, it answers.

dhairya · on May 5, 2019

you add a linear classifier at the top to predict start and end positions of the answer span. The augmented model is trained on a qa dataset like squad to actually learn how to answer questions.

hugging face has a simple implementation that augments bert in this manner and you can see the code there. their bertqa model get like an 84 F1 on squad 1.1 which really strong performance. you can augment the thier gpt2 implementation similarly.

p1esk · on May 5, 2019

I think they used gpt-2 for qa without any finetuning.