Thank you for sharing this! I am currently studying NLP..
Along the way, I've been struggling with a question and I hope someone can help me understand how to go about this: how would you build a model that does more than one NLP task? For a simple classifier like input: text (a tweet) and output: text (an emotion), you can fine-tune an existing classifier on such a data set. But, how would you build a model that does NER and sentiment analysis? E.g. input: text (a Yelp review of a restaurant) and output: list of (entity, sentiment) tuples (e.g. [("tacos", "good"), ("margaritas", "good"), ("salsa", "bad")]). If you have a data set structured this way, and want to fine-tune a model, how does that model know how to make use of a Python list of tuples?
If you have the dataset, you can try to train a model like T5 [1], notebook [2].
You just need to create [(input, output)] examples in the format you want.
For example
[(a Yelp review of a restaurant, [("tacos", "good"), ("margaritas", "good"), ("salsa", "bad")]].
With enough data, the model should be able to learn to generate the output in the right format.
> Python list of tuples
Things get interesting if you want to generate actual Python code. You can use a large language model with just a few examples of the task to generate such code. For example, see https://reasonwithpal.com/.
You could start by looking into either multitask transformers or really general seq2seq models like T5. With T5, for example, it just learns to transform one text sequence into another. So you could fine-tune T5 to produce your target sequence, but rather than outputting an explicit Python list of tuples, it would output a string that looks like a sequence of tuples.
Ah, so if the model is just converting input text into output text, it can really learn how to do just about anything? But, there may be certain aspects of model design that make it better at some types of conversions ("tasks") than others? And there may be certain data sets that you want to train a base model on to get base learning of such as general language comprehension, and then build on top of that for your specific use case?
Yeah, I can see that being the case for specialized domains. With state-of-the-art models widely available to the public, knowledge of the domain and its workflows, and fine-tuning models to suit the domain will probably be your edge.
Yours is an example of aspect-based sentiment analysis. Typically it has been tackled in two steps: first extract the aspects, then classify them as positive/negative. GPT or T5 are possible options for doing both in one go, but splitting the task seems to be still a good option [1].
Along the way, I've been struggling with a question and I hope someone can help me understand how to go about this: how would you build a model that does more than one NLP task? For a simple classifier like input: text (a tweet) and output: text (an emotion), you can fine-tune an existing classifier on such a data set. But, how would you build a model that does NER and sentiment analysis? E.g. input: text (a Yelp review of a restaurant) and output: list of (entity, sentiment) tuples (e.g. [("tacos", "good"), ("margaritas", "good"), ("salsa", "bad")]). If you have a data set structured this way, and want to fine-tune a model, how does that model know how to make use of a Python list of tuples?