Spacy's implementation, assuming it's roughly equivalent with the one syllog1sm ...

syllogism · on Aug 20, 2015

Yes, I do greedy parsing. There are many advantages to this, which I'll write about before too long. Fundamentally, it's "the way forward". As models improve in accuracy, search matters less and less.

By the way, the beam isn't slow from the copying. That's really not so important. What matters is simply that you're evaluating, say, 8 times as many decisions. This makes the parser 6-7x slower (you can do a little bit of memoisation).

nxb · on Aug 20, 2015

In that case, I wonder if it can output a probability score for each tag at each position, like pycrfsuite does? Then the output could be ensembled with other taggers, or otherwise pass that confidence information downstream.

Also, maybe a dumb question - is there any library or best-practice method for the ensembling of taggers / chunkers? Or must I create it myself from scratch?