A Unified Generative Framework for Various NER Subtasks

This paper adopts the text-to-text format from Google’s T5 to NER task and reports somewhat competitive results to biaffine-ner. They employ BART-Large to generate the indices and tags of named entities in the input sentence.


  • T5 should be emphasized as the source where this text-to-text formulation is proposed and where they got inspiration.
  • Is there a better way to interpolate the pointer/tag distribution with the token distribution learnt by the pre-trained decoder?
  • Using seq2seq for NER seems to be an overkill as it’s very slow, as reported by the authors in their appendix.
  • Regarding “the sentencepiece tokenization used in T5 will cause different tokenizations for the same token, making it hard to generate pointer indexes to
    conduct the entity extraction”, not sure why this is a problem. You can still feed pre-tokenized tokens into sentence piece and get a deterministic tokenization for each token.
