Principled Paraphrase Generation with Parallel Corpora

This paper argued that round-trip MT for paraphrase training is flawed as it overweights paraphrases that have ambiguous translations. In response, they propose to use information bottleneck method to remove as much information from source text as much as possible while retaining the information needed to predict its translation. In this way, distributions of translation given source / paraphrase will match everywhere so the overweighting problem is solved.



  • This paper demonstrates how an idea develop from intuition, to math formula and to implementation in detail.
  • Math is solid and implementation is practical.
  • It should be considered as best paper.
