PAWS: Paraphrase Adversaries from Word Scrambling

1114 · May 23, 2021, 6:34pm

Existing paraphrase datasets are too easy due to the low lexical overlap between sentence pairs. This paper presents a methodology to create high lexical overlap pairs through constrained sequence generation and back translation.

Specifically, a language model is used to generate a sentence with the same BOW vector of the original sentence. The generation is constrained by part-of-speech and NER tags to reduce the search scope. This procedure generates paraphrases and non-paraphrases with 1:4 ratio (judged by human). Another procedure is to back translate a sentence using beam size 5 and aggressively filter out easy ones by calculating the word-order inversion rate and BOW similarity, which instead mostly generates paraphrases.

These two procedures can generate negative pairs which will be labeled by human. Non well-formed sentences will be corrected too.

The final step is to balance the label in this human-labeled dataset by applying a set of rules on each sentence with their labeled counterpart from the aforementioned two procedures.

Comments

Their methodology is very sounding, each detail is carefully considered.
Their experiments on DIIN is very interesting. A small DIIN model performs on par with BERT.

Rating

5: Transformative: This paper is likely to change our field. It should be considered for a best paper award.
4.5: Exciting: It changed my thinking on this topic. I would fight for it to be accepted.
4: Strong: I learned a lot from it. I would like to see it accepted.
3.5: Leaning positive: It can be accepted more or less in its current form. However, the work it describes is not particularly exciting and/or inspiring, so it will not be a big loss if people don’t see it in this conference.
3: Ambivalent: It has merits (e.g., it reports state-of-the-art results, the idea is nice), but there are key weaknesses (e.g., I didn’t learn much from it, evaluation is not convincing, it describes incremental work). I believe it can significantly benefit from another round of revision, but I won’t object to accepting it if my co-reviewers are willing to champion it.
2.5: Leaning negative: I am leaning towards rejection, but I can be persuaded if my co-reviewers think otherwise.
2: Mediocre: I would rather not see it in the conference.
1.5: Weak: I am pretty confident that it should be rejected.
1: Poor: I would fight to have it rejected.

0 投票者