Stolen Probability: A Structural Weakness of Neural Language Models

1114 · May 27, 2021, 4:25pm

NNLMs produce a distribution for the next word by taking the dot product of its representation with all word vectors in a high dimensional embedding space. This paper theoretically shows that the norm of a word vector placed interior to the convex hull will put an upper bound on its softmax probability, meaning that the probability of some words will never be predicted as 1 even if the context provides very certain clues. This finding is confirmed with empirical experiments on some small sized NNLMs, showing that infrequent words are placed inside the convex hull.

Comments

Very good paper with both theoretical and experimental results.
I’m wondering if the case of large models like Transformers will be any better. Maybe MaskedLMs will arrange the embedding space such that more words are on the convex hull?

Rating

5: Transformative: This paper is likely to change our field. It should be considered for a best paper award.
4.5: Exciting: It changed my thinking on this topic. I would fight for it to be accepted.
4: Strong: I learned a lot from it. I would like to see it accepted.
3.5: Leaning positive: It can be accepted more or less in its current form. However, the work it describes is not particularly exciting and/or inspiring, so it will not be a big loss if people don’t see it in this conference.
3: Ambivalent: It has merits (e.g., it reports state-of-the-art results, the idea is nice), but there are key weaknesses (e.g., I didn’t learn much from it, evaluation is not convincing, it describes incremental work). I believe it can significantly benefit from another round of revision, but I won’t object to accepting it if my co-reviewers are willing to champion it.
2.5: Leaning negative: I am leaning towards rejection, but I can be persuaded if my co-reviewers think otherwise.
2: Mediocre: I would rather not see it in the conference.
1.5: Weak: I am pretty confident that it should be rejected.
1: Poor: I would fight to have it rejected.

0 投票者