# NAT: Noise-Aware Training for Robust Neural Sequence Labeling

Taggers are expected to perform reliably not only under clean text but also real-world noisy text. This paper proposes two training strategies to improve the robustness of popular sequence labeling models while preserving accuracy on the original input.

### Data Augmentation Method

The first method introduces noisy data as one kind of data augmentation and train it with the clean text.

\begin{align*} %\label{eqn:augmentation} \begin{split} \mathcal{L}_{augm}(x,\tilde{x},y;\theta) &= \mathcal{L}_0(x,y;\theta) + \alpha\mathcal{L}_0(\tilde{x},y;\theta), \end{split} \end{align*}

where \tilde{x} is the perturbed sentence, and \alpha is a weight of the noisy loss component.

### Stability Training Method

\begin{align*} %\label{eqn:stability} \begin{split} \mathcal{L}_{stabil}(x,\tilde{x},y;\theta) &= \mathcal{L}_0(x,y;\theta) + \alpha\mathcal{L}_{sim}(x,\tilde{x};\theta), \\ \mathcal{L}_{sim}(x,\tilde{x};\theta) &= \mathcal{D}\big(y(x), y(\tilde{x})\big), \end{split} \end{align*}

where \mathcal{L}_{sim} encourages the similarity of the model outputs for both x and \tilde{x}, \mathcal{D} is a task-specific feature distance measure (usually \mathcal{D}_{KL}), and \alpha balances the strength of the similarity objective.

### Results

Their approaches especially the stability one achieved significant error reduction across all perturbation levels and all entity types.

• The motivation is very clear and practical.
• Idea is simple but it works surprisingly well.
Rating
• 5: Transformative: This paper is likely to change our field. It should be considered for a best paper award.
• 4.5: Exciting: It changed my thinking on this topic. I would fight for it to be accepted.
• 4: Strong: I learned a lot from it. I would like to see it accepted.
• 3.5: Leaning positive: It can be accepted more or less in its current form. However, the work it describes is not particularly exciting and/or inspiring, so it will not be a big loss if people don’t see it in this conference.
• 3: Ambivalent: It has merits (e.g., it reports state-of-the-art results, the idea is nice), but there are key weaknesses (e.g., I didn’t learn much from it, evaluation is not convincing, it describes incremental work). I believe it can significantly benefit from another round of revision, but I won’t object to accepting it if my co-reviewers are willing to champion it.
• 2.5: Leaning negative: I am leaning towards rejection, but I can be persuaded if my co-reviewers think otherwise.
• 2: Mediocre: I would rather not see it in the conference.
• 1.5: Weak: I am pretty confident that it should be rejected.
• 1: Poor: I would fight to have it rejected.

0 投票者