Language Model

Topic	Replies	Views	Activity
LIMIT-BERT : Linguistics Informed Multi-Task BERT	0	1083	October 18, 2021
Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese	0	1319	October 18, 2021
What Context Features Can Transformer Language Models Use?	0	1107	September 22, 2021
有个疑问，为什么ltp，hanlp都用electra，而不是其他预训练模型？	2	2278	September 11, 2021
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation	0	1382	September 3, 2021
Are Pretrained Convolutions Better than Pretrained Transformers?	0	992	August 18, 2021
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for...	0	1786	August 7, 2021
Multi-view Subword Regularization	0	1054	July 19, 2021
Enabling Language Models to Fill in the Blanks	0	886	June 30, 2021
Stolen Probability: A Structural Weakness of Neural Language Models	0	1096	May 27, 2021
Exploring the Limits of Transfer Learning with a Unified Text-to-Text...	0	1582	May 14, 2021
Spelling Error Correction with Soft-Masked BERT	0	2332	May 9, 2021
Structured Pruning of Large Language Models	0	1217	February 25, 2021
Losing Heads in the Lottery: Pruning Transformer Attention in Neural Machine...	0	1055	February 23, 2021
A Mixture of h - 1 Heads is Better than h Heads	0	992	February 20, 2021
FastBERT: a Self-distilling BERT with Adaptive Inference Time	0	2905	February 19, 2021
How does BERT’s attention change when you fine-tune? An analysis methodology...	0	1294	February 19, 2021
What Does BERT Learn about the Structure of Language?	0	3540	January 25, 2021
请问如何在已有模型的基础上训练自己的模型	2	1944	December 11, 2020
BPE-Dropout: Simple and Effective Subword Regularization	0	3479	October 17, 2020
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language...	0	2495	September 21, 2020
复现了ACL 2020的FastBert，欢迎试用和反馈	1	1355	August 17, 2020

LIMIT-BERT : Linguistics Informed Multi-Task BERT

0

1083

October 18, 2021

Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese

0

1319

October 18, 2021

What Context Features Can Transformer Language Models Use?

0

1107

September 22, 2021

有个疑问，为什么ltp，hanlp都用electra，而不是其他预训练模型？

2

2278

September 11, 2021

KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation

0

1382

September 3, 2021

Are Pretrained Convolutions Better than Pretrained Transformers?

0

992

August 18, 2021

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for...

0

1786

August 7, 2021

Multi-view Subword Regularization

0

1054

July 19, 2021

Enabling Language Models to Fill in the Blanks

0

886

June 30, 2021

Stolen Probability: A Structural Weakness of Neural Language Models

0

1096

May 27, 2021

Exploring the Limits of Transfer Learning with a Unified Text-to-Text...

0

1582

May 14, 2021

Spelling Error Correction with Soft-Masked BERT

0

2332

May 9, 2021

Structured Pruning of Large Language Models

0

1217

February 25, 2021

Losing Heads in the Lottery: Pruning Transformer Attention in Neural Machine...

0

1055

February 23, 2021

A Mixture of h - 1 Heads is Better than h Heads

0

992

February 20, 2021

FastBERT: a Self-distilling BERT with Adaptive Inference Time

0

2905

February 19, 2021

How does BERT’s attention change when you fine-tune? An analysis methodology...

0

1294

February 19, 2021

What Does BERT Learn about the Structure of Language?

0

3540

January 25, 2021

请问如何在已有模型的基础上训练自己的模型

2

1944

December 11, 2020

BPE-Dropout: Simple and Effective Subword Regularization

0

3479

October 17, 2020

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language...

0

2495

September 21, 2020

复现了ACL 2020的FastBert，欢迎试用和反馈

1

1355

August 17, 2020