Language Model

Topic	Replies	Views	Activity
LIMIT-BERT : Linguistics Informed Multi-Task BERT	0	770	October 18, 2021
Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese	0	926	October 18, 2021
What Context Features Can Transformer Language Models Use?	0	852	September 22, 2021
有个疑问，为什么ltp，hanlp都用electra，而不是其他预训练模型？	2	1945	September 11, 2021
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation	0	1122	September 3, 2021
Are Pretrained Convolutions Better than Pretrained Transformers?	0	799	August 18, 2021
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for...	0	1411	August 7, 2021
Multi-view Subword Regularization	0	781	July 19, 2021
Enabling Language Models to Fill in the Blanks	0	690	June 30, 2021
Stolen Probability: A Structural Weakness of Neural Language Models	0	865	May 27, 2021
Exploring the Limits of Transfer Learning with a Unified Text-to-Text...	0	1050	May 14, 2021
Spelling Error Correction with Soft-Masked BERT	0	1531	May 9, 2021
Structured Pruning of Large Language Models	0	927	February 25, 2021
Losing Heads in the Lottery: Pruning Transformer Attention in Neural Machine...	0	774	February 23, 2021
A Mixture of h - 1 Heads is Better than h Heads	0	714	February 20, 2021
FastBERT: a Self-distilling BERT with Adaptive Inference Time	0	1556	February 19, 2021
How does BERT’s attention change when you fine-tune? An analysis methodology...	0	1064	February 19, 2021
What Does BERT Learn about the Structure of Language?	0	1516	January 25, 2021
请问如何在已有模型的基础上训练自己的模型	2	1649	December 11, 2020
BPE-Dropout: Simple and Effective Subword Regularization	0	1888	October 17, 2020
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language...	0	1455	September 21, 2020
复现了ACL 2020的FastBert，欢迎试用和反馈	1	1123	August 17, 2020

LIMIT-BERT : Linguistics Informed Multi-Task BERT

0

770

October 18, 2021

Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese

0

926

October 18, 2021

What Context Features Can Transformer Language Models Use?

0

852

September 22, 2021

有个疑问，为什么ltp，hanlp都用electra，而不是其他预训练模型？

2

1945

September 11, 2021

KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation

0

1122

September 3, 2021

Are Pretrained Convolutions Better than Pretrained Transformers?

0

799

August 18, 2021

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for...

0

1411

August 7, 2021

Multi-view Subword Regularization

0

781

July 19, 2021

Enabling Language Models to Fill in the Blanks

0

690

June 30, 2021

Stolen Probability: A Structural Weakness of Neural Language Models

0

865

May 27, 2021

Exploring the Limits of Transfer Learning with a Unified Text-to-Text...

0

1050

May 14, 2021

Spelling Error Correction with Soft-Masked BERT

0

1531

May 9, 2021

Structured Pruning of Large Language Models

0

927

February 25, 2021

Losing Heads in the Lottery: Pruning Transformer Attention in Neural Machine...

0

774

February 23, 2021

A Mixture of h - 1 Heads is Better than h Heads

0

714

February 20, 2021

FastBERT: a Self-distilling BERT with Adaptive Inference Time

0

1556

February 19, 2021

How does BERT’s attention change when you fine-tune? An analysis methodology...

0

1064

February 19, 2021

What Does BERT Learn about the Structure of Language?

0

1516

January 25, 2021

请问如何在已有模型的基础上训练自己的模型

2

1649

December 11, 2020

BPE-Dropout: Simple and Effective Subword Regularization

0

1888

October 17, 2020

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language...

0

1455

September 21, 2020

复现了ACL 2020的FastBert，欢迎试用和反馈

1

1123

August 17, 2020