About Language Model
|
|
0
|
261
|
August 28, 2020
|
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
|
|
1
|
67
|
February 17, 2022
|
Phrase-BERT: Improved Phrase Embeddings from BERT with an Application to...
|
|
0
|
54
|
January 24, 2022
|
Finetuning Pretrained Transformers into RNNs
|
|
0
|
52
|
January 16, 2022
|
Block Pruning For Faster Transformers
|
|
0
|
59
|
January 16, 2022
|
What’s in Your Head? Emergent Behaviour in Multi-Task Transformer Models
|
|
0
|
83
|
December 30, 2021
|
AdapterDrop: On the Efficiency of Adapters in Transformers
|
|
0
|
80
|
December 30, 2021
|
Frustratingly Simple Pretraining Alternatives to Masked Language Modeling
|
|
0
|
129
|
November 30, 2021
|
Condenser: a Pre-training Architecture for Dense Retrieval
|
|
0
|
137
|
November 21, 2021
|
How to Train BERT with an Academic Budget
|
|
0
|
103
|
November 17, 2021
|
The Power of Scale for Parameter-Efficient Prompt Tuning
|
|
0
|
72
|
November 15, 2021
|
Constrained Language Models Yield Few-Shot Semantic Parsers
|
|
0
|
112
|
November 15, 2021
|
ConvFiT: Conversational Fine-Tuning of Pretrained Language Models
|
|
0
|
101
|
November 10, 2021
|
#EMNLP21#干细胞假说:神经网络也具备干细胞难成全才
|
|
0
|
177
|
November 6, 2021
|
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
|
|
0
|
103
|
November 6, 2021
|
Lower Perplexity is Not Always Human-Like
|
|
0
|
119
|
November 5, 2021
|
Bird’s Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach
|
|
0
|
88
|
November 2, 2021
|
When Do You Need Billions of Words of Pretraining Data?
|
|
0
|
132
|
October 26, 2021
|
CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language...
|
|
1
|
174
|
October 18, 2021
|
LIMIT-BERT : Linguistics Informed Multi-Task BERT
|
|
0
|
122
|
October 18, 2021
|
Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese
|
|
0
|
144
|
October 18, 2021
|
What Context Features Can Transformer Language Models Use?
|
|
0
|
149
|
September 22, 2021
|
有个疑问,为什么ltp,hanlp都用electra,而不是其他预训练模型?
|
|
2
|
351
|
September 11, 2021
|
KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
|
|
0
|
151
|
September 3, 2021
|
Are Pretrained Convolutions Better than Pretrained Transformers?
|
|
0
|
138
|
August 18, 2021
|
ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for...
|
|
0
|
250
|
August 7, 2021
|
Multi-view Subword Regularization
|
|
0
|
154
|
July 19, 2021
|
Enabling Language Models to Fill in the Blanks
|
|
0
|
149
|
June 30, 2021
|
Stolen Probability: A Structural Weakness of Neural Language Models
|
|
0
|
206
|
May 27, 2021
|
Exploring the Limits of Transfer Learning with a Unified Text-to-Text...
|
|
0
|
176
|
May 14, 2021
|