Finetune ner tf_model 发现没有使用load_vocab()

crash222222 · October 25, 2023, 10:58am

当我使用ner_tf 中的 TransformerNamedEntityRecognizerTF 微调MSRA_NER_BERT_BASE_ZH模型时，发现训练后的vocabs.json与微调之前不同。
后面我查看源码，发现keras_component.py中if finetune:条件后面只有load_weights, 没有load_vocabs,而代码前面有num_examples = self.build_vocab()导致vocab新建立了。代码如下图

而我对比torch_component.py中的finetune条件，调用了load(), load()中调用了load_vocabs()所以正确的。
代码如下图

hankcs · October 27, 2023, 12:34am

It was designed this way to incorperate new tags introduced in your corpus. Otherwise you will need to implement weights loading for the classifier head like this:

crash222222 · October 27, 2023, 2:06am

Thanks for your answer! But this way to incorperate my new tags will change all tags order compared to before the fine-tuning and it mybe cause the all model parameters to largely updated although the classifier head is retrained.

hankcs · October 27, 2023, 2:52am

Yes, you’re right. The classifier head will get trained from scratch but it might not be a critical issue. Because the number of parameters in classifier head is very small compared to the transformer encoder.

Ideally, you can implement the resize trick I mentioned before. Since I’m not actively maintaining the tensorflow versions, PRs are more than welcome.

crash222222 · October 30, 2023, 6:21am

Ok, thank you