Hanlp model licencing question



My apologies, I don’t speak Chinese, so hopefully someone will be able to answer my question. I understand that Hanlp operates under the Apace licence. My question is: Does the licence over only use of the Python package or all of the models (tokeniser and tagger) that I can download?

For example:
tokenizer = hanlp.load(‘PKU_NAME_MERGED_SIX_MONTHS_CONVSEG’)
tagger = hanlp.load(hanlp.pretrained.pos.CTB5_POS_RNN_FASTTEXT_ZH)


Hi Roger,

Thank you for asking. It’s an open question, I’m not sure whether the model trained on some corpus must inherit the licence of the corpus or not. Stanford University is in exactly the same situation as us. They said that

The copyright and licensing status of machine learning models is not very clear (to us). We list in the table below the Treebank License of the underlying data from which each language pack (set of machine learning models for a treebank) was trained. To the extent that The Trustees of Leland Stanford Junior University have ownership and rights over these language packs, all these Stanza language packs are made available under the Open Data Commons Attribution License v1.0.

We have the research licence for the corpora but the licence doesn’t permit commercial use. It’s better to assume the Apache Licence doesn’t apply to the models.

1 Like

thanks for the detailed response. Much appreciated.