加载腾讯AI LAB模型报错文件不存在

Hanlp 2.0直接load(‘TENCENT_AI_LAB_EMBEDDING’),压缩包下载好了,但是却报下面的错误。
FileNotFoundError: The identifier $HANLP_HOME/thirdparty/ai.tencent.com/ailab/nlp/data/Tencent_AILab_ChineseEmbedding.tar/Tencent_AILab_ChineseEmbedding.txt resolves to a non-exist meta file $HANLP_HOME/thirdparty/ai.tencent.com/ailab/nlp/data/Tencent_AILab_ChineseEmbedding.tar/Tencent_AILab_ChineseEmbedding.txt/meta.json.

大部分第三方模型不符合hanlp规范,不能直接hanlp.load。你可以通过设计好的类型load:

import hanlp
from hanlp.layers.embeddings.word2vec import Word2VecEmbedding

print(Word2VecEmbedding(filepath=hanlp.pretrained.word2vec.TENCENT_AI_LAB_EMBEDDING).input_dim)

3 Likes

感谢解答,可以加载了。

要加载的文件TENCENT_AI_LAB_EMBEDDING应该放到哪?

我在使用Word2Vec预训练模型时无法加载,缺少配置文件。但我尝试用网盘分享的下载zip解压,删掉全部重新运行代码。依然报这个错误。请问是什么问题?
另外有没有2.x的比较全的文档包,因为我的python执行环境无法使用在线api,
import hanlp

model = hanlp.load(hanlp.pretrained.word2vec.CONVSEG_W2V_NEWS_TENSITE_WORD_PKU)
model([
‘看图猜一电影名’,
‘无线路由器怎么无线上网’,
‘北京到上海的动车票’,
])

Traceback (most recent call last):
File “E:/Project/python/jxnlp-sdk/test/Word2VecTest.py”, line 9, in
model = hanlp.load(hanlp.pretrained.word2vec.CONVSEG_W2V_NEWS_TENSITE_WORD_PKU)
File “E:\Project\python\jxnlp-sdk\hanlp_init_.py”, line 43, in load
return load_from_meta_file(save_dir, ‘meta.json’, verbose=verbose, **kwargs)
File “E:\Project\python\jxnlp-sdk\hanlp\utils\component_util.py”, line 53, in load_from_meta_file
raise FileNotFoundError(f’The identifier {save_dir} resolves to a non-exist meta file {metapath}. {tips}’)
FileNotFoundError: The identifier C:\Users\Administrator\AppData\Roaming\hanlp\hanlp\embeddings\convseg_embeddings\news_tensite.pku.words.w2v50 resolves to a non-exist meta file C:\Users\Administrator\AppData\Roaming\hanlp\hanlp\embeddings\convseg_embeddings\news_tensite.pku.words.w2v50\config.json.

你好,我是从这个帖子转过来的,但遇到了下面两个问题(https://bbs.hankcs.com/t/topic/3804)

  • 我使用的是源码预训练的模型(hanlp.pretrained.word2vec.SEMEVAL16_EMBEDDINGS_300_TEXT_CN)

  • 我屏蔽然后执行上述这段代码,发现还是报错?(Traceback (most recent call last):
    File “E:/Project/python/jxnlp-sdk/test/Word2VecTest.py”, line 24, in
    print(Word2VecEmbedding(filepath=hanlp.pretrained.word2vec.TENCENT_AI_LAB_EMBEDDING).input_dim)
    TypeError: init() got an unexpected keyword argument ‘filepath’)

我是从git 克隆的,应该属于2.X。然后本地随便建一个python文件测试

在2.1中更名为:

word2vec、FastText之类的模块被设计为其他网络的embedding层,不提供面向终端用户进行句向量计算这种弱功能。

import hanlp
Hanlp = hanlp.load(hanlp.pretrained.word2vec.RADICAL_CHAR_EMBEDDING_100)

FileNotFoundError: The identifier /home/gavin/.hanlp/embeddings/radical_char_vec_20191229_013849/character.vec.txt resolves to a non-exist meta file /home/gavin/.hanlp/embeddings/radical_char_vec_20191229_013849/character.vec.txt/config.json.

看了一下文件已经下载好了,没有出现网络问题。