环境:
1.操作系统:windows11
2.python版本:python 3.9.10
一、下载安装
pip install bert-score
二、示例代码
此示例代码由chatgpt自动生成,参考句子与生成句子皆以数组的形式输入。
from bert_score import score
# 定义参考句子和生成句子
refs = ["the cat sat on the mat.", "it was raining outside."]
cands = ["the cat sat on the mat.", "it was pouring outside."]
# 使用bert_score计算分数
p, r, f1 = score(cands, refs, lang='en', model_type="roberta-large", verbose=true)
# 打印结果
print("precision:", p)
print("recall:", r)
print("f1 score:", f1)
三、程序报错
oserror: we couldn't connect to 'https://huggingface.co' to load this file, couldn't find it in the cached files and it looks like roberta-large is not the path to a directory containing a file named config.json.
checkout your internet connection or see how to run the library in offline mode at 'https://huggingface.co/docs/transformers/installation#offline-mode'.
四、解决方案(能科学上网)
因为上述代码会首先检查本地是否有roberta-large模型文件,所以可以使用如下代码预先加载roberta-large的模型文件保存到本地的.cache文件中。
from transformers import autotokenizer, automodelforsequenceclassification
bert_tokenizer = autotokenizer.from_pretrained('roberta-large')
bert_model = automodelforsequenceclassification('roberta-large')
五、解决方案(不能科学上网)
因为上述代码会首先检查本地是否有roberta-large模型文件,所以可以到国内的huggingface镜像站下载模型文件,再放置到合适位置。
本地缓存没有roberta-large模型文件时的目录结构:
c:\users\user\.cache\huggingface> tree
c:.
├─datasets
├─hub
├─metrics
│ └─rouge
│ └─default
└─modules
本地缓存拥有roberta-large模型文件时的目录结构(其中的722…1d59是sha散列函数值):
c:\users\user\.cache\huggingface> tree
c:.
├─datasets
├─hub
│ ├─.locks
│ │ └─models--roberta-large
│ └─models--roberta-large
│ ├─blobs
│ ├─refs
│ │ └─main
│ └─snapshots
│ └─722cf37b1afa9454edce342e7895e588b6ff1d59
│ ├─config.json
│ ├─merges.txt
│ ├─pytorch_model.bin
│ ├─tokenizer.json
│ ├─tokenizer_config.json
│ └─vocab.json
├─metrics
│ └─rouge
│ └─default
└─modules
六、测试代码
from transformers import pipeline
classifier = pipeline("sentiment-analysis")
res = classifier(["we are very happy.", 'we are very sad.'])
print(res)
发表评论