我正在使用 huggingface 构建一个能够识别给定句子中的错误的模型。假设我有一个给定的句子和一个相应的标签如下 - >
correct_sentence = "we used to play together."
correct_label = [1, 1, 1, 1, 1]
changed_sentence = "we use play to together."
changed_label = [1, 2, 2, 2, 1]
这些标签进一步用 0 填充到等长的512
. 句子也被标记化并向上(或向下)填充到这个长度。模型如下:
class Camembert(torch.nn.Module):
"""
The definition of the custom model, last 15 layers of Camembert will be retrained
and then a fcn to 512 (the size of every label).
"""
def __init__(self, cam_model):
super(Camembert, self).__init__()
self.l1 = cam_model
total_layers = 199
for i, param in enumerate(cam_model.parameters()):
if total_layers - i > hparams["retrain_layers"]:
param.requires_grad = False
else:
pass
self.l2 = torch.nn.Dropout(hparams["dropout_rate"])
self.l3 = torch.nn.Linear(768, 512)
def forward(self, ids, mask):
_, output = self.l1(ids, attention_mask=mask)
output = self.l2(output)
output = self.l3(output)
return output
说,batch_size=2
因此,输出层将(2, 512)
与 target_label 相同。据我所知,这种方法就像说512
要分类的类不是我想要的,当我尝试计算损失时出现问题,torch.nn.CrossEntropyLoss()
这给了我以下错误(截断):
File "D:\Anaconda\lib\site-packages\torch\nn\functional.py", line 1838, in nll_loss
ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), igno
re_index)
RuntimeError: multi-target not supported at C:/w/1/s/tmp_conda_3.7_100118/conda/conda-bld/p
ytorch_1579082551706/work/aten/src\THCUNN/generic/ClassNLLCriterion.cu:15
我应该如何解决这个问题,是否有类似模型的教程?