人工智能 - 训练损失和验证损失是按样本还是按批次绘制的？ - 吾爱随笔录

训练损失和验证损失是按样本还是按批次绘制的？

人工智能卷积神经网络目标函数监督学习火炬

2021-10-31 20:11:10

我正在使用 CNN 对一些数据进行训练，其中训练大小 = 21700 个样本，测试大小为 653 个样本，并且说我使用的 batch_size 为 500（我也在考虑批量大小的样本）。

我一直在寻找这个很长时间，但无法得到明确的答案，但是在绘制损失函数以检查模型是否过度拟合时，我是否绘制如下

for j in range(num_epochs):
  <some training code---Take gradient descent step do wonders>
  batch_loss=0
  for i in range(num_batches_train):
       batch_loss = something....criterion(target,output)...
       total_loss += batch_loss
  Losses_Train_Per_Epoch.append(total_loss/num_samples_train)#and this is

我需要帮助的地方

Losses_Train_Per_Epoch.append(total_loss/num_batches_train)
and doing the same for Losses_Validation_Per_Epoch.
plt.plot(Losses_Train_Per_Epoch, Losses_Validation_Per_epoch)

所以，基本上，我要问的是，我应该除以 num_samples 还是 num_batches 还是 batch_size？哪一个？

1个回答

您想计算所有批次的平均损失。您需要做的是将批次损失的总和除以批次数！

在你的情况下：

你有一个训练集 $21700$ 样本和批量大小 $500$ . 这意味着你采取 $21700/500 \approx 43$ 训练迭代。这意味着对于每个时期，模型都会更新 $43$ 次！所以你计算训练损失的方式，就是你需要除以的。

注意：我不确定您到底要绘制什么，但我假设您要绘制训练损失和验证损失

training_loss = []
validation_loss = []
training_steps = num_samples // batch_size
validation_steps = num_validation_samples // batch_size

for epoch in range(num_epochs):

    # Training steps
    total_loss = 0
    for b in range(training_steps):
        batch_loss = ...  # compute batch loss
        total_loss += batch_loss
    training_loss.append(total_loss / training_steps)

    # Validation steps
    total_loss = 0
    for b in range(validation_steps):
        batch_loss = ...  # compute batch validation loss
        total_loss += batch_loss
    training_loss.append(total_loss / validation_steps)

# Plot training and validation curves
plt.plot(range(num_epochs), training_loss)
plt.plot(range(num_epochs), validation_loss)

另一种方法是将损失存储在列表中并计算平均值。如果您不确定要划分什么，可以使用它。

...

for epoch in range(num_epochs):

    list_of_batch_losses = []  # initialize list that is going to store batch losses

    # Training steps
    for b in range(training_steps):
        batch_loss = ...  # compute batch loss
        list_of_batch_losses.append(batch_loss)  # store loss in a list

    epoch_loss = np.mean(list_of_batch_losses)
    training_loss.append(epoch_loss)

    ...

plt.plot(range(num_epochs), training_loss)

其它你可能感兴趣的问题

上一篇神经网络可以被视为“强人工智能”吗？下一篇我们如何才能有效且公正地决定在 MCTS 的扩展阶段生成哪些子代？