使用 pytorch 的分类器

数据挖掘 火炬
2022-02-27 09:43:31

我正在编写一个演示代码来预测 10 维输入数据集的 2 类分类。下面,函数 _data生成数据:

import math
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim

def _data(dimension, num_examples):
    # Create a simulated 10-dimensional training dataset consisting of 1000 labeled
    # examples, of which 800 are labeled correctly and 200 are mislabeled.
    num_mislabeled_examples = 20
    # We will constrain the recall to be at least 90%.
    recall_lower_bound = 0.9

    # Create random "ground truth" parameters for a linear model.
    ground_truth_weights = np.random.normal(size=dimension) / math.sqrt(dimension)
    ground_truth_threshold = 0

    # Generate a random set of features for each example.
    features = np.random.normal(size=(num_examples, dimension)).astype(
        np.float32) / math.sqrt(dimension)
    # Compute the labels from these features given the ground truth linear model.
    labels = (np.matmul(features, ground_truth_weights) >
              ground_truth_threshold).astype(np.float32)
    # Add noise by randomly flipping num_mislabeled_examples labels.
    mislabeled_indices = np.random.choice(
        num_examples, num_mislabeled_examples, replace=False)
    labels[mislabeled_indices] = 1 - labels[mislabeled_indices]

    return torch.tensor(labels), torch.tensor(features)

下面的代码显示了我的尝试,predictor模型在哪里,损失函数选择为铰链损失。

import math
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim

dim = 10
N = 100
target, features = _data(dim, N)

class predictor(nn.Module):
    def __init__(self):
        super(predictor, self).__init__()
        self.f_1 = nn.Linear(10, 1)

    def forward(self, features):
        return self.f_1(features)

model = predictor()

optimizer = optim.Adam(model.parameters(), lr=1e-2)
loss = torch.nn.HingeEmbeddingLoss(margin=1.0, size_average=None, reduce=None, reduction='mean')

running_loss = 0
for _ in range(1000):
    optimizer.zero_grad()
    output = model(features)
    objective = loss(output, target)
    objective.backward()
    running_loss += objective.item()
    optimizer.step()

    print(running_loss)

我的问题:

  1. 我看到我的损失从零增加到 20,然后深入负面领域。我想知道我的实现是否正确。
  2. 我试图nn.Linear通过自己定义计算来实现我的预测器而不使用:
class predictor(nn.Module):
    def __init__(self):
        super(predictor, self).__init__()
        self.weights = torch.zeros(dim, dim, requires_grad=True)
        self.threshold = torch.zeros(1, 1, requires_grad=True)

    def forward(self, features):
        return torch.matmul(self.weights, features) - self.threshold

但在优化过程中,

model = predictor()
optimizer = optim.Adam(model.parameters(), lr=1e-3) 

我收到以下错误:

ValueError:优化器得到一个空的参数列表

我将不胜感激有关如何解决这些问题的指导或意见。谢谢。

1个回答
  1. 选择 lr 的优化器非常小的东西。这可能是因为梯度爆炸。

  2. 在 self.weight 中使用 nn.Parameter() 然后传递你的 torch.zeros() 使其成为模型参数。