X = np.array([1,2,3])
Y = np.array([2,1,2])
params = np.array([1, 0])
def loss(y, yhat):
return ((y - yhat)**2).mean()
def model(x):
return params[0] + params[1]*x
def loss_grad(y, yhat, x):
return np.array([(2*(yhat-y)).mean(), (2*(yhat-y)*x).mean()])
lr = .1
for _ in range(3):
yhat = model(X)
l = loss(Y, yhat)
g = loss_grad(Y, yhat, X)
params = params - lr*g
print(f'thetas are now {params} with new loss of {loss(Y, yhat)}')
输出
thetas are now [1.13333333 0.26666667] with new loss of 0.6666666666666666
thetas are now [1.13333333 0.23111111] with new loss of 0.2696296296296296
thetas are now [1.14755556 0.22874074] with new loss of 0.262887242798354
我也用 keras 仔细检查了这一点,但是在 numpy 中我明确写了渐变,我建议仔细检查你的渐变或算术