我对 Adaline 算法的实现有什么问题?

数据挖掘 机器学习 Python 分类 感知器
2021-09-24 12:41:02

我正在阅读名为“从数据中学习”的教科书,第一章中的一个问题是让读者从头开始实现 Adaline 算法,而我选择使用 Python 来实现。我遇到的问题是我的权重w在我的算法收敛之前立即爆炸到无穷大。我在这里做错了什么吗?看起来我正在按照文本描述的方式实现它。下面我提供了问题和我的 Python 代码。这里y 取值 -1 和 1。所以这是一个分类问题。 问题 1.5

import numpy as np
import pandas as pd

#Generate w* vector, the true weights
dim=2
wstar=2000*np.random.rand(dim+1)-1000

#Generate the random sample of size 100
trainSize=100
train=pd.DataFrame(2000*np.random.rand(trainSize,dim)-1000)
train['intercept']=np.ones(trainSize)
cols=train.columns.tolist()
cols=cols[-1:]+cols[:-1]
train=train[cols]

#Classify the points
train['y']=np.sign(np.dot(train.iloc[:,0:3],wstar))

#Now we run the ADALINE algorithm on the training data
#Declare w vector
w=np.zeros(dim+1)

#Column of guesses
train['guess']=np.ones(trainSize)

#s column
train['s']=np.dot(train.iloc[:,0:3],w)

#Set eta
eta=5
iterations=0
while (all((train['y']*train['s'])>1)==False):
    if iterations>=1000:
        break
    #Picking a random point
    randInt=np.random.randint(len(train))
    #Temporary values for calculating new w
    temp_s=train['s'].iloc[randInt]
    temp_x=train.iloc[randInt,0:3]
    temp_y=train['y'].iloc[randInt]
    #Calculating new w
    if temp_y*temp_s<=1:
        w=w+eta*(temp_y-temp_s)*temp_x
        #Calculating new guesses and s values
        train['s']=np.dot(train.iloc[:,0:3],w)
        train['guess']=np.sign(train['s'])
    iterations+=1
1个回答

首先,让我添加这个模式,我认为它非常有助于理解从最初的 Rosenblatt 感知器和 Adaline 算法的过渡和改进:

在此处输入图像描述

在 Adaline 中,假设成本函数(您的 y(t)-s(t))是可微的,则可以更新权重并且没有 y 和 s 具有相同符号的限制:目标是最小化成本 ys .

您可以在下面找到Sebastian Raschka 的优秀著作中提供的代码:

class AdalineSGD(object):
"""ADAptive LInear NEuron classifier.
    Parameters
    ------------
    eta : float
    Learning rate (between 0.0 and 1.0)
    n_iter : int
    Passes over the training dataset.
    shuffle : bool (default: True)
    Shuffles training data every epoch if True
    to prevent cycles.
    random_state : int
    Random number generator seed for random weight
    initialization.
    Attributes
    -----------
    w_ : 1d-array
    Weights after fitting.
    cost_ : list
    Sum-of-squares cost function value averaged over all
    training samples in each epoch.
"""
def __init__(self, eta=0.01, n_iter=10,
                shuffle=True, random_state=None):
    self.eta = eta
    self.n_iter = n_iter
    self.w_initialized = False
    self.shuffle = shuffle
    self.random_state = random_state

def fit(self, X, y):
    """ Fit training data.
    Parameters
    ----------
    X : {array-like}, shape = [n_samples, n_features]
    Training vectors, where n_samples is the number
    of samples and
    n_features is the number of features.
    y : array-like, shape = [n_samples]
    Target values.
    Returns
    -------
    self : object
    """
    self._initialize_weights(X.shape[1])
    self.cost_ = []
    for i in range(self.n_iter):
        if self.shuffle:
            X, y = self._shuffle(X, y)
        cost = []
        for xi, target in zip(X, y):
            cost.append(self._update_weights(xi, target))
        avg_cost = sum(cost) / len(y)
        self.cost_.append(avg_cost)
    
    return self

def partial_fit(self, X, y):
    """Fit training data without reinitializing the weights"""
    if not self.w_initialized:
        self._initialize_weights(X.shape[1])
    if y.ravel().shape[0] > 1: #if we have more than one sample
        for xi, target in zip(X, y):
            self._update_weights(xi, target)
    else:
        self._update_weights(X, y)
    
    return self
    
def _shuffle(self, X, y):
    """Shuffle training data"""
    r = self.rgen.permutation(len(y))
    
    return X[r], y[r]

def _initialize_weights(self, m):
    """Initialize weights to small random numbers"""
    import numpy as np

    self.rgen = np.random.RandomState(self.random_state)
    self.w_ = self.rgen.normal(loc=0.0, scale=0.01,
                               size=1 + m)
    
    self.w_initialized = True

def _update_weights(self, xi, target):
    """Apply Adaline learning rule to update the weights"""
    output = self.activation(self.net_input(xi))
    error = (target - output)
    self.w_[1:] += self.eta * xi.dot(error)
    self.w_[0] += self.eta * error
    cost = 0.5 * error**2

    return cost

def net_input(self, X):
    """Calculate net input"""
    
    return np.dot(X, self.w_[1:]) + self.w_[0]

def activation(self, X):
    """Compute linear activation"""
    return X

def predict(self, X):
    """Return class label after unit step"""

    return np.where(self.activation(self.net_input(X))
                    >= 0.0, 1, -1)