数据挖掘 - 我对 Adaline 算法的实现有什么问题？ - 吾爱随笔录

我对 Adaline 算法的实现有什么问题？

数据挖掘机器学习 Python 分类感知器

2021-09-24 12:41:02

我正在阅读名为“从数据中学习”的教科书，第一章中的一个问题是让读者从头开始实现 Adaline 算法，而我选择使用 Python 来实现。我遇到的问题是我的权重 $\textbf{w}$ 在我的算法收敛之前立即爆炸到无穷大。我在这里做错了什么吗？看起来我正在按照文本描述的方式实现它。下面我提供了问题和我的 Python 代码。这里 $\textbf{y}$ 取值 -1 和 1。所以这是一个分类问题。

import numpy as np
import pandas as pd

#Generate w* vector, the true weights
dim=2
wstar=2000*np.random.rand(dim+1)-1000

#Generate the random sample of size 100
trainSize=100
train=pd.DataFrame(2000*np.random.rand(trainSize,dim)-1000)
train['intercept']=np.ones(trainSize)
cols=train.columns.tolist()
cols=cols[-1:]+cols[:-1]
train=train[cols]

#Classify the points
train['y']=np.sign(np.dot(train.iloc[:,0:3],wstar))

#Now we run the ADALINE algorithm on the training data
#Declare w vector
w=np.zeros(dim+1)

#Column of guesses
train['guess']=np.ones(trainSize)

#s column
train['s']=np.dot(train.iloc[:,0:3],w)

#Set eta
eta=5
iterations=0
while (all((train['y']*train['s'])>1)==False):
    if iterations>=1000:
        break
    #Picking a random point
    randInt=np.random.randint(len(train))
    #Temporary values for calculating new w
    temp_s=train['s'].iloc[randInt]
    temp_x=train.iloc[randInt,0:3]
    temp_y=train['y'].iloc[randInt]
    #Calculating new w
    if temp_y*temp_s<=1:
        w=w+eta*(temp_y-temp_s)*temp_x
        #Calculating new guesses and s values
        train['s']=np.dot(train.iloc[:,0:3],w)
        train['guess']=np.sign(train['s'])
    iterations+=1

1个回答

首先，让我添加这个模式，我认为它非常有助于理解从最初的 Rosenblatt 感知器和 Adaline 算法的过渡和改进：

在 Adaline 中，假设成本函数（您的 y(t)-s(t)）是可微的，则可以更新权重并且没有 y 和 s 具有相同符号的限制：目标是最小化成本 ys .

您可以在下面找到Sebastian Raschka 的优秀著作中提供的代码：

class AdalineSGD(object):
"""ADAptive LInear NEuron classifier.
    Parameters
    ------------
    eta : float
    Learning rate (between 0.0 and 1.0)
    n_iter : int
    Passes over the training dataset.
    shuffle : bool (default: True)
    Shuffles training data every epoch if True
    to prevent cycles.
    random_state : int
    Random number generator seed for random weight
    initialization.
    Attributes
    -----------
    w_ : 1d-array
    Weights after fitting.
    cost_ : list
    Sum-of-squares cost function value averaged over all
    training samples in each epoch.
"""
def __init__(self, eta=0.01, n_iter=10,
                shuffle=True, random_state=None):
    self.eta = eta
    self.n_iter = n_iter
    self.w_initialized = False
    self.shuffle = shuffle
    self.random_state = random_state

def fit(self, X, y):
    """ Fit training data.
    Parameters
    ----------
    X : {array-like}, shape = [n_samples, n_features]
    Training vectors, where n_samples is the number
    of samples and
    n_features is the number of features.
    y : array-like, shape = [n_samples]
    Target values.
    Returns
    -------
    self : object
    """
    self._initialize_weights(X.shape[1])
    self.cost_ = []
    for i in range(self.n_iter):
        if self.shuffle:
            X, y = self._shuffle(X, y)
        cost = []
        for xi, target in zip(X, y):
            cost.append(self._update_weights(xi, target))
        avg_cost = sum(cost) / len(y)
        self.cost_.append(avg_cost)
    
    return self

def partial_fit(self, X, y):
    """Fit training data without reinitializing the weights"""
    if not self.w_initialized:
        self._initialize_weights(X.shape[1])
    if y.ravel().shape[0] > 1: #if we have more than one sample
        for xi, target in zip(X, y):
            self._update_weights(xi, target)
    else:
        self._update_weights(X, y)
    
    return self
    
def _shuffle(self, X, y):
    """Shuffle training data"""
    r = self.rgen.permutation(len(y))
    
    return X[r], y[r]

def _initialize_weights(self, m):
    """Initialize weights to small random numbers"""
    import numpy as np

    self.rgen = np.random.RandomState(self.random_state)
    self.w_ = self.rgen.normal(loc=0.0, scale=0.01,
                               size=1 + m)
    
    self.w_initialized = True

def _update_weights(self, xi, target):
    """Apply Adaline learning rule to update the weights"""
    output = self.activation(self.net_input(xi))
    error = (target - output)
    self.w_[1:] += self.eta * xi.dot(error)
    self.w_[0] += self.eta * error
    cost = 0.5 * error**2

    return cost

def net_input(self, X):
    """Calculate net input"""
    
    return np.dot(X, self.w_[1:]) + self.w_[0]

def activation(self, X):
    """Compute linear activation"""
    return X

def predict(self, X):
    """Return class label after unit step"""

    return np.where(self.activation(self.net_input(X))
                    >= 0.0, 1, -1)

其它你可能感兴趣的问题

上一篇密集层与卷积层 - 何时使用它们以及如何使用它们下一篇如何包含分类字段以增强文本分类