数据挖掘 - 机器学习可以学习像从列表中找到最大值这样的功能吗？ - 吾爱随笔录

机器学习可以学习像从列表中找到最大值这样的功能吗？

数据挖掘机器学习深度学习

2021-10-04 22:03:46

我有一个输入，它是一个列表，输出是输入列表中元素的最大值。

机器学习可以学习这样一个总是选择输入中存在的最大输入元素的函数吗？

这似乎是一个非常基本的问题，但它可能让我了解机器学习通常可以做什么。谢谢！

4个回答

也许可以，但请注意，这是机器学习不能解决问题的情况之一。有一种趋势是尝试将机器学习硬塞进那些真正基于规则的标准解决方案更快、更简单并且通常是正确选择的情况下：P

仅仅因为你可以，并不意味着你应该

编辑：我最初将其写为“是的，但请注意......”但后来开始怀疑自己，从未见过它完成。我今天下午试了一下，确实可行：

import numpy as np
from keras.models import Model
from keras.layers import Input, Dense, Dropout
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
from keras.callbacks import EarlyStopping

# Create an input array of 50,000 samples of 20 random numbers each
x = np.random.randint(0, 100, size=(50000, 20))

# And a one-hot encoded target denoting the index of the maximum of the inputs
y = to_categorical(np.argmax(x, axis=1), num_classes=20)

# Split into training and testing datasets
x_train, x_test, y_train, y_test = train_test_split(x, y)

# Build a network, probaly needlessly complicated since it needs a lot of dropout to
# perform even reasonably well.

i = Input(shape=(20, ))
a = Dense(1024, activation='relu')(i)
b = Dense(512, activation='relu')(a)
ba = Dropout(0.3)(b)
c = Dense(256, activation='relu')(ba)
d = Dense(128, activation='relu')(c)
o = Dense(20, activation='softmax')(d)

model = Model(inputs=i, outputs=o)

es = EarlyStopping(monitor='val_loss', patience=3)

model.compile(optimizer='adam', loss='categorical_crossentropy')

model.fit(x_train, y_train, epochs=15, batch_size=8, validation_data=[x_test, y_test], callbacks=[es])

print(np.where(np.argmax(model.predict(x_test), axis=1) == np.argmax(y_test, axis=1), 1, 0).mean())

输出为 0.74576，因此正确找到最大 74.5% 的时间。我毫不怀疑这可以改进，但正如我所说，这不是我推荐用于 ML 的用例。

编辑 2：实际上我今天早上使用 sklearn 的 RandomForestClassifier 重新运行了它，它的表现明显更好：

# instantiation of the arrays is identical

rfc = RandomForestClassifier(n_estimators=1000, verbose=1)
rfc.fit(x_train, y_train)

yhat_proba = rfc.predict_proba(x_test)


# We have some annoying transformations to do because this .predict_proba() call returns the data in a weird format of shape (20, 12500, 2).

for i in range(len(yhat_proba)):
    yhat_proba[i] = yhat_proba[i][:, 1]

pyhat = np.reshape(np.ravel(yhat_proba), (12500,20), order='F')

print(np.where(np.argmax(pyhat, axis=1) == np.argmax(y_test, axis=1), 1, 0).mean())

这里的分数是 94.4% 的样本正确识别了最大值，这确实相当不错。

是的。 非常重要的是，您决定机器学习解决方案的架构。架构和培训程序不会自己编写；它们必须被设计或模板化，并且训练作为发现适合一组数据点的架构参数化的一种手段。

您可以构建一个非常简单的架构，其中实际上包含一个最大值函数：

net(x) = a * max(x) + b * min(x)

其中a和b是学习参数。

给定足够的训练样本和合理的训练例程，这个非常简单的架构将很快学会为您的任务将 a 设置为 1 并将 b 设置为 0。

机器学习通常采用接受关于输入数据点的特征化和转换的多个假设的形式，并学习仅保留与目标变量相关的那些假设。假设在参数化算法中可用的架构和子功能中明确编码，或者作为“无参数”算法中编码的假设。

例如，使用普通神经网络 ML 中常见的点积和非线性的选择有些随意；它表达了一个包含假设，即可以使用线性变换和阈值函数的预定组合网络结构来构建函数。该网络的不同参数化体现了关于使用哪些线性变换的不同假设。可以使用任何功能工具箱，机器学习者的工作是通过微分或试错法或其他一些可重复的信号来发现其阵列中的哪些功能或特征最能最小化错误度量。在上面给出的示例中，学习网络简单地简化为最大函数本身，而未区分的网络可以替代地“学习”最小函数。这些函数可以通过其他方式表达或近似，如另一个答案中的线性或神经网络回归函数。总之，这实际上取决于您的 ML 架构工具箱中有哪些功能或乐高积木。

是的 - 机器学习可以学习在数字列表中找到最大值。

下面是一个学习查找最大值索引的简单示例：

import numpy as np
from sklearn.tree import DecisionTreeClassifier

# Create training pairs where the input is a list of numbers and the output is the argmax
training_data = np.random.rand(10_000, 5) # Each list is 5 elements; 10K examples
training_targets = np.argmax(input_data, axis=1)

# Train a descision tree with scikit-learn
clf = DecisionTreeClassifier()
clf.fit(input_data, targets)

# Let's see if the trained model can correctly predict the argmax for new data
test_data = np.random.rand(1, 5)
prediction = clf.predict(test_data)
assert prediction == np.argmax(test_data) # The test passes - The model has learned argmax

学习算法

不是将函数学习为由前馈神经网络完成的计算，而是关于从样本数据中学习算法的整个研究领域。例如，人们可能会使用神经图灵机或其他方法，其中算法的执行由机器学习在其决策点控制。诸如查找最大值、排序列表、反转列表或过滤列表等玩具算法通常用作算法学习研究中的示例。

其它你可能感兴趣的问题

上一篇如何计算 Pandas 数据框中每一行中缺失值的数量？下一篇如何根据 Pandas 数据框中的其他列填充缺失值？