数据挖掘 - 如何维护 CBOW 数据集维度并使其适合神经网络？ - 吾爱随笔录

我是神经网络的新手。我正在尝试在不使用 word2vec 包的情况下训练词嵌入。

使用来自 reddit worldnews 数据集的标题，我已经完成了一些 CBOW 表示。

对于窗口大小为 3，这是我的一些输出：

0 Context : ['scores', 'killed', 'pakistan'] ---> Target: clashes
1 Context : ['japan', 'resumes', 'refuelling'] ---> Target: mission
2 Context : ['us', 'presses', 'egypt'] ---> Target: gaza
3 Context : ['presses', 'egypt', 'gaza'] ---> Target: border

对于 Vocab size = 513，我收集了 369 个目标词和 369 个 3 克上下文词。每个上下文词都是长度为 513 的单热编码。

因此，我的数据集长度变为：
X.shape = (369, 3, 1, 513) Y.shape = (369, 1, 513)

现在我在拟合神经网络中的数据时遇到了麻烦。我的神经网络模型是用 keras 构建的。

# create model
model = Sequential()
model.add(Dense(100, input_dim=1, init=369 'uniform' , activation= 'sigmoid' ))
model.add(Dense(1, init= 'uniform' , activation= 'sigmoid' ))
model.compile(loss= 'binary_crossentropy' , optimizer= 'sgd' , metrics=['accuracy'])

#train
history = model.fit(X, Y, nb_epoch=100)

引发错误：

ValueError: Error when checking input: expected dense_9_input to have 2 dimensions, but got array with shape (369, 3, 1, 513)