我的任务是将新闻文章分类为有趣 [1] 或无趣 [0]。我的训练集有4053篇文章,其中179篇是Interesting。验证集有664篇文章,其中17篇是Interesting。我已经对文章进行了预处理并使用 word2vec 转换为向量。
CNN架构如下:
sentence_length, vector_length = 500, 100
def create_convnet(img_path='../new_out/cnn_model_word2vec.png'):
input_shape = Input(shape=(sentence_length, vector_length, 1))
tower_1 = Conv2D(8, (vector_length, 3), padding='same', activation='relu')(input_shape)
tower_1 = MaxPooling2D((1,vector_length-3+1), strides=(1, 1), padding='same')(tower_1)
tower_1 = Dropout(0.25)(tower_1)
tower_2 = Conv2D(8, (vector_length, 4), padding='same', activation='relu')(input_shape)
tower_2 = MaxPooling2D((1,vector_length-4+1), strides=(1, 1), padding='same')(tower_2)
tower_2 = Dropout(0.25)(tower_2)
tower_3 = Conv2D(8, (vector_length, 5), padding='same', activation='relu')(input_shape)
tower_3 = MaxPooling2D((1, vector_length-5+1), strides=(1, 1), padding='same')(tower_3)
tower_3 = Dropout(0.25)(tower_3)
merged = concatenate([tower_1, tower_2, tower_3], axis=1)
merged = Flatten()(merged)
dropout1 = Dropout(0.5)(merged)
out = Dense(1, activation='sigmoid')(dropout1)
model = Model(input_shape, out)
plot_model(model, to_file=img_path)
return model
some_model = create_convnet()
some_model.compile(loss=keras.losses.binary_crossentropy,
optimizer='adam',
metrics=['accuracy'])
该模型将验证集中的所有文章预测为Uninteresting [0]。准确率为 97.44%,与验证集中无趣文章的比率相同。我已经尝试过这种架构的变体,但问题仍然存在。
对于实验,我对训练数据本身进行了预测,为此,它也将所有预测都预测为Uninteresting [0]。以下是 10 个 epoch 的日志:
some_model.fit_generator(train_gen, train_steps, epochs=num_epoch, verbose=1, callbacks=callbacks_list, validation_data=val_gen, validation_steps=val_steps)
Epoch 1/10
254/253 [==============================] - 447s 2s/step - loss: 0.7119 - acc: 0.9555 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00001: val_loss improved from inf to 0.41266, saving model to ../new_out/cnn_model_word2vec
Epoch 2/10
254/253 [==============================] - 440s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00002: val_loss did not improve
Epoch 3/10
254/253 [==============================] - 440s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00003: val_loss did not improve
Epoch 00003: ReduceLROnPlateau reducing learning rate to 0.00010000000474974513.
Epoch 4/10
254/253 [==============================] - 448s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00004: val_loss did not improve
Epoch 00004: ReduceLROnPlateau reducing learning rate to 1.0000000474974514e-05.
Epoch 5/10
254/253 [==============================] - 444s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00005: val_loss did not improve
Epoch 00005: ReduceLROnPlateau reducing learning rate to 1.0000000656873453e-06.
Epoch 6/10
254/253 [==============================] - 443s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00006: val_loss did not improve
Epoch 00006: ReduceLROnPlateau reducing learning rate to 1.0000001111620805e-07.
Epoch 7/10
254/253 [==============================] - 443s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00007: val_loss did not improve
Epoch 00007: ReduceLROnPlateau reducing learning rate to 1e-07.
Epoch 8/10
254/253 [==============================] - 443s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00008: val_loss did not improve
Epoch 00008: ReduceLROnPlateau reducing learning rate to 1e-07.
Epoch 9/10
254/253 [==============================] - 444s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00009: val_loss did not improve
Epoch 00009: ReduceLROnPlateau reducing learning rate to 1e-07.
Epoch 10/10
254/253 [==============================] - 440s 2s/step - loss: 0.7099 - acc: 0.9560 - val_loss: 0.4127 - val_acc: 0.9744
Epoch 00010: val_loss did not improve
Epoch 00010: ReduceLROnPlateau reducing learning rate to 1e-07.
Out[3]: <keras.callbacks.History at 0x7f19898b90f0>