数据挖掘 - 如何在 Keras 中使用 Embedding() 和 3D 张量？ - 吾爱随笔录

如何在 Keras 中使用 Embedding() 和 3D 张量？

数据挖掘 Python 张量流喀拉斯 rnn lstm

2021-09-26 07:16:22

我有一个股票价格序列列表，每个序列有 20 个时间步长。这是一个二维数组 shape (total_seq, 20)。我可以将其重塑(total_seq, 20, 1)为连接到其他功能。

我还有每个时间步长 10 个单词的新闻标题。所以我有来自和(total_seq, 20, 10)的新闻令牌形状的3D 数组。Tokenizer.texts_to_sequences()sequence.pad_sequences()

我想将嵌入的新闻连接到股票价格并进行预测。

我的想法是新闻嵌入应该返回形状的张量， (total_seq, 20, embed_size)以便我可以将它与形状的股票价格(total_seq, 20, 1)连接起来，然后将其连接到 LSTM 层。

为此，我应该使用函数将形状的新闻嵌入转换(total_seq, 20, 10)为。(total_seq, 20, 10, embed_size)Embedding()

但在 Keras 中，该Embedding()函数采用 2D 张量而不是 3D 张量。我该如何解决这个问题？

假设Embedding()接受 3D 张量，然后在我得到 4D 张量作为输出后，我将通过使用 LSTM 删除第 3 维，仅返回最后一个单词的嵌入，因此形状的输出(total_seq, 20, 10, embed_size)将转换为(total_seq, 20, embed_size)

但我会再次遇到另一个问题，LSTM 接受 3D 张量而不是 4D 所以

如何解决嵌入和 LSTM 不接受我的输入的问题？

1个回答

我不完全确定这是否是最干净的解决方案，但我将所有内容拼接在一起。10 个单词位置中的每一个都有自己的输入，但这应该不是太大的问题。这个想法是制作一个嵌入层并多次使用它。首先，我们将生成一些数据：

n_samples = 1000
time_series_length = 50
news_words = 10
news_embedding_dim = 16
word_cardinality = 50

x_time_series = np.random.rand(n_samples, time_series_length, 1)
x_news_words = np.random.choice(np.arange(50), replace=True, size=(n_samples, time_series_length, news_words))
x_news_words = [x_news_words[:, :, i] for i in range(news_words)]
y = np.random.randint(2, size=(n_samples))

现在我们将定义图层：

## Input of normal time series
time_series_input = Input(shape=(50, 1, ), name='time_series')

## For every word we have it's own input
news_word_inputs = [Input(shape=(50, ), name='news_word_' + str(i + 1)) for i in range(news_words)]

## Shared embedding layer
news_word_embedding = Embedding(word_cardinality, news_embedding_dim, input_length=time_series_length)

## Repeat this for every word position
news_words_embeddings = [news_word_embedding(inp) for inp in news_word_inputs]

## Concatenate the time series input and the embedding outputs
concatenated_inputs = concatenate([time_series_input] + news_words_embeddings, axis=-1)

## Feed into LSTM
lstm = LSTM(16)(concatenated_inputs)

## Output, in this case single classification
output = Dense(1, activation='sigmoid')(lstm)

编译模型后，我们可以像这样拟合它：

model.fit([x_time_series] + x_news_words, y)

编辑：

在您在评论中提到的内容之后，您可以添加一个总结新闻的密集层，并将其添加到您的时间序列（股票价格）中：

## Summarize the news:
news_words_concat = concatenate(news_words_embeddings, axis=-1)
news_words_transformation = TimeDistributed(Dense(combined_news_embedding))(news_words_concat)

## New concat
concatenated_inputs = concatenate([time_series_input, news_words_transformation], axis=-1)

其它你可能感兴趣的问题

上一篇使用神经网络进行序数回归的成本函数下一篇如何判断一个英文句子的复杂度？