我正在尝试提出一个基于LSTM层的 Keras 模型,该模型将对图像序列进行二进制分类。
输入数据具有以下形状:(sample_number, timesteps, width, height, channels)其中一个示例是(1200, 100, 100, 100, 3).
所以它是一个相当于视频数据的 5D 张量。
timesteps等于 100 -> 每个样本(图像序列)有 100 帧
channels等于 3 -> RGB 数据
这是一个最小的可行示例:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras import models, layers, optimizers
class TestingStuff():
def __sequence_image_generator(self, x, y, batch_size, generator, seq_len):
new_y = np.repeat(y, seq_len)
helper_flow = generator.flow(x.reshape(x.shape[0] * seq_len,
x.shape[2],
x.shape[3],
x.shape[4]),
new_y,
batch_size=seq_len * batch_size)
for x_temp, y_temp in helper_flow:
yield x_temp.reshape(x_temp.shape[0] // seq_len,
seq_len,
x.shape[2] * x.shape[3] * x.shape[4]), y_temp[::seq_len]
def testStuff(self):
batch_size = 50
training_epochs = 60
# Random generated, similar to the actual dataset
train_samples_num = 50
valid_samples_num = 50
data_train = np.random.randint(0, 65536, size=(train_samples_num, 100, 100, 100, 3), dtype='uint16')
data_valid = np.random.randint(0, 65536, size=(valid_samples_num, 100, 100, 100, 3), dtype='uint16')
labels_train = np.random.randint(0, 2, size=(train_samples_num), dtype='uint8')
labels_valid = np.random.randint(0, 2, size=(valid_samples_num), dtype='uint8')
train_data_generator = ImageDataGenerator()
valid_data_generator = ImageDataGenerator()
num_frames_per_sample = data_train.shape[1]
data_dimension = data_train.shape[2] * data_train.shape[3] * data_train.shape[4] # height * width * channels
data_train_num_samples = data_train.shape[0]
data_valid_num_samples = data_valid.shape[0]
train_generator = self.__sequence_image_generator(x = data_train,
y = labels_train,
batch_size = batch_size,
generator = train_data_generator,
seq_len = num_frames_per_sample)
valid_generator = self.__sequence_image_generator(x = data_valid,
y = labels_valid,
batch_size = batch_size,
generator = valid_data_generator,
seq_len = num_frames_per_sample)
num_units = 100
model = models.Sequential()
model.add(layers.LSTM(num_units, input_shape=(num_frames_per_sample, data_dimension)))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(optimizer=optimizers.Adam(), loss='binary_crossentropy', metrics=['acc'])
model.summary()
model.fit_generator(train_generator,
steps_per_epoch = data_train_num_samples // batch_size,
epochs = training_epochs,
validation_data = valid_generator,
validation_steps = data_valid_num_samples // batch_size,
verbose = 1)
my_class = TestingStuff()
my_class.testStuff()
此示例使用以下版本进行了测试:
python 3.6.8
keras 2.2.4
tensorflow 1.13.1
代码说明:
data_train是有形状的(50, 100, 100, 100, 3),代表 100 帧 100x100 图像的 50 个样本,具有 3 个通道。图像为 16 位。同样适用于data_valid。labels_train和是具有可能值和labels_valid的一维张量。10ImageDataGenerator()用于数据增强目的,但在此示例中未提及转换。__sequence_image_generator()从这里改编,目的是将初始输入数据(5D张量)重塑为类的flow()方法所期望的输入形状(4D张量),ImageDataGenerator并进一步成为LSTM层所期望的输入形状(3D张量)与形状(batch_size, timesteps, input_dim))。- 模型架构是一个起点(有待改进),只有 1 个 LSTM 层和 1 个 Dense 层。
问题:
我注意到代码在值高达 50 时运行良好train_samples_num。valid_samples_num如果这些变量具有较大的值(例如 1000),那么内存使用量就会变得过多,并且似乎整个训练都被阻塞了。培训没有超过第一个时期。
我怀疑问题可能出在某个地方,__sequence_image_generator()数据生成可能效率低下。但我可能错了。
更改num_units或batch_size减小值并不能解决问题。num_units = 1即使使用和,过多的内存使用仍然存在batch_size = 1。
输出train_samples_num等于valid_samples_num50:
Using TensorFlow backend.
Epoch 1/60
1/1 [==============================] - 16s 16s/step - loss: 0.7258 - acc: 0.5400 - val_loss: 0.7119 - val_acc: 0.6200
Epoch 2/60
1/1 [==============================] - 18s 18s/step - loss: 0.7301 - acc: 0.4800 - val_loss: 0.7445 - val_acc: 0.4000
Epoch 3/60
1/1 [==============================] - 21s 21s/step - loss: 0.7312 - acc: 0.4200 - val_loss: 0.7411 - val_acc: 0.4200
(...training continues...)
输出train_samples_num等于valid_samples_num1000:
Using TensorFlow backend.
Epoch 1/60
(...never finishes training the 1st epoch and memory usage grows until a MemoryError occurs...)
问题:
当我使用大量样本时,如何修改我的代码以防止这种过多的内存使用?
我的数据有大约5000 个训练数据集的样本,少于有效数据集和测试数据集的样本。