我正在尝试创建一个 1D 变分自动编码器以将 931x1 向量作为输入,但我遇到了两件事:
- 获得 931 的输出大小,因为 maxpooling 和上采样给出了均匀的大小
- 获得适当的图层大小
这就是我到目前为止所拥有的。在训练之前,我在输入数组的两侧添加了 0 填充(这就是为什么您会看到输入为 h+2,931+2 = 933),然后裁剪输出以获得 933 的输出大小。使用 931 输入会产生 928 输出,我不确定从那里获得 931 而不进行裁剪的最佳方法是什么。
input_sig = Input(batch_shape=(w,h+2, 1))
x = Conv1D(8,3, activation='relu', padding='same',dilation_rate=2)(input_sig)
# x = ZeroPadding1D((2,1))(x)
x1 = MaxPooling1D(2)(x)
x2 = Conv1D(4,3, activation='relu', padding='same',dilation_rate=2)(x1)
x3 = MaxPooling1D(2)(x2)
x4 = AveragePooling1D()(x3)
flat = Flatten()(x4)
encoder = Dense(2)(flat)
x = encoder
z_mean = Dense(latent_dim, name="z_mean")(x)
z_log_var = Dense(latent_dim, name="z_log_var")(x)
z = Sampling()([z_mean, z_log_var])
encoder = Model(input_sig, [z_mean, z_log_var, z], name="encoder")
encoder.summary()
latent_inputs = keras.Input(shape=(latent_dim,))
# d1 = Dense(464)(latent_inputs)
d1 = Dense(468)(latent_inputs)
# d2 = Reshape((117,4))(d1)
d2 = Reshape((117,4))(d1)
d3 = Conv1D(4,1,strides=1, activation='relu', padding='same')(d2)
d4 = UpSampling1D(2)(d3)
d5 = Conv1D(8,1,strides=1, activation='relu', padding='same')(d4)
d6 = UpSampling1D(2)(d5)
d7 = UpSampling1D(2)(d6)
d8 = Conv1D(1,1, strides=1, activation='sigmoid', padding='same')(d7)
decoded = Cropping1D(cropping=(1,2))(d8) # this is the added step
decoder = Model(latent_inputs, decoded, name="decoder")
decoder.summary()
这是打印的摘要:
Model: "encoder"
__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_99 (InputLayer) [(1, 933, 1)] 0
__________________________________________________________________________________________________
conv1d_209 (Conv1D) (1, 933, 8) 32 input_99[0][0]
__________________________________________________________________________________________________
max_pooling1d_90 (MaxPooling1D) (1, 466, 8) 0 conv1d_209[0][0]
__________________________________________________________________________________________________
conv1d_210 (Conv1D) (1, 466, 4) 100 max_pooling1d_90[0][0]
__________________________________________________________________________________________________
max_pooling1d_91 (MaxPooling1D) (1, 233, 4) 0 conv1d_210[0][0]
__________________________________________________________________________________________________
average_pooling1d_45 (AveragePo (1, 116, 4) 0 max_pooling1d_91[0][0]
__________________________________________________________________________________________________
flatten_45 (Flatten) (1, 464) 0 average_pooling1d_45[0][0]
__________________________________________________________________________________________________
dense_89 (Dense) (1, 2) 930 flatten_45[0][0]
__________________________________________________________________________________________________
z_mean (Dense) (1, 2) 6 dense_89[0][0]
__________________________________________________________________________________________________
z_log_var (Dense) (1, 2) 6 dense_89[0][0]
__________________________________________________________________________________________________
sampling_45 (Sampling) (1, 2) 0 z_mean[0][0]
z_log_var[0][0]
==================================================================================================
Total params: 1,074
Trainable params: 1,074
Non-trainable params: 0
__________________________________________________________________________________________________
Model: "decoder"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input_100 (InputLayer) [(None, 2)] 0
_________________________________________________________________
dense_90 (Dense) (None, 468) 1404
_________________________________________________________________
reshape_44 (Reshape) (None, 117, 4) 0
_________________________________________________________________
conv1d_211 (Conv1D) (None, 117, 4) 20
_________________________________________________________________
up_sampling1d_117 (UpSamplin (None, 234, 4) 0
_________________________________________________________________
conv1d_212 (Conv1D) (None, 234, 8) 40
_________________________________________________________________
up_sampling1d_118 (UpSamplin (None, 468, 8) 0
_________________________________________________________________
up_sampling1d_119 (UpSamplin (None, 936, 8) 0
_________________________________________________________________
conv1d_213 (Conv1D) (None, 936, 1) 9
_________________________________________________________________
cropping1d_18 (Cropping1D) (None, 933, 1) 0
=================================================================
Total params: 1,473
Trainable params: 1,473
Non-trainable params: 0
______________________________
但是,当我尝试拟合我的模型时,出现以下异常:
ValueError: Invalid reduction dimension 2 for input with 2 dimensions. for '{{node Sum}} = Sum[T=DT_FLOAT, Tidx=DT_INT32, keep_dims=false](Mean, Sum/reduction_indices)' with input shapes: [1,933], [2] and with computed input tensors: input[1] = <1 2>.
任何人都遇到过这个错误,或者看到我在模型构建中做错了什么?我是新手,不确定我做错了什么。
请注意,我已从 Keras 文档中的 28x28 MNIST VAE 对其进行了修改。
提前致谢