数据挖掘 - 用于时间序列压缩的自动编码器 - 吾爱随笔录

我正在尝试使用自动编码器（简单、卷积、LSTM）来压缩时间序列。

这是我尝试过的模型。

简单的自动编码器：

    from keras.layers import Input, Dense
    from keras.models import Model
    import keras

    # this is the size of our encoded representations
    encoding_dim = 50

    # this is our input placeholder
    input_ts = Input(shape=(2100,))
    # "encoded" is the encoded representation of the input
    encoded = Dense(encoding_dim, activation='relu')(input_ts) #, activity_regularizer=regularizers.l2(10e-5)
    # "decoded" is the lossy reconstruction of the input
    decoded = Dense(2100, activation='tanh')(encoded)

    # this model maps an input to its reconstruction
    autoencoder = Model(input_ts, decoded)

    # this model maps an input to its encoded representation
    encoder = Model(input_ts, encoded)

    # create a placeholder for an encoded (%encoding_dim%-dimensional) input
    encoded_input = Input(shape=(encoding_dim,))
    # retrieve the last layer of the autoencoder model
    decoder_layer = autoencoder.layers[-1]
    # create the decoder model
    decoder = Model(encoded_input, decoder_layer(encoded_input))

    autoencoder.summary()

    adamax = keras.optimizers.Adamax(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.01)
    autoencoder.compile(optimizer=adamax, loss='mean_absolute_percentage_error')

卷积自动编码器：

    from keras.layers import Input, Dense, Conv1D, MaxPooling1D, UpSampling1D
    from keras.models import Model

    window_length = 518

    input_ts = Input(shape=(window_length,1))

    x = Conv1D(32, 3, activation="relu", padding="valid")(input_ts)
    x = MaxPooling1D(2, padding="valid")(x)

    x = Conv1D(1, 3, activation="relu", padding="valid")(x)

    encoded = MaxPooling1D(2, padding="valid")(x)

    encoder = Model(input_ts, encoded)

    x = Conv1D(16, 3, activation="relu", padding="valid")(encoded)
    x = UpSampling1D(2)(x) 

    x = Conv1D(32, 3, activation='relu', padding="valid")(x)
    x = UpSampling1D(2)(x)

    decoded = Conv1D(1, 1, activation='tanh', padding='valid')(x)

    convolutional_autoencoder = Model(input_ts, decoded)

    convolutional_autoencoder.summary()

    optimizer = "nadam"
    loss = "mean_absolute_error"

    convolutional_autoencoder.compile(optimizer=optimizer, loss=loss)

LSTM 自动编码器：

    from keras.layers import Input, LSTM, RepeatVector
    from keras.models import Model

    inputs = Input(shape=(1, 500))
    encoded = LSTM(128)(inputs)

    decoded = RepeatVector(1)(encoded)

    decoded = LSTM(500, return_sequences=True)(decoded)

    sequence_autoencoder = Model(inputs, decoded)
    encoder = Model(inputs, encoded)

    sequence_autoencoder.summary()

    sequence_autoencoder.compile(optimizer='nadam', loss='mean_absolute_error')

为了检查压缩损失，我使用SMAPE公式。

我得到了这样的结果。简单自动编码器的平均损失为 14.28%，卷积自动编码器为 8.04%，LSTM-自动编码器为 9.25%。

我的问题是：如果压缩时间无关紧要，使用神经网络压缩有损失的时间序列是否可行？也许我应该注意其他方法？如何获得更好的压缩效果？

谢谢！