数据挖掘 - 为什么 TFLearn DCGAN 不能在我的 GPU（在 Windows 上）上运行？ - 吾爱随笔录

为什么 TFLearn DCGAN 不能在我的 GPU（在 Windows 上）上运行？

数据挖掘机器学习 Python 张量流甘学习

2022-02-26 19:30:45

我收集了TFLearn DCGAN 示例代码并将其放入我本地的 Jupyter 环境中。此外，我with tf.device('/gpu:0'):在调用之前更改了一些注释并添加gan.fit(...)了以下代码：

# coding: utf-8

# In[1]:


get_ipython().magic('matplotlib inline')
from __future__ import division, print_function, absolute_import

import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tflearn


# In[2]:


# Data loading and preprocessing
import tflearn.datasets.mnist as mnist
X, Y, testX, testY = mnist.load_data()
X = np.reshape(X, newshape=[-1, 28, 28, 1])


# In[3]:


# Noise data input
z_dim = 200
total_samples = len(X)


# In[4]:


# Generator
def generator(x, reuse=False):
    with tf.variable_scope('Generator', reuse=reuse):
        x = tflearn.fully_connected(x, n_units=7 * 7 * 128)
        x = tflearn.batch_normalization(x)
        x = tf.nn.tanh(x)
        x = tf.reshape(x, shape=[-1, 7, 7, 128])
        x = tflearn.upsample_2d(x, 2)
        x = tflearn.conv_2d(x, 64, 5, activation='tanh')
        x = tflearn.upsample_2d(x, 2)
        x = tflearn.conv_2d(x, 1, 5, activation='sigmoid')
        return x


# In[5]:


# Discriminator
def discriminator(x, reuse=False):
    with tf.variable_scope('Discriminator', reuse=reuse):
        x = tflearn.conv_2d(x, 64, 5, activation='tanh')
        x = tflearn.avg_pool_2d(x, 2)
        x = tflearn.conv_2d(x, 128, 5, activation='tanh')
        x = tflearn.avg_pool_2d(x, 2)
        x = tflearn.fully_connected(x, 1028, activation='tanh')
        x = tflearn.fully_connected(x, 2)
        x = tf.nn.softmax(x)
        return x


# In[6]:


# Input data
gen_input = tflearn.input_data(shape=[None, z_dim], name='input_gen_noise')
input_disc_noise = tflearn.input_data(shape=[None, z_dim], name='input_disc_noise')
input_disc_real = tflearn.input_data(shape=[None, 28, 28, 1], name='input_disc_real')


# In[7]:


# Build discriminator
disc_fake = discriminator(generator(input_disc_noise))
disc_real = discriminator(input_disc_real, reuse=True)
disc_net = tf.concat([disc_fake, disc_real], axis=0)


# In[8]:


# Build stacked Generator/Discriminator
gen_net = generator(gen_input, reuse=True)
stacked_gan_net = discriminator(gen_net, reuse=True)


# In[9]:


# Build training ops for Discriminator
# Each network optimization should only update its own variable, thus we need
# to retrieve each network variable (with get_layer_variables_by_name)
disc_vars = tflearn.get_layer_variables_by_name('Discriminator')
# We need 2 target placeholders, for both the real and fake image target
disc_target = tflearn.multi_target_data(['target_disc_fake', 'target_disc_real'],
                                        shape=[None, 2])
disc_model = tflearn.regression(disc_net, optimizer='adam',
                                placeholder=disc_target,
                                loss='categorical_crossentropy',
                                trainable_vars=disc_vars,
                                batch_size=64, name='target_disc',
                                op_name='DISC')


# In[10]:


# Build training ops for Generator
gen_vars = tflearn.get_layer_variables_by_name('Generator')
gan_model = tflearn.regression(stacked_gan_net, optimizer='adam',
                               loss='categorical_crossentropy',
                               trainable_vars=gen_vars,
                               batch_size=64, name='target_gen',
                               op_name='GEN')


# In[11]:


# Define GAN model, that outputs the generated images
gan = tflearn.DNN(gan_model, tensorboard_verbose=3)


# In[12]:


# Training
# Prepare input data to feed to the discriminator
disc_noise = np.random.uniform(-1., 1., size=[total_samples, z_dim])
# Prepare target data to feed to the discriminator (0: fake image, 1: real image)
y_disc_fake = np.zeros(shape=[total_samples])
y_disc_real = np.ones(shape=[total_samples])
y_disc_fake = tflearn.data_utils.to_categorical(y_disc_fake, 2)
y_disc_real = tflearn.data_utils.to_categorical(y_disc_real, 2)


# In[13]:


# Prepare input data to feed to the stacked Generator/Discriminator
gen_noise = np.random.uniform(-1., 1., size=[total_samples, z_dim])
# Prepare target data to feed to the Discriminator
# The Generator tries to fool the Discriminator, thus target is 1 (real images)
y_gen = np.ones(shape=[total_samples])
y_gen = tflearn.data_utils.to_categorical(y_gen, 2)


# In[14]:


# Start training, feed both noise and real images
with tf.device('/gpu:0'):
    gan.fit(X_inputs={'input_gen_noise': gen_noise,
                      'input_disc_noise': disc_noise,
                      'input_disc_real': X},
            Y_targets={'target_gen': y_gen,
                       'target_disc_fake': y_disc_fake,
                       'target_disc_real': y_disc_real},
            n_epoch=10)


# In[15]:


# Create another model from the Generator graph to generate some samples
# for testing (re-using the same session to re-use the weights learnt)
gen = tflearn.DNN(gen_net, session=gan.session)


# In[16]:


f, a = plt.subplots(4, 10, figsize=(10, 4))
for i in range(10):
    # Noise input
    z = np.random.uniform(-1., 1., size=[4, z_dim])
    g = np.array(gen.predict({'input_gen_noise': z}))
    for j in range(4):
        # Generate image from noise. Extend to 3 channels for matplot figure.
        img = np.reshape(np.repeat(g[j][:, :, np.newaxis], 3, axis=2),
                         newshape=(28, 28, 3))
        a[j][i].imshow(img)

f.show()
plt.draw()

我想在我的 NVIDIA GPU 上运行此代码。我的机器上已经安装了 CUDA 和 cuDNN。在训练期间检查 Windows 任务管理器时，我发现我的 CPU 压力很大，而我的 GPU 处于休眠状态。

任何人都可以就如何正确实施提供建议，with tf.device('/gpu:0'):因为很明显上述代码不能在我的 NVIDIA GPU 上运行？

1个回答

需要在with tf.device('/gpu:0'):上下文中的代码部分是神经网络所在的实际计算图。所有应该在 GPU 上更新的参数都应该在其中定义。我不太了解 tflearn，但我假设如果您将代码从上下文管理器包装IN 6到内部，它应该可以工作。IN 12

发生的情况是，当您定义变量（权重）时，图表中的操作是您创建它们但还必须放置它们并保存参考。通常这会发生在您的 RAM 中，并且变量本身是 RAM 的地址。在 TensorFlow 的情况下，有两件事是不同的。首先，有时您希望将一些东西或所有东西放在您的 GPU 上，因为它对 TensorFlow 中发生的计算类型具有一些固有的优势。通过使用这些上下文管理器，您可以显示您想要放置它们的位置。

您在上下文管理器中的fit那个只会调用已经在您的 CPU 上定义的训练操作。

这样做似乎有点冗长，但你会得到很大的灵活性。您可以将图形分布在多个 GPU 上，或者在 CPU 上进行部分计算，因为它在那里更快，或者您没有足够的 GPU RAM 来存储整个图形。

其它你可能感兴趣的问题

上一篇LSTM 时间序列递归预测收敛到相同的值下一篇通用机器学习工作流程问题