DCGAN手写数字生成--tensorflow2.0

最近忙里偷闲,补一发之前落下的关于一些GANtensorflow2.0实现代码。

首先,GAN ,即生成对抗网络Generative Adversarial Network,由一个生成器Generator和一个判别器Discriminator组成,两者属于 零和博弈 的双方,不是你死就是我亡的状态。

其损失函数一般定义如下:

判别器 D 对来自真实数据集的样本 $x\sim p_{data}$ 要输出大概率,越接近 1 越好;对来自生成器生成的样本$z\sim p_z$ 要输出小概率,越接近 0 越好。当然,这是一个零和博弈,对于生成器G来说,其目标与D截然相反,他想使自己生成的样本不能被D识别出来。这么不断的你来我往,双方各自不断调优,最后达成一个平衡状态。

当然,这其中的损失函数也不只这一种,比如说,生成器的损失函数可以定义为:生成样本分布与真实样本分布的 KL 散度值等。

KL 散度用来衡量两个分布之间的相似性。

接下来,我们用很熟悉的 MNIST 的数据集,来实现一个简单的 DCGAN 模型Deep Convolutional Generative Adversarial Networks。该模型由此 论文 提出。其实就是生成器和判别器用深度卷积来实现的。话不多说,上代码:

1、加载数据

MNIST数据集有很多种格式,这里我们使用的是 npz 格式,用 numpy直接解析就可以了。将训练集和测试集联合起来,一起喂入模型进行训练,label用不到,就直接舍弃掉得了。

1
2
3
4
5
data = np.load("dataset/mnist.npz")
train_images = np.concatenate((data["x_train"], data["x_test"]), axis=0)#[70000,28,28]
# 增加一维[70000, 28, 28, 1]
train_images = train_images.reshape(train_images.shape[0], 28, 28,1).astype("float32")
train_images = (train_images - 127.5) / 127.5 # 像素值标准化到 [-1, 1] 之间
2、定义生成器

生成器要做的就是将一个 100 维的随机噪声向量,变换成一个 28*28 的黑白数字图片。用到的是一个全连接层 + 3个反卷积层。

反卷积的具体操作过程见 知乎博文 ,其实就是根据 stride 的值先填充(包括矩阵的左侧和上侧),然后再做步长为 1 的正向卷积。反卷积只能恢复尺寸,不能恢复数值。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
def make_generator_model():
model = tf.keras.Sequential()
model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((7, 7, 256)))
assert model.output_shape == (None, 7, 7, 256)

model.add(layers.Conv2DTranspose(filters=128, kernel_size=(5, 5), strides=(1, 1), padding='same', use_bias=False)) # 反卷积层
assert model.output_shape == (None, 7, 7, 128)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())

model.add(layers.Conv2DTranspose(filters=64, kernel_size=(5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 14, 14, 64)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())

model.add(layers.Conv2DTranspose(filters=1, kernel_size=(5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
assert model.output_shape == (None, 28, 28, 1)
return model
3、定义判别器

判别器要做的就是对来自真实数据集中的样本输出大概率,对来自生成器的样本输出小概率值。包含两个卷积层和1个全连接层。

1
2
3
4
5
6
7
8
9
10
11
12
13
def make_discriminator_model():
model = tf.keras.Sequential()
model.add(layers.Conv2D(filters=64, kernel_size=(5, 5), strides=(2, 2), padding='same', input_shape=[28, 28, 1]))
model.add(layers.LeakyReLU())
model.add(layers.Dropout(rate=0.3))

model.add(layers.Conv2D(filters=128, kernel_size=(5, 5), strides=(2, 2), padding='same'))
model.add(layers.LeakyReLU())
model.add(layers.Dropout(rate=0.3))

model.add(layers.Flatten())
model.add(layers.Dense(1)) # 真实图片输出 1,伪造的图片输出0
return model
4、定义生成器和判别器的损失函数

其中,生成器损失函数要对fake_output的判别输出与全 1 矩阵做交叉熵,与判别器要做的正好相反。

1
2
3
4
5
6
7
8
9
10
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)
# 判别器损失
def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss
# 生成器损失
def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)
5、定义优化器及训练过程

在训练过程中,每次生成一个维度为 [256, 100] 的随机噪声矩阵,喂入 generator 生成 fake_image

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 实例化 generator 和 discriminator
generator = make_generator_model()
discriminator = make_discriminator_model()
# 定义判别器和生成器的优化器
generator_optimizer = tf.keras.optimizers.Adam(1e-4)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)
# @tf.functio 在这里是为了加速训练,具体作用及用法,google一下
@tf.function
def train_step(images):
noise = tf.random.normal([256, 100])
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
generated_images = generator(noise, training=True)

real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)

gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)

gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator,generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))
6、开始训练

checkpoint 定义要保存的模型对象,这里保存了生成器,判别器及两者各自的优化器,每 15 轮保存一次。

1
2
3
4
5
6
7
8
9
10
11
12
seed = tf.random.normal([16, 100])  # 查看生成器效果用的
# 定义训练过程汇总要保存的对象
checkpoint = tf.train.Checkpoint(generator_optimizer=generator_optimizer, discriminator_optimizer=discriminator_optimizer, generator=generator, discriminator=discriminator)
def train(dataset, epochs):
for epoch in range(epochs):
start = time.time()
for image_batch in dataset:
train_step(image_batch)
generate_and_save_images(generator, epoch+1, seed)
if (epoch+1) % 15 == 0:
checkpoint.save(file_prefix="training_checkpoints/dcgan")
print("Time for", epoch, "epoch is: ", time.time() - start)

每个 epoch 训练结束,我们使用一个固定的 seed(256*100)来查看生成器的生成数字的效果,其查看效果函数定义如下,就是将seed喂入生成器,将生成的数字图片画出来:

1
2
3
4
5
6
7
8
9
def generate_and_save_images(model, epoch, input):
predictions = model(input, training=False)
fig = plt.figure(figsize=(4, 4))
for i in range(predictions.shape[0]):
plt.subplot(4, 4, i+1)
plt.imshow(predictions[i,:,:,0]*127.5+127.5, cmap='gray')
plt.axis('off')
plt.savefig("dcgan_image_save/epoch_" + str(epoch) + ".png")
plt.show()
5、效果展示

每次训练生成的数字效果图如下:

为了有个动态的直观感受,我们使用如下函数,将每轮训练保存的效果图片生成一个 gif 图片。

1
2
3
4
5
6
7
8
9
10
def gif_animation_generate():
gif_name = "dcgan_gif.gif"
filenames = []
for i in range(1, 61):
filenames.append("dcgan_image_save/epoch_" + str(i) + ".png")
frames = []
for filename in filenames:
im = imageio.imread(filename)
frames.append(im)
imageio.mimsave(gif_name, frames, "GIF", duration=0.1)

最后的效果图:

全部代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
# _*_ coding: utf-8 _*_
"""
@author: Jibao Wang
@time: 2019/12/23 19:23
使用 深度卷积生成对抗网络 生成手写数字 MNIST
"""

import tensorflow as tf
import glob, imageio, os, time, PIL
import matplotlib.pyplot as plt
from tensorflow.keras import layers

# 加载数据,获得训练数据集
(train_images, train_labels), (_, _) = tf.keras.datasets.mnist.load_data() # 返回 numpy,[60000,28,28] [60000]
train_images = train_images.reshape(train_images.shape[0], 28, 28, 1).astype("float32") # 增加一维
train_images = (train_images - 127.5) / 127.5 # 像素值标准化到 [-1, 1] 之间
# 批量化和打乱数据, 每个 batch 的维度是 [256, 28, 28, 1], 最后一个 batch 的维度是 [96, 28, 28, 1]
train_dataset = tf.data.Dataset.from_tensor_slices(train_images).shuffle(train_images.shape[0]).batch(batch_size=256, drop_remainder=True)

# 创建模型
# 定义生成器
def make_generator_model():
model = tf.keras.Sequential()
model.add(layers.Dense(7*7*256, use_bias=False, input_shape=(100,)))
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())
model.add(layers.Reshape((7, 7, 256)))
assert model.output_shape == (None, 7, 7, 256)

model.add(layers.Conv2DTranspose(filters=128, kernel_size=(5, 5), strides=(1, 1), padding='same', use_bias=False)) # 反卷积层
assert model.output_shape == (None, 7, 7, 128)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())

model.add(layers.Conv2DTranspose(filters=64, kernel_size=(5, 5), strides=(2, 2), padding='same', use_bias=False))
assert model.output_shape == (None, 14, 14, 64)
model.add(layers.BatchNormalization())
model.add(layers.LeakyReLU())

model.add(layers.Conv2DTranspose(filters=1, kernel_size=(5, 5), strides=(2, 2), padding='same', use_bias=False, activation='tanh'))
assert model.output_shape == (None, 28, 28, 1)
return model

# 定义判别器
def make_discriminator_model():
model = tf.keras.Sequential()
model.add(layers.Conv2D(filters=64, kernel_size=(5, 5), strides=(2, 2), padding='same', input_shape=[28, 28, 1]))
model.add(layers.LeakyReLU())
model.add(layers.Dropout(rate=0.3))

model.add(layers.Conv2D(filters=128, kernel_size=(5, 5), strides=(2, 2), padding='same'))
model.add(layers.LeakyReLU())
model.add(layers.Dropout(rate=0.3))

model.add(layers.Flatten())
model.add(layers.Dense(1)) # 真实图片输出 1,伪造的图片输出0
return model

cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True) # 类的对象
# 定义判别器损失
def discriminator_loss(real_output, fake_output):
real_loss = cross_entropy(tf.ones_like(real_output), real_output)
fake_loss = cross_entropy(tf.zeros_like(fake_output), fake_output)
total_loss = real_loss + fake_loss
return total_loss

# 定义生成器损失
def generator_loss(fake_output):
return cross_entropy(tf.ones_like(fake_output), fake_output)

generator = make_generator_model()
discriminator = make_discriminator_model()
# 定义优化器
generator_optimizer = tf.keras.optimizers.Adam(1e-4)
discriminator_optimizer = tf.keras.optimizers.Adam(1e-4)

@tf.function
def train_step(images):
noise = tf.random.normal([256, 100])
with tf.GradientTape() as gen_tape, tf.GradientTape() as disc_tape:
generated_images = generator(noise, training=True)

real_output = discriminator(images, training=True)
fake_output = discriminator(generated_images, training=True)
gen_loss = generator_loss(fake_output)
disc_loss = discriminator_loss(real_output, fake_output)
gradients_of_generator = gen_tape.gradient(gen_loss, generator.trainable_variables)
gradients_of_discriminator = disc_tape.gradient(disc_loss, discriminator.trainable_variables)
generator_optimizer.apply_gradients(zip(gradients_of_generator,generator.trainable_variables))
discriminator_optimizer.apply_gradients(zip(gradients_of_discriminator, discriminator.trainable_variables))

seed = tf.random.normal([16, 100]) # 查看生成器效果用的
# 保存检查点
checkpoint = tf.train.Checkpoint(generator_optimizer=generator_optimizer, discriminator_optimizer=discriminator_optimizer, generator=generator, discriminator=discriminator)

def generate_and_save_images(model, epoch, input):
predictions = model(input, training=False)
fig = plt.figure(figsize=(4, 4))
for i in range(predictions.shape[0]):
plt.subplot(4, 4, i+1)
plt.imshow(predictions[i,:,:,0]*127.5+127.5, cmap='gray')
plt.axis('off')
plt.savefig("dcgan_image_save/epoch_" + str(epoch) + ".png")
plt.show()

def train(dataset, epochs):
for epoch in range(epochs):
start = time.time()
for image_batch in dataset:
train_step(image_batch)
generate_and_save_images(generator, epoch+1, seed)
if (epoch+1) % 15 == 0:
checkpoint.save(file_prefix="training_checkpoints/dcgan")
print("Time for", epoch, "epoch is: ", time.time() - start)

# 通过 imageio 生成训练过程动画
def gif_animation_generate():
gif_name = "dcgan_gif.gif"
filenames = []
for i in range(1, 61):
filenames.append("dcgan_image_save/epoch_" + str(i) + ".png")
frames = []
for filename in filenames:
im = imageio.imread(filename)
frames.append(im)
imageio.mimsave(gif_name, frames, "GIF", duration=0.1)


if __name__ == "__main__":
train(dataset=train_dataset, epochs=60)
gif_animation_generate()
-------------本文结束感谢您的阅读-------------
您的鼓励就是我创作的动力,求打赏买面包~~
0%