16-726 SP22 Assignment #3 - When Cats meet GANs

 

Run

Please run according to the assignment pages instructions.

For result of diff_augment.py in code, add parser --use-diffaug True after the command.

Introduction

This assignment requires us to implement two GANs: DCGAN and CycleGAN. The first one generates cat images from noise and the second one transfer style from one group of pictures to another group.

Part1: DCGAN

Explanation

For Data Augmentation, there are two augmentation in this assignment. One is in data_loader.py with parser -data_preprocessing 'deluxe' . The second is Differentiable Augmentation, and is added in training process with activation parser --use-diffaug True.

For the padding value, I use the formula from PyTorch official page: nn.Conv2d:

Hout=Hin+2×padding[0]dilation[0]×(kernel_size[0]1)1stride[0]+1

Take kernel_size = 4 and stride = 2 and the scale of input and output image into this formula, we got the padding value 1.

And as for the first layer in Generator without using up_conv, taking the values in and we got padding value 3.

Result

For basic and diff_aug only mode, the network overfits the dataset in the late iteration, and the loss function doesn't seem converge. For the deluxe augmentation, the result is better.

And when the DiffAug and Deluxe are both enabled, through the iterations the details becomes more and more clear and cat's feature becomes recognizable. In early stages only the illusions is generated but later the face and eyes is generated as well.

ModeEarly Iteration(200)Mid Iteration(2000)Late Iteration(6400)Loss*
Basicsample-000200sample-002000sample-006400loss
Deluxesample-000200sample-002000sample-006400loss
DiffAugsample-000200sample-002200sample-006400loss
DiffAug+Deluxesample-000200sample-002000sample-006400loss

(*: Curve smooth with 0.9 in Tensorboard)

 

Part2: CycleGAN

Generally speaking, the more iterations, the better result it is. And by adding cycle consistency loss function, the output image seems more stable and has more features from the other image. Take this as an example. With cycle loss, the brown fur is kept and it's more like the original picture of Grumpy cat.

comparasion1_nocycleloss

Without cycle loss

comparasion1_cycleloss

With cycle loss

Generally speaking, the more iterations, the better result it is. And y adding cycle consistency loss function, the

And for Discriminator selection, its hard to tell which one is better than the other. What my observation is that the result generated by Patch Discriminator usually have the expected shape, which looks more like real pictures. The DC Discriminator may have facial expressions very clear, but in general the cat is sometimes in deformation status. The only difference between the two discriminators is the output layer of Patch is 4x4 while the DC is 1x1. So the Patch Discriminator could resolve the general features well while the DC focus more on detailed features of images.

One interesting to note is that usually X->Y direction is better than Y->X direction.

 

Cat Dataset

DiscriminatorCycle Consistency Enabled?DirctionEarly Iteration(500)Mid Iteration(5000)Late Iteration(10000)Generator LossDiscriminator Loss
PatchDiscriminatorNoX->Ysample-000500-X-Ysample-005000-X-Ysample-010000-X-YGLossDLoss
  Y->Xsample-000500-Y-Xsample-005000-Y-Xsample-010000-Y-X  
PatchDiscriminatorYesX->Ysample-000500-X-Ysample-005000-X-Ysample-010000-X-YGLossDLoss
  Y->Xsample-000500-Y-Xsample-005000-Y-Xsample-010000-Y-X  
DCDiscriminatorNoX->Ysample-000500-X-Ysample-005000-X-Ysample-010000-X-YGLossDLoss
  Y->Xsample-000500-Y-Xsample-005000-Y-Xsample-010000-Y-X  
DCDiscriminatorYesX->Ysample-000500-X-Ysample-005000-X-Ysample-010000-X-YGLossDLoss
  Y->Xsample-000500-Y-Xsample-005000-Y-Xsample-010000-Y-X  

It is the same with the Apple-orange Dataset. The background is remained and the edge of apple looks great with cycle loss.

comparasion2_nocycleloss

Without cycle loss

comparasion2_cycleloss

With cycle loss

Apple-orange Dataset

DiscriminatorCycle Consistency Enabled?DirctionEarly Iteration(500)Mid Iteration(5000)Late Iteration(10000)Generator LossDiscriminator Loss
PatchDiscriminatorNoX->Ysample-000500-X-Ysample-005000-X-Ysample-010000-X-YGLossDLoss
  Y->Xsample-000500-Y-Xsample-005000-Y-Xsample-010000-Y-X  
PatchDiscriminatorYesX->Ysample-000500-X-Ysample-005000-X-Ysample-010000-X-YGLossDLoss
  Y->Xsample-000500-Y-Xsample-005000-Y-Xsample-010000-Y-X  
DCDiscriminatorNoX->Ysample-000500-X-Ysample-005000-X-Ysample-010000-X-YGLossDLoss
  Y->Xsample-000500-Y-Xsample-005000-Y-Xsample-010000-Y-X  
DCDiscriminatorYesX->Ysample-000500-X-Ysample-005000-X-Ysample-010000-X-YGLossDLoss
  Y->Xsample-000500-Y-Xsample-005000-Y-Xsample-010000-Y-X