Smoothing the Generative Latent Space with Mixup-based Distance Learning Chaerin Kong Jeesoo Kim Donghoon Han Nojun Kwak Seoul National University {veztylord, kimjiss0305, dhk1349, nojunk}@snu.ac.kr Abstract Producing diverse and realistic images with generative models such as GANs typically requires large scale training with vast amount of images. GANs trained with extremely limited data can easily overfit to few training samples and display undesirable properties like ”stairlike” latent space where transitions in latent space suffer from discontinu- ity, occasionally yielding abrupt changes in outputs. In this work, we consider the situation where neither large scale dataset of our interest nor transferable source dataset is available, and seek to train existing generative models with minimal overfitting and mode collapse. We propose latent mixup-based distance regularization on the feature space of both a generator and the counterpart discriminator that en- courages the two players to reason not only about the scarce observed data points but the relative distances in the feature space they reside. Qualitative and quantitative evaluation on diverse datasets demonstrates that our method is gener- ally applicable to existing models to enhance both fidelity and diversity under the constraint of limited data. Code will be made public. 1. Introduction Remarkable features of Generative Adversarial Net- works (GANs) such as impressive sample quality and smooth latent space interpolation have drawn enormous at- tention from the community, but what we have enjoyed with little gratitude claim their worth in a data-limited regime. As naive training of GANs with small datasets often fails both in terms of fidelity and diversity, many have proposed novel approaches specifically designed for few-shot image syn- thesis. Among the most successful are those adapting a pre- trained source generator to the target domain [21, 26, 28] and those seeking generalization to unseen categories through feature fusion [13, 16]. Despite their impressive synthesis quality, these approaches are often critically constrained in practice as they all require semantically related large source domain datasets to pretrain on. For some domains like ab- stract art paintings, medical images of rare symptoms and Figure 1. Training GANs with as little as 10 training samples typ- ically results in severe overfitting, as illustrated here by stairlike generator latent transition in the first row. Generator trained with our method learns smooth and interpolable latent space. cartoon illustrations, it is very difficult or impossible to col- lect thousands of samples, while at the same time, finding an adequate source domain to transfer from is not straight- forward either. This poses intimidating challenge for gener- ative modeling, but we seek to find solutions in the darkest of times. One of the biggest challenges of learning generative models with scarce data is that the model easily over- fits. To directly address this issue, [46] and [18] have pro- posed data augmentation techniques that show promising results on low-shot generation tasks with datasets contain- ing hundreds to thousands of training samples. Neverthe- less, they show unsatisfactory performance with handful of data points (e.g., 10) and generative modeling under these circumstances still remains extremely challenging. Inspired by [28] that proposes novel distance regulariza- tion to effectively transfer diversity information from the source to target, we recast the overfitting issue as stairlike latent space problem and suggest Mixup-based Distance Learning (MDL) for both the generator and the discrimi- nator to effectively control it. As diversity is already heavily constrained by the small dataset, we wish to maximally exploit the given data points by continuously exploring their semantic mixups [44]. With overfitting, however, the discriminator is convinced only with the few observed samples, showing overly confident and abrupt decision boundaries. This induces the generator 1 arXiv:2111.11672v1 [cs.CV] 23 Nov 2021