Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes — Supplementary Materials — Huiting Yang 1 , Liangyu Chai 1 , Qiang Wen 1 , Shuang Zhao 2 , Zixun Sun 2 , Guoqiang Han 1 , Shengfeng He 1* 1 School of Computer Science and Engineering, South China University of Technology 2 Interactive Entertainment Group, Tencent Inc. 1. Introduction In this document, we provide additional experiments to further examine our method, AdvStyle. In the first section, we provide more implementation details. In the second sec- tion, we elaborate how we collect and process our training datasets. Lastly, we show additional experiments results including the limitation of our proposed method, statisti- cal measurement, editing results and analysis of binary and non-binary attributes, and real image manipulation compar- ison. 2. Implementation details 2.1. Distribution of Training Samples The latent code z is sampled from the Gaussian distribu- tion N (0,I d ), where d = 512 is the dimensionality of the latent code. The selected index k is sampled from a uniform distribution U{1,K}, where K = 100. The step size α is sampled from a uniform distribution U{-6, 6}. 2.2. Comparison Details When comparing to InterFaceGAN [12], five human face attributes are provided in their released codes, including old, smile, pose, eyeglasses, and female, we directly use these well-trained directions. The other human style at- tributes and anime attributes are trained based on their pro- vided model and default parameters. These attribute as- sessors obtain higher than 95% accuracy on the validation set, which is higher than the accuracy reported in their pa- per [12]. When comparing to GANSpace [5], four human face attributes are provided in their released codes, including smile, pose, eyeglasses, and female, we directly use these well-trained directions. * Corresponding author (hesfe@scut.edu.cn). This work is sup- ported by the National Natural Science Foundation of China (No. 61972162), and CCF-Tencent Open Research fund. Code is available at https://github.com/BERYLSHEEP/AdvStyle. 2.3. Pretrained Generator Models For animate attributes editing, the generator of Style- GAN [7] is trained on the Danbooru2018 dataset [4]. For human face attribute editing, the generator of StyleGAN is trained on the FFHQ dataset [7]. Note that all the results are generated by these generators, i.e., all images share the same learned latent space, and the only difference is how se- mantic directions are discovered. Some images may show water droplet -like artifacts, which is a defect of the Style- GAN network [8]. 3. Attribute Datasets For animate attributes, we aim to generate anime char- acters with a high-resolution of 512 × 512. Some publicly available anime datasets do not meet our needs, e.g., the im- ages in [6] are with a low resolution of 64 × 64. To obtain datasets with diverse attribute labels, we collect 9 attribute datasets, corresponding to 6 character attributes and 3 anime styles. All the images are resized to 512 × 512 for training. 1. Danbooru2018 dataset [4] provides large-scale high- resolution anime images associated with metadata. However, the original images of the dataset contain multiple characters in the same scene. We detect the anime faces using a faster-RCNN [11] based anime face detector 1 , then crop the image at a larger scale of 1.2 to include the hair of the character. We use the above preprocessing method to obtain 7 at- tribute datasets, including 6 character attributes of open mouth, blunt bangs, short hair, black hair, blonde hair, pink hair and 1 style attribute Itomugi-Kun. Each attribute has more than 800 images. 2. Manga 109 [10, 3] consists of 109 comic books of 21142 pages drawn by professional artists. We extract 2689 anime faces according to the annotations with a 1 https://github.com/qhgz2013/anime-face-detector 1