3. Properties of the style-based generator
Our generator architecture makes it possible to control
the image synthesis via scale-specific modifications to the
styles. We can view the mapping network and affine trans-
formations as a way to draw samples for each style from a
learned distribution, and the synthesis network as a way to
generate a novel image based on a collection of styles. The
effects of each style are localized in the network, i.e., modi-
fying a specific subset of the styles can be expected to affect
only certain aspects of the image.
To see the reason for this localization, let us consider
how the AdaIN operation (Eq. 1) first normalizes each chan-
nel to zero mean and unit variance, and only then applies
scales and biases based on the style. The new per-channel
statistics, as dictated by the style, modify the relative impor-
tance of features for the subsequent convolution operation,
but they do not depend on the original statistics because of
the normalization. Thus each style controls only one convo-
lution before being overridden by the next AdaIN operation.
3.1. Style mixing
To further encourage the styles to localize, we employ
mixing regularization
, where a given percentage of images
are generated using two random latent codes instead of one
during training. When generating such an image, we sim-
ply switch from one latent code to another — an operation
we refer to as
style mixing
— at a randomly selected point
in the synthesis network. To be specific, we run two latent
codes
z
1
,
z
2
through the mapping network, and have the
corresponding
w
1
,
w
2
control the styles so that
w
1
applies
before the crossover point and
w
2
after it. This regular-
ization technique prevents the network from assuming that
adjacent styles are correlated.
Table 2 shows how enabling mixing regularization dur-
3