1: \begin{abstract}
2: Data simulation engines like Unity are becoming an increasingly important data source that allows us to acquire ground truth labels conveniently. Moreover, we can flexibly edit the \emph{content} of an image in the engine, such as objects (position, orientation) and environments (illumination, occlusion).
3: When using simulated data as training sets, its editable content can be leveraged to mimick the distribution of real-world data, and thus reduce the content difference between the synthetic and real domains.
4: This paper explores content adaptation in the context of semantic segmentation, where the complex street scenes are fully synthesized using 19 classes of virtual objects from a first person driver perspective and controlled by 23 attributes.
5: To optimize the attribute values and obtain a training set of similar content to real-world data, we propose a scalable discretization-and-relaxation (SDR) approach.
6: %We formulate the attribute optimization as a distribution mapping problem that maps random attribute value to optimized one.
7: Under a reinforcement learning framework, we formulate attribute optimization as a random-to-optimized mapping problem using a neural network.
8: Our method has three characteristics.
9: 1) Instead of editing attributes of individual objects, we focus on global attributes that have large influence on the scene structure, such as object density and illumination.
10: 2) Attributes are quantized to discrete values, so as to reduce search space and training complexity.
11: 3) Correlated attributes are jointly optimized in a group, so as to avoid meaningless scene structures and find better convergence points.
12: Experiment shows our system can generate reasonable and useful scenes, from which we obtain promising real-world segmentation accuracy compared with existing synthetic training sets.
13: \end{abstract}
14: