Object-Oriented Learning (OOL): Perception, Representation, and Reasoning

International Conference on Machine Learning (ICML)

Friday July 17, 2020, Virtual Workshop

Access the virtual workshop page

Structured Generative Modeling of Images with Object Depths and Locations

We present a generative model of images, that incorporates a structured latent representation separating objects from each other and from the background. It explicitly models the depth ordering of objects, as well as their 2D positions, with a novel and efficient approach to placement that avoids computationally-expensive spatial transformers. The model can be trained from images alone, without the need for object masks or depth supervision. It learns to generate coherent scenes, and to decompose novel images into their constituent objects, predicting their depth ordering, locations, and segmentation of occluded parts.