Object-Oriented Learning (OOL): Perception, Representation, and Reasoning
International Conference on Machine Learning (ICML)
Friday July 17, 2020, Virtual Workshop
Rigid body systems are among the most important subjects of study in physics and are widely applied in both simulated and real-world applications. As suggested by theory of classical mechanics, the motion of a rigid body system is strongly governed by its geometric constraints, i.e. how different rigid components are connected and allowed to move with respect to each other. Extracting these information from observations and applying them to the system's dynamics modeling are important steps towards a deeper and more structured understanding of the physical world. In this work, we propose a computational framework that both extracts the geometric constraints and models the forward dynamics of rigid body systems starting from raw pixel observations. Our model first extracts a hierarchical representation, facilitated by keypoints and their groupings, for describing the system's constraints from visual observations in an unsupervised fashion. Then, a dynamics model aware of these constraints is applied to predict the forward dynamics of the state representation. Finally, a reconstruction network recovers the visual frames from the predicted states. Experiment results on classic rigid body control environments show our model is able to accurately infer the constraints, and geometry-aware dynamics modeling leads to more accurate and physically sensible future predictions.