A recipe to learn generalist robot policies from large-scale human and robot videos without action labels. A novel approach to extract motion-centric latent actions that capture fine-grained physical ...