Self-supervised pre-training for the embodied Turing Test