Beyond Image to Depth: Improving Depth Prediction using Echoes


We address the problem of estimating depth with multi modal audio visual data. Inspired by the ability of animals, such as bats and dolphins, to infer distance of objects with echolocation, we propose an end-to-end deep learning based pipeline utilizing RGB images, binaural echoes and estimated material properties of various objects within a scene for the task of depth estimation.


The code is tesed with

- Python 3.6 
- PyTorch 1.6.0
- Numpy 1.19.5


Replica-VisualEchoes can be obatined from here. We have used the 128×128 image resolution for our experiment.

MatterportEchoes is an extension of existing matterport3D dataset. In order to obtain the raw frames please forward the access request acceptance from the authors of matterport3D dataset. We




To finish reading, please visit source site