EPIC Fields Dataset and Benchmarks


Watch the Trailer

EPIC Fields Dataset

We introduce EPIC Fields, an augmentation of EPIC-KITCHENS with 3D camera information. Similar to other datasets for neural rendering, EPIC Fields removes the complex and expensive step of reconstructing cameras using photogrammetry, and allows researchers to focus on more interesting modeling problems. We illustrate the challenge of photogrammetry in egocentric videos and propose several technical innovations to address them.

Compared to other datasets for neural rendering, EPIC Fields is much better tailored to video understanding because it combines nicely with the recently-released VISOR annotations. Furthermore, it covers the complex and yet increasingly important case of egocentric video understanding. It also offers new challenges for the neural rendering community, such as modeling long videos with complex dynamic changes. To further jump start the interest of the community in this area, we also define neural rendering and motion segmentation benchmarks and provide several strong baselines for each, characterizing what is and is not possible today.

Recovered Camera Poses

19M frames in 99 hours of 671 videos, recorded in 45 kitchens.

Dynamic View Synthesis

Dynamic Object Segmentation

Video Object Segmentation

When combined with EPIC-KITCHENS annotations, Actions can now be grounded in 3D

Download Data

Downloading point clouds and camera poses

The dataset is now publicly available for download from here (7.5G).

Raw data in COLMAP format:
  • The dense registered frames in raw COLMAP format can be found at: here (133G).
  • The sparse frames including raw COLMAP database can be found at: here (91.6G).
You can also download the files from our servers from Oxford University in case the dropbox servers are too busy:

You can verify your downloads with the SHA-512 hashes available here.

Code

We provide the demo code for visualising the data, and the reconstruction pipeline: We make the following benchmark codes public, which replicate the EPIC Fields paper's baseline.

Paper and Citation

When using these annotations, cite our paper (preprint now available on ArXiv):

@inproceedings{EPICFields2023,
           title={{EPIC Fields}: {M}arrying {3D} {G}eometry and {V}ideo {U}nderstanding},
           author={Tschernezki, Vadim and Darkhalil, Ahmad and Zhu, Zhifan and Fouhey, David and Larina, Iro and Larlus, Diane and Damen, Dima and Vedaldi, Andrea},
           booktitle   = {Proceedings of the Neural Information Processing Systems (NeurIPS)},
           year      = {2023}
} 
Also cite the EPIC-KITCHENS-100 paper where the videos originate:
@ARTICLE{Damen2022RESCALING,
           title={Rescaling Egocentric Vision: Collection, Pipeline and Challenges for EPIC-KITCHENS-100},
           author={Damen, Dima and Doughty, Hazel and Farinella, Giovanni Maria  and and Furnari, Antonino
           and Ma, Jian and Kazakos, Evangelos and Moltisanti, Davide and Munro, Jonathan
           and Perrett, Toby and Price, Will and Wray, Michael},
           journal   = {International Journal of Computer Vision (IJCV)},
           year      = {2022},
           volume = {130},
           pages = {33–55},
           Url       = {https://doi.org/10.1007/s11263-021-01531-2}
} 

Disclaimer

The underlying data that power EPIC Fields, EPIC-KITCHENS-100, were collected as a tool for research in computer vision. The dataset may have unintended biases (including those of a societal, gender or racial nature).

Copyright Creative Commons License

The EPIC Fields dataset is copyright by us and published under the Creative Commons Attribution-NonCommercial 4.0 International License. This means that you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may not use the material for commercial purposes.

For commercial licenses of EPIC-KITCHENS, email us at uob-epic-kitchens@bristol.ac.uk

The Team

EPIC Fields is the result of a collaboration of the Universities of Oxford, Bristol, Michigan and NAVER LABS Europe

Vadim Tschernezki*

University of Oxford

Ahmad Darkhalil*

University of Bristol

Zhifan Zhu*

University of Bristol

David Fouhey

University of Michigan

Iro Laina

University of Oxford

Diane Larlus

NAVER LABS Europe

Dima Damen

University of Bristol

Andrea Vedaldi

University of Oxford

Research Funding

The work on EPIC Fields was supported by:

  • UKRI Engineering and Physical Sciences Research Council (EPSRC) Program Grant Visual AI (EP/T028572/1)
  • V. Tschernezki and D.Larlus are supported by Naver Labs.
  • A. Darkhalil is supported by EPSRC DTP program.
  • Z. Zhu is supported by UoB-CSC Scholarship.
  • I. Laina and A. Vedaldi are supported by ERC-CoG UNION 101001212.
  • D. Damen is supported by EPSRC Fellowship UMPIRE~EP/T004991/1.