# Neural Volumes

We present a learning-based approach to representing dynamic objects, supervised directly from 2D images in a multi-view capture setting. The method consists of an encoder-decoder network that transforms input images into a 3D volume representation, and a differentiable ray-marching operation that enables end-to-end training. The method learns a dynamic warp field that greatly improves the apparent resolution and reduces grid-like artifacts and jagged motion.

### Differentiable Ray Marching

The decoder generates a volume that contains RGB and opacity values. To render this volume to an image, we use a differentiable ray marching algorithm. The idea is that we integrate the RGB and opacity values through the volume along the ray defined by each pixel.

