Insights into Neural Radiance Field – NeRF

November 12, 2023

In this blog post we will give some insight about a method called neural radiance field (NeRF) that has become popular during the recent years. This method uses deep learning to reconstruct a 3D representation of a scene from 2D images. NeRF can generate new high-resolution images from the learnt scene, which can be utilised for image synthesis, object detection, image segmentation and to generate realistic-looking images, among other use-cases. We will also show our experiences in Nerfstudio.

NeRF works by training a neural network to approximate the radiance field of a scene. This is done by training the network to predict the colour and opacity of each point in space. The network takes as inputs the 3D coordinates of a point and the viewing direction of 2D images, and outputs the corresponding colour and opacity values. To train the network, we need a dataset of images of the scene from different viewpoints. The network is trained to minimize the difference between the predicted radiance values and the ground truth values from the dataset.

We chose Nerfstudio because it provides a simple API that allows a simplified end-to-end process of creating, training, and testing NeRFs. This way we could try out a set of pre-implemented methods, making the journey of learning about NeRFs easier.

With the help of CUDA libraries the training could be accelerated by an NVIDIA GeForce GTX 1650 video card. The training took 30,000 iterations in the dataset, which consisted of 764 images with extra information about the camera and image positions and rotations, which were processed earlier in our COLMAP post. Training took around 2 hours with our chosen method called Nerfacto.

During training, Nerfstudio provides a browser-based GUI tool to interact with the model. Here the viewport can be rotated and moved around. The segment below was taken from the browser viewer of Nerfacto at the 5196th iteration. In the scene the images that were used during training can be seen floating in space, positioned and rotated according to the input data. The background is generated in “real time” according to the current stage of the training. As the neural network took more iterations, the view became increasingly clearer.

Once the training is complete, we can look around and render new 2D images from any directions in the 3D environment. Additionally, using Nerfstudio’s API we can export the 3D point cloud or mesh of a defined segment of the learnt environment. Here is a little snippet about the exported point cloud viewed in MeshLab.

In our experience NeRF performs better when the input data is taken in open environment with well-lit objects. During our testing the learnt model was inconsistent when dynamic lighting and enclosed space were present in the input data, for example in a sewer pipe system.

Want to learn more?
Name Surname
name.surname@qamcom.se