AI Research
AI Research
Neural Radiance Fields (NeRF’s)
With the aspiration of unlocking the 3rd dimension for our Food imagery and the need for using the same Imagery in more ways I stumbled upon Neural Radience fields in 2023 and decided to actively research possibilities to make our range of assets future proof.The main idea that sprung to my head with this technology were the following:NeRF's render extremely fast, which makes animation exports really efficient in a fast pace envirnmentYou can use the images that are trained inside the Neural engine and move the model in any scenario - From Studio photography to Lifestyle sets, everything is possible once you have the model.The textures are spot on and photorealistic since the database are real items. So no weird Artefacts from Generative AI possible.
So, how does it work?
Neural Raidiance fields build upon the approach of Photogrammatry that already has been around for decades, but mixing it with powerful GPU machine learning capabilities. This new approach is not just recreating a 3D envirnment by using a Pointcloud from a colmap (Colmap being the array of cameras mapped out around the scene), but also being able to learn the texture from each angle, taking into account the correct reflectance and reflection maps of an object or scene depending on the cameras position. This apporach allows for real life results when moving through your finalized scene without the need of extra heavy render engines, since the texture of the object isn't baked in - but rather learned by the underlying engines. The object that you get is also not a 3D model, but constist of Radience fields that have no physical attributes like a .obj or .fbx file. Having the possibility to open a scene file up and re-animating the camera afterwards allows for a fluent and fast post production flow that leaves unlimited possibilities to the artist to jump back and and re-animate the camera however they seem fit for their next project.
Take these two examples:
These are results from a early test in 2023. I shot around 60 Photos from the Burger on a plain background to capture its shape and texture. Using multiple angles will help the AI engine to understand not just the shape, but also the texture and reflactive properties. See that the highlights on these Burgers are following the camera as if they were filmed like this? This is the power of Neural Radience fields.
Post Production Workflow and culpits:
Since there are around 60-120 photographs of each item (as per recommended by NVIDIA research labs for best results) - colours and textures need to be exact on all photos before they are being processed by the AI engine. To account for these many shots and keep the consistancy across all of them, I used Lightroom from Adobe and restricted the corrections to mainly curves adjustment that are targeting colour values only. One of the culpits I encountered in this process is when lighting conditions were changing in bewteen shots which then made a unpleasent since when the camera reaches the position of the image angle that hasn't been coloured correctly, you end up with visible Artefacts on screen.
To avoid these difficult situations, the best practice is normally to erase the image from the database and run the learning algorithm again to avoid these types of mistakes.
Once a dataset is trained accuartly and correctly, we can use After Effects with the Plugin "Gaussian Splatting" to then import our Models directly into the program and use cameras to animate around the object.
We can also merge multiple models together or create a whole new scene inside our composition, using real life measurements and camera settings and After Effects lights to interact with the Model. However, these techniques are not really fully production ready since the results are still not sufficient for post lighting techniques compared to Cinema 4D or else and still need a lot of post processing that slows down render times.
A Chinese Company is currently developing results for Video NeRF's that allows to capture a scene in 3D with animated cameras - I really can't wait to see what the future holds for these types of technologies since they are certainly change the world of CGI forever - making it more accissable to the wider public and less expensive to process.
Limitations:
Objects need to be static to be able to shoot them. Movement (like a sauce drip from a Burger) are only possible with a range of cameras shooting at the same time.Any direction the object isn't shot at, the data isn't showing after the training. The results are therfore unusable since there is no generative option yet.Post Production can be difficult if lighting conditions change throughout the recording - or if items need to be retouched after the shoot - since having this continious changes over 60 frames is impossible and not economical.The usage of NeRF's is quite technical since there is no solid UI for these yet. Apps like "Postshot" or "Polycam" are really good applications for anyone that would like to try to create Gaussian Splats to start off.