"NeRF" turns 2D photos into convincing 3D scenes

Originally published at: "NeRF" turns 2D photos into convincing 3D scenes | Boing Boing


And there’s the catch.

(Emphasis mine)

It’s impressive that it happens in real time but it’s not taking one photo, or even a number of unstructured photos like the SfM systems we are familiar with (and the best of which is currently being sanctioned out of existence because it is from Russia). The known position means you either need a number of permanently installed cameras, basically building a giant 3d scanner, or you need to do a lot of measuring while shooting


There’s a thing called MiDaS that can look at a single image and make a depth map from that. It doesn’t need any information about the camera. That thing is amazing.

Check out this Colab page that utilizes MiDaS: 3D Photo Inpainting - Turn Any Picture Into 3D Photo with Deep Learning and Python


I find it extremely implausible that this current requirement will not be sidestepped in the next iteration or two. “Two more papers down the line”, as Dr. Károly Zsolnai-Fehér always says.


It turns out that the deepfake scene from 1987’s Running Man (embedded below) was not only disarmingly good, but has arrived in reality on its science-fictional schedule

This very much isn’t deepfake, though - it’s really the opposite, as it’s constructing the scene in the images. Deepfake needs far more data, time and manual massaging and delivers results that aren’t nearly as good. (Though I wonder if being able to plug the results of this into the deepfake data pipeline could improve it…)

It could be amazing for making 3D models of things, though - photogrammetry can be a pain in the ass.


NeRF isn’t from NVIDIA. The original NeRF methods, and many of the improvements, were developed at UC Berkeley and Google Research, starting with: NeRF: Neural Radiance Fields
It’s been a very hot research area since then and there have been many improvements from different labs.


Is it a 3d scene? The sparse video seems to be an animated 2d scene with a moving camera. It’s not in 3d, it’s not a 3d model, it’s not a 3d set. It’s impressive but confusing.

1 Like

I…don’t really understand what your definition here is. Are you saying it should be like VR-ready 3d, with two points of view? Because I would absolutely describe that as a 3D scene - you can pan around the character in 3 dimensions and they have a clear depth beyond two dimensions.

1 Like

I’m a 3d modeler and an animator that’s studied a lot of AI image generation and that’s why I’m asking - I’m confused. I don’t think there’s actually any 3d data being generated here, but I could be wrong. Yes the camera “moves” and yes we see the back of the subject. But these are rapidly generated 2d images (maybe) giving us the impression the camera is moving through 3d space.

So I asking for clarification, it does look 3d. But is it 3d? Like a stereogram looks 3d but it’s not really. And a 3d movie looks 3d but it’s not really. I want to know if there is full 3d data being made - is it a model. We call both those things 3d, so you’re right. I was asking about the other kind of 3d - geometry and materials and surfaces etc.

1 Like

It’s a full 3D mesh


What? No, just get the inertial gyro info off the cellphone that took the pix. When it’s not in ‘poor accuracy mode’ asking you to do loop de loops to calibrate it (looking at you, Android.)


This topic was automatically closed after 5 days. New replies are no longer allowed.