Dolly Zoom using Möbius Transforms

Dolly zoom is the technique of zooming/panning a camera while a dolly moves the subject towards or away from the camera. The combined effect is that the subject doesn’t change size but the background moves in a weird way that distorts perspective. This effect was pioneered in the movie Vertigo but has been used countless times since. I first noticed it in Spike Lee’s “Do the Right Thing”.

Some time ago, Henry Segerman posted a video showing how to generate a similar effect with 360 video.

He achieved this effect using Möbius transforms. There is a unique Möbius transformation mapping any 3 points to any other 3 points. So you can use such a transform to keep the same 3 points fixed no matter what!

That was something I wanted to experiment with. Here is what I came up with first:

On the left is an equirectangular projection of a 360 still I took on my rooftop. I superimposed the 3 Nyan cats wandering around the image. On the right is the same video, BUT I keep the cats’ locations fixed using a Möbius transform and the world has to adjust around them.

So on the left the world stays fixed and the cats move. On the right the cats stay fixed and the world moves. If you don’t believe that the cats stay still just put your finger on one and you’ll see that while it might shrink or grow or flip, it always stays under your finger.

The change in perspectives in the above Nyan Cat video are weird and dramatic indeed. But since the cats are all in a line the equator doesn’t move. In this next video the cats move at different latitudes for more dramatic ‘tiny world‘ effects:

 

 

Remember that Möbius transforms are possible with spherical (360) images because we identify the image sphere with the Riemann sphere, as I covered in an earlier note. Thus when I speak of an equator or latitude I am referring to the sphere of the image from the camera of course.

Aside: I find it quite natural to read equirectangular images now. The fact is, it is inconvenient to always be scrolling left, right up and down in order to view a real 360 image. An equirectangular projection has all the same information but requires less work. I wonder if this will become a common way for people to understand images. There is nothing special about the perspective and projection we’ve been using since the Renaissance.

Finally, let’s try the same effect on a video with people instead of Nyan Cats:

 

One effect I love: when 2 people meet, the algo has to flip the whole picture upside down to preserve spacing.