A pan, zoom, and rotate gesture model for touch devices

No mobile app would be complete without a few gestures. I already had swiping in place, so now I turned my attention to panning, zooming, and rotation. These nicely round out of the basic gestures of Fuse. The math turned out quite simple this time, but it was a bit of challenge figuring out exactly how the model works.

The model

As per usual, I just starting writing the gestures without thinking too much about how the model works. I started with zooming, then added rotation, and then wedged in some panning. Though clearly not a nice unified model I prefer starting this way. I find it easier to come up with a nice model once I’ve already introduced myself to the problem thoroughly.

I should note this approach involves a lot of rewriting. But really, any new feature in a product should involve this. Rewriting is just a natural part of programming; you’re first attempt should just be considered a draft. Assuming you can design once, then implement once, is just a dreamer’s fallacy.

Once I had some basics working I identified the problem: rotation and panning involved the same finger movements thus must be intimately related. The rotation and panning aspects could not be separated from each other.

I thought about how it should work by playing with a piece of paper on my desk. I spun and rotated the paper watching what happens. I noticed that my fingers actually never change position relative to the paper. If I placed my fingers in two little circles, I could move and spin the paper all I wanted and they would never leave that position. That was the model I wanted.

The rotation math

The gesture starts when the user places two fingers on the device. The calculations are done relative to this start position. The gesture tracks the total rotation and panning relative to the start, rather than an incremental amount from each movement. This leads to much more stable results, and works better when constraints are involved.

The angle of rotation is easy to calculate. Our friend atan2 basically does the work for us:

var startVector = point#0.start - point#1.start
var currentVector = point#0.current - point#1.current

var startAngle = math.atan2( startVector.Y, startVector.X )
var endAngle = math.atan2( currentVector.Y, currentVector.X )

var deltaAngle = endAngle - startAngle

This calculation is done interactively, on each pointer moved event. deltaAngle is the change in angle since the start of the gesture: how much the user has rotated the on-screen element.

The panning math

Panning at first seems odd since we have two points to work it: the first and second finger. Looking back at the paper model though I realized that the center point between them never moves relative to either finger. The center point is a single point we can use for the calculation.

var startCenter = (point#0.start - point#1.start)/2 + point#1.start
var currentCenter = (point#0.current - point#1.current)/2 + point#1.current
var rawTrans = currentCenter - startCenter

I’m using the name rawTrans to indicate this isn’t the actual translation we’re interested in. It’s in the wrong coordinate system. If the user didn’t rotate at all this would be the correct value, but they can rotate, and this translation is in that rotated space. This assumes we want the translation measured in the container of the thing we’re rotating, which in our case we do. It is a reasonable assumption for UI..

Consider if you put your two fingers vertically above each other then slide up/down. This translates to a movement in the y-axis. What if you rotate your fingers to be horizontal then slide up/down. This translates to an x-axis movement relative to the original finger position.

var rotationMatrix = matrix.rotationZ(-angle)
var translation = vector.transform(rawTrans, rotationMatrix)

Flat rotations in 2D are just rotations around the z-axis in a 3D API. We already calculated the angle the user rotated, so we just create the transform matrix to undo that rotation.

Creating a full 4×4 matrix and doing a generic 3D transform is mathematical overkill. If this were part of a critical pathway I’d take the time to specialize this code, since I know a 2D transform of a vector is really quite simple. But this isn’t a critical pathway; it happens only once per pointer event. Using generic functions makes it’s very clear what I’m doing and saves me having to debug extra code.

The zooming math

My piece-of-paper model doesn’t really work for zooming, but fortunately that’s rather easy. We just need to make a ratio of the distance between the two fingers at the start and now.

var startVector = point#0.start - point#1.start
var currentVector = point#0.current - point#1.current

var scale = vector.length(currentVector) / vector.length(startVector)

There’s one other minor adjustment to be made to the translation, since rawTrans is being measure not just in the rotated space, but also in the scaled space:

var translation = vector.transform(rawTrans, rotationMatrix) / scale

Pulling it together

The above is the all the basic math needed for the core of the pan, zoom, and rotate gestures. There’s a slight bit more involved in the wrapping API, but I can get into that in a further article. Overall this felt a bit easier to implement than the swipe gesture, and a lot eaiser than the nasty finger velocity measurement.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s