We use quaternion slerping to interpolate between rotations. But I've seldom seen a precise discussion on what it does and why it's correct. This tutorial/review article provides a walkthrough of some core identities for 3D rotations, and ends with a discussion that establishes the equivalence between axis-angle interpolation and quaternion slerping.
[u^]2=uuT−(uTu)I. tr([u^]2)=−2uTu. If u has norm 1, [u^]2=uuT−I.
2. Matrix Exponential
Define matrix exponential of a square matrix X∈Rn×n to be the series
exp(X):=k=0∑∞k!Xk.
Fact 1: The series converges (absolutely; allows reordering) [proof]. exp(A)exp(B)=exp(B)exp(A) in general. exp(X) results in an invertible matrix [proof]: exp(−X)exp(X)=exp(X)exp(−X)=I.
Fact 2: Matrix exponential is generally not a 1-to-1 mapping (it is for R1). It's generally many-to-1 [example]. See the detailed case breakdown for R3×3 in the axis-angle mapping section below.
Fact 3: If X is skew-symmetric, then exp(X) is an orthogonal matrix, i.e. its transpose is its inverse:
exp(X)T=exp(XT)=exp(−X)=exp(X)−1.
Furthermore, its determinant det(exp(X))=exptr(X)=exp(0)=1, by Jacobi's identity [proof]. Thus, exp(X) produces special orthogonal matrices SO(n).
We also claim that every special orthogonal matrix R can be generated by some skew-symmetric matrix X. The mapping is surjective [missing proof].
Fact 4: If (λ,v) is an eigen val/vec pair of X, then exp(X)v=∑k=0∞k!λkv. (eλ,v) is an eigen pair of exp(X).
Fact 5: exp(BAB−1)=I+2!(BAB−1)2+⋯=BIB−1+B2!A2B−1+…=Bexp(A)B−1.
Fact 6: ∂texp(tA)=∂t(I+tA+2!t2A2+…)=A+tA2+2!t2A3+…=Aexp(tA)=exp(tA)A. Let y(t)=exp(tA)x. Then y˙(t)=Aexp(tA)x=Ay(t).
We are therefore able to come up with guesses when encountering ODEs of the form
B˙=AB or BAy˙=Ay⟹B(t)=exp(tA)⟹y(t)=exp(tA)xfor some initial x.
These are less useful for rotations, but central in modeling probability distributions of markov chains.
3. Rotation Exponential Map
3.1. axis-angle map: 4-number, constrained.
In 3D, given (θ∈R,u∈R3), with ∥u∥=1 and θwithout range restriction, [u^] is skew-symmetric, and by fact 3
rotates tx on the x,y plane, which is orthogonal and right-handed to u, from x to y by angle θ. The same conclusion can be reached by a spectral analysis on the complex eigenvalues. Action on any vector can be decomposed into action on some su+tx.
Therefore it is appropriate to name (θ,u) the axis-angle parameterization of a rotation matrix.
This mapping is not injective. To produce the same rotation matrix, for some integer k, one could have
(θ,u) and (−θ+k⋅2π,−u), e.g. k=−1,(−(2π−θ),u) rotates through the opposite turn.
(θ,u) and (−θ+k⋅2π,−u), e.g. k=1,(2π−θ,−u) keeps the rotation angle under π.
At θ=k⋅2π, any u maps to identity matrix.
See the cases with quaternion for contrast.
3.2. exponential map: 3-number, unconstrained
Sometimes the terms axis-angle map and exponential map are used interchangeably. We make the distinction more explicit.
Axis-angle map takes 4 numbers (θ,u) to rotation matrix via matrix exponential. ∥u∥=1 simplifies the infinite series into the concise Rodrigues formula.
Given 3 unconstrained numbers h∈R3,h=0, the axis-angle form is (∥h∥,∥h∥h), and
because numerical software is unaware of the interaction bewtween trig terms and 1/x when the two parts are evaluated separately. To look up series like these, use wolfram alpha.
Eq. 9 bijectively identifies h=0h⟷(θ>0θ,u). At h=0, we stipulate h⟷(0,0).
Due to their 1-to-1 correspondence, statements about h∈R3 can be made using (θ≥0θ,u) as its proxy.
To put it differently, exponential map is just axis-angle map restricted to θ≥0. For example, exp([h^])=exp([−h]), and their proxies (θ,u), (θ,−u) share the same angle but flipped axis, (not opposite angle, same axis).
[Invertibility of exponential map] exponential map exp([h^]) is injective and thus invertible when the proxy angle (θ:=∥h∥,u) is in range 0≤θ<π. The inverse is called the logarithm map.
Proof. Earlier we have listed the three types for ambiguous cases of axis-angle mapping. exp([h^])=I⇒h=0. For θ:=∥h∥∈[0,π), the most important case (2π−θ,−u) is the same rotation but its angle 2π−θ>π violates the range. Other cases are outside the range too. Within [0,π) the mapping is 1-to-1.
Some authors do not make a distinction between axis-angle map and exponential map; we emphasize it at the cost of verbosity.
An advantage of exponential map is that it allows unconstrained optimization of rotations via h. Optimizing axis-angle (θ,u) requires care to maintain the norm 1 constraint on u.
3.3. Log Map, Matrix → Axis-Angle, and inversion in general
Log map inverts rotation matrix R to h such that exp([h^])=R. Again, inverting to h is 1-to-1 identified as inverting to some (θ≥0θ,u). So here we consider the general problem of inverting to axis-angle mapping.
Every rotation matrix can be produced by some axis-angle (θ,u) s.t. θ∈[0,π], because if θ>π, (2π−θ,−u) produces the same rotation.
Knowledge of cosθ (or tanθ) is sufficient to determine rotation angle θ, since arccos (or arctan) is unambiguous for [0,π]. On the contrary sinθ alone is insufficient to determine θ.
It is no coincidence that both rotation matrix and quaternion provide unambiguous ways to get cosθ (for quat, cosθ=2w2−1), and sinθ=±1−cos2θ is only decidable up to sign, with + chosen to have θ in range [0,π].
R−RT=2[u^]sinθ. u=2sinθ(R−RT)∨ if sinθ=0. By deciding the sign of sinθ to be +, we have also locked the sign of u. If tr(R)=−1,cosθ=−1,θ=π, R is symmetric, and sinθ=0; we obtain pairwise products of entries of u by R=I+2[u^]2⇒uuT=2R+I. With sinθ=0, the sign of u is free to flip. Set one of the non-zero element to positive, and get the rest from pairwise products.
R+RT=2(I+(1−(tr(R)−1)/2)cosθ)[u^]2)⟹uuT=3−tr(R)R+RT+(1−tr(R))I if tr(R)=3. Rotation axis at I is undefined. This identity provides pairwise products of u at all θ>0, and we can compute u up to sign flip as described above. But make sure that the sign of u is compatible with the +ve sign of sinθ as their products are locked by [u^]sinθ=2R−RT. The need for the compatibility check makes this route not that useful vs item 2.
The alternative is rotation matrix → quaternion → axis-angle. The baseline approach in rotation matrix → quaternion is similar to what is outlined in this section.
4. A Historical Note on Quaternions
We discuss an easier and more historically faithful way to motivate the use of quaternion for rotations.
Quaternions are often introduced as hypercomplex numbers.
The problem with this presentation is that it is difficult to intuit a priori why conjugation R(q)x=qxq∗ should perform rotation acting on x while product q3=q2q1 implements rotation composition. The formulae can be algebraically verified posthoc, but their origins are puzzling. Why not use x′=qx to rotate vectors?
Hamilton himself seemed to have struggled with this exact question, even after Cayley's article came to his attention.
It turns out that historically Rodrigues first developed the use of a 3- or 4-number tuple as a rational representation of rotations without trigonometric functions. The concept of a matrix was not yet prevalent, but his proposed formulae in effect 1) converts the 3/4 numbers to a rotation matrix, and 2) composes successive rotations into a new 3/4-number tuple.
Arthur Cayley studied [#6, 1843] Rodrigues' results,
and noticed [#20, 1845] that quaternion conjugation coincides with Rodrigues' formulae.
His background in Rodrigues' prior work made the discovery much more plausible, and he went on to extend quaternions in several directions, one of which is the representation of 4D rotations [#137].
5. Absorbing ambiguities with intermediary:
a simple 5D representation
The spirit is to capture intermediate variables in Rodrigues mapping of eq. 6.
One source of ambiguity in axis-angle is the angle. We over-parameterize θ to absorb its ambiguities.
Let (x,y):=(cosθ,sinθ). The naming is consistent with cosθ measuring the x-axis of a point on unit circle.
The forward mapping becomes
R(x,y,u)=I+y[u^]+(1−x)[u^]2.
The ambiguities have been reduced. To map to the same rotation matrix,
(x,y,u) and (x,−y,−u). This keeps the rotation angle under π.
When (x,y)=(1,0), any u maps to identity matrix.
One way to motivate quaternion is that it resolves case 2 by merging sine and axis u.
6. Quaternion, a 4D representation
Using the fact that sinθ=2sin2θcos2θ and cosθ=2cos22θ−1 [derive with Euler's identity]
We capture the intermediary by letting (w,v):=(cos2θ,sin2θu). (w,v) is referred to as quaternion. This 4d vector is naturally of norm 1.
[Reduced ambiguities]
Since (θ,u)many-to-1⟹(w,v)2-to-1⟹R. The intermediary (w,v) absorbs and consolidates the many-to-one exponential map of (θ,u)⇒R into a 2-to-1 mapping through
(w,v):=(cos2θ,sin2θu). (w,v) and (−w,−v) map to the same rotation, but this 2-to-1 mapping does not correspond to simply flipping the axis-angle. Flipping the signs of (θ,u) in (cos2θ,sin2θu) doesn't change its value.
To change (w,v) to (−w,−v), the options are
(2θ,u)→(2−θ+k⋅2π,−u) for odd int k, e.g. k=1,(2−(2π−θ),u) rotates through the opposite turn.
(2θ,u)→(2−θ+k⋅2π,−u) for odd int k, e.g. k=1,(22π−θ,−u) keeps rotation angle under π.
To keep at the same (w,v), the options are
(2θ,u)→(2−θ+k⋅2π,−u) for some even int k, e.g. k=0
(2θ,u)→(2−θ+k⋅2π,−u) for some even int k, e.g. k=0,(−2θ,−u) for flipped axis-angle.
In this sense, the 2-to-1 ambiguities of quaternion is the result of grouping the many-to-1 ambiguities that axis-angle possesses into "even" and "odd" sides, because all four axis-angle configurations, with k being even or odd, map to the same rotation matrix.
6.1. Quaternion → Rotation Matrix
The forward mapping starting from the intermediary (w,v) is
The off-diagonal entries in eq. 13 are unambiguous. For the diagonal entries, using the constraint w2+v12+v22+v32=1, there are three different ways to write it in the literature and various codebases. WLOG consider the 2nd diagonal entry i.e. R11 with 0-based indexing,
Version 3 / eq. 16 provides critical insight for (parallelizable) inverse mapping from rotation matrix to quaternion. Version 1 / eq. 14 is for example used in Gaussian splatting. A few other papers use version 2.
These different diagonal parameterizations will result in different gradients when backprop-ing from R(w,v) to [w,v]. But after projecting the gradient (w.r.t. norm 1 constraint / to be orthogonal to [w,v]), the gradients become the same [missing proof].
6.2. Rotation Matrix → Quaternion
The problem with the inversion step is always "what happens if one of them is 0".
[Baseline]
We first present a baseline method that involves explicit if-else branching. This is not parallel-friendly and also numerically not ideal because small input perturbations could execute a different code branch.
If tr(R)>−1, then we set w=+41+tr(R) and v=4w(R−RT)∨. Note that when R=I,tr(R)=3,w=1,v=40 is correct.
If tr(R)=−1, then w=0,R=RT,∥v∥=1 and vvT=2R+I, from which we read off v12,v22,v32,v1v2,v1v3,v2v3. At most two of v1,v2,v3 could be 0. Fix the non-zero one to be positive, and obtain the other two.
Overall this routine uses a significant number of conditional branching.
[Better method]
Using the diagonal parameterization in eq. 16, we have
The first row of L comes from the norm 1 constraint. We supply it so that the linear system is solvable. Interestingly L−1 is 41L. It makes 21L an involutary, symmetric, orthogonal matrix.
Now that we have the squares of each number, it remains to determine the signs. Using eq. 18, we can use the off-diagonal entries to obtain pairwise products
At any time, at least one of [w,v1,v2,v3] is non-zero due to norm-1 constraint, and thus a common strategy is to select the one with the largest denominator out of the 4 options.
A batch-parallel implementation can be found in PyTorch3d.
6.3. Quaternion → Axis-Angle
[to 4 numbers (θ,u)]
Over the batch set w=cos2θ non-negative so that θ is within [0,π]. Recall and contrast that for axis-angle, no restriction is placed on the value of cosθ for θ to be in [0,π].
[to 3 numbers h∈R3]
If the goal is instead to directly invert to unnormalized axis,
then it is numerically stabler to use small angle approximation arctan(x)=x−31x3+….
See implementation in LieTorch, which is itself based on Sophus.
Moreover, this stackexchange post pointed out that we can use half-angle formula of tangent to implement atan2.
Again the gradient is different depending on the diag parameterization used. But they become the same after projection.
7. Interpolation
It is often stated that quaternion is the ideal representation to interpolate rotations with.
Its advantage is not obvious because the behaviors of quaternions are hard to intuit. This section attempts to establish some precise characterizations.
7.1. Axis-Angle Interpolation
Rotations are actions / groups,
and an intuitive way to interpolate from action R1 to R2 by ratio t is to gradually apply the connecting action by
interp(R1→R2,t)=R1(R1TR2)t=(R2R1T)tR1.
It would satisfy the endpoints: interp(R1→R2,0)=R1 and interp(R1→R2,1)=R2.
For diagonalizable A=XΛX−1,t∈R, matrix power is defined as At:=XΛtX−1. In particular, using fact 5, exp(A)t=(Xexp(Λ)X−1)t:=Xexp(Λ)tX−1=Xexp(tΛ)X−1=exp(tXΛX−1)=exp(tA) .
The natural way to implement this interpolation is to invert R1TR2 to its axis-angle mapping and then do R1(expθ[w^])t.
[Ambiguity of (R1TR2)t,t∈(0,1)]
Just as the axis-angle form of a rotation is not unique, raising matrix to a power t is not unique either.
Rotation matrices are unitarily diagonalizable, and in 3D Λ is some diag[1,a+bi,a−bi] with a2+b2=1.
The problem is that for matrix with complex eigenvalues, non-integer power of a complex number (a+bi)t is not uniquely defined [details].
We need to first take complex log and convert a+bi to polar form eiθ, before raising to eit⋅θ. Complex logarithm is not unique (just like axis-angle of a rotation is not). eiθ=ei(θ+k⋅2π) for any integer k. When raised to power t, however,
eit⋅(θ+k⋅2π)=eit⋅θ⋅=1 unless t is integereit⋅k⋅2π.
Hence in general (eiθ)t=(ei(θ+k⋅2π))t for t∈(0,1).
The situation manifests as the ambiguity of log-map from rotation matrix to axis-angle R1TR2=expθ[w^]. We need to ensure θ∈[0,π], so that interpolation takes place over the small arc,
This formula is not easily inventable without thinking about hypercomplex numbers.
The matrix version of quaternion product in eq. 30 is used in rigid body dynamics to write a linear ODE.
Rodrigues's derivation in 1840 uses the geometry of the spherical triangle formed by the 3 rotation axes. That picture is intuitive enough and clearly explains the critical role of half angles and bisecting lines.
See Altmann Rotations Quaternion and Double Groups p159.
But working out the actual formula from the picture requires spherical trigonometry (not too bad) in addition to extensive, non-obvious algebraic steps. It seems improbable that one could naturally arrive at the quaternion product without knowing the final goal beforehand as a guide. Detecting the similarity between Rodrigues' rotation composition rule and quaternion product seems only possible in hindsight.
7.3. Slerp: spherical linear interpolation
The slerping routine is generally applicable to vectors. Besides rotation, it is most often used to interpolate Gaussian noise vectors that act as initiating latents of deep generative models.
To interpolate from u to v, the gist is to establish a basis in the plane spanned by {u,v}, and we need 2 orthogonal axis vectors.
The angle θ between u,v is dependent on the choice of basis. The ordering of the basis vectors may change θ to −θ or 2π−θ. What is invariant is cosθ=cos(−θ)=cos(2π−θ). Assuming u,v are of unit norm, cosθ=u⋅v, and again dot product is invariant to change of basis.
Since the interpolation starts at u, we let it be one of the axis for simpler algebra.
To find the other orthogonal axis w, we need the precise phase, for which we let sinθ=+1−cos2θ in order that the angle θ∈[0,π] and the interpolation travels along the small arc.
v=ucosθ+wsinθ⟹w=+1−(v⋅u)2v−(v⋅u)u.
To compute the interpolated vector with ratio t∈[0,1],
But this is using quaternion product, not slerping. To slerp between q1,q2, and show that's it's equivalent to quat product, we let their geometric angle be Ω.