Why slerp the quaternions?

2024-12

We use quaternion slerping to interpolate between rotations. But I've seldom seen a precise discussion on what it does and why it's correct. This tutorial/review article provides a walkthrough of some core identities for 3D rotations, and ends with a discussion that establishes the equivalence between axis-angle interpolation and quaternion slerping.

1. 3D skew-symmetric matrix

X\mX is skew symmetric if XT=X\mX^T = -\mX. The common way to parameterize such a matrix in R3×3\R^{3 \times 3} is with a vector u=[x,y,z]T\vu = [x, y, z]^T s.t.
[u^]=[0zyz0xyx0].\begin{align} \skewm{\vu} = \begin{bmatrix*}[r] 0 & -z & y \\ z & 0 & -x \\ -y & x & 0 \\ \end{bmatrix*}.\end{align}
The variable placement and the signs are very deliberate. [u^]\skewm{\vu} is rank-2 with null-space being u\vu itself.
Useful identities of a skew-symmetric matrix:
  • Cross product matrix: [x^]y=x×y[x^]y=[y^]x[x^]x=0\skewm{\vx} \vy = \vx \times \vy \qquad \skewm{\vx} \vy = -\skewm{\vy} \vx \qquad \skewm{\vx} \vx = \bm{0}.
  • [u^]2=uuT(uTu)I\skewm{\vu}^2 = \vu \vu^T - (\vu^T\vu) \mI . tr([u^]2)=2uTu\tr{\skewm{\vu}^2} = -2\vu^T\vu. If u\vu has norm 1, [u^]2=uuTI \skewm{\vu}^2 = \vu \vu^T - \mI.

2. Matrix Exponential

Define matrix exponential of a square matrix XRn×n\mX \in \R^{n \times n} to be the series
exp(X):=k=0Xkk!.eqnlabel-eq:expm\begin{align} \exp{(\mX)} := \sum_{k=0}^\infty \frac{\mX^k}{k!}. \label{eq:expm}\end{align}
Fact 1: The series converges (absolutely; allows reordering) [proof]. exp(A)exp(B)exp(B)exp(A)\exp{(\mA)}\exp{(\mB)} \neq \exp{(\mB)}\exp{(\mA)} in general. exp(X)\exp{(\mX)} results in an invertible matrix [proof]: exp(X)exp(X)=exp(X)exp(X)=I\exp{(-\mX)} \exp{(\mX)} = \exp{(\mX)} \exp{(-\mX)} = \mI.
Fact 2: Matrix exponential is generally not a 1-to-1 mapping (it is for R1\R^1). It's generally many-to-1 [example]. See the detailed case breakdown for R3×3\R^{3 \times 3} in the axis-angle mapping section below.
Fact 3: If X\mX is skew-symmetric, then exp(X)\exp{(\mX)} is an orthogonal matrix, i.e. its transpose is its inverse: exp(X)T=exp(XT)=exp(X)=exp(X)1\exp{(\mX)}^T = \exp{(\mX^T)} = \exp{(-\mX)} = \exp{(\mX)}^{-1}. Furthermore, its determinant det(exp(X))=exptr(X)=exp(0)=1\text{det}(\exp(\mX)) = \exp{\tr{\mX}} = \exp(0) =1, by Jacobi's identity [proof]. Thus, exp(X)\exp(\mX) produces special orthogonal matrices SO(n)SO(n).
We also claim that every special orthogonal matrix R\mR can be generated by some skew-symmetric matrix X\mX. The mapping is surjective [missing proof].
Fact 4: If (λ,v)(\lambda,\vv) is an eigen val/vec pair of X\mX, then exp(X)v=k=0λkk!v\exp(\mX) \vv = \sum_{k=0}^\infty \frac{\lambda^k}{k!} \vv . (eλ,v)(e^\lambda, \vv) is an eigen pair of exp(X)\exp(\mX). Fact 5: exp(BAB1)=I+(BAB1)22!+=BIB1+BA22!B1+=Bexp(A)B1\exp{(\mB \mA \mB^{-1})} \mycolor{gray}{ = \mI + \frac{(\mB \mA \mB^{-1})^2}{2!} + \dots = \mB \mI \mB^{-1} + \mB\frac{\mA^2}{2!}\mB^{-1} + \dots }= \mB \exp{(\mA)} \mB^{-1} .
Fact 6: texp(tA)=t(I+tA+t22!A2+)=A+tA2+t22!A3+=Aexp(tA)=exp(tA)A\partial_t \exp({t\mA}) \mycolor{gray}{ = \partial_t(\mI + t\mA + \frac{t^2}{2!}\mA^2 + \dots ) = \mA + t\mA^2 + \frac{t^2}{2!} \mA^3 + \dots }= \mA \exp{(t\mA)} = \exp{(t\mA)} \mA. Let y(t)=exp(tA)x\vy(t) = \exp({t\mA})\vx . Then y˙(t)=Aexp(tA)x=Ay(t) \dot \vy(t) \mycolor{gray}{ = \mA \exp{(t\mA)} \vx } = \mA \vy(t) .
We are therefore able to come up with guesses when encountering ODEs of the form
B˙=AB or BA  B(t)=exp(tA)y˙=Ay  y(t)=exp(tA)xfor some initial x.\begin{align} \dot \mB = \mA \mB \text{ or } \mB \mA &~\Longrightarrow~ \mB(t) = \exp(t\mA) \nonumber \\ \dot \vy = \mA \vy &~\Longrightarrow~ \vy(t) = \exp({t\mA}) \vx \quad \text{for some initial $\vx$}.\end{align}
These are less useful for rotations, but central in modeling probability distributions of markov chains.

3. Rotation Exponential Map

3.1. axis-angle map: 4-number, constrained.

In 3D, given (θR,uR3)(\theta \in \R, \vu \in \R^3), with u=1\|\vu\| = 1 and θ\theta without range restriction, [u^]\skewm{\vu} is skew-symmetric, and by fact 3
exp(θ[u^])=kθkk![u^]k\begin{align} \exp{(\theta \skewm{\vu})} = \sum_k \frac{\theta^k}{k!}\skewm{\vu}^k\end{align}
is a spatial rotation matrix in SO(3)SO(3).
Using the constraint u=1,[u^]2=uuTI\| \vu \| = 1, \skewm{\vu}^2 = \vu \vu^T - \mI,
[u^]0=I[u^]1=[u^][u^]2=uuTI[u^]3=[u^][u^]4=[u^]2\begin{align} \skewm{\vu}^0 = \mI \quad \skewm{\vu}^1 = \skewm{\vu} \quad \skewm{\vu}^2 = \vu \vu^T - \mI \quad \skewm{\vu}^3 = -\skewm{\vu} \quad \skewm{\vu}^4 = - \skewm{\vu}^2 \dots\end{align}
and thus
exp(θ[u^])=I+[u^](θθ33!+θ55!)+[u^]2(θ22!θ44!+θ66!)=I+sinθ[u^]+(1cosθ)[u^]2Rodrigues formulaeqnlabel-eq:rod\begin{align} \exp{(\theta \skewm{\vu})} &= \mI + \skewm{\vu}(\theta - \frac{\theta^3}{3!} + \frac{\theta^5}{5!} - \dots) + \skewm{\vu}^2(\frac{\theta^2}{2!} - \frac{\theta^4}{4!} + \frac{\theta^6}{6!} - \dots) \nonumber \\ &= \boxed{ \mI + \sin\theta \skewm{\vu} + (1 - \cos\theta) \skewm{\vu}^2 \quad \text{Rodrigues formula} } \label{eq:rod}\end{align}
produces a rotation matrix. Futhermore, the action preserves vectors along axis u\vu,
exp(θ[u^])(su)=sIu+ssinθ[u^]u+s(1cosθ)[u^]2u=su.\begin{align} \exp{(\theta \skewm{\vu})} (s\vu) = s\mI \vu + s \cdot \sin\theta \cancel{\skewm{\vu} \vu} + s\cdot (1 - \cos\theta) \cancel{\skewm{\vu}^2 \vu} = s\vu .\end{align}
For a unit-length vector x\vx orthogonal to u\vu, [u^]x=u×x\skewm{\vu}\vx = \vu \times \vx provides the 3rd orthogonal vector y\vy. [u,x,y][\vu, \vx, \vy] forms a right-hand oriented basis.
exp(θ[u^])(tx)=I(tx)+sinθt(u×x)y+(1cosθ)t(u×(u×x))x\begin{align} \exp{(\theta \skewm{\vu})} \, (t\vx) = \mI (t\vx) + \sin\theta \cdot t \cdot \underbrace{(\vu \times \vx)}_{\vy} + (1 - \cos\theta) \cdot t \cdot \underbrace{ (\vu \times (\vu \times \vx)) }_{-\vx}\end{align}
rotates txt\vx on the x,y\vx , \vy plane, which is orthogonal and right-handed to u\vu, from x\vx to y\vy by angle θ\theta. The same conclusion can be reached by a spectral analysis on the complex eigenvalues. Action on any vector can be decomposed into action on some su+txs\vu + t \vx.
Therefore it is appropriate to name (θ,u)(\theta, \vu) the axis-angle parameterization of a rotation matrix.
This mapping is not injective. To produce the same rotation matrix, for some integer kk, one could have
  1. (θ,u)(\theta, \vu) and (θ+k2π,u)(\hphantom{-} \theta + k \cdot 2 \pi, \hphantom{-} \vu), e.g. k=1,((2πθ),u)k=-1, (-(2\pi - \theta), \vu) rotates through the opposite turn.
  2. (θ,u)(\theta, \vu) and (θ+k2π,u)(-\theta + k \cdot 2 \pi , -\vu), e.g. k=1,(2πθ,u)k=1, (2\pi - \theta, -\vu) keeps the rotation angle under π\pi.
  3. At θ=k2π\theta = k \cdot 2\pi, any u\vu maps to identity matrix.
See the cases with quaternion for contrast.

3.2. exponential map: 3-number, unconstrained

Sometimes the terms axis-angle map and exponential map are used interchangeably. We make the distinction more explicit. Axis-angle map takes 4 numbers (θ,u)(\theta, \vu ) to rotation matrix via matrix exponential. u=1\|\vu\|=1 simplifies the infinite series into the concise Rodrigues formula.
Given 3 unconstrained numbers hR3,h0\vh \in \R^3, \vh \neq \bm{0}, the axis-angle form is (h,hh)(\|\vh\|, \frac{\vh}{\|\vh\|}), and
exp([h^])=exp(h[h^]h)=I+sinhh[h^]+(1cosh)h2[h^]2eqnlabel-eq:rod_h.\begin{align} \exp{(\skewm{\vh})} &= \exp \left( \|\vh\| \frac{ \skewm{\vh} }{\|\vh\|} \right) = \mI + \frac{ \sin \|\vh\|}{\|\vh\|} \skewm{\vh} + \frac{(1 - \cos \|\vh\|)}{\|\vh\|^2} \skewm{\vh}^2 \quad \label{eq:rod\_h} .\end{align}
For h=0\vh = \bm{0}, exp([h^])=I\exp{(\skewm{\vh})} = \mI by the series in eq 2. For h\|\vh\| small and rotation close to I\mI, it is numerically stabler to use
sinxx=1x26+x4120+O(x6),1cosxx2=12x224+x4720+O(x6),\begin{align*} \frac{\sin x}{x} = 1 - \frac{x^2}{6} + \frac{x^4}{120} + O(x^6), \quad \frac{1 - \cos x}{x^2} = \frac{1}{2} - \frac{x^2}{24} + \frac{x^4}{720} + O(x^6),\end{align*}
because numerical software is unaware of the interaction bewtween trig terms and 1/x1/x when the two parts are evaluated separately. To look up series like these, use wolfram alpha.
Eq. 9 bijectively identifies hh0(θθ>0,u)\underset{\vh \neq \bm{0}}{\vh} \longleftrightarrow (\underset{\mycolor{teal}{\theta > 0}}{\theta}, \vu) . At h=0\vh = 0, we stipulate h(0,0)\vh \longleftrightarrow (0, \bm{0}).
Due to their 1-to-1 correspondence, statements about hR3\vh \in \R^3 can be made using (θθ0,u)(\underset{\theta \ge 0}{\theta}, \vu) as its proxy. To put it differently, exponential map is just axis-angle map restricted to θ0\theta \ge 0. For example, exp([h^])exp([h^])\exp(\skewm{\vh}) \neq \exp([\widehat{-\vh}]), and their proxies (θ,u)(\theta, \vu), (θ,u)(\theta, -\vu) share the same angle but flipped axis, (not opposite angle, same axis).
[Invertibility of exponential map exponential map exp([h^])\exp(\skewm{\vh}) is injective and thus invertible when the proxy angle (θ:=h,u)(\theta := \|\vh\|, \vu) is in range 0θ<π0 \le \theta < \pi. The inverse is called the logarithm map.
Proof. Earlier we have listed the three types for ambiguous cases of axis-angle mapping. exp([h^])=Ih=0\exp(\skewm{\vh}) = \mI \Rightarrow \vh = 0. For θ:=h[0,π) \theta := \|\vh\| \in [0, \pi), the most important case (2πθ,u)(2\pi - \theta, -\vu) is the same rotation but its angle 2πθ>π2\pi - \theta > \pi violates the range. Other cases are outside the range too. Within [0,π)[0, \pi) the mapping is 1-to-1.
Some authors do not make a distinction between axis-angle map and exponential map; we emphasize it at the cost of verbosity. An advantage of exponential map is that it allows unconstrained optimization of rotations via h\vh. Optimizing axis-angle (θ,u)(\theta, \vu) requires care to maintain the norm 1 constraint on u\vu.

3.3. Log Map, Matrix \rightarrow Axis-Angle, and inversion in general

Log map inverts rotation matrix R\mR to h\vh such that exp([h^])=R\exp(\skewm{\vh}) = \mR. Again, inverting to h\vh is 1-to-1 identified as inverting to some (θθ0,u)(\underset{\theta \ge 0}{\theta}, \vu). So here we consider the general problem of inverting to axis-angle mapping.
Every rotation matrix can be produced by some axis-angle (θ,u)(\theta, \vu) s.t. θ[0,π]\theta \in [0, \pi], because if θ>π\theta > \pi, (2πθ,u)(2\pi - \theta, -\vu) produces the same rotation.
Knowledge of cosθ\cos \theta (or tanθ\tan \theta) is sufficient to determine rotation angle θ\theta, since arccos\arccos (or arctan\arctan) is unambiguous for [0,π][0, \pi]. On the contrary sinθ\sin \theta alone is insufficient to determine θ\theta. It is no coincidence that both rotation matrix and quaternion provide unambiguous ways to get cosθ\cos \theta (for quat, cosθ=2w21\cos \theta = 2 w^2 - 1), and sinθ=±1cos2θ\sin \theta = \pm \sqrt{1 - \cos^2 \theta} is only decidable up to sign, with ++ chosen to have θ\theta in range [0,π][0, \pi].
In the case of rotation matrix,
  1. tr(R)=3+0+(1cosθ)(2)=1+2cosθcosθ=tr(R)12,sinθ=+1cos2θ \tr{\mR} = 3 + 0 + (1 - \cos\theta)\cdot(-2) = 1 + 2\cos\theta \Longrightarrow \cos\theta = \frac{\tr{\mR} - 1}{2}, \sin \theta = + \sqrt{1 - \cos^2\theta} .
  2. RRT=2[u^]sinθ\mR - \mR^T = 2 \skewm{\vu} \sin \theta. u=(RRT)2sinθ\vu = \frac{(\mR - \mR^T)^{\vee}}{2\sin\theta} if sinθ0\sin \theta \neq 0. By deciding the sign of sinθ\sin \theta to be ++, we have also locked the sign of u\vu. If tr(R)=1,cosθ=1,θ=π\tr{\mR}=-1,\cos\theta=-1, \theta = \pi, R\mR is symmetric, and sinθ=0\sin\theta = 0; we obtain pairwise products of entries of u\vu by R=I+2[u^]2uuT=R+I2\mR = \mI + 2\skewm{\vu}^2 \Rightarrow \vu \vu^T = \frac{\mR + \mI}{2}. With sinθ=0\sin\theta=0, the sign of u\vu is free to flip. Set one of the non-zero element to positive, and get the rest from pairwise products.
  3. R+RT=2(I+(1cosθ(tr(R)1)/2))[u^]2)uuT=R+RT+(1tr(R))I3tr(R)\mR + \mR^T = 2(\mI + (1 - \underbrace{\cos \theta}_{ (\tr{\mR} - 1)/2) } ) \skewm{\vu}^2 ) \Longrightarrow \vu \vu^T = \frac{\mR + \mR^T + (1 - \tr{\mR})\mI}{3 - \tr{\mR}} if tr(R)3\tr{\mR} \neq 3. Rotation axis at I\mI is undefined. This identity provides pairwise products of u\vu at all θ>0\theta > 0, and we can compute u\vu up to sign flip as described above. But make sure that the sign of u\vu is compatible with the +ve sign of sinθ\sin \theta as their products are locked by [u^]sinθ=RRT2\skewm{\vu} \sin \theta = \frac{\mR - \mR^T}{2}. The need for the compatibility check makes this route not that useful vs item 2.
The alternative is rotation matrix \rightarrow quaternion \rightarrow axis-angle. The baseline approach in rotation matrix \rightarrow quaternion is similar to what is outlined in this section.

4. A Historical Note on Quaternions

We discuss an easier and more historically faithful way to motivate the use of quaternion for rotations.
Quaternions are often introduced as hypercomplex numbers. The problem with this presentation is that it is difficult to intuit a priori why conjugation R(q)x=qxq\mR(\vq) \vx = \vq \vx \vq^* should perform rotation acting on x\vx while product q3=q2q1\vq_3 = \vq_2 \vq_1 implements rotation composition. The formulae can be algebraically verified posthoc, but their origins are puzzling. Why not use x=qx\vx' = \vq \vx to rotate vectors? Hamilton himself seemed to have struggled with this exact question, even after Cayley's article came to his attention.
It turns out that historically Rodrigues first developed the use of a 3- or 4-number tuple as a rational representation of rotations without trigonometric functions. The concept of a matrix was not yet prevalent, but his proposed formulae in effect 1) converts the 3/4 numbers to a rotation matrix, and 2) composes successive rotations into a new 3/4-number tuple.
Arthur Cayley studied [#6, 1843] Rodrigues' results, and noticed [#20, 1845] that quaternion conjugation coincides with Rodrigues' formulae. His background in Rodrigues' prior work made the discovery much more plausible, and he went on to extend quaternions in several directions, one of which is the representation of 4D rotations [#137].

5. Absorbing ambiguities with intermediary: a simple 5D representation

The spirit is to capture intermediate variables in Rodrigues mapping of eq. 6.
One source of ambiguity in axis-angle is the angle. We over-parameterize θ\theta to absorb its ambiguities.
Let (x,y):=(cosθ,sinθ)(x, y) := (\cos \theta, \sin \theta) . The naming is consistent with cosθ\cos\theta measuring the xx-axis of a point on unit circle.
The forward mapping becomes
R(x,y,u)=I+y[u^]+(1x)[u^]2.\begin{align} \mR(x, y, \vu) = \mI + y \skewm{\vu} + (1 - x) \skewm{\vu}^2 .\end{align}
The ambiguities have been reduced. To map to the same rotation matrix,
  1. (x,y,u)(x, y, \vu) and (x,y,u)(x, -y, -\vu). This keeps the rotation angle under π\pi.
  2. When (x,y)=(1,0)(x, y) = (1, 0), any u\vu maps to identity matrix.
One way to motivate quaternion is that it resolves case 2 by merging sine and axis u\vu.

6. Quaternion, a 4D representation

Using the fact that sinθ=2sinθ2cosθ2\sin \theta = 2 \sin\tfrac{\theta}{2} \cos\tfrac{\theta}{2} and cosθ=2cos2θ21\cos \theta = 2 \cos^2 \frac{\theta}{2} - 1 [derive with Euler's identity]
exp(θ[u^])=I+2cosθ2sinθ2[u^]+(12cos2θ2+1)[u^]2=I+2cosθ2sinθ2[u^]+2(sinθ2[u^])2.\begin{align} \exp(\theta \skewm{\vu} ) &= \mI + 2 \cos\frac{\theta}{2} \sin\frac{\theta}{2} \skewm{\vu} + (1 - 2\cos^2 \frac{\theta}{2} + 1) \skewm{\vu}^2 \nonumber \\ &= \mI + 2 \cos\frac{\theta}{2} \sin\frac{\theta}{2} \skewm{\vu} + 2 \left( \sin \frac{\theta}{2} \skewm{\vu} \right)^2.\end{align}
We capture the intermediary by letting (w,v):=(cosθ2,sinθ2u)\boxed{ (w, \vv) := ( \cos \tfrac{\theta}{2}, \sin \tfrac{\theta}{2} \vu )}. (w,v)(w, \vv) is referred to as quaternion. This 4d vector is naturally of norm 1.
[Reduced ambiguities Since (θ,u)many-to-1(w,v)2-to-1R(\theta, \vu) \underbrace{\Longrightarrow}_{\text{many-to-1}} (w, \vv) \underbrace{\Longrightarrow}_{\text{2-to-1}} \mR . The intermediary (w,v)(w, \vv) absorbs and consolidates the many-to-one exponential map of (θ,u)R(\theta, \vu) \Rightarrow \mR into a 2-to-1 mapping through (w,v):=(cosθ2,sinθ2u)(w, \vv) := (\cos \frac{\theta}{2}, \sin \frac{\theta}{2} \vu ). (w,v)(w, \vv) and (w,v)(-w, -\vv) map to the same rotation, but this 2-to-1 mapping does not correspond to simply flipping the axis-angle. Flipping the signs of (θ,u)(\theta, \vu) in (cosθ2,sinθ2u)(\cos \frac{\theta}{2}, \sin \frac{\theta}{2} \vu ) doesn't change its value. To change (w,v)(w, \vv) to (w,v)(-w, -\vv), the options are
  • (θ2,u)(θ+k2π2,u)(\frac{\theta}{2}, \vu) \rightarrow (\frac{ \hphantom{-} \theta + k \cdot 2\pi}{2}, \hphantom{-}\vu) for odd int kk, e.g. k=1,((2πθ)2,u)k=1, (\frac{-(2\pi - \theta)}{2}, \vu) rotates through the opposite turn.
  • (θ2,u)(θ+k2π2,u)(\frac{\theta}{2}, \vu) \rightarrow (\frac{ -\theta + k \cdot 2\pi}{2}, -\vu) for odd int kk, e.g. k=1,(2πθ2,u)k=1, (\frac{2\pi - \theta}{2}, -\vu) keeps rotation angle under π\pi.
To keep at the same (w,v)(w, \vv), the options are
  • (θ2,u)(θ+k2π2,u)(\frac{\theta}{2}, \vu) \rightarrow (\frac{\hphantom{-}\theta + k \cdot 2 \pi}{2}, \hphantom{-}\vu) for some even int kk, e.g. k=0k=0
  • (θ2,u)(θ+k2π2,u)(\frac{\theta}{2}, \vu) \rightarrow (\frac{-\theta + k \cdot 2\pi }{2}, -\vu) for some even int kk, e.g. k=0,(θ2,u)k=0, (-\frac{\theta}{2}, -\vu) for flipped axis-angle.
In this sense, the 2-to-1 ambiguities of quaternion is the result of grouping the many-to-1 ambiguities that axis-angle possesses into "even" and "odd" sides, because all four axis-angle configurations, with kk being even or odd, map to the same rotation matrix.

6.1. Quaternion \rightarrow Rotation Matrix

The forward mapping starting from the intermediary (w,v)(w, \vv) is
R(w,v)=I+2w[v^]+2[v^]2=I+2w[v^]+2(vvTvTvI)=(12(v12+v22+v32))I+2(vvT+w[v^])=(12(v12+v22+v32))I+2[v12000v22000v32]diagonal entries+2[0v1v2v3wv1v3+v2wv2v1+v3w0v2v3v1wv3v1v2wv3v2+v1w0]off-diagonal entries.eqnlabel-eq:quat2mat\begin{align} \mR(w, \vv) &= \mI + 2 w \cdot \skewm{\vv} + 2\skewm{\vv}^2 \\ &= \mI + 2 w \cdot \skewm{\vv} + 2 (\vv \vv^T - \vv^T \vv\mI) \nonumber \\ &= \left(1 - 2(v_1^2 + v_2^2 + v_3^2) \right) \mI + 2(\vv \vv^T +w \cdot \skewm{\vv}) \nonumber \\ &= \underbrace{ \left(1 - 2(v_1^2 + v_2^2 + v_3^2) \right) \mI + 2 \begin{bmatrix} v_1^2 & 0 & 0 \\ 0 & v_2^2 & 0 \\ 0 & 0 & v_3^2 \end{bmatrix} }_{\text{diagonal entries}} + \underbrace{ 2 \begin{bmatrix} 0 & v_1 v_2 - v_3 w & v_1 v_3 + v_2 w \\ v_2 v_1 + v_3 w & 0 & v_2 v_3 - v_1 w \\ v_3 v_1 - v_2 w & v_3 v_2 + v_1 w & 0 \end{bmatrix} }_{\text{off-diagonal entries}} . \label{eq:quat2mat}\end{align}
The off-diagonal entries in eq. 13 are unambiguous. For the diagonal entries, using the constraint w2+v12+v22+v32=1w^2 + v_1^2 + v_2^2 + v_3^2 = 1, there are three different ways to write it in the literature and various codebases. WLOG consider the 2nd diagonal entry i.e. R11\mR_{11} with 0-based indexing,
R11=12(v12+v22+v32)+2v22=12v122v32eqnlabel-eq:quatm_v1=12(1w2)+2v22=2w2+2v221eqnlabel-eq:quatm_v2=(w2+v12+v22+v32)2v122v32=w2v12+v22v32.eqnlabel-eq:quatm_v3\begin{align} \mR_{11} = 1 - 2(v_1^2 + v_2^2 + v_3^2) + 2v_2^2 &= 1 - 2v_1^2 - 2 v_3^2 \label{eq:quatm\_v1} \\ &= 1 - 2(1 - w^2) + 2v_2^2 = 2 w^2 + 2v_2^2 - 1 \label{eq:quatm\_v2} \\ &= (w^2 + v_1^2 + v_2^2 + v_3^2) - 2v_1^2 - 2 v_3^2 = w^2 - v_1^2 + v_2^2 - v_3^2 . \label{eq:quatm\_v3}\end{align}
Two remarks
  1. Version 3 / eq. 16 provides critical insight for (parallelizable) inverse mapping from rotation matrix to quaternion. Version 1 / eq. 14 is for example used in Gaussian splatting. A few other papers use version 2.
  2. These different diagonal parameterizations will result in different gradients when backprop-ing from R(w,v)\mR(w, \vv) to [w,v][w, \vv]. But after projecting the gradient (w.r.t. norm 11 constraint / to be orthogonal to [w,v][w, \vv]), the gradients become the same [missing proof].

6.2. Rotation Matrix \rightarrow Quaternion

The problem with the inversion step is always "what happens if one of them is 00". [Baseline We first present a baseline method that involves explicit if-else branching. This is not parallel-friendly and also numerically not ideal because small input perturbations could execute a different code branch.
tr(R)=3+0+22vTv=34v2  v2=3tr(R)4, w2=1v2=1+tr(R)4.RRT=4w[v^],R+RT=2I+4[v^]2=4vvT+(24v2)Ieqnlabel-eq:quatm_offdiag.\begin{align} \tr{\mR} = \mycolor{lightgray}{3 + 0 + 2 \cdot -2 \vv^T \vv =} 3 - 4 \| \vv \|^2 &~\Longrightarrow~ \| \vv \|^2 = \frac{3 - \tr{\mR}}{4}, ~ w^2 = \mycolor{lightgray}{ 1 - \| \vv \|^2 = } \frac{1 + \tr{\mR}}{4} . \\ \mR - \mR^T = 4w \cdot \skewm{\vv} &, \qquad \mR + \mR^T = \mycolor{lightgray}{ 2\mI + 4\skewm{\vv}^2 } = 4 \vv \vv^T + (2 - 4\|\vv\|^2)\mI \label{eq:quatm\_offdiag} .\end{align}
If tr(R)>1\tr{\mR} > -1, then we set w=+1+tr(R)4w = + \sqrt{\frac{1 + \tr{\mR}}{4}} and v=(RRT)4w\vv = \frac{(\mR - \mR^T)^{\vee}}{4w}. Note that when R=I,tr(R)=3,w=1,v=04\mR = \mI, \tr{\mR} = 3, w=1, v = \frac{\bm{0}}{4} is correct.
If tr(R)=1\tr{\mR} = -1, then w=0,R=RT,v=1w=0, \mR = \mR^T, \| \vv \| = 1 and vvT=R+I2\vv \vv^T = \tfrac{\mR + \mI}{2}, from which we read off v12,v22,v32,v1v2,v1v3,v2v3v_1^2, v_2^2, v_3^2, v_1v_2, v_1v_3, v_2v_3. At most two of v1,v2,v3v_1, v_2, v_3 could be 00. Fix the non-zero one to be positive, and obtain the other two.
Overall this routine uses a significant number of conditional branching.
[Better method Using the diagonal parameterization in eq. 16, we have
[1111111111111111]L[w2v12v22v32]=[1R00R11R22][w2v12v22v32]=14L[1R00R11R22].\begin{align} \underbrace{ \begin{bmatrix*}[r] 1 & 1 & 1 & 1 \\ 1 & 1 & -1 & -1 \\ 1 & -1 & 1 & -1 \\ 1 & -1 & -1 & 1 \\ \end{bmatrix*} }_{\mL} \begin{bmatrix} w^2 \\ v_1^2 \\ v_2^2 \\ v_3^2 \end{bmatrix} = \begin{bmatrix} 1 \\ \mR_{00} \\ \mR_{11} \\ \mR_{22} \end{bmatrix} \quad \Longrightarrow \quad \begin{bmatrix} w^2 \\ v_1^2 \\ v_2^2 \\ v_3^2 \end{bmatrix} = \frac{1}{4} \mL \begin{bmatrix} 1 \\ \mR_{00} \\ \mR_{11} \\ \mR_{22} \end{bmatrix} .\end{align}
The first row of L\mL comes from the norm 1 constraint. We supply it so that the linear system is solvable. Interestingly L1\mL^{-1} is 14L\frac{1}{4}\mL. It makes 12L\frac{1}{2}\mL an involutary, symmetric, orthogonal matrix.
Now that we have the squares of each number, it remains to determine the signs. Using eq. 18, we can use the off-diagonal entries to obtain pairwise products
wv1=R21R124,wv2=R02R204,wv3=R10R014,v1v2=R01+R104,v1v3=R02+R204,v2v3=R12+R214,\begin{align} w v_1 = \frac{\mR_{21} - \mR_{12}}{4},& \quad w v_2 = \frac{\mR_{02} - \mR_{20}}{4}, \quad w v_3 = \frac{\mR_{10} - \mR_{01}}{4} , \nonumber \\ v_1 v_2 = \frac{\mR_{01} + \mR_{10}}{4},& \quad v_1 v_3 = \frac{\mR_{02} + \mR_{20}}{4}, \quad v_2 v_3 = \frac{\mR_{12} + \mR_{21}}{4} ,\end{align}
and finally
[w,v1,v2,v3]=[w2,wv1,wv2,wv3]+w2=[wv1,v12,v1v2,v1v3]+v12=[wv2,v1v2,v22,v2v3]+v22=[wv3,v1v3,v2v3,v32]+v32.\begin{align} \myscalebox{1.15}{ [w, v_1, v_2, v_3] = \frac{[w^2, wv_1, w v_2, w v_3]}{+\sqrt{w^2}} = \frac{[wv_1, v_1^2, v_1v_2, v_1v_3]}{+\sqrt{v_1^2}} = \frac{[wv_2, v_1v_2, v_2^2, v_2v_3]}{+\sqrt{v_2^2}} = \frac{[wv_3, v_1v_3, v_2v_3, v_3^2]}{+\sqrt{v_3^2}} . }\end{align}
At any time, at least one of [w,v1,v2,v3][w, v_1, v_2, v_3] is non-zero due to norm-1 constraint, and thus a common strategy is to select the one with the largest denominator out of the 4 options.
A batch-parallel implementation can be found in PyTorch3d.

6.3. Quaternion \rightarrow Axis-Angle

[to 4 numbers (θ,u)(\theta, \vu) Over the batch set w=cosθ2w = \cos \frac{\theta}{2} non-negative so that θ\theta is within [0,π][0, \pi]. Recall and contrast that for axis-angle, no restriction is placed on the value of cosθ\cos \theta for θ\theta to be in [0,π][0, \pi].
q=[q.w0]qθ=2atan2(v,w)u=v/v  if  θ>0  else  u=0.\begin{align} \vq &= [\vq.w \ge 0] \cdot \vq \nonumber \\ \theta &= 2 \cdot \text{atan2}(\|\vv\|, w) \nonumber \\ \vu &= \vv / \| \vv \| \text{~~if~~} \theta > 0 \text{~~else~~} \vu = \bm{0} .\end{align}
[to 3 numbers hR3\vh \in \R^3 If the goal is instead to directly invert to unnormalized axis, then it is numerically stabler to use small angle approximation arctan(x)=x13x3+\arctan(x) = x - \tfrac{1}{3}x^3 + \dots.
q=[q.w0]qh=2(vw13(vw)3)vv=(2w23v2w3)v  if  v<ϵ=2atan2(v,w)vv  else.\begin{align} \vq &= [\vq.w \ge 0] \cdot \vq \nonumber \\ \vh &= 2\left( \frac{\|\vv\|}{w} - \frac{1}{3} \left(\frac{\|\vv\|}{w} \right)^3 \right) \frac{\vv}{\|\vv\|} = \left( \frac{2}{w} - \frac{2}{3} \cdot \frac{\|\vv\|^2}{w^3} \right) \vv \text{~~if~~} \| \vv \| < \epsilon \nonumber \\ &= 2 \cdot \text{atan2}(\|\vv\|, w) \cdot \frac{\vv}{\|\vv\|} \text{~~else} .\end{align}
See implementation in LieTorch, which is itself based on Sophus. Moreover, this stackexchange post pointed out that we can use half-angle formula of tangent to implement atan2.

6.4. Gradient of Matrix w.r.t. Quaternion

v1[v^]2vvTvTvI=[0v2v3v22v10v302v1] v2[v^]2=[2v2v10v10v30v32v2] v3[v^]2=[2v30v102v3v2v1v20].\begin{align} \partial_{v_1} \underbrace{ \skewm{\vv}^2 }_{\vv \vv^T - \vv^T \vv \mI} = \begin{bmatrix} 0 & v_2 & v_3 \\ v_2 & -2v_1 & 0 \\ v_3 & 0 & -2v_1 \\ \end{bmatrix} ~ \partial_{v_2} \skewm{\vv}^2 = \begin{bmatrix} -2v_2 & v_1 & 0 \\ v_1 & 0 & v_3 \\ 0 & v_3 & -2v_2 \\ \end{bmatrix} ~ \partial_{v_3} \skewm{\vv}^2 = \begin{bmatrix} -2v_3 & 0 & v_1 \\ 0 & -2v_3 & v_2 \\ v_1 & v_2 & 0 \\ \end{bmatrix} .\end{align}
Using the diagonal param of eq. 14, wR=2[v^]\partial_w \mR = 2 \skewm{\vv} , and
v1R=2([00000w0w0]+[0v2v3v22v10v302v1]) v2R=2([00w000w00]+[2v2v10v10v30v32v2]) v3R=2([0w0w00000]+[2v30v102v3v2v1v20]).\begin{align} \myscalebox{0.65}{ \partial_{v_1} \mR = 2\left( \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & -w \\ 0 & w & 0 \\ \end{bmatrix} + \begin{bmatrix} 0 & v_2 & v_3 \\ v_2 & -2v_1 & 0 \\ v_3 & 0 & -2v_1 \\ \end{bmatrix} \right) } ~ \myscalebox{0.65}{ \partial_{v_2} \mR = 2\left( \begin{bmatrix} 0 & 0 & w \\ 0 & 0 & 0 \\ -w & 0 & 0 \\ \end{bmatrix} + \begin{bmatrix} -2v_2 & v_1 & 0 \\ v_1 & 0 & v_3 \\ 0 & v_3 & -2v_2 \\ \end{bmatrix} \right) } ~ \myscalebox{0.65}{ \partial_{v_3} \mR = 2\left( \begin{bmatrix} 0 & -w & 0 \\ w & 0 & 0 \\ 0 & 0 & 0 \\ \end{bmatrix} + \begin{bmatrix} -2v_3 & 0 & v_1 \\ 0 & -2v_3 & v_2 \\ v_1 & v_2 & 0 \\ \end{bmatrix} \right) } .\end{align}
Again the gradient is different depending on the diag parameterization used. But they become the same after projection.

7. Interpolation

It is often stated that quaternion is the ideal representation to interpolate rotations with. Its advantage is not obvious because the behaviors of quaternions are hard to intuit. This section attempts to establish some precise characterizations.

7.1. Axis-Angle Interpolation

Rotations are actions / groups, and an intuitive way to interpolate from action R1\mR_1 to R2\mR_2 by ratio tt is to gradually apply the connecting action by
interp(R1R2,t)=R1(R1TR2)t=(R2R1T)tR1.\begin{align} \text{interp}(\mR_1 \rightarrow \mR_2, t) = \mR_1 (\mR_1^T\mR_2)^t \mycolor{lightgray}{= (\mR_2 \mR_1^T)^t \mR_1}.\end{align}
It would satisfy the endpoints: interp(R1R2,0)=R1\text{interp}(\mR_1 \rightarrow \mR_2, 0) = \mR_1 and interp(R1R2,1)=R2\text{interp}(\mR_1 \rightarrow \mR_2, 1) = \mR_2.
For diagonalizable A=XΛX1,tR\mA = \mX \bm{\Lambda} \mX^{-1}, t \in \R, matrix power is defined as At:=XΛtX1\mA^t := \mX \bm{\Lambda}^t \mX^{-1} . In particular, using fact 5, exp(A)t=(Xexp(Λ)X1)t:=Xexp(Λ)tX1=Xexp(tΛ)X1=exp(tXΛX1)=exp(tA)\exp(\mA)^t = \mycolor{lightgray}{ (\mX\exp(\bm{\Lambda})\mX^{-1})^t := \mX\exp(\bm{\Lambda})^t\mX^{-1} = \mX\exp(t\bm{\Lambda})\mX^{-1} = \exp(t\mX\bm{\Lambda}\mX^{-1}) } = \exp(t\mA) . The natural way to implement this interpolation is to invert R1TR2\mR_1^T\mR_2 to its axis-angle mapping and then do R1(expθ[w^])t\mR_1 (\exp \theta \skewm{\vw})^t.
[Ambiguity of (R1TR2)t,t(0,1)(\mR_1^T\mR_2)^t, t \in (0, 1) Just as the axis-angle form of a rotation is not unique, raising matrix to a power tt is not unique either. Rotation matrices are unitarily diagonalizable, and in 3D Λ\bm{\Lambda} is some diag[1,a+bi,abi]\textbf{diag}[1, a+bi, a-bi] with a2+b2=1a^2 + b^2=1. The problem is that for matrix with complex eigenvalues, non-integer power of a complex number (a+bi)t(a + bi)^t is not uniquely defined [details]. We need to first take complex log and convert a+bia + bi to polar form eiθe^{i\theta}, before raising to eitθe^{it \cdot \theta}. Complex logarithm is not unique (just like axis-angle of a rotation is not). eiθ=ei(θ+k2π)e^{i\theta} = e^{i(\theta + k \cdot 2\pi)} for any integer kk. When raised to power tt, however,
eit(θ+k2π)=eitθeitk2π1 unless t is integer.\begin{align} e^{it \cdot (\theta + k \cdot 2\pi)} = e^{it \cdot \theta} \cdot \underbrace{e^{it\cdot k \cdot 2\pi}}_{ \neq 1 \text{ unless t is integer} } .\end{align}
Hence in general (eiθ)t(ei(θ+k2π))t( e^{i\theta} )^t \neq ( e^{i(\theta + k \cdot 2\pi)} )^t for t(0,1)t \in (0, 1).
The situation manifests as the ambiguity of log-map from rotation matrix to axis-angle R1TR2=expθ[w^]\mR_1^T\mR_2 = \exp \theta \skewm{\vw}. We need to ensure θ[0,π]\theta \in [0, \pi], so that interpolation takes place over the small arc,
interp(R1R2,t)=R1(expθ[w^])t=R1exp(tθ[w^])=R1(I+sintθ[w^]+(1costθ)[w^]2).\begin{align} \text{interp}(\mR_1 \rightarrow \mR_2, t) &= \mR_1 (\exp \theta \skewm{\vw})^t = \mR_1 \exp (t\theta \skewm{\vw}) \nonumber \\ &= \mR_1 (\mI + \sin t\theta \skewm{\vw} + (1 - \cos t\theta) \skewm{\vw}^2).\end{align}
The overall goal of the section is to show that quaternion slerping is exactly equivalent to this formula.

7.2. Quaternion product

The Quaternion product is
(w2,v2)(w1,v1)=(w2w1v2v1,w2v1+w1v2+v2×v1)=(w2I+[0v2Tv2[v2^]])[w1v1].eqnlabel-eq:quat_prod_mat\begin{align} (w_2, \vv_2) (w_1, \vv_1) &= (w_2w_1 - \vv_2 \vv_1, w_2 \vv_1 + w_1 \vv_2 + \vv_2 \times \vv_1) \\ &= \left( w_2 \mI + \begin{bmatrix} 0 & -\vv_2^T \\ \vv_2 & \skewm{\vv_2} \end{bmatrix} \right) \begin{bmatrix}w_1 \\ \vv_1\end{bmatrix}. \label{eq:quat\_prod\_mat}\end{align}
This formula is not easily inventable without thinking about hypercomplex numbers. The matrix version of quaternion product in eq. 30 is used in rigid body dynamics to write a linear ODE.
Rodrigues's derivation in 1840 uses the geometry of the spherical triangle formed by the 3 rotation axes. That picture is intuitive enough and clearly explains the critical role of half angles and bisecting lines. See Altmann Rotations Quaternion and Double Groups p159.
But working out the actual formula from the picture requires spherical trigonometry (not too bad) in addition to extensive, non-obvious algebraic steps. It seems improbable that one could naturally arrive at the quaternion product without knowing the final goal beforehand as a guide. Detecting the similarity between Rodrigues' rotation composition rule and quaternion product seems only possible in hindsight.

7.3. Slerp: spherical linear interpolation

The slerping routine is generally applicable to vectors. Besides rotation, it is most often used to interpolate Gaussian noise vectors that act as initiating latents of deep generative models.
To interpolate from u\vu to v\vv, the gist is to establish a basis in the plane spanned by {u,v}\{\vu, \vv\}, and we need 22 orthogonal axis vectors. The angle θ\theta between u,v\vu, \vv is dependent on the choice of basis. The ordering of the basis vectors may change θ\theta to θ-\theta or 2πθ2\pi - \theta. What is invariant is cosθ=cos(θ)=cos(2πθ)\cos \theta = \cos (-\theta) = \cos(2\pi - \theta) . Assuming u,v\vu, \vv are of unit norm, cosθ=uv\cos\theta = \vu \cdot \vv, and again dot product is invariant to change of basis.
Since the interpolation starts at u\vu, we let it be one of the axis for simpler algebra. To find the other orthogonal axis w\vw, we need the precise phase, for which we let sinθ=+1cos2θ\sin \theta = + \sqrt{1 - \cos^2 \theta} in order that the angle θ[0,π]\theta \in [0, \pi] and the interpolation travels along the small arc.
v=ucosθ+wsinθ  w=v(vu)u+1(vu)2.\begin{align} \vv = \vu \cos \theta + \vw \sin \theta ~\Longrightarrow ~ \vw = \frac{\vv - (\vv \cdot \vu) \vu }{+\sqrt{1 - (\vv \cdot \vu)^2}} .\end{align}
To compute the interpolated vector with ratio t[0,1]t \in [0, 1],
slerp(uv,t)=ucostθ+wsintθ=ucostθ+vucosθsinθsintθ=u(sinθcostθcosθsintθ)+vsintθsinθ=u(sinθcos(tθ)+cosθsin(tθ))+vsintθsinθ=usin(1t)θ+vsintθsinθeqnlabel-eq:slerp.\begin{align} \text{slerp}(\vu \rightarrow \vv, t) &= \vu \cos t\theta + \vw \sin t\theta = \vu \cos t\theta + \frac{\vv - \vu \cos \theta }{\sin \theta} \cdot \sin t\theta \nonumber \\ &= \frac{\vu(\sin\theta \cos t\theta - \cos \theta \sin t\theta ) + \vv \sin t\theta }{\sin \theta} \nonumber \\ &= \frac{\vu(\sin\theta \cos (-t\theta) + \cos \theta \sin (-t\theta) ) + \vv \sin t\theta }{\sin \theta} \nonumber \\ &= \frac{\vu\sin(1 - t)\theta + \vv \sin t\theta }{\sin \theta} \label{eq:slerp} .\end{align}
The slerping formula in eq. 32 is symmetrical i.e. slerp(uv,t)=slerp(vu,1t)\text{slerp}(\vu \rightarrow \vv, t) = \text{slerp}(\vv \rightarrow \vu, 1- t).

7.4. Quaternion Slerp

First of all, R1(R1TR2)t\mR_1 (\mR_1^T\mR_2)^t directly corresponds to q1(q1q2)t\vq_1 (\vq_1^* \vq_2)^t, and we define the connecting
q1q2=(w1,v1)(w2,v2)=(w1w2+v1v2cos(θ/2),w1v2w2v1v1×v2sin(θ/2)u12).\begin{align} \vq_1^* \vq_2 = (w_1, -\vv_1) (w_2, \vv_2) = (\underbrace{w_1w_2 + \vv_1 \cdot \vv_2}_{\cos (\theta/2)}, \underbrace{w_1 \vv_2 - w_2 \vv_1 - \vv_1 \times \vv_2}_{\sin (\theta/2) \, \vu_{1 \rightarrow 2}} ) .\end{align}
The axis-angle interpolation when written in quaternion form becomes
interp(R1R2,t)=R1(R1TR2)t=R1(I+sintθ[w^]+(1costθ)[w^]2)=q1(q1q2)t=q1(costθ2,sintθ2u12).\begin{align} \text{interp}(\mR_1 \rightarrow \mR_2, t) &= \mR_1 (\mR_1^T\mR_2)^t \nonumber \\ &= \mR_1 (\mI + \sin t\theta \skewm{\vw} + (1 - \cos t\theta) \skewm{\vw}^2) \nonumber \\ &= \vq_1 (\vq_1^* \vq_2)^t \nonumber \\ &= \vq_1 ( \cos t \frac{\theta}{2} ,\: \sin t \frac{\theta}{2} \, \vu_{1 \rightarrow 2}) .\end{align}
But this is using quaternion product, not slerping. To slerp between q1,q2\vq_1, \vq_2, and show that's it's equivalent to quat product, we let their geometric angle be Ω\Omega.
cosΩ=q1q2=w1w2+v1v2=(q1q2).w=cos(θ/2)u12=w1v2w2v1v1×v2sinΩ.\begin{align} \cos \Omega = \vq_1 \cdot \vq_2 &= \mycolor{gray}{ w_1 w_2 + \vv_1 \cdot \vv_2 = (\vq_1^*\vq_2).w =} \cos (\theta / 2) \\ \vu_{1 \rightarrow 2} &= \frac{w_1 \vv_2 - w_2 \vv_1 - \vv_1 \times \vv_2}{\sin \Omega}.\end{align}
and the power interpolation when expressed using cosΩ=cos(θ/2)\cos \Omega = \cos (\theta / 2) and u12=w1v2w2v1v1×v2sinΩ\vu_{1 \rightarrow 2} = \frac{w_1 \vv_2 - w_2 \vv_1 - \vv_1 \times \vv_2}{\sin \Omega},
q1(q1q2)t=q1(costθ2,sintθ2u12)=(w1,v1)(costΩ,  sintΩu12)=(w1costΩsintΩv1u12, w1sintΩu12+costΩv1+sintΩv1×u12)=(w1costΩsintΩv1u12,  costΩv1+sintΩ(w1u12+v1×u12).\begin{align} \vq_1 (\vq_1^* \vq_2)^t &= \vq_1 ( \cos t \frac{\theta}{2} ,\: \sin t \frac{\theta}{2} \, \vu_{1 \rightarrow 2}) = ( w_1, \vv_1) ( \cos t \Omega ,\; \sin t \Omega \, \vu_{1 \rightarrow 2} ) \nonumber \\ &= ( w_1 \cos t \Omega - \sin t \Omega \, \vv_1 \cdot \vu_{1 \rightarrow 2} ,~ w_1 \sin t \Omega \, \vu_{1 \rightarrow 2} + \cos t \Omega \vv_1 + \sin t \Omega \, \vv_1 \times \vu_{1 \rightarrow 2} ) \nonumber \\ &= ( w_1 \cos t \Omega - \sin t \Omega \, \vv_1 \cdot \vu_{1 \rightarrow 2} ,\; \cos t \Omega \, \vv_1 + \sin t \Omega (w_1 \vu_{1 \rightarrow 2} + \vv_1 \times \vu_{1 \rightarrow 2} ) .\end{align}
First look at the scalar part of this quaternion,
sintΩv1u12=sintΩv1(w1v2w2v1v1×v2)sinΩ=sintΩw2v1v1w1v1v2sinΩ  where  v1v1=1w12=sintΩw2w1(w1w2+v1v2)sinΩ=sintΩw2w1cosΩsinΩ.\begin{align} - \sin t \Omega \, \vv_1 \cdot \vu_{1 \rightarrow 2} &= - \sin t \Omega \, \frac{\vv_1 \cdot (w_1 \vv_2 - w_2 \vv_1 - \vv_1 \times \vv_2)}{\sin \Omega} \nonumber \\ &= \sin t \Omega \, \frac{w_2 \vv_1 \cdot \vv_1 -w_1\vv_1 \cdot \vv_2 }{\sin \Omega} ~~\text{where}~~ \vv_1 \cdot \vv_1 = 1 - w_1^2 \nonumber \\ &= \sin t \Omega \, \frac{w_2 - w_1( w_1 w_2 + \vv_1 \cdot \vv_2) }{\sin \Omega} = \sin t \Omega \, \frac{w_2 - w_1 \cos \Omega }{\sin \Omega}.\end{align}
The scalar part of the quaternion is already very close to the form of the slerping forumla. We now do the same for the quat vector component,
sintΩ(w1u12+v1×u12)=sintΩ(w1w1v2w2v1v1×v2sinΩ+v1×w1v2w2v1v1×v2sinΩ)=sintΩw1w1v2w1w2v1+v1×(v1×v2)sinΩ=sintΩw1w1v2w1w2v1+(v1Tv1)v2(v1Tv2)v1sinΩ=sintΩ(w1w1+v1Tv1)v2(w1w2+v1Tv2)v1sinΩ=sintΩv2v1cosΩsinΩ.\begin{align} & \sin t \Omega (w_1 \vu_{1 \rightarrow 2} + \vv_1 \times \vu_{1 \rightarrow 2} ) \nonumber \\ =& \sin t \Omega \left( w_1 \frac{w_1 \vv_2 - w_2 \vv_1 - \vv_1 \times \vv_2}{\sin \Omega} + \vv_1 \times \frac{w_1 \vv_2 - w_2 \vv_1 - \vv_1 \times \vv_2}{\sin \Omega} \right) \nonumber \\ =& \sin t \Omega \frac{ w_1 w_1 \vv_2 - w_1 w_2 \vv_1 + \vv_1 \times (\vv_1 \times \vv_2) }{\sin \Omega} \nonumber \\ =& \sin t \Omega \frac{ w_1 w_1 \vv_2 - w_1 w_2 \vv_1 + (\vv_1^T \vv_1) \vv_2 - (\vv_1^T \vv_2) \vv_1 }{\sin \Omega} \nonumber \\ =& \sin t \Omega \frac{ (w_1 w_1 + \vv_1^T \vv_1) \vv_2 - (w_1 w_2 + \vv_1^T \vv_2) \vv_1 }{\sin \Omega} = \sin t \Omega \frac{ \vv_2 - \vv_1\cos \Omega }{\sin \Omega} .\end{align}
The last step is to put the scalar and the vector part together,
q1(q1q2)t=(w1costΩsintΩv1u12,  costΩv1+sintΩ(w1u12+v1×u12)=(w1costΩsintΩw2w1cosΩsinΩ, v1costΩ+sintΩv2v1cosΩsinΩ)=q1costΩsintΩq2q1cosΩsinΩ=slerp(q1q2,t).\begin{align} \vq_1 (\vq_1^* \vq_2)^t &= ( w_1 \cos t \Omega - \sin t \Omega \, \vv_1 \cdot \vu_{1 \rightarrow 2} ,\; \cos t \Omega \, \vv_1 + \sin t \Omega (w_1 \vu_{1 \rightarrow 2} + \vv_1 \times \vu_{1 \rightarrow 2} ) \nonumber \\ &= \left( w_1 \cos t \Omega - \sin t \Omega \, \frac{w_2 - w_1 \cos \Omega }{\sin \Omega},~ \vv_1 \cos t \Omega + \sin t \Omega \frac{ \vv_2 - \vv_1\cos \Omega }{\sin \Omega} \right) \nonumber \\ &= \vq_1 \cos t \Omega - \sin t \Omega \, \frac{\vq_2 - \vq_1 \cos \Omega }{\sin \Omega} = \text{slerp}(\vq_1 \rightarrow \vq_2, t).\end{align}
We have established that
interp(R1R2,t)=R1(R1TR2)t=R1(I+sintθ[w^]+(1costθ)[w^]2)=q1(q1q2)t=q1(costθ2,sintθ2u12)=slerp(q1q2,t).\begin{align} \text{interp}(\mR_1 \rightarrow \mR_2, t) &= \mR_1 (\mR_1^T\mR_2)^t \nonumber \\ &= \mR_1 (\mI + \sin t\theta \skewm{\vw} + (1 - \cos t\theta) \skewm{\vw}^2) \nonumber \\ &= \vq_1 (\vq_1^* \vq_2)^t \nonumber \\ &= \vq_1 ( \cos t \frac{\theta}{2} ,\: \sin t \frac{\theta}{2} \, \vu_{1 \rightarrow 2}) \nonumber \\ &= \text{slerp}(\vq_1 \rightarrow \vq_2, t).\end{align}

Last updated on 2025-05-07. Design inspired by distill.