Lecture 02.2 – Rotations and Kinematic Chains
Lecture Slides: Rotations and Kinematic Chains
In this lecture, we explore different parameterizations of rotations and kinematic chains, which are core ingredients for modeling human motion. We’ll examine rotation matrices, Euler angles, quaternions, and twist-exponential maps, comparing their advantages and disadvantages. Then we’ll see how these parameterizations apply to articulated kinematic chains for controlling virtual humans.
When evaluating a parameterization for rotations or rigid body motion, we should consider:
Minimum number of parameters: This makes controlling characters easier, inference more efficient, and storage more lightweight.
Avoiding singularities: Degenerate cases where degrees of freedom are lost or multiple parameter combinations produce the same result should be avoided.
Ease of computing derivatives: How efficiently we can calculate changes in positions and orientations with respect to parameters is crucial for optimization.
Placing biomechanical constraints: The ability to incorporate natural movement constraints (e.g., an elbow rotating along a single axis).
Ease of motion concatenation: How easily we can accumulate motion over time, which is important for tracking and animation.
1. Representations of 3D Rotations
First, let’s understand what a rotation actually is. Informally, a rotation is a linear transformation that: - Preserves angles between vectors - Preserves distances (length of vectors) - Does not mirror the object (no reflection)
A key property to note is that rotations in 2D commute (order doesn’t matter), but rotations in 3D do not commute (order matters significantly).
3D rotations lie in the special orthogonal group \(SO(3)\), i.e., all \(3\times 3\) matrices \(R\) with \(R^T R = I\) and \(\det(R)=+1\). Because \(SO(3)\) is a non-linear manifold, we have multiple parameterizations:
A) Rotation Matrices
A rotation matrix \(R \in SO(3)\) is a \(3\times 3\) orthonormal matrix. Applying it to a vector \(v\) yields \(v' = R\,v\). Composing two rotations is matrix multiplication: \(R_2 R_1\), which is non-commutative in 3D.
Understanding Rotation Matrices: Consider a point \(P_B\) expressed in body frame coordinates with coordinates \((\lambda_x, \lambda_y, \lambda_z)\). To express this point in spatial (world) frame coordinates, we can write:
where \(X_{BS}\), \(Y_{BS}\), and \(Z_{BS}\) are the axes of the body frame expressed in spatial coordinates. This can be written in matrix form as:
where \(R_{SB}\) is the rotation matrix mapping from body frame to spatial frame. The columns of \(R_{SB}\) are simply the axes of the body frame expressed in spatial coordinates.
There are two ways to interpret rotations (though mathematically identical): 1. Coordinate transformation: Moving between coordinate frames that are fixed in space 2. Motion: Actually rotating a point (or object) in space
Important properties of rotation matrices:
The columns (or rows) form an orthonormal basis
Algorithm for enforcing orthogonality: When numerical errors accumulate, rotation matrices may drift from being perfectly orthogonal. A common correction method is:
Apply Gram-Schmidt orthogonalization to the columns
Use SVD: If \(R = USV^T\), set \(R' = UV^T\) (discarding singular values)
Pros: Directly used for transformations (OpenGL, DirectX). Often good for learning tasks due to its trigonometric nature.
Cons: Uses 9 numbers to represent 3 degrees of freedom. Requires maintaining orthonormality constraints during optimization, which is inefficient.
B) Euler Angles
A triple \((\alpha,\beta,\gamma)\) describing successive rotations about coordinate axes (e.g., yaw/pitch/roll). These parameterize a rotation as three consecutive planar rotations.
Basic 2D rotation matrices: - Around x-axis:
\[\begin{split}R_x(\theta) = \begin{pmatrix} 1 & 0 & 0 \\ 0 & \cos\theta & -\sin\theta \\ 0 & \sin\theta & \cos\theta \end{pmatrix}\end{split}\]
Around y-axis:
\[\begin{split}R_y(\theta) = \begin{pmatrix} \cos\theta & 0 & \sin\theta \\ 0 & 1 & 0 \\ -\sin\theta & 0 & \cos\theta \end{pmatrix}\end{split}\]Around z-axis:
\[\begin{split}R_z(\theta) = \begin{pmatrix} \cos\theta & -\sin\theta & 0 \\ \sin\theta & \cos\theta & 0 \\ 0 & 0 & 1 \end{pmatrix}\end{split}\]
A complete Euler angle rotation might look like:
Important: When using Euler angles, two things must be specified:
Convention: Which axes are being rotated around (e.g., XYZ, ZYX, ZXZ)
Frame of reference: Whether rotations are happening in the static spatial frame or the moving body frame (extrinsic vs. intrinsic rotations)
Common Euler angle conventions:
ZYX (yaw-pitch-roll): Common in aerospace, robotics
XYZ: Often used in computer graphics
ZXZ: Used in physics and classical mechanics
Converting from rotation matrix to Euler angles (ZYX convention): Given a rotation matrix \(R\), the Euler angles can be extracted as:
where \(r_{ij}\) is the element at row \(i\), column \(j\) of matrix \(R\). For the gimbal lock case (\(\beta = \pm \pi/2\)), special handling is required as \(\alpha\) and \(\gamma\) are not uniquely determined (only their sum or difference matters).
Pros: Intuitive for small rotations. Three parameters match the degrees of freedom. Common in animation.
Cons: Suffers from gimbal lock—a configuration where one degree of freedom is lost. For example, with ZYX Euler angles, when \(\theta_y = \pi/2\), the rotation matrix becomes:
\[\begin{split}R = \begin{pmatrix} 0 & -\sin(\theta_x+\theta_z) & \cos(\theta_x+\theta_z) \\ 0 & -\cos(\theta_x+\theta_z) & -\sin(\theta_x+\theta_z) \\ -1 & 0 & 0 \end{pmatrix}\end{split}\]Notice that \(\theta_x\) and \(\theta_z\) only appear as a sum, so we’ve lost one degree of freedom—only their combined effect matters.
C) Quaternions
Quaternions are 4D numbers \(q = (w, x, y, z)\) with \(\|q\| = 1\) for rotation. To understand quaternions, it helps to first understand complex numbers as representing 2D rotations.
Complex numbers as 2D rotations: In a geometric interpretation, the imaginary unit \(i\) represents a quarter turn. Multiplying by \(i\) twice gives a half turn (\(i^2 = -1\)). Complex numbers can be expressed in polar form as \(re^{i\theta}\), where \(r\) is the magnitude and \(\theta\) is the angle.
When we multiply complex numbers, the magnitudes multiply and the angles add:
Quaternions as an extension to 3D: William Rowan Hamilton generalized complex numbers to 3D, discovering that you need 4 dimensions with the fundamental formula:
A quaternion can be written as \(q = q_w + q_x i + q_y j + q_z k\) or more compactly as \(q = (q_w, \vec{v})\) where \(\vec{v} = (q_x, q_y, q_z)\).
Multiplication rules for quaternion basis elements:
\(i^2 = j^2 = k^2 = -1\)
\(ij = k\), \(jk = i\), \(ki = j\)
\(ji = -k\), \(kj = -i\), \(ik = -j\)
For a rotation by angle \(\theta\) around a unit axis \(k = (k_x, k_y, k_z)\), the corresponding quaternion is:
Quaternion operations:
Conjugate: \(q^* = (q_w, -\vec{v})\)
Norm: \(\|q\| = \sqrt{q_w^2 + q_x^2 + q_y^2 + q_z^2}\)
Inverse: \(q^{-1} = q^*/\|q\|^2\) (for unit quaternions, \(q^{-1} = q^*\))
The quaternion product for two quaternions \(q_1 = (w_1, \vec{v}_1)\) and \(q_2 = (w_2, \vec{v}_2)\) is:
Derivation of quaternion product: The quaternion product can be derived by expanding using the definitions:
Expanding and applying the multiplication rules:
This can be rewritten more compactly using vector operations as shown above.
To rotate a vector \(v\) by a quaternion \(q\), we use:
Where \(v\) is treated as a pure quaternion \((0, v)\) with zero scalar part. The formula expands to:
This can also be written as a matrix-vector multiplication, where a quaternion \(q=(w,x,y,z)\) can be converted to a rotation matrix:
Quaternion SLERP (Spherical Linear Interpolation) is excellent for interpolating between orientations:
where \(t \in [0, 1]\) and \(q_0\) and \(q_1\) are the start and end orientations.
Computational formula for SLERP: If \(\cos\theta = q_0 \cdot q_1\) is the dot product between unit quaternions:
For small angles or numerical stability, a linear interpolation followed by normalization (NLERP) is often used as an approximation:
Converting from rotation matrix to quaternion: Given a rotation matrix \(R\):
When \(w\) is close to zero, alternative formulas must be used to avoid numerical instability.
Pros: No gimbal lock. Computationally efficient. Perfect for interpolation. Derivatives exist and are linearly independent.
Cons: Four numbers for 3 DOF. Requires unit norm constraint in optimization. Sign ambiguity \(q \leftrightarrow -q\).
2. Lie Algebra \(so(3)\) and Exponential Map
\(SO(3)\) is a Lie group—a continuously differentiable manifold with group properties. Its Lie algebra is \(so(3)\), the space of all \(3\times 3\) real skew-symmetric matrices:
Any \(A\in so(3)\) can be written:
Given a vector \(w = (w_1, w_2, w_3)\), we can define the skew-symmetric matrix:
This skew-symmetric matrix has the property that for any vector \(p\):
That is, the matrix represents the cross product operation.
Properties of skew-symmetric matrices: For \(\hat{\omega} = [\omega]_\times\) where \(\omega\) is a unit vector:
\(\hat{\omega}^2 = \omega\omega^T - I\)
\(\hat{\omega}^3 = -\hat{\omega}\)
\(\hat{\omega}^4 = -\hat{\omega}^2\)
\(\hat{\omega}^{2n} = (-1)^n\hat{\omega}^2\) for \(n \geq 1\)
\(\hat{\omega}^{2n+1} = (-1)^n\hat{\omega}\) for \(n \geq 1\)
These properties are crucial for deriving Rodrigues’ formula.
Exponential Map Exponentiating \(A\in so(3)\) yields a rotation matrix in \(SO(3)\):
If we have a vector \(w\) with \(\|w\| = \theta\), we get a rotation by angle \(\theta\) about axis \(w/\|w\|\).
Derivation of the exponential map: For \(A = [w]_\times\) where \(\|w\| = \theta\) and \(\hat{w} = w/\|w\|\) is a unit vector, we group the infinite series of the matrix exponential:
Using the properties of skew-symmetric matrices, we can separate even and odd powers:
With \(A = \theta[\hat{w}]_\times\), this becomes:
Using the properties above, and recognizing the series expansions of sine and cosine:
This is Rodrigues’ rotation formula.
Physical Interpretation of the Exponential Map: The exponential map comes from the differential equation for rotating a point \(P_s\) around an axis \(w\):
The solution to this differential equation is:
This is why we use the exponential to represent rotations—it integrates the angular velocity over time.
Logarithmic map (inverse of exponential map): The logarithm of a rotation matrix \(R\) gives the corresponding element in \(so(3)\):
Where:
\(\theta = \arccos((tr(R) - 1)/2)\) is the rotation angle
\(\hat{w}\) is the unit rotation axis extracted from:
\[[\hat{w}]_\times = \frac{R - R^T}{2\sin\theta}\]
For small rotations or when \(\theta \approx 0\) (when \(tr(R) \approx 3\)), special handling is required to avoid numerical issues.
3. Rodrigues’ Rotation Formula
Rodrigues’ formula provides a closed-form expression for the exponential map, avoiding the need to compute the infinite Taylor series.
Given a unit axis \(k=(k_x,k_y,k_z)\) and angle \(\theta\):
where \([k]_\times\) is the skew-symmetric matrix of \(k\).
Computing \([k]_\times^2\) explicitly: For \(k = (k_x, k_y, k_z)\) with \(\|k\| = 1\):
This identity simplifies the implementation of Rodrigues’ formula.
Vector Form: For any \(v\in\mathbb{R}^3\),
Derivation of vector form: Applying the matrix form to a vector \(v\):
Expanding using \([k]_\times v = k \times v\) and \([k]_\times^2 v = k(k \cdot v) - v\):
Simplifying:
This formula has a nice geometric interpretation: - The first term \(v\cos\theta\) scales the original vector. - The second term \((k \times v)\sin\theta\) adds a perpendicular component for rotation. - The third term \(k\,(k\cdot v)\,(1-\cos\theta)\) is a correction for the parallel component.
Extending to Translation with Twist Coordinates:
To represent both rotation and translation (a full rigid body motion), we can use twist coordinates:
where \(\omega\) is the rotation axis (scaled by angle) and \(v\) relates to translation. These six numbers can be arranged into a twist matrix:
Exponential of a twist: For twist \(\xi = (\omega, v)\) with \(\|\omega\| = \theta > 0\):
Where:
\(\exp([\omega]_\times)\) is the rotation matrix from Rodrigues’ formula
\(G = I\theta + (1-\cos\theta)[\omega]_\times + (\theta-\sin\theta)[\omega]_\times^2\)
For the special case when \(\omega = 0\) (pure translation):
The exponential of this matrix gives us the full rigid body motion matrix (a 4×4 matrix including translation).
Rodrigues’ Parameters: An alternative way to represent rotations is using Rodrigues’ parameters:
These provide a minimal 3-parameter representation but have a singularity at \(\theta = \pm\pi\).
Modified Rodrigues Parameters (MRPs): A variation that moves the singularity to \(\theta = \pm2\pi\):
MRPs are useful in attitude control and estimation.
4. Kinematic Chains: Forward & Inverse Kinematics
A kinematic chain models an articulated structure like the human skeleton as a sequence of rigid segments connected by joints.
Forward Kinematics (FK): Given joint angles, compute the position of end effectors (e.g., hands, feet):
where: - \(G_{SB}(\theta)\) is the transformation from body to spatial frame - \(G_{SB}(0)\) is the transformation in the rest pose - \(\xi_i\) are the twist coordinates for each joint - \(\theta_i\) are the joint angles
Product of Exponentials (PoE) derivation: For a serial chain with \(n\) joints where the \(i\)-th joint contributes a twist motion \(\xi_i\):
Define the “home configuration” \(M \in SE(3)\) as the end-effector pose when all joint variables are zero
Define a twist \(\xi_i\) for each joint, expressed in the space frame at the home configuration
For a given set of joint values \(\theta = (\theta_1, \ldots, \theta_n)\), the end-effector pose is:
This formula is exact and accounts for the full 3D kinematics.
Types of Joints and Their Twists:
Revolute Joint (1 DOF rotation): \(\xi = (\omega, -\omega \times q)\) where \(\omega\) is the unit axis of rotation and \(q\) is a point on the axis.
Prismatic Joint (1 DOF translation): \(\xi = (0, v)\) where \(v\) is the unit direction of translation.
Spherical Joint (3 DOF rotation): Represented as three revolute joints with orthogonal axes intersecting at a point.
Universal Joint (2 DOF): Represented as two revolute joints with orthogonal axes intersecting at a point.
Inverse Kinematics (IK): Given a desired end-effector pose, find the joint angles:
This is typically a non-linear problem solved using the Jacobian, which maps small changes in joint angles to small changes in end-effector position:
where \(\xi'_i\) are the twist coordinates transformed to the current configuration.
Computation of the Jacobian: For a serial chain with \(n\) joints, the spatial Jacobian columns are:
Where \(Ad\) is the adjoint transformation that maps twists between different coordinate frames. For revolute joints, this simplifies to:
Where \(\omega_i\) is the axis of the \(i\)-th joint after being transformed by previous joints, and \(p_{i-1,n}\) is the vector from joint \(i\) to the end-effector.
The Jacobian gives us a linearization:
This allows us to solve IK iteratively:
where \(J^+\) is the pseudoinverse of the Jacobian.
Pseudoinverse computation: For a Jacobian \(J\):
When \(J\) is singular or near-singular (at singularities), damped least squares provides numerical stability:
Where \(\lambda\) is a damping factor.
IK Algorithm (Damped Least Squares):
Initialize joint angles \(\theta^{(0)}\)
For iteration \(k = 0, 1, 2, ...\): a. Compute forward kinematics \(X(\theta^{(k)})\) b. Compute error: \(e = X_{desired} - X(\theta^{(k)})\) c. Compute Jacobian \(J(\theta^{(k)})\) d. Update: \(\theta^{(k+1)} = \theta^{(k)} + J^+(\theta^{(k)})e\) e. Check for convergence or maximum iterations
Handling Joint Limits: For practical applications, joints often have limits. A weighted damped least squares approach can be used:
Where: - \(W_x\) is a weighting matrix for task space - \(W_\theta\) is a weighting matrix for joint space - \(\theta_{mid}\) are the midpoints of joint ranges
The weights in \(W_\theta\) can be adjusted to increasingly penalize joints as they approach their limits:
Where \(w_0\) is a small base weight and \(w_s\) is a scaling factor.
Null Space Projection: For redundant manipulators (more DOFs than task constraints), secondary objectives can be optimized in the null space:
Where \(z\) is a vector pointing in the direction of optimization for secondary objectives (e.g., avoiding joint limits, obstacles, or singularities).
Singularity-Robust IK: Near singularities, the standard damped least squares approach can be enhanced with a variable damping factor:
Where \(\sigma_{min}\) is the smallest singular value of the Jacobian, \(\epsilon\) is a threshold, and \(\lambda_0\) is the maximum damping factor.
Practical Applications for Human Body Modeling:
In the SMPL model (a widely used parametric human body model), kinematic chains are used with twist coordinates to represent the articulated human skeleton. Each joint can be modeled as having: - A fixed axis of rotation with variable angle (e.g., hinge joints like elbows) - Multiple degrees of freedom (e.g., ball joints like shoulders)
SMPL model formulation: The SMPL model represents the human body as:
Where: - \(\beta\) are shape parameters - \(\theta\) are pose parameters (joint angles) - \(T_P\) is a template mesh deformed by pose and shape - \(J(\beta)\) are joint locations - \(W\) is a skinning function with weights \(\mathcal{W}\)
The joint rotations are parametrized using axis-angle representation, which can be converted to rotation matrices using Rodrigues’ formula.
Practical IK Implementations for Human Models: When applying IK to human models, special considerations include:
Biomechanical constraints: Human joints have physical limits (e.g., elbows don’t bend backwards)
Posture naturalness: Solutions that satisfy end-effector positions may still look unnatural
Joint coupling: Some human joints have coupled motion (e.g., finger joints)
These are often addressed by: - Using data-driven priors for natural poses - Adding energy terms to penalize unnatural configurations - Defining hierarchy of task priorities (e.g., feet placement > hand position > gaze direction)
Full-Body IK: For full-body human IK, the problem is often decomposed into:
Root positioning: Placing the hip/pelvis in global space
Lower-body IK: Solving for legs to maintain balance and foot placement
Upper-body IK: Solving for arms, spine, and head
A common approach is a multi-stage IK with prioritization: - Primary tasks: Balance constraints, foot placement - Secondary tasks: Hand positions, gaze direction - Tertiary tasks: Joint limit avoidance, pose naturalness
Comparison of Rotation Representations
Here is a summary comparison of the different rotation parameterizations:
Representation |
Advantages |
Disadvantages |
---|---|---|
Rotation Matrices |
|
|
Euler Angles |
|
|
Quaternions |
|
|
Twist/Axis-Angle |
|
|
Recommended Choices: - For optimization and machine learning: Twist coordinates or quaternions - For interpolation: Quaternions (SLERP) - For intuitive control: Euler angles or axis-angle - For direct transformation: Rotation matrices
Conversion Between Representations: In practice, it’s often necessary to convert between different representations:
Rotation Matrix to Euler Angles: - Extract angles using arctangent functions as described earlier - Handle special cases like gimbal lock
Rotation Matrix to Axis-Angle: - Extract angle: \(\theta = \arccos(\frac{\text{tr}(R) - 1}{2})\) - Extract axis: \(\hat{\omega} = \frac{1}{2\sin\theta}(r_{32} - r_{23}, r_{13} - r_{31}, r_{21} - r_{12})^T\)
Axis-Angle to Quaternion: - \(q = (\cos\frac{\theta}{2}, \hat{\omega}\sin\frac{\theta}{2})\)
Quaternion to Rotation Matrix: - Use the formula shown in the quaternion section
Each conversion may have numerical considerations, especially near singularities or special cases.