Coordinate transformations

For description of 3D space we will use cartesian coordinate system, where points are uniquely determined by their three cartesian coordinates (x, y,z). There are also other systems: cylindrical and spherical.

It is convenient to represent affine coordinate transformations in the form of matrices. They are used by major 3D APIs such as OpenGl. To transform coordinates in 3D space, we have to use a 4x4 matrix and stretched 4d vector with fourth coordinate d=1, i.e. v(x,y,z,1).

You can add a transformation to another by multiplying their matrices. This way you can get any transformation using:

  1. translation
  2. scalling
  3. rotation

You can use the transformation matrix in 2D graphics also. Just remove third column and third row from matrix for 3D graphics.

There is Java source code related with this article (MatrixFloat and LinearAlgebra classes).

compute transformed coordinates

So how to compute transformed coordinates? Just multiply transformation matrix and vector that represent a point. As result you will get vector with transformed coordinates.

But you can write vector as column, i.e 1x4 matrix, or as row, i.e. 4x1 matrix.

The first case is called column-major order or notation. Accordingly, the matrix is called the column-major matrix. To compute the transformed coordinates, we use a matrix as the first cofactor and a vector as the second cofactor.

v' = M * v =
|mx0 mx1 mx2 mx3|
|my0 my1 my2 my3|
|mz0 mz1 mz2 mz3|
|md0 md1 md2 md3|
*
|x|
|y|
|z|
|d|
=
{
  x' =  mx0*x + mx1*y + mx2*z + mx3*d
  y' =  my0*x + my1*y + my2*z + my3*d
  z' =  mz0*x + mz1*y + mz2*z + mz3*d  
  d' = ...	
}

The second case is called row-major order or notation. Accordingly, the matrix is called the row-major matrix. To compute the transformed coordinates, we use a vector as the first cofactor and a matrix as the second cofactor. In other words we multiply in reverse order compared to column-major notation.

Use transpose to convert your column-major matrix to row-major matrix and vice versa.

For example, OpenGL documentation uses column-major notation, but in code uses row-major notation, that confuse programmers.

In this article a column-major notation is used.

matrix composition

As mention above, you can accumulate transformation matrices in one matrix using multiplication.

Let's we want accumulate two transformations translation then rotation. Denote their matrices as T and R.

column-major order
// translate
v' = T * v
// rotate
v'' = R * v'
  or
v'' = R * (T * v) = (R * T) * v
row-major order
// translate
v' = v * T
// rotate
v'' = v' * R
  or
v'' =  (v * T) * R = v * (T * R)

As you can see, when column-major order is used, we need to multiply matrices in reverse order instead of what we want. For example, suppose you want to implement pivot around a point. And you are using api like OpenGL or html canvas. Then in the code you have to add the matrices from step 3, step 2 and finally step 1.

// rotate canvas around center
// by 90 degree (snippet)
ctx.save();
ctx.translate(img.width/2, img.height / 2); // step 3
ctx.rotate(Math.PI / 2); // step 2
ctx.translate(-img.width/2, -img.height / 2); // step 1
ctx.drawImage(img,0,0);  
ctx.restore();  

identity matrix

The identity matrix is the identity transformation matrix, i.e. new coordinates are equal to old ones. It is used for the initial transformation.

    | 1 0 0 0|
I=  | 0 1 0 0|
    | 0 0 1 0|
    | 0 0 0 1|

translation matrix

Suppose we want to translate point (x, y, z) on values ax, ay, az along each axis. Then new coordinates can be computed as

x' = x + ax
y' = y + ay
z' = z + az

So matrix will be

|1 0 0 ax|
|0 1 0 ay|
|0 0 1 az|
|0 0 0 1 |

scaling matrix

Suppose we want to scale shape on sx, sy and sz values along each axis. Then new coordinates can be computed as

x' = sx*x
y' = sy*y
z' = sz*z

So matrix will be

|sx 0 0 0|
|0 sy 0 0|
|0 0 sz 0|
|0 0 0 1 |

There is special case when s = -1, that represent reflection across axis. For example, if sx = -1, than shape to be mirrowed at the yz coordinate plane in x direction.

scale about arbitrary center

Suppose we want to scale shape on sx, sy and sz values along each axis. And let p(x,y,z) is center of scaling.

To get the transformation matrix you need perform following steps:

  1. translate the center to the origin, i.e. translate on -px, -py, -pz values
  2. scale on sx, sy and sz values
  3. reverse step 1, i.e. translate on px, py, pz values

rotation matrix

Suppose we want to rotate the point (x, y, z) around the z-axis by an angle α. Then the new coordinates can be calculated as

x' = x*cos(α) - y*sin(α)
y' = x*sin(α) + y*cos(α)
z' = z                  

So matrix will be

|cos(α) -sin(α) 0 0|
|sin(α)  cos(α) 0 0|
| 0       0     1 0|
| 0       0     0 1|

Similarly the rotation matrices around x-axis and y-axis.

     around x-axis         around y-axis    
| 1   0      0      0|    |cos(α)  0  sin(α) 0|
| 0 cos(α) -sin(α)  0|,   |  0     1   0     0|
| 0 sin(α)  cos(α)  0|    |-sin(α) 0  cos(α) 0|
| 0   0      0      1|    |  0     0   0     1|

The sign - of sin() corresponds to the right-hand coordinate system.

rotation around an arbitrary axis

Suppose we have vector υ with normalized coordinates (υx, υy, υz) and we want rotate point around this vector by an angle α.

To get the transformation matrix we will perform following steps:

  1. rotate the given axis and the point such that the axis lies in one of the coordinate planes (xy, yz or zx)
  2. rotate the given axis and the point such that the axis is aligned with one of the two coordinate axes for that particular coordinate plane (x, y or z)
  3. use basic rotation matrix to rotate the point depending on the coordinate axis with which the rotation axis is aligned
  4. reverse rotate the axis-point pair such that it attains the final configuration as that was in step 2
  5. reverse rotate the axis-point pair which was done in step 1

If you will perform steps by hands on paper, as result you will have a matrix

|t*υx2+cos(α)        t*υxy-sin(α)*υz    t*υxz+sin(α)*υy   0|
|t*υxy+sin(α)*υz    t*y2+cos(α)         t*υyz-sin(α)*υx   0|
|t*υxz-sin(α)*υy    t*υyz+sin(α)*υx    t*υz2+cos(α)       0|
|    0                  0                      0            1|
where t = 1 - cos(α)

rotation around arbitrary point

Suppose we want rotate some point around pivot point p with coordinates (px, py, pz).

To get the transformation matrix you need perform following steps:

  1. translate the pivot point to the origin, i.e. translate on -px, -py, -pz values
  2. use basic rotation matrix to rotate the point by an angle α
  3. reverse step 1, i.e. translate on +px, +py, +pz values

rotate around center

There is a special case of rotation around arbitrary point, when you rotate whole canvas on 90 degree around center. After rotation you can wish that top left corner of rotated image was at point (0,0).

Let the canvas be sized (w, h). After rotation, the size will be (h, w).

  1. translate on -w/2, -h/2 values
  2. rotate on 90°
  3. translate on +w/2, +h/2 values (reverse step 1)
  4. translate on (h-w)/2, (w-h)/2 align rotated image to point (0,0)

shear matrices

It is also called as deformation.

Suppose we want shear shape along x-axis and other axis on shx, shy, shz amounts. Then matrix will be look like this

| 1   shy   shz  0|
|shx   1    shz  0|
|shx  shy   1    0|
| 0   0     0    1|

You also can specify shear as an angle α. In this case matrix will be look like this

shear along x-axis
| 1 ctg(α) 0  0|
| 0   1    0  0|
| 0   0    1  0|
| 0   0    0  1|

projection matrices

The projection matrix is used to project all points from the bounded volume onto a plane.

There are two types of projection: perspective and orthogonal. In perspective projection far objects look smaller, and nearby objects larger.

One way to bound volume is to use six planes:

  • r - the right plane
  • l - the left plane
  • t - the top plane
  • b - the bottom plane
  • f - the far plane
  • n - the near plane to which we will project
projection
perspective projection
| 2*n/(r-l)  0          (r+l)/(r-l)    0          |
|   0       2*n/(t-b)   (t+b)/(t-b)    0          |
|   0       0           (f+n)/(n-f)   -2*f*n/(f-n)|
|   0       0          -1             0           |
orthogonal projection
| 2/(r-l)   0          0         (r+l)/(r-l)|
|   0      2/(t-b)     0         (t+b)/(t-b)| 
|   0       0        -2*f/(f-n)  (f+n)/(f-n)|
|   0       0          0             1      |

Other way to bound volume is to use near/far planes and vertical field of view:

  • n - the near plane to which we will project
  • f - the far plane
  • ar - the ratio between the width and the height of the rectangular area which will be the target of projection
  • α - vertical field of view (FOVy on image), the vertical angle of the camera through which we are looking at the world

In this case, the matrix will look like this

perspective projection
| 1/(ar*tan(α/2))  0             0                 0     |
|   0            1/tan(α/2)      0                 0     |
|   0              0        (-n-f)/(n-f)     -2*f*n/(n-f)|
|   0              0             1                 0     |