## Prerequisites

No really advanced math is necessary. Find a good book on matrix operations, that’s all you might need above school math course. If you would like to refresh your knowledge below are some quick articles:
Cartesian Coordinate System – http://en.wikipedia.org/wiki/Cartesian_coordinate_system
Rotation – http://en.wikipedia.org/wiki/Rotation_%28mathematics%29
Vector scalar product – http://en.wikipedia.org/wiki/Dot_product
Vector cross product – http://en.wikipedia.org/wiki/Cross_product
Matrix Multiplication – http://en.wikipedia.org/wiki/Matrix_multiplication
Block Matrix – http://en.wikipedia.org/wiki/Block_matrix
Transpose Matrix – http://en.wikipedia.org/wiki/Transpose
Triple Product – http://en.wikipedia.org/wiki/Triple_product

## Notations

Vectors are marked in bold text so for example “v is a vector and “v” is a scalar (if you can’t distinguish the two there’s problem with the text formatting wherever you’re reading this).

## Part 1. The DCM Matrix

Generally speaking orientation kinematics deals with calculating the relative orientation of a body relative to a global coordinate system. It is useful to attach a coordinate system to our body frame and call it Oxyz, and another one to our global frame and call it OXYZ. Both the global and the body frames have the same fixed origin O (see Fig. 1). Let’s also define i, j, k to be unity vectors co-directional with the body frame’s x, y, and z axes – in other words they are versors of Oxyz and let I, J, K be the versors of global frame OXYZ.

Figure 1

Thus, by definition, expressed in terms of global coordinates vectors I, J, K can be written as:

IG = {1,0,0} T, JG={0,1,0} T , KG = {0,0,1} T

Note: we use {…} T notation to denote a column vector, in other words a column vector is a translated row vector. The orientation of vectors (row/column) will become relevant once we start multiplying them by a matrix later on in this text.

And similarly, in terms of body coordinates vectors i, j, k can be written as:

iB = {1,0,0} T, jB={0,1,0} T , kB = {0,0,1} T

Now let’s see if we can write vectors i, j, k in terms of global coordinates. Let’s take vector i as an example and write its global coordinates:

iG = {ixG , iyG , izG} T

Again, by example let’s analyze the X coordinate ixG, it’s calculated as the length of projection of the i vector onto the global X axis.

ixG = |i| cos(X,i) = cos(I,i)

Where |i| is the norm (length) of the i unity vector and cos(I,i) is the cosine of the angle formed by the vectors I and i. Using the fact that |I| = 1 and |i| = 1 (they are unit vectors by definition). We can write:

ixG = cos(I,i) = |I||i| cos(I,i) = I.i

Where I.i. is the scalar (dot) product of vectors I and i. For the purpose of calculating scalar product I.i it doesn’t matter in which coordinate system these vectors are measured as long as they are both expressed in the same system, since a rotation does not modify the angle between vectors so: I.i = IB.iB = IG.iG = cos(IB.iB) = cos(IG.iG) , so for simplicity we’ll skip the superscript in scalar products I.i , J.j , K.k and in cosines cos(I,i), cos(J,j), cos(K,k).

Similarly we can show that:

iyG = J.i , izG=K.i , so now we can write vector i in terms of global coordinate system as:

iG= { I.i, J.i, K.i}T

Furthermore, similarly it can be shown that jG= { I.j, J.j, K.j} T , kG= { I.k, J.k, K.k} T.

We now have a complete set of global coordinates for our body’s versors i, j, k and we can organize these values in a convenient matrix form:

(Eq. 1.1)

This matrix is called Direction Cosine Matrix for now obvious reasons – it consists of cosines of angles of all possible combinations of body and global versors.

The task of expressing the global frame versors IG, JG, KG in body frame coordinates is symmetrical in nature and can be achieved by simply swapping the notations I, J, K with i, j, k, the results being:

IB= { I.i, I.j, I.k}T , JB= { J.i, J.j, J.k}T , KB= { K.i, K.j, K.k}T

and organized in a matrix form:

(Eq. 1.2)

It is now easy to notice that DCMB = (DCMG)T or DCMG = (DCMB)T , in other words the two matrices are translates of each other, we’ll use this important property later on.

Also notice that DCMB. DCMG = (DCMG)T .DCMG = DCMB. (DCMB)T = I3 , where I3 is the 3×3 identity matrix. In other words the DCM matrices are orthogonal.

This can be proven by simply expanding the matrix multiplication in block matrix form:

(Eq. 1.3)

To prove this we use such properties as for example: iGT. iG = | iG|| iG|cos(0) = 1 and iGT. jG = 0 because (i and j are orthogonal) and so forth.

The DCM matrix (also often called the rotation matrix) has a great importance in orientation kinematics since it defines the rotation of one frame relative to another. It can also be used to determine the global coordinates of an arbitrary vector if we know its coordinates in the body frame (and vice versa).

Let’s consider such a vector with body coordinates:

rB= { rxB, ryB, rzB} T and let’s try to determine its coordinates in the global frame, by using a known rotation matrix DCMG.

We start by doing following notation:

rG = { rxG , ryG , rzG } T.

Now let’s tackle the first coordinate rxG:

rxG = | rG| cos(IG,rG) , because rxG is the projection of rG onto X axis that is co-directional with IG.

Next let’s note that by definition a rotation is such a transformation that does not change the scale of a vector and does not change the angle between two vectors that are subject to the same rotation, so if we express some vectors in a different rotated coordinate system the norm and angle between vectors will not change:

| rG| = | rB| , | IG| = | IB| = 1 and cos(IG,rG) = cos(IB,rB), so we can use this property to write

rxG = | rG| cos(IG,rG) = | IB || rB| cos(IB,rB) = IB. rB = IB. { rxB, ryB, rzB} T , by using one the two definition of the scalar product.

Now recall that IB= { I.i, I.j, I.k}T and by using the other definition of scalar product:

rxG = IB. rB = { I.i, I.j, I.k}T . { rxB, ryB, rzB} T = rxB I.i + ryB I.j + rzB I.k

In same fashion it can be shown that:

ryG = rxB J.i + ryB J.j + rzB J.k
rzG = rxB K.i + ryB K.j + rzB K.k

Finally let’s write this in a more compact matrix form:

(Eq. 1.4)

Thus the DCM matrix can be used to covert an arbitrary vector rB expressed in one coordinate system B, to a rotated coordinate system G.

We can use similar logic to prove the reverse process:

(Eq. 1.5)

Or we can arrive at the same conclusion by multiplying both parts in (Eq. 1.4) by DCMB which equals to DCMGT, and using the property that DCMGT.DCMG = I3 , see (Eq. 1.3):

DCMB rG = DCMB DCMG rB = DCMGT DCMG rB = I3 rB = rB

## Part 2. Angular Velocity

So far we have a way to characterize the orientation of one frame relative to another rotated frame, it is the DCM matrix and it allows us to easily convert the global and body coordinates back and forth using (Eq. 1.4) and (Eq. 1.5). In this section we’ll analyze the rotation as a function of time that will help us establish the rules of updating the DCM matrix based on a characteristic called angular velocity. Let’s consider an arbitrary rotating vector r and define it’s coordinates at time t to be r(t). Now let’s consider a small time interval dt and make the following notations: r = r (t) , r’= r (t+dt) and dr = r’ – r:

Figure 2

Let’s say that during a very small time interval dt → 0 the vector r has rotated about an axis co-directional with a unity vector u by an angle dθ and ended up in the position r’. Since u is our axis of rotation it is perpendicular to the plane in which the rotation took place (the plane formed by r and r’) so u is orthogonal to both r and r’. There are two unity vectors that are orthogonal to the plane formed by r and r’, they are shown on the picture as u and u’ since we’re still defining things we’ll choose the one that is co-directional with the cross product r x r’, following the rule of right-handed coordinate system. Thus because u is a unity vector |u| = 1 and is co-directional with r x r’ we can deduct it as follows:

u = (r x r’) / |r x r’| = (r x r’) / (|r|| r’|sin(dθ)) = (r x r’) / (|r|2 sin(dθ)) (Eq. 2.1)

Since a rotation does not alter the length of a vector we used the property that| r’| = |r|.

The linear velocity of the vector r can be defined as the vector:

v = dr / dt = ( r’ – r) / dt (Eq. 2.2)

Please note that since our dt approaches 0 so does dθ → 0, hence the angle between vectors r and dr (let’s call it α) can be found from the isosceles triangle contoured by r , r’ and dr:

α = (π – dθ) / 2 and because dθ → 0 , then α → π/2

What this tells us is that r is perpendicular to dr when dt → 0 and hence r is perpendicular to v since v and dr are co-directional from (Eq. 2.2):

v ⊥ r (Eq. 2.21)

We are now ready to define the angular velocity vector. Ideally such a vector should define the rate of change of the angle θ and the axis of the rotation, so we define it as follows:

w = (dθ/dt ) u (Eq. 2.3)

Indeed the norm of the w is |w| = dθ/dt and the direction of w coincides with the axis of rotation u. Let’s expand (Eq. 2.3) and try to establish a relationship with the linear velocity v:

Using (Eq. 2.3) and (Eq. 2.1):

w = (dθ/dt ) u = (dθ/dt ) (r x r’) / (|r|2 sin(dθ))

Now note that when dt → 0, so does dθ → 0 and hence for small dθ, sin(dθ) ≈ dθ , we end up with:

w = (r x r’) / (|r|2 dt) (Eq. 2.4)

Now because r’ = r + dr , dr/dt = v , r x r = 0 and using the distributive property of cross product over addition:

w = (r x (r + dr)) / (|r|2 dt) = (r x r + r x dr)) / (|r|2 dt) = r x (dr/dt) / |r|2

And finally:

w = r x v / |r|2­ (Eq. 2.5)

This equation establishes a way to calculate angular velocity from a known linear velocity v.

We can easily prove the reverse equation that lets us deduct linear velocity from angular velocity:

v = w x r ­ (Eq. 2.6)

This can be proven simply by expanding w from (Eq. 2.5) and using vector triple product rule (a x b) x c = (a.c)b – (b.c)a. Also we’ll use the fact that v and r are perpendicular (Eq. 2.21) and thus v.r = 0

w x r = (r x v / |r|2­) x r = (r x v) x r / |r|2­ = ((r.r) v + (v.r)r) / |r|2­ = ( |r|2­ v + 0) |r|2 = v

So we just proved that (Eq. 2.6) is true. Just to check (Eq. 2.6) intuitively – from Figure 2 indeed v has the direction of w x r using the right hand rule and indeed vr and vw because it is in the same plane with r and r’.

## Part 3. Gyroscopes and angular velocity vector

A 3-axis MEMS gyroscope is a device that senses rotation about 3 axes attached to the device itself (body frame). If we adopt the device’s coordinate system (body’s frame), and analyze some vectors attached to the earth (global frame), for example vector K pointing to the zenith or vector I pointing North – then it would appear to an observer inside the device that these vector rotate about the device center. Let wx , wy , wz be the outputs of a gyroscope expressed in rad/s – the measured rotation about axes x, y , z respectively. Converting from the raw output of the gyroscope to physical values is discussed for example here: http://www.starlino.com/imu_guide.html . If we query the gyroscope at regular, small time intervals dt, then what gyroscope output tells us is that during this time interval the earth rotated about gyroscope’s x axis by an angle of dθx = wxdt, about y axis by an angle of dθy = wydt and about z axis by an angle of dθz = wzdt. These rotations can be characterized by the angular velocity vectors: wx = wx i = {wx , 0 , 0 }T , wy = wy j = { 0 , wy , 0 }T , wz = wz k = { 0 , 0, wz }T , where i,j,k are versors of the local coordinate frame (they are co-directional with body’s axes x,y,z respectively). Each of these three rotations will cause a linear displacement which can be expressed by using (Eq. 2.6):

dr1 = dt v1 = dt (wx x r) ; dr2 = dt v2 = dt (wy x r) ; dr3 = dt v3 = dt (wz x r) .

The combined effect of these three displacements will be:

dr = dr1 + dr2 + dr3 = dt (wx x r + wy x r + wz x r) = dt (wx + wy + wz) x r (cross product is distributive over addition)

Thus the equivalent linear velocity resulting from these 3 transformations can be expressed as:

v = dr/dt = (wx + wy + wz) x r = w x r , where we introduce w = wx + wy + wz = {wx , wy , wz }

Which looks exactly like (Eq. 2.6) and suggests that the combination of three small rotations about axes x,y,z characterized by angular rotation vectors wx , wy , wz is equivalent to one small rotation characterized by angular rotation vector w = wx + wy + wz = {wx , wy , wz }. Please note that we’re stressing out that these are small rotations, since in general when you combine large rotations the order in which rotations are performed become important and you cannot simply sum them up. Our main assumption that let us go from a linear displacement to a rotation by using (Eq. 2.6) was that dt is really small, and thus the rotations dθ and linear displacement dr are small as well. In practice this means that the larger the dt interval between gyro queries the larger will be our accumulated error, we’ll deal with this error later on. Now, since wx , wy , wz are the output of the gyroscope, then we arrive at the conclusion that in fact a 3 axis gyroscope measures the instantaneous angular velocity of the world rotating about the device’s center.

## Part 4. DCM complimentary filter algorithm using 6DOF or 9DOF IMU sensors

In the context of this text a 6DOF device is an IMU device consisting of a 3 axis gyroscope and a 3 axis accelerometer. A 9DOF device is an IMU device of a 3 axis gyroscope, a 3 axis accelerometer and a 3 axis magnetometer. Let’s attach a global right-handed coordinate system to the Earth’s frame such that the I versor points North, K versor points to the Zenith and thus, with these two versors fixed, the J versor will be constrained to point West.

Figure 3

Also let’s consider the body coordinate system to be attached to our IMU device (acc_gyro used as an example),

Figure 4

We already established the fact that gyroscopes can measure the angular velocity vector. Let’s see how accelerometer and magnetometer measurements will fall into our model.

Accelerometers are devices that can sense gravitation. Gravitation vector is pointing towards the center of the earth and is opposite to the vector pointing to Zenith KB. If the 3 axis accelerometer output is A = {Ax , Ay , Az } and we assume that there are no external accelerations or we have corrected them then we can estimate that KB = –A. (See this IMU Guide for more clarifications http://www.starlino.com/imu_guide.html).

Magnetometers are devices that are really similar to accelerometers, except that instead of gravitation they can sense the Earth’s magnetic North. Just like accelerometers they are not perfect and often need corrections and initial calibration. If the corrected 3-axis magnetometer output is M = {Mx , My , Mz }, then according to our model IB is pointing North , thus IB = M.

Knowing IB and KB allows us calculate JB = KB x IB.

Thus an accelerometer and a magnetometer alone can give us the DCM matrix , expressed either as DCMB or DCMG

DCMG = DCMBT = [IB, JB, KB]T

The DCM matrix can be used to convert any vector from body’s(devices) coordinate system to the global coordinate system. Thus for example if we know that the nose of the plane has some fixed coordinates expressed in body’s coordinate system as rB = {1,0,0}, the we can find where the device is heading in other words the coordinates of the nose in global coordinate systems using (Eq. 1.4):

rG = DCMG rB

So far you’re asking yourself if an accelerometer and a magnetometer gives us the DCM matrix at any point in time, why do we need the gyroscope ? The gyroscope is actually a more precise device than the accelerometer and magnetomer are , it is used to “fine-tune” the DCM matrix returned by the accelerometer and magnetometer.

Gyroscopes have no sense of absolute orientation of the device , i.e. they don’t know where north is and where zenith is (things that we can find out using the accelerometer and magnetometer), instead if we know the orientation of the device at time t, expressed as a DCM matrix DCM(t) , we can find a more precise orientation DCM(t+dt) using the gyroscope , then the one estimated directly from the accelerometer and magnetometer direct readings which are subject to a lot of noise in form of external (non-gravitational) inertial forces (i.e. acceleration) or magnetically forces that are not caused by the earth’s magnetic field.

These facts call for an algorithm that would combine the readings from all three devices (accelerometer, magnetometer and gyroscope) in order to create our best guess or estimate regarding the device orientation in space (or space’s orientation in device’s coordinate systems), the two orientations are related since they are simply expressed using two DCM matrices that are transpose of one another (DCMG = DCMBT ).

We’ll now go ahead and introduce such an algorithm.

We’ll work with the DCM matrix that consists of the versors of the global (earth’s) coordinate system aligned on each row:

If we read the rows of DCMG we get the vectors IB, JB, KB. We’ll work mostly with vectors KB (that can be directly estimated by accelerometer) and vector IB (that can be directly estimated by the magnetometer). The vector JB is simply calculated as JB = KB x IB , since it’s orthogonal to the other two vectors (remember versors are unity vectors with same direction as coordinate axes).

Let’s say we know the zenith vector expressed in body frame coordinates at time t0 and we note it as KB0. Also let’s say we measured our gyro output and we have determined that our angular velocity is w = {wx , wy , wz }. Using our gyro we want to know the position of our zenith vector after a small period of time dt has passed we’ll note it as KB1G . And we find it using (Eq. 2.6):

KB1GKB0 + dt v = KB0 + dt (wg x KB0) = KB0 + ( dθg x KB0)

Where we noted dθg = dt wg. Because wg is angular velocity as measured by the gyroscope. We’ll call dθg angular displacement. In other words it tells us by what small angle (given for all 3 axis in form of a vector) has the orientation of a vector KB changed during this small period of time dt.

Obviously, another way to estimate KB is by making another reading from accelerometer so we can get a reading that we note as KB1A .

In practice the values KB1G will be different from from KB1A. One was estimated using our gyroscope and the other was estimated using our accelerometer.

Now it turns out we can go the reverse way and estimate the angular velocity wa or angular displacement dθa ­= dt wa , from the new accelerometer reading KB1A­ , we’ll use (Eq. 2.5):

w­a = KB0 x va / | KB0|2­

Now va = (KB1A­ KB0) / dt , and is basically the linear velocity of the vector KB0. And | KB0|2­­ = 1 , since KB0 is a unity vector. So we can calculate:

dθa ­= dt wa = KB0 x (KB1A­ KB0)

The idea of calculating a new estimate KB1 ­ that combines both KB1A and KB1G is to first estimate dθ as a weighted average of dθa and dθg :

dθ = (sa dθa + sg dθg) / (sa + s), we’ll discuss about the weights later on , but shortly they are determined and tuned experimentally in order to achieve a desired response rate and noise rejection.

Note for advanced readers: To improve algorithm we could divide dϴg into two components: one orthogonal to KB0 and one parallel to it. We’d then only apply averaging formula on the component orthogonal to KB0 , the idea being that the accelerometer cannot sense rotation  about KB0 (azimuth to zenith) axis, thus the equivalent parallel component  of dϴa will always be zero.

And then KB1 ­ is calculated similar to how we calculated KB1G:

KB1KB0 + ( dθ x KB0)

Why we went all the way to calculate dθ and did not apply the weighted average formula directly to KB1A and KB1G ? Because dθ can be used to calculate the other elements of our DCM matrix in the same way:

IB1IB0 + ( dθ x IB0)

JB1JB0 + ( dθ x JB0)

The idea is that all three versors IB, JB, KB are attached to each other and will follow the same angular displacement dθ during our small interval dt. So in a nutshell this is the algorithm that allows us to calculate the DCM1 matrix at time t1 from our previous estimated DCM0 matrix at time t­­0. It is applied recursively at regular small time intervals dt and gives us an updated DCM matrix at any point in time. The matrix will not drift too much because it is fixed to the absolute position dictated by the accelerometer and will not be too noisy from external accelerations because we also use the gyroscope data to update it.

So far we didn’t mention a word about our magnetometer. One reasons being that it is not available on all IMU units (6DOF) and we can go away without using it, but our resulting orientation will then have a drifting heading (i.e. it will not show if we’re heading north, south, west or east), or we can introduce a virtual magnetometer that is always pointing North, to introduce stability in our model. This situation is demonstrated in the accompanying source code that used a 6DOF IMU.

Now we’ll show how to integrate magnetometer readings into our algorithm. As it turns out it is really simple since magnetometer is really similar to accelerometer (they even use similar calibration algorithms), the only difference being that instead of estimating the Zenith vector KB vector it estimates the vector pointing North IB. Following the same logic as we did for our accelerometer we can determine the angular displacement according to the updated magnetometer reading as being:

dθm ­= dt wm = IB0 x (IB1M­ IB0)

Now let’s incorporate it into our weighted average:

dθ = (sa dθa + sg dθg + sm dθm) / (sa + sg +­ sm)

From here we go the same path to calculate the updated DCM

IB1IB0 + ( dθ x IB0) , KB1KB0 + ( dθ x KB0) and JB1 JB0 + ( dθ x JB0),

In practice we’ll calculate JB1 = KB1 x IB1, after correcting KB1 and IB1 to be perpendicular unity vectors again , note that all our logic is approximated and dependent on dt being small, the larger the dt the larger the error we’ll accumulate.

So if vectors IB0, JB0, KB0 form a valid DCM matrix , in other words they are orthogonal to each other and are unity vectors, then we can’t say the same about IB1, JB1, KB1 , the formulas used for calculating them does not guarantee the orthogonality or length of the vector to be preserved , however we will not get a big error if dt is small, all we need to do is to correct them after each iteration.

First let’s see how we can ensure that two vectors are orthogonal again. Let’s consider two unity vectors a and b that are “almost orthogonal” in other words the angle between these two vectors is close to 90°, but not exactly 90°. We’re looking to find a vector b’ that is orthogonal to a and that is in the same plane formed by the vectors a and b. Such a vector is easy to find as shown in Figure 5. First we find vector c = a x b that by the rules of cross product is orthogonal to both a and b and thus is perpendicular to the plane formed by a and b. Next the vector b’ = c x a is calculated as the cross product of c and a. From the definition of cross product b’ is orthogonal to a and because it is also orthogonal to c – it end up in the plane orthogonal to c , which is the plane formed by a and b. Thus b’ is the corrected vector we’re seeking that is orthogonal to a and belongs to the plane formed by a and b.

Figure 5

We can extend the equation using the triple product rule and the fact that a.a = |a| = 1:

b’ = c x a = (a x b) x a = –a (a.b) + b(a.a) = ba (a.b) = b + d , where d = – a (a.b) (Scenario 1, a is fixed b is corrected)

You can reflect a little bit on the results … So we obtain corrected vector b’ from vector b by adding a “correction” vector d = – a (a.b). Notice that d is parallel to a. Its direction is dependent upon the angle between a and b, for example in Figure 5 a.b = cos (a,b) > 0 , because angle between a and b is less than 90°thus d has opposite direction from a and a magnitutde of cos(a,b) = sin(b,b’).

In the scenario above we considered that vector a is fixed and we found a corrected vector b’ that is orthogonal to a. We can consider the symmetric problem – we fix b and find the corrected vector a’:

a’ = ab (b.a) = ab (a.b) = a + e, where e = b (a.b) (Scenario 2, b is fixed a is corrected)

Finally in the third scenario we want both vectors to move towards their corrected state, we consider them both “equally wrong”, so intuitively we apply half correction to both vectors from scenario 1 and 2:

a’ = ab (a.b) / 2 (Scenario 3, both a and b are corrected)
b’ = ba (a.b) / 2

Figure 6

This is an relatively easy formula to calculate on a microprocessor since we can pre-compute Err = (a.b)/2 and then use it to correct both vectors:

a’ = a – Err * b
b’ = b – Err * a

Please note that we’re not proving that a’ and b’ are orthogonal in Scenario 3, but we presented the intuitive reasoning why the angle between a’ and b’ will get closer to 90°if we apply the above corrective transformations.

Now going back to our updated DCM matrix that consists of three vectors IB1, JB1, we apply the following corrective actions before reintroducing the DCM matrix into the next loop:

Err = ( IB1 . JB1 ) / 2

IB1= IB1 – Err * JB1
JB1= JB1 – Err * IB1
IB1’’ = Normalize[IB1]
JB1’’ = Normalize[JB1]
KB1’’ = IB1’’ x JB1’’

Where Normalize[a] = a / |a| , is the formula calculating the unit vector co-directional with a.

So finally our corrected DCM1 matrix can be recomposed from vectors IB1’’, JB1’’, KB1’’ that have been ortho-normalized (each vector constitutes a row of the updated and corrected DCM matrix).

We repeat the loop to find DCM2 , DCM3 , or in general DCM n , at any time interval n.