3Blue1Brown has an *outstanding* series of short videos, "Essence of linear alge...

edanm · on Dec 17, 2017

Amazing series!

Open question I still have - what is the "geometric interpretation" of the transpose operation? A^T?

Considering the fact that the transpose shows up all the time, I'm very surprised that I've never seen a good explanation of how I should be visualizing it.

blurbleblurble · on Dec 18, 2017

Graph theory offers one nice illustration of the transpose (for square matrices). Here's a directed graph:

  ⬐---------¬
  n₁ → n₂ → n₃

It can be represented by this adjacency matrix:

  X = [ 0  1  0 ]
      [ 0  0  1 ]
      [ 1  0  0 ]

See how there is a 1 for each row i and column j where there is an edge from node i to node j?

Okay, now look at the transpose, Xᵀ:

  Xᵀ = [ 0  0  1 ]
       [ 1  0  0 ]
       [ 0  1  0 ]

If we interpret this matrix as an adjacency matrix, its graph looks like this:

  ⬐---------¬
  n₃ → n₂ → n₁

Taking the transpose of a directed graph's adjacency matrix just reverses the direction of the edges!

P.S: some more cool things about adjacency matrices:

Imagine that a vector represents a node or set of nodes on a graph. Like this:

  n₁ = [ 1 ],  n₂ = [ 0 ],  n₃ = [ 0 ],
       [ 0 ]        [ 1 ]        [ 0 ]
       [ 0 ]        [ 0 ]        [ 1 ]

Watch what happens when we multiply X with n₁:

  X*n₁ = [ 0  0  1 ]   [ 1 ]   [ 0 ]
         [ 1  0  0 ] * [ 0 ] = [ 1 ] = n₂
         [ 0  1  0 ]   [ 0 ]   [ 0 ]

So the adjacency matrix is not only an index of edges, it's also a little machine that can push nodes around on a graph. You can even do it with two at a time:

  X*(n₁ + n₂) = [ 0  0  1 ]   [ 1 ]   [ 0 ]
                [ 1  0  0 ] * [ 1 ] = [ 1 ] = n₂ + n₃
                [ 0  1  0 ]   [ 0 ]   [ 1 ]

If you're interested in diving further into algebraic representations of graphs, check out the Leavitt Path algebra: https://arxiv.org/abs/1410.1835

thethirdone · on Dec 18, 2017

Notably adjacency matrices are orthogonal so A^T = A^-1.

This extends to probabilistic transitions as well.

blurbleblurble · on Dec 18, 2017

Well, the adjacency matrix I used as an example is orthogonal, but they don't have to be. Any matrix with 1's and 0's can be interpreted as an unweighted adjacency matrix for an undirected graph (if the matrix is symmetric) or directed graph (if it's not symmetric). For example, here's an adjacency matrix that's not an orthogonal matrix:

  [ 1 0 ]
  [ 1 1 ]

obastani · on Dec 18, 2017

It's a bit difficult to explain the visualization without pictures, but I'll give it a shot.

The transpose is really about converting the matrix to operate on a different vector space, namely, the dual space. In particular, the dual space of a vector space V is the vector space of "linear functionals", which are linear functions

\phi: V -> R

A linear functional on R^2 looks like a gradient (the "gradient fill" gradient, not a calculus gradient). These gradients are in one-to-one correspondence to vectors in R^2. In particular, given a vector w \in R^2, the direction of the gradient is along the direction of v, and the speed with which the gradient is changing corresponds to the magnitude of w.

The precise mathematical correspondence is that (i) given a vector w \in R^2, the function

f_w(v) = <w, v>

is a linear function (here, <,> is the inner/dot product), and (ii) every linear function has this form. Now, note that f_w is exactly multiplication by the transpose w^T of w! In particular,

f_w(v) = w^T v

More generally, for any linear map A : V -> W, the adjoint A* of A is defined to be the linear map from the dual space of W to the dual space of V that satisfies

<w, A v> = <A* w, v>

The transpose A^T is the adjoint of A when V is a finite-dimensional real vector space:

<A^T w, v> = (A^T w)^T v = w^T A v = <w, A v>

In summary, you can try to visualize A^T as a linear map "acting" on dual vectors. For example, let v \in R^2 and let w be a dual vector (i.e., a gradient), and suppose that A rotates v clockwise by 90 degrees. To preserve the inner product <w, A v>, A^T rotates w counter-clockwise by 90 degrees.

pandaman · on Dec 18, 2017

And a practical example of this is checking transformed points against a view frustum. Instead of transforming points into the view space a transposed matrix allows you to transform the frustum into the object space and check untransformed points against it. This works only on non-perspective transforms, of course, but the view transform should not be perspective anyways.

To visualize this, take a simplest case of 2D space and non-homogenous coordinates. A simple frustum would be an angle made by two rays from the origin. You can see that rotating this space is as same as rotating the frustum in the opposite direction (though in this case transposed matrix is the same as inverse) but stretching the space opens/closes the frustum depending on in which direction it pulls its normals.

thethirdone · on Dec 18, 2017

> Open question I still have - what is the "geometric interpretation" of the transpose operation? A^T?

AFAIK, there is not a solid geometric interpretation. Part of the trouble is that the transpose can change the shape of the matrix.

For example, for a vector the transpose turns it into a matrix outputting a single number. These two objects seem to be rather incomparable geometrically.

The best intuition I have for transposition is that it represents a time reversal (but not necessarily an inverse). In the case of a vector, you have to think of it as a linear transformation that maps 1 to that vector. The transpose instead maps that vector to its length squared. I have more vague intuition for the why its length squared and how it relates to projections, but its hard to put into words.

With rotation matrices, this time reversal results in the inverse. So essentially rotation/skewing is reversed, but scaling is not.

akalin · on Dec 18, 2017

I don't have a complete answer, but a first step would be to think of the coordinate-free concept behind transposes, which is the adjoint: https://en.wikipedia.org/wiki/Transpose#Adjoint

jacobolus · on Dec 18, 2017

You want to start with: https://en.wikipedia.org/wiki/Covariance_and_contravariance_...

Crye · on Dec 17, 2017

I was hoping this was the top comment. Linear algebra would have been much more interesting if I had watch those videos first. He 3B1B such an amazing teacher!

jjcc · on Dec 18, 2017

It remind me 3B1B either when I read this much earlier blog. 3B1B is really really wonderful tutor. Liner algebra is only one series among others.

lexpar · on Dec 17, 2017

Can't overstate how great his stuff is.