Lie Groups and their Algebras

There is a saying that “groups, like men, will be known by their actions.” The groups we are interested in here are known by their actions as linear transformations on vector spaces (of finite dimension, over \mathbb{R} or \mathbb{C}). Thus for every g \in G, v, w \in V we must have g \cdot (av+bw) = ag \cdot v + bg \cdot w for all scalars a and b. Since we require that g \cdot g^{-1} \cdot v = v \; \forall g , \; \forall v, the group element g must be invertible, so it lives in \mathrm{GL}(V), the group of linear transformations with nonzero determinant. The term for this type of action, or equivalently a homomorphism \rho : G \longrightarrow \mathrm{GL}(V), is a representation of G.

Here’s a basic example of a representation: let G = \mathbb{Z}/n\mathbb{Z} for some n \geq 2. We can define \rho: G \longrightarrow \mathrm{GL}(2, \mathbb{R}) by sending

m \mapsto \begin{pmatrix} \cos(2m\pi/n) & -\sin(2m\pi/n) \\ \sin(2m\pi/n) & \cos(2m\pi/n) \end{pmatrix}.

This identifies G as the group of rotations of order n about the origin.

The group G = \mathrm{GL}(V) has additional structure. If dim(V) = n as a vector space over F = \mathbb{R} \textrm{ or } \mathbb{C}, we can identify G as a subset of M^{n \times n}(F) \simeq F^{n^2}, consisting of the n^2-tuples whose determinant is nonzero. This means G is an open subset of F^{n^2}. This gives G a geometric structure as a smooth manifold, which allows us to define things like tangent spaces, derivatives of functions, and vector fields in terms of G. The important point here is that the group structure is compatible with this geometric structure, that is, the group law f(g, h) = g \cdot h is a smooth function f : G \times G \longrightarrow G, and so is the inverse operation g \mapsto g^{-1} as a function G \longrightarrow G.

This motivates a definition. If G is both a group and a smooth manifold (think “open subset of \mathbb{R/C}^n“), and the operations (g, h) \mapsto g\cdot h and g \mapsto g^{-1} define smooth maps, then we call G a Lie group. If H is another Lie group, we can also define a map of Lie groups \rho: G \longrightarrow H to be a group homomorphism which is also differentiable.

An obvious example of a Lie group is just GL_{n}(\mathbb{R} \textit{ or  } \mathbb{C}) itself. This has the normal Lie subgroup SL_{n}(\mathbb{R} \textit{ or } \mathbb{C}) consisting of matrices with determinant 1.

Another example is the special unitary group U = \mathrm{SU}(2), which consists of 2\times 2 matrices with determinant 1 that preserve the complex inner product \langle v, w \rangle = \overline{v_1}w_1 + \overline{v_2}w_2 = v^{\dag}w, where v^{\dag} = (\overline{v})^{\intercal} is the conjugate transpose of v. If A \in U then we must have

\langle Av, Aw \rangle = (Av)^{\dag}(Aw) = v^{\dag}(A^{\dag}A)w = v^{\dag}w \; \mathrm{for \; all} \; v, w \in \mathbb{C}^2 \iff A^{\dag}A=I,

so that A^{-1}=A^{\dag}. Writing A = \begin{pmatrix} a & b \\ c & d \end{pmatrix} \in \mathrm{SU}(2), then since \mathrm{det}(A)=1, we must have

\begin{pmatrix} a & b \\ c & d \end{pmatrix}^{-1} = \begin{pmatrix} d & -b \\ -c & a \end{pmatrix} = \begin{pmatrix} a & b \\ c & d \end{pmatrix}^{\dag} = \begin{pmatrix} \bar{a} & \bar{c} \\ \bar{b} & \bar{d} \end{pmatrix}

\iff d = \overline{a}, c = -\overline{b}, \textrm{ and } ad-bc = 1 = |a|^2+|b|^2,

\Longrightarrow \mathrm{SU}(2) = \{ \begin{pmatrix} \alpha & \beta \\ -\overline{\beta} & \overline{\alpha} \end{pmatrix} \ \textrm{ with } |\alpha|^2+|\beta|^2=1\}.

Now if we let \alpha = x+iy, \beta = z+iw, we see that x^2+y^2+z^2+w^2=1, so that \textrm{SU}(2) is the 3-dimensional sphere S^3.

The nice thing about Lie groups is that, although in theory their geometry might be complicated, their group operations give us extra tools that we can use, which makes working with them a lot easier. For example, left multiplication by g, m_g(x) = g\cdot x is a smooth map with a smooth inverse – a diffeomorphism – such that m_g(1) = g, where 1 \in G is the identity. So every point in G is topologically the same as the identity, thus G is homogeneous. Further, if G is connected, then any neighborhood of the identity element 1 generates the whole group (you can show that the group generated by this neighborhood is both open and closed).

So a Lie group’s topology is described by the neighborhoods about the identity, and if it is connected, any arbitrarily small neighborhood gives us the whole group. We will go even further and consider infinitesimal transformations, which are vectors in the tangent plane of G at the identity (these are “infinitesimally” close to 1).

First, note that while left-multiplication m_g(x) = g \cdot x is a diffeomorphism of G to itself, it isn’t a group homomorphism. We can fix this by instead using conjugation, defining C_g(x) = g^{-1}xg, which is an automorphism of G, C_g \in \mathrm{Aut}(G). In fact, C_g \circ C_h = C_{g\cdot h}, so that C: G \longrightarrow \mathrm{Aut}(G), \; C(g)(x) = C_g(x) is a homomorphism into the automorphism group of G.

We are one step away from finding a canonical representation of G, an action on a vector space. The differential of C_g at 1 \in G is a linear map denoted \mathrm{d}(C_g), that goes from \mathfrak{g} = T_1 G, the tangent space of G at 1, to T_{C_g(1)}G = T_{1}G (using local coordinates, the differential of a map is just the Jacobian matrix, A_{ij}=\frac{\partial f_i}{\partial x_j}). Let \mathrm{Ad}(g) = (\mathrm{d} C_g)_1 . Then using the chain rule,

\mathrm{Ad}(gh) = \mathrm{d}(C_{gh})=\mathrm{d}(C_g \circ C_h)=\mathrm{d}(C_g) \circ \mathrm{d}(C_h) = \mathrm{Ad}(g) \circ \mathrm{Ad}(h).

Thus, \mathrm{Ad}(g) is a homomorphism \mathrm{Ad} : G \longrightarrow \mathrm{GL}(\mathfrak{g}), or in other words, a representation. We call this the adjoint representation of G. Our last step is taking the differential of \mathrm{Ad} at 1. This gives a map

\mathrm{ad} = \mathrm{d}(\mathrm{Ad})_1 : T_1 G \longrightarrow T_I \mathrm{GL}(\mathfrak{g})

Since \mathrm{GL}(\mathfrak{g}) is an open subset of the vector space \mathfrak{gl(g)} = \mathrm{End}(\mathfrak{g}) of linear maps from \mathfrak{g} to itself, its tangent space at any point can be identified with \mathfrak{gl(g)}. The map \mathrm{ad} then assigns to every X \in \mathfrak{g} a linear transformation \mathrm{ad}(X): \mathfrak{g} \longrightarrow \mathfrak{g}. This naturally gives an operation on \mathfrak{g}, where we combine two tangent vectors X, Y \in \mathfrak{g} by taking [X, Y] = \mathrm{ad}(X)(Y). We call [X, Y] the Lie bracket of X and Y. The vector space \mathfrak{g} together with the Lie bracket is the Lie algebra of G.

What exactly is this Lie bracket? Let’s assume our group is a matrix group, that is, a subgroup of \mathrm{GL}(F, n). Then the tangent space \mathfrak{g} is a subspace of \mathrm{End}(F^n), the space of n \times n matrices. The nice thing is that the map \mathrm{Ad} is still just conjugation, so \mathrm{Ad}(g)(M) = gMg^{-1} for any matrix M. Now if X, Y \in \mathfrak{g}, and \gamma : [0, 1] \longrightarrow G is a curve such that \gamma(0) = I \in G, \gamma'(0) = X, then using matrix derivatives,

[X, Y] = \frac{d}{dt}|_{t=0} (\mathrm{Ad}(\gamma(t)(Y))) = \frac{d}{dt}|_{t=0} (\gamma(t) Y \gamma(t)^{-1})

= \gamma'(0) Y \gamma(0)^{-1} +\gamma(0) Y' \gamma(0)^{-1} + \gamma(0) Y (\gamma(0)^{-1})',

and differentiating both sides of \gamma(t) \gamma(t)^{-1} = I at t=0 shows that (\gamma(0)^{-1})' = - \gamma'(0) which gives us

[X, Y] = XY-YX.

So the bracket is just the commutator of matrices. With a little work, we can show two properties of [ \> , \> ] from this definition:

  • Bilinearity: [aX, Y] = [X, aY] = a[X, Y], [X+X', Y] = [X, Y]+[X', Y] and [X, Y+Y'] = [X, Y]+[X, Y']
  • Skew-symmetry: [X, Y] = -[Y, X]

Now since \mathrm{ad} : \mathfrak{g} \longrightarrow \mathrm{End}(\mathfrak{g}), and the latter space has a Lie bracket on it (the commutator of matrices), we should also expect that \mathrm{ad} preserves the Lie bracket. That means that

[[X, Y], Z] = \mathrm{ad}([X, Y]_{\mathfrak{g}})(Z) = ([\mathrm{ad}(X), \mathrm{ad}(Y)]_{\mathrm{End}(\mathfrak{g}})(Z)

= \mathrm{ad}(X) \circ \mathrm{ad}(Y) (Z) -\mathrm{ad}(Y) \circ \mathrm{ad}(X) (Z) = [X, [Y, Z]] - [Y, [Z, X]].

Rearranging some terms using skew-symmetry gives us the third property:

  • Jacobi identity: [X, [Y, Z]] + [Y, [Z, X]] + [Z, [X, Y]] = 0

These three properties define a general Lie algebra: any (real/complex) vector space with a skew-symmetric, bilinear form with the Jacobi identity is a Lie algebra. Note that in a general Lie algebra, the bracket [X, Y] is not necessarily the commutator XY-YX. If we identify \mathfrak{g} as a subalgebra of \mathfrak{gl}_n, then the bracket does become the commutator, but a priori there is no way to “multiply” elements of \mathfrak{g} intrinsically: we could have XY, YX \notin \mathfrak{g} even though XY-YX \in \mathfrak{g}.

The natural question is whether any general Lie algebra is the Lie algebra of some Lie group. The answer lies in the exponential function. First, we use Ado’s theorem to make our lives easier. This just says that any finite-dimensional Lie algebra is a Lie subalgebra (a subspace which contains the brackets of all its elements) of the matrix algebra \mathfrak{gl}_n equipped with the commutator bracket. If your Lie algebra is a subalgebra of \mathfrak{gl}(V), then the exponential function is just the matrix exponential, which is given by the power series

\exp(X) = I + \sum_{n=1}^{\infty} \frac{1}{n!} X^n.

This converges for any X \in \mathfrak{gl}(V). If X and Y commute with each other, we get \exp(X+Y)=\exp(X) \cdot \exp(Y). This means that \exp(X)\exp(-X) = \exp(0) = I, so that \exp(X) has an inverse, and so \exp has an image in \mathrm{GL}(V). Also, since X commutes with its scalar multples, for every X \in \mathfrak{gl}(V), the map f(t) = \exp(tX) : \mathbb{R} \longrightarrow \mathrm{GL}(V) is a group homomorphism. In fact, for closed subgroups G \leq \mathrm{GL}_n(\mathbb{C}), the Lie algebra of G is precisely the X \in \mathfrak{gl}(V) for which this homomorphism has an image in G, so that \exp(tX) \in G \; \forall \; t \in \mathbb{R}.

This allows us to compute the Lie algebras of some Lie groups explicitly, by differentiating the exponential map and using the property that

\frac{d}{dt} \exp(tX) = X \exp(tX) = \exp(tX) X.

For example consider the matrix group \mathrm{GL}_n(\mathbb{C}). For any n \times n matrix X, \exp(tX) is always invertible, so the Lie algebra of \mathrm{GL}_n(\mathbb{C}) is just \mathfrak{gl}_n(\mathbb{C}).

What about its subgroup G = \mathrm{SL}_n(\mathbb{C})? It’s easy to check that the eigenvalues of \exp(X) are just e^{\lambda} where \lambda is an eigenvalue of X. The determinant of \exp(X) is just the product of all these eigenvalues,

\det(\exp(X)) = e^{\lambda_1}\cdot e^{\lambda_2} \cdot ... \cdot e^{\lambda_n}

= \exp(\lambda_1+\lambda_2+...+\lambda_n) = \exp(\mathrm{Tr}(X)),

where \mathrm{Tr}(X) is the sum of the eigenvalues, or trace of X. If X \in \mathfrak{sl}_n(\mathbb{C}), then

\det(\exp(tX)) = 1 \; \forall \; t \in \mathbb{R}

\iff \exp(\mathrm{Tr}(tX)) = \exp(t \mathrm{Tr}(X)) = 1 \; \forall \; t \in \mathbb{R} \iff \mathrm{Tr}(X) = 0.

So \mathfrak{sl}_n consists of all the traceless n \times n matrices. Similarly, by differentiating both sides of

\exp(tX)\exp(tX)^{\dag} = I

at t = 0, we can show that the Lie algebra \mathfrak{su}_n(\mathbb{C}) of the unitary group \mathrm{SU}_n(\mathbb{C}) consists of skew-Hermitian n \times n matrices, that is, the matrices X for which X + X^{\dag} = 0.

For g \in \mathrm{GL}(V) in a small neighborhood of the identity, the exponential map has an inverse called the logarithm, which is again given by a power series

\log(g) = (g-I)-\frac{(g-I)^2}{2}+\frac{(g-I)^3}{3}-..., \; \exp(\log(g))=g.

Suppose G is connected. Then because the logarithm map exists in some neighborhood of 1 \in G, this neighborhood lives in the image of the exponential map. Since this neighborhood also generates G, the image \exp(\mathfrak{g}) generates G. So any abstract Lie algebra \mathfrak{g} is the Lie algebra of the Lie group G generated by the image \exp(\mathfrak{g}) in \mathrm{GL}_n(\mathbb{C}).

If X, Y \in \mathfrak{g} are close to the origin, so that \exp(X), \; \exp(Y) and \exp(X) \cdot \exp(Y) are close to the identity in G, then \exp(X) \cdot \exp(Y) is the exponential of some vector Z \in \mathfrak{g}. This Z is given by

Z = \log(\exp(X) \cdot \exp(Y)),

and we use X * Y to denote \log(\exp(X) \cdot \exp(Y)).

This might be unsatisfying, because the definition of X * Y involves the map \exp, which is not defined intrinsically in terms of the Lie algebra itself: we can’t take powers of X \in \mathfrak{g} and expect them to always be in \mathfrak{g}. We would like to define Z in terms of the Lie algebra, which means only using the Lie bracket and linear combinations. However, the shocking result of the Baker-Campbell-Hausdorff formula is that although \exp and \log don’t involve the Lie bracket in their definitions, X * Y can be written in terms of brackets. The first few terms are

X * Y = X + Y + \frac{1}{2} [X, Y] + \frac{1}{12} [X, [X, Y]] + \frac{1}{12} [Y, [Y, X]] - \frac{1}{24} [Y, [X, [X, Y]]] + ...

The \exp map and the Baker-Campbell-Hausdorff formula make the philosophy that the Lie algebra “locally” describes the nature of its Lie group precise. In a neighborhood of the identity, not only can we write everything as the exponential of something in the Lie algebra, but then using only the structure of the Lie bracket, we can also compute the product of elements in this neighborhood by Baker-Campbell-Hausdorff.

Leave a comment