In my post last week on the policy gradient theorem, I mentioned that the matrix inverse has a nice derivative: where we denote and After writing this, I wondered what other differential equations of this sort have solutions. For a simple example, consider Setting and differentiating in gives which implies for a choice of initial condition However, we know that in general for non-commutative and . In fact, the only function solving this equation for matrices with turns out to be . How can we tell in advance when such a differential equation will have non-trivial solutions?
Let's consider this problem from the point of view of differential geometry. Let be a smooth map and let be its total differential. We are curious to know when the equation admits local solutions. Clearly, making nice regularity assumptions about —which we will make use of liberally in the following—solutions to this equation are unique over path-connected domains, should they exist. Indeed, taking some path , the equation above gives us enough information to compute Thus, subject to a choice for the value of at some point along , is determined as a unique solution to this ODE. On the other hand, it may happen that the values of we obtain by fixing its value at some point and solving ODEs along different paths are not path-independent, in which case our equation won't have a solution.
An intuition for differential geometry tells us that path-independence of this integral over a simply connected domain will come down to a system of equations involving the first partial derivatives of . The nicest way to figure out exactly what these are is by using the Frobenius theorem. Let us introduce coordinates on and define the vector fields It is fairly clear that a (local) solution to our equation is the same as an integral submanifold for the distribution spanned by . Furthermore, involutivity of our distribution in this case boils down to the equations for the simple reason that is, at each point, a linear combination of the tangent vectors , and our distribution admits no elements of this form except .
These brackets are readily computed, taking a bit of care with indices of summation: From this, we can (perhaps in a future post) understand something about what matrix operator differential equations can be integrated.
This result is actually an exercise in Lee's Introduction to Smooth Manifolds in the chapter on the Frobenius theorem. However, if we forget to use the Frobenius theorem—as I did, when I considered this question a few days ago—we can also discover the utility of Lie brackets for ourselves.
Suppose for simplicity that is real-valued, and consider an equation of the simpler type (Here, is a differential form.) If a solution exists, we can recover it by integrating over paths. Furthermore, path-independence of our integral is the same as saying that it vanishes over loops.
When do our loop integrals vanish? Stoke's theorem gives the answer: when our domain is simply connected, every loop is the boundary of a disk, a disk can be partitioned into little tiny subregions, and integrals over big loops are sums of many integrals over little tiny loops. So for our equation to be integrable, we just have to check that the -form telling us the integral of over little tiny loops—its exterior derivative—vanishes.
Let's recall the usual way these little-tiny-loop-integrals are computed. Think of a function near the origin and integrate over loops traversing the points for two basis vectors and . We have By making substitutions of the form within the integral, we get the Stoke's theorem-type formula In particular, the approximation for small is The scalars are exactly the coefficients of the exterior derivative .
Can we do the same thing for a system of equations Path integrals of a sort can still be defined; for any path parameterized on and initial value , define to equal where solves the ODE Figuring out a Stoke's theorem in this situation will be considerably more confusing, since a path integral over an infinitesimal loop is now a vector field in the domain of . Nevertheless, we have the inkling that taking the limit should give us some functions playing a role similar to the one played above by the coefficients of the exterior derivative. Unfortunately, computing this limit seems pretty hopeless without either divine intuition or hard work.
One easy way out is to forget about path integrals and instead use the symmetry of partial derivatives of a prospective solution . Taking partial derivatives with respect to of gives from which we conclude that Actually, these functions are exactly what we would find by computing the approximation to , as we will see next. The "divine intuition" we will need is to relate the flow of vector fields to the Lie bracket.
One of the many fun ways you get Lie brackets to show up is by developing a commutator product of formal exponential series in two non-commuting variables and : The operator sending vector fields to their flows can also be dealt with formally as an exponential map. Specifically, when denotes the image of under the flow of for time , the estimate can be proven from the formal calculation above by looking at things from the right angle. The key observation is that, if is an integral curve of , then for any smooth function we have Iterating then gives Using the suggestive notation , we conclude that gives the right power series for . More precisely, we mean that evaluating this series at any point encodes the higher derivatives of with respect to . Furthermore, a bit of thought reveals that this correspondence from families of smooth maps to formal power series of differential operators (really, asymptotic series) is an antihomomorphism for composition. This curious observation simplifies the derivation of many series approximations involving compositions of flows; for example, With this result in hand, let's return to our equation The vector fields that we defined above have another use: their flows give "path integrals" along coordinate axes. In particular, For me, this is a nice way to see why Lie brackets express the integrability conditions of our differential equation so well.