Gram-Schmidt: Tying together matrices and functions

A discussion of how two seemingly unrelated topics have an intimate connection.

Apr 12, 2019

Different topics in mathematics are normally taught separately with very little discussion of how the topics are connected. Granted most of the time the connection requires a very deep understanding of the topics and possibly other unrelated topics. However, in some cases, the connection presents itself to be obvious.

Gram-Schmidt Orthogonalization

The Gram-Schmidt orthogonalization is a process that transforms a set of vectors (or functions) into a set of orthogonal (or orthonormal, depending on formulation) vectors. It is an useful procedure if you want to perform the QR decomposition of matrices, where Q is the matrix of orthonormal vectors derived from applying Gram-Schmidt to the matrix.

Consider a matrix A with columns ai. We want to generate a matrix Q with columns qi, such that the columns are orthonormal. In other words

Gram-Schmidt gives us a procedure to get from A to Q. It is as follows. Let

The choice of the first vector (with respect to which all other vectors will be orthonormal to) is arbitrary. So,

Now, to get the next orthogonal vector, we need to essentially remove any part of the vector a2 that is parallel to the vector A1. This can be done simply by

This process of removing the components parallel to the previous Ai can be repeated for the remaining vectors, giving the general formula

Now one may be thinking, how could a process that uses dot products and projections be applied to functions? The application arises in Sturm-Liouville theory of differential equations.

Gram-Schmidt for functions

When looking at differential equations of the Sturm-Liouville form, we sometimes get a result where there are multiple eigenfunctions for some eigenvalue as solutions to the differential equation. This poses a trouble for one of the postulate of the Sturm-Liouville theory that states that the eigenfunctions derived as solutions to the differential equations are orthogonal to each other. Hence, we can apply Gram-Schmidt to make the eigenfunctions orthogonal to each other. The process is as follows:

Let

be the eigenfunctions corresponding to an eigenvalue. We will to find eigenfunctions

such that they are orthogonal

where k would be the mod of the function.

The procedure is as follows: As with the vectors, the first eigenfunction can be any one of the given eigenfunctions. So,

Now, we know that in order to make the second eigenfunction orthogonal to the first, we need to remove the part of the eigenfunction that is ‘parallel’ to the first, and so we can assume the second eigenfunction would be of the form

Where c is a constant. We can determine c as such:

Expanding the brackets and moving the terms around, we get

As with the vectors, we can iterate this procedure and get the general formula as

This is the formula for finding orthogonal eigenfunctions. Making them orthonormal would be trivial, just divide by their mod.

The Connection

You might already seen that both the formulas are similar in the type of terms and were both derived in a very similar manner. For clarity’s sake, I am going to show both the formulas here again:

It makes you wonder, is there a more abstract mathematical notion that encompasses both functions, vectors, dot products and integrals? The answer is yes, and it is called the inner product space. An inner product space is a vector space with a notion of inner products. Inner products are the generalizations of the dot products, and for continuous functions, are defined as the integral over the entire domain. In general, an inner product is written as

So, we can now write the Gram-Schmidt orthogonalization in a very general form as

So essentially, Gram-Schmidt can be used to generate orthogonal elements of any inner product space given elements from that inner product space. This is a very useful result that goes well beyond matrices and functions.

Engineer Quant

Discussion about this post