Why You Should Think of Functions as Vectors.

Functions as vectors cover image.

Occasionally one might hear that functions are actually vectors living in the more abstract world of a vector space. This may seem like a useless generalization touted by practicioners of functional analysis, but it actually has a lot of use in the understanding of many physics concepts.

Specifically, the intuition gained by modelling functions as vectors allows us to project functions onto different basis sets; to model differentiation and integration as linear operators, much like matrices; and then extract useful representations of functions in the context of self-adjoint operators.

Below, we'll first give a brief refresher on the axioms of vector spaces, and then show how these properties can be applied to functions.

Axioms of a Vector Space

For a vector space \(V\) over a field \(F\) with vectors denoted by \(\vec{a},\vec{b}\) and scalars denoted \(\alpha,\beta\), we have (atleast) two products: one termed vector addition and denoted \(\vec{a}+\vec{b}\), and another termed scalar multiplication, denoted \(\alpha \vec{a}\). These operations are compatible together, such that scalar multiplication is associative with the underlying field's operations; as well are the operations distributive with respect to each other.

See Terry Tao's overview for a more formal statement

Functions as Vectors

Functions may be seen to be vectors by making the following identifications: scalar multiplication is simply multiplication of the function at all points by the underlying field element; and vector addition is point-wise addition of functions, that is

\begin{align} \text{Scalar Multiplication: }\quad a*f(x)&=af(x)\\ \text{Vector Addition: }\quad f(x)+g(x)&=(f+g)(x)\\ \end{align}

These two products should be easily seen to satisfy the vector space axioms.

Below, we consider just one-dimensional, complex-valued functions as an example, where the underlying scalar field is complex numbers.

Functional Bases

If we think of vectors in a typical vector space, we often wish to think in terms of components of these vectors in some basis. That is, for a vector \(\vec{v}\) in a \(d\)-dimensional vector space, we may always choose some basis set of \(d\) orthonormal unit vectors \({\hat{b}_n}\) such that:

\[ \vec{v}=\sum_{i=1}^n c_i \hat{b}_i \] where \[ \langle \hat{b}_i, \hat{b}_j\rangle=\delta_{ij} \]
The same then applies to functions, though we now require infinite dimensional basis vector sets.

As an example, consider the representation of a (well-behaved/differentiable) function as a power series, much as we do for Taylor series expansions.

\[ f(x)= \sum_{n=0}^{\infty}c_nx^n \]
In this case, we may consider the function to be projected onto a certain set of vectors \(\lbrace x^n\rbrace\) that span the space (though the terms \(x^n\) aren't orthonormal here, with respect to the inner product defined below).

Note that the definition above of an orthonormal basis requires we introduce an inner product on the space \(\langle \cdot,\cdot \rangle\). Below, we define this 'inner product of functions' in the context of general linear operators.

Linear Operators as Change of Bases

How is a derivative or integral like a matrix? They're all linear operators! As such, we can take the usual operators of differentiation and integration, and treat them somewhat analagously as matrices. This means we can apply the usual decompositions of linear operators.

As an example of a derivative acting like a matrix, let's again consider some function \(f\) represented as a power series, so that we have the following identification:

\[ f(x)= \sum_{n=0}^{\infty}c_nx^n =\begin{bmatrix}c_0 \\ c_1 \\ c_2 \\ \vdots \\ \end{bmatrix} \]
And now, acoording to the power law of differentiation, we may represent the derivative's action as the following:
\[ \frac{d}{dx}f(x)= \begin{bmatrix} 0 & 1 & 0 & 0 & \cdots \\ 0& 0 & 2 & 0 &\cdots \\ 0& 0 & 0 & 3 &\cdots\\ \vdots &\vdots &\vdots &\vdots & \ddots \\ \end{bmatrix} \begin{bmatrix}c_0 \\ c_1 \\ c_2 \\ \vdots \\ \end{bmatrix} \]
While this identification as a matrix isn't so pretty in all possible bases, it should now be clear that some parallel between the two may be drawn.

We need now introduce one last set of operations, which allow us to define a special type of operator highly relevant for physics; and which naturally yield nice sets of orthonormal bases. This operation is that of an inner product between vectors, a specific type of bi-linear form.

Bi-Linear Forms

Bi-linear forms are simply forms which take in two elements of the vector space and return an element of the underling scalar field, and which are linear in both arguments.

Inner Products

In many vector spaces, we often define a specific bi-linear form on vectors: the inner product. In the case of functions, we often may do the same.

A commonly defined inner product \(\langle\cdot,\cdot\rangle\) between two functions \(f\) and \(g\) is the following:

\[ \langle f , g \rangle = \int_a^b dx \ f(x) g^*(x) w(x) \]
where \(w(x)\) is some weighting function, which also might depend on the choice of coordinates (though one set of coordinates is often implicit); and the integral is defined strictly over some definite interval, that is from \(x=[a,b] \).

Note that the complex conjugate is included so that inner-products of (complex-valued) functions with themselves (that is \(\langle f,f\rangle \)) are guaranteed to be real. This condition allows us to define a real-valued norm on the space. The inner product is sometimes also defined to apply the complex conjugate to the left-hand element of the input (as in bra-ket notation).

Self-adjoint Operators

We may now define a self-adjoint operator (or Hermitian operator) \(L\), in general, as the following:

\[ \langle L f , g \rangle = \langle f , L g\rangle \]
So, in the specific case of complex-valued functions, we have that self-adjoint operators (such as Sturm-Liouville operators) satisfy the following:
\[ \int_a^b dx \ L\big( f(x)\big) g^*(x) w(x) = \int_a^b dx \ f(x) \big[ L\big( g(x)\big)\big]^* w(x) \]

If we now recall the spectral decomposition theorem, we know that any self-adjoint operator's set of eigenvectors form an orthonormal basis for the underlying vector space. So if we have some Hermitian (self-adjoint) linear operator, such as a Sturm-Liouville operator, we can get bases for functional spaces! In fact, this is where spherical harmonics, Legendre polynomials, and Bessel functions come from. The sometimes strange coefficients in these functions are often then just to normalize their inner product.

Where do Bessel functions come from?

The dreaded Bessel functions really are there to make your life easier. Note that the Bessel equaton of the first kind arises as the radial component of cylindrical coordinate poisson functions (after applying separation of variables).

\[ \frac{r}{R}\frac{\partial}{\partial r}\left(r\frac{\partial R}{\partial r} \right)+r^2n^2-m^2 = 0 \]

Note that this operator is linear and also happens to be Hermitian (self-adjoint). Hence, we may define some basis \(J_n\):

\[ \Rightarrow \ \ R(r)=J_n(kr)\quad \text{(Bessel Functions of First Kind)} \]
indexed by \(n\) which takes integer values ranging from \(0\) to \(\infty\), and the corresponding definite interval for the inner product being \(0\) to \(1\). The coefficients of the function are chosen so that the functions are normalized (with respect to the inner product with themselves) over this range.

Of course, the Bessel equation is second-order, so there's really two sets of solutions that we may extract from it. This second set is the lesser-used Neumann functions (which diverge at the origin).

Note also, that other basis are possible to extract from the Bessel equations (though they'll always be linear combinations of the previous two sets of equations), such as Hankel functions.

So, if we have a self-adjoint differential equation (as we often do in physics) we can always project our functions onto a particularly nice basis. This is similar to the concept of quantum numbers: for example, the \(n,l,m\) quantum numbers describing a Hydrogen-like wavefunciton are exactly the index values for the related bases (that is, the wave-functions are characterizable by what basis functions they are composed of). This suggests that non-simultaneous quantum numbers are evident of a poor or inconsistent choice of basis.