Quasi matrices, orthogonal polynomials, and Lanczos

September 02, 2020

Quasi matrices, orthogonal polynomials, and Lanczos

$\def\sopmatrix#1{ \begin{pmatrix}#1\end{pmatrix} } \def\map[#1]{\mapengine #1,\void.} \def\mapenginesep_#1#2,#3.{\mapfunction{#2}\ifx\void#3\else#1\mapengine #3.\fi } \def\mapsep_#1[#2]{\mapenginesep_{#1}#2,\void.} \def\vcbr[#1]{\pr(#1)} \def\qqand{\qquad\hbox{and}\qquad} \def\qqfor{\qquad\hbox{for}\qquad} \def\qqas{\qquad\hbox{as}\qquad} \def\half{ {1 \over 2} } \def\D{ {\rm d} } \def\I{ {\rm i} } \def\E{ {\rm e} } \def\C{ {\mathbb C} } \def\R{ {\mathbb R} } \def\H{ {\mathbb H} } \def\Z{ {\mathbb Z} } \def\T{ {\cal T} } \def\CC{ {\cal C} } \def\FF{ {\cal F} } \def\HH{ {\cal H} } \def\LL{ {\cal L} } \def\vc#1{ {\mathbf #1} } \def\bbC{ {\mathbb C} } \def\norm#1{\left\| #1 \right\|} \def\pr(#1){\left({#1}\right)} \def\br[#1]{\left[{#1}\right]} \def\acos{{\rm acos}\,} \def\ip#1{ \left\langle{#1}\right\rangle } \def\addtab#1={#1\;&=} \def\ccr{\\\addtab}$

Orthogonal polynomials are an important tool in applied and computational mathematics. In numerical analysis one normally uses Chebyshev polynomials ($T_n(x) = \cos n \acos x$), as Chebyshev expansions are cosine series in disguise: setting $x = \cos \theta$, one is immediately able to leverage the power of computational Fourier series, e.g., the fast cosine transform. Under some circumstances it is beneficial to use other classical orthogonal polynomials e.g., Legendre, Ultraspherical, Jacobi, Laguerre, or Hermite polynomials. For example, in ApproxFun ultraspherical polynomials are used for numerically solving differential equations, and high parameter Jacobi polynomials are needed for constructing multivariate orthogonal polynomials on balls and tetrahedra. And, of course, weighted Hermite polynomials[1] give eigenfunctions for Schrödinger operators with quadratic well potentials, diagonalise the Fourier transform, describe statistics of random matrices, e.g. the Gaussian Unitary Ensemble (GUE), ... the list goes on and on.

This post is on computing general non-classical orthogonal polynomials, that is, polynomials orthogonal with respect to an inner product of the form

\[ \ip{f,g} = \int_a^b f(x) g(x) w(x) \D x \]

where $w(x)$ is not a classical weight (such as the Jacobi weight $(1-x)^\alpha (1+x)^\beta$). Such polynomials are admittedly a bit esoteric, though there are specialised applications where they are useful:

Constructing quadrature rules for integrands with singularities, if the integrand is of the form $w(x) g(x)$ where $g$ is well-approximated by polynomials. This is only ever useful if one has many different $g$'s for a single $w$ to make up for expensive precomutation of quadrature nodes and weights, which changes for each $w$.
Computing random matrix statistics for general invariant ensembles. Universality results mean that these statistics actually look like GUE apart from very special circumstances, but it is interesting to explore these special circumstances.
Constructing orthogonal polynomials on higher dimensional geometries such as half-disks and trapeziums, arcs and hyperbolas, and related surfaces of revolution.

The last application is why I'm currently interested in computing general orthogonal polynomials: they can be used to solve partial differential equations on quite exotic geometries, including with corners, cusps, etc. But to do this, it is important to be able to generate them efficiently, at least to the order of a few 1000s.

In this post I'm going to advocate a way of implementing the standard procedure for computing orthogonal polynomials, the Stieltjes procedure, that lends itself to adaptivity and manipulating the polynomials, their expansions, their derivatives, etc. This will be done by recasting the Stieltjes procedure as the Lanczos algorithm applied to infinite matrices with a non-standard inner product. This was motivated by a comment in Nick Trefethen's John von Neumann Prize lecture that the Stieltjes procedure is "Vandermonde with Arnoldi", which is a reference to the Brubeck, Nakatsukasa and Trefethen paper [2]. In our case we do not have any Vandermonde matrices as we do not work with a quadrature rule, instead, one could call this approach "Jacobi with Lanczos": as we will see, it is simply the Lanczos procedure applied to the Jacobi matrix with a non-standard inner product.

Computing general orthogonal polynomials is a classical problem, and is discussed in detail in Gautschi's excellent book. There also exist more involved methods for computing orthogonal polynomials from their Riemann–Hilbert problem representations with cost independent of the degree of the polynomial. I won't even attempt to compare any of the different classical or non-classical approaches.

Quasi- and continuum-arrays

We're going to introduce orthogonal polynomials in a slightly odd way as quasi-matrices. A quasi-array as implemented in QuasiArrays.jl is a generalisation of an array that supports non-integer based indexing. The most important usage of a quasi-array is the case where one axis is continuous: we call this a continuum-array as implemented in ContinuumArrays.jl. A continuum-matrix is essentially a row-vector of functions, where we use array-like notation to access the entries: if P is a continuum-matrix corresponding to the row-vector of functions $[p_1(x) | p_2(x) | \ldots | p_n(x)]$ then P[x,n] corresponds to evaluating $p_n(x)$.

It's best seen in code. Let's use a simple example of linear splines before moving on to orthogonal polynomials. A linear spline, represented by L = LinearSpline(points) represents the basis of "hat functions": piecewise affine functions equal to one at a single grid point. To evaluate the nth basis function at the point x we use the notation L[x,n]. Here we show how to plot a linear spline with 5 evenly spaced points:

using QuasiArrays, ContinuumArrays, OrthogonalPolynomialsQuasi, Plots, LinearAlgebra

L = LinearSpline(range(0, 1; length=5))
xx = range(0,1; length=10_000) # plotting grid
plot(xx, L[xx,:]; label=permutedims(string.("L[x,", 1:5,"]")))

That is, L is like a matrix but the first axis is continuous:

axes(L)

(Inclusion(0.0..1.0), Base.OneTo(5))

Here Inclusion(0.0..1.0) is an axis representing the interval 0.0..1.0, which is itself represented by a quasi-vector version of the identity function:

x = axes(L,1)
x[0.1]

0.1

Adjoints are then defined naturally by (L')[n,x] == L[x,n]. What's more interesting is products. For example, what does x'x mean? There is only one obvious definition (to me): for vectors this is the dot product dot(x,x) so it must be the continuous analogue, the inner product:

\[ \ip{f,g} = \int_0^1 f(x) g(x) \D x \]

And indeed:

x'x # integral of x^2 from 0 to 1

0.3333333333333333

So what's L'L? We understand it exactly like matrix multiplication: just as for matrices (A*B)[k,j] == dot(A[k,:],B[:,j]), for quasi-matrices we have (L'L)[k,j] == dot(L[k,:], L[j,:]) where now the dot corresponds to integrals:

L'L

5×5 SymTridiagonal{Float64,Array{Float64,1}}:
 0.0833333  0.0416667   ⋅          ⋅          ⋅ 
 0.0416667  0.166667   0.0416667   ⋅          ⋅ 
  ⋅         0.0416667  0.166667   0.0416667   ⋅ 
  ⋅          ⋅         0.0416667  0.166667   0.0416667
  ⋅          ⋅          ⋅         0.0416667  0.0833333

This is also called the mass matrix for linear splines.

Why bother with quasi-arrays? Because it gives a clean way of manipulating functions using linear algebra language. This grew out of certain defiencies in ApproxFun.jl, for example, there was no natural way of describing mass matrices.

Orthogonal polynomials as quasi-arrays

In this quasi-array language, orthogonal polynomials can then be defined as a continuum-matrix Q whose columns are polynomials, satisfying for a given weight w, represented by a quasi-vector, that Q' * (w .* Q) is diagonal, i.e., dot(Q[:,j], w .* Q[:,k]) is zero for k ≠ j. Here broadcast notation works exactly like it does for vectors: (w .* Q[:,k])[x] == w[x] Q[x,k]. In more classical notation, if the $(n+1)$th column of Q is a polynomial

\[ q_n(x) = k_n x^n + O(x^{n-1}) \]

with $k_n \neq 0$, we have

\[ \int_a^b q_n(x) q_m(x) w(x) \D x = 0 \]

for $n \neq m$. If we further have Q' * (w .* Q) == I then they are orthonormal polynomials.

Let's see this in action, beginning with Legendre polynomials, which are orthogonal with respect to the standard inner product. Let's first plot the first 5 polynomials, showing that they are of degree 0,1,...,4:

P = Legendre()
xx = range(-1,1; length=10_000) # plotting grid
plot(xx, P[xx,1:5]; label=permutedims(string.("P_", 0:4,"(x)")))

Since these are orthogonal, the mass matrix is diagonal:

M = P'P

∞×∞ Diagonal{Float64,LazyArrays.BroadcastArray{Float64,1,typeof(/),Tuple{In
t64,InfiniteArrays.InfStepRange{Int64,Int64}}}} with indices OneToInf()×One
ToInf():
 2.0   ⋅         ⋅    ⋅         ⋅        …  
  ⋅   0.666667   ⋅    ⋅         ⋅           
  ⋅    ⋅        0.4   ⋅         ⋅           
  ⋅    ⋅         ⋅   0.285714   ⋅           
  ⋅    ⋅         ⋅    ⋅        0.222222     
  ⋅    ⋅         ⋅    ⋅         ⋅        …  
  ⋅    ⋅         ⋅    ⋅         ⋅           
  ⋅    ⋅         ⋅    ⋅         ⋅           
  ⋅    ⋅         ⋅    ⋅         ⋅           
  ⋅    ⋅         ⋅    ⋅         ⋅           
 ⋮                                       ⋱

Note we can normalize them via Q = P * inv(sqrt(M)), so that Q'Q == inv(sqrt(M)) * P'P * inv(sqrt(M)) == I, which is wrapped up in a convenience type:

Q = Normalized(P) # Orthonormal version of Legendre
Q'Q

∞×∞ Diagonal{Float64,LazyArrays.BroadcastArray{Float64,1,typeof(*),Tuple{La
zyArrays.BroadcastArray{Float64,1,typeof(*),Tuple{OrthogonalPolynomialsQuas
i.NormalizationConstant{Float64,LazyArrays.BroadcastArray{Float64,1,typeof(
/),Tuple{InfiniteArrays.InfStepRange{Float64,Float64},InfiniteArrays.InfSte
pRange{Int64,Int64}}},LazyArrays.BroadcastArray{Float64,1,typeof(/),Tuple{I
nfiniteArrays.InfStepRange{Float64,Float64},InfiniteArrays.InfStepRange{Int
64,Int64}}}},LazyArrays.BroadcastArray{Float64,1,typeof(/),Tuple{Int64,Infi
niteArrays.InfStepRange{Int64,Int64}}}}},OrthogonalPolynomialsQuasi.Normali
zationConstant{Float64,LazyArrays.BroadcastArray{Float64,1,typeof(/),Tuple{
InfiniteArrays.InfStepRange{Float64,Float64},InfiniteArrays.InfStepRange{In
t64,Int64}}},LazyArrays.BroadcastArray{Float64,1,typeof(/),Tuple{InfiniteAr
rays.InfStepRange{Float64,Float64},InfiniteArrays.InfStepRange{Int64,Int64}
}}}}}} with indices OneToInf()×OneToInf():
 1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   …  
  ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅    ⋅      
  ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅    ⋅      
  ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅    ⋅      
  ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅    ⋅      
  ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅    ⋅   …  
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0   ⋅      
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅   1.0     
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅      
  ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅    ⋅      
 ⋮                        ⋮              ⋱

Other classical orthogonal polynomials have similar relationships. For example, for Chebyshev polynomials we have:

w = ChebyshevTWeight() # w[x] == 1/sqrt(1-x^2)
T = ChebyshevT()
T' * (w .* T)

∞×∞ Diagonal{Float64,LazyArrays.ApplyArray{Float64,1,typeof(vcat),Tuple{Flo
at64,FillArrays.Fill{Float64,1,Tuple{InfiniteArrays.OneToInf{Int64}}}}}} wi
th indices OneToInf()×OneToInf():
 3.14159   ⋅       ⋅       ⋅      …  
  ⋅       1.5708   ⋅       ⋅         
  ⋅        ⋅      1.5708   ⋅         
  ⋅        ⋅       ⋅      1.5708     
  ⋅        ⋅       ⋅       ⋅         
  ⋅        ⋅       ⋅       ⋅      …  
  ⋅        ⋅       ⋅       ⋅         
  ⋅        ⋅       ⋅       ⋅         
  ⋅        ⋅       ⋅       ⋅         
  ⋅        ⋅       ⋅       ⋅         
 ⋮                                ⋱

Lanczos procedure on infinite arrays

We now arrive at the main point of the blog post: given w, how can we find the corresponding orthonormal polynomials Q? The classic method is Stieltjes procedure, which is an optimised version of Gram–Schmidt: if we know $q_n(x)$, write

\[ q_{n+1}(x) = A_n x q_n(x) + B_n q_n(x) + C_{n-1} q_{n-1}(x) \]

where $A_n$, $B_n$ and $C_{n-1}$ are determined by orthogonalising $x q_n(x)$. That is,

\[ \begin{align*} v &= x q_n(x), \ccr \gamma_n = \ip{v, q_{n-1}}, \qquad \beta_n = \ip{v, q_n}, \ccr w = v - \gamma_n q_{n-1} - \beta_n q_n, \ccr A_n = 1/\norm{w}, B_n = -\beta_n A_n, C_{n-1} = \gamma_n A_n, \ccr q_{n+1}(x) = A_n w \end{align*} \]

We get for free that the coefficients in front of lower order polynomials are zero since $\ip{x q_n(x), q_m(x)} = \ip{q_n(x), x q_m(x)} = 0$ for $m < n-1$.

This Stieltjes procedure is, in fact, an infinite-dimensional standard Lanczos algorithm with a non-standard inner product, a fact that was incorporated into ApproxFun.jl [3]. What's interesting about this implementation is it does not do the obvious approach of discretising the inner product, as discussed in Gautschi's book. Unfortunately, the lanczos.jl code is quite long in the tooth, and is not particularly efficient. It is much cleaner in quasi-matrix language.

The idea is to represent orthogonal polynomials by re-expanding them in a simpler basis. For example, if our weight is w[x] == x + x^2 + 1, which has no singularities, it is natural to write the new orthogonal expansion as Q = P*R where P = Normalized(Legendre()) and R is upper-triangular: it must be upper-triangular as the columns of Q and P are both polynomials, thus the first n columns of Q must be a linear combination of the first n columns of P. In light of the fact that P'P==I, this is in fact a QR Decomposition of Q.

Now to do computations we need to drop quasi-matrices and deal only with the (infinite) matrix R. The key is that we can simplify w .* P as P*W:

P = Normalized(Legendre())
x = axes(P,1)
w = x .+ x.^2 .+ 1 # w[x] == exp(x)
W = P \ (w .* P)

∞×∞ Clenshaw{Float64} with 3 degree polynomial:
 1.33333   0.57735   0.298142   ⋅        …  
 0.57735   1.6       0.516398  0.261861     
 0.298142  0.516398  1.52381   0.507093     
  ⋅        0.261861  0.507093  1.51111      
  ⋅         ⋅        0.255551  0.503953     
  ⋅         ⋅         ⋅        0.253246  …  
  ⋅         ⋅         ⋅         ⋅           
  ⋅         ⋅         ⋅         ⋅           
  ⋅         ⋅         ⋅         ⋅           
  ⋅         ⋅         ⋅         ⋅           
 ⋮                                       ⋱

Therefore, our equation becomes Q' * (w .* Q) == R' * P' * P * W * R == R' * W * R.

Note that W is an infinite-dimensional, symmetric definite banded matrix, hence defines an inner product on vectors. If the vector has only a finite-number of entries this can be realised on a computer:

v = [randn(100); zeros(∞)]
dot(v, W, v) # norm of a random degree 99 polynomial `P * v` w.r.t. the weight `w`

176.74273723058576

This tells us the inner product to use, but the Lanczos algorithm is all about tridiagonalising an operator A. Note that it is essentially the same as Arnoldi iteration in that it performs Gram–Schmidt on a Krylov subspace [b; A*b; A^2*b; ...; A^r*b]. Since we want to make sure our basis is polynomial, we want A to correspond to multiplication by x. When represented as acting on P this is a simple symmetric tridiagonal matrix, the Jacobi operator:

X = P \ (x .* P)

∞×∞ Symmetric{Float64,BandedMatrices.BandedMatrix{Float64,LazyArrays.ApplyA
rray{Float64,2,typeof(vcat),Tuple{Adjoint{Float64,FillArrays.Zeros{Float64,
1,Tuple{InfiniteArrays.OneToInf{Int64}}}},Adjoint{Float64,LazyArrays.Broadc
astArray{Float64,1,typeof(/),Tuple{LazyArrays.BroadcastArray{Float64,1,type
of(*),Tuple{LazyArrays.BroadcastArray{Float64,1,typeof(/),Tuple{InfiniteArr
ays.InfStepRange{Float64,Float64},InfiniteArrays.InfStepRange{Int64,Int64}}
},OrthogonalPolynomialsQuasi.NormalizationConstant{Float64,LazyArrays.Broad
castArray{Float64,1,typeof(/),Tuple{InfiniteArrays.InfStepRange{Float64,Flo
at64},InfiniteArrays.InfStepRange{Int64,Int64}}},LazyArrays.BroadcastArray{
Float64,1,typeof(/),Tuple{InfiniteArrays.InfStepRange{Float64,Float64},Infi
niteArrays.InfStepRange{Int64,Int64}}}}}},SubArray{Float64,1,OrthogonalPoly
nomialsQuasi.NormalizationConstant{Float64,LazyArrays.BroadcastArray{Float6
4,1,typeof(/),Tuple{InfiniteArrays.InfStepRange{Float64,Float64},InfiniteAr
rays.InfStepRange{Int64,Int64}}},LazyArrays.BroadcastArray{Float64,1,typeof
(/),Tuple{InfiniteArrays.InfStepRange{Float64,Float64},InfiniteArrays.InfSt
epRange{Int64,Int64}}}},Tuple{InfiniteArrays.InfUnitRange{Int64}},false}}}}
}},InfiniteArrays.OneToInf{Int64}}} with indices OneToInf()×OneToInf():
 0.0      0.57735    ⋅         ⋅        …  
 0.57735  0.0       0.516398   ⋅           
  ⋅       0.516398  0.0       0.507093     
  ⋅        ⋅        0.507093  0.0          
  ⋅        ⋅         ⋅        0.503953     
  ⋅        ⋅         ⋅         ⋅        …  
  ⋅        ⋅         ⋅         ⋅           
  ⋅        ⋅         ⋅         ⋅           
  ⋅        ⋅         ⋅         ⋅           
  ⋅        ⋅         ⋅         ⋅           
 ⋮                                      ⋱

So we can perform the Lanczos directly on X w.r.t. the inner product induced by W: this is purely linear algebra without need for quasi-arrays at all. Moreover, we can do it adaptively: each column of R is upper triangular, X is tridiagonal so multiplication just adds another row, and W is banded so the inner products can be computed in finite time.

Examples

This is all wrapped up in a conveninient type LanczosPolynomial, which automatically and adaptively performs Lanczos to construct the orthogonal polynomials. Here we plot the first 5 orthogonal polynomials:

Q = LanczosPolynomial(w)
plot(xx, Q[xx,1:5]; label=permutedims(string.("q_", 0:4,"(x)")))

Those in the know can see one of the classical properties of orthogonal polynomials: the roots of $q_n$ (Q[x,n+1]) interlace those of $q_{n-1}$ (Q[x,n]). We can easily scale up to much higher orders:

Q = LanczosPolynomial(w)
plot(xx, Q[xx,1001:1005]; label=permutedims(string.("q_", 1000:1004,"(x)")))

As the computation is cached we can keep going without starting from scratch.

We can also find the Jacobi operator corresponding to multiplication by x:

Q \ (x .* Q)

∞×∞ SymTridiagonal{Float64} Jacobi operator from Lanczos:
 0.25       0.580948   ⋅         …  
 0.580948  -0.101852  0.515638      
  ⋅         0.515638  0.0145877     
  ⋅          ⋅        0.502487      
  ⋅          ⋅         ⋅            
  ⋅          ⋅         ⋅         …  
  ⋅          ⋅         ⋅            
  ⋅          ⋅         ⋅            
  ⋅          ⋅         ⋅            
  ⋅          ⋅         ⋅            
 ⋮                               ⋱

And we can find the R such that Q = P*R:

P \ Q

∞×∞ LanczosConversion{Float64}:
 0.866025  -0.372678  -0.0856594   0.100801  …  
  ⋅         0.860663  -0.247277   -0.088639     
  ⋅          ⋅         0.861931   -0.279146     
  ⋅          ⋅          ⋅          0.869831     
  ⋅          ⋅          ⋅           ⋅           
  ⋅          ⋅          ⋅           ⋅        …  
  ⋅          ⋅          ⋅           ⋅           
  ⋅          ⋅          ⋅           ⋅           
  ⋅          ⋅          ⋅           ⋅           
  ⋅          ⋅          ⋅           ⋅           
 ⋮                                           ⋱

Or even more fun, we can differentiate expansions in Q as it automatically exploits the relationship Q = P*R:

f = Q * [randn(10); zeros(∞)] # random degree 9 polynomial
D = Derivative(x) # Derivative operator on -1..1
plot(xx, f[xx]; label="f")
plot!(xx, (D*f)[xx]; label="f'")

One nice aspect of this approach is that it extends naturally and immediately to higher precision arithmetic, whereas a quadrature based approach has difficulties: either using Golub–Welsch, where one needs a high precision eigensolver, or with asymptotics-based approaches, where one needs error bounds not tied to standard Float64 arithmetic, that is, FastGaussQuadrature.jl is not going to work. Here we see the above computation with BigFloat:

x̃ = Inclusion(ChebyshevInterval{BigFloat}())
w̃ = x̃.^2 .+ x̃ .+ 1 # w[x] == exp(x)
Q̃ = LanczosPolynomial(w̃)
Q̃[0.25,1:5]

5-element Array{BigFloat,1}:
  0.61237243569579452454932101867647284799148687016416753210817314181274009
43643334
 -2.64427578658130775197011944365056228038605031316232616690307510215729266
9306343e-78
 -0.68993374838955602980260063388375520059332367291201504520216476606897110
19978372
 -0.32323029002028797299911239725088374359437710229411919650025205253279795
50510895
  0.52607621532089676276257730995178142321503590312664875855483810421102798
86443275

Interval arithmetic (for rigorous computation of orthogonal polynomials and their recurrence coefficients) and auto-differentiation (for differentiating with respect to a varying parameter in the weight) would also work, except, we cannot yet compute the expansion of exp.(x) or other functions in Legendre polynomials, yet.

Orthogonal tridiagonalisation?

Lanczos is a way of tridiagonalising matrices, and when run to completion is equivalent to Householder tridiagonalisation if the standard inner product is used and b = [1; zeros(size(A,1)-1)] [4]. Something interesting is going on in the orthogonal polynomial setting: X is already tridiagonal, and so Lanczos is mapping it to a different tridiagonal matrix. One way to think of this is to consider the fact that R'W*R == I means U = sqrt(W)*R is an orthogonal matrix, whose columns are precisely the Legendre coefficients of sqrt(w) .* Q, which is an orthonormal basis. Thus it is really an orthogonal similarity transform: J = U'X*U.

The annoying thing about Lanczos (and therefore Stieltjes) is that there is inherent instability where orthogonality is eventually lost; as is typical of Gram–Schmidt methods [5]. For QR there is a beautiful alternative to Gram–Schmidt: use Householder reflections. That is, to quote Trefethen & Bau, use orthogonal triangularisation instead of triangular orthogonalisation. I'll leave it for now as a thought experiment: does there exist an "orthogonal tridiagonalisation" version of the Stieltjes algorithm as an alternative, more stable, approach to computing orthogonal polynomial expansions?

Acknowledgments Thanks to Marco Fasondini, Yuji Nakatsukasa, Mikael Slevinsky, Ben Snowball, and Nick Trefethen for useful comments!

[1] I should mention that there is some controversy if weighted Hermite polynomials are ever useful in computational settings. The lack of a numerically stable fast transform is a significant issue: FastTransforms.jl does have weightedhermitetransform but it is neither fast nor stable. A common alternative is to truncate the real line and use Fourier series.

[2] To appear in SIAM Review.

[3] This was added as an example back in 2014 at the request of Alan Edelman: as mentioned above, non-classical orthogonal polynomials come up in random matrix theory, which is one of Alan's research interests. If I recall correctly he used the code in 18.338 Eigenvalues of Random Matrices.

[4] I learned this seldom mentioned fact from Tom Trogdon.

[5] Yuji Nakatsukasa mentioned that this issue can be overcome by reorthogonalisation, but I think there is still a benefit from having a representation that encodes the orthogonality directly.

Comments

Marcus WebbSeptember 27, 2020 at 1:45 AM
Regarding the last point about Lanczos losing orthogonality, and whether a robust version using Householder transformations instead of modified Gram-Schmidt can be used, it appears to be possible. See "Implementation of the GMRES Method Using Householder Transformations" by Homer Walker (SISC 1988) https://doi.org/10.1137/0909010. It requires about the same storage but twice as many operations as modified Gram-Schmidt.
ReplyDelete
Replies
fn1i26o4qbFebruary 21, 2023 at 8:28 AM
And should you ask me, that is why the 3D-printing revolution has stalled -- at least of|no much less than} in the home space. I totally count on commercial and academic utilization of 3D printers to soar. That meant installing an Disposable Shower Caps extension to convert the file right into a format the printer's software could process. That's not an enormous deal, but it took an additional 10 minutes out our lives. Remember when getting the exact right driver for your 2D printer was a thing?
ReplyDelete
Replies

Add comment

Search This Blog

Approximately Functioning