Levinson recursion: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>OAbot
m Open access bot: url-access updated in citation with #oabot.
 
imported>Quondum
No edit summary
 
Line 1: Line 1:
{{Short description|Recursive algorighm in linear algebra}}
{{short description|Recursive algorighm in linear algebra}}
'''Levinson recursion''' or '''Levinson–Durbin recursion''' is a procedure in [[linear algebra]] to [[recursion|recursively]] calculate the solution to an equation involving a [[Toeplitz matrix]]. The [[algorithm]] runs in {{math|[[Big O notation|Θ]](''n''<sup>2</sup>)}} time, which is a strong improvement over [[Gauss–Jordan elimination]], which runs in Θ(''n''<sup>3</sup>).
'''Levinson recursion''' or '''Levinson–Durbin recursion''' is a procedure in [[linear algebra]] to [[recursion|recursively]] calculate the solution to an equation involving a [[Toeplitz matrix]]. The [[algorithm]] runs in {{math|[[Big O notation|Θ]](''n''<sup>2</sup>)}} time, which is a strong improvement over [[Gauss–Jordan elimination]], which runs in Θ(''n''<sup>3</sup>).


Line 8: Line 8:
The Bareiss algorithm for [[Toeplitz matrix|Toeplitz matrices]] (not to be confused with the general [[Bareiss algorithm]]) runs about as fast as Levinson recursion, but it uses {{math|''O''(''n''<sup>2</sup>)}} space, whereas Levinson recursion uses only ''O''(''n'') space.  The Bareiss algorithm, though, is [[numerical stability|numerically stable]],<ref>Bojanczyk et al. (1995).</ref><ref>Brent (1999).</ref> whereas Levinson recursion is at best only weakly stable (i.e. it exhibits numerical stability for [[Condition number|well-conditioned]] linear systems).<ref>Krishna & Wang (1993).</ref>
The Bareiss algorithm for [[Toeplitz matrix|Toeplitz matrices]] (not to be confused with the general [[Bareiss algorithm]]) runs about as fast as Levinson recursion, but it uses {{math|''O''(''n''<sup>2</sup>)}} space, whereas Levinson recursion uses only ''O''(''n'') space.  The Bareiss algorithm, though, is [[numerical stability|numerically stable]],<ref>Bojanczyk et al. (1995).</ref><ref>Brent (1999).</ref> whereas Levinson recursion is at best only weakly stable (i.e. it exhibits numerical stability for [[Condition number|well-conditioned]] linear systems).<ref>Krishna & Wang (1993).</ref>


Newer algorithms, called ''asymptotically fast'' or sometimes ''superfast'' Toeplitz algorithms, can solve in {{math|Θ(''n'' log<sup>''p''</sup>''n'')}} for various ''p'' (e.g. ''p'' = 2,<ref>{{Cite web |url=http://www.maths.anu.edu.au/~brent/pd/rpb143tr.pdf |title=Archived copy |access-date=2013-04-01 |archive-date=2012-03-25 |archive-url=https://web.archive.org/web/20120325215317/http://maths.anu.edu.au/~brent/pd/rpb143tr.pdf |url-status=dead }}</ref><ref>{{cite web |url=http://etd.gsu.edu/theses/available/etd-04182008-174330/unrestricted/kimitei_symon_k_200804.pdf |title=Archived copy |access-date=2009-04-28 |url-status=dead |archive-url=https://web.archive.org/web/20091115041852/http://etd.gsu.edu/theses/available/etd-04182008-174330/unrestricted/kimitei_symon_k_200804.pdf |archive-date=2009-11-15 }}</ref> ''p'' = 3 <ref>{{cite web |url=http://saaz.cs.gsu.edu/papers/sfast.pdf |title=Archived copy |website=saaz.cs.gsu.edu |access-date=12 January 2022 |archive-url=https://web.archive.org/web/20070418074240/http://saaz.cs.gsu.edu/papers/sfast.pdf |archive-date=18 April 2007 |url-status=dead}}</ref>). Levinson recursion remains popular for several reasons; for one, it is relatively easy to understand in comparison; for another, it can be faster than a superfast algorithm for small ''n'' (usually ''n''&nbsp;<&nbsp;256).<ref>{{Cite web |url=http://www.math.niu.edu/~ammar/papers/amgr88.pdf |title=Archived copy |access-date=2006-08-15 |archive-date=2006-09-05 |archive-url=https://web.archive.org/web/20060905064921/http://www.math.niu.edu/~ammar/papers/amgr88.pdf |url-status=dead }}</ref>
Newer algorithms, called ''asymptotically fast'' or sometimes ''superfast'' Toeplitz algorithms, can solve in {{math|Θ(''n'' (log ''n'')<sup>''p''</sup>)}} for various ''p'' (e.g. ''p'' = 2,<ref>{{cite web |url=http://www.maths.anu.edu.au/~brent/pd/rpb143tr.pdf |title=Archived copy |access-date=2013-04-01 |archive-date=2012-03-25 |archive-url=https://web.archive.org/web/20120325215317/http://maths.anu.edu.au/~brent/pd/rpb143tr.pdf |url-status=dead }}</ref><ref>{{cite web |url=http://etd.gsu.edu/theses/available/etd-04182008-174330/unrestricted/kimitei_symon_k_200804.pdf |title=Archived copy |access-date=2009-04-28 |url-status=dead |archive-url=https://web.archive.org/web/20091115041852/http://etd.gsu.edu/theses/available/etd-04182008-174330/unrestricted/kimitei_symon_k_200804.pdf |archive-date=2009-11-15 }}</ref> ''p'' = 3<ref>{{cite web |url=http://saaz.cs.gsu.edu/papers/sfast.pdf |title=Archived copy |website=saaz.cs.gsu.edu |access-date=12 January 2022 |archive-url=https://web.archive.org/web/20070418074240/http://saaz.cs.gsu.edu/papers/sfast.pdf |archive-date=18 April 2007 |url-status=dead}}</ref>). Levinson recursion remains popular for several reasons; for one, it is relatively easy to understand in comparison; for another, it can be faster than a superfast algorithm for small ''n'' (usually ''n''&nbsp;<&nbsp;256).<ref>{{cite web |url=http://www.math.niu.edu/~ammar/papers/amgr88.pdf |title=Archived copy |access-date=2006-08-15 |archive-date=2006-09-05 |archive-url=https://web.archive.org/web/20060905064921/http://www.math.niu.edu/~ammar/papers/amgr88.pdf |url-status=dead }}</ref>


== Derivation ==
== Derivation ==
Line 14: Line 14:
=== Background ===
=== Background ===
Matrix equations follow the form
Matrix equations follow the form
: <math>\mathbf M \, \vec x = \vec y.</math>
: <math>\mathbf M \, \vec x = \vec y.</math>


The Levinson–Durbin algorithm may be used for any such equation, as long as '''M''' is a known [[Toeplitz matrix]] with a nonzero main diagonal. Here <math>\vec y</math> is a known [[vector space|vector]], and <math>\vec x</math> is an unknown vector of numbers ''x''<sub>''i''</sub> yet to be determined.
The Levinson–Durbin algorithm may be used for any such equation, as long as '''M''' is a known [[Toeplitz matrix]] with a nonzero [[main diagonal]]. Here <math>\vec y</math> is a known [[vector space|vector]], and <math>\vec x</math> is an unknown vector of numbers ''x''<sub>''i''</sub> yet to be determined.


For the sake of this article, ''ê''<sub>''i''</sub> is a vector made up entirely of zeroes, except for its ''i''th place, which holds the value one. Its length will be implicitly determined by the surrounding context. The term ''N'' refers to the width of the matrix above – '''M''' is an ''N''×''N'' matrix. Finally, in this article, superscripts refer to an ''inductive index'', whereas subscripts denote indices. For example (and definition), in this article, the matrix '''T'''<sup>''n''</sup> is an ''n''×''n'' matrix that copies the upper left ''n''×''n'' block from '''M''' – that is, ''T''<sup>''n''</sup><sub>''ij''</sub> = ''M''<sub>''ij''</sub>.
For the sake of this article, ''ê''<sub>''i''</sub> is a vector made up entirely of zeroes, except for its ''i''th place, which holds the value one. Its length will be implicitly determined by the surrounding context. The term ''N'' refers to the width of the matrix above – '''M''' is an ''N''×''N'' matrix. Finally, in this article, superscripts refer to an ''inductive index'', whereas subscripts denote indices. For example (and definition), in this article, the matrix '''T'''<sup>''n''</sup> is an ''n''×''n'' matrix that copies the upper left ''n''×''n'' block from '''M''' – that is, ''T''<sup>''n''</sup><sub>''ij''</sub> = ''M''<sub>''ij''</sub>.


'''T'''<sup>''n''</sup> is also a Toeplitz matrix, meaning that it can be written as
'''T'''<sup>''n''</sup> is also a Toeplitz matrix, meaning that it can be written as
: <math>\mathbf T^n = \begin{bmatrix}
: <math>\mathbf T^n = \begin{bmatrix}
     t_0    & t_{-1}  & t_{-2}  & \dots  & t_{-n+1}  \\
     t_0    & t_{-1}  & t_{-2}  & \dots  & t_{-n+1}  \\
Line 34: Line 32:
The algorithm proceeds in two steps. In the first step, two sets of vectors, called the ''forward'' and ''backward'' vectors, are established. The forward vectors are used to help get the set of backward vectors; then they can be immediately discarded. The backwards vectors are necessary for the second step, where they are used to build the solution desired.
The algorithm proceeds in two steps. In the first step, two sets of vectors, called the ''forward'' and ''backward'' vectors, are established. The forward vectors are used to help get the set of backward vectors; then they can be immediately discarded. The backwards vectors are necessary for the second step, where they are used to build the solution desired.


Levinson–Durbin recursion defines the ''n''<sup>th</sup> "forward vector", denoted <math>\vec f^n</math>, as the vector of length ''n'' which satisfies:
Levinson–Durbin recursion defines the ''n''th "forward vector", denoted <math>\vec f^n</math>, as the vector of length ''n'' which satisfies:
 
: <math>\mathbf T^n \vec f^n = \hat e_1.</math>
:<math>\mathbf T^n \vec f^n = \hat e_1.</math>
 
The ''n''<sup>th</sup> "backward vector" <math>\vec b^n</math> is defined similarly; it is the vector of length ''n'' which satisfies:


:<math>\mathbf T^n \vec b^n = \hat e_n.</math>
The ''n''th "backward vector" <math>\vec b^n</math> is defined similarly; it is the vector of length ''n'' which satisfies:
: <math>\mathbf T^n \vec b^n = \hat e_n.</math>


An important simplification can occur when '''''M''''' is a [[symmetric matrix]]; then the two vectors are related by ''b''<sup>''n''</sup><sub>''i''</sub> = ''f''<sup>''n''</sup><sub>''n''+1−''i''</sub>—that is, they are row-reversals of each other. This can save some extra computation in that special case.
An important simplification can occur when '''''M''''' is a [[symmetric matrix]]; then the two vectors are related by ''b''<sup>''n''</sup><sub>''i''</sub> = ''f''<sup>''n''</sup><sub>''n''+1−''i''</sub> – that is, they are row-reversals of each other. This can save some extra computation in that special case.


=== Obtaining the backward vectors ===
=== Obtaining the backward vectors ===
Even if the matrix is not symmetric, then the ''n''<sup>th</sup> forward and backward vector may be found from the vectors of length ''n''&nbsp;−&nbsp;1 as follows. First, the forward vector may be extended with a zero to obtain:
Even if the matrix is not symmetric, then the ''n''th forward and backward vector may be found from the vectors of length ''n''&nbsp;−&nbsp;1 as follows. First, the forward vector may be extended with a zero to obtain:
 
: <math>\mathbf T^n \begin{bmatrix} \vec f^{n-1} \\ 0 \\ \end{bmatrix} =
:<math>\mathbf T^n \begin{bmatrix} \vec f^{n-1} \\ 0 \\ \end{bmatrix} =
   \begin{bmatrix}
   \begin{bmatrix}
     \        & \              & \    & t_{-n+1}  \\
     \        & \              & \    & t_{-n+1}  \\
Line 67: Line 62:
   \end{bmatrix}. </math>
   \end{bmatrix}. </math>


In going from '''''T''<sup>''n''−1</sup>''' to '''''T<sup>n</sup>''''', the extra ''column'' added to the matrix does not perturb the solution when a zero is used to extend the forward vector. However, the extra ''row'' added to the matrix ''has'' perturbed the solution; and it has created an unwanted error term ''ε''<sub>''f''</sub> which occurs in the last place. The above equation gives it the value of:
In going from '''''T''<sup>''n''−1</sup>''' to '''''T''<sup>''n''</sup>''', the extra ''column'' added to the matrix does not perturb the solution when a zero is used to extend the forward vector. However, the extra ''row'' added to the matrix ''has'' perturbed the solution; and it has created an unwanted error term ''ε''<sub>''f''</sub> which occurs in the last place. The above equation gives it the value of:
 
: <math> \varepsilon_f^n \ = \  \sum_{i=1}^{n-1} \ M_{ni} \  f_{i}^{n-1} \ = \ \sum_{i=1}^{n-1} \  t_{n-i} \ f_{i}^{n-1}. </math>
: <math> \varepsilon_f^n \ = \  \sum_{i=1}^{n-1} \ M_{ni} \  f_{i}^{n-1} \ = \ \sum_{i=1}^{n-1} \  t_{n-i} \ f_{i}^{n-1}. </math>


This error will be returned to shortly and eliminated from the new forward vector; but first, the backwards vector must be extended in a similar (albeit reversed) fashion. For the backwards vector,
This error will be returned to shortly and eliminated from the new forward vector; but first, the backwards vector must be extended in a similar (albeit reversed) fashion. For the backwards vector,
 
: <math> \mathbf T^n \begin{bmatrix} 0 \\ \vec b^{n-1} \\ \end{bmatrix} =
:<math> \mathbf T^n \begin{bmatrix} 0 \\ \vec b^{n-1} \\ \end{bmatrix} =
\begin{bmatrix}
\begin{bmatrix}
     t_0    & \dots & t_{-n+2}        & t_{-n+1} \\
     t_0    & \dots & t_{-n+2}        & t_{-n+1} \\
Line 94: Line 87:


As before, the extra column added to the matrix does not perturb this new backwards vector; but the extra row does. Here we have another unwanted error ''ε''<sub>''b''</sub> with value:
As before, the extra column added to the matrix does not perturb this new backwards vector; but the extra row does. Here we have another unwanted error ''ε''<sub>''b''</sub> with value:
 
: <math> \varepsilon_b^n \ = \ \sum_{i=2}^n \  M_{1i} \ b_{i-1}^{n-1} \ = \ \sum_{i=1}^{n-1} \  t_{-i} \ b_i^{n-1}. \ </math>
:<math> \varepsilon_b^n \ = \ \sum_{i=2}^n \  M_{1i} \ b_{i-1}^{n-1} \ = \ \sum_{i=1}^{n-1} \  t_{-i} \ b_i^{n-1}. \ </math>


These two error terms can be used to form higher-order forward and backward vectors described as follows. Using the linearity of matrices, the following identity holds for all <math>(\alpha,\beta)</math>:
These two error terms can be used to form higher-order forward and backward vectors described as follows. Using the linearity of matrices, the following identity holds for all <math>(\alpha,\beta)</math>:
 
: <math>\mathbf T \left( \alpha
:<math>\mathbf T \left( \alpha
   \begin{bmatrix}
   \begin{bmatrix}
                   \vec f \\
                   \vec f \\
Line 123: Line 114:
   \end{bmatrix}.</math>
   \end{bmatrix}.</math>


If ''α'' and ''β'' are chosen so that the right hand side yields ''ê''<sub>1</sub> or ''ê''<sub>''n''</sub>, then the quantity in the parentheses will fulfill the definition of the ''n''<sup>th</sup> forward or backward vector, respectively. With those alpha and beta chosen, the vector sum in the parentheses is simple and yields the desired result.
If ''α'' and ''β'' are chosen so that the right hand side yields ''ê''<sub>1</sub> or ''ê''<sub>''n''</sub>, then the quantity in the parentheses will fulfill the definition of the ''n''th forward or backward vector, respectively. With those alpha and beta chosen, the vector sum in the parentheses is simple and yields the desired result.


To find these coefficients, <math>\alpha^n_{f}</math>, <math>\beta^n_{f}</math> are such that :
To find these coefficients, <math>\alpha^n_{f}</math>, <math>\beta^n_{f}</math> are such that :
:<math>
: <math>
\vec f^n = \alpha^n_{f} \begin{bmatrix} \vec f^{n-1}\\
\vec f^n = \alpha^n_{f} \begin{bmatrix} \vec f^{n-1}\\
0
0
Line 135: Line 126:
</math>
</math>
and respectively  <math>\alpha^n_{b}</math>, <math>\beta^n_{b}</math> are such that :
and respectively  <math>\alpha^n_{b}</math>, <math>\beta^n_{b}</math> are such that :
:<math>\vec b^n = \alpha^n_{b}
: <math>\vec b^n = \alpha^n_{b}
\begin{bmatrix}
\begin{bmatrix}
\vec f^{n-1}\\
\vec f^{n-1}\\
Line 162: Line 153:


Now, all the zeroes in the middle of the two vectors above being disregarded and collapsed, only the following equation is left:
Now, all the zeroes in the middle of the two vectors above being disregarded and collapsed, only the following equation is left:
: <math> \begin{bmatrix} 1 & \varepsilon^n_b \\ \varepsilon^n_f & 1 \end{bmatrix} \begin{bmatrix} \alpha^n_f & \alpha^n_b \\ \beta^n_f & \beta^n_b \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}.</math>
: <math> \begin{bmatrix} 1 & \varepsilon^n_b \\ \varepsilon^n_f & 1 \end{bmatrix} \begin{bmatrix} \alpha^n_f & \alpha^n_b \\ \beta^n_f & \beta^n_b \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}.</math>


With these solved for (by using the Cramer 2×2 matrix inverse formula), the new forward and backward vectors are:
With these solved for (by using the Cramer 2×2 [[Invertible matrix|matrix inverse]] formula), the new forward and backward vectors are:
 
: <math>\vec f^n = {1 \over { 1 - \varepsilon_b^n \varepsilon_f^n }}          \begin{bmatrix} \vec f^{n-1} \\ 0 \end{bmatrix}
: <math>\vec f^n = {1 \over { 1 - \varepsilon_b^n \varepsilon_f^n }}          \begin{bmatrix} \vec f^{n-1} \\ 0 \end{bmatrix}
                 - { \varepsilon_f^n \over { 1 - \varepsilon_b^n \varepsilon_f^n }}\begin{bmatrix} 0 \\ \vec b^{n-1} \end{bmatrix}</math>
                 - { \varepsilon_f^n \over { 1 - \varepsilon_b^n \varepsilon_f^n }}\begin{bmatrix} 0 \\ \vec b^{n-1} \end{bmatrix}</math>
: <math>\vec b^n =  { 1 \over { 1 - \varepsilon_b^n \varepsilon_f^n }}        \begin{bmatrix} 0 \\ \vec b^{n-1} \end{bmatrix}
: <math>\vec b^n =  { 1 \over { 1 - \varepsilon_b^n \varepsilon_f^n }}        \begin{bmatrix} 0 \\ \vec b^{n-1} \end{bmatrix}
                 - { \varepsilon_b^n \over { 1 - \varepsilon_b^n \varepsilon_f^n }}\begin{bmatrix} \vec f^{n-1} \\ 0 \end{bmatrix}.</math>
                 - { \varepsilon_b^n \over { 1 - \varepsilon_b^n \varepsilon_f^n }}\begin{bmatrix} \vec f^{n-1} \\ 0 \end{bmatrix}.</math>


Performing these vector summations, then, gives the ''n''<sup>th</sup> forward and backward vectors from the prior ones. All that remains is to find the first of these vectors, and then some quick sums and multiplications give the remaining ones. The first forward and backward vectors are simply:
Performing these vector summations, then, gives the ''n''th forward and backward vectors from the prior ones. All that remains is to find the first of these vectors, and then some quick sums and multiplications give the remaining ones. The first forward and backward vectors are simply:
 
: <math>\vec f^1 = \vec b^1 = \left[ {1 \over M_{11}} \right] = \left[ {1 \over t_0} \right].</math>
: <math>\vec f^1 = \vec b^1 = \left[ {1 \over M_{11}} \right] = \left[ {1 \over t_0} \right].</math>


=== Using the backward vectors ===
=== Using the backward vectors ===
The above steps give the ''N'' backward vectors for '''''M'''''. From there, a more arbitrary equation is:
The above steps give the ''N'' backward vectors for '''''M'''''. From there, a more arbitrary equation is:
: <math> \vec y = \mathbf M \  \vec x. </math>
: <math> \vec y = \mathbf M \  \vec x. </math>


Line 185: Line 171:


The solution is then built recursively by noticing that if
The solution is then built recursively by noticing that if
: <math> \mathbf T^{n-1}
: <math> \mathbf T^{n-1}
   \begin{bmatrix}  x_1^{n-1}    \\
   \begin{bmatrix}  x_1^{n-1}    \\
Line 199: Line 184:


Then, extending with a zero again, and defining an error constant where necessary:
Then, extending with a zero again, and defining an error constant where necessary:
: <math> \mathbf T^{n}
: <math> \mathbf T^{n}
   \begin{bmatrix}  x_1^{n-1}    \\
   \begin{bmatrix}  x_1^{n-1}    \\
Line 214: Line 198:
   \end{bmatrix}.</math>
   \end{bmatrix}.</math>


We can then use the ''n''<sup>th</sup> backward vector to eliminate the error term and replace it with the desired formula as follows:
We can then use the ''n''th backward vector to eliminate the error term and replace it with the desired formula as follows:
 
: <math> \mathbf T^{n} \left (
: <math> \mathbf T^{n} \left (
   \begin{bmatrix}  x_1^{n-1}    \\
   \begin{bmatrix}  x_1^{n-1}    \\
Line 235: Line 218:


== Block Levinson algorithm ==
== Block Levinson algorithm ==
If '''''M''''' is not strictly Toeplitz, but [[block matrix|block]] Toeplitz, the Levinson recursion can be derived in much the same way by regarding the block Toeplitz matrix as a Toeplitz matrix with matrix elements (Musicus 1988). Block Toeplitz matrices arise naturally in signal processing algorithms when dealing with multiple signal streams (e.g., in [[System analysis#Characterization of systems|MIMO]] systems) or cyclo-stationary signals.
If '''''M''''' is not strictly Toeplitz, but [[block matrix|block]] Toeplitz, the Levinson recursion can be derived in much the same way by regarding the block Toeplitz matrix as a Toeplitz matrix with matrix elements (Musicus 1988). Block Toeplitz matrices arise naturally in [[signal processing]] algorithms when dealing with multiple signal streams (e.g., in [[System analysis#Characterization of systems|MIMO]] systems) or cyclo-stationary signals.


== See also ==
== See also ==
*[[Split Levinson recursion]]
* [[Split Levinson recursion]]
*[[Linear prediction]]
* [[Linear prediction]]
*[[Autoregressive model]]
* [[Autoregressive model]]


== Notes ==
== Notes ==
Line 246: Line 229:


== References ==
== References ==
'''Defining sources'''
'''Defining sources'''
* Levinson, N. (1947). "The Wiener RMS error criterion in filter design and prediction." ''J. Math. Phys.'', v. 25, pp.&nbsp;261–278.
* Levinson, N. (1947). "The Wiener RMS error criterion in filter design and prediction". ''J. Math. Phys.'', v. 25, pp.&nbsp;261–278.
* Durbin, J. (1960). "The fitting of time series models." ''Rev. Inst. Int. Stat.'', v. 28, pp.&nbsp;233–243.
* Durbin, J. (1960). "The fitting of time series models". ''Rev. Inst. Int. Stat.'', v. 28, pp.&nbsp;233–243.
* Trench, W. F. (1964).  "An algorithm for the inversion of finite Toeplitz matrices." ''J. Soc. Indust. Appl. Math.'', v. 12, pp.&nbsp;515–522.
* Trench, W. F. (1964).  "An algorithm for the inversion of finite Toeplitz matrices".  ''J. Soc. Indust. Appl. Math.'', v. 12, pp.&nbsp;515–522.
* Musicus, B. R. (1988). "Levinson and Fast Choleski Algorithms for Toeplitz and Almost Toeplitz Matrices." ''RLE TR'' No. 538, MIT. [http://dspace.mit.edu/bitstream/1721.1/4954/1/RLE-TR-538-20174000.pdf]
* Musicus, B. R. (1988). "Levinson and Fast Choleski Algorithms for Toeplitz and Almost Toeplitz Matrices". ''RLE TR'' No. 538, MIT. [http://dspace.mit.edu/bitstream/1721.1/4954/1/RLE-TR-538-20174000.pdf]
* Delsarte, P. and Genin, Y. V. (1986). "The split Levinson algorithm." ''IEEE Transactions on Acoustics, Speech, and Signal Processing'', v. ASSP-34(3), pp.&nbsp;470–478.
* Delsarte, P. and Genin, Y. V. (1986). "The split Levinson algorithm". ''IEEE Transactions on Acoustics, Speech, and Signal Processing'', v. ASSP-34(3), pp.&nbsp;470–478.
 
'''Further work'''
'''Further work'''
*{{cite journal | last1 = Bojanczyk | first1 = A.W. | last2 = Brent | first2 = R.P. | last3 = De Hoog | first3 = F.R. | last4 = Sweet | first4 = D.R. | year = 1995 | title = On the stability of the Bareiss and related Toeplitz factorization algorithms | journal = [[SIAM Journal on Matrix Analysis and Applications]] | volume = 16 | pages = 40–57 | doi = 10.1137/S0895479891221563 | arxiv = 1004.5510 | s2cid = 367586 }}
* {{cite journal | last1 = Bojanczyk | first1 = A.W. | last2 = Brent | first2 = R.P. | last3 = De Hoog | first3 = F.R. | last4 = Sweet | first4 = D.R. | year = 1995 | title = On the stability of the Bareiss and related Toeplitz factorization algorithms | journal = [[SIAM Journal on Matrix Analysis and Applications]] | volume = 16 | pages = 40–57 | doi = 10.1137/S0895479891221563 | arxiv = 1004.5510 | s2cid = 367586 }}
*[[Richard P. Brent|Brent R.P.]] (1999), "Stability of fast algorithms for structured linear systems", ''Fast Reliable Algorithms for Matrices with Structure'' (editors—T. Kailath, A.H. Sayed), ch.4 ([[Society for Industrial and Applied Mathematics|SIAM]]).
* [[Richard P. Brent|Brent R.P.]] (1999), "Stability of fast algorithms for structured linear systems", ''Fast Reliable Algorithms for Matrices with Structure'' (editors T. Kailath, A.H. Sayed), ch.4 ([[Society for Industrial and Applied Mathematics|SIAM]]).
* Bunch, J. R. (1985). "Stability of methods for solving Toeplitz systems of equations." ''SIAM J. Sci. Stat. Comput.'', v. 6, pp.&nbsp;349–364. [https://doi.org/10.1137/0906025]
* Bunch, J. R. (1985). "Stability of methods for solving Toeplitz systems of equations". ''SIAM J. Sci. Stat. Comput.'', v. 6, pp.&nbsp;349–364. [https://doi.org/10.1137/0906025]
*{{cite journal | last = Krishna | first = H. |author2=Wang, Y.  | title = The Split Levinson Algorithm is weakly stable | journal = [[SIAM Journal on Numerical Analysis]] | volume = 30 | issue = 5 | pages = 1498–1508 | date = 1993 | url = http://scitation.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=SJNAAM000030000005001498000001&idtype=cvips&gifs=yes | doi = 10.1137/0730078| url-access = subscription }}
* {{cite journal | last = Krishna | first = H. |author2=Wang, Y.  | title = The Split Levinson Algorithm is weakly stable | journal = [[SIAM Journal on Numerical Analysis]] | volume = 30 | issue = 5 | pages = 1498–1508 | date = 1993 | url = http://scitation.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=SJNAAM000030000005001498000001&idtype=cvips&gifs=yes | doi = 10.1137/0730078| url-access = subscription }}
 
'''Summaries'''
'''Summaries'''
* Bäckström, T. (2004). "2.2. Levinson–Durbin Recursion." ''Linear Predictive Modelling of Speech – Constraints and Line Spectrum Pair Decomposition.'' Doctoral thesis. Report no. 71 / Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing. Espoo, Finland. [http://lib.tkk.fi/Diss/2004/isbn9512269473/isbn9512269473.pdf]
* Bäckström, T. (2004). "2.2. Levinson–Durbin Recursion". ''Linear Predictive Modelling of Speech – Constraints and Line Spectrum Pair Decomposition''. Doctoral thesis. Report no. 71 / Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing. Espoo, Finland. [http://lib.tkk.fi/Diss/2004/isbn9512269473/isbn9512269473.pdf]
* Claerbout, Jon F. (1976). "Chapter 7 – Waveform Applications of Least-Squares." ''Fundamentals of Geophysical Data Processing.''  Palo Alto: Blackwell Scientific Publications. [https://web.archive.org/web/20060921204908/http://sep.stanford.edu/oldreports/fgdp2/fgdp_07.pdf]
* Claerbout, Jon F. (1976). "Chapter 7 – Waveform Applications of Least-Squares". ''Fundamentals of Geophysical Data Processing''. Palo Alto: Blackwell Scientific Publications. [https://web.archive.org/web/20060921204908/http://sep.stanford.edu/oldreports/fgdp2/fgdp_07.pdf]
*{{Citation |last1=Press|first1=WH|last2=Teukolsky|first2=SA|last3=Vetterling|first3=WT|last4=Flannery|first4=BP|year=2007|title=Numerical Recipes: The Art of Scientific Computing|edition=3rd|publisher=Cambridge University Press| publication-place=New York|isbn=978-0-521-88068-8|chapter=Section 2.8.2. Toeplitz Matrices|chapter-url=http://apps.nrbook.com/empanel/index.html?pg=96}}
* {{citation |last1=Press|first1=WH|last2=Teukolsky|first2=SA|last3=Vetterling|first3=WT|last4=Flannery|first4=BP|year=2007|title=Numerical Recipes: The Art of Scientific Computing|edition=3rd|publisher=Cambridge University Press| publication-place=New York|isbn=978-0-521-88068-8|chapter=Section 2.8.2. Toeplitz Matrices|chapter-url=http://apps.nrbook.com/empanel/index.html?pg=96}}
* Golub, G.H., and Loan, C.F. Van (1996). "Section 4.7 : Toeplitz and related Systems" ''Matrix Computations'', Johns Hopkins University Press
* Golub, G.H., and Loan, C.F. Van (1996). "Section 4.7: Toeplitz and related Systems" ''Matrix Computations'', Johns Hopkins University Press


[[Category:Matrices (mathematics)]]
[[Category:Matrices (mathematics)]]
[[Category:Numerical analysis]]
[[Category:Numerical analysis]]

Latest revision as of 13:13, 25 October 2025

Template:Short description Levinson recursion or Levinson–Durbin recursion is a procedure in linear algebra to recursively calculate the solution to an equation involving a Toeplitz matrix. The algorithm runs in Θ(n2)Script error: No such module "Check for unknown parameters". time, which is a strong improvement over Gauss–Jordan elimination, which runs in Θ(n3).

The Levinson–Durbin algorithm was proposed first by Norman Levinson in 1947, improved by James Durbin in 1960, and subsequently improved to 4n2Script error: No such module "Check for unknown parameters". and then 3n2Script error: No such module "Check for unknown parameters". multiplications by W. F. Trench and S. Zohar, respectively.

Other methods to process data include Schur decomposition and Cholesky decomposition. In comparison to these, Levinson recursion (particularly split Levinson recursion) tends to be faster computationally, but more sensitive to computational inaccuracies like round-off errors.

The Bareiss algorithm for Toeplitz matrices (not to be confused with the general Bareiss algorithm) runs about as fast as Levinson recursion, but it uses O(n2)Script error: No such module "Check for unknown parameters". space, whereas Levinson recursion uses only O(n) space. The Bareiss algorithm, though, is numerically stable,[1][2] whereas Levinson recursion is at best only weakly stable (i.e. it exhibits numerical stability for well-conditioned linear systems).[3]

Newer algorithms, called asymptotically fast or sometimes superfast Toeplitz algorithms, can solve in Θ(n (log n)p)Script error: No such module "Check for unknown parameters". for various p (e.g. p = 2,[4][5] p = 3[6]). Levinson recursion remains popular for several reasons; for one, it is relatively easy to understand in comparison; for another, it can be faster than a superfast algorithm for small n (usually n < 256).[7]

Derivation

Background

Matrix equations follow the form

𝐌x=y.

The Levinson–Durbin algorithm may be used for any such equation, as long as M is a known Toeplitz matrix with a nonzero main diagonal. Here y is a known vector, and x is an unknown vector of numbers xi yet to be determined.

For the sake of this article, êi is a vector made up entirely of zeroes, except for its ith place, which holds the value one. Its length will be implicitly determined by the surrounding context. The term N refers to the width of the matrix above – M is an N×N matrix. Finally, in this article, superscripts refer to an inductive index, whereas subscripts denote indices. For example (and definition), in this article, the matrix Tn is an n×n matrix that copies the upper left n×n block from M – that is, Tnij = Mij.

Tn is also a Toeplitz matrix, meaning that it can be written as

𝐓n=[t0t1t2tn+1t1t0t1tn+2t2t1t0tn+3tn1tn2tn3t0].

Introductory steps

The algorithm proceeds in two steps. In the first step, two sets of vectors, called the forward and backward vectors, are established. The forward vectors are used to help get the set of backward vectors; then they can be immediately discarded. The backwards vectors are necessary for the second step, where they are used to build the solution desired.

Levinson–Durbin recursion defines the nth "forward vector", denoted fn, as the vector of length n which satisfies:

𝐓nfn=e^1.

The nth "backward vector" bn is defined similarly; it is the vector of length n which satisfies:

𝐓nbn=e^n.

An important simplification can occur when M is a symmetric matrix; then the two vectors are related by bni = fnn+1−i – that is, they are row-reversals of each other. This can save some extra computation in that special case.

Obtaining the backward vectors

Even if the matrix is not symmetric, then the nth forward and backward vector may be found from the vectors of length n − 1 as follows. First, the forward vector may be extended with a zero to obtain:

𝐓n[fn10]=[   tn+1 𝐓n1 tn+2   tn1tn2t0][ fn1 0 ]=[100εfn].

In going from Tn−1 to Tn, the extra column added to the matrix does not perturb the solution when a zero is used to extend the forward vector. However, the extra row added to the matrix has perturbed the solution; and it has created an unwanted error term εf which occurs in the last place. The above equation gives it the value of:

εfn = i=1n1 Mni fin1 = i=1n1 tni fin1.

This error will be returned to shortly and eliminated from the new forward vector; but first, the backwards vector must be extended in a similar (albeit reversed) fashion. For the backwards vector,

𝐓n[0bn1]=[t0tn+2tn+1   tn2 𝐓n1 tn1  ][ 0 bn1 ]=[εbn001].

As before, the extra column added to the matrix does not perturb this new backwards vector; but the extra row does. Here we have another unwanted error εb with value:

εbn = i=2n M1i bi1n1 = i=1n1 ti bin1. 

These two error terms can be used to form higher-order forward and backward vectors described as follows. Using the linearity of matrices, the following identity holds for all (α,β):

𝐓(α[f 0]+β[0 b])=α[100εf]+β[εb001].

If α and β are chosen so that the right hand side yields ê1 or ên, then the quantity in the parentheses will fulfill the definition of the nth forward or backward vector, respectively. With those alpha and beta chosen, the vector sum in the parentheses is simple and yields the desired result.

To find these coefficients, αfn, βfn are such that :

fn=αfn[fn10]+βfn[0bn1]

and respectively αbn, βbn are such that :

bn=αbn[fn10]+βbn[0bn1].

By multiplying both previous equations by 𝐓n one gets the following equation:

[1εbn0000εfn1][αfnαbnβfnβbn]=[10000001].

Now, all the zeroes in the middle of the two vectors above being disregarded and collapsed, only the following equation is left:

[1εbnεfn1][αfnαbnβfnβbn]=[1001].

With these solved for (by using the Cramer 2×2 matrix inverse formula), the new forward and backward vectors are:

fn=11εbnεfn[fn10]εfn1εbnεfn[0bn1]
bn=11εbnεfn[0bn1]εbn1εbnεfn[fn10].

Performing these vector summations, then, gives the nth forward and backward vectors from the prior ones. All that remains is to find the first of these vectors, and then some quick sums and multiplications give the remaining ones. The first forward and backward vectors are simply:

f1=b1=[1M11]=[1t0].

Using the backward vectors

The above steps give the N backward vectors for M. From there, a more arbitrary equation is:

y=𝐌 x.

The solution can be built in the same recursive way that the backwards vectors were built. Accordingly, x must be generalized to a sequence of intermediates xn, such that xN=x.

The solution is then built recursively by noticing that if

𝐓n1[x1n1x2n1xn1n1]=[y1y2yn1].

Then, extending with a zero again, and defining an error constant where necessary:

𝐓n[x1n1x2n1xn1n10]=[y1y2yn1εxn1].

We can then use the nth backward vector to eliminate the error term and replace it with the desired formula as follows:

𝐓n([x1n1x2n1xn1n10]+(ynεxn1) bn)=[y1y2yn1yn].

Extending this method until n = N yields the solution x.

In practice, these steps are often done concurrently with the rest of the procedure, but they form a coherent unit and deserve to be treated as their own step.

Block Levinson algorithm

If M is not strictly Toeplitz, but block Toeplitz, the Levinson recursion can be derived in much the same way by regarding the block Toeplitz matrix as a Toeplitz matrix with matrix elements (Musicus 1988). Block Toeplitz matrices arise naturally in signal processing algorithms when dealing with multiple signal streams (e.g., in MIMO systems) or cyclo-stationary signals.

See also

Notes

<templatestyles src="Reflist/styles.css" />

  1. Bojanczyk et al. (1995).
  2. Brent (1999).
  3. Krishna & Wang (1993).
  4. Script error: No such module "citation/CS1".
  5. Script error: No such module "citation/CS1".
  6. Script error: No such module "citation/CS1".
  7. Script error: No such module "citation/CS1".

Script error: No such module "Check for unknown parameters".

References

Defining sources

  • Levinson, N. (1947). "The Wiener RMS error criterion in filter design and prediction". J. Math. Phys., v. 25, pp. 261–278.
  • Durbin, J. (1960). "The fitting of time series models". Rev. Inst. Int. Stat., v. 28, pp. 233–243.
  • Trench, W. F. (1964). "An algorithm for the inversion of finite Toeplitz matrices". J. Soc. Indust. Appl. Math., v. 12, pp. 515–522.
  • Musicus, B. R. (1988). "Levinson and Fast Choleski Algorithms for Toeplitz and Almost Toeplitz Matrices". RLE TR No. 538, MIT. [1]
  • Delsarte, P. and Genin, Y. V. (1986). "The split Levinson algorithm". IEEE Transactions on Acoustics, Speech, and Signal Processing, v. ASSP-34(3), pp. 470–478.

Further work

  • Script error: No such module "Citation/CS1".
  • Brent R.P. (1999), "Stability of fast algorithms for structured linear systems", Fast Reliable Algorithms for Matrices with Structure (editors T. Kailath, A.H. Sayed), ch.4 (SIAM).
  • Bunch, J. R. (1985). "Stability of methods for solving Toeplitz systems of equations". SIAM J. Sci. Stat. Comput., v. 6, pp. 349–364. [2]
  • Script error: No such module "Citation/CS1".

Summaries

  • Bäckström, T. (2004). "2.2. Levinson–Durbin Recursion". Linear Predictive Modelling of Speech – Constraints and Line Spectrum Pair Decomposition. Doctoral thesis. Report no. 71 / Helsinki University of Technology, Laboratory of Acoustics and Audio Signal Processing. Espoo, Finland. [3]
  • Claerbout, Jon F. (1976). "Chapter 7 – Waveform Applications of Least-Squares". Fundamentals of Geophysical Data Processing. Palo Alto: Blackwell Scientific Publications. [4]
  • Script error: No such module "citation/CS1".
  • Golub, G.H., and Loan, C.F. Van (1996). "Section 4.7: Toeplitz and related Systems" Matrix Computations, Johns Hopkins University Press