Schur complement: Difference between revisions

Latest revision as of 06:10, 29 October 2025

Template:Short description The Schur complement is a key tool in the fields of linear algebra, the theory of matrices, numerical analysis, and statistics.

It is defined for a block matrix. Suppose p, q are nonnegative integers such that p + q > 0, and suppose A, B, C, D are respectively p × p, p × q, q × p, and q × q matrices of complex numbers. Let $M = [\begin{matrix} A & B \\ C & D \end{matrix}]$ so that M is a (p + q) × (p + q) matrix.

If D is invertible, then the Schur complement of the block D of the matrix M is the p × p matrix defined by $M / D : = A - B D^{- 1} C .$ If A is invertible, the Schur complement of the block A of the matrix M is the q × q matrix defined by $M / A : = D - C A^{- 1} B .$ In the case that A or D is singular, substituting a generalized inverse for the inverses on M/A and M/D yields the generalized Schur complement.

The Schur complement is named after Issai Schur^[1] who used it to prove Schur's lemma, although it had been used previously.^[2] Emilie Virginia Haynsworth was the first to call it the Schur complement.^[3] The Schur complement is sometimes referred to as the Feshbach map after a physicist Herman Feshbach.^[4]

Background

The Schur complement arises when performing a block Gaussian elimination on the matrix M. In order to eliminate the elements below the block diagonal, one multiplies the matrix M by a block lower triangular matrix on the right as follows: $\begin{aligned} M = [\begin{matrix} A & B \\ C & D \end{matrix}] \to [\begin{matrix} A & B \\ C & D \end{matrix}] [\begin{matrix} I_{p} & 0 \\ - D^{- 1} C & I_{q} \end{matrix}] = [\begin{matrix} A - B D^{- 1} C & B \\ 0 & D \end{matrix}], \end{aligned}$ where I_p denotes a p×p identity matrix. As a result, the Schur complement $M / D = A - B D^{- 1} C$ appears in the upper-left p×p block.

Continuing the elimination process beyond this point (i.e., performing a block Gauss–Jordan elimination), $\begin{aligned} [\begin{matrix} A - B D^{- 1} C & B \\ 0 & D \end{matrix}] \to [\begin{matrix} I_{p} & - B D^{- 1} \\ 0 & I_{q} \end{matrix}] [\begin{matrix} A - B D^{- 1} C & B \\ 0 & D \end{matrix}] = [\begin{matrix} A - B D^{- 1} C & 0 \\ 0 & D \end{matrix}], \end{aligned}$ leads to an LDU decomposition of M, which reads $\begin{aligned} M & = [\begin{matrix} A & B \\ C & D \end{matrix}] = [\begin{matrix} I_{p} & B D^{- 1} \\ 0 & I_{q} \end{matrix}] [\begin{matrix} A - B D^{- 1} C & 0 \\ 0 & D \end{matrix}] [\begin{matrix} I_{p} & 0 \\ D^{- 1} C & I_{q} \end{matrix}] . \end{aligned}$ Thus, the inverse of M may be expressed involving D⁻¹ and the inverse of Schur's complement, assuming it exists, as $\begin{aligned} M^{- 1} = {[\begin{matrix} A & B \\ C & D \end{matrix}]}^{- 1} = & {([\begin{matrix} I_{p} & B D^{- 1} \\ 0 & I_{q} \end{matrix}] [\begin{matrix} A - B D^{- 1} C & 0 \\ 0 & D \end{matrix}] [\begin{matrix} I_{p} & 0 \\ D^{- 1} C & I_{q} \end{matrix}])}^{- 1} \\ = & [\begin{matrix} I_{p} & 0 \\ - D^{- 1} C & I_{q} \end{matrix}] [\begin{matrix} {(A - B D^{- 1} C)}^{- 1} & 0 \\ 0 & D^{- 1} \end{matrix}] [\begin{matrix} I_{p} & - B D^{- 1} \\ 0 & I_{q} \end{matrix}] \\ = & [\begin{matrix} {(A - B D^{- 1} C)}^{- 1} & - {(A - B D^{- 1} C)}^{- 1} B D^{- 1} \\ - D^{- 1} C {(A - B D^{- 1} C)}^{- 1} & D^{- 1} + D^{- 1} C {(A - B D^{- 1} C)}^{- 1} B D^{- 1} \end{matrix}] \\ = & [\begin{matrix} {(M / D)}^{- 1} & - {(M / D)}^{- 1} B D^{- 1} \\ - D^{- 1} C {(M / D)}^{- 1} & D^{- 1} + D^{- 1} C {(M / D)}^{- 1} B D^{- 1} \end{matrix}] . \end{aligned}$ The above relationship comes from the elimination operations that involve D⁻¹ and M/D. An equivalent derivation can be done with the roles of A and D interchanged. By equating the expressions for M⁻¹ obtained in these two different ways, one can establish the matrix inversion lemma, which relates the two Schur complements of M: M/D and M/A (see "Derivation from LDU decomposition" in Template:Slink).

Properties

If p and q are both 1 (i.e., A, B, C and D are all scalars), we get the familiar formula for the inverse of a 2-by-2 matrix:

M^{- 1} = \frac{1}{A D - B C} [\begin{matrix} D & - B \\ - C & A \end{matrix}]

provided that AD − BC is non-zero.

In general, if A is invertible, then

\begin{aligned} M & = [\begin{matrix} A & B \\ C & D \end{matrix}] = [\begin{matrix} I_{p} & 0 \\ C A^{- 1} & I_{q} \end{matrix}] [\begin{matrix} A & 0 \\ 0 & D - C A^{- 1} B \end{matrix}] [\begin{matrix} I_{p} & A^{- 1} B \\ 0 & I_{q} \end{matrix}], \\ M^{- 1} & = [\begin{matrix} A^{- 1} + A^{- 1} B (M / A)^{- 1} C A^{- 1} & - A^{- 1} B (M / A)^{- 1} \\ - (M / A)^{- 1} C A^{- 1} & (M / A)^{- 1} \end{matrix}] \end{aligned}

whenever this inverse exists.

(Schur's formula) When A, respectively D, is invertible, the determinant of M is also clearly seen to be given by

\det (M) = \det (A) \det (D - C A^{- 1} B)

, respectively

\det (M) = \det (D) \det (A - B D^{- 1} C)

,

which generalizes the determinant formula for 2 × 2 matrices.

(Guttman rank additivity formula) If D is invertible, then the rank of M is given by

rank (M) = rank (D) + rank (A - B D^{- 1} C)

(Haynsworth inertia additivity formula) If A is invertible, then the inertia of the block matrix M is equal to the inertia of A plus the inertia of M/A.
(Quotient identity) $A / B = ((A / C) / (B / C))$ .^[5]
The Schur complement of a Laplacian matrix is also a Laplacian matrix.^[6]

Application to solving linear equations

The Schur complement arises naturally in solving a system of linear equations such as^[7]

$[\begin{matrix} A & B \\ C & D \end{matrix}] [\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} u \\ v \end{matrix}]$ .

Assuming that the submatrix $A$ is invertible, we can eliminate $x$ from the equations, as follows.

$x = A^{- 1} (u - B y) .$

Substituting this expression into the second equation yields

(D - C A^{- 1} B) y = v - C A^{- 1} u .

We refer to this as the reduced equation obtained by eliminating $x$ from the original equation. The matrix appearing in the reduced equation is called the Schur complement of the first block $A$ in $M$ :

S \overset{d e f}{=} D - C A^{- 1} B

.

Solving the reduced equation, we obtain

y = S^{- 1} (v - C A^{- 1} u) .

Substituting this into the first equation yields

x = (A^{- 1} + A^{- 1} B S^{- 1} C A^{- 1}) u - A^{- 1} B S^{- 1} v .

We can express the above two equation as:

[\begin{matrix} x \\ y \end{matrix}] = [\begin{matrix} A^{- 1} + A^{- 1} B S^{- 1} C A^{- 1} & - A^{- 1} B S^{- 1} \\ - S^{- 1} C A^{- 1} & S^{- 1} \end{matrix}] [\begin{matrix} u \\ v \end{matrix}] .

Therefore, a formulation for the inverse of a block matrix is:

{[\begin{matrix} A & B \\ C & D \end{matrix}]}^{- 1} = [\begin{matrix} A^{- 1} + A^{- 1} B S^{- 1} C A^{- 1} & - A^{- 1} B S^{- 1} \\ - S^{- 1} C A^{- 1} & S^{- 1} \end{matrix}] = [\begin{matrix} I_{p} & - A^{- 1} B \\ I_{q} \end{matrix}] [\begin{matrix} A^{- 1} \\ S^{- 1} \end{matrix}] [\begin{matrix} I_{p} \\ - C A^{- 1} & I_{q} \end{matrix}] .

In particular, we see that the Schur complement is the inverse of the $2, 2$ block entry of the inverse of $M$ .

In practice, one needs $A$ to be well-conditioned in order for this algorithm to be numerically accurate.

This method is useful in electrical engineering to reduce the dimension of a network's equations. It is especially useful when element(s) of the output vector are zero. For example, when $u$ or $v$ is zero, we can eliminate the associated rows of the coefficient matrix without any changes to the rest of the output vector. If $v$ is null then the above equation for $x$ reduces to $x = (A^{- 1} + A^{- 1} B S^{- 1} C A^{- 1}) u$ , thus reducing the dimension of the coefficient matrix while leaving $u$ unmodified. This is used to advantage in electrical engineering where it is referred to as node elimination or Kron reduction.

Applications to probability theory and statistics

Suppose the random column vectors X, Y live in Rⁿ and R^m respectively, and the vector (X, Y) in R^{n + m} has a multivariate normal distribution whose covariance is the symmetric positive-definite matrix

Σ = [\begin{matrix} A & B \\ B^{T} & C \end{matrix}],

where $A \in ℝ^{n \times n}$ is the covariance matrix of X, $C \in ℝ^{m \times m}$ is the covariance matrix of Y and $B \in ℝ^{n \times m}$ is the covariance matrix between X and Y.

Then the conditional covariance of X given Y is the Schur complement of C in $Σ$ :^[8]

\begin{aligned} Cov (X ∣ Y) & = A - B C^{- 1} B^{T} \\ E (X ∣ Y) & = E (X) + B C^{- 1} (Y - E (Y)) \end{aligned}

If we take the matrix $Σ$ above to be, not a covariance of a random vector, but a sample covariance, then it may have a Wishart distribution. In that case, the Schur complement of C in $Σ$ also has a Wishart distribution.Script error: No such module "Unsubst".

Conditions for positive definiteness and semi-definiteness

Let X be a symmetric matrix of real numbers given by $X = [\begin{matrix} A & B \\ B^{T} & C \end{matrix}] .$ Then by the Haynsworth inertia additivity formula, we find

If A is invertible, then X is positive definite if and only if A and its complement X/A are both positive definite:^[2]Template:Rp

X ≻ 0 \Leftrightarrow A ≻ 0, X / A = C - B^{T} A^{- 1} B ≻ 0 .

If C is invertible, then X is positive definite if and only if C and its complement X/C are both positive definite:

X ≻ 0 \Leftrightarrow C ≻ 0, X / C = A - B C^{- 1} B^{T} ≻ 0 .

If A is positive definite, then X is positive semi-definite if and only if the complement X/A is positive semi-definite:^[2]Template:Rp

If A ≻ 0, then X ⪰ 0 \Leftrightarrow X / A = C - B^{T} A^{- 1} B ⪰ 0 .

If C is positive definite, then X is positive semi-definite if and only if the complement X/C is positive semi-definite:

If C ≻ 0, then X ⪰ 0 \Leftrightarrow X / C = A - B C^{- 1} B^{T} ⪰ 0 .

The first and third statements can also be derived^[7] by considering the minimizer of the quantity $u^{T} A u + 2 v^{T} B^{T} u + v^{T} C v,$ as a function of v (for fixed u).

Furthermore, since $[\begin{matrix} A & B \\ B^{T} & C \end{matrix}] ≻ 0 ⟺ [\begin{matrix} C & B^{T} \\ B & A \end{matrix}] ≻ 0$ and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third) statement.

There is also a sufficient and necessary condition for the positive semi-definiteness of X in terms of a generalized Schur complement.^[2] Precisely,

$X ⪰ 0 \Leftrightarrow A ⪰ 0, C - B^{T} A^{g} B ⪰ 0, (I - A A^{g}) B = 0$ and
$X ⪰ 0 \Leftrightarrow C ⪰ 0, A - B C^{g} B^{T} ⪰ 0, (I - C C^{g}) B^{T} = 0,$

where $A^{g}$ denotes a generalized inverse of $A$ .

References

↑ Script error: No such module "Citation/CS1".
↑ ^a ^b ^c ^d Script error: No such module "citation/CS1".
↑ Haynsworth, E. V., "On the Schur Complement", Basel Mathematical Notes, #BNB 20, 17 pages, June 1968.
↑ Script error: No such module "Citation/CS1".
↑ Script error: No such module "Citation/CS1".
↑ Script error: No such module "Citation/CS1".
↑ ^a ^b Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)
↑ Script error: No such module "citation/CS1".

Script error: No such module "Check for unknown parameters".

[1] Script error: No such module "Citation/CS1".

[Zhang_2005-2] Script error: No such module "citation/CS1".

[3] Haynsworth, E. V., "On the Schur Complement", Basel Mathematical Notes, #BNB 20, 17 pages, June 1968.

[4] Script error: No such module "Citation/CS1".

[5] Script error: No such module "Citation/CS1".

[6] Script error: No such module "Citation/CS1".

[Boyd_2004-7] Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)

[von_Mises_1964-8] Script error: No such module "citation/CS1".

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

@@ Line 1: / Line 1: @@
 {{Short description|Tool in linear algebra and matrix analysis}}
-The '''Schur complement''' is a key tool in the fields of [[linear algebra]], the theory of [[matrix (mathematics)|matrices]], numerical analysis, and statistics.
+The '''Schur complement''' is a key tool in the fields of [[linear algebra]], the theory of [[matrix (mathematics)|matrices]], [[numerical analysis]], and [[statistics]].
 It is defined for a [[block matrix]]. Suppose ''p'', ''q'' are [[nonnegative integers]] such that ''p + q > 0'', and suppose ''A'', ''B'', ''C'', ''D'' are respectively ''p'' × ''p'', ''p'' × ''q'', ''q'' × ''p'', and ''q'' × ''q'' matrices of complex numbers. Let
@@ Line 15: / Line 15: @@
 |author=Schur, J.
 |title=Über Potenzreihen die im Inneren des Einheitskreises beschränkt sind
-|journal=J. reine u. angewandte Mathematik
+|journal=J. Reine U. Angewandte Mathematik
 |volume=147
 |year=1917
@@ Line 91: / Line 91: @@
 : <math> \operatorname{rank}(M) = \operatorname{rank}(D) + \operatorname{rank}\left(A - BD^{-1} C\right)</math>
 * ([[Haynsworth inertia additivity formula]]) If ''A'' is invertible, then the ''inertia'' of the block matrix ''M'' is equal to the inertia of ''A'' plus the inertia of ''M''/''A''.
-* (Quotient identity) <math>A/B = ((A/C)/(B/C))</math>.<ref>{{Cite journal |last1=Crabtree |first1=Douglas E. |last2=Haynsworth |first2=Emilie V. |date=1969 |title=An identity for the Schur complement of a matrix |url=https://www.ams.org/proc/1969-022-02/S0002-9939-1969-0255573-1/ |journal=Proceedings of the American Mathematical Society |language=en |volume=22 |issue=2 |pages=364–366 |doi=10.1090/S0002-9939-1969-0255573-1 |s2cid=122868483 |issn=0002-9939|doi-access=free }}</ref>
+* (Quotient identity) <math>A/B = ((A/C)/(B/C))</math>.<ref>{{Cite journal |last1=Crabtree |first1=Douglas E. |last2=Haynsworth |first2=Emilie V. |date=1969 |title=An identity for the Schur complement of a matrix |url=https://www.ams.org/proc/1969-022-02/S0002-9939-1969-0255573-1/ |journal=Proceedings of the American Mathematical Society |language=en |volume=22 |issue=2 |pages=364–366 |doi=10.1090/S0002-9939-1969-0255573-1 |s2cid=122868483 |issn=0002-9939|doi-access=free |url-access=subscription }}</ref>
 * The Schur complement of a [[Laplacian matrix]] is also a Laplacian matrix.<ref>{{Cite journal |last=Devriendt |first=Karel |date=2022 |title=Effective resistance is more than distance: Laplacians, Simplices and the Schur complement |url=https://linkinghub.elsevier.com/retrieve/pii/S0024379522000039 |journal=Linear Algebra and Its Applications |language=en |volume=639 |pages=24–49 |doi=10.1016/j.laa.2022.01.002|arxiv=2010.04521 |s2cid=222272289 }}</ref>
 == Application to solving linear equations ==
-The Schur complement arises naturally in solving a system of linear equations such as<ref name="Boyd 2004" />
+The Schur complement arises naturally in solving a [[system of linear equations]] such as<ref name="Boyd 2004" />
 <math>
@@ Line 149: / Line 149: @@
 In practice, one needs <math>A</math> to be [[condition number|well-conditioned]] in order for this algorithm to be numerically accurate.
-This method is useful in electrical engineering to reduce the dimension of a network's equations.  It is especially useful when element(s) of the output vector are zero.  For example, when <math>u</math> or <math>v</math> is zero, we can eliminate the associated rows of the coefficient matrix without any changes to the rest of the output vector.  If <math>v</math> is null then the above equation for <math>x</math> reduces to  <math>x = \left(A^{-1} + A^{-1} B S^{-1} C A^{-1}\right) u</math>, thus reducing the dimension of the coefficient matrix while leaving <math>u</math> unmodified. This is used to advantage in electrical engineering where it is referred to as node elimination or [[Kron reduction]].
+This method is useful in electrical engineering to reduce the dimension of a network's equations.  It is especially useful when element(s) of the output vector are zero.  For example, when <math>u</math> or <math>v</math> is zero, we can eliminate the associated rows of the [[coefficient matrix]] without any changes to the rest of the output vector.  If <math>v</math> is null then the above equation for <math>x</math> reduces to  <math>x = \left(A^{-1} + A^{-1} B S^{-1} C A^{-1}\right) u</math>, thus reducing the dimension of the coefficient matrix while leaving <math>u</math> unmodified. This is used to advantage in electrical engineering where it is referred to as node elimination or [[Kron reduction]].
 ==Applications to probability theory and statistics==
@@ Line 157: / Line 157: @@
 :<math>\Sigma = \left[\begin{matrix} A & B \\ B^\mathrm{T} & C\end{matrix}\right],</math>
-where  <math display="inline">A \in \mathbb{R}^{n \times n}</math> is the covariance matrix of ''X'', <math display="inline">C \in \mathbb{R}^{m \times m}</math> is the covariance matrix of ''Y'' and <math display="inline">B \in \mathbb{R}^{n \times m}</math>  is the covariance matrix between ''X'' and ''Y''.
+where  <math display="inline">A \in \mathbb{R}^{n \times n}</math> is the [[covariance matrix]] of ''X'', <math display="inline">C \in \mathbb{R}^{m \times m}</math> is the covariance matrix of ''Y'' and <math display="inline">B \in \mathbb{R}^{n \times m}</math>  is the covariance matrix between ''X'' and ''Y''.
 Then the [[Conditional variance|conditional covariance]] of ''X'' given ''Y'' is the Schur complement of ''C'' in <math display="inline">\Sigma</math>:<ref name="von Mises 1964">{{cite book  |title=Mathematical theory of probability and statistics |url=https://archive.org/details/mathematicaltheo0057vonm |url-access=registration |first=Richard |last=von Mises |year=1964|publisher=Academic Press| chapter=Chapter VIII.9.3|isbn=978-1483255385}}</ref>
@@ Line 166: / Line 166: @@
 \end{align}</math>
-If we take the matrix <math>\Sigma</math> above to be, not a covariance of a random vector, but a ''sample'' covariance, then it may have a [[Wishart distribution]].  In that case, the Schur complement of ''C'' in <math>\Sigma</math> also has a Wishart distribution.{{Citation needed|date=January 2014}}
+If we take the matrix <math>\Sigma</math> above to be, not a covariance of a [[Multivariate random variable|random vector]], but a ''sample'' covariance, then it may have a [[Wishart distribution]].  In that case, the Schur complement of ''C'' in <math>\Sigma</math> also has a Wishart distribution.{{Citation needed|date=January 2014}}
 == Conditions for positive definiteness and semi-definiteness ==
-Let ''X'' be a symmetric matrix of real numbers given by
+Let ''X'' be a [[symmetric matrix]] of real numbers given by
 <math display="block">X = \left[\begin{matrix} A & B \\ B^\mathrm{T} & C\end{matrix}\right].</math>
-Then
+Then by the [[Haynsworth inertia additivity formula]], we find
 * If ''A'' is invertible, then ''X'' is positive definite if and only if ''A'' and its complement ''X/A'' are both positive definite:<ref name="Zhang 2005"></ref>{{rp|34}}
 :<math display="block">X \succ  0 \Leftrightarrow  A \succ  0, X/A = C - B^\mathrm{T} A^{-1} B \succ  0.</math>
@@ Line 182: / Line 182: @@
 :<math display="block">\text{If } C \succ 0,\text{ then } X \succeq 0 \Leftrightarrow X/C = A - B C^{-1} B^\mathrm{T} \succeq 0.</math>
-The first and third statements can be derived<ref name="Boyd 2004">Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)</ref> by considering the minimizer of the quantity
+The first and third statements can also be derived<ref name="Boyd 2004">Boyd, S. and Vandenberghe, L. (2004), "Convex Optimization", Cambridge University Press (Appendix A.5.5)</ref> by considering the minimizer of the quantity
 <math display="block">u^\mathrm{T} A u + 2 v^\mathrm{T} B^\mathrm{T} u + v^\mathrm{T} C v, \,</math>
 as a function of ''v'' (for fixed ''u'').
@@ Line 193: / Line 193: @@
 and similarly for positive semi-definite matrices, the second (respectively fourth) statement is immediate from the first (resp. third) statement.
-There is also a sufficient and necessary condition for the positive semi-definiteness of ''X'' in terms of a generalized Schur complement.<ref name="Zhang 2005" /> Precisely,
+There is also a sufficient and [[Necessity and sufficiency|necessary condition]] for the positive semi-definiteness of ''X'' in terms of a generalized Schur complement.<ref name="Zhang 2005" /> Precisely,
 * <math>X \succeq 0 \Leftrightarrow A \succeq  0, C - B^\mathrm{T} A^g B \succeq 0, \left(I - A A^{g}\right)B = 0 \, </math> and
 * <math>X \succeq 0 \Leftrightarrow C \succeq  0, A - B C^g B^\mathrm{T} \succeq  0, \left(I - C C^g\right)B^\mathrm{T} = 0, </math>

Schur complement: Difference between revisions

Latest revision as of 06:10, 29 October 2025

Contents

Background

Properties

Application to solving linear equations

Applications to probability theory and statistics

Conditions for positive definiteness and semi-definiteness

See also

References

Navigation menu

Schur complement: Difference between revisions

Latest revision as of 06:10, 29 October 2025

Background

Properties

Application to solving linear equations

Applications to probability theory and statistics

Conditions for positive definiteness and semi-definiteness

See also

References

Navigation menu

Search