Bisection method: Difference between revisions
imported>Citation bot Added isbn. | Use this bot. Report bugs. | Suggested by Abductive | Category:Cleanup tagged articles with a reason field from May 2025 | #UCB_Category 116/130 |
imported>LaundryPizza03 Restored revision 1271311237 by Swinub (talk): Last good version, including content that was replaced with low-quality material without explanation |
||
| Line 1: | Line 1: | ||
{{short description|Algorithm for finding a zero of a function}} | {{short description|Algorithm for finding a zero of a function}} | ||
{{about|searching zeros of continuous functions|searching a finite sorted array|binary search algorithm|the method of determining what software change caused a change in behavior|Bisection (software engineering)}} | {{about|searching zeros of continuous functions|searching a finite sorted array|binary search algorithm|the method of determining what software change caused a change in behavior|Bisection (software engineering)}} | ||
[[Image:Bisection method.svg|250px|thumb|A few steps of the bisection method applied over the starting range [a<sub>1</sub>;b<sub>1</sub>]. The bigger red dot is the root of the function.]] | [[Image:Bisection method.svg|250px|thumb|A few steps of the bisection method applied over the starting range [a<sub>1</sub>;b<sub>1</sub>]. The bigger red dot is the root of the function.]] | ||
In [[mathematics]], the '''bisection method''' is a [[Root-finding algorithm|root-finding method]] that applies to any [[continuous function]] for which one knows two values with opposite signs. The method consists of repeatedly [[Bisection|bisecting]] the [[Interval (mathematics)|interval]] defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a [[Root of a function|root]]. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods.<ref>{{Harvnb|Burden|Faires| | In [[mathematics]], the '''bisection method''' is a [[Root-finding algorithm|root-finding method]] that applies to any [[continuous function]] for which one knows two values with opposite signs. The method consists of repeatedly [[Bisection|bisecting]] the [[Interval (mathematics)|interval]] defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a [[Root of a function|root]]. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods.<ref>{{Harvnb|Burden|Faires|1985|p=31}}</ref> The method is also called the '''interval halving''' method,<ref>{{cite web |url=http://siber.cankaya.edu.tr/NumericalComputations/ceng375/node32.html |title=Interval Halving (Bisection) |access-date=2013-11-07 |url-status=dead |archive-url=https://web.archive.org/web/20130519092250/http://siber.cankaya.edu.tr/NumericalComputations/ceng375/node32.html |archive-date=2013-05-19 }}</ref> the '''[[Binary search algorithm|binary search method]]''',<ref>{{Harvnb|Burden|Faires|1985|p=28}}</ref> or the '''dichotomy method'''.<ref>{{Cite web|title = Dichotomy method - Encyclopedia of Mathematics|url = https://www.encyclopediaofmath.org/index.php/Dichotomy_method|website = www.encyclopediaofmath.org|access-date = 2015-12-21}}</ref> | ||
For [[polynomial]]s, more elaborate methods exist for testing the existence of a root in an interval ([[Descartes' rule of signs]], [[Sturm's theorem]], [[Budan's theorem]]). They allow extending the bisection method into efficient algorithms for finding all real roots of a polynomial; see [[Real-root isolation]]. | For [[polynomial]]s, more elaborate methods exist for testing the existence of a root in an interval ([[Descartes' rule of signs]], [[Sturm's theorem]], [[Budan's theorem]]). They allow extending the bisection method into efficient algorithms for finding all real roots of a polynomial; see [[Real-root isolation]]. | ||
== The method == | == The method == | ||
The method is applicable for numerically solving the equation | The method is applicable for numerically solving the equation ''f''(''x'') = 0 for the [[Real number|real]] variable ''x'', where ''f'' is a [[continuous function]] defined on an interval [''a'', ''b''] and where ''f''(''a'') and ''f''(''b'') have opposite signs. In this case ''a'' and ''b'' are said to bracket a root since, by the [[intermediate value theorem]], the continuous function ''f'' must have at least one root in the interval (''a'', ''b''). | ||
At each step the method divides the interval in two parts/halves by computing the midpoint | At each step the method divides the interval in two parts/halves by computing the midpoint ''c'' = (''a''+''b'') / 2 of the interval and the value of the function ''f''(''c'') at that point. If ''c'' itself is a root then the process has succeeded and stops. Otherwise, there are now only two possibilities: either ''f''(''a'') and ''f''(''c'') have opposite signs and bracket a root, or ''f''(''c'') and ''f''(''b'') have opposite signs and bracket a root.<ref>If the function has the same sign at the endpoints of an interval, the endpoints may or may not bracket roots of the function.</ref> The method selects the subinterval that is guaranteed to be a bracket as the new interval to be used in the next step. In this way an interval that contains a zero of ''f'' is reduced in width by 50% at each step. The process is continued until the interval is sufficiently small. | ||
Explicitly, if | Explicitly, if ''f''(''c'')=0 then ''c'' may be taken as the solution and the process stops. Otherwise, if ''f''(''a'') and ''f''(''c'') have opposite signs, then the method sets ''c'' as the new value for ''b'', and if ''f''(''b'') and ''f''(''c'') have opposite signs then the method sets ''c'' as the new ''a''. In both cases, the new ''f''(''a'') and ''f''(''b'') have opposite signs, so the method is applicable to this smaller interval.<ref>{{Harvnb|Burden|Faires|1985|p=28}} for section</ref> | ||
=== | === Iteration tasks === | ||
The input for the method is a continuous function | The input for the method is a continuous function ''f'', an interval [''a'', ''b''], and the function values ''f''(''a'') and ''f''(''b''). The function values are of opposite sign (there is at least one zero crossing within the interval). Each iteration performs these steps: | ||
# Calculate | # Calculate ''c'', the midpoint of the interval, ''c'' = {{sfrac|''a ''+ ''b''|2}}. | ||
# Calculate the function value at the midpoint, ''f''(''c''). | |||
# If convergence is satisfactory (that is, ''c'' − ''a'' is sufficiently small, or |''f''(''c'')| is sufficiently small), return ''c'' and stop iterating. | |||
# Examine the sign of ''f''(''c'') and replace either (''a'', ''f''(''a'')) or (''b'', ''f''(''b'')) with (''c'', ''f''(''c'')) so that there is a zero crossing within the new interval. | |||
# Calculate the function value at the midpoint, | |||
# If convergence is satisfactory ( | |||
When implementing the method on a computer, there can be problems with finite precision, so there are often additional convergence tests or limits to the number of iterations. Although ''f'' is continuous, finite precision may preclude a function value ever being zero. For example, consider {{math|1=''f''(''x'') = cos ''x''}}; there is no floating-point value approximating {{math|1=''x'' = {{pi}}/2}} that gives exactly zero. Additionally, the difference between ''a'' and ''b'' is limited by the floating point precision; i.e., as the difference between ''a'' and ''b'' decreases, at some point the midpoint of {{math|[''a'', ''b'']}} will be numerically identical to (within floating point precision of) either ''a'' or ''b''. | |||
< | ===Algorithm=== | ||
The method may be written in [[pseudocode]] as follows:<ref>{{Harvnb|Burden|Faires|1985|p=29}}. This version recomputes the function values at each iteration rather than carrying them to the next iterations.</ref> | |||
'''input:''' Function ''f'', | |||
endpoint values ''a'', ''b'', | |||
tolerance ''TOL'', | |||
maximum iterations ''NMAX'' | |||
'''conditions:''' ''a'' < ''b'', | |||
either ''f''(''a'') < 0 and ''f''(''b'') > 0 or ''f''(''a'') > 0 and ''f''(''b'') < 0 | |||
'''output:''' value which differs from a root of ''f''(''x'') = 0 by less than ''TOL'' | |||
''N'' ← 1 | |||
'''while''' ''N'' ≤ ''NMAX'' '''do''' ''// limit iterations to prevent infinite loop'' | |||
''c'' ← (''a'' + ''b'')/2 ''// new midpoint'' | |||
'''if''' ''f''(''c'') = 0 or (''b'' – ''a'')/2 < ''TOL'' '''then''' ''// solution found'' | |||
Output(''c'') | |||
'''Stop''' | |||
'''end if''' | |||
''N'' ← ''N'' + 1 ''// increment step counter'' | |||
'''if''' sign(''f''(''c'')) = sign(''f''(''a'')) '''then''' ''a'' ← ''c'' '''else''' ''b'' ← ''c'' ''// new interval'' | |||
'''end while''' | |||
Output("Method failed.") ''// max number of steps exceeded'' | |||
=== Example: Finding the root of a polynomial === | |||
Suppose that the bisection method is used to find a root of the polynomial | |||
:<math> f(x) = x^3 - x - 2 \,.</math> | |||
First, two numbers <math> a </math> and <math> b </math> have to be found such that <math>f(a)</math> and <math>f(b)</math> have opposite signs. For the above function, <math> a = 1 </math> and <math> b = 2 </math> satisfy this criterion, as | |||
:<math> f(1) = (1)^3 - (1) - 2 = -2 </math> | |||
and | |||
:<math> f(2) = (2)^3 - (2) - 2 = +4 \,.</math> | |||
Because the function is continuous, there must be a root within the interval [1, 2]. | |||
In the first iteration, the end points of the interval which brackets the root are <math> a_1 = 1 </math> and <math> b_1 = 2 </math>, so the midpoint is | |||
:<math> c_1 = \frac{2+1}{2} = 1.5 </math> | |||
The function value at the midpoint is <math> f(c_1) = (1.5)^3 - (1.5) - 2 = -0.125 </math>. Because <math> f(c_1) </math> is negative, <math> a = 1 </math> is replaced with <math> a = 1.5 </math> for the next iteration to ensure that <math> f(a) </math> and <math> f(b) </math> have opposite signs. As this continues, the interval between <math> a </math> and <math> b </math> will become increasingly smaller, converging on the root of the function. See this happen in the table below. | |||
{| class="wikitable" | |||
! Iteration !! <math>a_n</math> !! <math>b_n</math> !! <math>c_n</math> !! <math>f(c_n)</math> | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |1|| 1 || 2 || 1.5 || −0.125 | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |2|| 1.5|| 2|| 1.75|| style="text-align: right;" | 1.6093750 | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |3|| 1.5|| 1.75|| 1.625|| style="text-align: right;" | 0.6660156 | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |4|| 1.5|| 1.625|| 1.5625|| style="text-align: right;" | 0.2521973 | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |5|| 1.5|| 1.5625|| 1.5312500|| style="text-align: right;" | 0.0591125 | |||
|- style="text-align: left;" | |||
< | | style="text-align: right;" |6|| 1.5|| 1.5312500|| 1.5156250|| −0.0340538 | ||
|- style="text-align: left;" | |||
| style="text-align: right;" |7|| 1.5156250|| 1.5312500|| 1.5234375|| style="text-align: right;" | 0.0122504 | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |8|| 1.5156250|| 1.5234375|| 1.5195313|| −0.0109712 | |||
|- style="text-align: left;" | |||
</ | | style="text-align: right;" |9|| 1.5195313|| 1.5234375|| 1.5214844|| style="text-align: right;" | 0.0006222 | ||
|- style="text-align: left;" | |||
| style="text-align: right;" |10|| 1.5195313|| 1.5214844|| 1.5205078|| −0.0051789 | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |11|| 1.5205078|| 1.5214844|| 1.5209961|| −0.0022794 | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |12|| 1.5209961|| 1.5214844|| 1.5212402|| −0.0008289 | |||
|- style="text-align: left;" | |||
| style="text-align: right;" |13|| 1.5212402|| 1.5214844|| 1.5213623|| −0.0001034 | |||
|- style="text-align: left;" | |||
= | | style="text-align: right;" |14|| 1.5213623|| 1.5214844|| 1.5214233|| style="text-align: right;" | 0.0002594 | ||
|-style="text-align: left;" | |||
| style="text-align: right;" |15|| 1.5213623|| 1.5214233|| 1.5213928|| style="text-align: right;" | 0.0000780 | |||
|- | |||
|style=" | |||
|style=" | |||
|style=" | |||
|style=" | |||
|style=" | |||
|style=" | |||
|style=" | |||
| | |||
|style=" | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
|style=" | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
| | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
|style=" | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
| | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
|style=" | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
| | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
| | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
| | |||
|style=" | |||
|style=" | |||
| | |||
| | |||
| | |||
| | |||
|- | |||
|style="text-align: | |||
|style="text-align: | |||
|- | |||
|style="text-align: right; | |||
|style="text-align: right; | |||
|} | |} | ||
After 13 iterations, it becomes apparent that there is a convergence to about 1.521: a root for the polynomial. | |||
== Analysis == | |||
The method is guaranteed to converge to a root of ''f'' if ''f'' is a [[continuous function]] on the interval [''a'', ''b''] and ''f''(''a'') and ''f''(''b'') have opposite signs. The [[approximation error|absolute error]] is halved at each step so the method [[Rate of convergence|converges linearly]]. Specifically, if ''c''<sub>1</sub> = {{sfrac|''a''+''b''|2}} is the midpoint of the initial interval, and ''c''<sub>''n''</sub> is the midpoint of the interval in the ''n''th step, then the difference between ''c''<sub>''n''</sub> and a solution ''c'' is bounded by<ref>{{Harvnb|Burden|Faires|1985|p=31}}, Theorem 2.1</ref> | |||
:<math>|c_n-c|\le\frac{|b-a|}{2^n}.</math> | |||
This formula can be used to determine, in advance, an upper bound on the number of iterations that the bisection method needs to converge to a root to within a certain tolerance. | |||
The number ''n'' of iterations needed to achieve a required tolerance ε (that is, an error guaranteed to be at most ε), is bounded by | |||
:<math>n \le n_{1/2} \equiv \left\lceil\log_2\left(\frac{\epsilon_0}{\epsilon}\right)\right\rceil, </math> | |||
where the initial bracket size <math>\epsilon_0 = |b-a|</math> and the required bracket size <math>\epsilon \le \epsilon_0.</math> The main motivation to use the bisection method is that over the set of continuous functions, no other method can guarantee to produce an estimate c<sub>n</sub> to the solution c that in the ''worst case'' has an <math>\epsilon</math> absolute error with less than n<sub>1/2</sub> iterations.<ref name=":0">{{Cite journal|last=Sikorski|first=K.|date=1982-02-01|title=Bisection is optimal|url=https://doi.org/10.1007/BF01459080|journal=Numerische Mathematik|language=en|volume=40|issue=1|pages=111–117|doi=10.1007/BF01459080|s2cid=119952605 |issn=0945-3245}}</ref> This is also true under several common assumptions on function f and the behaviour of the function in the neighbourhood of the root.<ref name=":0" /><ref>{{Cite journal|last=Sikorski|first=K|date=1985-12-01|title=Optimal solution of nonlinear equations|journal=Journal of Complexity|language=en|volume=1|issue=2|pages=197–209|doi=10.1016/0885-064X(85)90011-1|issn=0885-064X|doi-access=}}</ref> | |||
( | However, despite the bisection method being optimal with respect to worst case performance under absolute error criteria it is sub-optimal with respect to ''average performance'' under standard assumptions <ref>{{Cite journal|last1=Graf|first1=Siegfried|last2=Novak|first2=Erich|last3=Papageorgiou|first3=Anargyros|date=1989-07-01|title=Bisection is not optimal on the average|url=https://doi.org/10.1007/BF01396051|journal=Numerische Mathematik|language=en|volume=55|issue=4|pages=481–491|doi=10.1007/BF01396051|s2cid=119546369 |issn=0945-3245}}</ref><ref>{{Cite journal|last=Novak|first=Erich|date=1989-12-01|title=Average-case results for zero finding|journal=Journal of Complexity|language=en|volume=5|issue=4|pages=489–501|doi=10.1016/0885-064X(89)90022-8|issn=0885-064X|doi-access=free}}</ref> as well as ''asymptotic performance''.<ref name=":1">{{Cite journal|last1=Oliveira|first1=I. F. D.|last2=Takahashi|first2=R. H. C.|date=2020-12-06|title=An Enhancement of the Bisection Method Average Performance Preserving Minmax Optimality|url=https://doi.org/10.1145/3423597|journal=ACM Transactions on Mathematical Software|volume=47|issue=1|pages=5:1–5:24|doi=10.1145/3423597|s2cid=230586635 |issn=0098-3500}}</ref> Popular alternatives to the bisection method, such as the [[secant method]], [[Ridders' method]] or [[Brent's method]] (amongst others), typically perform better since they trade-off worst case performance to achieve higher [[Rate of convergence|orders of convergence]] to the root. And, a strict improvement to the bisection method can be achieved with a higher order of convergence without trading-off worst case performance with the [[ITP Method]].<ref name=":1" /><ref>{{Cite web|last=Ivo|first=Oliveira|date=2020-12-14|title=An Improved Bisection Method|doi=10.1145/3423597 |s2cid=230586635 |url=https://link.growkudos.com/1iwxps83474}}</ref>{{Primary source inline|date=November 2024}} | ||
<!-- Therefore, the linear convergence is expressed by <math>\epsilon_{n+1} = \text{constant} \times \epsilon_n^m, \ m=1 .</math> :: Commented out; "Therefore"? If this needs to be said, it has to be said in a better way. --> | |||
== Generalization to higher dimensions == | == Generalization to higher dimensions == | ||
| Line 916: | Line 118: | ||
-1, & x<0 \\ | -1, & x<0 \\ | ||
\end{cases}</math> | \end{cases}</math> | ||
is the [[sign function]].<ref>{{cite journal |last1=Polymilis |first1=C. |last2=Servizi |first2=G. |last3=Turchetti |first3=G. |last4=Skokos |first4=Ch. |last5=Vrahatis |first5=M. N. |journal=Libration Point Orbits and Applications |title=Locating Periodic Orbits by Topological Degree Theory |date=May 2003 |pages=665–676 |doi=10.1142/9789812704849_0031 |arxiv=nlin/0211044 |isbn=978-981-238-363-1 }}</ref> In order for a root to exist, it is sufficient that <math>\deg(f, \Omega) \neq 0</math>, and this can be verified using a [[surface integral]] over the boundary of <math>\Omega</math>.<ref>{{Cite journal |last=Kearfott |first=Baker |date=1979-06-01 |title=An efficient degree-computation method for a generalized method of bisection |url=https://doi.org/10.1007/BF01404868 |journal=Numerische Mathematik |language=en |volume=32 |issue=2 |pages=109–127 |doi=10.1007/BF01404868 |s2cid=122058552 |issn=0945-3245 | is the [[sign function]].<ref>{{cite journal |last1=Polymilis |first1=C. |last2=Servizi |first2=G. |last3=Turchetti |first3=G. |last4=Skokos |first4=Ch. |last5=Vrahatis |first5=M. N. |journal=Libration Point Orbits and Applications |title=Locating Periodic Orbits by Topological Degree Theory |date=May 2003 |pages=665–676 |doi=10.1142/9789812704849_0031 |arxiv=nlin/0211044 |isbn=978-981-238-363-1 }}</ref> In order for a root to exist, it is sufficient that <math>\deg(f, \Omega) \neq 0</math>, and this can be verified using a [[surface integral]] over the boundary of <math>\Omega</math>.<ref>{{Cite journal |last=Kearfott |first=Baker |date=1979-06-01 |title=An efficient degree-computation method for a generalized method of bisection |url=https://doi.org/10.1007/BF01404868 |journal=Numerische Mathematik |language=en |volume=32 |issue=2 |pages=109–127 |doi=10.1007/BF01404868 |s2cid=122058552 |issn=0945-3245}}</ref> | ||
=== Characteristic bisection method === | === Characteristic bisection method === | ||
The '''characteristic bisection method''' uses only the signs of a function in different points. Lef ''f'' be a function from R<sup>d</sup> to R<sup>d</sup>, for some integer ''d'' ≥ 2. A '''characteristic polyhedron<ref>{{Cite journal |last=Vrahatis |first=Michael N. |date=1995-06-01 |title=An Efficient Method for Locating and Computing Periodic Orbits of Nonlinear Mappings |url=https://www.sciencedirect.com/science/article/pii/S0021999185711199 |journal=Journal of Computational Physics |language=en |volume=119 |issue=1 |pages=105–119 |doi=10.1006/jcph.1995.1119 |bibcode=1995JCoPh.119..105V |issn=0021-9991 | The '''characteristic bisection method''' uses only the signs of a function in different points. Lef ''f'' be a function from R<sup>d</sup> to R<sup>d</sup>, for some integer ''d'' ≥ 2. A '''characteristic polyhedron<ref>{{Cite journal |last=Vrahatis |first=Michael N. |date=1995-06-01 |title=An Efficient Method for Locating and Computing Periodic Orbits of Nonlinear Mappings |url=https://www.sciencedirect.com/science/article/pii/S0021999185711199 |journal=Journal of Computational Physics |language=en |volume=119 |issue=1 |pages=105–119 |doi=10.1006/jcph.1995.1119 |bibcode=1995JCoPh.119..105V |issn=0021-9991}}</ref>''' (also called an '''admissible polygon''')<ref name=":2">{{Cite journal |last1=Vrahatis |first1=M. N. |last2=Iordanidis |first2=K. I. |date=1986-03-01 |title=A rapid Generalized Method of Bisection for solving Systems of Non-linear Equations |url=https://doi.org/10.1007/BF01389620 |journal=Numerische Mathematik |language=en |volume=49 |issue=2 |pages=123–138 |doi=10.1007/BF01389620 |s2cid=121771945 |issn=0945-3245}}</ref> of ''f'' is a [[polytope]] in R<sup>''d''</sup>, having 2<sup>d</sup> vertices, such that in each vertex '''v''', the combination of signs of ''f''('''v''') is unique and the [[Degree of a continuous mapping#Maps from closed region|topological degree of ''f'' on its interior]] is not zero (a necessary criterion to ensure the existence of a root).<ref name=":3">{{cite journal |last1=Vrahatis |first1=M.N. |last2=Perdiou |first2=A.E. |last3=Kalantonis |first3=V.S. |last4=Perdios |first4=E.A. |last5=Papadakis |first5=K. |last6=Prosmiti |first6=R. |last7=Farantos |first7=S.C. |title=Application of the Characteristic Bisection Method for locating and computing periodic orbits in molecular systems |journal=Computer Physics Communications |date=July 2001 |volume=138 |issue=1 |pages=53–68 |doi=10.1016/S0010-4655(01)00190-4|bibcode=2001CoPhC.138...53V }}</ref> For example, for ''d''=2, a characteristic polyhedron of ''f'' is a [[quadrilateral]] with vertices (say) A,B,C,D, such that: | ||
* {{tmath|1=\sgn f(A) = (-,-)}}, that is, ''f''<sub>1</sub>(A)<0, ''f''<sub>2</sub>(A)<0. | * {{tmath|1=\sgn f(A) = (-,-)}}, that is, ''f''<sub>1</sub>(A)<0, ''f''<sub>2</sub>(A)<0. | ||
| Line 925: | Line 127: | ||
* {{tmath|1=\sgn f(C) = (+,-)}}, that is, ''f''<sub>1</sub>(C)>0, ''f''<sub>2</sub>(C)<0. | * {{tmath|1=\sgn f(C) = (+,-)}}, that is, ''f''<sub>1</sub>(C)>0, ''f''<sub>2</sub>(C)<0. | ||
* {{tmath|1=\sgn f(D) = (+,+)}}, that is, ''f''<sub>1</sub>(D)>0, ''f''<sub>2</sub>(D)>0. | * {{tmath|1=\sgn f(D) = (+,+)}}, that is, ''f''<sub>1</sub>(D)>0, ''f''<sub>2</sub>(D)>0. | ||
A '''proper edge''' of a characteristic polygon is | A '''proper edge''' of a characteristic polygon is an edge between a pair of vertices, such that the sign vector differs by only a single sign. In the above example, the proper edges of the characteristic quadrilateral are AB, AC, BD and CD. A '''diagonal''' is a pair of vertices, such that the sign vector differs by all ''d'' signs. In the above example, the diagonals are AD and BC. | ||
At each iteration, the algorithm picks a proper edge of the polyhedron (say, A{{--}}B), and computes the signs of ''f'' in its mid-point (say, M). Then it proceeds as follows: | At each iteration, the algorithm picks a proper edge of the polyhedron (say, A{{--}}B), and computes the signs of ''f'' in its mid-point (say, M). Then it proceeds as follows: | ||
| Line 942: | Line 144: | ||
== References == | == References == | ||
{{reflist|30em}} | {{reflist|30em}} | ||
* {{ | * {{Citation | last1=Burden | first1=Richard L. | last2=Faires | first2=J. Douglas | title=Numerical Analysis | publisher=PWS Publishers | edition=3rd | isbn=0-87150-857-5 | year=1985 | chapter=2.1 The Bisection Algorithm | url-access=registration | url=https://archive.org/details/numericalanalys00burd }} | ||
==Further reading== | ==Further reading== | ||
* {{ | * {{Citation | last1=Corliss | first1=George | title=Which root does the bisection algorithm find? | year=1977 | journal=SIAM Review | issn=1095-7200 | volume=19 | issue=2 | pages=325–327 | doi=10.1137/1019044 |ref=none}} | ||
* {{ | * {{Citation | last1=Kaw | first1=Autar | last2=Kalu | first2=Egwu | year=2008 | title=Numerical Methods with Applications | edition=1st | url=http://numericalmethods.eng.usf.edu/topics/textbook_index.html | ref=none | url-status=dead | archive-url=https://web.archive.org/web/20090413123941/http://numericalmethods.eng.usf.edu/topics/textbook_index.html | archive-date=2009-04-13 }}<!-- isbn for 2nd abridged edition: 978-0578057651. Why isn't the website the 2nd edition? --> | ||
== External links == | == External links == | ||
Latest revision as of 00:25, 1 July 2025
Template:Short description Script error: No such module "about".
In mathematics, the bisection method is a root-finding method that applies to any continuous function for which one knows two values with opposite signs. The method consists of repeatedly bisecting the interval defined by these values and then selecting the subinterval in which the function changes sign, and therefore must contain a root. It is a very simple and robust method, but it is also relatively slow. Because of this, it is often used to obtain a rough approximation to a solution which is then used as a starting point for more rapidly converging methods.[1] The method is also called the interval halving method,[2] the binary search method,[3] or the dichotomy method.[4]
For polynomials, more elaborate methods exist for testing the existence of a root in an interval (Descartes' rule of signs, Sturm's theorem, Budan's theorem). They allow extending the bisection method into efficient algorithms for finding all real roots of a polynomial; see Real-root isolation.
The method
The method is applicable for numerically solving the equation f(x) = 0 for the real variable x, where f is a continuous function defined on an interval [a, b] and where f(a) and f(b) have opposite signs. In this case a and b are said to bracket a root since, by the intermediate value theorem, the continuous function f must have at least one root in the interval (a, b).
At each step the method divides the interval in two parts/halves by computing the midpoint c = (a+b) / 2 of the interval and the value of the function f(c) at that point. If c itself is a root then the process has succeeded and stops. Otherwise, there are now only two possibilities: either f(a) and f(c) have opposite signs and bracket a root, or f(c) and f(b) have opposite signs and bracket a root.[5] The method selects the subinterval that is guaranteed to be a bracket as the new interval to be used in the next step. In this way an interval that contains a zero of f is reduced in width by 50% at each step. The process is continued until the interval is sufficiently small.
Explicitly, if f(c)=0 then c may be taken as the solution and the process stops. Otherwise, if f(a) and f(c) have opposite signs, then the method sets c as the new value for b, and if f(b) and f(c) have opposite signs then the method sets c as the new a. In both cases, the new f(a) and f(b) have opposite signs, so the method is applicable to this smaller interval.[6]
Iteration tasks
The input for the method is a continuous function f, an interval [a, b], and the function values f(a) and f(b). The function values are of opposite sign (there is at least one zero crossing within the interval). Each iteration performs these steps:
- Calculate c, the midpoint of the interval, c = Template:Sfrac.
- Calculate the function value at the midpoint, f(c).
- If convergence is satisfactory (that is, c − a is sufficiently small, or |f(c)| is sufficiently small), return c and stop iterating.
- Examine the sign of f(c) and replace either (a, f(a)) or (b, f(b)) with (c, f(c)) so that there is a zero crossing within the new interval.
When implementing the method on a computer, there can be problems with finite precision, so there are often additional convergence tests or limits to the number of iterations. Although f is continuous, finite precision may preclude a function value ever being zero. For example, consider Template:Math; there is no floating-point value approximating Template:Math that gives exactly zero. Additionally, the difference between a and b is limited by the floating point precision; i.e., as the difference between a and b decreases, at some point the midpoint of Template:Math will be numerically identical to (within floating point precision of) either a or b.
Algorithm
The method may be written in pseudocode as follows:[7]
input: Function f,
endpoint values a, b,
tolerance TOL,
maximum iterations NMAX
conditions: a < b,
either f(a) < 0 and f(b) > 0 or f(a) > 0 and f(b) < 0
output: value which differs from a root of f(x) = 0 by less than TOL
N ← 1
while N ≤ NMAX do // limit iterations to prevent infinite loop
c ← (a + b)/2 // new midpoint
if f(c) = 0 or (b – a)/2 < TOL then // solution found
Output(c)
Stop
end if
N ← N + 1 // increment step counter
if sign(f(c)) = sign(f(a)) then a ← c else b ← c // new interval
end while
Output("Method failed.") // max number of steps exceeded
Example: Finding the root of a polynomial
Suppose that the bisection method is used to find a root of the polynomial
First, two numbers and have to be found such that and have opposite signs. For the above function, and satisfy this criterion, as
and
Because the function is continuous, there must be a root within the interval [1, 2].
In the first iteration, the end points of the interval which brackets the root are and , so the midpoint is
The function value at the midpoint is . Because is negative, is replaced with for the next iteration to ensure that and have opposite signs. As this continues, the interval between and will become increasingly smaller, converging on the root of the function. See this happen in the table below.
| Iteration | ||||
|---|---|---|---|---|
| 1 | 1 | 2 | 1.5 | −0.125 |
| 2 | 1.5 | 2 | 1.75 | 1.6093750 |
| 3 | 1.5 | 1.75 | 1.625 | 0.6660156 |
| 4 | 1.5 | 1.625 | 1.5625 | 0.2521973 |
| 5 | 1.5 | 1.5625 | 1.5312500 | 0.0591125 |
| 6 | 1.5 | 1.5312500 | 1.5156250 | −0.0340538 |
| 7 | 1.5156250 | 1.5312500 | 1.5234375 | 0.0122504 |
| 8 | 1.5156250 | 1.5234375 | 1.5195313 | −0.0109712 |
| 9 | 1.5195313 | 1.5234375 | 1.5214844 | 0.0006222 |
| 10 | 1.5195313 | 1.5214844 | 1.5205078 | −0.0051789 |
| 11 | 1.5205078 | 1.5214844 | 1.5209961 | −0.0022794 |
| 12 | 1.5209961 | 1.5214844 | 1.5212402 | −0.0008289 |
| 13 | 1.5212402 | 1.5214844 | 1.5213623 | −0.0001034 |
| 14 | 1.5213623 | 1.5214844 | 1.5214233 | 0.0002594 |
| 15 | 1.5213623 | 1.5214233 | 1.5213928 | 0.0000780 |
After 13 iterations, it becomes apparent that there is a convergence to about 1.521: a root for the polynomial.
Analysis
The method is guaranteed to converge to a root of f if f is a continuous function on the interval [a, b] and f(a) and f(b) have opposite signs. The absolute error is halved at each step so the method converges linearly. Specifically, if c1 = Template:Sfrac is the midpoint of the initial interval, and cn is the midpoint of the interval in the nth step, then the difference between cn and a solution c is bounded by[8]
This formula can be used to determine, in advance, an upper bound on the number of iterations that the bisection method needs to converge to a root to within a certain tolerance. The number n of iterations needed to achieve a required tolerance ε (that is, an error guaranteed to be at most ε), is bounded by
where the initial bracket size and the required bracket size The main motivation to use the bisection method is that over the set of continuous functions, no other method can guarantee to produce an estimate cn to the solution c that in the worst case has an absolute error with less than n1/2 iterations.[9] This is also true under several common assumptions on function f and the behaviour of the function in the neighbourhood of the root.[9][10]
However, despite the bisection method being optimal with respect to worst case performance under absolute error criteria it is sub-optimal with respect to average performance under standard assumptions [11][12] as well as asymptotic performance.[13] Popular alternatives to the bisection method, such as the secant method, Ridders' method or Brent's method (amongst others), typically perform better since they trade-off worst case performance to achieve higher orders of convergence to the root. And, a strict improvement to the bisection method can be achieved with a higher order of convergence without trading-off worst case performance with the ITP Method.[13][14]Template:Primary source inline
Generalization to higher dimensions
The bisection method has been generalized to multi-dimensional functions. Such methods are called generalized bisection methods.[15][16]
Methods based on degree computation
Some of these methods are based on computing the topological degree, which for a bounded region and a differentiable function is defined as a sum over its roots:
- ,
where is the Jacobian matrix, , and
is the sign function.[17] In order for a root to exist, it is sufficient that , and this can be verified using a surface integral over the boundary of .[18]
Characteristic bisection method
The characteristic bisection method uses only the signs of a function in different points. Lef f be a function from Rd to Rd, for some integer d ≥ 2. A characteristic polyhedron[19] (also called an admissible polygon)[20] of f is a polytope in Rd, having 2d vertices, such that in each vertex v, the combination of signs of f(v) is unique and the topological degree of f on its interior is not zero (a necessary criterion to ensure the existence of a root).[21] For example, for d=2, a characteristic polyhedron of f is a quadrilateral with vertices (say) A,B,C,D, such that:
- Template:Tmath, that is, f1(A)<0, f2(A)<0.
- Template:Tmath, that is, f1(B)<0, f2(B)>0.
- Template:Tmath, that is, f1(C)>0, f2(C)<0.
- Template:Tmath, that is, f1(D)>0, f2(D)>0.
A proper edge of a characteristic polygon is an edge between a pair of vertices, such that the sign vector differs by only a single sign. In the above example, the proper edges of the characteristic quadrilateral are AB, AC, BD and CD. A diagonal is a pair of vertices, such that the sign vector differs by all d signs. In the above example, the diagonals are AD and BC.
At each iteration, the algorithm picks a proper edge of the polyhedron (say, ATemplate:--B), and computes the signs of f in its mid-point (say, M). Then it proceeds as follows:
- If Template:Tmath, then A is replaced by M, and we get a smaller characteristic polyhedron.
- If Template:Tmath, then B is replaced by M, and we get a smaller characteristic polyhedron.
- Else, we pick a new proper edge and try again.
Suppose the diameter (= length of longest proper edge) of the original characteristic polyhedron is Template:Mvar. Then, at least bisections of edges are required so that the diameter of the remaining polygon will be at most Template:Mvar.[20]Template:Rp If the topological degree of the initial polyhedron is not zero, then there is a procedure that can choose an edge such that the next polyhedron also has nonzero degree.[21][22]
See also
- Binary search algorithm
- Lehmer–Schur algorithm, generalization of the bisection method in the complex plane
- Nested intervals
References
- Script error: No such module "citation/CS1".
Further reading
- Script error: No such module "citation/CS1".
- Script error: No such module "citation/CS1".
External links
Template:Sister project Template:Sister project
- Script error: No such module "Template wrapper".
- Bisection Method Notes, PPT, Mathcad, Maple, Matlab, Mathematica from Holistic Numerical Methods Institute
Template:Root-finding algorithms
- ↑ Script error: No such module "Footnotes".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Footnotes".
- ↑ Script error: No such module "citation/CS1".
- ↑ If the function has the same sign at the endpoints of an interval, the endpoints may or may not bracket roots of the function.
- ↑ Script error: No such module "Footnotes". for section
- ↑ Script error: No such module "Footnotes".. This version recomputes the function values at each iteration rather than carrying them to the next iterations.
- ↑ Script error: No such module "Footnotes"., Theorem 2.1
- ↑ a b Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ a b Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".
- ↑ a b Script error: No such module "Citation/CS1".
- ↑ a b Script error: No such module "Citation/CS1".
- ↑ Script error: No such module "Citation/CS1".