Linear search

Script error: No such module "Distinguish".Template:Short description Template:One source

In computer science, linear search or sequential search is a method for finding an element within a list. It sequentially checks each element of the list until a match is found or the whole list has been searched.Template:Sfn

A linear search runs in linear time in the worst case, and makes at most Template:Math comparisons, where Template:Math is the length of the list. If each element is equally likely to be searched, then linear search has an average case of Template:Math comparisons, but the average case can be affected if the search probabilities for each element vary. Linear search is rarely practical because other search algorithms and schemes, such as the binary search algorithm and hash tables, allow significantly faster searching for all but short lists.Template:Sfn

Algorithm

A linear search sequentially checks each element of the list until it finds an element that matches the target value. If the algorithm reaches the end of the list, the search terminates unsuccessfully.Template:Sfn

Basic algorithm

Given a list Template:Math of Template:Math elements with values or records Template:Math, and target value Template:Math, the following subroutine uses linear search to find the index of the target Template:Math in Template:Math.Template:Sfn

Set Template:Math to 0.
If Template:Math, the search terminates successfully; return Template:Math.
Increase Template:Math by 1.
If Template:Math, go to step 2. Otherwise, the search terminates unsuccessfully.

With a sentinel

The basic algorithm above makes two comparisons per iteration: one to check if Template:Math equals T, and the other to check if Template:Math still points to a valid index of the list. By adding an extra record Template:Math to the list (a sentinel value) that equals the target, the second comparison can be eliminated until the end of the search, making the algorithm faster. The search will reach the sentinel if the target is not contained within the list.Template:Sfn

Set Template:Math to 0.
If Template:Math, go to step 4.
Increase Template:Math by 1 and go to step 2.
If Template:Math, the search terminates successfully; return Template:Math. Else, the search terminates unsuccessfully.

In an ordered table

If the list is ordered such that Template:Math, the search can establish the absence of the target more quickly by concluding the search once Template:Math exceeds the target. This variation requires a sentinel that is greater than the target.Template:Sfn

Set Template:Math to 0.
If Template:Math, go to step 4.
Increase Template:Math by 1 and go to step 2.
If Template:Math, the search terminates successfully; return Template:Math. Else, the search terminates unsuccessfully.

Analysis

For a list with n items, the best case is when the value is equal to the first element of the list, in which case only one comparison is needed. The worst case is when the value is not in the list (or occurs only once at the end of the list), in which case n comparisons are needed.

If the value being sought occurs k times in the list, and all orderings of the list are equally likely, the expected number of comparisons is

{\begin{cases} n & if k = 0 \\ \frac{n + 1}{k + 1} & if 1 \leq k \leq n . \end{cases}

For example, if the value being sought occurs once in the list, and all orderings of the list are equally likely, the expected number of comparisons is $\frac{n + 1}{2}$ . However, if it is known that it occurs once, then at most n − 1 comparisons are needed, and the expected number of comparisons is

\frac{(n + 2) (n - 1)}{2 n}

(for example, for n = 2 this is 1, corresponding to a single if-then-else construct).

Either way, asymptotically the worst-case cost and the expected cost of linear search are both O(n).

Non-uniform probabilities

The performance of linear search improves if the desired value is more likely to be near the beginning of the list than to its end. Therefore, if some values are much more likely to be searched than others, it is desirable to place them at the beginning of the list.

In particular, when the list items are arranged in order of decreasing probability, and these probabilities are geometrically distributed, the cost of linear search is only O(1). ^[1]

In general, if items are arranged in order of decreasing probability and the probability of searching for the ith element is $p_{i}$ , the expected cost of a single search is $E S^{O P T} = \sum_{i = 1}^{n} i p_{i}$ . Under the natural assumption that the probabilities are not known in advance, or one cannot spend the time to sort the list by probabilities, one can use the approach of self-adjusting data structure and move elements towards the head of the list when they are requested in a search. Two natural heuristics for this self-adjustment are Move to Front (MF) and Transpose (T), where the requested element trades places with its predecessor. It is known that the expected cost of an access in a large sequence of independent accesses, averaged over all initial orders of the list, satisfies $E S^{T} \leq E S^{M F} \leq \frac{π}{2} E S^{O P T}$ . In terms of amortized cost, averaging over a worst-case sequence of operations (note - among sequences satisfying the assumption on probabilities), we have $S^{M F} \leq 2 S^{O P T}$ , while $S^{T}$ can be as bad as $O (m S^{O P T})$ .^[2]

Application

Linear search is usually very simple to implement, and is practical when the list has only a few elements, or when performing a single search in an unordered list.

When many values have to be searched in the same list, it often pays to preprocess the list in order to use a faster method. For example, one may sort the list and use binary search, or build an efficient search data structure from it. Should the content of the list change frequently, repeated reorganization may be more trouble than it is worth.

As a result, even though in theory other search algorithms may be faster than linear search (for instance binary search), in practice even on medium-sized arrays (around 100 items or less) it might be infeasible to use anything else. On larger arrays, it only makes sense to use other, faster search methods if the data is large enough, because the initial time to prepare (sort) the data is comparable to many linear searches.^[3]

References

Citations

Template:Reflist

Works

Template:Sfn whitelist

Template:TAOCP Template:ISBN

↑ Script error: No such module "citation/CS1".
↑ Script error: No such module "citation/CS1".
↑ Script error: No such module "citation/CS1".

[knuth-1] Script error: No such module "citation/CS1".

[2] Script error: No such module "citation/CS1".

[:0-3] Script error: No such module "citation/CS1".

[1]

[2]

[3]

Linear search

Contents

Algorithm

Basic algorithm

With a sentinel

In an ordered table

Analysis

Non-uniform probabilities

Application

See also

References

Citations

Works

Navigation menu

Linear search

Algorithm

Basic algorithm

With a sentinel

In an ordered table

Analysis

Non-uniform probabilities

Application

See also

References

Citations

Works

Navigation menu

Search