Abstract
Consider the following three-stage strategy for recognizing patterns in larger scenes: Mimic randomization deterministically. Sample several positions of the pattern. Search for sample. Find all occurrences of the sample in the scene. Verify. For each occurrence of the sample, verify occurrence of the full pattern. This strategy has led to the core of the new idea given in this paper. Consider the string matching problem. Given the pattern, a sample of its positions is carefully selected whose size is at most logarithmic (the deterministic sample). Then, the sample is searched for. For nonperoidic patterns, the sample has the following perhaps surprising property. It is possible to disqualify all occurrences of the sample positions but one, within each "neighborhood" of locations in the text, without any further comparisons of characters. This provides sparse verification. This approach enables the text analysis (stages "search for sample" and "verify") to be performed in O(log* n) time and optimal speedup on a PRAM. This improves on the previous fastest optimal speedup result. It also leads to a new serial algorithm for string matching that runs in linear time including preprocessing. The approach is expected to be applicable for pragmatic pattern recognition problems. In some sense the algorithms are based on degenerate forms of computation, such as AND and OR of a large number of bits. However, traditional machine designs do not take advantage of such degeneracies, and usual complexity measures do not even enable them to be reflected. This leads to the conclusion of the paper with some speculative thoughts on desirable capabilities that would enhance computing machinery for some pattern recognition applications.

This publication has 25 references indexed in Scilit: