• ~8 Pi Pn•
Pl Pjo
• •
Fig. 5. input data and result of a closest pair problem.
A rough sketch of a better algorithm [HNS88] shows that 5 n distance eval-
uations always suffice to determine a closest pair. This remarkable reduction
from O(n 2) geometric operations down to <9(n), however, is not entirely free. It
requires more complex data management, such as sorting, in order to carefully
keep track of all the relevant information that can be deduced from the few dis-
tance evaluations actually performed. As we will see, the resulting trade-off "less
geometry at the cost of more book-keeping" is beneficial. It leads to an
O(n
log n)
algorithm which is a clear improvement over the naive O(n 2) algorithm.
The algorithm to be presented is of a type called plane sweep or line sweep, a
paradigm that works well for many 2-dimensional problems. A vertical line, the
"moving front", sweeps across the data from left to right, without ever backing
up, processing "events" as it "sees" or encounters them. Sweep algorithms rely
on an invariant characteristic of incremental algorithms: at all times, the front
has accumulated the answer to the problem defined by the subset of data already
seen. Once it has scanned all the data, the answer to the entire problem is at
hand.
This algorithm schema is tailored to the closest pair problem as follows. The
n points Pl,...,Pn are the events to be processed, sorted by x-coordinate. The
sweep invariant Ik, for 2 < k < n, states: when the front has processed the events
Pl,... ,pk we know a closest pair
(Pi,Pj)
among the subset {Pl,... ,Pk} and its
distance 6 =
did.
The invariant I2 is initialized with 6 = d12. At termination,
k = n, I~ solves the closest pair problem. The essence of a sweep algorithm, as
in a proof by induction, is the step from k to k + 1.
Fig. 6 illustrates how an event p is processed. The minimal distance 6 among
the k points to the left of the front is known. The current event, the (k + 1)-
th point p, either changes the value of 6, or doesn't, depending on the data
configuration in its immediate vicinity. Specifically, a new closest pair and a new
6 come into existence iff there exists a point r within a semi-circle to the left of
the front with center p and radius 6. Imagine the plane covered by a sheet of
paper into which a semi-circular hole of the right shape and size has been cut,
positioned with its center over p. This window defines a semi-circular query. The
response to this query, i.e. all the points visible through the window, must be
examined. If there are none, the event p leaves the previous solution unchanged.
If any point r is visible, a new closer pair and a new value of 6 emerge.
We can now substantiate the claim stated earlier: "5 n distance evaluations
always suffice to determine a closest pair". How many points can be seen through
the semi-circular window? The sweep invariant implies that points to the left of
the front must have a distance > 6; and the window has radius 6. Clearly, only